Add Stacked DiD tutorial (Tutorial 13) by igerber · Pull Request #174 · igerber/diff-diff

igerber · 2026-02-19T23:51:35Z

Summary

Add new Jupyter notebook tutorial (docs/tutorials/13_stacked_did.ipynb) for the Stacked DiD estimator (Wing, Freedman & Hollingsworth 2024)
Covers basic usage, event study, stacked data inspection, IC1/IC2 trimming, Q-weight schemes, clean control definitions, estimator comparison, anticipation, and clustering
Update CLAUDE.md and README.md with tutorial 13 entry

Methodology references (required if estimator / math changes)

N/A - no methodology changes, tutorial only

Validation

Tests added/updated: No test changes (tutorial notebook)
Backtest / simulation / notebook evidence: Tutorial validated via nbconvert --execute — all cells pass

Security / privacy

Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

New Jupyter notebook covering Wing, Freedman & Hollingsworth (2024) StackedDiD estimator: basic usage, event study, stacked data inspection, IC1/IC2 trimming, Q-weight schemes, clean control definitions, estimator comparison with C&S and Imputation DiD, anticipation, and clustering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-19T23:54:52Z

Overall assessment: ⚠️ Needs changes

Executive summary

P1: The tutorial’s description of the aggregate Q-weight estimand says “equal weight per adoption cohort,” but the registry/paper define it as weighting by treated cohort size (N_a^D / N_Ω^D). docs/tutorials/13_stacked_did.ipynb:L235-L239
P3: The tutorial calls the event window “symmetric,” but kappa_pre and kappa_post can differ in both the implementation and the registry. docs/tutorials/13_stacked_did.ipynb:L62-L63, docs/tutorials/13_stacked_did.ipynb:L191-L193
P3: README tutorial list now includes 13 but still omits existing tutorials 11 and 12, which makes the list incomplete. README.md:L91-L103
No estimator code changes; changes are documentation + tutorial only.

Methodology

Severity: P1 | Impact: Misstates the target estimand for aggregate weighting, which can mislead users about what the estimator identifies (registry defines weights proportional to treated cohort size, not equal per cohort). | Fix: Update the wording to “weights by treated cohort size (N_a^D / N_Ω^D)” and adjust the follow‑up sentence that contrasts aggregate vs population/sample_share. docs/tutorials/13_stacked_did.ipynb:L235-L239
Severity: P3 | Impact: “Symmetric event window” is inaccurate because the method and code allow asymmetric windows (kappa_pre ≠ kappa_post). | Fix: Replace “symmetric event window” with “event window (can be asymmetric)” in both locations. docs/tutorials/13_stacked_did.ipynb:L62-L63, docs/tutorials/13_stacked_did.ipynb:L191-L193

Code Quality
No issues found.

Performance
No issues found.

Maintainability
No issues found.

Tech Debt
No issues found.

Security
No issues found.

Documentation/Tests

Severity: P3 | Impact: README tutorial index is now incomplete, which can misdirect users to missing content. | Fix: Add rows for 11_imputation_did.ipynb and 12_two_stage_did.ipynb and keep the list in numeric order. README.md:L91-L103

Extract development checklists to .claude/commands/dev-checklists.md (accessible via /dev-checklists) and documentation requirements + test guidelines to CONTRIBUTING.md. Remove module structure listing, performance architecture, benchmark docs, and test structure listing — all discoverable from the codebase. Update cross-references in review-plan.md, docs-check.md, and check-plan-review.sh. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…able Address PR #174 review feedback: - Correct aggregate Q-weight from "equal weight" to "weight by treated cohort size" per REGISTRY.md methodology - Fix "symmetric event window" wording to note kappa_pre/kappa_post can differ - Add missing tutorials 11 and 12 to README table Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

igerber · 2026-02-20T12:30:45Z

/ai-review

github-actions · 2026-02-20T12:34:38Z

🔁 AI review rerun (requested by @igerber)

Head SHA: d436af72121331ebdf4658ee00b49d4b3e885af9

Overall assessment: ✅ Looks good

Executive Summary

No estimator or inference code changed; the tutorial content aligns with the StackedDiD Methodology Registry and class docstrings on IC1/IC2 trimming, clean controls, anticipation, and clustering.
Two minor documentation clarity items: aggregate Q‑weight summary could mislead, and global warning suppression hides important trimming/assumption warnings.
AGENTS.md was not found in the repo root; proceeded with REGISTRY/docstrings and the diff only.

Methodology

P3 | Impact: For default weighting="aggregate", Q‑weights vary by event time; the “Q‑Weight Summary by Sub‑Experiment” collapses to unit‑level and reports a single “effective control” metric, which can imply weights are constant within a sub‑experiment. | Fix: Summarize at the observation/event‑time level, or add a note that aggregate Q‑weights vary by event time; alternatively set weighting="population" or sample_share for that summary. (docs/tutorials/13_stacked_did.ipynb:L164-L179)
No other findings.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings.

Security

No findings.

Documentation/Tests

P3 | Impact: warnings.filterwarnings('ignore') suppresses trimming/IC1/IC2 warnings and other diagnostics that are pedagogically useful in a methodology tutorial. | Fix: Remove the global filter or scope it to specific warnings/cells (e.g., context manager around noisy sections). (docs/tutorials/13_stacked_did.ipynb:L35-L41)
No other findings.

Drop `warnings.filterwarnings('ignore')` so IC1/IC2 trimming warnings are visible to tutorial readers. Add explanatory markdown interpreting the warnings and a concrete asymmetric-window code example showing how to recover a trimmed cohort. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igerber · 2026-02-20T13:28:48Z

/ai-review

github-actions · 2026-02-20T13:34:00Z

🔁 AI review rerun (requested by @igerber)

Head SHA: fde8fffc2e2923f2eb7d24257ff48c434e718189

Overall assessment: ✅ Looks good

Executive Summary

Doc-only PR; no estimator, weighting, or inference code changes.
New StackedDiD tutorial aligns with docs/methodology/REGISTRY.md and StackedDiD docstrings on IC1/IC2, controls, clustering, and anticipation.
Two minor tutorial clarifications will reduce confusion around trimming warnings and Q-weight summaries.

Methodology

P3 Impact: The tutorial’s “Eff. Ctrl” uses one Q-weight per unique control unit even though Q-weights are per-observation (and vary by event_time for aggregate weighting), so the number doesn’t correspond to the estimator’s weighting. Fix: compute the sum of control observation weights (or label it as a rough proxy) instead of summing unique-unit weights. Location: docs/tutorials/13_stacked_did.ipynb:L162-L177 (weighting is per-observation in diff_diff/stacked_did.py:L704-L738).
P3 Impact: The quoted trimming warning omits the IC2 clause and implies an IC1-only reason, which doesn’t match the actual warning text and can confuse users comparing outputs. Fix: include the IC2 sentence or avoid quoting the exact string. Location: docs/tutorials/13_stacked_did.ipynb:L229-L233; actual warning in diff_diff/stacked_did.py:L492-L499.

Code Quality
No issues found.

Performance
No issues found.

Maintainability
No issues found.

Tech Debt
No issues found.

Security
No issues found.

Documentation/Tests
No issues found. Tests not run (doc-only review).

…trimming warning quote Drop "Eff. Ctrl" column from Q-weight summary table (cell 9) — the per-unit drop_duplicates computation is incorrect for unbalanced panels. Append IC2 clause to trimming warning blockquote (cell 12) to match the actual warning emitted by the estimator. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igerber · 2026-02-20T13:49:05Z

/ai-review

github-actions · 2026-02-20T13:52:50Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 8c6d19426151513185d507fd80c1ce39748bc9c2

Overall assessment: ✅ Looks good

Executive summary

Docs-only PR: new Stacked DiD tutorial plus documentation workflow updates; no estimator/math/default behavior changes.
Cross-checked the tutorial against docs/methodology/REGISTRY.md and diff_diff/stacked_did.py; IC1/IC2 trimming, clean control definitions, anticipation, and clustering align with the registry and code.
One minor doc clarity item around aggregate Q-weights for unbalanced panels.

Methodology

Severity: P3 | Impact: The Q-weight scheme description implies aggregate weighting is purely by treated cohort size, which can mislead readers on unbalanced panels where implementation uses event-time observation counts (per R reference) to match the registry. | Fix: Add a sentence in the Q-weight schemes section noting that aggregate Q-weights are computed at the (event_time, sub_exp) level for unbalanced panels (reduces to cohort-size weighting only for balanced panels). Location: docs/tutorials/13_stacked_did.ipynb:255-272.

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings.

Security

No findings.

Documentation/Tests

No findings. (Docs-only change; no tests expected.)

Add parenthetical to Q-weight schemes section noting that aggregate weights are computed at the (event_time, sub_exp) observation level, which reduces to cohort-size weighting only for balanced panels. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igerber · 2026-02-20T13:57:03Z

/ai-review

github-actions · 2026-02-20T14:01:09Z

🔁 AI review rerun (requested by @igerber)

Head SHA: b1b7d1c0799142644a6e8faab0dc4d48488cce6b

Overall assessment: Looks good

Executive Summary

Docs-only change: adds docs/tutorials/13_stacked_did.ipynb and doc housekeeping; no estimator/inference code touched.
Cross-checked tutorial statements against docs/methodology/REGISTRY.md and diff_diff/stacked_did.py; IC1/IC2 trimming, Q-weights, clean controls, anticipation, and clustering align.
Notebook examples appear consistent with current API (StackedDiDResults attributes and plot_event_study support).
No tests added (reasonable for documentation-only change).

Methodology

No findings. Tutorial content aligns with the StackedDiD registry and in-code docstrings.

Code Quality

No findings. No code changes.

Performance

No findings. Documentation-only change.

Maintainability

No findings.

Tech Debt

No findings.

Security

No findings. No secrets or PII observed in the new notebook.

Documentation/Tests

No findings. README/CONTRIBUTING/tutorial updates are consistent; no tests required for docs-only change.

igerber and others added 2 commits February 20, 2026 07:30

igerber merged commit 7f19f29 into main Feb 20, 2026

igerber deleted the stacked-did-tutorial branch February 20, 2026 14:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add Stacked DiD tutorial (Tutorial 13)#174

Add Stacked DiD tutorial (Tutorial 13)#174
igerber merged 6 commits intomainfrom
stacked-did-tutorial

igerber commented Feb 19, 2026

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

igerber commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Uh oh!

igerber commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Uh oh!

igerber commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Uh oh!

igerber commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

igerber commented Feb 19, 2026

Summary

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

igerber commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Uh oh!

igerber commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Uh oh!

igerber commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Uh oh!

igerber commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant