Skip to content

Comments

Add Stacked DiD tutorial (Tutorial 13)#174

Merged
igerber merged 6 commits intomainfrom
stacked-did-tutorial
Feb 20, 2026
Merged

Add Stacked DiD tutorial (Tutorial 13)#174
igerber merged 6 commits intomainfrom
stacked-did-tutorial

Conversation

@igerber
Copy link
Owner

@igerber igerber commented Feb 19, 2026

Summary

  • Add new Jupyter notebook tutorial (docs/tutorials/13_stacked_did.ipynb) for the Stacked DiD estimator (Wing, Freedman & Hollingsworth 2024)
  • Covers basic usage, event study, stacked data inspection, IC1/IC2 trimming, Q-weight schemes, clean control definitions, estimator comparison, anticipation, and clustering
  • Update CLAUDE.md and README.md with tutorial 13 entry

Methodology references (required if estimator / math changes)

  • N/A - no methodology changes, tutorial only

Validation

  • Tests added/updated: No test changes (tutorial notebook)
  • Backtest / simulation / notebook evidence: Tutorial validated via nbconvert --execute — all cells pass

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

New Jupyter notebook covering Wing, Freedman & Hollingsworth (2024)
StackedDiD estimator: basic usage, event study, stacked data inspection,
IC1/IC2 trimming, Q-weight schemes, clean control definitions, estimator
comparison with C&S and Imputation DiD, anticipation, and clustering.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

Overall assessment: ⚠️ Needs changes

Executive summary

  • P1: The tutorial’s description of the aggregate Q-weight estimand says “equal weight per adoption cohort,” but the registry/paper define it as weighting by treated cohort size (N_a^D / N_Ω^D). docs/tutorials/13_stacked_did.ipynb:L235-L239
  • P3: The tutorial calls the event window “symmetric,” but kappa_pre and kappa_post can differ in both the implementation and the registry. docs/tutorials/13_stacked_did.ipynb:L62-L63, docs/tutorials/13_stacked_did.ipynb:L191-L193
  • P3: README tutorial list now includes 13 but still omits existing tutorials 11 and 12, which makes the list incomplete. README.md:L91-L103
  • No estimator code changes; changes are documentation + tutorial only.

Methodology

  • Severity: P1 | Impact: Misstates the target estimand for aggregate weighting, which can mislead users about what the estimator identifies (registry defines weights proportional to treated cohort size, not equal per cohort). | Fix: Update the wording to “weights by treated cohort size (N_a^D / N_Ω^D)” and adjust the follow‑up sentence that contrasts aggregate vs population/sample_share. docs/tutorials/13_stacked_did.ipynb:L235-L239
  • Severity: P3 | Impact: “Symmetric event window” is inaccurate because the method and code allow asymmetric windows (kappa_prekappa_post). | Fix: Replace “symmetric event window” with “event window (can be asymmetric)” in both locations. docs/tutorials/13_stacked_did.ipynb:L62-L63, docs/tutorials/13_stacked_did.ipynb:L191-L193

Code Quality
No issues found.

Performance
No issues found.

Maintainability
No issues found.

Tech Debt
No issues found.

Security
No issues found.

Documentation/Tests

  • Severity: P3 | Impact: README tutorial index is now incomplete, which can misdirect users to missing content. | Fix: Add rows for 11_imputation_did.ipynb and 12_two_stage_did.ipynb and keep the list in numeric order. README.md:L91-L103

igerber and others added 2 commits February 20, 2026 07:30
Extract development checklists to .claude/commands/dev-checklists.md (accessible
via /dev-checklists) and documentation requirements + test guidelines to
CONTRIBUTING.md. Remove module structure listing, performance architecture,
benchmark docs, and test structure listing — all discoverable from the codebase.
Update cross-references in review-plan.md, docs-check.md, and
check-plan-review.sh.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…able

Address PR #174 review feedback:
- Correct aggregate Q-weight from "equal weight" to "weight by treated
  cohort size" per REGISTRY.md methodology
- Fix "symmetric event window" wording to note kappa_pre/kappa_post can differ
- Add missing tutorials 11 and 12 to README table

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Feb 20, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: d436af72121331ebdf4658ee00b49d4b3e885af9


Overall assessment: ✅ Looks good

Executive Summary

  • No estimator or inference code changed; the tutorial content aligns with the StackedDiD Methodology Registry and class docstrings on IC1/IC2 trimming, clean controls, anticipation, and clustering.
  • Two minor documentation clarity items: aggregate Q‑weight summary could mislead, and global warning suppression hides important trimming/assumption warnings.
  • AGENTS.md was not found in the repo root; proceeded with REGISTRY/docstrings and the diff only.

Methodology

  • P3 | Impact: For default weighting="aggregate", Q‑weights vary by event time; the “Q‑Weight Summary by Sub‑Experiment” collapses to unit‑level and reports a single “effective control” metric, which can imply weights are constant within a sub‑experiment. | Fix: Summarize at the observation/event‑time level, or add a note that aggregate Q‑weights vary by event time; alternatively set weighting="population" or sample_share for that summary. (docs/tutorials/13_stacked_did.ipynb:L164-L179)
  • No other findings.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • P3 | Impact: warnings.filterwarnings('ignore') suppresses trimming/IC1/IC2 warnings and other diagnostics that are pedagogically useful in a methodology tutorial. | Fix: Remove the global filter or scope it to specific warnings/cells (e.g., context manager around noisy sections). (docs/tutorials/13_stacked_did.ipynb:L35-L41)
  • No other findings.

Drop `warnings.filterwarnings('ignore')` so IC1/IC2 trimming warnings
are visible to tutorial readers. Add explanatory markdown interpreting the
warnings and a concrete asymmetric-window code example showing how to
recover a trimmed cohort.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Feb 20, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: fde8fffc2e2923f2eb7d24257ff48c434e718189


Overall assessment: ✅ Looks good

Executive Summary

  • Doc-only PR; no estimator, weighting, or inference code changes.
  • New StackedDiD tutorial aligns with docs/methodology/REGISTRY.md and StackedDiD docstrings on IC1/IC2, controls, clustering, and anticipation.
  • Two minor tutorial clarifications will reduce confusion around trimming warnings and Q-weight summaries.

Methodology

  • P3 Impact: The tutorial’s “Eff. Ctrl” uses one Q-weight per unique control unit even though Q-weights are per-observation (and vary by event_time for aggregate weighting), so the number doesn’t correspond to the estimator’s weighting. Fix: compute the sum of control observation weights (or label it as a rough proxy) instead of summing unique-unit weights. Location: docs/tutorials/13_stacked_did.ipynb:L162-L177 (weighting is per-observation in diff_diff/stacked_did.py:L704-L738).
  • P3 Impact: The quoted trimming warning omits the IC2 clause and implies an IC1-only reason, which doesn’t match the actual warning text and can confuse users comparing outputs. Fix: include the IC2 sentence or avoid quoting the exact string. Location: docs/tutorials/13_stacked_did.ipynb:L229-L233; actual warning in diff_diff/stacked_did.py:L492-L499.

Code Quality
No issues found.

Performance
No issues found.

Maintainability
No issues found.

Tech Debt
No issues found.

Security
No issues found.

Documentation/Tests
No issues found. Tests not run (doc-only review).

…trimming warning quote

Drop "Eff. Ctrl" column from Q-weight summary table (cell 9) — the
per-unit drop_duplicates computation is incorrect for unbalanced panels.
Append IC2 clause to trimming warning blockquote (cell 12) to match
the actual warning emitted by the estimator.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Feb 20, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: 8c6d19426151513185d507fd80c1ce39748bc9c2


Overall assessment: ✅ Looks good

Executive summary

  • Docs-only PR: new Stacked DiD tutorial plus documentation workflow updates; no estimator/math/default behavior changes.
  • Cross-checked the tutorial against docs/methodology/REGISTRY.md and diff_diff/stacked_did.py; IC1/IC2 trimming, clean control definitions, anticipation, and clustering align with the registry and code.
  • One minor doc clarity item around aggregate Q-weights for unbalanced panels.

Methodology

  • Severity: P3 | Impact: The Q-weight scheme description implies aggregate weighting is purely by treated cohort size, which can mislead readers on unbalanced panels where implementation uses event-time observation counts (per R reference) to match the registry. | Fix: Add a sentence in the Q-weight schemes section noting that aggregate Q-weights are computed at the (event_time, sub_exp) level for unbalanced panels (reduces to cohort-size weighting only for balanced panels). Location: docs/tutorials/13_stacked_did.ipynb:255-272.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • No findings. (Docs-only change; no tests expected.)

Add parenthetical to Q-weight schemes section noting that aggregate
weights are computed at the (event_time, sub_exp) observation level,
which reduces to cohort-size weighting only for balanced panels.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Feb 20, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: b1b7d1c0799142644a6e8faab0dc4d48488cce6b


Overall assessment: Looks good

Executive Summary

  • Docs-only change: adds docs/tutorials/13_stacked_did.ipynb and doc housekeeping; no estimator/inference code touched.
  • Cross-checked tutorial statements against docs/methodology/REGISTRY.md and diff_diff/stacked_did.py; IC1/IC2 trimming, Q-weights, clean controls, anticipation, and clustering align.
  • Notebook examples appear consistent with current API (StackedDiDResults attributes and plot_event_study support).
  • No tests added (reasonable for documentation-only change).

Methodology

  • No findings. Tutorial content aligns with the StackedDiD registry and in-code docstrings.

Code Quality

  • No findings. No code changes.

Performance

  • No findings. Documentation-only change.

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings. No secrets or PII observed in the new notebook.

Documentation/Tests

  • No findings. README/CONTRIBUTING/tutorial updates are consistent; no tests required for docs-only change.

@igerber igerber merged commit 7f19f29 into main Feb 20, 2026
@igerber igerber deleted the stacked-did-tutorial branch February 20, 2026 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant