Add Stacked DiD estimator (Wing, Freedman & Hollingsworth 2024)#172
Merged
Add Stacked DiD estimator (Wing, Freedman & Hollingsworth 2024)#172
Conversation
Implement the Stacked Difference-in-Differences estimator from NBER WP 32054. The estimator corrects bias in naive stacked regressions by applying Q-weights that ensure stable composition across event time. Core implementation: - StackedDiD class with IC1/IC2 trimming, Q-weight computation (aggregate, population, sample_share), WLS event study regression (Eq. 3), and delta-method SE for overall ATT - Three clean control modes: not_yet_treated, strict, never_treated - Clustering at unit or unit×sub-experiment level - Anticipation parameter support Validated against R reference implementation (stacked-did-weights by co-author Hollingsworth): ATT matches within 2.1e-11, SE within 4.0e-10, all event study coefficients match to machine epsilon. Includes 72 tests, R/Python benchmark scripts, full documentation (README, API docs, REGISTRY.md, METHODOLOGY_REVIEW.md), and ROADMAP update. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Overall assessment: Executive Summary
Methodology
Code Quality
Performance Maintainability Tech Debt Security Documentation/Tests
|
CRITICAL: - Fix Q-weight computation for unbalanced panels: aggregate weighting now uses observation counts per (event_time, sub_exp) matching R reference compute_weights(). Population/sample_share unchanged. - Fix anticipation parameter: reference period shifts to e=-1-anticipation, post-treatment includes anticipation periods, consistent with ImputationDiD/TwoStageDiD/SunAbraham. - Remove aggregate='group' and aggregate='all': pooled stacked regression cannot produce cohort-specific effects. Raises ValueError with guidance. MEDIUM: - Fix n_sub_experiments to track actual built sub-experiments, warn on empty. - Fix README/API parameter blocks to match actual constructor signature. - Fix REGISTRY.md algorithm steps and add anticipation edge cases. - Strengthen test assertions for anticipation, Q-weights, and unbalanced panels. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Owner
Author
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall assessment: ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
If you want, I can also add the population-weighting unit test and update the event-window messaging. |
Expand minimal test dataset from 4 to 6 observations (3 units × 2 periods) to avoid saturated model (n=k=4) which causes division by zero in cluster-robust VCV adjustment. The Rust backend handles this gracefully but pure Python's (n-1)/(n-k) term requires n > k. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
StackedDiDclass with IC1/IC2 trimming, Q-weight computation (3 schemes), WLS event study regression, delta-method SEStackedDiDResultsdataclass with summary(), to_dataframe(), event study and group effectsMethodology references (required if estimator / math changes)
create_sub_exp()+compute_weights())[a - kappa_pre, a + kappa_post]rather than paper text[a - kappa_pre - 1, a + kappa_post](paper vs R discrepancy documented in wing-2024-review.md)Validation
tests/test_stacked_did.py(72 tests, 11 classes)benchmarks/R/benchmark_stacked_did.R,benchmarks/python/benchmark_stacked_did.pySecurity / privacy
Generated with Claude Code