|
| 1 | +--- |
| 2 | +description: Development checklists for code changes (params, methodology, warnings, reviews, bugs) |
| 3 | +argument-hint: "[checklist-name]" |
| 4 | +--- |
| 5 | + |
| 6 | +# Development Checklists |
| 7 | + |
| 8 | +## Adding a New Parameter to Estimators |
| 9 | + |
| 10 | +When adding a new `__init__` parameter that should be available across estimators: |
| 11 | + |
| 12 | +1. **Implementation** (for each affected estimator): |
| 13 | + - [ ] Add to `__init__` signature with default value |
| 14 | + - [ ] Store as `self.param_name` |
| 15 | + - [ ] Add to `get_params()` return dict |
| 16 | + - [ ] Handle in `set_params()` (usually automatic via `hasattr`) |
| 17 | + |
| 18 | +2. **Consistency** - apply to all applicable estimators per the **Estimator inheritance** map in CLAUDE.md |
| 19 | + |
| 20 | +3. **Testing**: |
| 21 | + - [ ] Test `get_params()` includes new param |
| 22 | + - [ ] Test parameter affects estimator behavior |
| 23 | + - [ ] Test with non-default value |
| 24 | + |
| 25 | +4. **Downstream tracing**: |
| 26 | + - [ ] Before implementing: `grep -rn "self\.<param>" diff_diff/<module>.py` to find ALL downstream paths |
| 27 | + - [ ] Parameter handled in ALL aggregation methods (simple, event_study, group) |
| 28 | + - [ ] Parameter handled in bootstrap inference paths |
| 29 | + |
| 30 | +5. **Documentation**: |
| 31 | + - [ ] Update docstring in all affected classes |
| 32 | + - [ ] Update CLAUDE.md if it's a key design pattern |
| 33 | + |
| 34 | +## Implementing Methodology-Critical Code |
| 35 | + |
| 36 | +When implementing or modifying code that affects statistical methodology (estimators, SE calculation, inference, edge case handling): |
| 37 | + |
| 38 | +1. **Before coding - consult the Methodology Registry**: |
| 39 | + - [ ] Read the relevant estimator section in `docs/methodology/REGISTRY.md` |
| 40 | + - [ ] Identify the reference implementation(s) listed |
| 41 | + - [ ] Note the edge case handling requirements |
| 42 | + |
| 43 | +2. **During implementation**: |
| 44 | + - [ ] Follow the documented equations and formulas |
| 45 | + - [ ] Match reference implementation behavior for standard cases |
| 46 | + - [ ] For edge cases: either match reference OR document deviation |
| 47 | + |
| 48 | +3. **When deviating from reference implementations**: |
| 49 | + - [ ] Add a **Note** in the Methodology Registry explaining the deviation |
| 50 | + - [ ] Include rationale (e.g., "defensive enhancement", "R errors here") |
| 51 | + - [ ] Ensure the deviation is an improvement, not a bug |
| 52 | + |
| 53 | +4. **Testing methodology-aligned behavior**: |
| 54 | + - [ ] Test that edge cases produce documented behavior (NaN, warning, etc.) |
| 55 | + - [ ] Assert warnings are raised (not just captured) |
| 56 | + - [ ] Assert the warned-about behavior actually occurred |
| 57 | + - [ ] For NaN results: assert `np.isnan()`, don't just check "no exception" |
| 58 | + - [ ] All inference fields computed via `safe_inference()` (not inline) |
| 59 | + |
| 60 | +## Adding Warning/Error/Fallback Handling |
| 61 | + |
| 62 | +When adding code that emits warnings or handles errors: |
| 63 | + |
| 64 | +1. **Consult Methodology Registry first**: |
| 65 | + - [ ] Check if behavior is documented in edge cases section |
| 66 | + - [ ] If not documented, add it before implementing |
| 67 | + |
| 68 | +2. **Verify behavior matches message**: |
| 69 | + - [ ] Manually trace the code path after warning/error |
| 70 | + - [ ] Confirm the stated behavior actually occurs |
| 71 | + |
| 72 | +3. **Write behavioral tests**: |
| 73 | + - [ ] Don't just test "no exception raised" |
| 74 | + - [ ] Assert the expected outcome occurred |
| 75 | + - [ ] For fallbacks: verify fallback behavior was applied |
| 76 | + - [ ] Example: If warning says "setting NaN", assert `np.any(np.isnan(result))` |
| 77 | + |
| 78 | +4. **Protect arithmetic operations**: |
| 79 | + - [ ] Wrap ALL related operations in `np.errstate()`, not just the final one |
| 80 | + - [ ] Include division, matrix multiplication, and any operation that can overflow/underflow |
| 81 | + |
| 82 | +## Reviewing New Features or Code Paths |
| 83 | + |
| 84 | +When reviewing PRs that add new features, modes, or code paths (learned from PR #97 analysis): |
| 85 | + |
| 86 | +1. **Edge Case Coverage**: |
| 87 | + - [ ] Empty result sets (no matching data for a filter condition) |
| 88 | + - [ ] NaN/Inf propagation through ALL inference fields (SE, t-stat, p-value, CI) |
| 89 | + - [ ] Parameter interactions (e.g., new param x existing aggregation methods) |
| 90 | + - [ ] Control/comparison group composition for all code paths |
| 91 | + |
| 92 | +2. **Documentation Completeness**: |
| 93 | + - [ ] All new parameters have docstrings with type, default, and description |
| 94 | + - [ ] Methodology docs match implementation behavior (equations, edge cases) |
| 95 | + - [ ] Edge cases documented in `docs/methodology/REGISTRY.md` |
| 96 | + |
| 97 | +3. **Logic Audit for New Code Paths**: |
| 98 | + - [ ] When adding new modes (like `base_period="varying"`), trace ALL downstream effects |
| 99 | + - [ ] Check aggregation methods handle the new mode correctly |
| 100 | + - [ ] Check bootstrap/inference methods handle the new mode correctly |
| 101 | + - [ ] Explicitly test control group composition in new code paths |
| 102 | + |
| 103 | +4. **Pattern Consistency**: |
| 104 | + - [ ] Search for similar patterns in codebase (e.g., `t_stat = x / se if se > 0 else ...`) |
| 105 | + - [ ] Ensure new code follows established patterns or updates ALL instances |
| 106 | + - [ ] If fixing a pattern, grep for ALL occurrences first: |
| 107 | + ```bash |
| 108 | + grep -n "if.*se.*> 0.*else" diff_diff/*.py |
| 109 | + ``` |
| 110 | + |
| 111 | +## Fixing Bugs Across Multiple Locations |
| 112 | + |
| 113 | +When a bug fix involves a pattern that appears in multiple places (learned from PR #97 analysis): |
| 114 | + |
| 115 | +1. **Find All Instances First**: |
| 116 | + - [ ] Use grep/search to find ALL occurrences of the pattern before fixing |
| 117 | + - [ ] Document the locations found (file:line) |
| 118 | + - [ ] Example: `t_stat = effect / se if se > 0 else 0.0` appeared in 7 locations |
| 119 | +
|
| 120 | +2. **Fix Comprehensively in One Round**: |
| 121 | + - [ ] Fix ALL instances in the same PR/commit |
| 122 | + - [ ] Create a test that covers all locations |
| 123 | + - [ ] Don't fix incrementally across multiple review rounds |
| 124 | +
|
| 125 | +3. **Regression Test the Fix**: |
| 126 | + - [ ] Verify fix doesn't break other code paths |
| 127 | + - [ ] For early-return fixes: ensure downstream code still runs when needed |
| 128 | + - [ ] Example: Bootstrap early return must still compute per-effect SEs |
| 129 | +
|
| 130 | +4. **Common Patterns to Watch For**: |
| 131 | + - `if se > 0 else 0.0` -> should be `else np.nan` for undefined statistics |
| 132 | + - `if len(data) > 0 else return` -> check what downstream code expects |
| 133 | + - `mask = (condition)` -> verify mask logic for all parameter combinations |
| 134 | +
|
| 135 | +## Pre-Merge Review Checklist |
| 136 | +
|
| 137 | +Final checklist before approving a PR: |
| 138 | +
|
| 139 | +1. **Behavioral Completeness**: |
| 140 | + - [ ] Happy path tested |
| 141 | + - [ ] Edge cases tested (empty data, NaN inputs, boundary conditions) |
| 142 | + - [ ] Error/warning paths tested with behavioral assertions |
| 143 | +
|
| 144 | +2. **Inference Field Consistency**: |
| 145 | + - [ ] If one inference field (SE, t-stat, p-value) can be NaN, all related fields handle it |
| 146 | + - [ ] Aggregation methods propagate NaN correctly |
| 147 | + - [ ] Bootstrap methods handle NaN in base estimates |
| 148 | +
|
| 149 | +3. **Documentation Sync**: |
| 150 | + - [ ] Docstrings updated for all changed signatures |
| 151 | + - [ ] README updated if user-facing behavior changes |
| 152 | + - [ ] REGISTRY.md updated if methodology edge cases change |
| 153 | +
|
| 154 | +## Quick Reference: Common Patterns to Check |
| 155 | +
|
| 156 | +Before submitting methodology changes, verify these patterns: |
| 157 | +
|
| 158 | +```bash |
| 159 | +# Find potential NaN handling issues (should use np.nan, not 0.0) |
| 160 | +grep -n "if.*se.*>.*0.*else 0" diff_diff/*.py |
| 161 | +
|
| 162 | +# Find all t_stat calculations to ensure consistency |
| 163 | +grep -n "t_stat.*=" diff_diff/*.py |
| 164 | +
|
| 165 | +# Find all inference field assignments |
| 166 | +grep -n "\(se\|t_stat\|p_value\|ci_lower\|ci_upper\).*=" diff_diff/*.py | head -30 |
| 167 | +``` |
0 commit comments