Skip to content

Commit 4ea2f25

Browse files
authored
Merge pull request #175 from igerber/claude-md-cleanup
Trim CLAUDE.md from 792 to 160 lines
2 parents 330ac65 + 5b669e7 commit 4ea2f25

File tree

6 files changed

+310
-697
lines changed

6 files changed

+310
-697
lines changed

.claude/commands/dev-checklists.md

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
description: Development checklists for code changes (params, methodology, warnings, reviews, bugs)
3+
argument-hint: "[checklist-name]"
4+
---
5+
6+
# Development Checklists
7+
8+
## Adding a New Parameter to Estimators
9+
10+
When adding a new `__init__` parameter that should be available across estimators:
11+
12+
1. **Implementation** (for each affected estimator):
13+
- [ ] Add to `__init__` signature with default value
14+
- [ ] Store as `self.param_name`
15+
- [ ] Add to `get_params()` return dict
16+
- [ ] Handle in `set_params()` (usually automatic via `hasattr`)
17+
18+
2. **Consistency** - apply to all applicable estimators per the **Estimator inheritance** map in CLAUDE.md
19+
20+
3. **Testing**:
21+
- [ ] Test `get_params()` includes new param
22+
- [ ] Test parameter affects estimator behavior
23+
- [ ] Test with non-default value
24+
25+
4. **Downstream tracing**:
26+
- [ ] Before implementing: `grep -rn "self\.<param>" diff_diff/<module>.py` to find ALL downstream paths
27+
- [ ] Parameter handled in ALL aggregation methods (simple, event_study, group)
28+
- [ ] Parameter handled in bootstrap inference paths
29+
30+
5. **Documentation**:
31+
- [ ] Update docstring in all affected classes
32+
- [ ] Update CLAUDE.md if it's a key design pattern
33+
34+
## Implementing Methodology-Critical Code
35+
36+
When implementing or modifying code that affects statistical methodology (estimators, SE calculation, inference, edge case handling):
37+
38+
1. **Before coding - consult the Methodology Registry**:
39+
- [ ] Read the relevant estimator section in `docs/methodology/REGISTRY.md`
40+
- [ ] Identify the reference implementation(s) listed
41+
- [ ] Note the edge case handling requirements
42+
43+
2. **During implementation**:
44+
- [ ] Follow the documented equations and formulas
45+
- [ ] Match reference implementation behavior for standard cases
46+
- [ ] For edge cases: either match reference OR document deviation
47+
48+
3. **When deviating from reference implementations**:
49+
- [ ] Add a **Note** in the Methodology Registry explaining the deviation
50+
- [ ] Include rationale (e.g., "defensive enhancement", "R errors here")
51+
- [ ] Ensure the deviation is an improvement, not a bug
52+
53+
4. **Testing methodology-aligned behavior**:
54+
- [ ] Test that edge cases produce documented behavior (NaN, warning, etc.)
55+
- [ ] Assert warnings are raised (not just captured)
56+
- [ ] Assert the warned-about behavior actually occurred
57+
- [ ] For NaN results: assert `np.isnan()`, don't just check "no exception"
58+
- [ ] All inference fields computed via `safe_inference()` (not inline)
59+
60+
## Adding Warning/Error/Fallback Handling
61+
62+
When adding code that emits warnings or handles errors:
63+
64+
1. **Consult Methodology Registry first**:
65+
- [ ] Check if behavior is documented in edge cases section
66+
- [ ] If not documented, add it before implementing
67+
68+
2. **Verify behavior matches message**:
69+
- [ ] Manually trace the code path after warning/error
70+
- [ ] Confirm the stated behavior actually occurs
71+
72+
3. **Write behavioral tests**:
73+
- [ ] Don't just test "no exception raised"
74+
- [ ] Assert the expected outcome occurred
75+
- [ ] For fallbacks: verify fallback behavior was applied
76+
- [ ] Example: If warning says "setting NaN", assert `np.any(np.isnan(result))`
77+
78+
4. **Protect arithmetic operations**:
79+
- [ ] Wrap ALL related operations in `np.errstate()`, not just the final one
80+
- [ ] Include division, matrix multiplication, and any operation that can overflow/underflow
81+
82+
## Reviewing New Features or Code Paths
83+
84+
When reviewing PRs that add new features, modes, or code paths (learned from PR #97 analysis):
85+
86+
1. **Edge Case Coverage**:
87+
- [ ] Empty result sets (no matching data for a filter condition)
88+
- [ ] NaN/Inf propagation through ALL inference fields (SE, t-stat, p-value, CI)
89+
- [ ] Parameter interactions (e.g., new param x existing aggregation methods)
90+
- [ ] Control/comparison group composition for all code paths
91+
92+
2. **Documentation Completeness**:
93+
- [ ] All new parameters have docstrings with type, default, and description
94+
- [ ] Methodology docs match implementation behavior (equations, edge cases)
95+
- [ ] Edge cases documented in `docs/methodology/REGISTRY.md`
96+
97+
3. **Logic Audit for New Code Paths**:
98+
- [ ] When adding new modes (like `base_period="varying"`), trace ALL downstream effects
99+
- [ ] Check aggregation methods handle the new mode correctly
100+
- [ ] Check bootstrap/inference methods handle the new mode correctly
101+
- [ ] Explicitly test control group composition in new code paths
102+
103+
4. **Pattern Consistency**:
104+
- [ ] Search for similar patterns in codebase (e.g., `t_stat = x / se if se > 0 else ...`)
105+
- [ ] Ensure new code follows established patterns or updates ALL instances
106+
- [ ] If fixing a pattern, grep for ALL occurrences first:
107+
```bash
108+
grep -n "if.*se.*> 0.*else" diff_diff/*.py
109+
```
110+
111+
## Fixing Bugs Across Multiple Locations
112+
113+
When a bug fix involves a pattern that appears in multiple places (learned from PR #97 analysis):
114+
115+
1. **Find All Instances First**:
116+
- [ ] Use grep/search to find ALL occurrences of the pattern before fixing
117+
- [ ] Document the locations found (file:line)
118+
- [ ] Example: `t_stat = effect / se if se > 0 else 0.0` appeared in 7 locations
119+
120+
2. **Fix Comprehensively in One Round**:
121+
- [ ] Fix ALL instances in the same PR/commit
122+
- [ ] Create a test that covers all locations
123+
- [ ] Don't fix incrementally across multiple review rounds
124+
125+
3. **Regression Test the Fix**:
126+
- [ ] Verify fix doesn't break other code paths
127+
- [ ] For early-return fixes: ensure downstream code still runs when needed
128+
- [ ] Example: Bootstrap early return must still compute per-effect SEs
129+
130+
4. **Common Patterns to Watch For**:
131+
- `if se > 0 else 0.0` -> should be `else np.nan` for undefined statistics
132+
- `if len(data) > 0 else return` -> check what downstream code expects
133+
- `mask = (condition)` -> verify mask logic for all parameter combinations
134+
135+
## Pre-Merge Review Checklist
136+
137+
Final checklist before approving a PR:
138+
139+
1. **Behavioral Completeness**:
140+
- [ ] Happy path tested
141+
- [ ] Edge cases tested (empty data, NaN inputs, boundary conditions)
142+
- [ ] Error/warning paths tested with behavioral assertions
143+
144+
2. **Inference Field Consistency**:
145+
- [ ] If one inference field (SE, t-stat, p-value) can be NaN, all related fields handle it
146+
- [ ] Aggregation methods propagate NaN correctly
147+
- [ ] Bootstrap methods handle NaN in base estimates
148+
149+
3. **Documentation Sync**:
150+
- [ ] Docstrings updated for all changed signatures
151+
- [ ] README updated if user-facing behavior changes
152+
- [ ] REGISTRY.md updated if methodology edge cases change
153+
154+
## Quick Reference: Common Patterns to Check
155+
156+
Before submitting methodology changes, verify these patterns:
157+
158+
```bash
159+
# Find potential NaN handling issues (should use np.nan, not 0.0)
160+
grep -n "if.*se.*>.*0.*else 0" diff_diff/*.py
161+
162+
# Find all t_stat calculations to ensure consistency
163+
grep -n "t_stat.*=" diff_diff/*.py
164+
165+
# Find all inference field assignments
166+
grep -n "\(se\|t_stat\|p_value\|ci_lower\|ci_upper\).*=" diff_diff/*.py | head -30
167+
```

.claude/commands/docs-check.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,6 @@ Summary: 15/18 checks passed, 3 issues found
200200
## Notes
201201

202202
- This check is especially important after adding new estimators
203-
- The CLAUDE.md file documents what documentation is required for new features
203+
- The CONTRIBUTING.md file documents what documentation is required for new features
204204
- Missing references should cite the original methodology paper, not textbooks
205205
- When adding new estimators, update this skill's tables accordingly

.claude/commands/review-plan.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -76,11 +76,12 @@ If no prior review is available from either source (conversation context or revi
7676
### Step 2: Read CLAUDE.md for Project Context
7777

7878
Read the project's `CLAUDE.md` file to understand:
79-
- Module structure and architecture
79+
- Key design patterns (sklearn-like API, formula interface, results objects, etc.)
8080
- Estimator inheritance map
81-
- Development checklists (adding parameters, methodology-critical code, etc.)
82-
- Test structure and conventions
83-
- Design patterns (sklearn-like API, formula interface, results objects, etc.)
81+
- Testing conventions
82+
- Key reference file pointers (methodology registry, development checklists, etc.)
83+
84+
Also read `.claude/commands/dev-checklists.md` for development checklists.
8485

8586
If the plan modifies estimator math, standard error formulas, inference logic, or edge-case handling, also read `docs/methodology/REGISTRY.md` to understand the academic foundations and reference implementations for the affected estimator(s).
8687

@@ -224,7 +225,7 @@ Check for **missing related changes**:
224225
- Tests for new/changed functionality
225226
- `__init__.py` export updates
226227
- `get_params()` / `set_params()` updates for new parameters
227-
- Documentation updates (README, RST, tutorials, CLAUDE.md)
228+
- Documentation updates (README, RST, tutorials, CONTRIBUTING.md, CLAUDE.md if design patterns change)
228229
- For bug fixes: did the plan grep for ALL occurrences of the pattern, or just the one reported?
229230

230231
Check for **unnecessary additions**:
@@ -354,9 +355,9 @@ Present the review in the following format. Number each issue sequentially withi
354355
355356
## Checklist Gaps
356357
357-
Cross-reference against the relevant CLAUDE.md checklists. List which checklist items are not addressed by the plan.
358+
Cross-reference against the relevant development checklists in `.claude/commands/dev-checklists.md`. List which checklist items are not addressed by the plan.
358359
359-
[Identify which CLAUDE.md checklist applies (e.g., "Adding a New Parameter to Estimators", "Implementing Methodology-Critical Code", "Fixing Bugs Across Multiple Locations") and list any items from that checklist that the plan doesn't cover.]
360+
[Identify which checklist applies (e.g., "Adding a New Parameter to Estimators", "Implementing Methodology-Critical Code", "Fixing Bugs Across Multiple Locations") and list any items from that checklist that the plan doesn't cover.]
360361
361362
**Registry Alignment** (if methodology files changed):
362363
- [ ] Plan equations match REGISTRY.md (or deviations documented)

.claude/hooks/check-plan-review.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
# - A stale sentinel can approve an unreviewed newer plan. This occurs only when:
1919
# (a) the sentinel points to an older plan with a valid review, AND
2020
# (b) a newer plan was created without updating the sentinel.
21-
# CLAUDE.md instructs sentinel updates on new plan creation (line 647), making (b)
21+
# CLAUDE.md instructs sentinel updates on new plan creation (Plan Review section), making (b)
2222
# unlikely in practice. A runtime guard was tested and removed (cb8d5e3) because it
2323
# caused false denials in multi-plan workflows. Automated tests in
2424
# test-check-plan-review.sh verify existing behavior; full mitigation would require
@@ -102,7 +102,7 @@ if [ -f "$REVIEW_FILE" ]; then
102102
fi
103103
exit 0 # Review exists and is fresh, allow
104104
else
105-
deny "No plan review found. Expected: $REVIEW_FILE. Follow the Plan Review Before Approval instructions in CLAUDE.md."
105+
deny "No plan review found. Expected: $REVIEW_FILE. Run the plan review workflow: offer review via AskUserQuestion, spawn /review-plan if requested, or write a Skipped marker. See CLAUDE.md Plan Review section."
106106
fi
107107

108108
# Manual verification checklist:

0 commit comments

Comments
 (0)