-
Notifications
You must be signed in to change notification settings - Fork 201
Make country package purely deterministic - read stochastic variables from dataset #6635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… from dataset This change removes all random number generation from policyengine-us. All stochastic take-up variables are now generated in policyengine-us-data and read from the dataset. The country package is now a purely deterministic rules engine. ## Key Changes ### Removed - All take-up seed variables (snap_take_up_seed, aca_take_up_seed, medicaid_take_up_seed) - All take-up rate parameters (moved to policyengine-us-data) ### Simplified All takes_up_* variables now use dataset values with deterministic fallbacks: - takes_up_snap_if_eligible (default: True) - takes_up_aca_if_eligible (default: True) - takes_up_medicaid_if_eligible (default: True) ## Trade-offs **IMPORTANT**: Take-up rates can no longer be adjusted dynamically via policy reforms or in the web app. They are fixed in the microdata. This is an acceptable trade-off for the cleaner architecture of keeping the country package purely deterministic. To adjust take-up rates for analysis, the microdata must be regenerated with updated parameter values in policyengine-us-data. Related: policyengine-us-data PR (must be merged FIRST) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
I think the Mass branch got merged into this PR @MaxGhenis |
- Create takes_up_head_start_if_eligible and takes_up_early_head_start_if_eligible - Update head_start and early_head_start to use takeup in microsimulation - Add unit=USD and simplify labels to match conventions - Takeup is generated stochastically in dataset, defaults to True in policy calculator
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6635 +/- ##
=========================================
Coverage ? 71.42%
=========================================
Files ? 7
Lines ? 84
Branches ? 2
=========================================
Hits ? 60
Misses ? 24
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Changed np.any(programs) to programs > 0 to preserve array structure. The np.any() call was collapsing the entire array into a single boolean, causing all people to be categorically eligible if ANY tax unit qualified. This manifested when using axes - eligibility showed True at all income levels even when income_eligible was correctly False at high incomes. Fixes the issue where Early Head Start benefits were incorrectly given to high-income households (e.g., $200k) in vectorized calculations.
The vectorization fix is now in its own PR (PolicyEngine#6804) to keep the takeup migration PR focused on moving randomness to the data package.
These tests tested the old formula-based takeup using seed variables. In the new design, takeup is generated in the dataset (policyengine-us-data) and the variables have no formula (just default_value = True). Removed: - takes_up_snap_if_eligible.yaml - takes_up_medicaid_if_eligible.yaml - takes_up_aca_if_eligible.yaml The stochastic behavior is now tested in the data package, not the rules engine.
Summary
This PR removes all random number generation from policyengine-us. All stochastic take-up variables are now generated in policyengine-us-data and read from the dataset. The country package is now a purely deterministic rules engine.
Changes
Removed
Simplified
All takes_up_* variables now use dataset values with deterministic fallbacks:
These variables have no formula - when present in the dataset, OpenFisca uses the dataset value. For policy calculator (non-microsimulation), they default to True (full take-up assumption).
Trade-offs
IMPORTANT: Take-up rates can no longer be adjusted dynamically via policy reforms or in the web app. They are fixed in the microdata. This is an acceptable trade-off for the cleaner architecture of keeping the country package purely deterministic.
To adjust take-up rates for analysis, the microdata must be regenerated with updated parameter values in policyengine-us-data.
Test Plan
Related PRs
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com