Skip to content

Conversation

@MaxGhenis
Copy link
Contributor

@MaxGhenis MaxGhenis commented Oct 5, 2025

Summary

This PR removes all random number generation from policyengine-us. All stochastic take-up variables are now generated in policyengine-us-data and read from the dataset. The country package is now a purely deterministic rules engine.

⚠️ MERGE ORDER: The companion PolicyEngine/policyengine-us-data#442 must be merged FIRST, then this PR

Changes

Removed

  • All take-up seed variables (snap_take_up_seed, aca_take_up_seed, medicaid_take_up_seed)
  • All take-up rate parameters (moved to policyengine-us-data)

Simplified

All takes_up_* variables now use dataset values with deterministic fallbacks:

  • takes_up_snap_if_eligible (default: True)
  • takes_up_aca_if_eligible (default: True)
  • takes_up_medicaid_if_eligible (default: True)

These variables have no formula - when present in the dataset, OpenFisca uses the dataset value. For policy calculator (non-microsimulation), they default to True (full take-up assumption).

Trade-offs

IMPORTANT: Take-up rates can no longer be adjusted dynamically via policy reforms or in the web app. They are fixed in the microdata. This is an acceptable trade-off for the cleaner architecture of keeping the country package purely deterministic.

To adjust take-up rates for analysis, the microdata must be regenerated with updated parameter values in policyengine-us-data.

Test Plan

  • Package imports successfully
  • All existing tests pass
  • Microsimulations produce correct results
  • Policy calculator (non-microsim) still works

Related PRs

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

… from dataset

This change removes all random number generation from policyengine-us. All
stochastic take-up variables are now generated in policyengine-us-data and
read from the dataset. The country package is now a purely deterministic rules engine.

## Key Changes

### Removed
- All take-up seed variables (snap_take_up_seed, aca_take_up_seed, medicaid_take_up_seed)
- All take-up rate parameters (moved to policyengine-us-data)

### Simplified
All takes_up_* variables now use dataset values with deterministic fallbacks:
- takes_up_snap_if_eligible (default: True)
- takes_up_aca_if_eligible (default: True)
- takes_up_medicaid_if_eligible (default: True)

## Trade-offs

**IMPORTANT**: Take-up rates can no longer be adjusted dynamically via policy
reforms or in the web app. They are fixed in the microdata. This is an
acceptable trade-off for the cleaner architecture of keeping the country
package purely deterministic.

To adjust take-up rates for analysis, the microdata must be regenerated with
updated parameter values in policyengine-us-data.

Related: policyengine-us-data PR (must be merged FIRST)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@PavelMakarchuk
Copy link
Collaborator

I think the Mass branch got merged into this PR @MaxGhenis

- Create takes_up_head_start_if_eligible and takes_up_early_head_start_if_eligible
- Update head_start and early_head_start to use takeup in microsimulation
- Add unit=USD and simplify labels to match conventions
- Takeup is generated stochastically in dataset, defaults to True in policy calculator
@codecov
Copy link

codecov bot commented Nov 10, 2025

Codecov Report

❌ Patch coverage is 71.42857% with 10 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (master@61702e1). Learn more about missing BASE report.
⚠️ Report is 26 commits behind head on master.

Files with missing lines Patch % Lines
...s/variables/gov/hhs/head_start/early_head_start.py 0.00% 5 Missing ⚠️
...gine_us/variables/gov/hhs/head_start/head_start.py 0.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             master    #6635   +/-   ##
=========================================
  Coverage          ?   71.42%           
=========================================
  Files             ?        7           
  Lines             ?       84           
  Branches          ?        2           
=========================================
  Hits              ?       60           
  Misses            ?       24           
  Partials          ?        0           
Flag Coverage Δ
unittests 71.42% <71.42%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Changed np.any(programs) to programs > 0 to preserve array structure.
The np.any() call was collapsing the entire array into a single boolean,
causing all people to be categorically eligible if ANY tax unit qualified.

This manifested when using axes - eligibility showed True at all income levels
even when income_eligible was correctly False at high incomes.

Fixes the issue where Early Head Start benefits were incorrectly given to
high-income households (e.g., $200k) in vectorized calculations.
The vectorization fix is now in its own PR (PolicyEngine#6804) to keep the
takeup migration PR focused on moving randomness to the data package.
These tests tested the old formula-based takeup using seed variables.
In the new design, takeup is generated in the dataset (policyengine-us-data)
and the variables have no formula (just default_value = True).

Removed:
- takes_up_snap_if_eligible.yaml
- takes_up_medicaid_if_eligible.yaml
- takes_up_aca_if_eligible.yaml

The stochastic behavior is now tested in the data package, not the rules engine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants