Skip to content

Conversation

@baogorek
Copy link
Collaborator

@baogorek baogorek commented Oct 17, 2025

Summary

This PR implements a 75-year projection capability for federal income tax revenue (2025-2100) by integrating PolicyEngine's economic microsimulation with Social Security Administration demographic forecasts. This enables quantifying the fiscal impact of population aging while preserving the full complexity of the tax code.

Key Features

Two-Stage Projection Methodology

Stage 1: Economic Uprating

  • Sophisticated modeling of 17 distinct income categories uprated by economic fundamentals
  • Complete tax code implementation with all brackets, credits, deductions, and interactions
  • Calibrated to match IRS Statistics of Income aggregates

Stage 2: Demographic Reweighting

  • Two calibration methods available:
    • IPF (Iterative Proportional Fitting): Traditional raking approach with KL-divergence
    • GREG (Generalized Regression): Modern calibration enabling continuous variables
  • Age-specific targets from SSA Trustees Report (single-year groups 0-85+)
  • Optional calibration to Social Security benefit projections (GREG only)

What's Included

New Scripts

  • run_household_projection.py: Main projection engine with IPF/GREG support
  • create_reweighting_matrix.py: Builds demographic transition matrices
  • age_projection.py: Processes SSA age-specific population forecasts
  • extract_ssa_costs.py: Extracts Social Security benefit targets
  • transition_matrix_demo.py: Validation and methodology demonstration

Data Files

  • SSPopJul_TR2024.csv: SSA Trustees Report demographic projections (2025-2100)
  • social_security_aux.csv: OASDI cost projections from SSA Table VI.G9

Documentation

  • Comprehensive README explaining methodology and theoretical foundation
  • Technical notes on avoiding multicollinearity in GREG calibration
  • Validation metrics showing IPF/GREG agreement within 0.2%

Impact

This enables unprecedented analysis of:

  • Fiscal sustainability: Decompose revenue changes into economic vs demographic components
  • Policy design: Evaluate reforms in context of future demographics
  • Distributional analysis: Track tax burden shifts between age cohorts
  • Scenario planning: Model alternative demographic and economic scenarios

Usage

# Traditional IPF approach (default)
python run_household_projection.py 2050

# GREG calibration with demographics only
python run_household_projection.py 2050 --greg

# GREG with demographics + Social Security benefits
python run_household_projection.py 2050 --greg --use-ss

Validation

  • Population totals exactly match SSA projections
  • Age distributions preserved through reweighting
  • Social Security benefits match SSA Trustees Report (when using --use-ss)
  • IPF and GREG produce equivalent results (within 0.2%) for identical constraints
  • Validated against R's survey package

Future Extensions Needed

  • Update population calibration from CBO (ending 2055) to SSA (through 2100)
  • Incorporate CBO Long-Term Budget Outlook inflation projections
  • Extend income category projections beyond current CBO horizon using consistent methodology

This provides a comprehensive foundation for evidence-based fiscal policy analysis in an era of demographic transformation.

baogorek and others added 6 commits October 17, 2025 16:55
The script depends on social_security_aux.csv for benefit projections and SSPopJul_TR2024.csv for population demographics. Without these files, the script cannot run from a fresh clone of the repository.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@baogorek baogorek changed the title Scaffolding for 75-year forecasting 75-year Projections based on calibration to SSA Trustees data Nov 10, 2025
@baogorek baogorek marked this pull request as ready for review November 10, 2025 21:55
@baogorek baogorek requested a review from MaxGhenis November 10, 2025 21:55
baogorek and others added 8 commits November 11, 2025 09:35
- Move SSA data files to centralized storage directory
- Add comprehensive Jupyter notebook with methodology and analysis
- Update README to reference notebook for detailed documentation
- Add notebook to MyST documentation structure
- Ignore Jupyter checkpoint files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The myst build command creates output in docs/_build/site, but the
deployment was looking for docs/_build/html. This fixes the path in both
the workflow and Makefile targets.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This allows manually deploying documentation and running the full
test suite without waiting for a version update push to main.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Corrected imputed variable count from 72 to 67
- Corrected calibration target count from 7,000+ to 2,813
- Removed inaccurate "two-stage" terminology
- Added SSA data source documentation in storage README
- Renamed notebook to clarify PWBM comparison scope (2025-2100)
- Added taxable payroll calibration target

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…projections

- Add start_year parameter with default 2025 for flexible projection windows
- Replace hardcoded 85 with MAX_SINGLE_AGE constant for clarity
- Remove csv_path parameter to match codebase conventions (hardcoded instead)

Addresses PR #443 review comments

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@baogorek baogorek requested a review from MaxGhenis November 20, 2025 00:02
@baogorek baogorek merged commit e213759 into main Nov 20, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants