Skip to content

Conversation

@MaxGhenis
Copy link
Contributor

Summary

  • Households with has_fuel_consumption=0 (non-vehicle owners or EV owners) now have petrol_spending and diesel_spending set to zero after imputation
  • This prevents the QRF from assigning fuel spending based on other predictors to households that shouldn't have any fuel consumption
  • Should reduce the fuel duty over-estimation seen in calibration (was £49.5B vs £24.4B target)

Context

After PR #244 added has_fuel_consumption as a predictor, fuel duties were over-estimated by ~100%. This happened because:

  1. The QRF could still assign fuel spending to non-fuel households based on other predictors
  2. The calibration couldn't reduce this because it conflicts with other targets

Test plan

  • CI passes
  • Fuel duty estimate should be closer to £24.4B OBR target

🤖 Generated with Claude Code

Households with has_fuel_consumption=0 (non-vehicle owners or EV owners)
now have petrol_spending and diesel_spending set to zero after imputation.
This prevents the QRF from assigning fuel spending based on other predictors
to households that shouldn't have any.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@nikhilwoodruff
Copy link
Contributor

@PolicyEngine can you PTAL at this and see if you can get it to a point we can merge

Copy link

@policyengine policyengine bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR correctly addresses the fuel duty over-estimation issue by zeroing out fuel spending for non-fuel households after QRF imputation. The logic is sound and follows the existing pattern in the codebase.

What's good:

  • The fix is minimal and targeted - just 6 lines of code
  • The logic correctly uses the already-computed has_fuel_consumption variable
  • Documentation is updated to explain the new step
  • The approach makes sense: QRF is a continuous model, so post-hoc zeroing for categorical exclusions is reasonable

Minor consideration:

  • The code assumes has_fuel_consumption is still in scope at line 347. Looking at the context, it's computed around line 336-340 before imputation, so this should work fine.

The fix should significantly reduce the fuel duty over-estimation from £49.5B towards the £24.4B target, since ~30% of households (non-vehicle owners + EV owners) will no longer incorrectly contribute to fuel spending.

for column in output_df.columns:
dataset.household[column] = output_df[column].values

# Zero out fuel spending for households without fuel consumption
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick verification: has_fuel_consumption is computed earlier in this function (around line 336-340) and is a numpy array that persists in scope. The == 0 comparison is correct for boolean-like integers. This looks good.

dataset.household[column] = output_df[column].values

# Zero out fuel spending for households without fuel consumption
# This ensures only ICE vehicle owners contribute to fuel duty
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This direct assignment to a subset of dataset.household[...] works because the household dict stores numpy arrays. The boolean mask indexing correctly updates only the non-fuel households in place.

3. **Apply to LCFS**: Impute `has_fuel_consumption` to LCFS households before training consumption model

4. **At FRS imputation time**: Compute `has_fuel_consumption` directly from `num_vehicles` (already calibrated to NTS targets)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addition - the documentation clearly explains what happens and why.

@nikhilwoodruff nikhilwoodruff merged commit 827d9de into main Dec 7, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants