Skip to content

Reconcile SS sub-components after PUF imputation#552

Open
MaxGhenis wants to merge 6 commits intomainfrom
fix-ss-subcomponent-reconciliation
Open

Reconcile SS sub-components after PUF imputation#552
MaxGhenis wants to merge 6 commits intomainfrom
fix-ss-subcomponent-reconciliation

Conversation

@MaxGhenis
Copy link
Contributor

Summary

  • After PUF imputation replaces social_security values, rescales sub-components (social_security_retirement, social_security_disability, social_security_survivors, social_security_dependents) proportionally so they sum to the new total
  • New recipients (CPS had zero SS, PUF imputed positive) default to retirement
  • Adds 6 unit tests for the reconciliation logic

Problem

PUF imputation overwrites social_security but not its sub-components. When policyengine-us crosses the base year boundary, social_security switches from the PUF-imputed input (2,281 nonzero records) to the formula path (sum of CPS-derived sub-components = 7,197 nonzero records), creating ~3x more SS recipients and artificial 9-point Gini swings.

Test plan

  • 6 unit tests covering proportional rescaling, new recipients, zero imputation, CPS half unchanged, missing sub-components, and no-op when social_security absent
  • Full calibration pipeline run to verify downstream effects

Fixes #551

🤖 Generated with Claude Code

MaxGhenis and others added 5 commits February 24, 2026 21:18
When PUF imputation replaces social_security values, the sub-components
(retirement, disability, survivors, dependents) were left unchanged,
creating a mismatch. This caused a base-year discontinuity where projected
years had ~3x more SS recipients than the base year, producing artificial
9-point Gini swings.

The new reconcile_ss_subcomponents() function rescales sub-components
proportionally after PUF imputation. New recipients (CPS had zero SS
but PUF imputed positive) default to retirement.

Fixes #551

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New PUF recipients (CPS had zero SS) now get assigned based on age:
>= 62 -> retirement, < 62 -> disability. Matches the CPS fallback
logic. Falls back to retirement if age is unavailable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of the simple age >= 62 heuristic, train a QRF on CPS records
(where the reason-code split is known) to predict shares for new PUF
recipients. Uses age, gender, marital status, and other demographics.
Falls back to age heuristic when microimpute is unavailable or
training data is insufficient (< 100 records).

14 tests covering:
- Proportional rescaling (existing recipients)
- Age heuristic fallback (4 tests)
- QRF share prediction (4 tests including sum-to-one and
  elderly-predicted-as-retirement)
- Edge cases (zero imputation, missing subs, no SS)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CPS-PUF link is statistical (not identity-based), so the paired CPS
record's sub-component split is just one noisy draw. A QRF trained
on all CPS SS recipients gives a better expected prediction. Also
removes unnecessary try/except ImportError guard for microimpute.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@MaxGhenis MaxGhenis force-pushed the fix-ss-subcomponent-reconciliation branch from 15e6ed6 to 88f2aa4 Compare February 25, 2026 02:18
Fixes CI failure: DE TANF deficit_rate parameter started at 2024-10-01
in 1.570.7, causing ParameterNotFoundError for Jan 2024 simulations.
Fixed in newer releases (start date corrected to 2011-10-01).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PUF imputation overwrites social_security but not sub-components, causing base-year discontinuity

1 participant