Skip to content

Pre-launch cleanup: dead code + plotly optional extra#288

Merged
MaxGhenis merged 2 commits intomainfrom
cleanup-dead-code
Apr 19, 2026
Merged

Pre-launch cleanup: dead code + plotly optional extra#288
MaxGhenis merged 2 commits intomainfrom
cleanup-dead-code

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

Removes three unambiguously dead code paths and moves `plotly` from the core install to a new `[plotting]` optional extra. Behavior-preserving for every downstream repo audited (`policyengine-api`, `policyengine-api-v2`, `policyengine-api-v2-alpha`).

First of several pre-launch restructuring PRs from the three-reviewer pass.

What's deleted

  1. `tax_benefit_models/us.py` and `uk.py` shim files. Python always resolves the `us/` / `uk/` package directory first, so the `.py` files were always shadowed. Worse: both files re-exported `general_policy_reform_analysis`, which is not defined anywhere — `from policyengine.tax_benefit_models.us import general_policy_reform_analysis` raises `ImportError` at runtime today.

  2. `_create_entity_output_model` + `PersonOutput` / `BenunitOutput` / `HouseholdEntityOutput` in `uk/analysis.py`. Built with `pydantic.create_model` at import time, grep-confirmed as referenced nowhere.

  3. `policyengine.core.DatasetVersion`. Single `Optional` field on `Dataset` (never set by anything), one `policyengine.core` re-export, no other callers.

What moves

  1. `plotly>=5.0.0` → `[plotting]` extra. Only `policyengine.utils.plotting` uses plotly, and nothing in `src/` imports that module — only `examples/` scripts do. `plotting.py` now soft-imports plotly with a clear install hint. `import policyengine` no longer drags in ~15MB of charting deps.

Downstream audit

Repo Pinned Uses deleted symbols?
`policyengine-api` pre-3.x No (different API)
`policyengine-api-v2` 3.4.0 No (uses `Simulation`, `uk_latest`, `Aggregate`, `Policy`, `Parameter`)
`policyengine-api-v2-alpha` 3.1.15 No (same as above)

Test plan

  • 216 tests pass locally across `test_release_manifests`, `test_trace_tro`, `test_results`, `test_household_impact`, `test_models`, `test_us_regions`, `test_uk_regions`, `test_region`, `test_manifest_version_mismatch`, `test_filtering`, `test_cache`, `test_scoping_strategy`
  • `ruff check .` + `ruff format --check .` clean

What's deferred (follow-up PRs)

The three-reviewer pass identified several larger moves worth making pre-launch. Each gets its own PR so this one stays mechanical:

  • PR B: `extra="forbid"` on household inputs (closes silent-typo hole)
  • PR C: naming pass — `calculate_household_impact` → `calculate_household`, `USHouseholdOutput` → `USHouseholdResult`, `economic_impact_analysis` → `analyze_reform`, `ProgramStatistics`/`ProgrammeStatistics` unified, plus deprecation shims
  • PR D: provenance layer — move `release_manifest.py` + `trace_tro.py` from `core/` to `policyengine/provenance/`; lazy-import `h5py` from `scoping_strategy.py`
  • PR E: extract `MicrosimulationModelVersion` base (~600 LOC duplication between `us/model.py` and `uk/model.py`)

🤖 Generated with Claude Code

MaxGhenis and others added 2 commits April 18, 2026 19:39
Removes three unambiguously dead code paths and moves plotly out of
the core install so `import policyengine` doesn't pull the charting
stack. Changes are behavior-preserving for every downstream repo
surveyed (policyengine-api, policyengine-api-v2, policyengine-api-v2-alpha).

1. Delete `tax_benefit_models/{us,uk}.py` shim files. Python always
   resolves the `us/`/`uk/` package directory first, so the .py files
   were dead. Worse: both re-exported `general_policy_reform_analysis`
   which is not defined anywhere — `from policyengine.tax_benefit_models.us
   import general_policy_reform_analysis` raises ImportError at runtime.

2. Delete `_create_entity_output_model` + `PersonOutput` /
   `BenunitOutput` / `HouseholdEntityOutput` in uk/analysis.py. Built
   via pydantic.create_model at import time, referenced nowhere in
   the codebase.

3. Delete `policyengine.core.DatasetVersion`. One optional field on
   Dataset (never set by anything) and one core re-export. Nothing
   reads it downstream.

4. Move `plotly>=5.0.0` from base dependencies to a `[plotting]`
   optional extra. Only `policyengine.utils.plotting` uses plotly,
   and nothing in src/ imports that module — only `examples/` do.
   `plotting.py` now soft-imports with a clear install hint.

Downstream impact: none. Surveyed policyengine-api (pinned to a
pre-3.x API), policyengine-api-v2 (3.4.0), policyengine-api-v2-alpha
(3.1.15); none of them import the deleted symbols.

Tests: 216 passed locally across test_release_manifests,
test_trace_tro, test_results, test_household_impact, test_models,
test_us_regions, test_uk_regions, test_region,
test_manifest_version_mismatch, test_filtering, test_cache,
test_scoping_strategy.

Deferred (bigger refactors, follow-up PRs):
- filter_field/filter_value legacy path on Simulation (still wired
  through Region construction; needs migration)
- calculate_household_impact → calculate_household rename (with
  deprecation shim)
- Extract shared MicrosimulationModelVersion base (~600 LOC savings)
- Move release_manifest + trace_tro to policyengine/provenance/

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
utils/__init__.py eagerly imported COLORS from plotting.py, which now
raises ImportError when plotly isn't installed. Every smoke-import job
on PR #288 failed because plotting.py blocked at module load.

Move COLORS + FONT_* constants to a new plotly-free utils/design.py;
plotting.py re-exports them for backward compatibility and adds them
to __all__. utils/__init__.py now pulls COLORS from design rather than
plotting. Confirmed locally that pip uninstall plotly still lets
'import policyengine' + 'from policyengine.utils import COLORS' +
'from policyengine.core.release_manifest import get_release_manifest'
all work cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MaxGhenis
Copy link
Copy Markdown
Contributor Author

Corrections to the PR body

CI: fixed

Smoke-import jobs were red — `utils/init.py` eagerly imported `COLORS` from `plotting.py`, which now raises when plotly is absent. Pushed `8e412d2` that extracts brand tokens (`COLORS`, `FONT_*`) into a new plotly-free `utils/design.py`; `plotting.py` re-exports them. Confirmed locally: `pip uninstall plotly` still lets `import policyengine` + `from policyengine.utils import COLORS` + `from policyengine.core.release_manifest import get_release_manifest` work clean.

Downstream audit — corrected

The "three downstreams audited" table I posted earlier conflated pin ranges. Actual state:

Repo Pin On 3.x surface?
`policyengine-api` `>0.12.0,<1` No — pre-3.x API (`policyengine.simulation`, `policyengine.outputs.macro`)
`policyengine-api-v2` `policyengine==0.13.0` No — same pre-3.x API
`policyengine-api-v2-alpha` `>=3.2.3` Yes — sole real consumer of the 3.x surface

So this PR and the rest of the v4 chain affects exactly one repo: `policyengine-api-v2-alpha`. Migration effort for v4: ~80-120 LOC, concentrated in two files, mostly `filter_field`→`RowFilterStrategy` and `entity_variables`→`default_output_columns` renames. I'll file a tracking issue there when v4 cuts.

@MaxGhenis MaxGhenis merged commit e339621 into main Apr 19, 2026
11 checks passed
MaxGhenis added a commit that referenced this pull request Apr 19, 2026
* Pre-launch cleanup: dead code + plotly optional extra

Removes three unambiguously dead code paths and moves plotly out of
the core install so `import policyengine` doesn't pull the charting
stack. Changes are behavior-preserving for every downstream repo
surveyed (policyengine-api, policyengine-api-v2, policyengine-api-v2-alpha).

1. Delete `tax_benefit_models/{us,uk}.py` shim files. Python always
   resolves the `us/`/`uk/` package directory first, so the .py files
   were dead. Worse: both re-exported `general_policy_reform_analysis`
   which is not defined anywhere — `from policyengine.tax_benefit_models.us
   import general_policy_reform_analysis` raises ImportError at runtime.

2. Delete `_create_entity_output_model` + `PersonOutput` /
   `BenunitOutput` / `HouseholdEntityOutput` in uk/analysis.py. Built
   via pydantic.create_model at import time, referenced nowhere in
   the codebase.

3. Delete `policyengine.core.DatasetVersion`. One optional field on
   Dataset (never set by anything) and one core re-export. Nothing
   reads it downstream.

4. Move `plotly>=5.0.0` from base dependencies to a `[plotting]`
   optional extra. Only `policyengine.utils.plotting` uses plotly,
   and nothing in src/ imports that module — only `examples/` do.
   `plotting.py` now soft-imports with a clear install hint.

Downstream impact: none. Surveyed policyengine-api (pinned to a
pre-3.x API), policyengine-api-v2 (3.4.0), policyengine-api-v2-alpha
(3.1.15); none of them import the deleted symbols.

Tests: 216 passed locally across test_release_manifests,
test_trace_tro, test_results, test_household_impact, test_models,
test_us_regions, test_uk_regions, test_region,
test_manifest_version_mismatch, test_filtering, test_cache,
test_scoping_strategy.

Deferred (bigger refactors, follow-up PRs):
- filter_field/filter_value legacy path on Simulation (still wired
  through Region construction; needs migration)
- calculate_household_impact → calculate_household rename (with
  deprecation shim)
- Extract shared MicrosimulationModelVersion base (~600 LOC savings)
- Move release_manifest + trace_tro to policyengine/provenance/

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Extract brand tokens to utils.design so import works without plotly

utils/__init__.py eagerly imported COLORS from plotting.py, which now
raises ImportError when plotly isn't installed. Every smoke-import job
on PR #288 failed because plotting.py blocked at module load.

Move COLORS + FONT_* constants to a new plotly-free utils/design.py;
plotting.py re-exports them for backward compatibility and adds them
to __all__. utils/__init__.py now pulls COLORS from design rather than
plotting. Confirmed locally that pip uninstall plotly still lets
'import policyengine' + 'from policyengine.utils import COLORS' +
'from policyengine.core.release_manifest import get_release_manifest'
all work cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Drop legacy filter_field/filter_value scoping fields (v4 breaking)

Removes the two-way scoping contract in favour of the single
ScopingStrategy path. The legacy fields were labeled "kept for
backward compatibility" but became dead wiring the moment every
caller started passing scoping_strategy explicitly.

Changes:

Simulation (core/simulation.py)
  - Drop filter_field, filter_value fields.
  - Drop _auto_construct_strategy model_validator that rewrote those
    fields into a RowFilterStrategy.

Region (core/region.py)
  - Drop filter_field, filter_value, requires_filter fields.
  - Re-add requires_filter as a derived @Property: True iff
    scoping_strategy is not None.
  - Simplify get_dataset_regions / get_filter_regions to use
    dataset_path / scoping_strategy directly.

Country models (tax_benefit_models/us/model.py, .../uk/model.py)
  - Delete the `elif simulation.filter_field and simulation.filter_value:`
    fallback branch in run() — unreachable because nobody sets those
    fields anymore.
  - Delete the _filter_dataset_by_household_variable private method —
    only called from the elif branch. The underlying
    utils.entity_utils.filter_dataset_by_household_variable helper
    stays (it's what RowFilterStrategy.apply uses).
  - Drop the now-unused import.

Region factories (countries/{us,uk}/regions.py)
  - Stop setting requires_filter=True, filter_field=..., filter_value=...
    alongside scoping_strategy. The scoping_strategy is already the
    source of truth; the duplicate legacy triple was noise.

Tests
  - test_filtering.py: drop TestSimulationFilterParameters (fields gone)
    and TestUSFilterDatasetByHouseholdVariable /
    TestUKFilterDatasetByHouseholdVariable (method gone; underlying
    behaviour still covered by test_scoping_strategy.py
    TestRowFilterStrategy). Keep the build_entity_relationships tests.
  - test_scoping_strategy.py: drop three legacy-auto-construct tests,
    replace one with a direct WeightReplacementStrategy round-trip
    check.
  - test_region.py, test_us_regions.py, test_uk_regions.py: assertions
    move from `region.filter_field == "X"` to
    `region.scoping_strategy.variable_name == "X"`.
  - fixtures/region_fixtures.py: factories use
    scoping_strategy=RowFilterStrategy(...) directly.

212 tests pass. Downstream impact: policyengine-api-v2-alpha uses the
legacy fields in ~14 call sites (grep confirmed); they migrate to
RowFilterStrategy in a paired PR. The v4 migration guide will call out
this single search-and-replace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant