Related to #482.
Symptom
User-provided inputs that go through a custom set_input dispatcher can be lost after Simulation._invalidate_all_caches() runs, for example via apply_reform().
Minimal repro using the country template salary variable, which is monthly but accepts a yearly input through set_input_divide_by_period:
import numpy as np
from policyengine_core.country_template import CountryTaxBenefitSystem, situation_examples
from policyengine_core.model_api import Reform
from policyengine_core.simulations import SimulationBuilder
sim = SimulationBuilder().build_from_entities(
CountryTaxBenefitSystem(),
situation_examples.single,
)
sim.set_input("salary", "2017", np.array([12_000.0]))
assert sim.calculate("salary", "2017-01")[0] == 1_000
class NoOpReform(Reform):
def apply(self):
pass
sim.apply_reform(NoOpReform)
# Expected: still 1_000. Currently falls back to the default value.
assert sim.calculate("salary", "2017-01")[0] == 1_000
Observed locally while reviewing #483:
before apply_reform: [1000.]
after apply_reform: [0.]
Suspected mechanism
This appears to be the same broad class of issue as #482: cache invalidation preserves user inputs by replaying entries from _user_input_keys, but _user_input_keys can record a different period from the one actually written to storage.
For this case:
Simulation.set_input("salary", "2017", ...) / Holder.set_input(...) records (salary, default, 2017) in _user_input_keys.
- Because
salary.definition_period == MONTH and salary.set_input == set_input_divide_by_period, Holder.set_input dispatches to the helper instead of storing salary@2017 directly.
set_input_divide_by_period writes actual monthly values like salary@2017-01, ..., salary@2017-12 via holder._set(...).
_invalidate_all_caches() later tries to preserve salary@2017, misses the monthly storage entries, clears the holder storage, and the monthly user inputs are lost.
#482 covers the ETERNITY version of this mismatch: requested ordinary period, stored ETERNITY. This issue covers custom set_input dispatch where requested broad period is expanded into subperiod storage writes.
Expected behavior
User-provided inputs should survive cache invalidation, regardless of whether they are stored directly, canonicalized to ETERNITY, or expanded by a custom set_input dispatcher. Formula-computed cache outputs should still be invalidated.
Possible fix direction
Track the actual storage writes made as part of a user input rather than the caller-requested period. One option is to move preservation tracking closer to Holder._set, with an internal user-input context set by Holder.set_input, so dispatched writes record their real (variable, branch, period) keys. Branch handling would need care because existing dispatcher helpers call holder._set(sub_period, array) without an explicit branch_name.
Related to #482.
Symptom
User-provided inputs that go through a custom
set_inputdispatcher can be lost afterSimulation._invalidate_all_caches()runs, for example viaapply_reform().Minimal repro using the country template
salaryvariable, which is monthly but accepts a yearly input throughset_input_divide_by_period:Observed locally while reviewing #483:
Suspected mechanism
This appears to be the same broad class of issue as #482: cache invalidation preserves user inputs by replaying entries from
_user_input_keys, but_user_input_keyscan record a different period from the one actually written to storage.For this case:
Simulation.set_input("salary", "2017", ...)/Holder.set_input(...)records(salary, default, 2017)in_user_input_keys.salary.definition_period == MONTHandsalary.set_input == set_input_divide_by_period,Holder.set_inputdispatches to the helper instead of storingsalary@2017directly.set_input_divide_by_periodwrites actual monthly values likesalary@2017-01, ...,salary@2017-12viaholder._set(...)._invalidate_all_caches()later tries to preservesalary@2017, misses the monthly storage entries, clears the holder storage, and the monthly user inputs are lost.#482 covers the ETERNITY version of this mismatch: requested ordinary period, stored
ETERNITY. This issue covers customset_inputdispatch where requested broad period is expanded into subperiod storage writes.Expected behavior
User-provided inputs should survive cache invalidation, regardless of whether they are stored directly, canonicalized to
ETERNITY, or expanded by a customset_inputdispatcher. Formula-computed cache outputs should still be invalidated.Possible fix direction
Track the actual storage writes made as part of a user input rather than the caller-requested period. One option is to move preservation tracking closer to
Holder._set, with an internal user-input context set byHolder.set_input, so dispatched writes record their real(variable, branch, period)keys. Branch handling would need care because existing dispatcher helpers callholder._set(sub_period, array)without an explicitbranch_name.