Summary
The statistics fed into IntensityFree's LogNormalMixtureDistribution are computed on raw inter-event times $\tau$, but the variable is named mean_log_inter_time / std_log_inter_time and is consumed by an AffineTransform that — per the original IFL-TPP paper and reference implementation — expects the mean and std of $\log\tau$. This silently breaks the intended standardization of the GMM's base space and diverges from the upstream implementation (Shchur et al., ICLR 2020).
Where the issue is
1. easy_tpp/preprocess/dataset.py::TPPDataset.get_dt_stats
for dts, marks in zip(self.time_delta_seqs, self.type_seqs):
dts = np.array(dts[1:-1 if marks[-1] == -1 else None])
...
y_bar = dts.mean() # mean of raw τ
s_2_y = dts.var() # var of raw τ
...
return x_bar, (s_2_x ** 0.5), min_dt, max_dt
Per-chunk inputs y_bar, s_2_y are computed on raw dts, not on np.log(dts).
2. easy_tpp/runner/base_runner.py (around L43)
mean_log_inter_time, std_log_inter_time, min_dt, max_dt = (
self._data_loader.train_loader().dataset.get_dt_stats())
runner_config.model_config.set("mean_log_inter_time", mean_log_inter_time)
runner_config.model_config.set("std_log_inter_time", std_log_inter_time)
The values are assigned to keys that semantically promise "log-space statistics", with no log transform in between.
Comparison with the original IFL-TPP repo
Reference: shchur/ifl-tpp (ICLR 2020 official).
dpp/data/dataset.py computes log-space statistics explicitly:
def get_inter_time_statistics(self):
"""Get the mean and std of log(inter_time)."""
all_inter_times = torch.cat([seq.inter_times[:-1] for seq in self.sequences])
mean_log_inter_time = all_inter_times.log().mean()
std_log_inter_time = all_inter_times.log().std()
return mean_log_inter_time, std_log_inter_time
dpp/models/log_norm_mix.py documents the intended modelling chain:
x ~ GaussianMixtureModel(locs, log_scales, log_weights)
y = std_log_inter_time * x + mean_log_inter_time # <- expects log-space stats
z = exp(y)
The class structure (AffineTransform ∘ ExpTransform) in EasyTPP is the same, but the values supplied to it are computed on the wrong scale.
Is it a bug?
Summary
The statistics fed into$\tau$ , but the variable is named $\log\tau$ . This silently breaks the intended standardization of the GMM's base space and diverges from the upstream implementation (Shchur et al., ICLR 2020).
IntensityFree'sLogNormalMixtureDistributionare computed on raw inter-event timesmean_log_inter_time/std_log_inter_timeand is consumed by anAffineTransformthat — per the original IFL-TPP paper and reference implementation — expects the mean and std ofWhere the issue is
1.
easy_tpp/preprocess/dataset.py::TPPDataset.get_dt_statsPer-chunk inputs
y_bar,s_2_yare computed on rawdts, not onnp.log(dts).2.
easy_tpp/runner/base_runner.py(around L43)The values are assigned to keys that semantically promise "log-space statistics", with no
logtransform in between.Comparison with the original IFL-TPP repo
Reference: shchur/ifl-tpp (ICLR 2020 official).
dpp/data/dataset.pycomputes log-space statistics explicitly:dpp/models/log_norm_mix.pydocuments the intended modelling chain:The class structure (
AffineTransform ∘ ExpTransform) in EasyTPP is the same, but the values supplied to it are computed on the wrong scale.Is it a bug?