Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 32 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,11 @@ If you are not training locally, you can run inference using pre-trained models.
- `models/Ising-Decoder-SurfaceCode-1-Fast.pt` (receptive field R=9)
- `models/Ising-Decoder-SurfaceCode-1-Accurate.pt` (receptive field R=13)

These checkpoints target the uniform circuit-level depolarizing setting
encoded by the public configs. Custom, non-uniform 25-parameter noise models
are supported for training by the pipeline below; they are a training-time
customization rather than a property of the shipped checkpoints.

Clones get the files via `git lfs pull`. Optionally, set `PREDECODER_MODEL_URL` to the LFS/raw URL to fetch files when not in the working tree (e.g. in a minimal checkout or CI).

3. Set:
Expand Down Expand Up @@ -559,28 +564,39 @@ LOGICAL Z (lz):
#### Noise model (public default)

- `data.noise_model`: a **25-parameter circuit-level** noise model (SPAM, idles, and CNOT Pauli channels).
- The shipped configs use a **uniform circuit-level depolarizing** mapping, where all 25 values are derived from a single physical error rate `p` (for example `p_prep_{X,Z}=2*p/3`, `p_idle_cnot_{X,Y,Z}=p/3`, and `p_cnot_*=p/15`).
- You may edit `data.noise_model` to train on a non-uniform/custom 25-parameter model. In that case the Torch training generator refreshes the sampling probability vector from the active 25p model instead of collapsing back to the scalar uniform-depolarizing path.

#### Training noise upscaling (surface code)

When training a surface-code pre-decoder the noise parameters you specify may be very small (e.g. `p = 1e-4`), which produces extremely sparse syndromes and slow convergence. To address this, the training pipeline **automatically upscales** all 25 noise-model parameters so that the largest grouped total `max(P_prep, P_meas, P_idle_cnot, P_idle_spam, P_cnot)` equals a fixed target of **6 × 10⁻³** (just below the surface-code threshold of ~7.5 × 10⁻³).
When training a surface-code pre-decoder the noise parameters you specify may be very small (e.g. `p = 1e-4`), which produces extremely sparse syndromes and slow convergence. To address this, the training pipeline **automatically upscales** all 25 noise-model parameters so that the largest *effective* fault-channel probability equals a fixed target of **6 × 10⁻³** (just below the surface-code threshold of ~7.5 × 10⁻³).

The five grouped totals are:
The seven channels considered (the "capital P's") are:

| Group | Sum of |
|-------|--------|
| P_prep | `p_prep_X + p_prep_Z` |
| P_meas | `p_meas_X + p_meas_Z` |
| Channel | Value |
|---------|-------|
| P_prep_X | `p_prep_X` |
| P_prep_Z | `p_prep_Z` |
| P_meas_X | `p_meas_X` |
| P_meas_Z | `p_meas_Z` |
| P_idle_cnot | `p_idle_cnot_X + p_idle_cnot_Y + p_idle_cnot_Z` |
| P_idle_spam | `p_idle_spam_X + p_idle_spam_Y + p_idle_spam_Z` |
| P_idle_spam (effective) | `0.5 × (p_idle_spam_X + p_idle_spam_Y + p_idle_spam_Z)` |
| P_cnot | sum of all 15 `p_cnot_*` |

`max_group = max(P_prep_X, P_prep_Z, P_meas_X, P_meas_Z, P_idle_cnot, P_idle_spam_effective, P_cnot)`.

Two design notes:

- **X / Z prep and measurement are kept separate.** They are independent one-Pauli fault channels — summing `p_prep_X + p_prep_Z` (or `p_meas_X + p_meas_Z`) double-counts the effective channel probability and would inflate `max_group` for an otherwise on-target depolarising noise model.
- **`p_idle_spam_*` is halved before the comparison.** The SPAM-window idle is built from a two-step model (one per state-prep and one per ancilla-reset half), so the raw configured total represents two depolarising steps. The scaling decision uses the per-step effective value `0.5 × p_idle_spam_raw`; the raw value is still reported in logs as `idle_spam_raw`.

**Upscaling rules:**

- If `max_group < 6e-3`: all 25 p's are multiplied by `6e-3 / max_group` for training data generation only. Evaluation always uses the original user-specified noise model as-is.
- If `max_group >= 6e-3`: parameters are **not** modified (the training log emits a warning in case this indicates a configuration error).
- Non-surface-code types (`code_type != "surface_code"`) are never upscaled.

**Algorithm in brief:** The pipeline stores `p_max = max(P_prep, P_meas, P_idle_cnot, P_idle_spam, P_cnot)` from the full 25-parameter noise vector and rescales the entire vector by `0.006 / p_max` so that `p_max` is raised to **0.6%** (6 × 10⁻³). The original noise model is preserved unchanged for evaluation.
**Algorithm in brief:** The pipeline computes the seven channels above, takes `p_max = max(...)`, and rescales the entire 25-parameter vector by `0.006 / p_max` so that `p_max` is raised to **0.6%** (6 × 10⁻³). The original noise model is preserved unchanged for evaluation.

We have found that training on denser syndromes and then evaluating on sparser data produces better results than training directly on sparse data.

Expand Down Expand Up @@ -622,6 +638,14 @@ If frames are missing, the code can fall back to on-the-fly generation, but it i
python3 code/data/precompute_frames.py --distance 13 --n_rounds 13 --basis X Z --rotation O1
```

Precomputed DEM/frame artifacts are structural: they encode which detector
responses each possible error column can produce for a given distance, number of
rounds, basis, and rotation. The active scalar or 25-parameter noise model
controls the per-column sampling probabilities. Therefore cached structural
artifacts can be reused when only the probabilities change; the training
generator refreshes the probability vector from the active noise model at load
time.

### Resuming training and running inference on a trained model

- **Inference uses the trained model from `outputs/<experiment_name>/models/`**, so keep the same `EXPERIMENT_NAME` when you switch from training to inference.
Expand Down
109 changes: 88 additions & 21 deletions code/data/generator_torch.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
# limitations under the License.

import torch
from pathlib import Path


class QCDataGeneratorTorch:
Expand Down Expand Up @@ -76,17 +77,15 @@ def __init__(
raise ValueError(
"decompose_y is not supported in the Torch-only generator (set decompose_y=false)."
)
if noise_model is not None:
# TODO: wire noise_model through to precompute_dem_bundle_surface_code()
# so train/eval use the same 25-parameter noise distribution.
# build_single_p_marginal() already supports noise_model; only this
# generator-level plumbing is missing.
raise ValueError(
"noise_model is not supported in the Torch-only generator (simple single-p only)."
)
self.noise_model = noise_model

from qec.surface_code.memory_circuit_torch import MemoryCircuitTorch
from qec.precompute_dem import precompute_dem_bundle_surface_code
from qec.precompute_dem import (
build_probability_vector_surface_code,
dem_artifact_metadata_matches,
load_dem_artifact_metadata,
precompute_dem_bundle_surface_code,
)

import threading
self._early_compile_threads: list[threading.Thread] = []
Expand All @@ -110,17 +109,73 @@ def __init__(
self._early_compile_threads.append(t)

dem_cache = {}
if precomputed_frames_dir is None:
# Pick a nominal p for building the single-p marginal vector.
# (This matches existing behavior when using a precomputed directory.)
p_nom = float(
p_error if p_error is not None else (p_max if p_max is not None else 0.004)
p_overrides = {}
bases_needed = ["X", "Z"] if self._mixed else [self._single_basis]
p_nom = float(p_error if p_error is not None else (p_max if p_max is not None else 0.004))
effective_precomputed_frames_dir = precomputed_frames_dir
metadata_reason = None
if precomputed_frames_dir is not None:
precomputed_dir = Path(precomputed_frames_dir)
for b in bases_needed:
p_path = (
precomputed_dir /
f"surface_d{self.distance}_r{self.n_rounds}_{b}_frame_predecoder.p.npz"
)
if not p_path.exists():
# Asymmetric handling on purpose:
# - Scalar mode preserves the legacy behaviour of letting
# MemoryCircuitTorch raise its own clear FileNotFoundError
# downstream (so `continue` here keeps the dir intact).
# - 25p mode self-heals by falling back to an in-memory
# build, since the p artifact is needed to determine the
# cached error-column count.
if noise_model is None:
continue
effective_precomputed_frames_dir = None
metadata_reason = f"missing p artifact for basis {b}"
break
metadata = load_dem_artifact_metadata(p_path)
ok, reason = dem_artifact_metadata_matches(
metadata,
distance=self.distance,
n_rounds=self.n_rounds,
basis=b,
code_rotation=self.code_rotation,
p_scalar=p_nom,
noise_model=noise_model,
)
if not ok:
effective_precomputed_frames_dir = None
metadata_reason = f"basis {b}: {reason}"
break
p_overrides[b] = torch.from_numpy(
build_probability_vector_surface_code(
distance=self.distance,
n_rounds=self.n_rounds,
basis=b,
code_rotation=self.code_rotation,
p_scalar=p_nom,
noise_model=noise_model,
)
)

if effective_precomputed_frames_dir is None:
nm_tag = ", noise_model=25p" if noise_model is not None else ""
if precomputed_frames_dir is None:
source = "precomputed_frames_dir=None"
else:
source = f"precomputed DEM metadata mismatch ({metadata_reason})"
# Always announce a metadata-driven rebuild on rank 0 so silent
# rebuilds are visible in non-verbose distributed runs. The
# precomputed_frames_dir=None path is only logged when verbose.
should_log = self.verbose or (
precomputed_frames_dir is not None and int(self.rank) == 0
)
if self.verbose:
if should_log:
print(
f"[QCDataGeneratorTorch] precomputed_frames_dir=None -> building in-memory DEM bundle at p={p_nom}"
f"[QCDataGeneratorTorch] {source} -> building in-memory DEM bundle "
f"at p={p_nom}{nm_tag}"
)
bases_needed = ["X", "Z"] if self._mixed else [self._single_basis]
for b in bases_needed:
dem_cache[b] = precompute_dem_bundle_surface_code(
distance=self.distance,
Expand All @@ -132,8 +187,17 @@ def __init__(
device=self.device,
export=False,
return_artifacts=True,
# TODO: pass noise_model=noise_model here for circuit-level noise support
noise_model=noise_model,
)
elif self.verbose:
nm_tag = (
f", refreshed_p_from_noise_model={noise_model.sha256()}"
if noise_model is not None else f", refreshed_p_from_scalar={p_nom:g}"
)
print(
f"[QCDataGeneratorTorch] using disk DEM structure from "
f"{effective_precomputed_frames_dir}{nm_tag}"
)

_he_kwargs = dict(
timelike_he=timelike_he,
Expand All @@ -154,37 +218,40 @@ def __init__(
distance=self.distance,
n_rounds=self.n_rounds,
basis="X",
precomputed_frames_dir=precomputed_frames_dir,
precomputed_frames_dir=effective_precomputed_frames_dir,
code_rotation=self.code_rotation,
device=self.device,
H=(dem_cache.get("X", {}).get("H") if dem_cache else None),
p=(dem_cache.get("X", {}).get("p") if dem_cache else None),
A=(dem_cache.get("X", {}).get("A") if dem_cache else None),
p_override=p_overrides.get("X"),
**_he_kwargs,
)
self.sim_Z = MemoryCircuitTorch(
distance=self.distance,
n_rounds=self.n_rounds,
basis="Z",
precomputed_frames_dir=precomputed_frames_dir,
precomputed_frames_dir=effective_precomputed_frames_dir,
code_rotation=self.code_rotation,
device=self.device,
H=(dem_cache.get("Z", {}).get("H") if dem_cache else None),
p=(dem_cache.get("Z", {}).get("p") if dem_cache else None),
A=(dem_cache.get("Z", {}).get("A") if dem_cache else None),
p_override=p_overrides.get("Z"),
**_he_kwargs,
)
else:
self.sim = MemoryCircuitTorch(
distance=self.distance,
n_rounds=self.n_rounds,
basis=self._single_basis,
precomputed_frames_dir=precomputed_frames_dir,
precomputed_frames_dir=effective_precomputed_frames_dir,
code_rotation=self.code_rotation,
device=self.device,
H=(dem_cache.get(self._single_basis, {}).get("H") if dem_cache else None),
p=(dem_cache.get(self._single_basis, {}).get("p") if dem_cache else None),
A=(dem_cache.get(self._single_basis, {}).get("A") if dem_cache else None),
p_override=p_overrides.get(self._single_basis),
**_he_kwargs,
)

Expand Down
62 changes: 62 additions & 0 deletions code/data/precompute_frames.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
"""

import argparse
import json
from pathlib import Path
import sys
import time
Expand All @@ -44,6 +45,7 @@
sys.path.insert(0, str(Path(__file__).parent.parent))

from qec.precompute_dem import precompute_dem_bundle_surface_code
from qec.noise_model import NoiseModel


def _normalize_rotation(rotation: str) -> str:
Expand All @@ -56,6 +58,42 @@ def _normalize_rotation(rotation: str) -> str:
return internal_rot


def _load_config_mapping(path: str) -> dict:
path_obj = Path(path)
if path_obj.suffix.lower() == ".json":
with path_obj.open("r", encoding="utf-8") as f:
return json.load(f)

try:
from omegaconf import OmegaConf
cfg = OmegaConf.load(path_obj)
return OmegaConf.to_container(cfg, resolve=True)
except Exception:
try:
import yaml
except ImportError as exc:
raise RuntimeError(
"YAML noise model configs require omegaconf or PyYAML to be installed"
) from exc
with path_obj.open("r", encoding="utf-8") as f:
return yaml.safe_load(f)


def _load_noise_model(path: str) -> NoiseModel:
cfg = _load_config_mapping(path)
if not isinstance(cfg, dict):
raise ValueError(f"Noise model config must be a mapping, got {type(cfg).__name__}")

if isinstance(cfg.get("data"), dict) and isinstance(cfg["data"].get("noise_model"), dict):
noise_model_cfg = cfg["data"]["noise_model"]
elif isinstance(cfg.get("noise_model"), dict):
noise_model_cfg = cfg["noise_model"]
else:
noise_model_cfg = cfg

return NoiseModel.from_config_dict(noise_model_cfg)


def main() -> None:
parser = argparse.ArgumentParser(
description="Precompute DEM bundles for MemoryCircuitTorch",
Expand Down Expand Up @@ -101,6 +139,18 @@ def main() -> None:
default=0.01,
help="Scalar p for exporting single-p marginals",
)
parser.add_argument(
"--noise_model_config",
type=str,
default=None,
help="YAML/JSON config containing data.noise_model, noise_model, or a direct 25p mapping",
)
parser.add_argument(
"--noise_model_json",
type=str,
default=None,
help="JSON file containing data.noise_model, noise_model, or a direct 25p mapping",
)
parser.add_argument(
"--dem_output_dir",
type=str,
Expand Down Expand Up @@ -132,6 +182,13 @@ def main() -> None:
args.dem_output_dir = args.output_dir
if args.dem_output_dir is None:
args.dem_output_dir = str(Path(__file__).parent.parent / "frames_data")
if args.noise_model_config is not None and args.noise_model_json is not None:
print("Error: Use only one of --noise_model_config or --noise_model_json")
sys.exit(1)
noise_model = None
noise_model_path = args.noise_model_config or args.noise_model_json
if noise_model_path is not None:
noise_model = _load_noise_model(noise_model_path)

if args.n_rounds is None:
args.n_rounds = args.distance
Expand All @@ -158,6 +215,10 @@ def main() -> None:
print(f"# Bases: {args.basis}")
print(f"# Rotation: {args.rotation} (internal={internal_rot})")
print(f"# Output: {args.dem_output_dir}")
if noise_model is None:
print(f"# Noise: scalar p={float(args.p)}")
else:
print(f"# Noise: 25p noise_model sha256={noise_model.sha256()}")
print(f"{'#' * 60}")

total_t0 = time.time()
Expand All @@ -175,6 +236,7 @@ def main() -> None:
dem_output_dir=str(args.dem_output_dir),
device=device,
export=True,
noise_model=noise_model,
)
output_dirs.add(str(dem_dir))
except Exception as e:
Expand Down
6 changes: 3 additions & 3 deletions code/export/generate_test_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,9 +106,9 @@
"p_idle_cnot_X": 0.001,
"p_idle_cnot_Y": 0.001,
"p_idle_cnot_Z": 0.001,
"p_idle_spam_X": 0.001998,
"p_idle_spam_Y": 0.001998,
"p_idle_spam_Z": 0.001998,
"p_idle_spam_X": 0.001996,
"p_idle_spam_Y": 0.001996,
"p_idle_spam_Z": 0.001996,
"p_cnot_IX": 0.0002,
"p_cnot_IY": 0.0002,
"p_cnot_IZ": 0.0002,
Expand Down
Loading
Loading