Ancestors parallel by jeromekelleher · Pull Request #1124 · tskit-dev/tsinfer

jeromekelleher · 2026-03-12T12:46:07Z

Note to self - I'm not convinced futures approach is correct here and could lead to unbounded memory use. Need to figure out how to deal with potentially empty ancestors though and whether they are a real issue.

Extract per-ancestor work into _build_one_ancestor() and submit it to a thread pool during Pass 2. A SynchronousExecutor (matching bio2zarr's pattern) is used when num_threads<=0 so threaded and non-threaded paths share the same code. Futures are consumed in submission order to ensure deterministic output. The --threads CLI flag is wired through to infer_ancestors for both infer-ancestors and run commands.

Add INFO logging throughout infer_ancestors for source path, store dimensions, active filters, sample selection, site stats, position range, threading mode, and per-interval ancestor counts. Add DEBUG logging to AncestorWriter for chunk flush timing and finalize timing. Suppress zarr/numcodecs debug logs in CLI logging setup.

Null out each future in the list after its result is consumed so the AncestorResult and its large haplotype array can be garbage collected immediately rather than being kept alive for the entire interval.

codecov · 2026-03-12T12:47:25Z

Codecov Report

❌ Patch coverage is 92.69663% with 13 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (dev-1.0@f9fb2fc). Learn more about missing BASE report.

Additional details and impacted files

@@            Coverage Diff             @@
##             dev-1.0    #1124   +/-   ##
==========================================
  Coverage           ?   84.30%           
==========================================
  Files              ?       16           
  Lines              ?     4193           
  Branches           ?      683           
==========================================
  Hits               ?     3535           
  Misses             ?      530           
  Partials           ?      128

Flag	Coverage Δ
C	`81.57% <ø> (?)`
c-python	`58.31% <ø> (?)`
python-tests	`89.38% <92.69%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
Python API	`89.38% <0.00%> (?)`
Python C interface	`66.66% <0.00%> (?)`
C library	`88.68% <0.00%> (?)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Make index a required parameter of add_ancestor and use a pending dict to buffer out-of-order arrivals, draining contiguously into the write buffer. Remove the all-missing ancestor guard from _build_one_ancestor since the C engine guarantees focal sites are always set, and use the C-returned start/end directly. Add thorough tests for indexed insertion across chunk boundaries, multiple intervals, and shuffled orders.

Use concurrent.futures.as_completed instead of consuming futures in submission order. The AncestorWriter's index-aware pending dict ensures deterministic output regardless of completion order, while as_completed gives better throughput by processing results as soon as they are ready.

Use a set for futures and discard each after its result is consumed, so the AncestorResult and its haplotype array can be garbage collected immediately rather than being retained for the entire interval.

Revert lazy future approach which reintroduced memory leak. Restore eager SynchronousExecutor with as_completed and per-future discard. Add CLI warning that ancestor-level progress requires threads >= 1.

Wire the existing C-level genotype_encoding flag through AncestorsConfig and into the AncestorBuilder constructor. TOML config accepts both integer (0/1) and string ("eight_bit"/"one_bit") values. Tests verify one-bit produces the same ancestor set as eight-bit across single/multi-interval, threaded, many-sample, and small-chunk scenarios.

Report ab.mem_size in the per-interval info log so the builder's direct allocation can be compared against process RSS to isolate memory growth.

Replace AncestorResult with Ancestor dataclass that stores only the active haplotype fragment (start:end site range) instead of a full num_sites-length array. The full haplotype is expanded on demand at flush time in AncestorWriter. Also simplify AncestorWriter.add_ancestor to accept a single Ancestor object, and cap in-flight futures to 2*num_threads using wait(FIRST_COMPLETED) following the upstream threaded_map pattern.

…act helpers - Remove add_terminal_site() calls and +1 max_sites padding; per-interval builders naturally stop at the last site without a sentinel - Replace conditional tqdm imports with module-level import and disable=not progress; remove interval-level bar, add genotype-loading bar - Extract _open_source, _resolve_haplotype_count, _apply_site_stats, and _process_interval from infer_ancestors (~280 -> ~60 lines of orchestration) - Add time.monotonic() timing for Pass 1, Pass 2, and total elapsed - Make infer_ancestors accept only Source (not bare zarr.Group); tests use _src() helper to wrap in-memory groups

The resource module is Unix-only and causes ImportError on Windows. Use psutil (now a declared dependency) for portable RSS reporting.

jeromekelleher added 3 commits March 12, 2026 12:01

Release futures after consumption to avoid retaining haplotype arrays

00dd229

Null out each future in the list after its result is consumed so the AncestorResult and its large haplotype array can be garbage collected immediately rather than being kept alive for the entire interval.

jeromekelleher added 12 commits March 14, 2026 20:38

Fix memory leak by discarding futures after consumption

a057d04

Use a set for futures and discard each after its result is consumed, so the AncestorResult and its haplotype array can be garbage collected immediately rather than being retained for the entire interval.

Warn when --progress is used without --threads

67b79a6

Revert lazy future approach which reintroduced memory leak. Restore eager SynchronousExecutor with as_completed and per-future discard. Add CLI warning that ancestor-level progress requires threads >= 1.

Report memory usage

32f7a03

Memory debugging info

e4de44a

Log AncestorBuilder memory usage per interval

5a3651b

Report ab.mem_size in the per-interval info log so the builder's direct allocation can be compared against process RSS to isolate memory growth.

Add annotated example_config

4b58398

Replace resource module with psutil for cross-platform memory reporting

de055ca

The resource module is Unix-only and causes ImportError on Windows. Use psutil (now a declared dependency) for portable RSS reporting.

jeromekelleher force-pushed the ancestors-parallel branch from e1cc135 to de055ca Compare March 15, 2026 22:07

jeromekelleher merged commit c519b0f into tskit-dev:dev-1.0 Mar 15, 2026
13 checks passed

jeromekelleher deleted the ancestors-parallel branch March 15, 2026 22:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ancestors parallel#1124

Ancestors parallel#1124
jeromekelleher merged 15 commits intotskit-dev:dev-1.0from
jeromekelleher:ancestors-parallel

jeromekelleher commented Mar 12, 2026

Uh oh!

codecov bot commented Mar 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jeromekelleher commented Mar 12, 2026

Uh oh!

codecov bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Mar 12, 2026 •

edited

Loading