Ancestors parallel#1124
Merged
jeromekelleher merged 15 commits intotskit-dev:dev-1.0from Mar 15, 2026
Merged
Conversation
Extract per-ancestor work into _build_one_ancestor() and submit it to a thread pool during Pass 2. A SynchronousExecutor (matching bio2zarr's pattern) is used when num_threads<=0 so threaded and non-threaded paths share the same code. Futures are consumed in submission order to ensure deterministic output. The --threads CLI flag is wired through to infer_ancestors for both infer-ancestors and run commands.
Add INFO logging throughout infer_ancestors for source path, store dimensions, active filters, sample selection, site stats, position range, threading mode, and per-interval ancestor counts. Add DEBUG logging to AncestorWriter for chunk flush timing and finalize timing. Suppress zarr/numcodecs debug logs in CLI logging setup.
Null out each future in the list after its result is consumed so the AncestorResult and its large haplotype array can be garbage collected immediately rather than being kept alive for the entire interval.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## dev-1.0 #1124 +/- ##
==========================================
Coverage ? 84.30%
==========================================
Files ? 16
Lines ? 4193
Branches ? 683
==========================================
Hits ? 3535
Misses ? 530
Partials ? 128
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Make index a required parameter of add_ancestor and use a pending dict to buffer out-of-order arrivals, draining contiguously into the write buffer. Remove the all-missing ancestor guard from _build_one_ancestor since the C engine guarantees focal sites are always set, and use the C-returned start/end directly. Add thorough tests for indexed insertion across chunk boundaries, multiple intervals, and shuffled orders.
Use concurrent.futures.as_completed instead of consuming futures in submission order. The AncestorWriter's index-aware pending dict ensures deterministic output regardless of completion order, while as_completed gives better throughput by processing results as soon as they are ready.
Use a set for futures and discard each after its result is consumed, so the AncestorResult and its haplotype array can be garbage collected immediately rather than being retained for the entire interval.
Revert lazy future approach which reintroduced memory leak. Restore eager SynchronousExecutor with as_completed and per-future discard. Add CLI warning that ancestor-level progress requires threads >= 1.
Wire the existing C-level genotype_encoding flag through AncestorsConfig
and into the AncestorBuilder constructor. TOML config accepts both integer
(0/1) and string ("eight_bit"/"one_bit") values. Tests verify one-bit
produces the same ancestor set as eight-bit across single/multi-interval,
threaded, many-sample, and small-chunk scenarios.
Report ab.mem_size in the per-interval info log so the builder's direct allocation can be compared against process RSS to isolate memory growth.
Replace AncestorResult with Ancestor dataclass that stores only the active haplotype fragment (start:end site range) instead of a full num_sites-length array. The full haplotype is expanded on demand at flush time in AncestorWriter. Also simplify AncestorWriter.add_ancestor to accept a single Ancestor object, and cap in-flight futures to 2*num_threads using wait(FIRST_COMPLETED) following the upstream threaded_map pattern.
…act helpers - Remove add_terminal_site() calls and +1 max_sites padding; per-interval builders naturally stop at the last site without a sentinel - Replace conditional tqdm imports with module-level import and disable=not progress; remove interval-level bar, add genotype-loading bar - Extract _open_source, _resolve_haplotype_count, _apply_site_stats, and _process_interval from infer_ancestors (~280 -> ~60 lines of orchestration) - Add time.monotonic() timing for Pass 1, Pass 2, and total elapsed - Make infer_ancestors accept only Source (not bare zarr.Group); tests use _src() helper to wrap in-memory groups
The resource module is Unix-only and causes ImportError on Windows. Use psutil (now a declared dependency) for portable RSS reporting.
e1cc135 to
de055ca
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note to self - I'm not convinced futures approach is correct here and could lead to unbounded memory use. Need to figure out how to deal with potentially empty ancestors though and whether they are a real issue.