Skip to content

Refactor low-level code#830

Draft
jeromekelleher wants to merge 6 commits intotskit-dev:mainfrom
jeromekelleher:lshmm-directly-on-ts
Draft

Refactor low-level code#830
jeromekelleher wants to merge 6 commits intotskit-dev:mainfrom
jeromekelleher:lshmm-directly-on-ts

Conversation

@jeromekelleher
Copy link
Member

Just posting here for discussion, definitely not ready for merging any time soon.

@benjeffery
Copy link
Member

Looking good on a quick look through! Will have a proper look soon.

@jeromekelleher
Copy link
Member Author

No rush - I'm just pushing here to be able to refer to it and for discussion when we're interested.

@codecov
Copy link

codecov bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 0% with 667 lines in your changes missing coverage. Please review.
✅ Project coverage is 0.00%. Comparing base (f087068) to head (3cbd349).

❗ There is a different number of reports uploaded between BASE (f087068) and HEAD (3cbd349). Click for more details.

HEAD has 7 uploads less than BASE
Flag BASE (f087068) HEAD (3cbd349)
c-python 1 0
python-tests 6 0
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #830       +/-   ##
==========================================
- Coverage   86.44%   0.00%   -86.45%     
==========================================
  Files          18       6       -12     
  Lines        5651    2637     -3014     
  Branches      896     640      -256     
==========================================
- Hits         4885       0     -4885     
- Misses        593    2637     +2044     
+ Partials      173       0      -173     
Flag Coverage Δ
C 0.00% <0.00%> (-81.58%) ⬇️
c-python ?
python-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Python API ∅ <ø> (∅)
Python C interface ∅ <ø> (∅)
C library 0.00% <0.00%> (-88.69%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jeromekelleher
Copy link
Member Author

I'm bringing this branch up to date now with the intention of merging in as soon as it's usable.

@jeromekelleher jeromekelleher force-pushed the lshmm-directly-on-ts branch 3 times, most recently from a774ed3 to 679cd8c Compare March 26, 2026 17:06
from tree_sequence_builder and workign directly on the tree sequence
instead.

Initial dump of required HMM code

First steps - isolated code works with shim Builder class

Some more tests

Remove TreeSequenceBuilder
g

Fake the sortedcontainers API for now

Fully move to sorted lists of edges.

Stuff

Partial

Abstract out the MatcherIndexes class

Pull out stuff necessary for static match

Compiling version without tree_sequence_builder

Partway through getting C tests running

add test dataq

C code working and some tests

Factor out the "output" struct in matcher

Some basics for new Matcher infrastructure

compiling  with lwt interface

partial update to use table collection

Convert C code to use tables

Roughly working Python-C infrastructure

Add basic debug support to the MatcherIndexes

Basic Python-C linkage works :hooray:

Basic high-level infrastructure for matching

Rename file

Refactor the Matcher infrastructure

Improve class infrastructure

Add vestigial root automatically

Fix up tests to remove hard-coded virtual root

Work on making matcher work with edges not on site values

Roughtly working version with edges on genome coords

Python version looks like it's working

Add sites_position storage and coordinate_t type

Matching working in C (looks like)

Infer start and end from haplotype

Rough implementation of flank skipping

Change python code to use coords in path

Implement coordinate paths in C

Implement some cludges to support initial zero site-paht

Minor updates

Sort-of working driver script

Fiddle with some tricky issues

Rename the test file to avoid collisions
16 tests across 4 hand-crafted tree topologies (star, binary, two-tree,
deep chain) with exact path and mutation assertions from the current
AncestorMatcher. These serve as regression tests when swapping to
MatcherIndexes / AncestorMatcher2.
Update the AncestorMatcher2 Python wrapper so find_path accepts the
same (h, start, end, match_out) arguments and returns the same
(left, right, parent) tuple as the current AncestorMatcher.

Remove Path, Match, find_match and zero_sites_path from matching.py.
Move Path and Match to test_lshmm.py where they are used by the
Python reference implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants