Problem
78 reaches globally are labeled type=6 (ghost) but have topology neighbors in both directions (n_rch_up > 0 AND n_rch_down > 0). Ghost reaches should only exist at network boundaries (headwater or outlet), not midstream.
These are misclassified — they carry real data (width, WSE, facc, slope) and sit between real reaches. Many are tiny slivers (62–790 m long, median ~200 m).
Distribution
| Region |
Count |
| AS |
27 |
| NA |
20 |
| EU |
12 |
| SA |
13 |
| AF |
5 |
| OC |
1 |
Characteristics
- 67/78 have
end_reach=0 (midstream)
- 27/78 have
lakeflag=1 (lake) — likely lake/river boundary classification artifacts
- All have valid width, facc, dist_out
- Named rivers include Amazon (3,060 m wide!), Wisconsin, Grijalva, Elk, Mattagami, Savannah, Spree
Impact
- These break
path_segs contiguity (N016 lint check) — a path_segs=-9999 ghost in the middle of a segment splits it into two disconnected fragments. All 51 global N016 violations trace back to midstream ghosts.
- Every lint check needs
type NOT IN (5, 6) exclusions to avoid false positives from these.
Detection
SELECT reach_id, region, river_name, width, reach_length, lakeflag, end_reach
FROM reaches
WHERE type = 6 AND n_rch_up > 0 AND n_rch_down > 0
ORDER BY region, river_name
Also available via workflow.find_incorrect_ghost_reaches().
Proposed fix
Reclassify based on lakeflag:
lakeflag=0 → type=1 (river)
lakeflag=1 → type=3 (lake_on_river)
lakeflag=2 → type=1 (canal, keep as river type)
After reclassification, re-run v17c pipeline on affected regions to recompute path_freq, stream_order, path_segs for the formerly-ghost reaches.
Problem
78 reaches globally are labeled
type=6(ghost) but have topology neighbors in both directions (n_rch_up > 0 AND n_rch_down > 0). Ghost reaches should only exist at network boundaries (headwater or outlet), not midstream.These are misclassified — they carry real data (width, WSE, facc, slope) and sit between real reaches. Many are tiny slivers (62–790 m long, median ~200 m).
Distribution
Characteristics
end_reach=0(midstream)lakeflag=1(lake) — likely lake/river boundary classification artifactsImpact
path_segscontiguity (N016 lint check) — apath_segs=-9999ghost in the middle of a segment splits it into two disconnected fragments. All 51 global N016 violations trace back to midstream ghosts.type NOT IN (5, 6)exclusions to avoid false positives from these.Detection
Also available via
workflow.find_incorrect_ghost_reaches().Proposed fix
Reclassify based on lakeflag:
lakeflag=0→type=1(river)lakeflag=1→type=3(lake_on_river)lakeflag=2→type=1(canal, keep as river type)After reclassification, re-run v17c pipeline on affected regions to recompute path_freq, stream_order, path_segs for the formerly-ghost reaches.