Skip to content

fix(contractor): insert self-loops for all needed cases, not just one-ways#7442

Merged
DennisOSRM merged 10 commits into
Project-OSRM:masterfrom
MarcelloPerathoner:contractor-rewrite
Jun 28, 2026
Merged

fix(contractor): insert self-loops for all needed cases, not just one-ways#7442
DennisOSRM merged 10 commits into
Project-OSRM:masterfrom
MarcelloPerathoner:contractor-rewrite

Conversation

@MarcelloPerathoner

Copy link
Copy Markdown
Contributor

Issue

The shortest distance between any two nodes must be invariant under contraction. We must also keep invariant the shortest loop distance from any node back to itself. This requirement arises for us from the need to "go around" if source and target are on the same node, with source downstream from target. For this reason we must insert self-loops whenever this is the shortest path from self to self.

The current implementation only inserts self-loops for one-way streets, which is insufficient. Any road can be turned into an implicit one-way by any combination of turn restrictions, bearings, or continue-straight-at-waypoints.

Tasklist

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reworks the contraction-hierarchy (CH) contractor to preserve shortest-path invariants for node-to-node distances and node-to-self loop distances by inserting self-loop shortcuts when appropriate.

Changes:

  • Refactors the contractor implementation (parallel independent-node contraction, shortcut insertion, core handling) and updates APIs/call sites accordingly.
  • Extends routing/assembly logic to consider self-loops in edge-based routing, and adds/adjusts unit + BDD tests.
  • Adds heap/storage enhancements (LinearHashStorage option, IncreaseKey support) and introduces new debug/benchmark tooling/logging.

Reviewed changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
unit_tests/util/query_heap.cpp Expands heap storage backends tested (adds LinearHashStorage) and tweaks test sizes.
unit_tests/contractor/graph_contractor.cpp Updates tests for new contractor APIs and adds assertions for self-loop behavior.
src/tools/routed.cpp Makes std::exception handler unconditional (not Windows-only).
src/engine/routing_algorithms/routing_base_ch.cpp Adds debug logging and keeps packed-leg handling for loops.
src/contractor/graph_contractor.cpp Major contractor rewrite: parallel independent contraction, shortcut insertion, self-loop handling, updated TBB usage, instrumentation.
src/contractor/contractor.cpp Updates contraction pipeline to new contractor APIs (drops node_weights from calls).
src/contractor/contractor_search.cpp Updates contractor search to new heap data model (bool target flag) and new search signature.
scripts/debug/dump_hsgr.py New script to inspect .ebg/.hsgr and optionally emit DOT graphs.
scripts/contractor_benchmark.py Adds benchmark script documentation header.
package-lock.json Updates npm dependencies (brace-expansion, yaml).
include/util/query_heap.hpp Adds IncreaseKey support.
include/util/dynamic_graph.hpp Allows Renumber({}) to mean “compress edges only” without node renumbering.
include/util/d_ary_heap.hpp Adds heap “increase” operation.
include/engine/routing_algorithms/routing_base_ch.hpp Adjusts routing step logic to account for self-loops; adds debug logging in debug builds.
include/engine/guidance/assemble_steps.hpp Adds debug logging around RouteStep creation; adds assertion debugging.
include/engine/guidance/assemble_leg.hpp Adds util/log include (currently unused).
include/contractor/graph_contractor.hpp Updates contractor APIs (removes node_weights parameters).
include/contractor/contractor_search.hpp Updates search signature (adds start node parameter).
include/contractor/contractor_heap.hpp Simplifies heap payload to a boolean “is target” flag.
include/contractor/contractor_graph.hpp Changes edge id type to NodeID and documents originalEdges meaning.
features/support/run.js Uses close instead of exit for child process completion and clarifies log messages.
features/lib/osrm_loader.js Uses close instead of exit and logs stdout; drains streams on close in one loader.
features/foot/distance.feature Adds additional distance-metric routing assertions (more pair directions).
features/bicycle/distance.feature Adds additional distance-metric routing assertions (more pair directions).
Comments suppressed due to low confidence (1)

include/contractor/contractor_search.hpp:21

  • The parameter is named contractable here, but the implementation (src/contractor/contractor_search.cpp) uses the same argument as a “contractible” mask (nodes allowed to be traversed). Renaming this parameter to contractible (and matching call sites) would reduce confusion, especially now that both spellings appear in nearby code.
void search(ContractorHeap &heap,
            const ContractorGraph &graph,
            const NodeID start,
            const std::vector<bool> &contractable,
            const unsigned number_of_targets,
            const int node_limit,
            const EdgeWeight weight_limit,
            const NodeID forbidden_node);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +15 to +22
#include <oneapi/tbb/blocked_range.h>
#include <oneapi/tbb/concurrent_priority_queue.h>
#include <oneapi/tbb/concurrent_vector.h>
#include <oneapi/tbb/enumerable_thread_specific.h>
#include <oneapi/tbb/parallel_for.h>
#include <oneapi/tbb/parallel_for_each.h>
#include <oneapi/tbb/parallel_invoke.h>
#include <oneapi/tbb/parallel_sort.h>

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file switches to <oneapi/tbb/...> headers while the rest of the codebase still includes classic <tbb/...> (e.g. src/contractor/contractor.cpp, include/contractor/contracted_edge_container.hpp). This inconsistency can break builds depending on which TBB header layout is available. Please align to the project’s chosen include path (either keep <tbb/...> here, or migrate the rest of the repo/build config consistently).

Suggested change
#include <oneapi/tbb/blocked_range.h>
#include <oneapi/tbb/concurrent_priority_queue.h>
#include <oneapi/tbb/concurrent_vector.h>
#include <oneapi/tbb/enumerable_thread_specific.h>
#include <oneapi/tbb/parallel_for.h>
#include <oneapi/tbb/parallel_for_each.h>
#include <oneapi/tbb/parallel_invoke.h>
#include <oneapi/tbb/parallel_sort.h>
#include <tbb/blocked_range.h>
#include <tbb/concurrent_priority_queue.h>
#include <tbb/concurrent_vector.h>
#include <tbb/enumerable_thread_specific.h>
#include <tbb/parallel_for.h>
#include <tbb/parallel_for_each.h>
#include <tbb/parallel_invoke.h>
#include <tbb/parallel_sort.h>

Copilot uses AI. Check for mistakes.
Comment on lines +541 to +549
/** A heap kept in thread-local storage to avoid multiple recreations of it. */
ContractorHeap heap_exemplar(8000);
ThreadData thread_data(heap_exemplar);
/** Nodes still waiting for contraction. Not all of them will be contracted though. */
tbb::concurrent_vector<NodeID> remaining_nodes;

std::size_t number_of_contracted_nodes = 0;

const unsigned int number_of_nodes = graph.GetNumberOfNodes();

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContractorHeap heap_exemplar(8000) hard-codes the backing LinearHashStorage capacity. Since the search can insert far more than 8000 distinct nodes (node_limit is based on settled nodes, while relaxations can discover many more), the fixed-size linear-probing table can fill up and then probe forever. Size this from the graph/search bounds (e.g., graph node count or a safe upper bound derived from node_limit) or use a dynamically growing storage for safety.

Suggested change
/** A heap kept in thread-local storage to avoid multiple recreations of it. */
ContractorHeap heap_exemplar(8000);
ThreadData thread_data(heap_exemplar);
/** Nodes still waiting for contraction. Not all of them will be contracted though. */
tbb::concurrent_vector<NodeID> remaining_nodes;
std::size_t number_of_contracted_nodes = 0;
const unsigned int number_of_nodes = graph.GetNumberOfNodes();
const unsigned int number_of_nodes = graph.GetNumberOfNodes();
/** A heap kept in thread-local storage to avoid multiple recreations of it. */
ContractorHeap heap_exemplar(std::max<std::size_t>(1, number_of_nodes));
ThreadData thread_data(heap_exemplar);
/** Nodes still waiting for contraction. Not all of them will be contracted though. */
tbb::concurrent_vector<NodeID> remaining_nodes;
std::size_t number_of_contracted_nodes = 0;

Copilot uses AI. Check for mistakes.
Comment thread src/contractor/graph_contractor.cpp Outdated
Comment on lines +431 to +434
// ... but found edge has smaller weight, update it.
if (edge.data.weight < current_data.weight ||
edge.data.duration < current_data.duration ||
edge.data.distance < current_data.distance)

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InsertEdges treats an existing edge as a duplicate based only on direction flags and then replaces it if any of weight/duration/distance is smaller. This can select an edge with a larger routing weight just because duration or distance is smaller, changing shortest-path behavior. The duplicate-selection criteria should be based on the routing weight (and then keep the corresponding duration/distance for that same edge), rather than minimizing different metrics independently.

Suggested change
// ... but found edge has smaller weight, update it.
if (edge.data.weight < current_data.weight ||
edge.data.duration < current_data.duration ||
edge.data.distance < current_data.distance)
// ... but found edge has smaller routing weight, update it.
// Use duration and distance only as deterministic tie-breakers so
// that all metrics come from the same selected edge.
if (edge.data.weight < current_data.weight ||
(edge.data.weight == current_data.weight &&
(edge.data.duration < current_data.duration ||
(edge.data.duration == current_data.duration &&
edge.data.distance < current_data.distance))))

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +32
#define SELF_LOOPS

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#define SELF_LOOPS is unconditionally enabled, which makes the #ifndef SELF_LOOPS branches dead code and hides the intended behavior behind a compile-time switch that isn’t wired to any build option. Consider removing the macro and the conditional compilation (always support self-loops), or make it a proper build/config option if it must remain togglable.

Suggested change
#define SELF_LOOPS

Copilot uses AI. Check for mistakes.
Comment on lines 156 to +158

util::Log(logDEBUG) << "pushing RouteStep of duration "
<< from_alias<double>(segment_duration) / 10.;

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This debug log runs in the main step-assembly loop. Constructing util::Log(logDEBUG) always takes the global log mutex (see src/util/log.cpp:70), so this adds overhead even when debug logging is disabled. Please guard this behind #ifdef ENABLE_DEBUG_LOGGING / #ifndef NDEBUG, or check the log policy before constructing the logger.

Suggested change
util::Log(logDEBUG) << "pushing RouteStep of duration "
<< from_alias<double>(segment_duration) / 10.;
#if defined(ENABLE_DEBUG_LOGGING) || !defined(NDEBUG)
util::Log(logDEBUG) << "pushing RouteStep of duration "
<< from_alias<double>(segment_duration) / 10.;
#endif

Copilot uses AI. Check for mistakes.
Comment on lines +135 to +137
util::Log(logDEBUG) << "Search completed with middle node " << middle << " at weight "
<< weight;

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

util::Log(logDEBUG) here executes on every CH search and always acquires the global log mutex during construction (even if debug is muted). This can cause a noticeable per-request performance hit. Please wrap this behind #ifdef ENABLE_DEBUG_LOGGING / #ifndef NDEBUG, or check the logging policy before constructing the logger.

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +166
{
auto log = util::Log(logDEBUG);
log << " and packed_leg of size " << packed_leg.size() << " with nodes";
for (auto node : packed_leg)
{
log << " " << node;
}
}

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block logs every node in packed_leg and constructs a util::Log(logDEBUG) unconditionally. Even with debug disabled, logger construction takes a mutex, and iterating packed_leg can be expensive. Please compile-guard this (e.g. #ifdef ENABLE_DEBUG_LOGGING / #ifndef NDEBUG) or avoid building the log message unless debug logging is actually enabled.

Copilot uses AI. Check for mistakes.
#include "engine/guidance/route_step.hpp"
#include "engine/internal_route_result.hpp"
#include "util/coordinate_calculation.hpp"
#include "util/log.hpp"

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

util/log.hpp is included but not used anywhere in this header (no util::Log references). Please drop the include to avoid unnecessary rebuilds and keep header dependencies minimal.

Suggested change
#include "util/log.hpp"

Copilot uses AI. Check for mistakes.
Comment thread scripts/debug/dump_hsgr.py Outdated
@@ -0,0 +1,283 @@
"""
Contractior debugging script.

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the module docstring: “Contractior” should be “Contractor”.

Suggested change
Contractior debugging script.
Contractor debugging script.

Copilot uses AI. Check for mistakes.
Comment on lines 19 to 23
std::vector<bool> contractGraph(ContractorGraph &graph,
std::vector<bool> node_is_uncontracted,
std::vector<bool> node_is_contractable,
std::vector<EdgeWeight> node_weights,
double core_factor = 1.0);

Copilot AI Apr 3, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this API, the boolean vector is still named node_is_contractable, but the rest of the PR consistently uses “contractible” (is_contractible, node_is_contractible_, etc.). Renaming the parameter improves consistency and avoids having both spellings in the public header.

Copilot uses AI. Check for mistakes.
@MarcelloPerathoner

Copy link
Copy Markdown
Contributor Author

I did some preliminary benchmarks. The added self-loops increase the no. of edges by ~3%. That in turn increases search time by ~8%. Surprisingly the contractor now uses far less memory ~57%.

Contractor benchmark

log edges norm time (s) norm mem (MB) norm
/tmp/origin-master-germany.log 96173352 1.000 883.02 1.000 13755 1.000
/tmp/contractor-rewrite-germany.log 99390191 1.033 853.71 0.967 7836 0.570

Routed benchmark (requests routes between randomly selected big cities in Germany)

log time (ms) norm distance norm
/tmp/routed-origin-master-germany.log 2.97 1.000 323295.350 1.000
/tmp/routed-contractor-rewrite-germany.log 3.21 1.082 323295.350 1.000

@MarcelloPerathoner

Copy link
Copy Markdown
Contributor Author

These two embarassingly simple tests fail in the current implementation:

https://github.com/MarcelloPerathoner/osrm-backend/blob/c768db25513a870e4e57eee908c066bb75a6cc15/features/testbot/self_loop.feature#L6-L45

Note that there are no one-way streets involved.

To illustrate the problem, I show the contraction of the single bidirectional road segment:

With the current implementation:

dot

This contraction is clearly incorrect because it leaves no edge to exit node 1. If source and target phantoms are both on node 1, and source is downstream from target, there is no way for either forward nor backward search to leave node 1. No route can be found. (The bidirectional edge is stored on node 0 only.)

With this PR:

dot

With the added self-loop, the forward search can now leave node 1 and return to it.

@DennisOSRM

Copy link
Copy Markdown
Collaborator

Could you elaborate on the reason for this PR? I gave probably missed something, but what behavior does it correct?

@MarcelloPerathoner

Copy link
Copy Markdown
Contributor Author

The current implementation does not find routes that obviously exist. The reason is that the contractor does not always preserve the shortest path from a node back to itself.

The contractor inserts a self-loop only if the node represents a one-way street. That is insufficient, as the problem does surface most easily with one-way streets, but is not in any way limited to them.

The above mentioned test scenarios show how obvious routes are missed.

@DennisOSRM DennisOSRM changed the title Contractor rewrite fix(contractor): insert self-loops for all needed cases, not just one-ways Jun 28, 2026
LinearHashStorage requires the size to be a power of 2 (asserted in
debug builds). 8000 is not a power of 2, causing a debug-build crash
and degraded hash distribution in release builds. Replace with
HASH_MAP_CAPACITY (1<<13 = 8192) already defined in
contractor_search.hpp.
Replace hardcoded 1000/2000 with SIMULATION_SEARCH_SPACE_SIZE and
FULL_SEARCH_SPACE_SIZE from contractor_search.hpp. This ensures
changing the header constants actually affects behavior.
…flow

When the timestamp wraps from UINT_MAX to 0 via ++current_timestamp,
cells.clear() emptied the vector entirely, causing subsequent
operator[] to access elements of an empty vector (UB). Replace with
cells.assign() which reinitializes all cells while keeping the vector
sized, and reset current_timestamp to 0 to avoid matching the
HashCell default time value.
When remaining_nodes.size() < number_of_core_nodes (many
uncontractible nodes), the unsigned subtraction wrapped to a huge
value, producing misleading log output. Use a saturated subtraction to
avoid this.
Add a minimal scenario directly matching the diagram from PR Project-OSRM#7442:
a single bidirectional road between two nodes where both waypoints
snap to the same edge-based node with source downstream from target.

Without self-loops the forward search cannot leave the node, so no
route is found. Verified: test passes with SELF_LOOPS enabled,
fails with an empty route when SELF_LOOPS is disabled.
@DennisOSRM

Copy link
Copy Markdown
Collaborator

This is looking good and will go in once CI comes back green. Thanks so much for the contribution and your patience. It took a bit to get this over the finish lines.

Added a regression test: features/testbot/self_loop.feature — new scenario "Waypoints snap to same node, source downstream"

Node map (matching the PR comment diagram):

 1   2
a-----b

A single bidirectional residential road between two nodes an and b. Both waypoints are on the same (upper) side.

Test cases:

waypoints approaches expected route notes
1,2 curb curb ab,ab,ab,ab source near a, target near b: source downstream on b→a edge → needs self-loop
2,1 curb curb ab,ab Source near b, target near a: normal forward direction on b→a edge → no self-loop needed

Verification results:

  • With SELF_LOOPS (patch branch): ✅ 3 scenarios, 18 steps all pass
  • Without SELF_LOOPS (master behavior): ❌ 1,2 curb curb returns empty route — forward search cannot leave the edge-based node, no route found

@codecov

codecov Bot commented Jun 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.14126% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.21%. Comparing base (58fb6e6) to head (6be4caa).

Files with missing lines Patch % Lines
include/engine/guidance/assemble_steps.hpp 60.00% 2 Missing ⚠️
include/util/linear_hash_storage.hpp 0.00% 2 Missing ⚠️
src/contractor/graph_contractor.cpp 99.50% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7442      +/-   ##
==========================================
+ Coverage   90.80%   94.21%   +3.40%     
==========================================
  Files         484      483       -1     
  Lines       37789    37670     -119     
==========================================
+ Hits        34316    35489    +1173     
+ Misses       3473     2181    -1292     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@DennisOSRM DennisOSRM merged commit f11f4e6 into Project-OSRM:master Jun 28, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants