BE-429: HashQL: Add island dependency graph with data requirement resolution#8501
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
PR SummaryMedium Risk Overview Introduces Reviewed by Cursor Bugbot for commit 69a8667. Bugbot is set up for automated code reviews on this repo. Configure here. |
🤖 Augment PR SummarySummary: Adds an island dependency graph and scheduler to the HashQL MIR execution engine, including automatic data requirement resolution across targets. Changes:
Technical Notes: Data island selection uses the first set origin target (backend-priority order), and schedule levels are derived from predecessor constraints to enable safe parallel execution. 🤖 Was this summary useful? React with 👍 or 👎 |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## bm/be-428-hashql-simplify-traversal-tracking-to-path-recording #8501 +/- ##
==================================================================================================
+ Coverage 62.57% 62.75% +0.18%
==================================================================================================
Files 1322 1326 +4
Lines 134818 135540 +722
Branches 5508 5526 +18
==================================================================================================
+ Hits 84365 85063 +698
- Misses 49542 49563 +21
- Partials 911 914 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Benchmark results
|
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2002 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1001 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 3314 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 1526 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 2078 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 1033 | Flame Graph |
policy_resolution_medium
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 102 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 51 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 269 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 107 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 133 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 63 | Flame Graph |
policy_resolution_none
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 8 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 3 | Flame Graph |
policy_resolution_small
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 52 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 25 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 94 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 26 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 66 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 29 | Flame Graph |
read_scaling_complete
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id;one_depth | 1 entities | Flame Graph | |
| entity_by_id;one_depth | 10 entities | Flame Graph | |
| entity_by_id;one_depth | 25 entities | Flame Graph | |
| entity_by_id;one_depth | 5 entities | Flame Graph | |
| entity_by_id;one_depth | 50 entities | Flame Graph | |
| entity_by_id;two_depth | 1 entities | Flame Graph | |
| entity_by_id;two_depth | 10 entities | Flame Graph | |
| entity_by_id;two_depth | 25 entities | Flame Graph | |
| entity_by_id;two_depth | 5 entities | Flame Graph | |
| entity_by_id;two_depth | 50 entities | Flame Graph | |
| entity_by_id;zero_depth | 1 entities | Flame Graph | |
| entity_by_id;zero_depth | 10 entities | Flame Graph | |
| entity_by_id;zero_depth | 25 entities | Flame Graph | |
| entity_by_id;zero_depth | 5 entities | Flame Graph | |
| entity_by_id;zero_depth | 50 entities | Flame Graph |
read_scaling_linkless
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | 1 entities | Flame Graph | |
| entity_by_id | 10 entities | Flame Graph | |
| entity_by_id | 100 entities | Flame Graph | |
| entity_by_id | 1000 entities | Flame Graph | |
| entity_by_id | 10000 entities | Flame Graph |
representative_read_entity
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/block/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/book/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/building/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/organization/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/page/v/2
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/person/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/playlist/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/song/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/uk-address/v/1
|
Flame Graph |
representative_read_entity_type
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| get_entity_type_by_id | Account ID: bf5a9ef5-dc3b-43cf-a291-6210c0321eba
|
Flame Graph |
representative_read_multiple_entities
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_property | traversal_paths=0 | 0 | |
| entity_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=0 | 0 | |
| link_by_source_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true |
scenarios
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| full_test | query-limited | Flame Graph | |
| full_test | query-unlimited | Flame Graph | |
| linked_queries | query-limited | Flame Graph | |
| linked_queries | query-unlimited | Flame Graph |
922e42b to
572197c
Compare
7014842 to
e6439e0
Compare
572197c to
74ce977
Compare
e6439e0 to
2346ece
Compare
74ce977 to
b8fe72b
Compare
2346ece to
5e31131
Compare
|
Deployment failed with the following error: |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix prepared fixes for all 3 issues found in the latest run.
- ✅ Fixed: Cycles silently drop scheduled islands
- Added a full-node-count assertion in
schedule_inso cyclic control-flow now fails loudly instead of returning a partial schedule.
- Added a full-node-count assertion in
- ✅ Fixed: Range bound semantics are lost
- Updated
IdVec::copy_withinto preserveBound::{Included,Excluded,Unbounded}when converting typed ID bounds tousizebounds.
- Updated
- ✅ Fixed: Requirement resolution assumes acyclic control flow
- Replaced DFS-postorder-based ordering with Kahn topological ordering plus a cycle assertion in island requirement resolution.
Or push these changes by commenting:
@cursor push 27aca44a02
Preview (27aca44a02)
diff --git a/libs/@local/hashql/core/src/id/vec.rs b/libs/@local/hashql/core/src/id/vec.rs
--- a/libs/@local/hashql/core/src/id/vec.rs
+++ b/libs/@local/hashql/core/src/id/vec.rs
@@ -6,7 +6,7 @@
fmt::{self, Debug},
hash::{Hash, Hasher},
marker::PhantomData,
- ops::{Deref, DerefMut, RangeBounds},
+ ops::{Bound, Deref, DerefMut, RangeBounds},
slice,
};
@@ -510,8 +510,16 @@
where
T: Copy,
{
- let start = src.start_bound().copied().map(Id::as_usize);
- let end = src.end_bound().copied().map(Id::as_usize);
+ let start = match src.start_bound() {
+ Bound::Included(&bound) => Bound::Included(bound.as_usize()),
+ Bound::Excluded(&bound) => Bound::Excluded(bound.as_usize()),
+ Bound::Unbounded => Bound::Unbounded,
+ };
+ let end = match src.end_bound() {
+ Bound::Included(&bound) => Bound::Included(bound.as_usize()),
+ Bound::Excluded(&bound) => Bound::Excluded(bound.as_usize()),
+ Bound::Unbounded => Bound::Unbounded,
+ };
self.raw.copy_within((start, end), dst.as_usize());
}
@@ -796,3 +804,23 @@
Self::from_raw(Vec::from_iter_in(iter, alloc))
}
}
+
+#[cfg(test)]
+mod tests {
+ use super::IdVec;
+ use crate::id::{Id as _, newtype};
+
+ newtype!(#[id(crate = crate)] struct TestId(u32 is 0..=32));
+
+ #[test]
+ fn copy_within_preserves_inclusive_end_bound() {
+ let mut vec = IdVec::<TestId, i32>::from_raw(alloc::vec![10, 20, 30]);
+
+ vec.copy_within(
+ TestId::from_usize(1)..=TestId::from_usize(1),
+ TestId::from_usize(0),
+ );
+
+ assert_eq!(vec.as_slice().as_raw(), &[20, 20, 30]);
+ }
+}
diff --git a/libs/@local/hashql/mir/src/pass/execution/island/graph/mod.rs b/libs/@local/hashql/mir/src/pass/execution/island/graph/mod.rs
--- a/libs/@local/hashql/mir/src/pass/execution/island/graph/mod.rs
+++ b/libs/@local/hashql/mir/src/pass/execution/island/graph/mod.rs
@@ -20,7 +20,7 @@
#[cfg(test)]
pub(crate) mod tests;
-use alloc::alloc::Global;
+use alloc::{alloc::Global, collections::VecDeque};
use core::{
alloc::Allocator,
ops::{Index, IndexMut},
@@ -29,11 +29,10 @@
use hashql_core::{
debug_panic,
graph::{
- DirectedGraph, EdgeId, LinkedGraph, NodeId, Predecessors, Successors, Traverse as _,
+ DirectedGraph, EdgeId, LinkedGraph, NodeId, Predecessors, Successors,
algorithms::{Dominators, dominators},
linked::Edge,
},
- heap::CollectIn as _,
id::{
HasId as _, Id as _,
bit_vec::{BitMatrix, DenseBitSet},
@@ -243,13 +242,39 @@
where
S: Allocator + Clone,
{
- let mut topo: Vec<IslandId, _> = self
- .inner
- .depth_first_forest_post_order()
- .map(|node| IslandId::new(node.as_u32()))
- .collect_in(scratch.clone());
- topo.reverse();
+ #[expect(clippy::cast_possible_truncation)]
+ let node_count = self.node_count();
+ let mut in_degree = IslandVec::from_elem_in(0_u32, node_count, scratch.clone());
+ for (island_id, _) in self.iter_nodes() {
+ in_degree[island_id] = self.predecessors(island_id).count() as u32;
+ }
+ let mut queue: VecDeque<IslandId, _> = VecDeque::new_in(scratch.clone());
+ for (island_id, _) in self.iter_nodes() {
+ if in_degree[island_id] == 0 {
+ queue.push_back(island_id);
+ }
+ }
+
+ let mut topo = Vec::with_capacity_in(node_count, scratch.clone());
+ while let Some(island_id) = queue.pop_front() {
+ topo.push(island_id);
+
+ for successor in self.successors(island_id) {
+ in_degree[successor] -= 1;
+
+ if in_degree[successor] == 0 {
+ queue.push_back(successor);
+ }
+ }
+ }
+
+ assert_eq!(
+ topo.len(),
+ node_count,
+ "island requirement resolution requires acyclic control flow",
+ );
+
let start = self.lookup[BasicBlockId::START];
RequirementResolver::new(self, start, scratch).resolve(&topo);
@@ -379,7 +404,7 @@
}
fn resolve(mut self, topo: &[IslandId]) {
- // Iterate in reverse for topological order
+ // Iterate in topological order so dominators are processed before dependents.
for &island_id in topo {
let island = &self.graph[island_id];
let IslandKind::Exec(_) = &island.kind else {
diff --git a/libs/@local/hashql/mir/src/pass/execution/island/schedule/mod.rs b/libs/@local/hashql/mir/src/pass/execution/island/schedule/mod.rs
--- a/libs/@local/hashql/mir/src/pass/execution/island/schedule/mod.rs
+++ b/libs/@local/hashql/mir/src/pass/execution/island/schedule/mod.rs
@@ -122,6 +122,12 @@
}
}
+ assert_eq!(
+ entries.len(),
+ node_count,
+ "island schedule requires acyclic control flow",
+ );
+
entries.sort_by_key(|entry| entry.level);
IslandSchedule { entries }
}This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
| } | ||
|
|
||
| entries.sort_by_key(|entry| entry.level); | ||
| IslandSchedule { entries } |
There was a problem hiding this comment.
Cycles silently drop scheduled islands
High Severity
schedule_in uses Kahn’s algorithm but never checks whether all nodes were emitted. If the IslandGraph contains a control-flow cycle, queue can start empty or stall, and entries omits islands. The function still returns a seemingly valid IslandSchedule, which can cause missing execution steps.
Additional Locations (1)
5e31131 to
b32bd45
Compare
b8fe72b to
69a8667
Compare




🌟 What is the purpose of this PR?
This PR implements the island dependency graph construction and scheduling system for the HashQL MIR execution engine. It builds a directed graph over computation islands, resolves data requirements between islands, and computes a topological schedule with parallelism levels for execution.
🔗 Related links
🔍 What does this change?
first_set()method toFiniteBitSet- Returns the first set bit usingtrailing_zeros(), with comprehensive test coverage for empty sets, single bits, multiple bits, and wide integral typesIdVecwith new utility methods - Addsfrom_raw(),from_domain_derive(),extend_from_slice(),append(),into_iter_enumerated(), andcopy_within()methods with detailed documentation and examplesIslandGraph) - Creates a directed graph overIslandNodes connected by three edge types:ControlFlow(execution ordering),DataFlow(data dependencies), andInherits(path inheritance between same-target dominators)IslandSchedule) - Computes topological ordering with parallelism levels using Kahn's algorithm, allowing islands at the same level to execute concurrentlyoption_into_flat_iterfeature - Adds the unstable feature flag for iterator functionalityPre-Merge Checklist 🚀
🚢 Has this modified a publishable library?
This PR:
📜 Does this require a change to the docs?
The changes in this PR:
🕸️ Does this require a change to the Turbo Graph?
The changes in this PR:
🛡 What tests cover this?
FiniteBitSet::first_set()covering empty sets, single/multiple bits, edge cases❓ How to test this?
cargo test