IssunDB · habedi · Jun 21, 2026 · Jun 19, 2026 · Jun 20, 2026 · Jun 20, 2026
diff --git a/.gitignore b/.gitignore
@@ -10,6 +10,7 @@ venv/
 data/
 work/
 results/
+tmp/
 
 # Other files and directories to ignore
 .DS_Store
@@ -19,3 +20,6 @@ results/
 .codex
 .idea/
 .vscode/
+*.mdb
+*.idb
+*.lbdb
diff --git a/AGENTS.md b/AGENTS.md
@@ -29,7 +29,7 @@ Priorities, in order:
 - Use Oxford commas in inline lists: "a, b, and c" not "a, b, c".
 - Do not use em dashes. Restructure the sentence, or use a colon or semicolon instead.
 - Avoid colorful adjectives and adverbs. Write "instruction decoder" not "elegant instruction decoder".
-- Use noun phrases for checklist items, not imperative verbs. Write "opcode timing table" not "build the opcode timing table".
+- Prefer using noun phrases for checklist items, not imperative verbs. Write "opcode timing table" not "build the opcode timing table".
 - Headings in Markdown files must be in title case: "Build from Source" not "Build from source". Minor words (a, an, the, and, but, or, for, in, on,
   at, to, by, of) stay lowercase unless they are the first word.
 

diff --git a/Makefile b/Makefile
@@ -1,7 +1,7 @@
-SCALE ?= 10000
+SCALE ?= 100000
 SEED ?= 0
 ENGINES ?= issundb,ladybug,lance-graph,neo4j
-SCALES ?= 1000,10000,100000
+SCALES ?= 10000,100000,300000
 MIN_ROUNDS ?= 20
 TIME_BUDGET ?= 2.0
 WARMUP ?= 3
@@ -10,10 +10,10 @@ WARMUP ?= 3
 
 help:
 	@echo "graphbench targets:"
-	@echo "  make gen SCALE=10000      # Generate a synthetic graph dataset (SCALE=number of Person nodes)"
+	@echo "  make gen SCALE=100000      # Generate a synthetic graph dataset (SCALE=number of Person nodes)"
 	@echo "  make engines               # List which graph database engines are available"
 	@echo "  make run [ENGINES=issundb,ladybug]  # Run the benchmark for specified engines (default: all)"
-	@echo "  make sweep [SCALES=1000,10000,100000]  # Benchmark a series of scales and plot scaling curves"
+	@echo "  make sweep [SCALES=10000,100000,300000]  # Benchmark a series of scales and plot scaling curves"
 	@echo "  make report                # Generate a report from the results in the results/ directory"
 	@echo "  make test                  # Run the unit tests"
 	@echo "  make neo4j-up or neo4j-down # Start and stop the Neo4j container"

diff --git a/README.md b/README.md
@@ -17,6 +17,10 @@ against IssunDB.
 | 3 | **Lance-graph** | [lancedb/lance-graph](https://github.com/lancedb/lance-graph) |
 | 4 | **Neo4j**       | [neo4j.com](https://neo4j.com/)                               |
 
+> [!NOTE]
+> Technically `Lance-graph` is not a graph database, but an in-memory graph query engine over Apache Arrow tables.
+> In this repository when the word `engine` or `graph engine` are used, it referse to `Lance-graph` plus the other graph databases in the table above.
+
 ### Schema and Queries
 
 #### Benchmark Graph Dataset
@@ -57,36 +61,28 @@ See the [query definitions](graphbench/queries.py) for more details.
 
 ### Methodology
 
-The benchmarks are created, so the published numbers are reproducible and hard to manipulate.
-That's achieved by:
-
-- **Engine-independent correctness oracle.** Every query is independently re-implemented in
-  [`graphbench/oracle.py`](graphbench/oracle.py) with polars over the raw Parquet dataset. Each engine's
-  result rows (over several parameter instantiations) are diffed against the oracle; no engine, including
-  IssunDB, is ever used as the reference. Mismatches are reported, never silently omitted from timing.
-- **Process isolation.** Each engine is built and timed in its own worker process
-  ([`graphbench/_worker.py`](graphbench/_worker.py)), so heap state, allocator fragmentation, and caches
-  never leak between engines, and the peak RSS reported per engine is attributable to that engine alone.
-- **Statistics.** Per query: a cold run (first execution after build) is reported separately; timed rounds
-  run with the garbage collector disabled until both a minimum round count and a time budget are met; the
-  report shows median latency with a distribution-free 95% confidence interval for the median (the
-  order-statistic method, not a normal approximation on the mean), and the plot carries p25 to p75 whiskers.
-- **Honest comparisons.** Engines are labeled by kind (embedded / in-memory / client-server) and ingestion
-  method; load times are never ranked across kinds, the client-server network round-trip caveat is stated
-  in every report, and Neo4j's server memory settings are captured from the live server into the results.
-- **Indexing differences.** Index models differ by engine and cannot be fully equalized: IssunDB
-  auto-indexes every scalar property, Neo4j uses a uniqueness index on `id` plus an explicit range index
-  on the filtered column, Ladybug indexes only its primary key, and lance-graph holds no index. The report
-  spells this out so a filtered-query result is read as the engine's indexing model, not raw speed alone.
-- **Determinism.** The dataset is generated from a single seed, byte-for-byte reproducible, with edge rows
-  shuffled so no engine gains a locality advantage from sorted insertion order. Hardware (CPU model, cores,
-  and RAM) is recorded in every result file.
-- **Scaling.** `make sweep` benchmarks a series of dataset scales and plots median latency vs scale per
-  query, so results are never a single-scale snapshot.
-
-Known limitations (deliberately out of scope so far): the suite measures single-threaded read-only latency;
-no concurrent throughput and no write/update workloads. Engines may not all support every query
-(e.g. variable-length patterns); unsupported queries show as `ERR` in the report rather than being dropped.
+To ensure reproducible, objective, and comparable performance metrics, the benchmark suite follows these practices:
+
+- **Correctness Oracle**: Every query is re-implemented in [`graphbench/oracle.py`](graphbench/oracle.py) using Polars. Engine result rows are diffed
+  against this oracle to verify correctness before timing, and mismatching queries fail validation.
+- **Process Isolation**: Each engine executes queries in a dedicated worker process ([`graphbench/_worker.py`](graphbench/_worker.py)) to prevent
+  cache, allocator, and heap contamination.
+- **Statistical Rigor**: Query timing runs with the garbage collector disabled until a minimum round count and a time budget are met. Reports display
+  the median latency, a distribution-free 95% confidence interval, and p25 to p75 error bars. Cold runs are measured and reported separately.
+- **Categorization**: Engines are categorized by architecture (embedded, in-memory, or client-server) and ingestion method. Latency reports include
+  network round-trip caveats for client-server engines and log live server settings.
+- **Index Disclosure**: Engine index models are documented (such as IssunDB auto-indexing, Neo4j range indexing, LadybugDB primary key indexing, and
+  Lance-graph no-indexing) to provide context for query latency differences.
+- **Determinism**: Datasets are generated from a single seed, and edge rows are shuffled to eliminate insertion-order locality benefits. CPU, core
+  count, and RAM specifications are saved with every run.
+- **Multi-Scale Scaling**: The suite measures scaling characteristics by running a sweep across dataset sizes rather than relying on a single-point
+  snapshot.
+
+#### Scope and Limitations
+
+The suite currently measures single-threaded read-only latency.
+Concurrent throughput, write workloads, and update workloads are out of scope.
+Unsupported queries are reported as errors rather than being omitted.
 
 > [!IMPORTANT]
 > Benchmarking different systems (with different design philosophies, architectures, feature sets, etc.) is not straightforward and is tricky.

diff --git a/graphbench/engines/issundb_engine.py b/graphbench/engines/issundb_engine.py
@@ -36,7 +36,7 @@ def __init__(self, schema: Schema, workdir: Path):
         self._db_path = workdir / "social.issundb"
         if self._db_path.exists():
             shutil.rmtree(self._db_path)
-        self._db = IssunDB(str(self._db_path))
+        self._db = IssunDB(str(self._db_path), map_size_gb=16)
 
     @classmethod
     def probe(cls) -> EngineInfo:

diff --git a/graphbench/engines/neo4j_engine.py b/graphbench/engines/neo4j_engine.py
@@ -72,7 +72,7 @@ def probe(cls) -> EngineInfo:
     def _reset(self) -> None:
         # Batched delete so a wipe at large scales does not exhaust the heap.
         self._session.run(
-            "MATCH (n) CALL { WITH n DETACH DELETE n } IN TRANSACTIONS OF 50000 ROWS"
+            "MATCH (n) CALL (n) { DETACH DELETE n } IN TRANSACTIONS OF 50000 ROWS"
         )
         for label in self.schema.nodes:
             self._session.run(

diff --git a/pyproject.toml b/pyproject.toml
@@ -8,15 +8,15 @@ dependencies = [
     "pyarrow>=17.0",
     "matplotlib>=3.9",
     "polars>=1.0",
-    "issundb>=0.1.0a8",
+    "issundb>=0.1.0a9",
     "patchelf>=0.17.2",
 ]
 
 [project.optional-dependencies]
 ladybug = ["ladybug>=0.17"]
 lance-graph = ["lance-graph>=0.5"]
 neo4j = ["neo4j>=5.0"]
-all = ["ladybug>=0.15", "lance-graph>=0.5", "neo4j>=5.0"]
+all = ["ladybug>=0.17", "lance-graph>=0.5", "neo4j>=5.0"]
 
 [dependency-groups]
 dev = ["pytest>=8.0"]

diff --git a/uv.lock b/uv.lock