Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
baseline-http-with-latency.json	baseline-http-with-latency.json
baseline-quick.json	baseline-quick.json
results-baseline-2026-04.md	results-baseline-2026-04.md

Okapi Performance Benchmarks

This directory contains benchmark methodology, run instructions, and historical results.

Running benchmarks

All benchmarks live in the okapi-benchmarks module and use JMH via the me.champeau.jmh Gradle plugin.

Full baseline (production-quality numbers)

Default JMH config in okapi-benchmarks/build.gradle.kts uses:

fork = 2 — isolated JVMs to neutralize JIT-profile variance
warmupIterations = 3, warmup = 10s — let JIT C2 settle
iterations = 5, timeOnIteration = 30s — statistically meaningful sample
-Xms2g -Xmx2g -XX:+UseG1GC — pinned memory and GC for reproducibility

./gradlew :okapi-benchmarks:jmh

Wall time: ~30 minutes (Testcontainers spin-up + 2 transports × 3 batchSize values × 8 iterations).

Result JSON: okapi-benchmarks/build/reports/jmh/results.json

Quick smoke run

For development iteration when you don't need statistically significant numbers:

./gradlew :okapi-benchmarks:jmhJar
java -jar okapi-benchmarks/build/libs/okapi-benchmarks-jmh.jar \
  "ThroughputBenchmark" -f 1 -wi 1 -i 2 -w 10s -r 15s

Wall time: ~5-8 minutes.

Single benchmark

java -jar okapi-benchmarks/build/libs/okapi-benchmarks-jmh.jar \
  "KafkaThroughputBenchmark" -p batchSize=50 -f 1 -wi 1 -i 2

What we measure

Throughput benchmarks (`*ThroughputBenchmark`)

End-to-end pipeline: insert N PENDING entries, then call OutboxProcessor.processNext() in a tight loop until drained. The OutboxScheduler is bypassed deliberately — we measure processing capacity, not polling cadence (which is a deployment-time knob).

KafkaThroughputBenchmark — real Postgres + real Kafka via Testcontainers
HttpThroughputBenchmark — real Postgres + WireMock HTTP target with @Param httpLatencyMs injecting 0/20/100 ms server-side delay (library-only ceiling vs realistic webhook)

Reported as ops/s where one op = one delivered message (via @OperationsPerInvocation).

Microbenchmarks (`DelivererMicroBenchmark`)

Single-entry deliver() calls with mocked I/O:

Kafka: MockProducer with auto-complete (no broker)
HTTP: WireMock on loopback

Measures pure code overhead (JSON deserialization, record/request construction, exception classification). Useful as "did optimization X regress the hot path?" baseline.

How to read results

Throughput benchmarks report ops/s = msg/s thanks to @OperationsPerInvocation.

Benchmark                                  (batchSize)   Mode  Cnt   Score   Error  Units
KafkaThroughputBenchmark.drainAll                   10  thrpt    5  450.2 ± 18.3  ops/s
KafkaThroughputBenchmark.drainAll                   50  thrpt    5  890.5 ± 22.1  ops/s
KafkaThroughputBenchmark.drainAll                  100  thrpt    5  920.7 ± 31.4  ops/s

The Score is the headline number. The Error is a 99.9% confidence interval — a tight error (< 5% of score) means the result is trustworthy; a wide error means run more iterations or investigate variability sources (background processes, thermal throttling, GC).

Caveats — important for honest reporting

Localhost Testcontainers ≠ production. Kafka container on the same host has ~0.5ms RTT; a real cluster typically has 5-50ms. Real-world throughput will be 2-10× lower than these benchmarks suggest. Treat numbers as upper bounds for the library's processing capacity.
HTTP benchmark uses WireMock in-JVM, which adds ~0.3 ms overhead per request (Jetty servlet pipeline). At httpLatencyMs=0 the measurement reflects "library + DB + WireMock overhead", not pure library throughput. For tighter pomiar consider replacing WireMock with MockWebServer (Square) — see results-baseline-2026-04.md notes on benchmark methodology.
httpLatencyMs is server-side delay, not network RTT. Real production webhook latency is dominated by the target service's processing time + network — pick the value closest to your target. With httpLatencyMs=100, sequential delivery is bounded at 1000ms / 100ms = 10 msg/s/thread regardless of library efficiency.
Single-threaded scheduler. Current OutboxSchedulerConfig does not expose concurrency. Once that lands (planned), the throughput matrix will expand to batchSize × concurrency.

Historical baselines

results-baseline-2026-04.md — pre-optimization baseline (sync sequential delivery, single-threaded scheduler)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Okapi Performance Benchmarks

Running benchmarks

Full baseline (production-quality numbers)

Quick smoke run

Single benchmark

What we measure

Throughput benchmarks (`*ThroughputBenchmark`)

Microbenchmarks (`DelivererMicroBenchmark`)

How to read results

Caveats — important for honest reporting

Historical baselines

FilesExpand file tree

benchmarks

Directory actions

More options

Directory actions

More options

Latest commit

History

benchmarks

Folders and files

parent directory

README.md

Okapi Performance Benchmarks

Running benchmarks

Full baseline (production-quality numbers)

Quick smoke run

Single benchmark

What we measure

Throughput benchmarks (*ThroughputBenchmark)

Microbenchmarks (DelivererMicroBenchmark)

How to read results

Caveats — important for honest reporting

Historical baselines

Throughput benchmarks (`*ThroughputBenchmark`)

Microbenchmarks (`DelivererMicroBenchmark`)