Add CI Visibility instrumentation for JMH benchmarks by robertpi · Pull Request #11498 · DataDog/dd-trace-java

robertpi · 2026-05-29T08:37:28Z

Summary

Adds CI Visibility support for JMH (Java Microbenchmark Harness, org.openjdk.jmh), the dominant Java microbenchmarking framework. Benchmark runs are now reported as test spans in the Datadog test explorer with performance metrics attached.

Closes SDTEST-930.

How it works

JMH's OutputFormat interface receives lifecycle callbacks exactly once per benchmark method. We instrument BaseRunner.<init> with bytecode advice to wrap the user's OutputFormat with our DDOutputFormat decorator — this is the only hook needed, with zero overhead on the benchmark hot path.

Each benchmark method produces:

A suite span (test_suite_end) for the benchmark class
A test span (test) for the benchmark method

With benchmark-specific metric tags on the test span:

Tag	Source
`benchmark.value`	Aggregated primary score
`benchmark.error`	99.9% CI half-width
`benchmark.unit`	e.g. `"ns/op"`, `"ops/ms"`
`benchmark.run.mode`	e.g. `"avgt"`, `"thrpt"`
`benchmark.run.iterations`	Measurement iteration count
`benchmark.run.warmup_iterations`	Warmup iteration count
`benchmark.run.forks`	Fork count
`benchmark.run.threads`	Thread count
`benchmark.run.time_unit`	e.g. `"NANOSECONDS"`
`benchmark.p50/p90/p95/p99`	Percentiles (when N > 1)
`benchmark.min` / `benchmark.max`	Bounds
`benchmark.sample_count`	Total samples

@Param-parameterised benchmarks follow the same test.parameters convention as JUnit 5 parameterized tests: {"metadata":{"test_name":"myMethod:size=1000"}}.

Changes

New module dd-java-agent/instrumentation/jmh/jmh-1.0/ — instrumentation + integration tests + fixture templates
New module dd-smoke-tests/jmh/ — end-to-end smoke test with real agent
Tags.java — 17 new benchmark.* tag constants
TestFrameworkInstrumentation — new JMH enum value
TestDecorator — new TEST_TYPE_BENCHMARK constant (for future use)
supported-configurations.json — DD_TRACE_JMH_ENABLED registered

Note on integration test style

JmhInstrumentationTest is a Groovy/Spock test extending CiVisibilityInstrumentationTest. This is an intentional exception to the JUnit 5 convention: CiVisibilityInstrumentationTest is a Spock Specification subclass whose setup()/cleanup() lifecycle cannot be triggered by the JUnit 5 runner. All other CI Visibility instrumentation tests in the codebase follow this same pattern.

Test plan

./gradlew :dd-java-agent:instrumentation:jmh:jmh-1.0:test — unit tests (JmhUtilsTest) + integration tests (simple benchmark, parameterised benchmark via fixture comparison)
./gradlew :dd-smoke-tests:jmh:test — end-to-end smoke test: forks a JVM with the agent, runs a JMH benchmark, asserts span tags and benchmark.value > 0

🤖 Generated with Claude Code

Instruments JMH's Runner constructor to wrap its OutputFormat with a DDOutputFormat decorator. The decorator fires once per benchmark method (after all forks and iterations complete) to emit CI Visibility test spans — zero overhead on the benchmark hot path. Each benchmark method produces a suite span + test span with benchmark metrics (score, error, unit, percentiles, run config) attached as tags. Parameterised @Param benchmarks follow the same test.parameters convention as JUnit 5 parameterized tests. Changes: - New module: dd-java-agent/instrumentation/jmh/jmh-1.0 - Tags.java: add benchmark.* tag constants - TestFrameworkInstrumentation: add JMH enum value - TestDecorator: add TEST_TYPE_BENCHMARK constant - Design spec: docs/design/jmh-ci-visibility.md Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

Groovy/Spock integration tests extending CiVisibilityInstrumentationTest that run JMH benchmarks in-process (forks=0) and verify the emitted CI Visibility spans against FTL fixture templates. Covers: - Simple (unparameterized) benchmark: suite + test spans with benchmark run config metrics (mode, unit, iterations, forks, threads, time_unit) - Parameterised benchmark (@Param): two test spans with test.parameters set following the JUnit 5 convention Also fixes: - BaseRunner instrumented instead of Runner (JDK 17+ rejects PUTFIELD on a final field of a superclass from advice injected into the subclass) - JMH annotation processor added to testAnnotationProcessor so that META-INF/BenchmarkList is generated at test compile time - DD_TRACE_JMH_ENABLED registered in supported-configurations.json Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

Java JUnit 5 smoke test that forks a real JVM subprocess with the dd-java-agent attached, runs a JMH benchmark in-process (forks=0) against a MockBackend, and verifies that the expected CI Visibility spans arrive with correct tags: - test.framework = "jmh" - test.name, test.suite, test.status - benchmark.run.mode, benchmark.unit - benchmark.value > 0 (measured score actually present) The benchmark class (SmokeTestBenchmark) lives in src/main/java so the JMH annotation processor can generate META-INF/BenchmarkList at compile time, making it available on the classpath that is passed to the subprocess. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

1. splitBenchmarkName returned the full parameterised suffix as the test name (e.g. "myMethod:size=1000") instead of just the method name ("myMethod"). Fix: use baseName (param-stripped) for the method slice. 2. endBenchmark had no null guard — if called without a prior startBenchmark the handler would receive null keys. Fix: early-return when suiteKey/testKey are null. 3. handler.close() in endRun was not in a finally block, so a crash in close() would swallow delegate.endRun(); and an exception in endBenchmark could bypass close() entirely. Fix: try/finally in both endBenchmark and endRun. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

datadog-datadog-prod-us1-2 · 2026-05-29T08:37:37Z

✨ Fix all issues with BitsAI

⚠️ Warnings

🚦 12 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-java | check_inst 4/4

🔧 Fix in code (Fix with Cursor).
CodeNarc rule violations found. See report at file:///go/src/github.com/DataDog/apm-reliability/dd-trace-java/workspace/dd-java-agent/instrumentation/jmh/jmh-1.0/build/reports/codenarc/test.html.

Run system tests | main / End-to-end #1 / play 1

🔄 Retry job. This looks flaky and may succeed on retry.
Error response from Docker daemon: Unable to reach the Docker registry at 'https://registry-1.docker.io/v2/'.

Run system tests | main / End-to-end #10 / spring-boot-openliberty 10

🔄 Retry job. This looks flaky and may succeed on retry.
Error response from daemon: Get "https://registry-1.docker.io/v2/": unknown

View all 12 failed jobs.

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: f021388 | Docs | Datadog PR Page | Give us feedback!}

robertpi · 2026-05-29T09:05:34Z

JmhInstrumentationTest.groovy is a new Groovy file — tag: override-groovy-enforcement added as a justified exception.

CiVisibilityInstrumentationTest is a Spock Specification subclass. Its setup()/cleanup() lifecycle hooks are driven by the Spock runner and cannot be triggered by the JUnit 5 engine. A Java subclass was attempted and confirmed non-functional for this reason. All other CI Visibility instrumentation tests in the repo follow this same Groovy pattern for the same reason.

robertpi and others added 4 commits May 28, 2026 15:26

robertpi added type: feature request comp: ci visibility Continuous Integration Visibility tag: ai generated Largely based on code generated by an AI or LLM labels May 29, 2026

robertpi added the tag: override-groovy-enforcement Override the "Enforce Groovy Migration" check label May 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CI Visibility instrumentation for JMH benchmarks#11498

Add CI Visibility instrumentation for JMH benchmarks#11498
robertpi wants to merge 4 commits into
masterfrom
sdtest-930-jmh-ci-visibility

robertpi commented May 29, 2026

Uh oh!

datadog-datadog-prod-us1-2 Bot commented May 29, 2026 •

edited

Loading

Uh oh!

robertpi commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

robertpi commented May 29, 2026

Summary

How it works

Changes

Note on integration test style

Test plan

Uh oh!

datadog-datadog-prod-us1-2 Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

Uh oh!

robertpi commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

datadog-datadog-prod-us1-2 Bot commented May 29, 2026 •

edited

Loading