Add Prometheus translation strategy support#8346
Conversation
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #8346 +/- ##
============================================
- Coverage 91.04% 90.98% -0.06%
- Complexity 7822 7898 +76
============================================
Files 893 894 +1
Lines 23721 23884 +163
Branches 2364 2393 +29
============================================
+ Hits 21596 21732 +136
- Misses 1407 1421 +14
- Partials 718 731 +13 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
psx95
left a comment
There was a problem hiding this comment.
Also might wanna look at this relevant spec issue: open-telemetry/opentelemetry-specification#5062
| new LinkedBlockingQueue<>(), | ||
| new DaemonThreadFactory("prometheus-http-server")); | ||
| } | ||
| HTTPServer.Builder httpServerBuilder = HTTPServer.builder(); |
There was a problem hiding this comment.
Question: Based on the "Interaction with Translation Strategy" spec, I was expecting to see some content negotiation, basically reading accept headers somewhere.
There was a problem hiding this comment.
let's park this until open-telemetry/opentelemetry-specification#5062 is resolved.
There was a problem hiding this comment.
I revisited this against the clarification in open-telemetry/opentelemetry-specification#5134 and added parameterized coverage in PrometheusHttpServerTest to exercise translation strategy vs negotiated escaping. The new cases show the current behavior matches the clarified model: permissive strategies like NO_TRANSLATION still get restricted by the negotiated Accept escaping, while stricter configured translation is not undone by a more permissive request. This is now covered by commit eef409b.
|
|
||
| assertThat(response.status()).isEqualTo(HttpStatus.OK); | ||
| assertThat(response.headers().get(HttpHeaderNames.CONTENT_TYPE)) | ||
| .isEqualTo("application/openmetrics-text; version=1.0.0; charset=utf-8"); |
There was a problem hiding this comment.
Shouldn't the accept header here also have the escaping scheme like shown here?
There was a problem hiding this comment.
let's park this until open-telemetry/opentelemetry-specification#5062 is resolved.
There was a problem hiding this comment.
Yep — I added explicit escaping= variants to the OpenMetrics negotiation tests so we now cover the spec-relevant request shape directly instead of relying only on the generic application/openmetrics-text header. See the new parameterized cases in PrometheusHttpServerTest from commit eef409b.
| @@ -239,6 +239,50 @@ void fetchOpenMetrics() { | |||
| + "# EOF\n"); | |||
| } | |||
|
|
|||
There was a problem hiding this comment.
I think there should be more tests that test behavior with other possible escaping schemes?
The spec only mentions two in the examples, but I imagine its because its meant to be non-exhaustive list.
There was a problem hiding this comment.
let's park this until open-telemetry/opentelemetry-specification#5062 is resolved.
There was a problem hiding this comment.
Agreed. I added parameterized coverage for the key combinations from the clarified spec / content-negotiation behavior: permissive strategy + escaping=underscores, permissive strategy + escaping=allow-utf-8, and stricter configured translation + permissive request. That gives us direct coverage for the non-default escaping-scheme behavior we rely on here. Commit eef409b.
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
jack-berg
left a comment
There was a problem hiding this comment.
Some small comments and ideas. Thanks!
| throw new IllegalStateException("Unknown strategy: " + translationStrategy); | ||
| } | ||
|
|
||
| private static MetricMetadata convertMetadataEscapedWithSuffixes(MetricData metricData) { |
There was a problem hiding this comment.
I find these convert methods in the weeds and hard to follow / verify. Going to have to trust that you've done the research.
Could help improve the readability by:
- Computing the MetricMetadata constructor params in a consistent order the same as MetricMetadata accepts (i.e. name, expositionName, originalName, help, unit)
- Using variable names which are the same as the param names of MetricMetadata
- Always using the same constructor overload so its explicit which things we're setting to null
For example, applied to this method it might look like:
private static MetricMetadata convertMetadataEscapedWithSuffixes(MetricData metricData) {
String name = convertLegacyMetricName(metricData.getName());
name = stripReservedMetricSuffixes(name);
String expositionBaseName = name;
String originalName = name;
String help = metricData.getDescription();
Unit unit = PrometheusUnitsHelper.convertUnit(metricData.getUnit());
if (unit != null && !name.endsWith(unit.toString())) {
name = name + "_" + unit;
}
validateNormalizedMetricName(metricData.getName(), name);
return new MetricMetadata(name, expositionBaseName, originalName, help, unit);
}
Something to think about to make it easier for humans to reason about / maintain.
There was a problem hiding this comment.
Applied to convertMetadataEscapedWithSuffixes (consistent ordering, single 5-arg MetricMetadata ctor) per your example. Left the other three strategy methods as-is for now since they read OK to me; happy to apply the same shape across all four if you'd prefer consistency. Commit 957e2e7.
There was a problem hiding this comment.
Hmm... let me think about this more.
There was a problem hiding this comment.
Circling back on this one: I went ahead and applied the consistency cleanup across the strategy helpers while keeping the current metadata semantics intact. The methods now compute the metadata pieces in a consistent shape (name, expositionBaseName, originalName, help, unit) before building MetricMetadata, which made them much easier to audit. That landed in commit eef409b.
- Replace IllegalArgumentException control flow in PrometheusUnitsHelper.sanitizeUnitName with @nullable return; drop the try/catch in unitOrNull. - Extract doConvert helper so convert is just the IAE try/catch boundary. - Inline the getMergeKey ternary at the putOrMerge call site. - Reorder convertMetadataEscapedWithSuffixes for readability and use the explicit 5-arg MetricMetadata constructor. Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Also bumps protobuf-bom 4.34.1 -> 4.35.0 to match the gencode shipped inside prometheus-metrics-exposition-formats 1.7.0 (runtime must be >= gencode version). Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
9e0b450 to
9c20343
Compare
## Summary - Adds `MetricMetadata.builder()` with `name`, `help`, `unit`, `counterSuffix` fields - Builder appends unit to the base name when absent, and appends `_total` to `expositionBaseName` when `counterSuffix=true` - Deprecates the 4-arg and 5-arg constructors; internal callers (`MetricMetadataSupport`, `MetricMetadata.escape`) suppress the warning - Updates `docs/apidiffs/current_vs_latest/prometheus-metrics-model.txt` ## Motivation The OTel exporter ([opentelemetry/opentelemetry-java#8346](open-telemetry/opentelemetry-java#8346)) needs to express per-strategy counter intent without pre-computing `expositionBaseName` manually. The builder encapsulates that logic and provides a cleaner public API for any downstream adapter that constructs `MetricMetadata` directly. ## Test plan - [ ] `MetricMetadataTest` — 8 new builder tests covering: no unit, unit absent/present, counter suffix, counter + unit, UTF-8 name, non-counter, name-required validation - [ ] Existing 4-arg/5-arg constructor tests annotated with `@SuppressWarnings("deprecation")` --------- Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
|
Will undraft once the new prom client release has been published. |
|
@psx95 please have another look |
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
…slation-strategy # Conflicts: # dependencyManagement/build.gradle.kts
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
psx95
left a comment
There was a problem hiding this comment.
Thanks for following up on the updated spec issue, @zeitlinger !
Overall changes LGTM, I just left a couple of small suggestions and mostly verified the translation logic by looking at the tests.
| } | ||
| } | ||
|
|
||
| private static Stream<Arguments> translationStrategyContentNegotiationArgs() { |
There was a problem hiding this comment.
Should we add a test case for unrecognized escaping as well?
The prometheus escaping schemes also contain dots and values which are not supported on the OTel side IIUC, but I imagine someone could send an accept header with such escaping as well.
There was a problem hiding this comment.
The Prometheus Java library owns content negotiation for the escaping scheme — unrecognized values fall back to its default behavior. Testing that here would be testing the library rather than OTel translation logic, so I'll leave it out for now.
…n tests Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
Summary
TranslationStrategysupport to the Prometheus exporter builder and declarative configclient_javato1.6.1, which provides the released naming support this needsprometheus/otlptranslatorbehavior for invalid characters, repeated underscores, and digit-leading labelsNotes
prometheus/otlptranslatorpreserves labels normalized to__...__; Prometheus Java rejects user labels starting with__, so those labels are collapsed to a valid single-underscore form instead.References
otlptranslator: https://github.com/prometheus/otlptranslatorTest plan
./gradlew :exporters:prometheus:test --tests io.opentelemetry.exporter.prometheus.PrometheusHttpServerTest.fetchOpenMetrics --tests io.opentelemetry.exporter.prometheus.PrometheusHttpServerTest.fetchOpenMetrics_translationStrategyEnablesOm2 --tests io.opentelemetry.exporter.prometheus.PrometheusMetricReaderTest --tests io.opentelemetry.exporter.prometheus.Otel2PrometheusConverterTest --tests io.opentelemetry.exporter.prometheus.internal.PrometheusMetricReaderProviderTest :sdk-extensions:declarative-config:test --tests io.opentelemetry.sdk.autoconfigure.declarativeconfig.MetricReaderFactoryTest.create_PullPrometheusConfiguredResolves #8195