Add OpenTelemetry tracing across the backbeat pipeline#2733
Conversation
Hello delthas,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Request integration branchesWaiting for integration branch creation to be requested by the user. To request integration branches, please comment on this pull request with the following command: Alternatively, the |
Codecov Report❌ Patch coverage is Additional details and impacted files
... and 6 files with indirect coverage changes
@@ Coverage Diff @@
## development/9.4 #2733 +/- ##
===================================================
- Coverage 74.73% 74.65% -0.08%
===================================================
Files 199 200 +1
Lines 13650 13724 +74
===================================================
+ Hits 10201 10246 +45
- Misses 3439 3468 +29
Partials 10 10
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
9d08f7b to
2f7afb0
Compare
2f7afb0 to
d562a0a
Compare
d562a0a to
51a9f61
Compare
51a9f61 to
b9d3528
Compare
970a811 to
849d6b0
Compare
54399e3 to
3189d3a
Compare
3189d3a to
cedc91c
Compare
DarkIsDude
left a comment
There was a problem hiding this comment.
No more for me. Code is clean and factorised. No issue found. Congrats
Pin arsenal at the 8.4.6 release tag (shared tracing module + W3C trace-context stamping on MongoDB metadata writes). Drop the SDK-core packages now that arsenal carries them as optionalDependencies, and keep the four instrumentation packages (http, ioredis, mongodb, aws-sdk) here — the consumer owns and configures them. Issue: BB-764
lib/tracing/index.js becomes a thin shim over arsenal's shared module: it carries backbeat's config (serviceName, the http/ioredis/mongodb/ aws-sdk instrumentations, outbound-only HTTP via makeHttpInstrumentationConfig + disableIncomingRequestInstrumentation) so the 8 entry points keep calling init() with no args. kafkaTraceContext.js re-exports arsenal's kafka helpers so the existing require sites are unchanged. The trust-boundary filter and SDK bootstrap now live in arsenal. Issue: BB-764
Wire arsenal's tracing into the replication, lifecycle, GC, notification, and oplog-populator pods: init() at each entry point, per-pod spans, and trace-context propagation across the Kafka pipeline (producers stamp traceparent via the kafka helpers; consumers start linked spans from it). Out-of-process Kafka hops use span links, not parent/child, so traces stay bounded. Issue: BB-764
|
LGTM — well-structured PR with good test coverage. |
Summary
Add OpenTelemetry tracing across the backbeat pipeline, gated behind
ENABLE_OTEL=true. When the flag is unset, no@opentelemetry/*package is loaded — zero overhead off the OTEL path.The SDK bootstrap, trust-boundary host filter, and Kafka trace-context helpers now live in arsenal's shared
lib/tracingmodule (scality/Arsenal#2632, ARSN-586); backbeat consumes it through a thin shim instead of carrying its own copy. Companion to the cloudserver (#6140, CLDSRV-884) and vault (#203, VAULT-708) PRs, so all four services share one implementation.Commits
chore: depend on arsenal OTEL tracing module— pin arsenal at the8.4.6release tag (shared tracing module + the W3C trace-context stamping on MongoDB metadata writes, ARSN-572, which the Kafka pipeline relies on to continue traces across the oplog boundary). Drop the SDK-core packages now that arsenal carries them asoptionalDependencies, and keep the four instrumentation packages backbeat configures itself:instrumentation-http/-ioredis/-mongodb/-aws-sdk.feat: replace in-tree tracing with arsenal shim—lib/tracing/index.jsbecomes a thin shim overrequire('arsenal/build/lib/tracing')carrying backbeat's config in one place:serviceName: 'backbeat', the four instrumentations, and outbound-only HTTP (...makeHttpInstrumentationConfig()for the trust-boundaryrequestHook, plusdisableIncomingRequestInstrumentation: truesince backbeat pods serve no application HTTP).lib/tracing/kafkaTraceContext.jsre-exports arsenal's kafka helpers so the existing require sites are unchanged. The trust-boundary filter and SDK bootstrap that used to live here are now arsenal's.feat: instrument backbeat pods and the Kafka pipeline— wire tracing into the replication, lifecycle, GC, notification, and oplog-populator pods:init()at each of the 8 entry points, per-pod spans, and trace-context propagation across the Kafka pipeline. Producers stamptraceparentonto message headers via the kafka helpers; consumers start a span linked to (not a child of) the upstream span — out-of-process Kafka hops can fire long after the original request, so links keep traces bounded.Why a shim (vs cloudserver/vault's direct calls)
backbeat has 8
init()entry points and 6 Kafka-helper require sites. The shim keeps all 14 call sites untouched and the backbeat-specific config in one file; cloudserver and vault have a single entry point each, so they deep-require arsenal directly.Configuration
OpenTelemetry environment variables are documented in the arsenal module.
Out of scope (follow-ups)
The trust-boundary enforcement (read
OTEL_TRUSTED_HOSTS, striptraceparenton untrusted outbound) lives entirely in arsenal and is wired automatically. Populating that env var per deployment is routine operator config — no backbeat-side work.Related tickets
Issue: BB-764