Enable Cloudflare tracing + bind rayId/traceparent/instanceId onto logs#111
Enable Cloudflare tracing + bind rayId/traceparent/instanceId onto logs#111neekolas merged 8 commits intoxmtplabs:mainfrom
Conversation
Chain `.child({ instanceId: event.instanceId })` on the workflow logger
so every log line emitted from any step carries the instanceId, enabling
Workers Logs correlation across replays. Services (GitHubClient,
CoderService) inherit the binding via closure — no service-signature
changes needed.
Also adds a replay-safe `logger.info("Workflow run started", { type })`
breadcrumb at the top of `run()`. This guarantees at least one
instanceId-tagged line is emitted even when all downstream side-effects
are cached in `step.do` results (Option B from the task plan — needed
because mocked-step tests never let services emit logs themselves).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
| const HEX_RE = /^[0-9a-f]+$/; | ||
| const ZERO_TRACE_ID = "00000000000000000000000000000000"; | ||
|
|
||
| export function parseTraceparent( |
There was a problem hiding this comment.
Cloudflare doesn't have some open source library to help with this? Feels like a thing that we shouldn't have to implement from scratch
There was a problem hiding this comment.
Looked into this. Summary of what's on the shelf today:
- Cloudflare first-party — nothing for request-path code. When
[observability.traces]is on, the runtime auto-propagatestraceparenton outboundfetch, but the parsed context is only exposed to Tail Workers (SpanContextin workers-types is Tail-only). To puttraceId/spanIdinto our structured logs in thefetchhandler, we have to pull it off the incoming header ourselves. @opentelemetry/core— exportsW3CTraceContextPropagatorpublicly, but to use it you build aContext+ aTextMapGetteradapter just to read two hex strings. TheparseTraceParent(header)helper that would be a clean fit is an internal export, not a public API. Plus@opentelemetry/apipeer dep + bundle weight for ~15 lines of work.tctx(maraisr) — cleanest API match (traceparent.parse(header)), MIT, zero deps. But ~2.5k weekly downloads, single maintainer — supply-chain trade for 20 LOC feels lopsided.traceparent(elastic) — inactive, no release in the last year.@microlabs/otel-cf-workers/cloudflare/workers-honeycomb-logger— the Workers-community references I looked at either hand-roll extraction or go through the full OTel stack. No one's pulling a small dedicated parser.
I left the 20-line inline parser in place and added a comment at src/utils/logger.ts:60 noting the options I evaluated and why, plus a link to W3C Trace Context §3.2 so the validation rules are traceable. Happy to swap to tctx if you'd rather lean on a package — just want to flag the DL/maintainer profile first. If/when Cloudflare exposes the parsed context in the request path, the right move is to delete the helper, not to swap in a library.
Pushed in 4e37842.
Answers a reviewer question about using a library. Cloudflare does not expose parsed traceparent to request-path user code, @opentelemetry/core is overkill and its parse helper is internal, tctx has <3k weekly DL with one maintainer, and elastic/traceparent is inactive. Comment links the W3C spec so future readers can verify the validation rules. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Will this change pass the tracing context from the main worker that receives the webhook to the workflow run that executes the work? |
Answers a reviewer question: before this, only the Worker log line carried trace context — the Workflow logger only had `instanceId`, so querying by `traceId` in Workers Logs missed every step log. Now the Worker attaches the parsed `cf-ray` + `traceparent` fields onto `payload.source.trace` before calling `WORKFLOW.create`, and `TaskRunnerWorkflow.run()` spreads them onto its bound logger. A query by `traceId` now surfaces the webhook line plus every step log in the dispatched workflow run. `EventSource.trace` is optional, so existing fixtures and event construction sites that omit it continue to type-check. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Good catch — it didn't before this commit. The Worker was parsing Fixed in e579334:
Tests added in the same commit:
Net result: |
Resolves #108
Summary
Enable Cloudflare Workers tracing and extend the structured JSON logger so every log line carries the identifiers needed to join a Worker request to its
TaskRunnerWorkflowinstance and to any upstream W3C trace context.Changes
wrangler.toml— added[observability.logs] invocation_logs = trueand[observability.traces] enabled = true, and documentedhead_sampling_rate = 1explicitly on both blocks so sampling is reviewable in future diffs.src/utils/logger.ts— added a pureparseTraceparent(header)helper that returns{ traceId, spanId } | nullafter validating W3Ctraceparentformat (segment count, version !="ff", lengths 2/32/16/2, lowercase hex, non-zero trace-id).src/main.ts— per-request logger now bindsrayId(fromcf-ray) andtraceId/spanId(fromtraceparent) via conditional spread. Absent or malformed headers are omitted rather than set to"unknown"so log queries are unambiguous. ExistingdeliveryId+eventNamebindings preserved.src/workflows/task-runner-workflow.ts—TaskRunnerWorkflow.run()now chains.child({ instanceId: event.instanceId })onto the logger so every service / step emission carries the workflow instance ID. A"Workflow run started"breadcrumb is emitted at the top ofrun()so Workers Logs always has aninstanceId-tagged anchor for the run, even when all step results are cached.parseTraceparent(19 cases covering happy path + edge cases), Worker-level tests asserting JSON log shape for present / absent / malformed headers, and a workflow introspection test assertinginstanceIdappears on at least one emitted line.Log-query impact
Operators can now join logs on a single field:
instanceId = "task_requested-<repo>-<n>-<delivery>"surfaces every step log plus the Worker's"Webhook processed"summary.rayId = "8f...-SJC"/traceId = "<hex>"/spanId = "<hex>"filter to the Worker request path.deliveryId = "<hex>"continues to work exactly as before.Test plan
npm run check— 294/294 tests pass, biome clean, typecheck clean.parseTraceparentcovering empty / whitespace / lone dashes / trailing hyphen / uppercase / too-long / too-short / all-zero / malformed-flags / forward-compat versions.cf-ray/traceparentand parses the emitted JSON log line.instanceIdis on capturedconsole.logemissions.wrangler tail, fire a webhook, confirm the new fields appear in Workers Logs.Needs Human Input
Two assumptions were made while writing the spec and are called out in the issue comments:
0.1inwrangler.toml.Update the
wrangler.tomlblock if either assumption needs changing.🤖 Generated with Claude Code
Note
Enable Cloudflare tracing and bind rayId/traceparent/instanceId onto logs
parseTraceparent()to logger.ts to validate and parse W3C traceparent headers intotraceId/spanId.cf-rayandtraceparentheaders and bindsrayId,traceId, andspanIdonto the request-scoped logger.source.traceobject is attached to the event payload if any tracing fields are present.TaskRunnerWorkflowin task-runner-workflow.ts bindsinstanceIdand any propagated trace fields onto its logger and emits a "Workflow run started" breadcrumb.observabilityandtracessampling in wrangler.toml.Macroscope summarized e579334.