Draft
Conversation
Phase 1: Add Ansi rendering helpers (frame, hint, note, help, code, inline) to @workflow/errors, and a chalk mock for readable snapshot tests. Phase 2: Add four context-violation error classes to @workflow/core (NotInWorkflowContextError, NotInStepContextError, NotInWorkflowOrStepContextError, UnavailableInWorkflowContextError) and apply them to all twelve user-facing throw sites so errors now include docs links and a structured "what/why/fix" frame. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Tighten phase 1 changeset to a single sentence (per pranaygp review) and switch to double-quoted frontmatter (per Copilot + repo convention). - Implement `ansifyName` to actually apply dim styling to workflow/ / step/ prefixes; add an `Ansi.dim` helper to `@workflow/errors` so callers don't need to import chalk directly. - Remove the `void getWorkflowMetadata;` workaround in context-errors.ts by dropping the unused value import (we only needed the type and symbol). - Render the plain-Error throw in `workflow/get-workflow-metadata.ts` with `Ansi.frame` + docs link so the VM path matches the structured-class styling from the sibling step path (still uses a plain Error to avoid the module-init cycle). - Guard `buildUnderline` against zero-length markers so a stray empty token can't produce a negative `String.repeat` count.
Adds a `.child()` and `.forRun(runId, workflowName)` child-logger API to the structured logger so runtime/step code doesn't have to repeat `workflowRunId`/`workflowName`/`stepId` on every call. Normalizes error metadata to structured `errorName` / `errorMessage` / `errorStack` fields instead of ad-hoc `error: err.message` strings, and adds comments to silent catches that swallow expected idempotency conflicts. Also folds in the pending changes from #1812 so that PR can be closed: - Standardize the console prefix to `[workflow-sdk]`. - Split the replay-timeout log into a warn-while-retrying vs. error-when-giving-up, and surface the underlying error when we can't mark a timed-out run as failed. - Include the error stack in the "Fatal runtime error during workflow setup" log and in the top-level user-code workflow error log so the stack surfaces in flattened log drains. - Drop the `[Workflows] "<runId>" - ` prefix from `buildWorkflowSuspensionMessage` — the structured logger now attaches run context. Supersedes #1812. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 4 of friendlier errors: introduce a `SerializationError` class with
an optional `hint` and a docs link (workflow-sdk.dev/err/serialization-failed),
and adopt it at every user-facing serialization boundary in @workflow/core:
- Locked ReadableStream at a workflow boundary
- Unregistered class / missing `classId` / missing `WORKFLOW_DESERIALIZE`
- Attempting to return step functions to clients or call workflow functions
directly
- Webhook `respondWith()` called outside a step
- `dehydrate*` / `getSerializeStream` failures (workflow args/return, step
args/return, stream chunks)
Internal invariants (format prefix length checks, unknown format bytes,
missing `STREAM_NAME_SYMBOL`, encryption key/size guards, etc.) now throw
`WorkflowRuntimeError` instead of plain `Error` so the classifier and logger
treat them consistently.
`formatSerializationError` now returns `{ message, hint }` so the hint
fragment can be rendered with the standard SerializationError framing
instead of being baked into the message string.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add describeError() that derives attribution and class-aware hints from existing error classes + RUN_ERROR_CODES — no event data changes. Wire into step failures, max-delivery exhaustion, run failures, and fatal setup errors so terminal logs include errorAttribution and a hint for known error types. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- `describeError(err, errorCode?)` now accepts an optional precomputed `RunErrorCode`. `classifyRunError(err)` only narrows to USER_ERROR / RUNTIME_ERROR, so the REPLAY_TIMEOUT and MAX_DELIVERIES_EXCEEDED branches were previously unreachable from the step / run failure log sites. Callers that know the failure category (runtime.ts for replay timeout and max-deliveries exhaustion) now pass the code in. - Context-violation checks use `instanceof` against the actual classes from context-errors.ts instead of a name-string set. Type-safe + survives class renames. - Wire the new hints through to the REPLAY_TIMEOUT and MAX_DELIVERIES_EXCEEDED log sites so those branches actually render a hint now. - 3 new tests cover the reachable code paths + precomputed-code override. - Changeset frontmatter switched to double quotes per repo convention.
Internal invariants now use WorkflowRuntimeError so describeError attributes them to the SDK: missing startedAt, VM generateKey, closure-vars outside step context, ENOTSUP. defineHook().resume() formats schema validation failures as a readable list instead of a JSON blob. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Observability renderers read persisted run_failed / step_failed event data,
not live Error instances. describeRunError takes { errorCode, errorName }
and returns the same { attribution, hint } shape as describeError, so the
CLI and web UI can derive user-vs-SDK framing from the event log directly.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add `WorkflowBuildError` class in `@workflow/errors` with optional `hint` for an actionable next step, and apply it in `@workflow/builders` at user-facing sites: failed esbuild phases, unresolved built-in steps, and empty esbuild output now throw `WorkflowBuildError` with a hint pointing at the likely fix. Runtime invariants remain plain `Error`.
…docs link, redirect stack
- Drop the readonly `functionName` param-property on context-error classes so
util.inspect no longer prints a trailing `{ functionName: 'foo()' }` block.
- Replace the `DocLink` ("label: https://…") shape with a plain `DocsUrl`
template-literal type. Error output now renders a single clean line:
`docs: https://…` (new `Ansi.docs` helper) instead of the noisier
"note: Read more about foo(): https://…".
- Add throw helpers (`throwNotInWorkflowContext`, etc.) that call
`Error.captureStackTrace(err, stackStartFn)` on V8 engines so the top frame
of the thrown error points at the user's call site instead of at the gate
function inside the framework. Callers pass themselves as the boundary.
- Refactor `defineHook()` (both root and `/workflow`) to use named function
closures rather than `this.create`/`this.resume`, since the stack redirect
relies on a stable function identity that survives destructuring.
- Update context-errors.test.ts to snapshot the new `docs:` framing and to
add a regression test asserting the top stack frame is the user call site.
🦋 Changeset detectedLatest commit: 9fd914b The changes in this PR will be included in the next version bump. This PR includes changesets to release 22 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Contributor
Contributor
🧪 E2E Test Results❌ Some tests failed Summary
❌ Failed Tests💻 Local Development (2 failed)vite-stable (2 failed):
Details by Category❌ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
✅ 📋 Other
❌ Some E2E test jobs failed:
Check the workflow run for details. |
8 tasks
…e path SerializationError now carries readonly fatal = true. Step-return dehydration is wrapped inside the user-code try/catch so that the resulting error flows through userCodeFailed → step_failed → FatalError.is() short-circuit instead of bubbling up as HTTP 500 and triggering a queue retry loop. Retrying a step that returned a non-POJO is guaranteed to fail the same way, so this saves ~20s and 3 near- identical error blocks per serialization failure.
Snapshot tests lock in the exact shape of: - describeError() payloads (attribution, errorCode, hint) for every classification — plain Error, SerializationError, context-violation, WorkflowRuntimeError, REPLAY_TIMEOUT, MAX_DELIVERIES_EXCEEDED. - The scoped-logger call signature for the two canonical runtime failure paths (fatal-bubble and hit-max-retries), so refactors of forRun() / child() metadata merging can't silently change what users see in their log drains. SerializationError now also has a direct test for readonly fatal=true + FatalError.is() recognition. pr-artifacts/ contains real log-output snapshots from running the nextjs-turbopack workbench against five error scenarios. These are reference material for reviewers and are flagged to be removed before merge.
The step-level fatal-error log used to embed the full stack trace inside an `errorStack` string field in the metadata object, so util.inspect rendered it as a quote-escaped, line-continuation blob when the log hit the terminal — unreadable in practice. Move framing + stack into the log *message* (matching the workflow-level log in runtime.ts) and keep the metadata object compact with only the indexable structured fields (`errorAttribution`, `errorName`, `errorMessage`, `hint`, IDs). Log drains still get the same keys; humans now see a readable stack trace. Also introduce `formatStepName` / `formatWorkflowName` in `@workflow/utils` that render machine names (`step//./workflows/1_simple//add`) as `add (./workflows/1_simple)` in log framings, using the existing `parseStepName` / `parseWorkflowName` parsers. Applied to step-fatal, hit-max-retries, exceeded-max-retries, and workflow-threw log sites. Artifacts in pr-artifacts/ updated to show the new output shape, and renamed .log → .md since they're Markdown and IDE previews are nicer that way.
Replace util.inspect's default object dump (which quote-escapes multi-line stacks and paragraph hints into a single-line JSON-y blob) with a workflow-aware formatter that composes the entire log line into a single string passed to console.error / console.warn. Highlights of the new output: - Per-run / per-step IDs render with their parsed friendly names so users see `wrun_… · simple (./workflows/1_simple)` instead of just the raw `workflowName: 'workflow//./workflows/1_simple//simple'`. - Color-coded attribution badge (user error red / sdk error magenta) paired with the error class in bold. - Hints render as a paragraph under `hint:` rather than a backslash- `\n`-escaped string. - Drops redundant fields (errorStack always; errorMessage when it's already in the parent message) to avoid double-printing. - Unknown fields fall through as a sorted `key value` tail so we never silently drop log information. @workflow/errors/ansi gains bold/red/magenta helpers used by the formatter. The web / web-shared packages don't consume stderr — they read structured event payloads from the World event log — so this is presentation-only at the runtime layer.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Consolidates the eight-PR friendlier-errors stack into a single PR. Inspired by @Schniz's stalled #706. Superseded by this PR: #1831, #1832, #1836, #1837, #1838, #1839, #1840.
What's included
@workflow/errorsSerializationError,WorkflowBuildErrorclasses (with optionalhintfield).Ansirendering helpers (frame,code,docs,dim,inline) — now lives under the@workflow/errors/ansisubpath so the main entry doesn't pullchalkinto every consumer.FatalError.is(err)widened to recognize any error with afatal: trueown property.@workflow/core— context violationsNotInWorkflowContextError,NotInStepContextError,NotInWorkflowOrStepContextError,UnavailableInWorkflowContextError) applied to twelve user-facing throw sites. Each includes a docs link..message/.stackare plain text — the colored framed form renders lazily via[util.inspect.custom]/toString(), so structured logs and log drains no longer contain raw\x1B[...mbytes. All four classes setfatal = true, socreateHook()-from-a-step fails immediately instead of burning three retry attempts. Thrown errors redirect their stack to the user's call site via a sharedredirectStackToCallerhelper so terminal overlays (Next.js, Turbopack, VS Code) point at user code.@workflow/core— serializationSerializationErrorapplied to all user-facing serialization boundaries: stream locking, unregistered classes, missingWORKFLOW_DESERIALIZE, step-function / workflow-function misuse, and dehydrate/hydrate failures for workflow args, step args, and return values.@workflow/core— runtime logger.child()and.forRun(runId, workflowName)for stable per-run context, standardized[workflow-sdk]prefix, error stacks surfaced in log drains, clarified replay-timeout phrasing (warn while retrying vs. error when giving up).@workflow/core— attributiondescribeError(err)anddescribeRunError({ errorCode, errorName })compute user-vs-SDK attribution + class-aware hints, either from a liveErrorinstance or from persisted failure-event fields. Exposed under the public@workflow/core/describe-errorsubpath for CLI / web consumption. Terminal logs at step-failure, max-retries, run-failure, and fatal-setup sites now includeerrorAttributionmetadata and hint text.@workflow/core— consistencystartedAt, VMcrypto.subtle.generateKey, closure-vars outside a step context,ENOTSUP) now throwWorkflowRuntimeErrorso they are attributed to the SDK.defineHook().resume()formats schema validation failures as a readable bulleted list instead of a raw JSON dump.@workflow/builders— build-timeWorkflowBuildErrorwith ahintpointing at the likely fix.Fixes from manual testing (
createHook()inside a step)Surfaced three issues after running the earlier stack end-to-end and addressed in the final commits:
.messagebecause chalk resolves at construction time — fixed by storing plain text and rendering pretty lazily (seecontext-errors-plain-messagechangeset).fatal: trueand wideningFatalError.is()(seecontext-errors-fatalchangeset).captureStackTracefeature-detect in two files — fixed by extracting a sharedredirectStackToCallerhelper (seecapture-stack-sharedchangeset).Follow-up round (latest commits)
Manual testing uncovered two more issues, fixed in the last two commits on this branch:
SerializationErrorstill looped through 4 retries. Root cause:dehydrateStepReturnValue()was called outside the step-handler try/catch, soFatalError.is()never saw the error. Two-part fix: markSerializationErrorasfatal = true, and move the dehydration call inside the user-code try/catch instep-handler.tsso the error routes throughuserCodeFailed→step_failed(seeserialization-error-fatalchangeset). Same non-POJO return now fails in ~1.6s / 1 block, not ~21s / 4 blocks.toMatchInlineSnapshot-backed tests fordescribeError()payloads (every attribution path) and the scoped-logger call signature for the two canonical runtime failure sites. Regression gate on the exact field shapes users see in their log drains.Reviewer reference:
pr-artifacts/Five scenario log captures (
01-context-violation…through05-retryable-error-max-retries) show actual runtime output, before/after framing, and any follow-ups noted during testing. Remove before merge — they're underpr-artifacts/explicitly so they're easy togit rmin the final prep.Addressed review feedback
errorspackage depending on chalk" —Ansiis now on a subpath; the main entry has no chalk dep path.Error.captureStackTraceguard/cast" — extracted topackages/core/src/capture-stack.ts.Manual test plan
All sections below exercise different parts of the stack. Start
cd workbench/nextjs-turbopack && pnpm devunless otherwise noted.1. Context-violation errors (phase 1 + 2 + followups)
A convenient smoke route at
workbench/nextjs-turbopack/app/api/friendlier-errors-smoke/route.ts:createHook()outside workflow —?which=createHook. Terminal shows a framed box with title`createHook()` can only be called inside a workflow functionand adocs: https://…/workflow/create-hookline (polished — no longernote: Read more about…).sleep()outside workflow —?which=sleep. Same framing, docs URL ends in.../workflow/sleep.getStepMetadata()in a workflow function (not a step) — add to a"use workflow"file; expect title "can only be called inside a step function".getWorkflowMetadata()in application code —?which=getWorkflowMetadata. Expect "workflow or step function".resumeHook()inside a workflow — callresumeHook(token, payload)from inside a"use workflow"function. Title:`resumeHook()` cannot be called from a workflow context, plus a linethis call was made from the workflow//./src/workflows/example.ts//myWorkflow workflow context.with theworkflow/prefix dimmed.at ...line oferr.stackshould reference your route handler, not a frame inside@workflow/core. Same goes for the Next.js dev overlay.functionNameleak — the JSON response body should NOT contain afunctionNameproperty on the error object (it used to via the old constructor param-property)..message/.stack(new in this PR) — the JSON response body'smessageandstackstrings must contain no\x1B[bytes; they're plain text. In the terminal,console.error(err)still renders the pretty framed version viautil.inspect.createHook()inside a"use step"function:add(…). You should see one[workflow-sdk]fatal-error log block (Step "X" threw a FatalError — bubbling up...), not four. The step fails immediately.2. Runtime logger metadata (phase 3)
[workflow-sdk]. Grep your terminal output; there should be no[Workflows]or other prefixes.workflowRunIdandworkflowNameas metadata fields instead of being baked into the message string.warnlevel, phrasing includes "took too long — will retry".errorlevel, phrasing includes "gave up".errorStack), not just the one-line error message.3.
SerializationError(phase 4)Cause a user-facing serialization failure:
[workflow-sdk]log witherrorName: 'SerializationError'.errorAttribution: 'user'.hintfield present:"A value passed across a workflow/step boundary could not be serialized…".getWritable('x').getWriter()twice on the same stream — expect aSerializationErrorwith docs link.4.
describeErrorattribution (phase 5)For each of these, inspect the
[workflow-sdk]log at failure time:Error—throw new Error('boom')from a step →errorAttribution: 'user', nohintfield.SerializationError→errorAttribution: 'user'+ serialization hint.errorAttribution: 'user'+ context hint.WorkflowRuntimeError—throw new WorkflowRuntimeError('invariant')from a step →errorAttribution: 'sdk'+ runtime hint.WORKFLOW_REPLAY_TIMEOUT_MS=50, run a non-trivial workflow → after retries exhaust,errorAttribution: 'sdk'+ replay-timeout hint.errorAttribution: 'sdk'+ max-delivery hint.5. Consistency pass (phase 6)
defineHook().resume()— callhook.resume(token, invalidBody). Expect a readable bulleted list of validation issues (one per line,at "field": message), not a raw JSON dump ofZodError.issues.crypto.subtle.generateKey()inside workflow VM — call it from a"use workflow"function. Expect a clear message explaining why it's disabled + "move this into a step function", witherrorAttribution: 'sdk'.6.
describeErrorsubpath (phase 7 foundation)Create
scratch.tsat repo root:Run
pnpm tsx scratch.ts.describeRunError({ errorCode: 'USER_ERROR', errorName: 'SerializationError' })→{ attribution: 'user', errorCode: 'USER_ERROR', hint: 'A value…serialized…' }.describeRunError({ errorCode: 'RUNTIME_ERROR' })→{ attribution: 'sdk', hint: 'This is an internal workflow SDK error…' }.describeError(new SerializationError('x'))anddescribeRunError({ errorCode: 'USER_ERROR', errorName: 'SerializationError' })return the same shape and hint string.@workflow/core/describe-errorandpnpm tsxruns without module-resolution errors.7.
WorkflowBuildError(phase 8)Exercise the build pipeline, not the runtime. Use
workbench/nextjs-turbopackand runpnpm build(notpnpm dev).WorkflowBuildErrortitled "Build failed during workflows bundle" followed by a blank line andhint: Review the esbuild errors above…. The original esbuild errors remain printed above (not suppressed).mv node_modules/workflow node_modules/workflow-bakand runpnpm build. ExpectWorkflowBuildError: Failed to resolve built-in steps sources.+hint: run \pnpm install workflow`…`. Restore afterwards.WorkflowBuildError: No output files generated from esbuild+ hint mentioning"use workflow"/"use step"directives..is()discriminator — in a scratch script:WorkflowBuildError.is(new WorkflowBuildError('x', { hint: 'y' }))returnstrue;WorkflowBuildError.is(new Error('x'))returnsfalse.pnpm dev). Confirm noWorkflowBuildErrorshows up; this class is build-time only.8. New
@workflow/errors/ansisubpath (final PR)import { Ansi } from '@workflow/errors'no longer works (and wasn't intended to — the helpers were always namespaced). Confirmimport * as Ansi from '@workflow/errors/ansi'resolves.import { SerializationError } from '@workflow/errors') no longer pullschalkinto its bundle. Check a production bundle orpnpm why chalkfrom a dependent context.9.
FatalError.is()widening (final PR)FatalError.is(new FatalError('x'))—true.FatalError.is(new NotInWorkflowContextError('createHook()', 'https://…'))—true(viafatal: trueown property).FatalError.is(new Error('x'))—false.FatalError.is({ fatal: true })—false(must be anError-shaped value).Unit tests
All packages typecheck clean; relevant test files pass:
pnpm --filter @workflow/errors test— 25 tests (Ansi, SerializationError, WorkflowBuildError, new FatalError widening tests)pnpm --filter @workflow/core exec vitest run src/context-errors.test.ts src/describe-error.test.ts— 32 tests (incl. new plain-message / lazy-pretty / FatalError-gate cases)pnpm --filter @workflow/builders test— 129 testspnpm typecheck— clean across workspace🤖 Generated with Claude Code