Skip to content

fix: report snapshot timing diagnostics#798

Merged
thymikee merged 4 commits into
callstack:mainfrom
pupuking723:fix/snapshot-diagnostics-reporting
Jun 13, 2026
Merged

fix: report snapshot timing diagnostics#798
thymikee merged 4 commits into
callstack:mainfrom
pupuking723:fix/snapshot-diagnostics-reporting

Conversation

@pupuking723

Copy link
Copy Markdown
Contributor

Fixes #596.

This records snapshot capture timing per session and surfaces aggregate diagnostics in the command/run outputs that agents inspect.

What changed:

  • record snapshot capture durations with p50/p95/max/count stats
  • expose snapshotDiagnostics on snapshot command results and serialized client output
  • aggregate snapshot diagnostics through replay actions and agent-device test suite results
  • emit a scoped warning when snapshot p95 crosses a conservative threshold

Verification:

  • pnpm exec vitest run src/__tests__/snapshot-diagnostics.test.ts src/__tests__/client-shared.test.ts src/daemon/handlers/__tests__/session-test-suite.test.ts
  • pnpm format:check
  • pnpm lint
  • pnpm typecheck
  • pnpm check:fallow --base origin/main
  • pnpm build

Note: local Node is v22.17.0 while the repo asks for >=22.19, so pnpm printed an engine warning; the commands above completed successfully.

I used Codex as a coding assistant and manually reviewed the issue, code, tests, and verification output before submitting.

@thymikee

Copy link
Copy Markdown
Member

Coordinator note: this PR currently has no checks because it is from a first-time contributor, so I am treating the missing CI as approval-gating rather than as a readiness signal. I have queued a dedicated review pass to inspect the issue context, ADRs, production output route, and agent-friendly diagnostics shape before recommending whether maintainers should approve/run CI. I will follow up here with concrete review findings or validation evidence.

@thymikee thymikee left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I found two correctness issues that block this from closing #596.

  1. currently aggregates only already present on each replay action response (, then ). That misses the common replay/test slow-snapshot path: interaction/get/is/wait captures call and update , but their command responses do not include , so those timings are omitted from replay/test results. Conversely, explicit actions return the cumulative session summary from , so merging every action response double-counts prior captures (two snapshot actions with cumulative counts 1 then 2 become a replay count of 3). Please aggregate from request/run-scoped raw samples or per-action deltas rather than cumulative response summaries.

  2. appends the slow timing warning into snapshot annotation . Human output renders those warnings into stdout text through -> -> . #596 explicitly asks to keep normal command stdout stable and prefer stderr for non-JSON warnings. Please keep the machine-readable in JSON/client output, but route the timing warning through CLI stderr (for non-JSON) instead of embedding it in the snapshot tree text.

Local validation I ran in :

RUN v4.1.2 /private/tmp/agent-device-pr-798

Test Files 3 passed (3)
Tests 20 passed (20)
Start at 08:37:13
Duration 1.16s (transform 605ms, setup 0ms, import 846ms, tests 55ms, environment 0ms) passed.

  • Found 0 warnings and 0 errors.
    Finished in 116ms on 923 files with 73 rules using 12 threads. passed.
  • Rslib v0.20.1

info build started...
start generating declaration files... (esm)
ready built in 0.27 s
info declaration files prepared in 2.75 s (esm)
ready declaration files bundled successfully: dist/src/index.d.ts in 9.58 s (esm)
ready declaration files bundled successfully: dist/src/io.d.ts in 8.81 s (esm)
ready declaration files bundled successfully: dist/src/artifacts.d.ts in 8.23 s (esm)
ready declaration files bundled successfully: dist/src/batch.d.ts in 7.71 s (esm)
ready declaration files bundled successfully: dist/src/android-adb.d.ts in 5.57 s (esm)
ready declaration files bundled successfully: dist/src/android-snapshot-helper.d.ts in 5.06 s (esm)
ready declaration files bundled successfully: dist/src/contracts.d.ts in 4.51 s (esm)
ready declaration files bundled successfully: dist/src/finders.d.ts in 3.44 s (esm)
ready declaration files bundled successfully: dist/src/internal/bin.d.ts in 2.91 s (esm)
ready declaration files bundled successfully: dist/src/internal/companion-tunnel.d.ts in 2.36 s (esm)
ready declaration files bundled successfully: dist/src/internal/daemon.d.ts in 1.83 s (esm)
ready declaration files bundled successfully: dist/src/selectors.d.ts in 3.97 s (esm)
ready declaration files bundled successfully: dist/src/internal/png-worker.d.ts in 1.13 s (esm)
ready declaration files bundled successfully: dist/src/install-source.d.ts in 6.08 s (esm)
ready declaration files bundled successfully: dist/src/internal/update-check-entry.d.ts in 0.53 s (esm)
ready declaration files bundled successfully: dist/src/remote-config.d.ts in 6.60 s (esm)
ready declaration files bundled successfully: dist/src/metro.d.ts in 7.19 s (esm)

File (esm) Size
dist/src/internal/daemon.js 0.02 kB
dist/src/io.js 0.05 kB
dist/src/artifacts.js 0.06 kB
dist/src/4829.js 0.07 kB
dist/src/remote-config.js 0.07 kB
dist/src/internal/update-check-entry.js 0.15 kB
dist/src/install-source.js 0.16 kB
dist/src/selectors.js 0.17 kB
dist/src/devices1.js 0.20 kB
dist/src/batch.js 0.20 kB
dist/src/index.js 0.21 kB
dist/src/metro.js 0.22 kB
dist/src/devices
2.js 0.23 kB
dist/src/finders.js 0.23 kB
dist/src/command-surface.js 0.28 kB
dist/src/3675.js 0.30 kB
dist/src/3267.js 0.33 kB
dist/src/contracts.js 0.35 kB
dist/src/rslib-runtime.js 0.35 kB
dist/src/9671.js 0.42 kB
dist/src/recording-provider.js 0.43 kB
dist/src/android-snapshot-helper.js 0.50 kB
dist/src/lease.js 0.58 kB
dist/src/internal/png-worker.js 0.95 kB
dist/src/notifications.js 0.96 kB
dist/src/internal/bin.js 1.0 kB
dist/src/4778.js 1.1 kB
dist/src/5678.js 1.4 kB
dist/src/7719.js 1.4 kB
dist/src/record-trace.js 1.4 kB
dist/src/6629.js 1.4 kB
dist/src/2301.js 1.6 kB
dist/src/7556.js 2.0 kB
dist/src/linux.js 2.0 kB
dist/src/1010.js 2.1 kB
dist/src/7599.js 2.4 kB
dist/src/8656.js 2.4 kB
dist/src/1352.js 2.6 kB
dist/src/113.js 2.7 kB
dist/src/9471.js 2.8 kB
dist/src/6088.js 3.1 kB
dist/src/android-adb.js 3.1 kB
dist/src/1231.js 3.2 kB
dist/src/208.js 3.3 kB
dist/src/react-native.js 3.4 kB
dist/src/5827.js 3.5 kB
dist/src/7651.js 3.5 kB
dist/src/input-actions~1.js 3.7 kB
dist/src/devices.js 3.7 kB
dist/src/9152.js 3.9 kB
dist/src/9639.js 4.3 kB
dist/src/server.js 4.4 kB
dist/src/8133.js 5.2 kB
dist/src/apple.js 5.8 kB
dist/src/5611.js 5.9 kB
dist/src/4012.js 5.9 kB
dist/src/989.js 6.3 kB
dist/src/9818.js 6.8 kB
dist/src/9010.js 7.4 kB
dist/src/9616.js 7.5 kB
dist/src/generic.js 8.3 kB
dist/src/internal/companion-tunnel.js 8.6 kB
dist/src/find.js 8.8 kB
dist/src/input-actions.js 9.4 kB
dist/src/7847.js 10.4 kB
dist/src/2403.js 11.4 kB
dist/src/interaction.js 13.7 kB
dist/src/snapshot.js 16.3 kB
dist/src/1974.js 17.5 kB
dist/src/android.js 17.9 kB
dist/src/selector-runtime.js 18.3 kB
dist/src/221.js 28.5 kB
dist/src/8806.js 29.2 kB
dist/src/cli-output.js 29.3 kB
dist/src/record-trace-recording.js 30.3 kB
dist/src/apps.js 32.1 kB
dist/src/cli.js 55.0 kB
dist/src/args.js 55.4 kB
dist/src/9542.js 55.5 kB
dist/src/123.js 71.8 kB
dist/src/6133.js 134.6 kB
dist/src/session.js 140.2 kB
dist/src/2415.js 328.8 kB

                             Total:   1260.8 kB passed.
  • passed after rerunning outside the sandbox due pnpm registry verification failure.
  • Checking formatting...

All matched files use the correct format.
Finished in 181ms on 894 files using 12 threads. passed after rerunning outside the sandbox for the same pnpm registry verification issue.

Given the missing checks on this first-time contributor PR, I would not approve maintainer-triggered CI yet; these behavior issues should be fixed first.

@thymikee

Copy link
Copy Markdown
Member

Correction to my review body above: zsh interpreted inline backticks before gh received the text. The review state is still intentional (CHANGES_REQUESTED), but here are the actionable findings with the code references intact.

  1. src/daemon/handlers/session-replay-runtime.ts:90-159 aggregates only snapshotDiagnostics already present on each replay action response (readReplayActionSnapshotDiagnostics -> mergeSnapshotDiagnostics). That misses the common replay/test slow-snapshot path: interaction/get/is/wait captures call captureSnapshot and update session.snapshotDiagnostics, but their command responses do not include snapshotDiagnostics, so those timings are omitted from replay/test results. Conversely, explicit snapshot actions return the cumulative session summary from src/daemon/snapshot-runtime.ts:288-292, so merging every action response double-counts prior captures. For example, two explicit snapshot actions with cumulative counts 1 then 2 become a replay count of 3. Please aggregate from request/run-scoped raw samples or per-action deltas rather than cumulative response summaries.

  2. src/daemon/handlers/snapshot-capture.ts:317-323 appends the slow timing warning into snapshot annotation warnings. Human snapshot output renders those warnings into stdout text through src/commands/capture/output.ts:14-19 -> formatSnapshotText -> buildSnapshotNotices (src/utils/output.ts:68-69 and src/utils/output.ts:613). Track and report slow snapshot diagnostics #596 explicitly asks to keep normal command stdout stable and prefer stderr for non-JSON warnings. Please keep the machine-readable snapshotDiagnostics in JSON/client output, but route the timing warning through CLI stderr for non-JSON output instead of embedding it in the snapshot tree text.

Local validation I ran in /private/tmp/agent-device-pr-798:

  • pnpm exec vitest run src/__tests__/snapshot-diagnostics.test.ts src/__tests__/client-shared.test.ts src/daemon/handlers/__tests__/session-test-suite.test.ts passed.
  • pnpm check:quick passed.
  • pnpm build passed.
  • pnpm check:fallow --base origin/main passed after rerunning outside the sandbox due pnpm registry verification failure.
  • pnpm format:check passed after rerunning outside the sandbox for the same pnpm registry verification issue.

Given the missing checks on this first-time contributor PR, I would not approve maintainer-triggered CI yet; these behavior issues should be fixed first.

@pupuking723

Copy link
Copy Markdown
Contributor Author

Thanks for the review. I pushed db069ca to address both blocking points:

  • replay now aggregates snapshot diagnostics from per-action session timing samples instead of cumulative action response summaries, so normal interaction captures are included and explicit snapshot actions are not double-counted
  • slow snapshot warnings are no longer added to snapshot annotations/stdout text; CLI snapshot output routes the diagnostics warning to stderr while JSON/client output keeps the machine-readable snapshotDiagnostics field

Validation:

  • pnpm check:quick
  • pnpm format:check
  • pnpm exec vitest run src/tests/snapshot-diagnostics.test.ts src/tests/client-shared.test.ts src/daemon/handlers/tests/session-test-suite.test.ts src/commands/capture/index.test.ts
  • pnpm exec vitest run src/daemon/handlers/tests/session-replay-vars.test.ts -t "reports snapshot diagnostics"

One local note: running the full session-replay-vars file still hits the existing http.post child-process test failure on my machine before the server writes stdout; the new diagnostics test passes in isolation.

@thymikee

Copy link
Copy Markdown
Member

Thanks for the quick update. The two original findings are materially improved: replay now reads per-action session samples, and snapshot warnings are no longer injected into the snapshot tree stdout path.

I still see one remaining blocker for #596: failed replay/test runs can still drop the snapshot diagnostics. In runReplayScriptFile, samples are collected before each early failure return, but snapshotDiagnosticsSummary is only built on the success path after the loop. The early returns at src/daemon/handlers/session-replay-runtime.ts:115-147 call withReplayFailureContext, and that helper only copies the original error details/artifacts at src/daemon/handlers/session-replay-runtime.ts:229-255. Then buildReplayTestFailedResult tries to read diagnostics from outcome.finalResponse at src/daemon/handlers/session-test-attempt.ts:315-317, but failed replay responses never carry the summary, so the final test/replay result loses slow snapshot evidence exactly when a flaky failure needs it most.

Please attach the per-run snapshot diagnostics to failure responses too, preferably without putting raw samples in JSON output. A focused regression should make a replay action record snapshot samples and then fail, and assert both replay failure details/result shape and test failed case/final aggregate expose snapshotDiagnostics.

Focused validation I reran in /private/tmp/agent-device-pr-798-rereview passed:

  • pnpm exec vitest run src/__tests__/snapshot-diagnostics.test.ts src/__tests__/client-shared.test.ts src/daemon/handlers/__tests__/session-test-suite.test.ts src/commands/capture/index.test.ts

I would still hold maintainer-triggered CI/approval until that failed-run diagnostics path is covered.

@thymikee thymikee left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 23f7f6bad to this branch to address the remaining failed-run diagnostics gap.

What changed:

  • Replay failure responses now include the bounded snapshotDiagnostics summary under error.details.snapshotDiagnostics when samples were captured before the failure.
  • agent-device test failed cases now read diagnostics from failed replay response details, so failed tests and the suite aggregate preserve the slow snapshot signal.
  • Added focused regressions for failed replay diagnostics and failed suite aggregation.

Local validation in /private/tmp/agent-device-pr-798:

  • pnpm exec vitest run src/daemon/handlers/__tests__/session-replay-vars.test.ts -t "snapshot diagnostics" passed.
  • pnpm exec vitest run src/daemon/handlers/__tests__/session-test-suite.test.ts -t "snapshot diagnostics" passed.
  • pnpm check:quick passed.
  • pnpm check:unit passed after rerunning outside the sandbox due pnpm registry verification failure.
  • pnpm format passed after the same pnpm registry verification rerun.

The previous blockers are addressed. Ready for CI on the new head.

@thymikee thymikee merged commit 98376b9 into callstack:main Jun 13, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Track and report slow snapshot diagnostics

2 participants