fix(agents-api): sanitize artifact binary data before persistence by mike-inkeep · Pull Request #2680 · inkeep/agents

mike-inkeep · 2026-03-13T02:19:23Z

Summary

Sanitize inline artifact binary payloads before persistence by uploading base64 image/file parts to blob storage and replacing them with blob:// URIs.
Keep observability and naming prompts cleaner by stripping large binary payloads from artifact telemetry/prompt serialization paths.

changeset-bot · 2026-03-13T02:19:27Z

⚠️ No Changeset found

Latest commit: d3827b5

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel · 2026-03-13T02:19:28Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Mar 13, 2026 2:23am
agents-manage-ui	Ready	Preview, Comment	Mar 13, 2026 2:23am

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
agents-docs	Skipped		Mar 13, 2026 2:23am

pullfrog

Solid PR — the recursive sanitizer, content-addressed dedup, and cycle protection are well-designed. Two medium-severity issues: summaryData is still serialized raw into OTEL span attributes (inconsistent with the stripping applied to data and full), and unbounded Promise.all on array elements could overwhelm blob storage for artifacts with many inline images. A few smaller findings in tests and the upload path.

^{｜ Fix all ➔ ｜ Fix 👍s ➔ ｜ View workflow run ｜ pullfrog.com ｜ 𝕏}

pullfrog · 2026-03-13T02:21:50Z

agents-api/src/domains/run/session/AgentSession.ts

@@ -1664,7 +1670,7 @@ Make the name extremely specific to what this tool call actually returned, not g
                  'artifact.type': artifactData.artifactType,
                  'artifact.summary': JSON.stringify(artifactData.summaryData, null, 2),


artifact.summary serializes raw artifactData.summaryData without stripBinaryDataForObservability, while artifact.full on the next line and artifact.data at line 1370 are both stripped. Base64 blobs in summaryData will still bloat OTEL spans. Should be JSON.stringify(stripBinaryDataForObservability(artifactData.summaryData), null, 2). Same issue at line 1710.

pullfrog · 2026-03-13T02:21:50Z

agents-api/src/domains/run/services/blob-storage/artifact-binary-sanitizer.ts

+      }
+      inStack.add(current);
+      try {
+        return Promise.all(current.map((item) => visit(item)));


Promise.all(current.map(visit)) fires all uploads concurrently with no limit. An artifact with dozens of inline images would slam blob storage with parallel requests. Consider capping concurrency (e.g., p-limit or a manual batch of ~5).

pullfrog · 2026-03-13T02:21:50Z

agents-api/src/domains/run/services/blob-storage/artifact-binary-sanitizer.ts

+    if (isInlineBinaryPart(current)) {
+      return uploadInlinePart(current, ctx);


When isInlineBinaryPart matches, uploadInlinePart does Buffer.from(part.data, 'base64'). If a non-base64 string happens to satisfy the heuristic (type is 'image'/'file', data > 100 chars, not a URL), Buffer.from silently produces garbage bytes that get uploaded. Consider either a fast base64 regex guard (/^[A-Za-z0-9+/=\n]+$/) or wrapping the upload in a try/catch that leaves the original data intact on failure.

pullfrog · 2026-03-13T02:21:50Z

agents-api/src/domains/run/services/__tests__/artifact-binary-sanitizer.test.ts

+    expect(result.type).toBe('image');
+    expect(result.data).toMatch(/^blob:\/\//);
+    expect(result.mimeType).toBe('image/png');
+  });


The sanitizeArtifactBinaryData suite only tests type: 'image' parts. The implementation also handles 'file' — add a test with { type: 'file', data: LARGE_BASE64, mimeType: 'application/pdf' } to cover that branch.

pullfrog · 2026-03-13T02:21:50Z

agents-api/src/domains/run/services/__tests__/artifact-binary-sanitizer.test.ts

+  it('produces a deterministic blob:// URI via content hash', async () => {
+    const input = { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' };
+    const r1 = (await sanitizeArtifactBinaryData(input, CTX)) as any;
+    const r2 = (await sanitizeArtifactBinaryData(input, CTX)) as any;
+    expect(r1.data).toBe(r2.data);
+  });


This determinism test is tautological: buildStorageKey is mocked as a pure function of its inputs, so identical inputs always yield identical outputs regardless of whether the real implementation uses content-hashing. Either un-mock buildStorageKey here or pass two inputs with different metadata but identical data and assert the blob URIs match.

pullfrog · 2026-03-13T02:21:50Z

agents-api/src/domains/run/artifacts/__tests__/ArtifactService.test.ts

+        toolCallId: 'tool-call-1',
+      });
+
+      expect(sanitizeArtifactBinaryDataMock).toHaveBeenCalledTimes(2);


The test asserts sanitizeArtifactBinaryDataMock was called twice but never verifies the arguments. Assert that the first call received (artifact.data, { tenantId, projectId, artifactId }) and the second received (artifact.summaryData, ...) to ensure the correct context is forwarded.

claude

PR Review Summary

(3) Total Issues | Risk: Medium

🟠⚠️ Major (1) 🟠⚠️

🟠 1) artifact-binary-sanitizer.test.ts Missing test coverage for blob upload failures

Issue: The test suite for sanitizeArtifactBinaryData has no test cases for upload failure scenarios. All tests mock upload to resolve successfully (mockResolvedValue(undefined)). There are no tests verifying behavior when blob storage upload throws an error, one of multiple uploads fails, or the blob storage provider cannot be initialized.

Why: Without tests for failure paths, there's no verification that errors propagate correctly to ArtifactService.saveArtifact callers or that future changes don't accidentally swallow upload failures. If blob storage is temporarily unavailable, artifacts could fail to save without clear error messages. The caller AgentSession.processArtifact does have error handling, but that behavior isn't verified by tests.

Fix: Add test cases for error scenarios:

it('propagates upload errors when blob storage fails', async () => {
  mockUpload.mockRejectedValueOnce(new Error('S3 rate limit exceeded'));
  const input = { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' };
  
  await expect(sanitizeArtifactBinaryData(input, CTX)).rejects.toThrow('S3 rate limit exceeded');
});

it('fails entire operation if any upload fails in nested structure', async () => {
  mockUpload
    .mockResolvedValueOnce(undefined)  // First upload succeeds
    .mockRejectedValueOnce(new Error('Upload failed'));  // Second fails
  
  const input = {
    images: [
      { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' },
      { type: 'image', data: LARGE_BASE64, mimeType: 'image/jpeg' },
    ],
  };
  
  await expect(sanitizeArtifactBinaryData(input, CTX)).rejects.toThrow('Upload failed');
  expect(mockUpload).toHaveBeenCalledTimes(2); // Both attempted
});

Refs:

artifact-binary-sanitizer.test.ts:98-189 — existing tests only cover success paths

🟡 Minor (2) 🟡

🟡 1) storage-keys.test.ts Missing test for artifact-data storage key building

Issue: The storage-keys.test.ts file tests the media category key building but was not updated to include tests for the new artifact-data category.

Why: Changes to the artifact-data key format (path structure, prefixes, hash format) could break blob storage retrieval without failing any tests.

Fix: Add a parallel test case:

it('builds versioned artifact-data storage key', () => {
  const key = buildStorageKey({
    category: 'artifact-data',
    tenantId: 'tenant-1',
    projectId: 'project-1',
    artifactId: 'artifact-1',
    contentHash: 'abc123',
    ext: 'png',
  });

  expect(key).toBe(
    'v1/t_tenant-1/artifact-data/p_project-1/a_artifact-1/sha256-abc123.png'
  );
});

Refs:

storage-keys.test.ts:9-23 — existing media key test pattern

🟡 2) artifact-binary-sanitizer.test.ts No test for data: URI passthrough

Issue: The isInlineBinaryPart function explicitly excludes strings starting with data: from being treated as inline binary data (line 30), but there's no test verifying this behavior. HTTP URLs are tested, but data URIs are a different bypass path.

Why: A regression that removes the data: check could cause data URIs to be incorrectly treated as base64 data, uploaded to blob storage, and corrupted.

Fix: Add test for data URI passthrough:

it('does not upload data: URIs', async () => {
  const input = { type: 'image', data: 'data:image/png;base64,' + 'A'.repeat(200), mimeType: 'image/png' };
  await sanitizeArtifactBinaryData(input, CTX);
  expect(mockUpload).not.toHaveBeenCalled();
});

Refs:

artifact-binary-sanitizer.ts:30 — data: URI exclusion check

💭 Consider (1) 💭

Inline Comments:

💭 Consider: artifact-binary-sanitizer.ts:34-56 Add logging for blob uploads (consistency with peer file)

💡 APPROVE WITH SUGGESTIONS

Summary: This PR correctly addresses the root cause of binary data persisting to the database by sanitizing both data and summaryData at the saveArtifact chokepoint. The implementation is well-targeted with proper circular reference handling. The main gaps are test coverage for error paths and a missing storage key test. The observability logging suggestion is a nice-to-have for debugging in production.

Discarded (3)

Location	Issue	Reason Discarded
`storage-keys.ts:37-45`	Missing `parseArtifactDataStorageKey()` helper function	Artifact blob URIs don't use message resolution — `resolveMessageBlobUris` only handles `media` category. Artifacts are returned with blob URIs intact; client handles resolution separately.
`artifact-binary-sanitizer.ts:70-105`	WeakSet clears references prematurely for diamond patterns	This is correct stack-based cycle detection behavior. Content-addressed keys prevent duplicate storage; redundant uploads are a minor performance concern that doesn't warrant complexity increase.
`ArtifactService.ts:890-901`	Double sanitization for artifacts with both data and summaryData	Content-addressed storage prevents actual duplication. While this doubles network I/O for overlapping content, it's acceptable for correctness and simplicity. Optimization can be addressed separately if profiling shows impact.

Reviewers (7)

Reviewer	Returned	Main Findings	Consider	Inline Comments	Discarded
`pr-review-sre`	6	0	0	0	6
`pr-review-tests`	5	2	0	0	3
`pr-review-consistency`	5	0	1	1	3
`pr-review-errors`	2	1	0	0	1
`pr-review-types`	3	0	0	0	3
`pr-review-standards`	0	0	0	0	0
`pr-review-precision`	0	0	0	0	0
Total	21	3	1	1	16

Note: pr-review-precision confirmed implementation is correctly targeted at the root cause. pr-review-standards found no issues.

claude · 2026-03-13T02:26:42Z

agents-api/src/domains/run/services/blob-storage/artifact-binary-sanitizer.ts

+async function uploadInlinePart(
+  part: InlineBinaryPart,
+  ctx: ArtifactBinaryContext
+): Promise<InlineBinaryPart> {
+  const storage = getBlobStorageProvider();
+  const buffer = Buffer.from(part.data, 'base64');
+  const mimeType = part.mimeType ?? 'application/octet-stream';
+  const contentHash = createHash('sha256').update(buffer).digest('hex');
+  const ext = getExtensionFromMimeType(mimeType);
+
+  const key = buildStorageKey({
+    category: 'artifact-data',
+    tenantId: ctx.tenantId,
+    projectId: ctx.projectId,
+    artifactId: ctx.artifactId,
+    contentHash,
+    ext,
+  });
+
+  await storage.upload({ key, data: buffer, contentType: mimeType });
+
+  return { ...part, data: toBlobUri(key) };
+}


💭 Consider: Add logging for blob uploads

Issue: This file has no logging, while the peer file image-upload.ts includes debug logging for successful uploads and error/warn logging for failures.

Why: Without logging, debugging artifact upload issues in production will require tracing through multiple layers. Debug logs help identify which artifacts uploaded successfully and what their storage keys are.

Fix: Add logging consistent with the pattern in image-upload.ts:

import { getLogger } from '../../../../logger'; const logger = getLogger('artifact-binary-sanitizer'); // Then after the upload: logger.debug({ key, mimeType, size: buffer.length }, 'Uploaded artifact binary to blob storage');

Refs:

image-upload.ts:60 — existing logging pattern

itoqa · 2026-03-13T04:26:22Z

Ito Test Report ✅

7 test cases ran. 7 passed.

✅ Verification focused on route health, artifact-sanitization behavior, authorization boundaries, and stress interaction outcomes that had reproducible evidence and code-path support.

✅ Passed (7)

Test Case	Summary	Timestamp
ROUTE-1	Playground chat completed successfully and direct /run/api/chat request returned HTTP 200 with valid completion payload.	31:35
ROUTE-2	Initial request through port 3000 returned 404, then direct agents API endpoint returned HTTP 200 with valid SSE completion stream.	31:35
ROUTE-3	Conversation traces endpoint returned HTTP 200 and included sanitized blob artifactData path with artifact-data category shape.	31:35
EDGE-6	Verified controlled failure behavior via traces API fixture: artifact_processing returned error with sanitized artifactData placeholder and no unsanitized binary payload persisted.	47:46
ADV-1	Verified malicious-binary fixture is neutralized before persistence: traces payload reports sanitized blob URI in artifactData with successful artifact_processing and no raw executable payload.	47:50
ADV-2	Forged foreign scope headers were rejected with a not_found project response while the same token succeeded on the primary scope, demonstrating scope boundary enforcement for this endpoint path.	41:04
ADV-3	Ran 10 rapid send/stop cycles in one conversation and observed stable behavior with no visible long base64-like leakage in the rendered output.	49:30

📋 View Recording

Screen Recording

tim-inkeep · 2026-03-13T15:31:32Z

agents-api/src/domains/run/session/AgentSession.ts

+            // Strip binary blobs before serializing — base64 is useless noise for the naming LLM
            const fullDataStr = JSON.stringify(
-              artifactData.data || artifactData.summaryData || {},
+              stripBinaryDataForObservability(artifactData.data || artifactData.summaryData || {}),


Does the naming LLM get the image then? How does it name the image or produce a description here?

fix(agents-api): sanitize artifact binary data before persistence

d3827b5

vercel bot temporarily deployed to Preview – agents-docs March 13, 2026 02:19 Inactive

mike-inkeep requested a review from tim-inkeep March 13, 2026 02:19

vercel bot deployed to Preview – agents-api March 13, 2026 02:21 View deployment

pullfrog bot reviewed Mar 13, 2026

View reviewed changes

vercel bot deployed to Preview – agents-manage-ui March 13, 2026 02:23 View deployment

claude bot reviewed Mar 13, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 13, 2026

tim-inkeep reviewed Mar 13, 2026

View reviewed changes

tim-inkeep requested changes Mar 13, 2026

View reviewed changes

		@@ -1664,7 +1670,7 @@ Make the name extremely specific to what this tool call actually returned, not g
		'artifact.type': artifactData.artifactType,
		'artifact.summary': JSON.stringify(artifactData.summaryData, null, 2),

		if (isInlineBinaryPart(current)) {
		return uploadInlinePart(current, ctx);

Conversation

mike-inkeep commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Mar 13, 2026

⚠️ No Changeset found

Uh oh!

vercel bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pullfrog bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🟠⚠️ Major (1) 🟠⚠️

🟡 Minor (2) 🟡

💭 Consider (1) 💭

💡 APPROVE WITH SUGGESTIONS

Uh oh!

claude bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

itoqa bot commented Mar 13, 2026

Ito Test Report ✅

Uh oh!

tim-inkeep Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mike-inkeep commented Mar 13, 2026 •

edited

Loading

vercel bot commented Mar 13, 2026 •

edited

Loading

pullfrog bot left a comment •

edited

Loading