Skip to content

fix(agents-api): sanitize artifact binary data before persistence#2680

Open
mike-inkeep wants to merge 1 commit intomainfrom
fix/sanitize_binary_artifact_data
Open

fix(agents-api): sanitize artifact binary data before persistence#2680
mike-inkeep wants to merge 1 commit intomainfrom
fix/sanitize_binary_artifact_data

Conversation

@mike-inkeep
Copy link
Contributor

@mike-inkeep mike-inkeep commented Mar 13, 2026

Summary

  • Sanitize inline artifact binary payloads before persistence by uploading base64 image/file parts to blob storage and replacing them with blob:// URIs.
  • Keep observability and naming prompts cleaner by stripping large binary payloads from artifact telemetry/prompt serialization paths.

@changeset-bot
Copy link

changeset-bot bot commented Mar 13, 2026

⚠️ No Changeset found

Latest commit: d3827b5

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link

vercel bot commented Mar 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Mar 13, 2026 2:23am
agents-manage-ui Ready Ready Preview, Comment Mar 13, 2026 2:23am
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
agents-docs Skipped Skipped Mar 13, 2026 2:23am

Request Review

Copy link
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid PR — the recursive sanitizer, content-addressed dedup, and cycle protection are well-designed. Two medium-severity issues: summaryData is still serialized raw into OTEL span attributes (inconsistent with the stripping applied to data and full), and unbounded Promise.all on array elements could overwhelm blob storage for artifacts with many inline images. A few smaller findings in tests and the upload path.

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow runpullfrog.com𝕏

@@ -1664,7 +1670,7 @@ Make the name extremely specific to what this tool call actually returned, not g
'artifact.type': artifactData.artifactType,
'artifact.summary': JSON.stringify(artifactData.summaryData, null, 2),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

artifact.summary serializes raw artifactData.summaryData without stripBinaryDataForObservability, while artifact.full on the next line and artifact.data at line 1370 are both stripped. Base64 blobs in summaryData will still bloat OTEL spans. Should be JSON.stringify(stripBinaryDataForObservability(artifactData.summaryData), null, 2). Same issue at line 1710.

}
inStack.add(current);
try {
return Promise.all(current.map((item) => visit(item)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Promise.all(current.map(visit)) fires all uploads concurrently with no limit. An artifact with dozens of inline images would slam blob storage with parallel requests. Consider capping concurrency (e.g., p-limit or a manual batch of ~5).

Comment on lines +73 to +74
if (isInlineBinaryPart(current)) {
return uploadInlinePart(current, ctx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When isInlineBinaryPart matches, uploadInlinePart does Buffer.from(part.data, 'base64'). If a non-base64 string happens to satisfy the heuristic (type is 'image'/'file', data > 100 chars, not a URL), Buffer.from silently produces garbage bytes that get uploaded. Consider either a fast base64 regex guard (/^[A-Za-z0-9+/=\n]+$/) or wrapping the upload in a try/catch that leaves the original data intact on failure.

expect(result.type).toBe('image');
expect(result.data).toMatch(/^blob:\/\//);
expect(result.mimeType).toBe('image/png');
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sanitizeArtifactBinaryData suite only tests type: 'image' parts. The implementation also handles 'file' — add a test with { type: 'file', data: LARGE_BASE64, mimeType: 'application/pdf' } to cover that branch.

Comment on lines +172 to +177
it('produces a deterministic blob:// URI via content hash', async () => {
const input = { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' };
const r1 = (await sanitizeArtifactBinaryData(input, CTX)) as any;
const r2 = (await sanitizeArtifactBinaryData(input, CTX)) as any;
expect(r1.data).toBe(r2.data);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This determinism test is tautological: buildStorageKey is mocked as a pure function of its inputs, so identical inputs always yield identical outputs regardless of whether the real implementation uses content-hashing. Either un-mock buildStorageKey here or pass two inputs with different metadata but identical data and assert the blob URIs match.

toolCallId: 'tool-call-1',
});

expect(sanitizeArtifactBinaryDataMock).toHaveBeenCalledTimes(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test asserts sanitizeArtifactBinaryDataMock was called twice but never verifies the arguments. Assert that the first call received (artifact.data, { tenantId, projectId, artifactId }) and the second received (artifact.summaryData, ...) to ensure the correct context is forwarded.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(3) Total Issues | Risk: Medium

🟠⚠️ Major (1) 🟠⚠️

🟠 1) artifact-binary-sanitizer.test.ts Missing test coverage for blob upload failures

Issue: The test suite for sanitizeArtifactBinaryData has no test cases for upload failure scenarios. All tests mock upload to resolve successfully (mockResolvedValue(undefined)). There are no tests verifying behavior when blob storage upload throws an error, one of multiple uploads fails, or the blob storage provider cannot be initialized.

Why: Without tests for failure paths, there's no verification that errors propagate correctly to ArtifactService.saveArtifact callers or that future changes don't accidentally swallow upload failures. If blob storage is temporarily unavailable, artifacts could fail to save without clear error messages. The caller AgentSession.processArtifact does have error handling, but that behavior isn't verified by tests.

Fix: Add test cases for error scenarios:

it('propagates upload errors when blob storage fails', async () => {
  mockUpload.mockRejectedValueOnce(new Error('S3 rate limit exceeded'));
  const input = { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' };
  
  await expect(sanitizeArtifactBinaryData(input, CTX)).rejects.toThrow('S3 rate limit exceeded');
});

it('fails entire operation if any upload fails in nested structure', async () => {
  mockUpload
    .mockResolvedValueOnce(undefined)  // First upload succeeds
    .mockRejectedValueOnce(new Error('Upload failed'));  // Second fails
  
  const input = {
    images: [
      { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' },
      { type: 'image', data: LARGE_BASE64, mimeType: 'image/jpeg' },
    ],
  };
  
  await expect(sanitizeArtifactBinaryData(input, CTX)).rejects.toThrow('Upload failed');
  expect(mockUpload).toHaveBeenCalledTimes(2); // Both attempted
});

Refs:

🟡 Minor (2) 🟡

🟡 1) storage-keys.test.ts Missing test for artifact-data storage key building

Issue: The storage-keys.test.ts file tests the media category key building but was not updated to include tests for the new artifact-data category.

Why: Changes to the artifact-data key format (path structure, prefixes, hash format) could break blob storage retrieval without failing any tests.

Fix: Add a parallel test case:

it('builds versioned artifact-data storage key', () => {
  const key = buildStorageKey({
    category: 'artifact-data',
    tenantId: 'tenant-1',
    projectId: 'project-1',
    artifactId: 'artifact-1',
    contentHash: 'abc123',
    ext: 'png',
  });

  expect(key).toBe(
    'v1/t_tenant-1/artifact-data/p_project-1/a_artifact-1/sha256-abc123.png'
  );
});

Refs:

🟡 2) artifact-binary-sanitizer.test.ts No test for data: URI passthrough

Issue: The isInlineBinaryPart function explicitly excludes strings starting with data: from being treated as inline binary data (line 30), but there's no test verifying this behavior. HTTP URLs are tested, but data URIs are a different bypass path.

Why: A regression that removes the data: check could cause data URIs to be incorrectly treated as base64 data, uploaded to blob storage, and corrupted.

Fix: Add test for data URI passthrough:

it('does not upload data: URIs', async () => {
  const input = { type: 'image', data: 'data:image/png;base64,' + 'A'.repeat(200), mimeType: 'image/png' };
  await sanitizeArtifactBinaryData(input, CTX);
  expect(mockUpload).not.toHaveBeenCalled();
});

Refs:

💭 Consider (1) 💭

Inline Comments:

  • 💭 Consider: artifact-binary-sanitizer.ts:34-56 Add logging for blob uploads (consistency with peer file)

💡 APPROVE WITH SUGGESTIONS

Summary: This PR correctly addresses the root cause of binary data persisting to the database by sanitizing both data and summaryData at the saveArtifact chokepoint. The implementation is well-targeted with proper circular reference handling. The main gaps are test coverage for error paths and a missing storage key test. The observability logging suggestion is a nice-to-have for debugging in production.


Discarded (3)
Location Issue Reason Discarded
storage-keys.ts:37-45 Missing parseArtifactDataStorageKey() helper function Artifact blob URIs don't use message resolution — resolveMessageBlobUris only handles media category. Artifacts are returned with blob URIs intact; client handles resolution separately.
artifact-binary-sanitizer.ts:70-105 WeakSet clears references prematurely for diamond patterns This is correct stack-based cycle detection behavior. Content-addressed keys prevent duplicate storage; redundant uploads are a minor performance concern that doesn't warrant complexity increase.
ArtifactService.ts:890-901 Double sanitization for artifacts with both data and summaryData Content-addressed storage prevents actual duplication. While this doubles network I/O for overlapping content, it's acceptable for correctness and simplicity. Optimization can be addressed separately if profiling shows impact.
Reviewers (7)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-sre 6 0 0 0 0 0 6
pr-review-tests 5 2 0 0 0 0 3
pr-review-consistency 5 0 1 0 1 0 3
pr-review-errors 2 1 0 0 0 0 1
pr-review-types 3 0 0 0 0 0 3
pr-review-standards 0 0 0 0 0 0 0
pr-review-precision 0 0 0 0 0 0 0
Total 21 3 1 0 1 0 16

Note: pr-review-precision confirmed implementation is correctly targeted at the root cause. pr-review-standards found no issues.

Comment on lines +34 to +56
async function uploadInlinePart(
part: InlineBinaryPart,
ctx: ArtifactBinaryContext
): Promise<InlineBinaryPart> {
const storage = getBlobStorageProvider();
const buffer = Buffer.from(part.data, 'base64');
const mimeType = part.mimeType ?? 'application/octet-stream';
const contentHash = createHash('sha256').update(buffer).digest('hex');
const ext = getExtensionFromMimeType(mimeType);

const key = buildStorageKey({
category: 'artifact-data',
tenantId: ctx.tenantId,
projectId: ctx.projectId,
artifactId: ctx.artifactId,
contentHash,
ext,
});

await storage.upload({ key, data: buffer, contentType: mimeType });

return { ...part, data: toBlobUri(key) };
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Consider: Add logging for blob uploads

Issue: This file has no logging, while the peer file image-upload.ts includes debug logging for successful uploads and error/warn logging for failures.

Why: Without logging, debugging artifact upload issues in production will require tracing through multiple layers. Debug logs help identify which artifacts uploaded successfully and what their storage keys are.

Fix: Add logging consistent with the pattern in image-upload.ts:

import { getLogger } from '../../../../logger';

const logger = getLogger('artifact-binary-sanitizer');

// Then after the upload:
logger.debug({ key, mimeType, size: buffer.length }, 'Uploaded artifact binary to blob storage');

Refs:

@github-actions github-actions bot deleted a comment from claude bot Mar 13, 2026
@itoqa
Copy link

itoqa bot commented Mar 13, 2026

Ito Test Report ✅

7 test cases ran. 7 passed.

✅ Verification focused on route health, artifact-sanitization behavior, authorization boundaries, and stress interaction outcomes that had reproducible evidence and code-path support.

✅ Passed (7)
Test Case Summary Timestamp Screenshot
ROUTE-1 Playground chat completed successfully and direct /run/api/chat request returned HTTP 200 with valid completion payload. 31:35 ROUTE-1_31-35.png
ROUTE-2 Initial request through port 3000 returned 404, then direct agents API endpoint returned HTTP 200 with valid SSE completion stream. 31:35 ROUTE-2_31-35.png
ROUTE-3 Conversation traces endpoint returned HTTP 200 and included sanitized blob artifactData path with artifact-data category shape. 31:35 ROUTE-3_31-35.png
EDGE-6 Verified controlled failure behavior via traces API fixture: artifact_processing returned error with sanitized artifactData placeholder and no unsanitized binary payload persisted. 47:46 EDGE-6_47-46.png
ADV-1 Verified malicious-binary fixture is neutralized before persistence: traces payload reports sanitized blob URI in artifactData with successful artifact_processing and no raw executable payload. 47:50 ADV-1_47-50.png
ADV-2 Forged foreign scope headers were rejected with a not_found project response while the same token succeeded on the primary scope, demonstrating scope boundary enforcement for this endpoint path. 41:04 ADV-2_41-04.png
ADV-3 Ran 10 rapid send/stop cycles in one conversation and observed stable behavior with no visible long base64-like leakage in the rendered output. 49:30 ADV-3_49-30.png
📋 View Recording

Screen Recording

// Strip binary blobs before serializing — base64 is useless noise for the naming LLM
const fullDataStr = JSON.stringify(
artifactData.data || artifactData.summaryData || {},
stripBinaryDataForObservability(artifactData.data || artifactData.summaryData || {}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the naming LLM get the image then? How does it name the image or produce a description here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants