fix(agents-api): sanitize artifact binary data before persistence#2680
fix(agents-api): sanitize artifact binary data before persistence#2680mike-inkeep wants to merge 1 commit intomainfrom
Conversation
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
There was a problem hiding this comment.
Solid PR — the recursive sanitizer, content-addressed dedup, and cycle protection are well-designed. Two medium-severity issues: summaryData is still serialized raw into OTEL span attributes (inconsistent with the stripping applied to data and full), and unbounded Promise.all on array elements could overwhelm blob storage for artifacts with many inline images. A few smaller findings in tests and the upload path.
| @@ -1664,7 +1670,7 @@ Make the name extremely specific to what this tool call actually returned, not g | |||
| 'artifact.type': artifactData.artifactType, | |||
| 'artifact.summary': JSON.stringify(artifactData.summaryData, null, 2), | |||
There was a problem hiding this comment.
artifact.summary serializes raw artifactData.summaryData without stripBinaryDataForObservability, while artifact.full on the next line and artifact.data at line 1370 are both stripped. Base64 blobs in summaryData will still bloat OTEL spans. Should be JSON.stringify(stripBinaryDataForObservability(artifactData.summaryData), null, 2). Same issue at line 1710.
| } | ||
| inStack.add(current); | ||
| try { | ||
| return Promise.all(current.map((item) => visit(item))); |
There was a problem hiding this comment.
Promise.all(current.map(visit)) fires all uploads concurrently with no limit. An artifact with dozens of inline images would slam blob storage with parallel requests. Consider capping concurrency (e.g., p-limit or a manual batch of ~5).
| if (isInlineBinaryPart(current)) { | ||
| return uploadInlinePart(current, ctx); |
There was a problem hiding this comment.
When isInlineBinaryPart matches, uploadInlinePart does Buffer.from(part.data, 'base64'). If a non-base64 string happens to satisfy the heuristic (type is 'image'/'file', data > 100 chars, not a URL), Buffer.from silently produces garbage bytes that get uploaded. Consider either a fast base64 regex guard (/^[A-Za-z0-9+/=\n]+$/) or wrapping the upload in a try/catch that leaves the original data intact on failure.
| expect(result.type).toBe('image'); | ||
| expect(result.data).toMatch(/^blob:\/\//); | ||
| expect(result.mimeType).toBe('image/png'); | ||
| }); |
There was a problem hiding this comment.
The sanitizeArtifactBinaryData suite only tests type: 'image' parts. The implementation also handles 'file' — add a test with { type: 'file', data: LARGE_BASE64, mimeType: 'application/pdf' } to cover that branch.
| it('produces a deterministic blob:// URI via content hash', async () => { | ||
| const input = { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' }; | ||
| const r1 = (await sanitizeArtifactBinaryData(input, CTX)) as any; | ||
| const r2 = (await sanitizeArtifactBinaryData(input, CTX)) as any; | ||
| expect(r1.data).toBe(r2.data); | ||
| }); |
There was a problem hiding this comment.
This determinism test is tautological: buildStorageKey is mocked as a pure function of its inputs, so identical inputs always yield identical outputs regardless of whether the real implementation uses content-hashing. Either un-mock buildStorageKey here or pass two inputs with different metadata but identical data and assert the blob URIs match.
| toolCallId: 'tool-call-1', | ||
| }); | ||
|
|
||
| expect(sanitizeArtifactBinaryDataMock).toHaveBeenCalledTimes(2); |
There was a problem hiding this comment.
The test asserts sanitizeArtifactBinaryDataMock was called twice but never verifies the arguments. Assert that the first call received (artifact.data, { tenantId, projectId, artifactId }) and the second received (artifact.summaryData, ...) to ensure the correct context is forwarded.
There was a problem hiding this comment.
PR Review Summary
(3) Total Issues | Risk: Medium
🟠⚠️ Major (1) 🟠⚠️
🟠 1) artifact-binary-sanitizer.test.ts Missing test coverage for blob upload failures
Issue: The test suite for sanitizeArtifactBinaryData has no test cases for upload failure scenarios. All tests mock upload to resolve successfully (mockResolvedValue(undefined)). There are no tests verifying behavior when blob storage upload throws an error, one of multiple uploads fails, or the blob storage provider cannot be initialized.
Why: Without tests for failure paths, there's no verification that errors propagate correctly to ArtifactService.saveArtifact callers or that future changes don't accidentally swallow upload failures. If blob storage is temporarily unavailable, artifacts could fail to save without clear error messages. The caller AgentSession.processArtifact does have error handling, but that behavior isn't verified by tests.
Fix: Add test cases for error scenarios:
it('propagates upload errors when blob storage fails', async () => {
mockUpload.mockRejectedValueOnce(new Error('S3 rate limit exceeded'));
const input = { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' };
await expect(sanitizeArtifactBinaryData(input, CTX)).rejects.toThrow('S3 rate limit exceeded');
});
it('fails entire operation if any upload fails in nested structure', async () => {
mockUpload
.mockResolvedValueOnce(undefined) // First upload succeeds
.mockRejectedValueOnce(new Error('Upload failed')); // Second fails
const input = {
images: [
{ type: 'image', data: LARGE_BASE64, mimeType: 'image/png' },
{ type: 'image', data: LARGE_BASE64, mimeType: 'image/jpeg' },
],
};
await expect(sanitizeArtifactBinaryData(input, CTX)).rejects.toThrow('Upload failed');
expect(mockUpload).toHaveBeenCalledTimes(2); // Both attempted
});Refs:
🟡 Minor (2) 🟡
🟡 1) storage-keys.test.ts Missing test for artifact-data storage key building
Issue: The storage-keys.test.ts file tests the media category key building but was not updated to include tests for the new artifact-data category.
Why: Changes to the artifact-data key format (path structure, prefixes, hash format) could break blob storage retrieval without failing any tests.
Fix: Add a parallel test case:
it('builds versioned artifact-data storage key', () => {
const key = buildStorageKey({
category: 'artifact-data',
tenantId: 'tenant-1',
projectId: 'project-1',
artifactId: 'artifact-1',
contentHash: 'abc123',
ext: 'png',
});
expect(key).toBe(
'v1/t_tenant-1/artifact-data/p_project-1/a_artifact-1/sha256-abc123.png'
);
});Refs:
🟡 2) artifact-binary-sanitizer.test.ts No test for data: URI passthrough
Issue: The isInlineBinaryPart function explicitly excludes strings starting with data: from being treated as inline binary data (line 30), but there's no test verifying this behavior. HTTP URLs are tested, but data URIs are a different bypass path.
Why: A regression that removes the data: check could cause data URIs to be incorrectly treated as base64 data, uploaded to blob storage, and corrupted.
Fix: Add test for data URI passthrough:
it('does not upload data: URIs', async () => {
const input = { type: 'image', data: 'data:image/png;base64,' + 'A'.repeat(200), mimeType: 'image/png' };
await sanitizeArtifactBinaryData(input, CTX);
expect(mockUpload).not.toHaveBeenCalled();
});Refs:
💭 Consider (1) 💭
Inline Comments:
- 💭 Consider:
artifact-binary-sanitizer.ts:34-56Add logging for blob uploads (consistency with peer file)
💡 APPROVE WITH SUGGESTIONS
Summary: This PR correctly addresses the root cause of binary data persisting to the database by sanitizing both data and summaryData at the saveArtifact chokepoint. The implementation is well-targeted with proper circular reference handling. The main gaps are test coverage for error paths and a missing storage key test. The observability logging suggestion is a nice-to-have for debugging in production.
Discarded (3)
| Location | Issue | Reason Discarded |
|---|---|---|
storage-keys.ts:37-45 |
Missing parseArtifactDataStorageKey() helper function |
Artifact blob URIs don't use message resolution — resolveMessageBlobUris only handles media category. Artifacts are returned with blob URIs intact; client handles resolution separately. |
artifact-binary-sanitizer.ts:70-105 |
WeakSet clears references prematurely for diamond patterns | This is correct stack-based cycle detection behavior. Content-addressed keys prevent duplicate storage; redundant uploads are a minor performance concern that doesn't warrant complexity increase. |
ArtifactService.ts:890-901 |
Double sanitization for artifacts with both data and summaryData | Content-addressed storage prevents actual duplication. While this doubles network I/O for overlapping content, it's acceptable for correctness and simplicity. Optimization can be addressed separately if profiling shows impact. |
Reviewers (7)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
pr-review-sre |
6 | 0 | 0 | 0 | 0 | 0 | 6 |
pr-review-tests |
5 | 2 | 0 | 0 | 0 | 0 | 3 |
pr-review-consistency |
5 | 0 | 1 | 0 | 1 | 0 | 3 |
pr-review-errors |
2 | 1 | 0 | 0 | 0 | 0 | 1 |
pr-review-types |
3 | 0 | 0 | 0 | 0 | 0 | 3 |
pr-review-standards |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
pr-review-precision |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Total | 21 | 3 | 1 | 0 | 1 | 0 | 16 |
Note: pr-review-precision confirmed implementation is correctly targeted at the root cause. pr-review-standards found no issues.
| async function uploadInlinePart( | ||
| part: InlineBinaryPart, | ||
| ctx: ArtifactBinaryContext | ||
| ): Promise<InlineBinaryPart> { | ||
| const storage = getBlobStorageProvider(); | ||
| const buffer = Buffer.from(part.data, 'base64'); | ||
| const mimeType = part.mimeType ?? 'application/octet-stream'; | ||
| const contentHash = createHash('sha256').update(buffer).digest('hex'); | ||
| const ext = getExtensionFromMimeType(mimeType); | ||
|
|
||
| const key = buildStorageKey({ | ||
| category: 'artifact-data', | ||
| tenantId: ctx.tenantId, | ||
| projectId: ctx.projectId, | ||
| artifactId: ctx.artifactId, | ||
| contentHash, | ||
| ext, | ||
| }); | ||
|
|
||
| await storage.upload({ key, data: buffer, contentType: mimeType }); | ||
|
|
||
| return { ...part, data: toBlobUri(key) }; | ||
| } |
There was a problem hiding this comment.
💭 Consider: Add logging for blob uploads
Issue: This file has no logging, while the peer file image-upload.ts includes debug logging for successful uploads and error/warn logging for failures.
Why: Without logging, debugging artifact upload issues in production will require tracing through multiple layers. Debug logs help identify which artifacts uploaded successfully and what their storage keys are.
Fix: Add logging consistent with the pattern in image-upload.ts:
import { getLogger } from '../../../../logger';
const logger = getLogger('artifact-binary-sanitizer');
// Then after the upload:
logger.debug({ key, mimeType, size: buffer.length }, 'Uploaded artifact binary to blob storage');Refs:
| // Strip binary blobs before serializing — base64 is useless noise for the naming LLM | ||
| const fullDataStr = JSON.stringify( | ||
| artifactData.data || artifactData.summaryData || {}, | ||
| stripBinaryDataForObservability(artifactData.data || artifactData.summaryData || {}), |
There was a problem hiding this comment.
Does the naming LLM get the image then? How does it name the image or produce a description here?








Summary