feat: [ENG-2243] AutoHarness V2 dual-VM isolation integration test#501
feat: [ENG-2243] AutoHarness V2 dual-VM isolation integration test#501danhdoan merged 2 commits intoproj/autoharness-v2from
Conversation
|
Claude finished @danhdoan's task in 3m 51s —— View job PR Review: feat: [ENG-2243] AutoHarness V2 dual-VM isolation integration testTasks
Overall Assessment: 4.5 / 5This is a high-quality security proof-of-work. The five tests are well-scoped, correctly target the A3 attack vectors, and the in-source documentation is better than most test suites. The analysis below covers correctness, security assertions, code quality, and a few minor improvements. CorrectnessThe pipeline under test is correctly assembled in
Each attack is exercised end-to-end through the real Security Assertion Quality
The cross-realm Attack 5 correctly uses Minor Issues (see inline comments)
What Works Well
Out-of-Scope Observations (not blocking)The following attack vectors are NOT covered here (intentionally, per A3 scope):
These would only affect the VM's own realm per V8's context isolation, so they are consistent omissions for a Phase 3 PR. They would be Phase 6+ concerns if the threat model expands. Verdict: Merge-ready with optional nits. The two inline suggestions (Attack 3 message and |
Phase 3 Task 3.5 — closes brutal-review item A3 with 5 integration
tests covering every cross-VM attack vector. Each test exercises the
full pipeline (real `HarnessStore` + real `HarnessModuleBuilder` +
real `SandboxService.loadHarness`) against a harness fixture
designed to breach the sandbox boundary; the test asserts the breach
is contained and the pipeline stays healthy.
1. **Global pollution** — harness `globalThis.X = 'touched'` stays
in the VM's private `globalThis`; Node's `globalThis[X]` remains
`undefined` (V8 per-context realm).
2. **Closure leak** — harness returns a closure that captures
`ctx.tools` and tries `stolen.curate = hijackFn` when called.
Deep-freeze at the invocation boundary (ENG-2239) + strict-mode
injection means the mutation throws a VM-realm `TypeError` with
message matching `/read.?only/i`.
3. **Mutable parameter** — harness writes `ctx.env.commandType =
'hacked'`. Deep-freeze throws; wrapper normalizes as `curate()
failed: ...`.
4. **Prototype pollution** — harness `Object.prototype.X = 'polluted'`
stays in the VM realm's prototype; Node's `({} as any).X` remains
`undefined`.
5. **Stack-trace escape** — harness throws error with attached
`capturedThis: globalThis` AND `capturedArbitrary: {secret: ...}`.
Wrapper constructs a fresh `new Error('curate() failed: ' + msg)`
— only `message` survives, all attached properties drop.
**Cross-realm `instanceof` (Attack 2)**: TypeErrors from
`vm.createContext`
are instances of the VM realm's TypeError class, not Node's outer
realm. `error instanceof TypeError` fails despite the error being
structurally a TypeError. Fix: check `error.name === 'TypeError'`
(string-valued, realm-agnostic) + `error.message` match.
**`capturedArbitrary` property (Attack 5)**: attack attaches two
properties to the thrown error — a known one (`capturedThis`) and an
arbitrary one (`capturedArbitrary`). Proves the wrapper strips
*all* attached properties by constructing a fresh Error, not just
the ones it explicitly knows about.
Per-attack `expectSandboxHealthy()` now asserts BOTH `2 + 2 === 4`
(JS intact) AND `typeof tools === 'object' && typeof tools.readFile
=== 'function'` (tools namespace wired). A single sandbox snapshot
proves "attack didn't destabilize execution OR tool injection."
Uses `FileKeyStorage({inMemory: true})` instead of the task doc's
suggested `mkdtemp` + disk. Attacks don't need real I/O; in-memory
storage means zero disk footprint, zero tmpdir collision surface, zero
cleanup logic.
Full suite: 6710 passing / 0 failing; 59ms for this file's 5 tests.
|
Claude finished @danhdoan's task in 2m 29s —— View job PR Review: feat: [ENG-2243] AutoHarness V2 dual-VM isolation integration testTasks
Overall Assessment: 4.5 / 5This is a high-quality security proof-of-work PR. The prior review's three nit suggestions are already incorporated in the current file — no outstanding items remain. Review below documents what was verified. Prior Review Nits: StatusAll three suggestions from the prior review cycle are confirmed present:
CorrectnessPipeline wiring in Template string interpolation for sentinel keys is verified correct:
Security Assertion Quality
Cross-realm instanceof idioms are correctly applied throughout:
The distinction is documented correctly: VM-realm errors use What Was Verified (and works)
Out-of-Scope Observation (not blocking)The Verdict: Merge-ready. The three nits from the prior review are incorporated, all five attack vectors are correctly asserted, and the inline documentation is better than the average security test. No new concerns. |
The `waitForPruneToSettle` helper returned on the first two consecutive identical counts with 2ms polling, which under slow CI could catch the pre-prune state twice and declare "settled" before any work started. Require 5 consecutive stable readings at 5ms intervals (25ms of no change) with a 1s max bound — enough to ride out the initial async gap without inflating test runtime.
|
Claude finished @danhdoan's task in 2m 55s —— View job PR Review: feat: [ENG-2243] AutoHarness V2 dual-VM isolation integration testTasks
Overall Assessment: 4.5 / 5This is the third pass over this PR. All three nits from the first review cycle are confirmed incorporated in the current file. No new source-code changes. The analysis below covers the current state of both changed files — Prior Nits: Confirmed Addressed
Pipeline Wiring Verified
Each attack goes through the full Security Assertion Quality
Cross-realm
One subtle implementation detail confirmed: In Attack 3, the harness
|
Summary
SandboxService.loadHarness→ module builder → injected context) survives an adversarial harness.test/integration/agent/harness/isolation.test.ts— 5 tests, one per A3 attack vector, exercised end-to-end against realHarnessStore+HarnessModuleBuilder+SandboxService.loadHarness. Each test asserts the breach doesn't reach Node's scope AND the pipeline stays healthy afterwards.Type of change
Scope (select all touched areas)
Linked issues
HarnessModuleBuilder), ENG-2240 (SandboxService.loadHarness), ENG-2241 (graceful-degradation tests for A4)Root cause (bug fixes only, otherwise write
N/A)Test plan
test/integration/agent/harness/isolation.test.tsglobalThis.X = 'touched'→ Node'sglobalThis[X]isundefined(V8 per-context realm)stolen.curate = hijackFn→ outer invocation throws VM-realm TypeError (frozen strict-mode write)ctx.env.commandType = 'hacked'→ deep-freeze throws; wrapper surfacescurate() failed: ...Object.prototype.X = 'polluted'→ Node's({} as any).Xisundefined(V8 per-context prototype chain)capturedThis: globalThisANDcapturedArbitrary: {secret: 'do-not-leak'}→ wrapper's freshnew Error()strips ALL attached properties; caught error has neitherUser-visible changes
None. Pure test coverage. No consumer of the tested paths changes behavior.
Evidence
Before this PR, the test file didn't exist; the 5 attack vectors had no integration-level coverage. After: all 5 pass in 59ms. Full suite: 6710 passing / 0 failing.
Checklist
npm test) — 5 new tests; full suite 6710 passing / 0 failingnpm run lint) — 0 errors, 226 pre-existing warningsnpm run typecheck) — exit=0npm run build) — exit=0feat: [ENG-2243] ...features/autoharness-v2/tasks/phase_3/task_05-isolation-integration-test.md(research repo) drove the scope; the in-memory-vs-mkdtemp choice + health-check strengthening flagged below for post-merge task-doc tighteningmain— targetsproj/autoharness-v2, notmainRisks and mitigations
Risk: If any isolation invariant ever regresses, the test would immediately fail — which is the whole point — but a regression would also mean the harness sandbox is exploitable from attack code. The test is the gate, not the fix.
afterEachhook defensivelydeletes both sentinel keys (ATTACK_1_GLOBAL_KEYon globalThis,ATTACK_4_PROTO_KEYon Object.prototype). If a failure ever lets the pollution through, other tests in the suite don't inherit it — the failure stays localized to the one broken test.Risk: Cross-realm
instanceoffragility — the patternerror instanceof TypeErroris a common test idiom that silently fails when the error is thrown insidevm.createContext. Future maintainers who copy-paste from this file might hit the same trap elsewhere.error.name === 'TypeError'idiom as the fix. Any future reviewer reading the comment learns the gotcha. The stronger assertion (name + message-regex match) is less brittle than a singleinstanceofcheck anyway.Risk:
capturedArbitraryproperty on Attack 5 — the wrapper's "fresh Error construction" strips ALL attached properties today, but a future optimization that preserves some common properties (e.g.,cause,code) could re-introduce a leak surface.capturedThisand an arbitrarycapturedArbitrary) — the test would catch a regression that preserved EITHER one. If a future optimization needs to preserve specific properties (likeerror.causefor upstream diagnostics), the test should be updated to verify only attacker-attached properties drop, not the allowlisted ones.Notes for reviewers
If this PR ever fails, do not retry, debug. Every passing assertion here proves a structural invariant of V8's
vmmodule or the wrapper we layered on top. A flake would more likely mean the invariant broke (bug) than the test being racy. None of the assertions time-depend; the file runs in 59ms.Cross-realm
instanceofgotcha (Attack 2) was the one interesting discovery during implementation. The TypeError thrown by strict-mode frozen-write comes from the VM's realm with the VM'sTypeError.prototype, distinct from Node's outer-realm class.error instanceof TypeErrorfails silently. Fix:error.name === 'TypeError'(string, realm-agnostic). Comment in-source explains. If another test in the repo ever doesinstanceof TypeErroragainst a VM-thrown error, expect the same surprise.Attack 5's dual-property attachment is deliberate. A single
capturedThis: globalThiswould prove "wrapper doesn't pass through globalThis references." The secondcapturedArbitrary: {secret: 'do-not-leak'}proves the wrapper strips ALL attached properties vianew Error()construction — not a selective whitelist. If a future wrapper optimization preservescauseor similar, this test would catch it and force an explicit decision.expectSandboxHealthy()strengthened vs. task doc. The original task-doc AC said "subsequent raw tools.* calls still work" but didn't specify HOW to verify. Early implementation used2 + 2; reviewers correctly flagged that this only tests JS eval, not tools namespace integrity. Current health check runs a single sandbox expression that asserts both{math: 2 + 2, hasTools: typeof tools === 'object' && typeof tools.readFile === 'function'}match — snapshot proves JS + tools in one assertion. If the attack had corrupted ToolsSDK construction (e.g., by polluting something it depends on at the module level),hasTools: truewould flip.Storage choice: in-memory, not
mkdtemp. Task doc suggestedmkdtemp+ disk cleanup pattern from Phat's Task 2.5. Isolation attacks don't touch the filesystem — the harness runs in a VM context, not a subprocess — so in-memoryFileKeyStorageremoves tmpdir lifecycle concerns entirely. Cleaner setup, same isolation.Phase 3 closes with this PR. All 5 tasks merged:
Brutal-review items A3 + A4 both closed. Phase 5 (mode selector) is unblocked to start.
Related
test/integration/agent/harness/isolation.test.tsSandboxService.loadHarness(ENG-2240),HarnessModuleBuilder(ENG-2239),HarnessStore(ENG-2227 + ENG-2228)HarnessModuleBuilder.wrapInvocation(ENG-2239)features/autoharness-v2/tasks/phase_3/task_05-isolation-integration-test.md(research repo)AgentLLMServicesession-start hook — first real consumer ofloadHarness)