fix(bench): in-box workers use the box-provisioned model credential#271
Merged
Conversation
The egress proxy 403s foreign router keys inside sandboxes, so every path that injected the external TANGLE_API_KEY into a box (env OPENAI_API_KEY/ OPENAI_BASE_URL + backend.model.apiKey) produced silently empty workers. backend.model now pins provider/model/baseUrl only; the platform writes the in-box provider config keyed to OPENCODE_MODEL_API_KEY. Host-side paths (commit0 local, terminal-bench, llmAnalyst, judges, sandbox API clients) keep the external key. rsi.ts gains WORKER_PROVIDER so cheap router models run in-box via openai-compat. Proven live: rsi.ts BACKEND=sandbox BENCH=humaneval N=1 with deepseek-v4-flash — 3/3 in-box rollouts returned real completions, 2/3 resolved by the deterministic judge.
tangletools
approved these changes
Jun 12, 2026
tangletools
left a comment
Contributor
There was a problem hiding this comment.
✅ Auto-approved PR — ea39dae8
Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-12T02:16:32Z
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
In-box (sandbox-backed) worker paths stop injecting our external router key into boxes — the egress proxy 403s foreign credentials by design. With
backend.model = {provider: 'openai-compat', model, baseUrl}and no apiKey, the platform wires opencode to the box's ownOPENCODE_MODEL_API_KEY, which is the sanctioned flow.Changed:
sandboxAgentRun(the central seam feeding rsi/run/cloud-loop/research-loop/fleet/improve-prompt/run-benchmarks/search-bench),worker.ts,commit0-gate.mts(sandbox path only — the host-local path correctly keeps the external key),finsearch-loop.ts,clbench-codebase-gate.mts,worker-cad.ts.rsi.tsgains aWORKER_PROVIDERknob (cheap router models needopenai-compatin-box). Host-side and local-docker paths untouched.Proof
Live end-to-end through the real kernel:
BENCH=humaneval BACKEND=sandbox N=1 WORKER_MODEL=deepseek-v4-flash WORKER_PROVIDER=openai-compat tsx src/rsi.ts→ all 3 rollouts returned non-empty completions via router.tangle.tools (2/3 resolved by the deterministic judge). This unblocks the observe→steer efficacy run on real sandboxed workers.Flagged, not fixed:
search-bench/profiles.tsstill sends an external key to the search MCP from in-box (sandbox mode will 403; bridge mode unaffected) — documented at the site, needs a box-side credential flow.