Skip to content

fix(bench): in-box workers use the box-provisioned model credential#271

Merged
drewstone merged 1 commit into
mainfrom
fix/inbox-provisioned-creds
Jun 12, 2026
Merged

fix(bench): in-box workers use the box-provisioned model credential#271
drewstone merged 1 commit into
mainfrom
fix/inbox-provisioned-creds

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

What

In-box (sandbox-backed) worker paths stop injecting our external router key into boxes — the egress proxy 403s foreign credentials by design. With backend.model = {provider: 'openai-compat', model, baseUrl} and no apiKey, the platform wires opencode to the box's own OPENCODE_MODEL_API_KEY, which is the sanctioned flow.

Changed: sandboxAgentRun (the central seam feeding rsi/run/cloud-loop/research-loop/fleet/improve-prompt/run-benchmarks/search-bench), worker.ts, commit0-gate.mts (sandbox path only — the host-local path correctly keeps the external key), finsearch-loop.ts, clbench-codebase-gate.mts, worker-cad.ts. rsi.ts gains a WORKER_PROVIDER knob (cheap router models need openai-compat in-box). Host-side and local-docker paths untouched.

Proof

Live end-to-end through the real kernel: BENCH=humaneval BACKEND=sandbox N=1 WORKER_MODEL=deepseek-v4-flash WORKER_PROVIDER=openai-compat tsx src/rsi.ts → all 3 rollouts returned non-empty completions via router.tangle.tools (2/3 resolved by the deterministic judge). This unblocks the observe→steer efficacy run on real sandboxed workers.

Flagged, not fixed: search-bench/profiles.ts still sends an external key to the search MCP from in-box (sandbox mode will 403; bridge mode unaffected) — documented at the site, needs a box-side credential flow.

The egress proxy 403s foreign router keys inside sandboxes, so every path
that injected the external TANGLE_API_KEY into a box (env OPENAI_API_KEY/
OPENAI_BASE_URL + backend.model.apiKey) produced silently empty workers.
backend.model now pins provider/model/baseUrl only; the platform writes the
in-box provider config keyed to OPENCODE_MODEL_API_KEY. Host-side paths
(commit0 local, terminal-bench, llmAnalyst, judges, sandbox API clients)
keep the external key. rsi.ts gains WORKER_PROVIDER so cheap router models
run in-box via openai-compat.

Proven live: rsi.ts BACKEND=sandbox BENCH=humaneval N=1 with
deepseek-v4-flash — 3/3 in-box rollouts returned real completions, 2/3
resolved by the deterministic judge.

@tangletools tangletools left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Auto-approved PR — ea39dae8

Blanket team auto-approval is enabled for this reviewer service.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

tangletools · auto-approval · reason: blanket_auto_approve · 2026-06-12T02:16:32Z

@drewstone drewstone merged commit 49033aa into main Jun 12, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants