feat(opencode): cache-aligned compaction to reuse prefix cache by lloydzhou · Pull Request #25100 · anomalyco/opencode

lloydzhou · 2026-04-30T11:59:54Z

Issue for this PR

Type of change

Bug fix
New feature
Refactor / code improvement
Documentation

What does this PR do?

Compaction currently builds its own LLM request with an empty system prompt, no tools, and a filtered message history. Because the request structure differs from normal chat requests, none of the historical messages hit the provider's prompt cache — they're all charged at full input price.

This PR aligns the compaction request to share the exact same prefix as the main agent loop: same system prompt, same tool definitions, same message serialization. The compaction request becomes indistinguishable from a normal chat request up to the point where the summary instruction is appended. This lets the provider serve [system] + [tools] + [dropped messages] from cache, cutting compaction cost by roughly 90%.

See the design rationale: Cache-Aligned Summarization

Cost example (Claude Sonnet 4 pricing, compacting 45K tokens of history):

	Tokens	Without cache alignment	With cache alignment
System prompt	~2K	Full: $0.006	Cached: $0.0006
Tools	~3K	Full: $0.009	Cached: $0.0009
Dropped messages	~40K	Full: $0.120	Cached: $0.012
Summary instruction	~200	Full: $0.0006	Full: $0.0006
Total	~45.2K	$0.136	$0.014

~90% cost reduction per compaction. The cached portion (system + tools + dropped messages) drops from $3/MTok to $0.30/MTok — only the short summary instruction pays full price.

Conditions for the optimized path (otherwise falls back to original behavior unchanged):

Main model and compaction model share the same provider + model ID
The original user message is not a json_schema structured output request

Changes:

prompt.ts: Extract resolveStreamContext (shared system+tools resolution), make processor optional in resolveTools so compaction can call it without a live message handle, compute resolved context before calling compaction
compaction.ts: Accept optional resolved context with agent/system/tools/user. When present, skip hidden filtering and stripMedia/toolOutputMaxChars so serialized output matches the main loop exactly. Set toolChoice: "none" to prevent tool execution.

How did you verify your code works?

bun typecheck passes in packages/opencode
Fallback path (model mismatch / json_schema) is completely unchanged from current behavior
Reviewed that serialized [system] + [tools] + [messages] prefix is identical between compaction and main loop when resolved context is used

Screenshots / recordings

N/A — backend-only change.

Checklist

I have tested my changes locally
I have not included unrelated changes in this PR

…ges prefix for prefix cache hit

github-actions · 2026-04-30T12:03:49Z

The following comment was made by an LLM, it may be inaccurate:

Based on my search results, no duplicate PRs were found. The only PR matching these searches is PR #25100 itself (the current PR being analyzed). While there are related PRs that touch on cache optimization and compaction functionality (such as PR #24842 about caching messages and PR #14743 about Anthropic prompt cache hit rates), they address different aspects and are not duplicates of this specific cache-aligned compaction feature.

No duplicate PRs found

lloydzhou · 2026-05-01T11:39:47Z

dev模式bun run dev使用mimo-v2.5-pro进行测试，读取sqlite数据库记录显示最后一条 compaction 的 cache.read = 14,656，命中缓存。
缓存命中率 = 14,656 / 17,267 ≈ 84.9%

sqlite3 /Users/xxxx/.local/share/opencode/opencode-local.db "SELECT json_extract(data, '$.modelID') as model, json_extract(data, '$.agent') as agent, json_extract(data, '$.role') as role, json_extract(data, '$.tokens') as tokens, datetime(json_extract(data, '$.time.created')/1000, 'unixepoch', 'localtime') as created FROM message ORDER BY json_extract(data, '$.time.created') DESC LIMIT 20

mimo-v2.5-pro|compaction|assistant|{"total":17267,"input":784,"output":513,"reasoning":1314,"cache":{"write":0,"read":14656}}|2026-05-01 19:27:27
|build|user||2026-05-01 19:27:27
mimo-v2.5-pro|build|assistant|{"total":14870,"input":140,"output":15,"reasoning":59,"cache":{"write":0,"read":14656}}|2026-05-01 19:27:17
|build|user||2026-05-01 19:27:17
mimo-v2.5-pro|build|assistant|{"total":14788,"input":13644,"output":13,"reasoning":107,"cache":{"write":0,"read":1024}}|2026-05-01 19:27:00
|build|user||2026-05-01 19:27:00
mimo-v2.5-pro|compaction|assistant|{"total":1478,"input":939,"output":513,"reasoning":26,"cache":{"write":0,"read":0}}|2026-05-01 19:26:14

lloydzhou added 2 commits April 30, 2026 19:36

feat: cache-aligned compaction — reuse main loop's system/tools/messa…

5bf0f34

…ges prefix for prefix cache hit

Merge branch 'anomalyco:dev' into dev

3a48d1b

lloydzhou added 2 commits April 30, 2026 22:25

Merge branch 'dev' into dev

e67c13c

Merge branch 'anomalyco:dev' into dev

aab0a5e

lloydzhou mentioned this pull request Apr 30, 2026

[FEATURE]: ~90% of compaction cost is avoidable cache miss #25120

Closed

1 task

lloydzhou added 2 commits May 1, 2026 18:46

merge: resolve conflict with opencode/dev, sync sys.environment change

423ff0d

Merge remote-tracking branch 'origin/dev' into dev

086980d

Merge branch 'dev' into dev

972380a

github-actions Bot mentioned this pull request May 2, 2026

fix(session): cache messages across prompt loop to preserve prompt cache byte-identity #25367

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(opencode): cache-aligned compaction to reuse prefix cache#25100

feat(opencode): cache-aligned compaction to reuse prefix cache#25100
lloydzhou wants to merge 7 commits intoanomalyco:devfrom
lloydzhou:dev

lloydzhou commented Apr 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

lloydzhou commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lloydzhou commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue for this PR

Type of change

What does this PR do?

How did you verify your code works?

Screenshots / recordings

Checklist

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

lloydzhou commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lloydzhou commented Apr 30, 2026 •

edited

Loading