Skip to content

docs(rfd): mid-turn input via session/inject (queue and steer)#1261

Open
kennethsinder wants to merge 9 commits into
agentclientprotocol:mainfrom
kennethsinder:rfd/session-inject
Open

docs(rfd): mid-turn input via session/inject (queue and steer)#1261
kennethsinder wants to merge 9 commits into
agentclientprotocol:mainfrom
kennethsinder:rfd/session-inject

Conversation

@kennethsinder
Copy link
Copy Markdown

@kennethsinder kennethsinder commented May 19, 2026

Lifts discussion #1220 (session/inject for mid-turn queue + steer) into the formal RFD process, per @benbrandt's offer to put it in the v2 bucket.

Summary

One method (session/inject), two modes (queue, steer), one capability. Built on the v2 prompt lifecycle: agent-owned messageId, user_message echo notification, and state_change. Specifies the protocol shape for queue/steer behavior already present across Cursor, Codex CLI, Claude Code, Windsurf Cascade, Gemini CLI, and others.

Relation to existing work

Looking for

A champion. @benbrandt offered in the discussion thread; tagging here. I can iterate on framing, split scope, or rework around the v2 prompt lifecycle if a different factoring lands better.

Changes in this PR vs. the discussion post

  • Tightened messageId ownership to agent-owned, returned in the inject response, matching the v2 prompt lifecycle's resolution.
  • Added explicit acknowledgment and supersession of PR docs(rfd): prompt queueing RFD #484.
  • Replaced $/cancel_request revocation with messageId-based pending operations: session/revoke_inject, optional session/replace_inject, and defined sad paths for already-delivered and unknown messages.
  • Took positions on pending behavior: pending injects survive session/cancel by default; steer is held, not dropped, during session/request_permission.
  • Expanded prior art with Windsurf Cascade, Gemini CLI (including its own open /inject proposal), and the explicit cross-ecosystem "disambiguate queue/steer" discussions.

Process note: drafting used AI for outline and prior-art search; cited links were checked.

Lift discussion agentclientprotocol#1220 into the v2 RFD bucket per @benbrandt's offer.
One method, two modes (queue and steer), one capability. Rides on the
v2 prompt lifecycle (user_message echo, state_change, agent-owned
messageId). Supersedes PR agentclientprotocol#484 with credit to @SteffenDE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kennethsinder kennethsinder requested a review from a team as a code owner May 19, 2026 23:53
Copy link
Copy Markdown
Contributor

@SteffenDE SteffenDE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments from my side. Don't see me as authoritative here though. Happy to close the promptQueueing PR, as it makes more sense to wait for v2 to land.

Comment thread docs/rfds/v2/session-inject.mdx Outdated

The missing piece is not queueing alone. It is the wire shape that distinguishes "deliver soon" from "deliver later," gives each delivery a stable handle for ack and revocation, and echoes the delivery into the session history so multi-client and replay stay coherent. The v2 prompt lifecycle gives us that base. This RFD adds the call sites on top.

PR [#484](https://github.com/agentclientprotocol/agent-client-protocol/pull/484) (open since February 2026) takes a thinner cut: a `promptQueueing` capability plus an `end_turn` early-finish signal, with prompts queued via a parallel `session/prompt` call. It doesn't address steer, doesn't define delivery semantics, and predates the v2 prompt lifecycle's `state_change` and `user_message` notifications. The follow-up thread also asks how clients edit a queued message. This RFD answers that directly: editing is a pending-inject operation before `user_message`, not a transcript rewrite after delivery. The author also flags that the Claude Agent SDK had no public hook for "when was my queued message inserted," which is precisely what `user_message` echoing solves at the protocol level. The proposal below is intended to supersede #484, with credit to [@SteffenDE](https://github.com/SteffenDE) for raising the question first.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't address steer, doesn't define delivery semantics

I'd say this is not a correct representation of that PR.

promptQueueing was always meant to be steering. If you just want to queue for sending after the current turn, this is trivial to implement by ACP clients.

The early end_turn was the signal to the client when the new message was injected into the context.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair pushback, that was a sloppier read than I should have given #484. Rewrote the paragraph to call your early-end_turn what it is: a steer-via-yield design where the agent yields and the client's parallel session/prompt becomes the next turn. The honest delta vs. this RFD is that yield-and-re-prompt forces a turn boundary at every steer, so a tool call mid-stream surfaces as completing a turn instead of being absorbed, and the client drives the re-prompt rather than handing the agent a payload. Both reach steer; the trade-offs differ.

}
```

The agent responds when the inject has been accepted for pending delivery, not when the model has processed it. The response carries the agent-assigned `messageId`, matching the v2 prompt lifecycle pattern:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a note: this won't work for the Claude Agent SDK, because it will only generate the id at the point of delivery. In practice, this means that ACP message IDs would be an adapter concern, requiring the ACP code to store its own state persistently somewhere in order to get the same IDs for later loads.

We already have similar problems for message IDs in streaming chunks, so at some point this is probably inevitable.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is the part where I think the adapter has to pay the cost and we should just say so out loud. Added an FAQ explicitly addressing it: the wire ID has to come back synchronously, so any adapter bridging to an SDK that doesn't expose a pre-delivery ID needs to mint a wire ID at accept-time and remember the mapping until the underlying SDK delivers. Same shape as the streaming-chunk situation message-id already accepts. The two alternatives (client-provided ID, deferred ID-on-delivery) are worse for different reasons that the FAQ spells out.

Comment thread docs/rfds/v2/session-inject.mdx Outdated

### Semantics

**`queue`** is the simple one. The agent buffers the content. The agent delivers it as a `user_message` notification once `state_change: idle` fires for the current turn. FIFO across multiple queued injects. If a queue lands on an already-idle session, the agent treats it as a normal user message and starts a turn, though clients should prefer `session/prompt` in that case.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth discussing if this needs to be part of ACP in the first place, as clients need to implement logic for displaying queued messages anyway, so adding the logic to buffer those client-side and do regular prompt on end_turn might be enough and keeps the protocol smaller. No strong opinions on my side there though.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question and one I went back and forth on. Added a dedicated FAQ ("Does queue need to be in the protocol at all?") that tries to be honest about it: for one client driving one agent, you're right, client-side buffering covers the user-visible behavior and the protocol stays smaller. The cases I think justify the protocol-level queue are multi-client fan-out (with #533, the agent has to be the ordering authority because client-side queues can't agree on FIFO), session/load reconnect (a queue at the agent survives client restart), and headless agents serving multiple thin surfaces. Open to dropping it if the consensus is that those cases are too speculative for v2.

Comment thread docs/rfds/v2/session-inject.mdx Outdated

An agent that only supports buffering at end-of-turn declares `["queue"]`. A tool-loop agent that can break safely between tool calls declares `["queue", "steer"]`. A streaming-only agent that can't do either declares the absence of the capability and clients fall back to `session/cancel`.

Revoke is part of the pending-inject contract. `pending.replace` is optional. Clients should only offer edit-in-place for already-sent pending messages when it is advertised; otherwise they can keep drafts client-side longer or revoke and send a new inject when losing queue position is acceptable.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem clear to be. Is revoke required or not? Can an agent omit pending and therefore signal that revoke is not supported?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, the wording was vague. Tightened it: revoke is mandatory if you support session/inject at all. Any agent that can accept a pending inject can drop one before delivery; worst case is a tombstone flag plus skipping the user_message emit. Replace stays opt-in via pending.replace because content editing is genuinely harder than dropping (the content may already be partially serialized into a prompt envelope by the time the replace lands). The asymmetry is now spelled out instead of implied.

Comment thread docs/rfds/v2/session-inject.mdx Outdated

Normal session errors still apply. If the session is unknown, closed, or no longer accepts input, return the same error the agent uses for other session requests.

Delivery is the point where the agent commits the inject to session history and emits, or has irrevocably queued, the matching `user_message`. From that point, the content may already be in the next model input. Revoke must return `already_delivered`, and the client should expect the `user_message` if it has not seen it yet. There is no separate "revoke a delivered message" surface; that is [session rewind (#1214)](https://github.com/agentclientprotocol/agent-client-protocol/pull/1214) territory.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or has irrevocably queued

I'd say an agent should always emit the user_message at the point where it belongs in the chat. If a message is only "irrevocably queued" meaning that is can't be edited any more, but belongs to a later point in history, I'd say that should be an error on revoke / replace.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that hedge created a state that doesn't actually exist on the wire. Tightened the definition: delivery is exactly the user_message emission. Before that, revoke succeeds. At or after that, revoke returns already_delivered. Any internal commit the agent does before emitting is purely for its own bookkeeping.

Comment thread docs/rfds/v2/session-inject.mdx Outdated
### Interaction with other in-flight requests

- **`session/cancel` during a pending steer.** Codex [#22815](https://github.com/openai/codex/issues/22815) is a real bug surface here, and we should resolve it in the spec rather than leave it implicit. Recommended behavior: cancel applies to the in-flight turn only. Pending injects (both queue and steer) survive `session/cancel` and deliver as normal once the agent reaches idle. Clients that want to clear everything can call `session/revoke_inject` for pending injects first, then `session/cancel`. Agents should document if they deviate.
- **`session/request_permission` blocking the turn.** While the agent is awaiting a permission decision, the turn is paused but not idle. Queue accumulates as normal. Steer is held until the permission resolves, then delivered at the next break-point (which may be immediately if the permission decision is the break-point). Agents that prefer to drop steers during permission waits may do so, capability-advertised.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agents that prefer to drop steers during permission waits may do so, capability-advertised.

I'd drop that. If an Agent (SDK) can't handle this, the ACP adapter code should buffer rather than just dropping. But maybe I'm misunderstanding the point.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, that carve-out was a cop-out and pushed work onto agents that should sit in the adapter. Removed it. The spec is now uniform: pending steer is held during a permission wait and delivered at the next break-point after the decision resolves. If the underlying runtime can't hold it, the ACP adapter buffers. No capability flag for this.

kennethsinder and others added 2 commits May 24, 2026 12:34
Resolves PR agentclientprotocol#1261 inline comments from @SteffenDE:

- Reframe PR agentclientprotocol#484 comparison: acknowledge that agentclientprotocol#484's early end_turn is
  a legitimate steer-via-yield design. Spell out the trade-off (forced
  turn boundary at every steer, client drives the re-prompt) rather
  than dismissing it as not addressing steer.
- Make revoke mandatory if session/inject is supported; keep replace
  opt-in via pending.replace. Explain the asymmetry (drop is cheap,
  content edit is harder once partially serialized).
- Drop the "irrevocably queued" hedge. Delivery is defined strictly as
  user_message emission; no phantom committed-but-not-delivered state.
- Drop the agent-may-drop-steers-during-permission-wait carve-out.
  Adapter buffers when the underlying runtime can't.
- Add FAQ on adapter burden for messageId: addresses Claude Agent SDK
  not having a stable ID until delivery, explains why the alternatives
  (client-provided ID, deferred ID) are worse.
- Add FAQ on why queue lives in the protocol rather than client-side:
  multi-client FIFO ordering authority, replay/reconnect, headless
  agents.

Also a prose pass: tighten em-dashes to colons where natural, drop
rule-of-three flourishes, remove "not X. It is Y" constructions,
match the matter-of-fact tone of the v2 prompt lifecycle RFD.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kennethsinder
Copy link
Copy Markdown
Author

kennethsinder commented May 24, 2026

Thanks for the careful review, @SteffenDE! Replied to each thread inline. Quick summary of what changed in 48a629f + c3c9980:

  • Reframed the comparison with docs(rfd): prompt queueing RFD #484 to acknowledge its early end_turn as a legitimate steer-via-yield design, and articulated the actual trade-off (forced turn boundary at every steer vs. absorb-into-running-turn).
  • Made revoke mandatory; replace stays opt-in via pending.replace. Asymmetry explained.
  • Dropped the "irrevocably queued" hedge. Delivery is now defined strictly as user_message emission; no phantom intermediate state.
  • Dropped the agent-may-drop-steers-during-permission carve-out. Adapter buffers when the underlying runtime can't.
  • New FAQ on the adapter burden for messageId (the Claude Agent SDK case): the wire ID has to come back synchronously, adapter mints + maps, same shape message-id already accepts for streaming chunks.
  • New FAQ on why queue is in the protocol rather than client-side only: multi-client FIFO ordering authority, replay/reconnect, headless agents serving multiple thin surfaces.
  • Prose pass: tightened em-dashes to colons where natural, eased back on rule-of-three flourishes, removed "not X. It is Y" constructions to match the tone of the v2 prompt lifecycle RFD.

Open to more rounds. The biggest open design question I think is the queue itself: whether the multi-client / replay / headless cases justify protocol-level standardization at v2 stabilization time, or whether it's cleaner to ship steer alone and let queue land later once #533 firms up.

kennethsinder and others added 4 commits May 24, 2026 12:52
Cover the convergence that landed in the past quarter:

- Codex app-server `turn/steer` and `turn/interrupt` methods (the
  closest existing analogue to session/inject, by name).
- Cursor 3 "Glass" keybind split (Alt+Enter queues, Cmd+Enter
  interrupts-and-sends).
- Replit Agent 4 "Queue" with explicit queue/steer separation.
- Claude Managed Agents user.message events + idle/running/etc.
  statuses.
- Devin sessions API as the single-mode counterpoint.

Replace the now-superseded Gemini CLI #17197 reference with #18782
(experimental steering hints, system-role variant), and call out the
user-role vs system-role design split in the session/remind FAQ.

Strengthen the adapter FAQ with the concrete bug symptom from
claude-agent-sdk-typescript#67: messages yielded into the async
generator get processed but never appear in the visible transcript,
which is exactly the gap user_message echo + stable wire ID closes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Correct Cursor 1.4 vs Cursor 3 "Glass" conflation: the Alt/Cmd+Enter
  keybind split is from the 1.4 changelog, not the Glass IDE rebrand.
- Fix Replit attribution: drop "Agent 4" and "explicit queue/steer
  framing" — neither is in the source. Update the redirected blog URL.
- Add FAQ comparing to Codex's `turn/steer` / `turn/interrupt`:
  documents the threadId/expectedTurnId/input shape, calls out that
  Codex has no revoke/replace, and articulates why ACP's pending-edit
  surface matters for editor clients.
- Collapse the 2026-05-24 revision history entry to one line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Capability shape tightened: `modes` is required and non-empty when
  `inject` is present. Added the `steer_in_stream` capability
  (`["interrupt"]`, `["finish"]`, or both) so clients can know
  whether a mid-stream steer truncates the assistant turn or waits
  for it to finish — closes the previously "agent-defined" interop
  gap that produced visibly different transcripts for the same
  client action.
- Multi-client ordering claim weakened to per-controller FIFO with
  agent-defined cross-controller order observable via `user_message`
  delivery — what the protocol can actually enforce.
- Replace position semantics clarified to "within that mode's
  pending order" now that modes are segregated.
- Error code recommendation moved from `-32602 Invalid params`
  (request wasn't malformed) into the JSON-RPC server-error range
  (`-32000`–`-32099`), with the exact code left to schema-definition
  time and `error.data.reason` as the discriminator.
- Steer on an idle session is now explicitly an error; clients use
  `session/prompt` for input that starts a turn.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Vetted against ACP's existing ErrorCode union and the patterns in
request-cancellation, next-edit-suggestions, and the multi-client-attach
RFDs. The "exact code TBD" placeholder was duct tape; ACP RFDs assign
specific codes paired with a `data.reason` discriminator string.

Concrete assignments:

- `-32002 "Resource not found"` (already defined): reused for
  `unknown_message_id`, since the semantics match.
- `-32010 "Inject precondition failed"` (new): covers
  `already_delivered`, `no_running_turn`, `replace_not_supported`,
  with `error.data.reason` discriminating. Number chosen to leave
  -32001 through -32009 available for future generic-protocol codes.
- `-32601 "Method not found"` (standard JSON-RPC): an acceptable
  response when `pending.replace` was not advertised, as an
  alternative to returning -32010 with `reason: "replace_not_supported"`.

Updated all four error-response sites and added a schema-additions
entry documenting the ErrorCode extension.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants