Skip to content

docs(designs): background tasks#780

Open
gautamsirdeshmukh wants to merge 1 commit intostrands-agents:mainfrom
gautamsirdeshmukh:design/0008-background-tasks
Open

docs(designs): background tasks#780
gautamsirdeshmukh wants to merge 1 commit intostrands-agents:mainfrom
gautamsirdeshmukh:design/0008-background-tasks

Conversation

@gautamsirdeshmukh
Copy link
Copy Markdown
Contributor

Description

Related Issues

Type of Change

  • New content
  • Content update/revision
  • Structure/organization improvement
  • Typo/formatting fix
  • Bug fix
  • Other (please describe):

Checklist

  • I have read the CONTRIBUTING document
  • My changes follow the project's documentation style
  • I have tested the documentation locally using npm run dev
  • Links in the documentation are valid and working

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions
Copy link
Copy Markdown
Contributor

Documentation Preview Ready

Your documentation preview has been successfully deployed!

Preview URL: https://d3ehv1nix5p99z.cloudfront.net/pr-cms-780/docs/user-guide/quickstart/overview/

Updated at: 2026-04-24T17:04:35.198Z


The SDK now provides a built-in mechanism to "fork" an agent (create an independent copy) and run it alongside the original. No manual cloning, no lock conflicts, no coordination overhead.

**Zero overhead when not used.** Agents that don't configure `backgroundTools` pay no cost. No system prompt augmentation is injected, no management tools are registered, no token or context overhead is added. The decision points check `taskManager.size` and `_backgroundToolNames` and short-circuit immediately — no forks, no queues, no settlement checks. The agent loop behaves identically to today.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that mean you define it as tool not aysnc subagent task


### How the Model Sees Background Tasks

Background tools appear in the model's tool definitions identically to foreground tools — same name, same description, same input schema. There is no schema-level async marker. The model learns which tools run asynchronously and how to interact with them solely from the system prompt augmentation described below.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the system prompt, why not just add it to the tool descriptions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See Appendix G -- tested out updating tool spec description as well as a few other alternatives before settling on system prompt augmentation

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are augmenting tool results for strands-agents/sdk-python#2162 so i think this does have precedent in the sdk.


### Modified Agent Loop

![Modified Agent Loop](0009-modified-agent-loop.png)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this background tool makes recursive tool call? what will it be for the case single tool is background tool while others are not.


#### Result Notification

When the background task completes, its result is injected into the conversation as a user text message — not a `tool_result`, since there is no `tool_use` to pair it with. The `toolUseId` from the original dispatch is echoed for correlation:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I am having a conversation with the agent though, is this a behavior i want?

or if the agent is in the middle of an event loop writing code for example

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively, can we do a strategy where we update the tool description, and say There are 2 async task results available for ..., and let it be a bit more model driven instead of default notification?

how do the possibly different contexts interact with each other?

| Point | Location | Blocking | Purpose |
|-------|----------|----------|---------|
| **A** | Start of each loop cycle | No | Pop any background tasks that finished (success, error, or cancelled) since the last cycle and inject their results into the conversation as user messages. Proceeds immediately if none have settled. |
| **B** | Per tool, during dispatch | No | For each tool the model calls, check if it's designated as a background tool. If yes: fork the agent (or queue if `maxConcurrentBackgroundTasks` is reached), dispatch the tool on the fork, and return an immediate ACK to the model. If no: execute the tool inline as normal. The agent continues without waiting for background results. |
Copy link
Copy Markdown
Member

@pgrayy pgrayy Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mention forking the agent but dispatching the tool on the fork. Do we need to fork the entire agent? Also, do we need to use forks at all? What if tools were responsible for returning an ack if they dispatch a background process. I think in most scenarios it is going to be the case that a tool calls an api that is running a process on a separate server and thus the fork wouldn't be necessary.


The model can dispatch multiple tools simultaneously, and they all begin executing immediately. As each result arrives, the model can react and adjust its strategy in real time, triggering follow-up work as needed or cancelling tasks that are no longer required. No predefined topology is required – the model's dispatch strategy emerges from its own reasoning.

### A single agent instance can now run concurrent work
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work for agent as tools too?


The SDK now provides a built-in mechanism to "fork" an agent (create an independent copy) and run it alongside the original. No manual cloning, no lock conflicts, no coordination overhead.

**Zero overhead when not used.** Agents that don't configure `backgroundTools` pay no cost. No system prompt augmentation is injected, no management tools are registered, no token or context overhead is added. The decision points check `taskManager.size` and `_backgroundToolNames` and short-circuit immediately — no forks, no queues, no settlement checks. The agent loop behaves identically to today.
Copy link
Copy Markdown
Contributor

@opieter-aws opieter-aws Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add backgroundTools to the definitions?

[Background Task Result]
tool: <tool_name>
toolUseId: <tool_use_id>
status: success|error|cancelled
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can't be interrupted right?


Background tools appear in the model's tool definitions identically to foreground tools — same name, same description, same input schema. There is no schema-level async marker. The model learns which tools run asynchronously and how to interact with them solely from the system prompt augmentation described below.

When any background tools are passed to the agent, the SDK auto-generates and appends the following block to the system prompt:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So background tools need to be configured by the user? Why can't the agent make the decision to run a tool in the background or not?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus one -- autonomy feels like the main/only gain of background tasks to me, especially given that is was significantly slower than graph in the examples below


Two alternative approaches to giving the model async dispatch capability:

**1. Meta-tool: `run_in_background(tool_name, args)`**
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more for this tool wrapper -- it enables the model driven approach that is the backbone of strands.

```typescript
type TaskStatus = 'queued' | 'inProgress' | 'success' | 'error' | 'cancelled'

class BackgroundTask implements PromiseLike<unknown> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this correlate to MCP's task definition?


**Prior art: Mastra's dynamic dispatch.** Mastra solves the "same tool, both modes" problem by allowing the model to include a `_background` field in tool call args to override background/foreground per-call. This is a valid approach that adds flexibility, but it adds a hidden parameter to every tool's input schema, requires the model to learn when to use it, and means the developer can't guarantee a tool will always run in a specific mode. We chose static assignment for v1 because it's simpler, predictable, and lets the developer reason about fork safety at construction time. Dynamic dispatch via an opt-in allowlist (developer pre-approves which tools can be dynamically backgrounded, model decides per-call) is the natural extension path if static assignment proves too restrictive.

**3. Task management tool: `manage_tasks({ action: "create" | "status" | "stop" | "get_result", ... })`**
Copy link
Copy Markdown
Member

@lizradway lizradway Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems vaguely coupled to strands-agents/tools#389, which is the context management side of self-managed tools (whereas this is the lifecycle side). i would say they are very intertwined though and worth scoping as we look more into meta tools for context


Task state is tracked via an internal discriminated union rather than promise state, because the agent loop must check settlement without blocking — a raw Promise offers no synchronous status inspection. The union carries the associated data for each state (result value, error, cancellation reason), eliminating the need for separate flags. When `cancel()` is called, status transitions to `'cancelled'` immediately regardless of current state (queued or inProgress). No-op if the task has already settled. See [Cancellation](#cancellation) for the full API.

#### TaskManager
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nomenclature nit: should this be BackgroundTaskManager if it manages BackgroundTasks?


**Zero overhead when not used.** Agents that don't configure `backgroundTools` pay no cost. No system prompt augmentation is injected, no management tools are registered, no token or context overhead is added. The decision points check `taskManager.size` and `_backgroundToolNames` and short-circuit immediately — no forks, no queues, no settlement checks. The agent loop behaves identically to today.

## How It Works
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this to work, do we need to keep the process/runtime open?


#### Result Notification

When the background task completes, its result is injected into the conversation as a user text message — not a `tool_result`, since there is no `tool_use` to pair it with. The `toolUseId` from the original dispatch is echoed for correlation:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this bring side effects across different models?

| Sonnet 4.6 | 5 | 86.5s | 31.4s | **2.80x** | ±0.40 | 4.4% | 110/110 |
| Haiku 4.5 | 5 | 79.6s | 28.2s | **2.89x** | ±0.40 | 18.2% | 110/110 |

Context growth (avg messages): 6-7 standard vs 8-10 background. The additional messages are injected background task results — expected behavior, not overhead. Input tokens increase 11-40% due to the model seeing injected results across multiple turns rather than in a single batch. See [Context Management](#context-management).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a 40% increase in tokens is highly noteworthy. that is a huge cost

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good callout here - this particular figure is from earlier benchmarks, more recent testing after tightening up various aspects of the mechanism showed far less bloat. Will update this

Comment on lines +320 to +321
tools: [calculateMetrics, formatReport],
backgroundTools: [searchWeb, analyzeData, researcher],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a different way we can define this interface that makes it clear that tools is the superset of tools and backgroundTools?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus one, if the same tool is included in tools and background tools, what does that mean??


This is the structural asymmetry at the heart of background tasks: the ACK uses native `tool_result` pairing, but the real result arrives as a plain text message. This is why the system prompt augmentation is necessary — see [Appendix G](#appendix-g-system-prompt-augmentation-rationale).

#### Model-Driven Task Management
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry this could lead to more token usage. Does the model need to be the one to poll background tasks? Is that something we can do in the agent loop? Or is it something users could do.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of the following scenario:

  1. Model emits a tool use.
  2. Tool runs in background.
  3. No other tool is executing and so no other work is being done in the loop.
  4. We exit out of the agent loop with a handler so the user can decide when to reinvoke.
    1. One advantage here is that if no work is being done locally, the user can save on compute and shut down their entire process.
  5. Once ready, the user reinvokes the agent.
  6. Agent returns back to the tool call to retrieve the result.
  7. The result is sent to the model.
  8. The model isn't aware that the tool was executed in the background.

Your approach however allows for the model to continue processing concurrent results and follow up on background results. I can see then why passing those as a plain user message rather than a tool result could be helpful. Still curious though if there is a way to still enable passing back as a tool result.

- **`enqueue()`** — the entry point for background dispatch. Creates a `BackgroundTask`, either starts it immediately or queues it based on concurrency, and returns the task handle with the appropriate ACK.
- **`popCompleted()`** — returns and removes all settled tasks from the registry. Triggers queue drain if slots opened. This is the only way tasks leave the registry, ensuring no task is accidentally lost or processed twice.
- **`cancel(id)`** — cancel a specific task by its internal ID. Works on both queued and running tasks. Cancelling a queued task removes it from the queue without ever forking.
- **`cancelByToolUseId(toolUseId)`** — cancel a specific task by its `toolUseId`. This is the primary path for model-driven cancellation, since the model knows `toolUseId` from its own `tool_use` blocks in conversation history.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a task manager method though? i think it should be part of the tool, and it should just translate to cancel(id)

| **Execution** (invocation lock, cancellation, metrics) | Fresh. Each fork can be invoked and cancelled independently. |
| **Task management** | Fresh `TaskManager`, same config. Fork manages its own background tasks. |

Messages are deep-copied by default. For background tool forks specifically, the SDK automatically passes `messages: []` at Decision Point B — see [Context Management](#context-management) for why and how to opt out via `inheritMessages`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This confuses me. They are copied by default, but not by default for background tool forks? Can you clarify here what the difference scenarios are?


#### backgroundTools Config

`backgroundTools` accepts the same types as `tools` — `Tool`, `McpClient`, `Agent`, `Graph`, `Swarm`, or nested arrays. Anything that can be a tool can be a background tool.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this more of a tool concern though? i'd expect more like @tool(background=True) or something 🤔

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like this alot better as well

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally we'd default background to false and allow agents to run tasks in background unless explicitly configured otherwise, i'd imagine?

| **Fork** | An independent copy of the agent created via `fork()`. Has its own conversation, execution lock, and task manager, but shares the parent's model client and tool registry. The isolation primitive that makes concurrent execution safe. |
| **Decision Point** | One of three locations in the modified agent loop where background task logic is injected. **A** (top of cycle): pop settled results. **B** (per tool): fork and dispatch or execute inline. **C** (end of turn): wait if tasks are pending. |
| **ACK** | The immediate `tool_result` returned to the model when a background tool is dispatched. Contains "Background task dispatched" or "Background task queued." Not a real result — the actual output arrives later via injection. |
| **Injection** | The mechanism by which background task results enter the parent's conversation. Results are appended as user text messages with a `[Background Task Result]` prefix and `toolUseId` for correlation. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the model hallucinate the "late" tool results?


#### fork()

`fork()` creates an independent copy of the agent that can be invoked concurrently with the original. It is both the isolation primitive that makes background tasks possible (each dispatch at Decision Point B creates a fork) and a standalone capability for developers who want to parallelize work using the same agent configuration.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need more justification here?


##### Fork depth guard

A configurable depth limit (default: 20, set via `maxForkDepth` on `AgentConfig`) prevents infinite recursive forking — for example, a background tool that itself dispatches background tools. Throws if exceeded.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 20 not a bit excessive here? With 10 forks, all 20 forks deep, that's 200 agents, if the fork is only allowed to spin up 1 fork (can't find the default here). How do we keep cost managed?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we ensure multiple forks aren't converging to all doing the same task accidentally?


#### TaskManager

`TaskManager` is the lifecycle manager for `BackgroundTask` instances created during the `backgroundTools` dispatch path ([Decision Point B](#three-decision-points)). It owns settlement detection, cancellation, and cleanup. Each agent instance holds its own `TaskManager`; forks get fresh instances with the same config. This isolation ensures a fork's background tasks are the fork's responsibility — the parent only sees the fork itself as one task, never the sub-tasks the fork may spawn internally.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to manage our own task manager? This seems heavy and task management has already been solved.


#### Events

Three new hookable events:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who hooks into these? what are the use cases?

..."
```

The model must not assume results arrive in dispatch order. `toolUseId` correlates each result to its original `tool_use` block.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will implement this logic right? instead of letting the model find and map by itself.


`fork()` creates an independent copy of the agent that can be invoked concurrently with the original. It is both the isolation primitive that makes background tasks possible (each dispatch at Decision Point B creates a fork) and a standalone capability for developers who want to parallelize work using the same agent configuration.

Why it's needed: `invoke()` on the same agent instance acquires an `_isInvoking` lock. A second concurrent call throws `ConcurrentInvocationError`. This is a deliberate safety rail — concurrent writes to the same `messages` array would corrupt conversation state. `fork()` gives each concurrent invocation its own messages, state, and lock:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we invoking the same agent though? why not just use use_agent like sub-agent?

|----------|-----------|-------------|---------------|
| Standard | 98.1s | baseline | 4,006 chars |
| Background | 66.8s | **1.47x faster** | 3,371 chars |
| Graph | 34.8s | **2.82x faster** | 3,866 chars |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

graph is alot faster, i feel like swarm would be as well and would address the same usecase. why could we not add async support to swarm?

}
```

Cancellation follows the same pattern as `fetch` with an aborted `AbortSignal` — intentional cancellation is a rejection, not a silent resolve. The synchronous getters (`status`, `result`, `error`) enable inspection without awaiting.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that we need to thread through the signal into the tool. If possible I'd avoid handling more than one abort signal in addition to the existing top level agent signal


```typescript
interface TaskManagerConfig {
heartbeatMs?: number // How often to emit BackgroundTaskPendingEvent while waiting (default: 5000ms)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe timeBetweenPendingEventMs or something else closer to the function


![Current Agent Loop](0009-current-agent-loop.png)

### Modified Agent Loop
Copy link
Copy Markdown
Contributor

@notowen333 notowen333 Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level from the review meeting: I would be interested to see if we could first develop some constructs around async non-blocking tools/work/agents without deeply integrating the feature into the core event loop.

Proving out this async lifecycle management (very likely re-using similar approaches presented in the doc) in a more additive/compatible way could help to iron out the topic and prove out use cases.


In both cases, the agent blocks until the next background task settles, injects the result into the conversation, and re-enters the loop. See [Cancellation](#cancellation) for the safety bounds that prevent indefinite waits.

### How the Model Sees Background Tasks
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One quick call out, if we return a tool result later to the model, it may not know what to do with it because context could be lost from other messages added while waiting. Or it could be the case the conversation history was summarized while waiting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants