design: docs automation by notowen333 · Pull Request #804 · strands-agents/docs

notowen333 · 2026-05-01T17:03:47Z

Description

Documentation Automations. Docs agent and docs audit agent.

Reference for source code with agent implementations: https://github.com/notowen333/strands-docs-agents

Type of Change

Other (please describe):

mkmeral · 2026-05-01T17:08:03Z

+
+In this proposal, we split the problem space into two distinct domains with corresponding workflows:
+
+1. source change → docs. Event based. A developer has merged a diff and the docs need to reflect it.


do we need event driven, would a cron job do?

github-actions · 2026-05-01T17:08:28Z

Documentation Preview Ready

Your documentation preview has been successfully deployed!

Preview URL: https://d3ehv1nix5p99z.cloudfront.net/pr-cms-804/docs/user-guide/quickstart/overview/

Updated at: 2026-05-01T17:08:28.014Z

mkmeral · 2026-05-01T17:08:48Z

@@ -0,0 +1,365 @@
+# Strands Documentation Agent/s Review


cc @ryanycoleman

gautamsirdeshmukh · 2026-05-01T17:09:27Z

+In short, these tools allow the main agent to spawn individual Agents, but don't feel like a purpose-built orchestrator-worker
+protocol/abstraction. 
+
+Upcoming async/background tools would be a necessary piece to provide such constructs. With background agents-as-tools, a construct


:) coming soon

lizradway · 2026-05-01T17:09:58Z

+
+Just a single Agent alone with the same tool set also had a high baseline latency of ~5-12 minutes.
+
+In terms of signing off on the docs agent, perhaps the current 7-15 minutes is acceptable. In any


this is 100% acceptable to me, imho anything with no impact mcm release is acceptable. scale of days, not minutes

opieter-aws · 2026-05-01T17:10:17Z

+
+When I started this work I naively assumed that a documentation automation was going to be really simple in its implementation
+and the open questions were going to center around distribution and runtime choices. This did not turn out to be the case. Balancing
+correctness and latency has turned out to be really tricky. At the time of writing, the implementation does not solve for latency.


How important is latency for this use case, given the existing latency to make doc updates manually?

mkmeral · 2026-05-01T17:10:17Z

+
+Since releasing TypeScript side-by-side with Python in the docs, the workflow for creating
+effective documentation has become more complicated. TypeScript code samples go in `.ts` snippets
+while Python gets inlined in markdown. `<Tabs>` blocks present each language's flavor of a feature,


tangential: let's get rid of tabs. They don't add anything, they make the page crowded, they complicate docs writing, and cause structural problems.

Ryan is (as am I) aligned on this, esp now that we have the language picker.

We're ultimately thinking that we'd have entirely different pages per-language

mkmeral · 2026-05-01T17:11:27Z

+
+When I started this work I naively assumed that a documentation automation was going to be really simple in its implementation
+and the open questions were going to center around distribution and runtime choices. This did not turn out to be the case. Balancing
+correctness and latency has turned out to be really tricky. At the time of writing, the implementation does not solve for latency.


Balancing
correctness and latency has turned out to be really tricky

do we need to solve for latency? Do we care? It's not a sync workflow, so as far as I care, it can take a day

mkmeral · 2026-05-01T17:12:22Z

+Across all methods attempted the comprehensiveness of the explore phase was the most important factor in effectiveness.
+There are two considerations we might take for useful dedicated vended tools:
+
+1. A grep tool (find instances within files)


why not just a shell tool with sandbox? (i guess we dont have sandboxes yet :P ) but the overall point stands

mkmeral · 2026-05-01T17:13:58Z

+
+## Proposed Docs Agent Pipeline
+
+To handle the different contexts where the docs agent needs to run (pr, issue, revision) we add contextualize skills as the first step in the 


does the docs agent need to run in different context? why not just run it at 9am each day, and run it on the PRs that were merged yesterday?

The design of the agent is 1:1 with the PR. It's true that we could set up something similar to a daily run batching over merged PRs. That would generate the same runs.

What we'd lose with that approach is the context window of the dev. They wouldn't be able to merge a change -> see the generated PR -> review it and approve or give comments -> and close the issue out.

So it's more about aligning with the human in the loop for why we would want to kick off runs live at merge.

mkmeral · 2026-05-01T17:14:20Z

+
+### Limitations: Latency and Cost
+
+While experimenting, I optimized for correctness which was unfortunately paired with high latency between 10–20 minutes for 


is this a graph specific thing, or did it apply to everything?

The graph ran 20+ minutes, with 20 being the average and the bottom. Alternatives were able to get the minimum and average run times significantly reduced.

mkmeral · 2026-05-01T17:14:53Z

+latency for docs generation. When comparing to other code generation tools like CC/Codex, it's troubling that
+the role style design experimented with was so latent.
+
+Just a single Agent alone with the same tool set also had a high baseline latency of ~5-12 minutes.


what did it do in that 5-12 minutes? what's the latency breakdown?

lizradway · 2026-05-01T17:15:26Z

+With the understanding that we're expecting to iterate, building out the proposed docs agent and audit agent would have rough edges, but should
+deliver immediate value.
+
+If we align on moving forward with implementation, the main open question is whether to start with the reusable `/strands docs` runner now, or wait for the monorepo to avoid short-lived wiring. 


my assumption is that we will need a human-led docs overhaul for the v2a releases, so i am pro-waiting for monorepo for a concrete implementation

mkmeral · 2026-05-01T17:15:59Z

+
+So, for now starting with the re-usable `/strands docs` runner is a no-regret choice.
+
+### Why not run this locally with a SKILL.md


eww. I want async, why do I need to bother asking an agent to write docs? it should just happen

I mean, I'm not against a shared skill, but we shouldn't limit there

gautamsirdeshmukh · 2026-05-01T17:16:18Z

+  INPUT TYPE: issue     →  {contextualize-issue}      skill
+  INPUT TYPE: revision  →  {contextualize-comments}   skill + S3SessionManager
+                                    │
+                                    ▼


How could we leverage After Tool Call / After Invocation event hooks here? I'm specifically thinking about the doc-writer -> [refiner, validator, audit] -> doc-writer loop at the heart of this. I imagine this could help enforce some structured schema / templates per section of the docs repo a bit more deterministically and efficiently than putting everything on the refiner/validator/auditor.

TLDR constraining the doc writer output with deterministic guardrails/templates instead of the process being FULLY agent-driven

How could we leverage After Tool Call event hooks here?

what do you mean by this?/what are you visualizing? @gautamsirdeshmukh

That's a nice idea. I'd need to think more about the exact logic we'd want on that determinism, but it's definitely an avenue to increase speed. Maybe something like the npm run commands could be levered

zastrowm · 2026-05-01T17:16:26Z

+In this proposal, we split the problem space into two distinct domains with corresponding workflows:
+
+1. source change → docs. Event based. A developer has merged a diff and the docs need to reflect it.
+2. docs -> source as SOT. Proactive sentinel. As a cron-style async job, the docs-auditor agent checks back-and-forth between the state of the docs and the source code. Inconsistencies are raised as issues and then patched by invoking the docs agent from (1)


As a cron-style async job

I like this idea in general, provided the quality is high enough; just using the idea of "randomly look at something and check if it's in sync' would probably cache drift/issues

opieter-aws · 2026-05-01T17:16:34Z

+### Limitations: Latency and Cost
+
+While experimenting, I optimized for correctness which was unfortunately paired with high latency between 10–20 minutes for 
+medium to large diffs. Similarly, token inputs grew very quickly towards 1-5M input tokens per run.


Our planned work for an improved context management system will reduce the number of tokens used. Archival/long term memory (which we are currently thinking of) for a docs agent could both reduce latency and increase quality since it would allow the agent to do semantic search on both the code base and the docs

JackYPCOnline · 2026-05-01T17:17:07Z

+
+I first reached for a graph because it fit my mental model of the necessary flow.
+
+explore -> doc_writer -> refiner -> validator -> language_parity -> ui_tester


is that in isolated runtime/ sandbox? the main idea is to achieve better validation/eval result. Agent should not do self review.

mkmeral · 2026-05-01T17:17:24Z

+👀 → neutral / needs review
+```
+
+We can also apply the same approach to our existing `/impl` and `/review` workflows. The biggest downside of the


I've seen some folks do it, how would we collect the data and improve the system? just ask kiro/cc/codex/whatever?

poshinchen · 2026-05-01T17:17:58Z

+
+By setting up a GH action workflow, we can automatically run the docs agent on PR merge. We can re-use existing work like the tools and utilities defined in `devtools`.
+
+## Experimentations and Learnings


Can the doc agent handle conflicts? Two PRs might change the same pages and the agent has to deal with that, but I believe it is possible?

Also, I'm expecting the author to approve, and the external contributor's approval wouldn't count, right?

Automatically, each PR would generate its own run.

When there's dependent interplay and those PRs are irreconcilable, a fresh issue citing both PRs could be raised. From what I can tell this would be an uncommon case.

And yes we'd need a maintainer approval on every run like usual

JackYPCOnline · 2026-05-01T17:18:52Z

+  ─ ─ ─     = findings from audit/ui-tester re-enter doc-writer, which re-runs
+```
+
+### Limitations: Latency and Cost


I guess we should consider more about accuracy and human readable, easier understandable, user-friendly content?

opieter-aws · 2026-05-01T17:19:05Z

+
+In our GitHub environment a feedback comment automation would be a simple solution.
+
+Using the available GitHub emojis:


I like this as a feedback mechanism!

pgrayy · 2026-05-01T17:19:14Z

+
+### Limitations: Latency and Cost
+
+While experimenting, I optimized for correctness which was unfortunately paired with high latency between 10–20 minutes for 


For snippets and examples we present in the docs, I would like the agent to run testing. That is currently what I have done. It adds to the latency of course but it has been worth it.

mkmeral · 2026-05-01T17:20:23Z

+
+Approaches like these could be attempted by using the `use_agent` tool in the community tools repo or alternatively by using the
+experimental agent-as-config approach. However, both options are not ideal for a clean orchestrator–worker implementation: 
+they only support sequential, blocking invocations. 


is this related to agent teams? what's the blocking sequential limitation? technically LLM can call multiple tools at once, we'd need to batch the ends (but that's also what graph does today)

@gautamsirdeshmukh can answer for Agent teams.

I think in this section, I'm trying to note that we have tools that feel adjacent but not a perfect fit for orchestrator-worker pool design. Sequential is workable but it's not ideal

zastrowm · 2026-05-01T17:20:35Z

+
+Rate `/strands docs agent` output on this PR:
+
+👍 → good / acceptable


WE can/should just start collecting all feedback on our agent comments as a way to improve; our existing agents could use the same data TBH

mkmeral · 2026-05-01T17:20:59Z

+
+### Coordinating Concurrent Agents
+
+A small tool-as-class `SharedLedger` can be used to accommodate many Agents interacting with the same file. Each Agent has a `write_ledger`


who are these conflicting agents? why do we have multiple agents trying to write to same filesystem project at the same time? why not split those?

mkmeral · 2026-05-01T17:21:26Z

+
+We'd lose two things by taking that choice: 
+
+1) The opportunity to dogfood concurrent agent coordination tooling


we could dogfood new context stuff and sandboxes :)

opieter-aws · 2026-05-01T17:22:17Z

+The docs audit agent has the potential to be very annoying. If it flags issues which are not definitive, issues which were already flagged,
+or produces any other erroneous output, we will be tempted to turn it off.


Did you consider the agent having 1 long-running ticket that it always reports to? It can be part of our oncall to check it once a week and approve or discard suggested updates

mkmeral · 2026-05-01T17:22:59Z

+
+## Recommendation
+
+Start with a reusable `/strands docs` GitHub Actions runner using the proposed main-agent + fresh audit-subagent pipeline. Treat the current latency as acceptable for initial dogfooding, but track it explicitly. Defer fully automatic PR-merge kickoff until the monorepo removes cross-repo permission issues.


we should get to monorepo relatively soon

mkmeral · 2026-05-01T17:23:48Z

+With the understanding that we're expecting to iterate, building out the proposed docs agent and audit agent would have rough edges, but should
+deliver immediate value.
+
+If we align on moving forward with implementation, the main open question is whether to start with the reusable `/strands docs` runner now, or wait for the monorepo to avoid short-lived wiring. 


what would be the short-lived wiring? don't we add some stuff to devtools/strands-command anyways? what would need to change, invocations?

mkmeral · 2026-05-01T17:24:15Z

+
+Either way, the experiment surfaced useful Strands follow-up areas: file exploration tools, fresh-context audit patterns, workflow feedback collection, and multi-agent coordination.
+
+We also might look to convert some of the learnings around "how do I model my multi-step workflow in Strands" into a page in our docs.


a blog post? :)

What if the docs agent can propose blog posts based on newly merged features?

uff I love this

lizradway · 2026-05-01T17:26:27Z

+
+## Background
+
+Writing documentation for Strands features and capabilities is the important final step


i think this might need more of a product alignment decision required here, but if we are wanting to automate docs changes 100% (or honestly, even if we continue manually), we should have some sort of common structure / ethos / guidance / tenets for our docs site content. i would be interested in seeing a proposal for that in the future.

right now, we developers are following the model of "adding whatever we happen to think is needed to understand the feature" (the bar is really low for shipping internal and community docs changes imo), but i think some sort of aligned requirements for documentation should be introduced in the v2 docs overhaul, which this agent can then abide by in its autonomous execution.

our docs site is already highly scattered and has scaled in cluttered and non-uniform ways. information is buried under layers and layers of headers. i think automating docs changes is a must for strands to scale, but i worry that if we do not have some structural integrity/tenets behind these changes, we'll eventually just end up with docs slop.

Agreed on the need for common structure / ethos / guidance / tenets guidelines. But I'd strongly push back against 100% anything.

v2 docs overhaul, which this agent can then abide by in its autonomous execution.

I'm also guessing we're not going to have any big v2 overhaul at this point; everything is going to be incremental AFAICT so we should start documenting this now with that in mind.

That said, @ryanycoleman is going to be working towards adding some pre-defined skills/guidelines that he's been developing to the docs site

gautamsirdeshmukh · 2026-05-01T17:26:51Z

+Since we're already working in GitHub and have existing GH Actions devtools, we can follow the same
+pattern as `/strands impl` and `/strands review` and use GitHub Action runners.
+
+By setting up a GH action workflow, we can automatically run the docs agent on PR merge. We can re-use existing work like the tools and utilities defined in `devtools`.


Triggering the agent on merge makes sense from the standpoint of streamlining our development efforts, but it does put the decision of "does this change require a doc update" in the hands of the agent instead of the human (i.e. making the trigger the /strands docs command). This isn't a breaking issue, but would be a cause of extra churn if that first decision step isn't nailed down.

pgrayy · 2026-05-01T17:27:14Z

+  ─ ─ ─     = findings from audit/ui-tester re-enter doc-writer, which re-runs
+```
+
+### Limitations: Latency and Cost


What if we had an agent develop doc templates across the pages? Going forward then the docs agent would fill in these templates which might speed up delivery. It could potentially help reduce context and ensure better consistency.

Generally speaking here, maybe there is some preliminary scaffolding we need to setup to get our docs agents running more effectively.

design: docs automation

0f92df6

notowen333 temporarily deployed to auto-approve May 1, 2026 17:04 — with GitHub Actions Inactive

notowen333 requested a deployment to manual-approval May 1, 2026 17:04 — with GitHub Actions Waiting