You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Systematic exploration of how the agentic-workflows custom agent responds to workflow creation requests from 4 software worker personas. Tested 3 of 4 scenarios successfully (QA scenario timed out). Findings are based on agent responses to Backend Engineer, DevOps Engineer, and Product Manager scenarios.
✅ The agent consistently produces well-structured, production-quality YAML frontmatter with appropriate engine, permissions, and tool selections
✅ Minimal permissions (e.g. contents: read + only necessary write permissions) are applied by default — good security hygiene
✅ The agent correctly separates MCP-only patterns (no network: allowed needed when everything goes through GitHub MCP) from patterns that need external HTTP
✅ Concurrency groups and path filters were proactively suggested for the PR-triggered scenario — demonstrates awareness of run efficiency
⚠️ The agent doesn't always use the latest on: schedule shorthand (e.g., schedule: weekly-monday) — it sometimes falls back to cron syntax, which is valid but less readable
Top Patterns Observed
Engine: copilot recommended for all scenarios — appropriate for structured analysis and report generation
Triggers: pull_request (with paths: filter) for PR automation; workflow_run for deployment monitoring; schedule for periodic digests
Tools: github MCP with scoped toolsets — never over-provisioned (e.g., only [pull_requests, repos] for a PR reviewer)
Security: checkout: false suggested when diff is accessible via MCP (avoids unnecessary repo clone)
Safe outputs: safe-outputs: create-issue: true used for incident creation; correct discussions: write permission for digest posting
View Scenario Scores
Scenario 1 — Backend Engineer: DB Migration PR Reviewer
Clean, no network exposure; min-integrity: approved noted
Prompt clarity
5
Excellent tone guidance for non-technical stakeholders
Completeness
4
Quiet-week handling explicitly addressed
Average
4.6
View Areas for Improvement
Fork safety not default: The DB migration reviewer response noted roles: [write, maintainer, admin] only as an optional add-on. For PR automation, fork safety should be the default recommendation with an opt-out note, since external PRs can trigger workflows with write permissions.
Prompt injection awareness: The deployment monitor scenario correctly used safe-outputs but didn't explicitly flag that workflow log content is untrusted input. A reminder to avoid echoing raw log lines into issue titles would strengthen security guidance.
Toolset availability assumptions: The weekly digest assumed a discussions toolset exists without checking whether it's enabled in the repo. A validation step in the prompt (or a note in documentation) would prevent silent failures.
QA scenario timeout: The coverage analysis scenario caused the agent to exceed token budget, suggesting complex multi-artifact scenarios may need explicit scope-limiting guidance in the prompt template.
Recommendations
Strengthen fork safety defaults in PR automation guidance — Update .github/aw/create-agentic-workflow.md to recommend roles: [write, maintainer, admin] by default for pull_request triggered workflows, with a note on how to opt out for open-source repos that want to include external contributors.
Add untrusted-input callout for log/issue content — Enhance .github/aw/github-agentic-workflows.md with a short security note: when workflow prompts consume external content (issue bodies, log output, PR descriptions), remind authors to sanitize or scope what gets written to GitHub (use safe-outputs for writes based on that content).
Document toolset availability checks — Add a pattern to the workflow authoring guide showing how to gracefully handle missing GitHub Discussion categories (e.g., fallback from announcements → general), since this is a common first-run failure point.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
Systematic exploration of how the
agentic-workflowscustom agent responds to workflow creation requests from 4 software worker personas. Tested 3 of 4 scenarios successfully (QA scenario timed out). Findings are based on agent responses to Backend Engineer, DevOps Engineer, and Product Manager scenarios.Summary
Key Findings
contents: read+ only necessary write permissions) are applied by default — good security hygienenetwork: allowedneeded when everything goes through GitHub MCP) from patterns that need external HTTPon:schedule shorthand (e.g.,schedule: weekly-monday) — it sometimes falls back to cron syntax, which is valid but less readableTop Patterns Observed
copilotrecommended for all scenarios — appropriate for structured analysis and report generationpull_request(withpaths:filter) for PR automation;workflow_runfor deployment monitoring;schedulefor periodic digestsgithubMCP with scopedtoolsets— never over-provisioned (e.g., only[pull_requests, repos]for a PR reviewer)checkout: falsesuggested when diff is accessible via MCP (avoids unnecessary repo clone)safe-outputs: create-issue: trueused for incident creation; correctdiscussions: writepermission for digest postingView Scenario Scores
Scenario 1 — Backend Engineer: DB Migration PR Reviewer
pull_request+paths:filter is exactly rightpull_requests+repostoolsets, no extrascheckout: false, minimal permissions, concurrency cancelroles:) but noted as optional add-onScenario 2 — DevOps Engineer: Deployment Failure Monitor
workflow_runwith named deployment workflows is correctactions+issuestoolsets;safe-outputs: create-issueScenario 4 — Product Manager: Weekly Digest
schedule: weekly-monday+workflow_dispatchfallbackdefault+discussionstoolsets;discussions: writepermissionmin-integrity: approvednotedView Areas for Improvement
Fork safety not default: The DB migration reviewer response noted
roles: [write, maintainer, admin]only as an optional add-on. For PR automation, fork safety should be the default recommendation with an opt-out note, since external PRs can trigger workflows with write permissions.Prompt injection awareness: The deployment monitor scenario correctly used
safe-outputsbut didn't explicitly flag that workflow log content is untrusted input. A reminder to avoid echoing raw log lines into issue titles would strengthen security guidance.Toolset availability assumptions: The weekly digest assumed a
discussionstoolset exists without checking whether it's enabled in the repo. A validation step in the prompt (or a note in documentation) would prevent silent failures.QA scenario timeout: The coverage analysis scenario caused the agent to exceed token budget, suggesting complex multi-artifact scenarios may need explicit scope-limiting guidance in the prompt template.
Recommendations
Strengthen fork safety defaults in PR automation guidance — Update
.github/aw/create-agentic-workflow.mdto recommendroles: [write, maintainer, admin]by default forpull_requesttriggered workflows, with a note on how to opt out for open-source repos that want to include external contributors.Add untrusted-input callout for log/issue content — Enhance
.github/aw/github-agentic-workflows.mdwith a short security note: when workflow prompts consume external content (issue bodies, log output, PR descriptions), remind authors to sanitize or scope what gets written to GitHub (usesafe-outputsfor writes based on that content).Document toolset availability checks — Add a pattern to the workflow authoring guide showing how to gracefully handle missing GitHub Discussion categories (e.g., fallback from
announcements→general), since this is a common first-run failure point.References:
Beta Was this translation helpful? Give feedback.
All reactions