feat(audit): add `ado-aw audit <build-id-or-url>` command by jamesadevine · Pull Request #691 · githubnext/ado-aw

jamesadevine · 2026-05-21T15:01:47Z

`feat(audit)`: add `ado-aw audit <build-id-or-url>` command

ADO-side counterpart to gh aw audit. Single-run audit only in this MVP: download a build's artifacts, run every analyzer (firewall, MCP gateway, OTel, safe outputs, detection verdict, build timeline, missing tools/data/noops), and emit a Markdown or JSON report.

What ships

New src/audit/ module tree:

File	Role
`model.rs`	`AuditData` — top-level JSON contract. Drift-compatible with gh-aw's shape; adds ADO-specific `detection_analysis`, `safe_output_execution`, `rejected_safe_outputs` sections.
`url.rs`	Parses bare build IDs, `dev.azure.com` URLs, legacy `*.visualstudio.com` URLs, on-prem Azure DevOps Server URLs (with optional `&j=`/`&t=`/`&s=` job/step anchors).
`cache.rs`	CLI-version-keyed `<output>/build-<id>/run-summary.json` with atomic temp-file + rename writes.
`analyzers/firewall.rs`	AWF Squid proxy logs → per-domain stats (allowed/denied/mixed).
`analyzers/policy.rs`	AWF `policy-manifest.json` + `audit.jsonl` → rule hit counts.
`analyzers/mcp.rs`	MCPG NDJSON → tool usage, server health (`unreliable` flagging), failures.
`analyzers/otel.rs`	Copilot OTel + `aw_info.json` → metrics + engine config.
`analyzers/safe_outputs.rs`	Joins proposals + detection verdict + execution log keyed on `context`.
`analyzers/detection.rs`	`threat-analysis.json` → DetectionAnalysis.
`analyzers/missing.rs`	Missing-tool / missing-data / noop NDJSON entries.
`analyzers/jobs.rs`	ADO `/timeline` REST → `JobData[]`.
`findings.rs`	8 heuristic rules emitting severity-rated findings + recommendations.
`render/console.rs`	Markdown-style terminal renderer (section ordering mirrors gh-aw).
`render/json.rs`	Stable JSON contract for tooling.
`cli.rs`	Orchestration: URL parse → auth resolve → metadata fetch → artifact download → analyzers → findings → cache → render.

Pipeline-side runtime additions (so ado-aw audit of an existing build has the inputs it needs):

All four src/data/*-base.yml templates emit staging/aw_info.json at runtime (engine, model, agent name, source, target, version, build context). Generated by an extension to AdoAwMarkerExtension.
src/execute.rs writes per-item safe-outputs-executed.ndjson in <output-dir> so the audit can trace proposed → detection → executed per safe output.

CLI surface

ado-aw audit <build-id-or-url>
  -o, --output <dir>         # default ./logs (matches gh-aw operator muscle memory)
  --json                     # emit AuditData as JSON to stdout
  --org <url>                # ADO context overrides; auto-detected from git remote
  --project <name>
  --pat <token>              # also reads AZURE_DEVOPS_EXT_PAT
  --artifacts <agent,detection,safe-outputs>
  --no-cache

Unified rejection trace

When the aggregate THREAT_DETECTION_RESULT has any threat flag set, every proposed safe output lands in safe_output_execution[*].status = not_processed_due_to_aggregate_gate, carries the aggregate reasons[] (annotated applies_to_whole_batch: true), and exactly one severity-high KeyFinding is emitted summarizing which threat flags fired and how many proposals were dropped. A top-level rejected_safe_outputs rollup mirrors the same info for --json consumers.

The threat-analysis prompt itself is unchanged — it's identical to gh-aw's today, and per-item verdicts will be coordinated upstream rather than forked.

Dependencies

zip — unpack downloaded ADO PipelineArtifacts.
wiremock (dev only) — fake ADO REST server for the integration tests.

Tests

80 new audit unit tests across model, url, cache, analyzers, findings, renderers.
3 new integration tests (tests/audit_it.rs) against a fake REST server: happy path, permission-denied, cache hit.
Existing test suite untouched. 1740 tests pass total.

Docs

New docs/audit.md — accepted URL formats, flag table, output layout, AuditData shape, cache behavior, permission-failure UX, out-of-scope follow-ups.
docs/cli.md — new audit subcommand block.
README.md — one-line CLI entry.
AGENTS.md index — pointer to docs/audit.md under "Compiler internals & operations".
prompts/debug-ado-agentic-workflow.md — Step 1 first-move callout, new Step 2a-prime (run ado-aw audit --json before raw MCP timeline/log calls), AuditData top-level-key reference table, jq-diff fallback note. create-/update- prompts intentionally untouched (post-run inspection is debug-flavored).

Validation

cargo build ✓
cargo test ✓ (1740 passed, 0 failed)
cargo clippy --all-targets --all-features ✓ (warnings only, all non-blocking style nits; no new errors)

Explicitly out of scope (recorded as follow-ups)

Diff mode (ado-aw audit <a> <b>)
Cross-run trends (ado-aw audit --last N)
--parse log.md / firewall.md renderers (Rust-native, no JS bundle)
Job/step-anchored audit (anchors are parsed but normalised to the parent build in this MVP)
MCP-exposed audit (agentic-pipelines MCP tool for in-pipeline self-audit)
Per-item detection verdict NDJSON (coordinated upstream with gh-aw)
Partial-approval gating that consumes it
AWF policy-manifest plumbing
AWF firewall token-usage.jsonl opt-in
audit-manifest.json build inventory

Each is recorded in the session plan under "Out-of-scope follow-ups" so they're not lost.

github-actions · 2026-05-21T15:07:25Z

🔍 Rust PR Review

Summary: Well-structured feature addition with good error handling overall, but three actionable issues worth fixing before merge.

Findings

🐛 Bugs / Logic Issues

src/audit/analyzers/detection.rs – find_verdict_path lexicographic sort is wrong for multi-digit run IDs

The function picks the "lexicographically last" analyzed_outputs_* directory using plain string comparison:
```
match &latest_dir {
    Some((current_name, _)) if name <= *current_name => {}
    _ => latest_dir = Some((name, path)),
}
```
"analyzed_outputs_9" sorts after "analyzed_outputs_10" because '9' > '1'. In any build where both exist (e.g. a retry where ADO incremented the numeric suffix), the wrong — older — verdict would be read and the run could be mis-classified as safe.

Fix: parse the trailing integer and compare numerically, or zero-pad during comparison:
```
fn suffix_number(name: &str) -> u64 {
    name.rsplit('_').next().and_then(|s| s.parse().ok()).unwrap_or(0)
}
// then compare suffix_number(&name) vs suffix_number(current_name)
```
The same pattern is used in safe_outputs.rs and jobs.rs (all rely on matches.sort() / lexicographic-last on directory names). All three need the same fix.

🔒 Security Concerns

src/ado/mod.rs – ADO_AW_TEST_ORG_URL override is always active in production

apply_test_org_url_override is a #[doc(hidden)] function, but it is called unconditionally from resolve_ado_context — not gated by #[cfg(test)]. Any user or CI environment that happens to have ADO_AW_TEST_ORG_URL set (e.g. a leftover env var from a previous debugging session) will silently redirect all ADO API calls — build metadata fetches, artifact lists, pipeline updates — to an attacker-controlled URL.

Either gate it:
```
#[cfg(test)]
fn apply_test_org_url_override(ctx: &mut AdoContext) { ... }
#[cfg(not(test))]
fn apply_test_org_url_override(_: &mut AdoContext) {}
```
Or document the env var in docs/cli.md as a supported override so operators can reason about it. The current state (always active, hidden from docs) is the worst of both worlds.

⚠️ Suggestions

src/audit/analyzers/safe_outputs.rs – synchronous std::fs::read_dir called from async context

top_level_dirs_with_prefix and collect_named_files use std::fs::read_dir (blocking), but they are called from async fn analyze_safe_outputs. On a Tokio multi-thread runtime this blocks an executor thread for the duration of the directory walk. The rest of the codebase (and the other analyzers in this PR) consistently use tokio::fs. These two helpers should be converted to async fn with tokio::fs::read_dir, or wrapped with tokio::task::spawn_blocking.

✅ What Looks Good

Zip extraction in ado/mod.rs correctly uses entry.enclosed_name() to reject path-traversal entries — this is the right approach and is well-tested.
append_execution_record swallows its own errors (logged at warn!) so audit-log failures never abort Stage 3 execution — correct design for a diagnostic side-channel.
cache.rs atomic temp-file + rename pattern is correct; the version-keyed cache invalidation is clean.
Error messages on 401/403 include the exact PAT scopes needed and a manual az pipelines fallback command — great UX.
80 new unit tests + 3 integration tests against a fake REST server is solid coverage for a feature this size.

Generated by Rust PR Reviewer for issue #691 · ● 10.3M · ◷

Three issues raised by the Rust PR Reviewer on #691: 1. **Lexicographic sort wrong for multi-digit run IDs.** Previously `find_artifact_dir` / `find_verdict_path` / `top_level_dirs_with_prefix` picked the "lexicographically last" `<prefix>_<id>` directory, which sorts `_9` after `_10` (because `'9' > '1'`). On a build retry that produced both `analyzed_outputs_9` and `analyzed_outputs_10`, the older verdict would be read and the run could be mis-classified as safe. New `crate::audit::cmp_numeric_suffix` extracts the trailing token after the final `_`, parses it as `u64`, and compares numerically with a lexicographic tie-breaker for non-numeric suffixes. All three call sites now use it. Regression tests added in mod.rs, detection.rs, and cli.rs. 2. **Security: `ADO_AW_TEST_ORG_URL` was always active in production.** The override was `#[doc(hidden)]` but not gated by build mode, so a stray env var (debugging leftover, hostile CI environment) could silently redirect ADO REST calls to an attacker-controlled URL in a release binary. Gated on `cfg(debug_assertions)`: debug builds (`cargo test`, `cargo run`) keep the override AND emit a loud `warn!` on every invocation; release builds (all published artifacts via `cargo build --release`) replace the body with a no-op so a stray env var has no effect. The integration test in `tests/audit_it.rs` continues to work because `cargo test` builds in debug mode. 3. **Blocking `std::fs::read_dir` in async context.** `safe_outputs.rs` had two helpers (`top_level_dirs_with_prefix`, `collect_named_files`) using sync I/O from inside `async fn analyze_safe_outputs`. On a Tokio multi-thread runtime this blocks an executor thread for the duration of the directory walk. Both helpers converted to `async fn` using `tokio::fs::read_dir`. The recursive `collect_named_files` uses `Box::pin` to satisfy the async-recursion shape (consistent with the existing pattern in `crate::detect::scan_directory`). Tests: 1745 unit tests + 3 integration tests pass (up from 1740 — 5 new regression tests for the numeric-suffix bug). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-05-21T16:55:46Z

🔍 Rust PR Review

Summary: Looks good overall — well-structured module, solid test coverage, good error handling patterns throughout. A few specific concerns worth addressing before merge.

Findings

🐛 Bugs / Logic Issues

src/execute.rs:312-316 — is_budget_exhausted relies on brittle string matching
```
fn is_budget_exhausted(result: &ExecutionResult) -> bool {
    !result.success
        && result.message.starts_with("Skipped")
        && result.message.contains("maximum ")
        && result.message.contains("already reached")
}
```
This function drives whether the audit NDJSON gets status: "budget_exhausted" vs status: "failed". It cross-checks the human-readable message generated in enforce_budget (line 616: "Skipped{}: maximum {} count ({}) already reached..."). If that message is ever rephrased — e.g. changing "maximum" to "max", adding a prefix — budget-exhausted entries silently become "failed" in every audit log with no compile-time warning.

The right fix is to add a structural flag to ExecutionResult (e.g. budget_exhausted: bool) set in enforce_budget, and key off that in execution_record_status. This makes the intent explicit and refactor-safe.
src/execute.rs:320 — execution_record_status maps is_warning() → "skipped"
```
} else if result.is_warning() {
    "skipped"
}
```
is_warning() is true for tools like noop and missing-tool that succeed with a warning (no ADO credentials). These show up as status: "skipped" in the audit log, which is semantically misleading — they ran successfully, they just couldn't persist a result. "warning" or "no_op" would be more accurate and avoids confusion with the actual budget-skip case.

⚠️ Suggestions

src/audit/url.rs:1 — #![allow(dead_code)] on the whole file

The file-level attribute silences all unused-item warnings. If this was added to suppress specific MVP-incomplete items (e.g. job_id/step_id on ParsedBuildRef which are parsed but not yet consumed), prefer targeted #[allow(dead_code)] on just those fields so future dead code in the file doesn't go undetected.
src/audit/cache.rs — fs::rename atomicity on Windows

The atomic temp-write + rename pattern is correct on Linux. Worth a comment noting this is intentionally Linux-only — fs::rename is not atomic on Windows when the destination file exists. Not a bug for ADO hosted agents, just a future-proofing note.
src/compile/extensions/ado_aw_marker.rs — unwrap() in marker_json / aw_info_json
```
fn aw_info_json(&self) -> String {
    serde_json::to_string(&serde_json::json!({...})).unwrap()
}
```
These are provably safe (serializing a json!({}) literal never fails), but per the project style they deserve a // infallible: serializing a static json! literal never fails comment to make that explicit.

✅ What Looks Good

Atomic cache writes (temp → rename → cleanup on failure) correctly implemented in cache.rs.
The heredoc in ado_aw_marker.rs uses a single-quoted delimiter (<<'AW_INFO_EOF') preventing ADO variable expansion in the JSON content. The 4-space YAML indentation strips cleanly to leave the terminator at column 0 after YAML processing — correct.
append_execution_record correctly swallows I/O errors with warn! rather than propagating — Stage 3 audit logging must never abort the actual execution.
Detection-gate rejection path in analyzers/safe_outputs.rs uses expect() only behind the gate_fired guard, making the invariant clear.
wiremock-based integration tests in tests/audit_it.rs are the right approach — no real network calls in CI.
EXECUTED_NDJSON_FILENAME shared via ndjson:: module rather than duplicated as string literals across execute.rs and the audit analyzers.

Generated by Rust PR Reviewer for issue #691 · ● 11.3M · ◷

Single-run audit: download a build's artifacts, run every analyzer (firewall, MCP gateway, OTel, safe outputs, detection verdict, build timeline, missing tools/data/noops), and emit a Markdown or JSON report. ADO-side counterpart to `gh aw audit`. New module tree under `src/audit/`: - `model.rs` — `AuditData` (drift-compatible with gh-aw's top-level contract; adds ADO-specific `detection_analysis`, `safe_output_execution`, `rejected_safe_outputs` sections). - `url.rs` — parses bare IDs, dev.azure.com URLs, legacy visualstudio.com URLs, and on-prem Azure DevOps Server URLs (with optional `&j=`/`&t=`/`&s=` job/step anchors). - `cache.rs` — CLI-version-keyed `run-summary.json` with atomic writes. - `analyzers/{firewall,policy,mcp,otel,safe_outputs,detection,missing,jobs}.rs` — eight defensive NDJSON/REST analyzers. - `findings.rs` — eight heuristic rules emitting severity-rated findings + recommendations. - `render/{console,json}.rs` — two renderers; JSON shape is the public contract. - `cli.rs` — orchestration: URL parse → auth → metadata fetch → artifact download → analyzers → findings → cache → render. Unified rejection trace: when the aggregate `THREAT_DETECTION_RESULT` has any threat flag set, every proposal lands in `not_processed_due_to_aggregate_gate` carrying the aggregate `reasons[]`, exactly one severity-`high` `KeyFinding` is emitted, and a `rejected_safe_outputs` rollup appears at the top level. Pipeline-side runtime additions (so an `ado-aw audit` of an existing build has the data it needs): - `src/data/*-base.yml` (via `AdoAwMarkerExtension`): emits `staging/aw_info.json` at runtime with engine, model, agent name, source path, target, compiler version, and ADO build context. - `src/execute.rs`: writes a per-item `safe-outputs-executed.ndjson` in `<output-dir>` so the audit can show the proposed → detection → executed trace. CLI surface: ado-aw audit <build-id-or-url> -o, --output <dir> # default ./logs --json --org / --project / --pat --artifacts <agent,detection,safe-outputs> --no-cache New dependencies: `zip` (artifact unpack), `wiremock` (dev only — integration test mock server). Tests: 80 new audit unit tests + 3 integration tests against a fake ADO REST server (happy path, permission-denied, cache hit) using a thin `ADO_AW_TEST_ORG_URL` test seam. 1740 total tests pass. Docs: new `docs/audit.md`; updates to `docs/cli.md`, `README.md`, `AGENTS.md` index, and `prompts/debug-ado-agentic-workflow.md` (Step 1 first-move + new Step 2a-prime + `AuditData` reference + jq-diff fallback). Out of scope (explicit follow-ups): diff mode, cross-run trends, `--parse` log.md/firewall.md, job/step-anchored audit, MCP-exposed audit, per-item detection verdict (upstream coordination with gh-aw), partial-approval gating, AWF policy-manifest plumbing, AWF token-usage.jsonl, `audit-manifest.json` build inventory. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Three issues raised by the Rust PR Reviewer on #691: 1. **Lexicographic sort wrong for multi-digit run IDs.** Previously `find_artifact_dir` / `find_verdict_path` / `top_level_dirs_with_prefix` picked the "lexicographically last" `<prefix>_<id>` directory, which sorts `_9` after `_10` (because `'9' > '1'`). On a build retry that produced both `analyzed_outputs_9` and `analyzed_outputs_10`, the older verdict would be read and the run could be mis-classified as safe. New `crate::audit::cmp_numeric_suffix` extracts the trailing token after the final `_`, parses it as `u64`, and compares numerically with a lexicographic tie-breaker for non-numeric suffixes. All three call sites now use it. Regression tests added in mod.rs, detection.rs, and cli.rs. 2. **Security: `ADO_AW_TEST_ORG_URL` was always active in production.** The override was `#[doc(hidden)]` but not gated by build mode, so a stray env var (debugging leftover, hostile CI environment) could silently redirect ADO REST calls to an attacker-controlled URL in a release binary. Gated on `cfg(debug_assertions)`: debug builds (`cargo test`, `cargo run`) keep the override AND emit a loud `warn!` on every invocation; release builds (all published artifacts via `cargo build --release`) replace the body with a no-op so a stray env var has no effect. The integration test in `tests/audit_it.rs` continues to work because `cargo test` builds in debug mode. 3. **Blocking `std::fs::read_dir` in async context.** `safe_outputs.rs` had two helpers (`top_level_dirs_with_prefix`, `collect_named_files`) using sync I/O from inside `async fn analyze_safe_outputs`. On a Tokio multi-thread runtime this blocks an executor thread for the duration of the directory walk. Both helpers converted to `async fn` using `tokio::fs::read_dir`. The recursive `collect_named_files` uses `Box::pin` to satisfy the async-recursion shape (consistent with the existing pattern in `crate::detect::scan_directory`). Tests: 1745 unit tests + 3 integration tests pass (up from 1740 — 5 new regression tests for the numeric-suffix bug). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Two issues in src/execute.rs that made the executed-NDJSON manifest silently mis-classify entries: 1. is_budget_exhausted parsed the human-readable message ("Skipped...maximum...already reached"). Any phrasing tweak to check_budget would have silently downgraded budget-exhausted records to status: "failed" in every audit log, with no compile-time signal. 2. is_warning() entries (e.g. noop / missing-tool that succeeded without ADO credentials) were emitted as status: "skipped", conflating successful-but-no-op runs with the rejection bucket used by the audit rollup. Fixes: - Add ExecutionResult.budget_exhausted: bool with budget_exhausted() constructor and is_budget_exhausted() accessor. check_budget now emits a structurally-tagged result; execution_record_status keys off the flag. Refactor-safe. - Map warning results to status: "warning" (distinct from "skipped"). Add SafeOutputStatus::Warning; counted toward executed_count, never toward rejected_by_execution_count or the rejection rollup. - Update affected unit tests; add coverage for budget_exhausted serialization round-trip and warning-status analyzer mapping. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-01T21:25:15Z

🔍 Rust PR Review

Summary: Looks good overall — the audit module is well-structured, error handling is solid, and security-sensitive paths (zip extraction, cache writes) are handled correctly. Three issues worth addressing before merge.

Findings

🐛 Bugs / Logic Issues

src/ado/mod.rs:462 — artifact.name from the ADO REST response is used directly as a path component with no sanitisation:
```
let artifact_dir = dest_dir.join(&artifact.name);
```
If a server (or a compromised build) returned an artifact whose name field contains .. or an absolute path, the subsequent remove_dir_all + create_dir_all calls would operate outside dest_dir. The zip entries themselves are correctly guarded with enclosed_name() (good catch), but the artifact directory is not. A simple fix: check the name contains no path separator or leading / before joining, or use Path::new(&artifact.name).file_name() to reject anything with a parent component.

🔒 Security Concerns

src/ado/mod.rs:483-494 — auth is accepted as a parameter to download_build_artifact but the actual download request (client.get(download_url).send()) does not call auth.apply(). The only reference to auth is let _ = auth; inside the error branch to suppress the unused-variable warning. If the intent is that download_url is always a pre-signed/SAS URL and auth is intentionally omitted, a one-line comment would clarify this and prevent a future contributor from "fixing" it by adding auth.apply() to a signed URL unnecessarily. If it was an accidental omission and the URL is not pre-signed, this is a silent auth bypass on every download.

⚠️ Suggestions

✅ What Looks Good

Zip path traversal is handled correctly: entry.enclosed_name() rejects ../ components before constructing output_path.
Atomic cache writes via temp-file + rename in cache.rs prevent partial/corrupt reads.
apply_test_org_url_override being #[cfg(debug_assertions)]-only is a clean way to keep the test hook out of release builds.
Soft analyzer failures throughout cli.rs — individual analyzer errors are warned and recorded without aborting the full audit run.
load_run_summary version-keying transparently invalidates stale caches across ado-aw upgrades without requiring explicit migration code.

Generated by Rust PR Reviewer for issue #691 · sonnet46 2M · ◷

…xfil Two issues surfaced by the adversarial review: 1. download_build_artifact discarded its �uth argument src/ado/mod.rs::download_build_artifact took &AdoAuth but called client.get(download_url).send() with no �uth.apply(...) wrapper, silenced by let _ = auth;. For ADO artifact resource types whose downloadUrl is not pre-signed (legacy/Container artifacts and many on-prem ADO Server topologies) the request returned 401/403, and the 401 branch then misleadingly told the user to check their PAT scopes — which had never been sent. Fix: wrap with �uth.apply(...) to match list_build_artifacts / get_build. Pre-signed Artifact Services URLs continue to work because they ignore the Authorization header in favor of their sig= query string. 2. �udit <URL> sent the local PAT to any host in the URL src/audit/url.rs accepted on-prem URLs with an arbitrary host; apply_parsed_context_overrides plumbed that host straight into ctx.org_url; and AdoAuth::apply then attached the PAT via HTTP Basic Auth — to whatever host the URL named. A user social-engineered into running �do-aw audit https://attacker.example.com/Coll/Proj/_build/results?buildId=1 would silently exfiltrate their PAT. Fix: validate_audit_url_host() now runs before any auth-bearing request. Microsoft-managed cloud hosts (dev.azure.com, *.visualstudio.com) are always trusted. Any other host must match the host in the trusted ADO context, which is itself resolved from --org or the local git remote — both explicit, locally controlled trust anchors. The check is host-suffix anchored (rejects .visualstudio.com bare-suffix and �isualstudio.com.attacker.example lookalikes) and case-insensitive. Adds 11 unit tests covering: cloud-host allowlist (positive and case-insensitive), bare-suffix / lookalike rejection, trusted-context match, trusted-context mismatch, no-trusted-context fallback, and host extraction edge cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Follow-up to ea9c032. The previous fix derived the trusted host from resolve_ado_context(), which requires BOTH --org AND --project when running outside a git repo. That created a UX regression: when a user ran �do-aw audit https://onprem.example.com/coll/proj/_build/results?buildId=42 from an arbitrary folder with --org https://onprem.example.com/coll, the trust anchor still came back as None (because --project wasn't passed), and validate_audit_url_host then rejected the URL host telling the user to "pass --org" — which they already had. Fix: introduce resolve_trusted_host(cwd, org_flag) that derives a host from --org if provided (any form normalize_org_url accepts), else from the git remote of cwd, else None. validate_audit_url_host now takes Option<&str> directly. Full-context resolution still runs afterward for its original purpose (supplying defaults the URL overrides supersede). Adds 4 regression tests, including one for the original failure mode: running in an arbitrary folder with --org alone (no --project) must yield a usable trust anchor. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-01T21:43:23Z

🔍 Rust PR Review

Summary: Looks good overall — well-structured feature with solid error handling and test coverage. One correctness concern worth addressing before this pattern is widely deployed.

Findings

🐛 Bugs / Logic Issues

src/compile/extensions/ado_aw_marker.rs — ADO macro values in aw_info.json bypass JSON escaping

aw_info_json() embeds ADO runtime macros ($(Build.SourceBranch), $(Build.SourceVersion), $(Build.BuildId), $(System.DefinitionId)) as raw literal strings inside a serde_json::json!({}) call. Those strings are emitted verbatim into the JSON that becomes the heredoc body. ADO expands them before bash runs — but that expansion is a raw text substitution with no JSON escaping. If $(Build.SourceBranch) ever expands to a value containing " or \ (e.g. a ref on an on-prem server that permits backslashes), the written aw_info.json would be malformed and the otel analyzer would silently fail to parse it.

For $(Build.BuildId) (numeric) and $(Build.SourceVersion) (hex SHA) this is zero risk in practice. For $(Build.SourceBranch) it is low but non-zero. One fix: drop the four ADO-macro fields from the compile-time JSON and append them at bash runtime using the environment variables ADO sets ($BUILD_BUILDID, $BUILD_SOURCEVERSION, $BUILD_SOURCEBRANCH, $SYSTEM_DEFINITIONID), e.g. via jq or a printf approach, so the values are properly JSON-escaped before writing.

⚠️ Suggestions

src/audit/url.rs — file-level #[allow(dead_code)] suppresses warnings for the whole module. A blanket allow hides future drift; prefer per-item annotations or rely on pub visibility to drive the lint.
src/execute.rs — error field populated for warnings — append_execution_record_impl writes result.message into the error field for warning-status records (e.g. noop, missing-tool). The audit-side map_execution_status correctly classifies these as SafeOutputStatus::Warning, but raw NDJSON consumers see a populated error for a non-error outcome. Consider only setting error when status is not succeeded and not warning.

✅ What Looks Good

Atomic cache writes in cache.rs (temp-file + rename) are correct and prevent partial reads.
Error swallowing in append_execution_record is intentional and well-documented — audit logging must never break Stage 3 execution.
build_execution_items matching uses a FIFO VecDeque keyed on (name, context) with an index-based fallback for context-free proposals — handles the common cases cleanly.
budget_exhausted structural flag on ExecutionResult is the right call; avoids fragile message-string parsing in the audit path.
Error handling throughout uses anyhow with .context() chains consistently — no naked unwrap() on user-facing paths.

Generated by Rust PR Reviewer for issue #691 · sonnet46 2.3M · ◷

jamesadevine and others added 2 commits June 1, 2026 22:08

jamesadevine force-pushed the feat/audit-command branch from 42e5af1 to c07ea29 Compare June 1, 2026 21:11

jamesadevine and others added 2 commits June 1, 2026 22:31

jamesadevine merged commit 5c33e40 into main Jun 1, 2026
3 checks passed

jamesadevine deleted the feat/audit-command branch June 1, 2026 21:36

github-actions Bot mentioned this pull request Jun 1, 2026

chore(main): release 0.32.0 #824

Open

github-actions Bot mentioned this pull request Jun 2, 2026

docs(site): add ado-aw audit reference page #826

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(audit): add `ado-aw audit <build-id-or-url>` command#691

feat(audit): add `ado-aw audit <build-id-or-url>` command#691
jamesadevine merged 5 commits into
mainfrom
feat/audit-command

jamesadevine commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jamesadevine commented May 21, 2026

feat(audit): add ado-aw audit <build-id-or-url> command

What ships

CLI surface

Unified rejection trace

Dependencies

Tests

Docs

Validation

Explicitly out of scope (recorded as follow-ups)

Uh oh!

github-actions Bot commented May 21, 2026

🔍 Rust PR Review

Findings

🐛 Bugs / Logic Issues

🔒 Security Concerns

⚠️ Suggestions

✅ What Looks Good

Uh oh!

github-actions Bot commented May 21, 2026

🔍 Rust PR Review

Findings

🐛 Bugs / Logic Issues

⚠️ Suggestions

✅ What Looks Good

Uh oh!

github-actions Bot commented Jun 1, 2026

🔍 Rust PR Review

Findings

🐛 Bugs / Logic Issues

🔒 Security Concerns

⚠️ Suggestions

✅ What Looks Good

Uh oh!

Uh oh!

github-actions Bot commented Jun 1, 2026

🔍 Rust PR Review

Findings

🐛 Bugs / Logic Issues

⚠️ Suggestions

✅ What Looks Good

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`feat(audit)`: add `ado-aw audit <build-id-or-url>` command