Skip to content

Comments

Add Phase 5.5: verify mismatches via agent discussion#71

Open
hua7450 wants to merge 1 commit intoPolicyEngine:add-audit-state-tax-commandfrom
hua7450:add-mismatch-verification-discussion
Open

Add Phase 5.5: verify mismatches via agent discussion#71
hua7450 wants to merge 1 commit intoPolicyEngine:add-audit-state-tax-commandfrom
hua7450:add-mismatch-verification-discussion

Conversation

@hua7450
Copy link
Collaborator

@hua7450 hua7450 commented Feb 16, 2026

Summary

  • Adds Phase 5.5 to the audit-state-tax command that verifies reported mismatches via agent-to-agent discussion before including them in the final report

Motivation

During a full audit of Iowa's 2025 income tax PR (#7389 in policyengine-us), an audit agent reported a false positive: it flagged the QBI fraction parameter as wrong for 2023+ (0.75 instead of 1.0). However, tracing the code revealed that ia_qbi_deduction (which uses the fraction) is only called in the pre-2023 indiv/joint path — the 2023+ consolidated path starts from federal taxable income directly, where the QBI deduction is already included. The parameter was correct; the agent just didn't trace the code path.

What Phase 5.5 Does

For each MISMATCH reported by an audit agent:

  1. Creates a verification team (TeamCreate)
  2. Spawns a verifier agent that greps for parameter usages and traces the call chain from the top-level tax variable down to the flagged parameter
  3. Resumes the original audit agent as a teammate so it can explain its reasoning
  4. Lets them discuss (up to 3-4 round-trips via SendMessage)
  5. Collects a verdict: CONFIRMED, REJECTED, or INCONCLUSIVE
    • CONFIRMED → proceeds to Phase 6 (600 DPI visual verification)
    • REJECTED → excluded from final report (noted as "investigated and cleared")
    • INCONCLUSIVE → proceeds to Phase 6 for manual verification

Other Changes

  • Updated Key Rules to require both code-path verification AND visual confirmation
  • Added Rule 8: "Trace code paths" — a parameter mismatch is only real if the parameter is reachable in the target tax year
  • Updated Pre-Flight Checklist

Test plan

  • Verified the edit applies cleanly on top of add-audit-state-tax-command
  • Run an actual audit with mismatches to test the team discussion flow

🤖 Generated with Claude Code

Audit agents can produce false positives when they check parameter values
in isolation without tracing whether the parameter is actually used in the
target tax year's code path. This adds a verification step where a code-path
verifier agent discusses each reported mismatch with the original audit
agent before including it in the final report.

The verification team uses TeamCreate + SendMessage for back-and-forth
discussion, and the original audit agent is resumed to preserve its full
context. Verdicts are CONFIRMED, REJECTED, or INCONCLUSIVE.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant