Proposal: pre-execution guard layer for wallet actions via a wrapper WalletProvider (prompt injection → transaction, key leak, malicious address)

### Language Implementation

- [ ] Python
- [x] TypeScript

### Feature Type

- [ ] Action Provider Template
- [x] Wallet Provider Template
- [ ] Framework Extension
- [ ] Core Requirements
- [ ] Other

### 🚀 The feature, motivation and pitch

Disclosing affiliation upfront: I work on AgentGuard at GoPlus Security (https://github.com/GoPlusSecurity/agentguard, MIT, `@goplus/agentguard`). I'm filing this because AgentKit agents hold real funds, and I'd like to contribute a guarantee-level safety layer and do the work myself.

**The problem.** Today the model is the only thing standing between a prompt-injected agent and a signed transaction. The failure mode is specific to AgentKit's shape: an agent ingests untrusted content — a web page, a Farcaster cast, a tool output, an XMTP message — containing injected instructions, and the next action is `nativeTransfer` or `sendTransaction` to an attacker address. Text-layer safety in the LLM can't reliably catch this because the malicious step looks like a perfectly well-formed wallet action. An opt-in safety *action* (a tool the agent may call) doesn't close it either: an agent already following injected instructions won't volunteer to scan itself.

Concrete failure modes:

- Injected instructions in fetched/social content pivot directly into a transfer or approval to an attacker address.
- A tool call or agent output carries the wallet's private key or a CDP API secret in its arguments.
- An unlimited ERC-20 approval, or a swap into a honeypot token the agent never thought to look up.
- Data the agent has touched gets routed out through an outbound action (post, message, webhook).

**Proposed feature.** A `GuardedWalletProvider` in `typescript/agentkit/src/wallet-providers/`: it wraps any provider extending `EvmWalletProvider` (CDP, Privy, viem) and intercepts the signing/sending paths — `sendTransaction`, `signTransaction`, `signMessage`, `signTypedData`. Each intercepted call runs through AgentGuard's decision engine, including web3 transaction simulation, and returns `allow | block | require_user_confirm` with a machine-readable reason the agent loop can surface. Because it wraps the wallet interface rather than adding a tool, it's deterministic: the model can't decline to invoke it.

```ts
const wallet = await CdpWalletProvider.configureWithWallet(config);
const guarded = new GuardedWalletProvider(wallet, {
  mode: "block",            // or "confirm" | "log"
  maxValuePerTx: "0.1",     // optional policy knobs
});
const agentkit = await AgentKit.from({ walletProvider: guarded, actionProviders });
```

Properties that matter for this repo: it runs locally and deterministically — no API key, no hosted dependency, no per-call vendor cost. The decision engine averages ~0.13 ms per call on an open 84-sample benchmark (https://github.com/GoPlusSecurity/agentguard), so it's viable in the critical path of every wallet call, where an LLM-judge round-trip would not be. It scans against 24 detection rules across 10 threat categories. Eight of those rules are web3-specific and map directly onto wallet-action risk: `WALLET_DRAINING`, `UNLIMITED_APPROVAL`, `HIDDEN_TRANSFER`, `SIGNATURE_REPLAY`, `DANGEROUS_SELFDESTRUCT`, `PROXY_UPGRADE`, `REENTRANCY_PATTERN`, `FLASH_LOAN_RISK`. For a `GuardedWalletProvider` the first four are the core checks — drain-shaped transfers and approvals on `sendTransaction`, unbounded ERC-20 approvals, transfers hidden inside unrelated calldata, and replay-prone payloads on `signMessage`/`signTypedData`. The remaining four matter when an agent deploys or interacts with contracts. Optionally, the wrapper can consult GoPlus address/token threat intelligence for known-malicious counterparties — strictly opt-in; the default path stays local and deterministic with no network dependency.

The same decision engine already runs in production behind two other agent platforms on a shared adapter abstraction — Claude Code (`PreToolUse`/`PostToolUse` hooks) and OpenClaw (`before_tool_call`/`after_tool_call`). An AgentKit adapter is the third instance of an existing pattern, not a new design.

**Relationship to existing work.** This is complementary to #1258's tokenSafety action: that answers "is this token safe?" when the agent asks; this layer guarantees every wallet action gets checked whether or not the agent asks. It also pairs naturally with the spend-permissions item on the WISHLIST — spend permissions cap *how much* an agent can move; this checks *where and why* it's moving.

**Scope I'm proposing.** TypeScript first: one wrapper class + unit tests + a docs page + an example under `typescript/examples/` showing a blocked injection attempt on base-sepolia. Python port as a follow-up PR if there's appetite. No changes to existing wallet providers, action providers, or core.

**The one question I need answered:** is a wrapper WalletProvider the shape you'd accept — or would you rather see this as framework-extension-level middleware? I'll bring a runnable demo and the PR to whichever answer fits.


### Alternatives

1. **An `agentguard` ActionProvider** (a scan tool the agent can call). Rejected as the primary shape: opt-in safety fails exactly when it's needed — a prompt-injected agent won't call its own scanner. Could still be a useful follow-up for agent-invoked scans (skill scanning, registry lookups), but it can't be the guarantee layer.
2. **Per-action safety lookups** (the #1258 tokenSafety approach). Valuable, but coverage depends on the agent choosing to ask, and it covers tokens, not transfers/approvals/signatures generally. Complementary rather than alternative.
3. **LLM-judge on each wallet call.** Adds hundreds of ms and a second model dependency to every transaction; a deterministic local engine at ~0.13 ms avoids both.
4. **Spend permissions / session keys** (WISHLIST). Caps amount at risk but is content-blind — a capped transfer to an attacker address still goes through. Works best combined with this layer.

### Additional context

- Verified against `evmWalletProvider.ts` on main: the abstract base class exposes all the methods the wrapper needs (`sign`, `signMessage`, `signTypedData`, `signTransaction`, `sendTransaction`, `waitForTransactionReceipt`), so this requires zero core changes.
- AgentGuard is MIT-licensed, npm `@goplus/agentguard`. Benchmark methodology and corpus are public and reproducible in the repo.
- I can have a runnable demo (LangChain + AgentKit chatbot on base-sepolia, poisoned tool output triggering a blocked transfer) attached to the eventual PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: pre-execution guard layer for wallet actions via a wrapper WalletProvider (prompt injection → transaction, key leak, malicious address) #1282

Language Implementation

Feature Type

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Proposal: pre-execution guard layer for wallet actions via a wrapper WalletProvider (prompt injection → transaction, key leak, malicious address) #1282

Description

Language Implementation

Feature Type

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions