feat: wizard policies by sarahxsanders · Pull Request #305 · PostHog/wizard

sarahxsanders · 2026-02-28T18:58:00Z

hackathon project wip

Why the wizard needs deterministic guardrails

Honestly, so we can sleep better at night. And also because security really does matter when this becomes the default way people install PostHog :)

The wizard today has two security layers:

L0: Commandments: Prompt rules in context-mill, do this, not that. Prompt injection can override these.
L1: canUseTool(): Typescript allowlist that blocks dangerous operators, restricts commands, blocks .env file access. It is deterministic, but only checks pre-execution inputs.

This leaves two gaps.

Gaps

Commandments are advisory. Every commandment is a prompt instruction. We trust the agent follows them because we asked nicely. But if a project file contains , the agent might comply. There's no hard enforcement here.

We don't have post-execution verification. Our allowlist runs before the tool executes. It checks what the agent wants to do, but doesn't check what the agent actually wrote. If the agent writes posthog.capture('user_signed_up', { email: user.email }) into a file, nothing catches it. The PII violation is in the output, and we have no output scanning today.

Solution

We add two enforcement layers:

L2: Cedar policies: Structural rules (paths, commands, URLs), runs pre-execution
L3: YARA signatures: Content pattern matching (PII, hardcoded keys, prompt injection), runs pre and post-execution

With this architecture:

Commandments (L0) guides the agent.
Our allowlist, Cedar policies, and YARA signatures enforce hard boundaries the agent can't bypass regardless of prompt manipulation.

How it works

Agent wants to call a tool
        |
        v
  canUseTool()  ---- blocked? ---- tool rejected (existing, unchanged)
        | allowed
        v
  Cedar policy check (L2)
        |
        +-- "rm -rf /" → BLOCKED (structural rule on command pattern)
        +-- WebFetch to unknown domain → BLOCKED (URL allowlist)
        +-- Write to .env → BLOCKED (path rule, defense-in-depth with L1)
        |
        v allowed
  YARA pre-scan (L3)
        |
        +-- "curl $API_KEY" → BLOCKED (secret exfiltration pattern)
        |
        v allowed
  Tool executes
        |
        v
  YARA post-scan (L3)      ← THIS IS NEW — we now check outputs
        |
        +-- posthog.capture() with email field → BLOCKED, agent told to revert
        +-- Hardcoded phc_ key in written file → BLOCKED, agent told to revert
        +-- Prompt injection in file content → ABORT (agent context is poisoned)
        |
        v allowed
  Agent continues

The harness

L2 and L3 run in a separate Rust daemon (sondera-harness-server), not inside the wizard's Node.js process. It starts once when the wizard launches, communicates over a Unix socket, and shuts down when the wizard exits.

Why use an external harness?

Isolation: Even if the agent somehow influenced the wizard runtime, it can't touch the policy engine
Right tool for the job: Cedar and YARA-X are Rust-native. A Unix socket RPC is simpler than FFI or WASM bindings
*Monorepo-friendly: One harness instance serves all concurrent sub-agents. No per-agent startup cost
Graceful degradation: If the binary isn't available (unsupported platform, install issue), the wizard falls back to L0 + L1 only. No crash, no user-facing error

Will this slow down wizard runs?

I'll have to do some testing, but it shouldn't. It might add like 5-10ms. A typical run takes 6-9 minutes, and nearly all of that is LLM inference. Tool execution is fractional. The guardrail eval happens in between "agent decides to use a tool" and "tool executes".

Enforcement layers

These are just some examples.

Cedar policies

Rule	What it blocks
`forbid-rm-rf`	`rm -rf` in any bash command
`forbid-git-reset-hard`	`git reset --hard`
`forbid-git-push-force`	`git push --force` / `git push -f`
`forbid-git-clean`	`git clean -f`
Network allowlist	WebFetch to anything outside `*.posthog.com`, `github.com/PostHog`, `localhost`
`.env` blocking	Read/write/edit any `.env*` file (defense-in-depth with L1)
YARA + policy model gates	Block when upstream signature/policy scanners fire

YARA signatures

Rule	What it catches	When
`pii_in_capture_call`	Email, phone, name, SSN, DOB, IP in `posthog.capture()` or `.identify()`	Post-execution (output scan)
`hardcoded_posthog_key`	`phc_` or `phx_` keys written into source files	Post-execution (output scan)
`autocapture_disabled`	Agent writing `autocapture: false`	Post-execution (output scan)
`prompt_injection_wizard_override`	"ignore previous instructions", "you are now", "skip posthog" in project files	Post-execution (file read scan)
`secret_exfiltration_via_command`	`curl $SECRET`, `base64	curl`, piping to` nc`

Commandments to enforcement mapping

Every soft rule that can be expressed deterministically now has a hard enforcement counterpart:

Commandment (L0)	Hard enforcement (L2/L3)
"NEVER send PII in capture()"	YARA `pii_in_capture_call`, post-execution output scan
"Use env vars, don't hardcode keys"	YARA `hardcoded_posthog_key`, post-execution output scan
"Don't disable autocapture"	YARA `autocapture_disabled`, post-execution output scan
".env access via wizard-tools only"	Cedar `.env*` path block, pre-execution
"Only modify files in project dir"	Cedar workspace fence, pre-execution
"No destructive commands"	Cedar `rm -rf`, `git reset --hard`, `git push --force` blocks, pre-execution
(implicit: no exfiltration)	Cedar URL allowlist + YARA `secret_exfiltration`, pre-execution
(defense: prompt injection)	YARA `prompt_injection_wizard_override`, post-execution file read scan

github-actions · 2026-02-28T18:58:12Z

🧙 Wizard CI

Run the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands:

Test all apps:

/wizard-ci all

Test all apps in a directory:

/wizard-ci android
/wizard-ci angular
/wizard-ci astro
/wizard-ci django
/wizard-ci fastapi
/wizard-ci flask
/wizard-ci javascript-node
/wizard-ci javascript-web
/wizard-ci laravel
/wizard-ci next-js
/wizard-ci nuxt
/wizard-ci python
/wizard-ci rails
/wizard-ci react-native
/wizard-ci react-router
/wizard-ci sveltekit
/wizard-ci swift
/wizard-ci tanstack-router
/wizard-ci tanstack-start
/wizard-ci vue

Test an individual app:

/wizard-ci android/Jetchat
/wizard-ci angular/angular-saas
/wizard-ci astro/astro-hybrid-marketing

Show more apps

/wizard-ci astro/astro-ssr-docs
/wizard-ci astro/astro-static-marketing
/wizard-ci astro/astro-view-transitions-marketing
/wizard-ci django/django3-saas
/wizard-ci fastapi/fastapi3-ai-saas
/wizard-ci flask/flask3-social-media
/wizard-ci javascript-node/express-todo
/wizard-ci javascript-node/fastify-blog
/wizard-ci javascript-node/hono-links
/wizard-ci javascript-node/koa-notes
/wizard-ci javascript-node/native-http-contacts
/wizard-ci javascript-web/saas-dashboard
/wizard-ci laravel/laravel12-saas
/wizard-ci next-js/15-app-router-saas
/wizard-ci next-js/15-app-router-todo
/wizard-ci next-js/15-pages-router-saas
/wizard-ci next-js/15-pages-router-todo
/wizard-ci nuxt/movies-nuxt-3-6
/wizard-ci nuxt/movies-nuxt-4
/wizard-ci python/meeting-summarizer
/wizard-ci rails/fizzy
/wizard-ci react-native/expo-react-native-hacker-news
/wizard-ci react-native/react-native-saas
/wizard-ci react-router/react-router-v7-project
/wizard-ci react-router/rrv7-starter
/wizard-ci react-router/saas-template
/wizard-ci react-router/shopper
/wizard-ci sveltekit/CMSaasStarter
/wizard-ci swift/hackers-ios
/wizard-ci tanstack-router/tanstack-router-code-based-saas
/wizard-ci tanstack-router/tanstack-router-file-based-saas
/wizard-ci tanstack-start/tanstack-start-saas
/wizard-ci vue/movies

Results will be posted here when complete.

sarahxsanders closed this Feb 28, 2026

sarahxsanders force-pushed the hackathon branch from ad2901b to 8821e8a Compare February 28, 2026 18:58

add file

92475ff

sarahxsanders reopened this Feb 28, 2026

add base policies

f8d4404

sarahxsanders force-pushed the hackathon branch from 90d3fbe to f8d4404 Compare March 2, 2026 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: wizard policies#305

feat: wizard policies#305
sarahxsanders wants to merge 2 commits intomainfrom
hackathon

sarahxsanders commented Feb 28, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sarahxsanders commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why the wizard needs deterministic guardrails

Gaps

Solution

How it works

The harness

Why use an external harness?

Will this slow down wizard runs?

Enforcement layers

Cedar policies

YARA signatures

Commandments to enforcement mapping

Uh oh!

github-actions bot commented Feb 28, 2026

🧙 Wizard CI

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sarahxsanders commented Feb 28, 2026 •

edited

Loading