dgenio · dgenio · Jun 5, 2026 · Jun 5, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -47,6 +47,7 @@ jobs:
           python examples/http_driver_demo.py
           python examples/tutorial.py
           python examples/readme_quickstart.py
+          python examples/trace_export_demo.py
 
   conformance_stub:
     name: "Weaver Spec Conformance Stub (v0.1.0)"

diff --git a/AGENTS.md b/AGENTS.md
@@ -138,6 +138,7 @@ See [docs/agent-context/review-checklist.md](docs/agent-context/review-checklist
 | Driver integration patterns | [docs/integrations.md](docs/integrations.md) |
 | Capability design conventions | [docs/capabilities.md](docs/capabilities.md) |
 | Context firewall details | [docs/context_firewall.md](docs/context_firewall.md) |
+| Action trace export contract | [docs/trace_export.md](docs/trace_export.md) |
 
 ## Update policy
 

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -20,6 +20,27 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   a settings rename to `weaver-kernel` is the optional final step.
 
 ### Added
+- **Action trace export contract (#94).** New `export_action_trace` /
+  `export_action_traces` produce a stable, versioned, JSON-serialisable shape
+  for `ActionTrace` records so downstream tools (e.g. LessonWeaver-style lesson
+  extraction) can consume the audit trail without depending on internals. The
+  export is derived only from already-redaction-safe trace fields — `args`
+  (memory payloads stripped) and `result_summary` (post-firewall counts/flags)
+  — so it cannot widen the I-01 boundary. `ActionTrace` now carries the invoked
+  capability's `sensitivity`, and downstream human-correction metadata can be
+  attached at export time. New [`docs/trace_export.md`](docs/trace_export.md)
+  (including how it differs from the OpenTelemetry export) and
+  [`examples/trace_export_demo.py`](examples/trace_export_demo.py), wired into
+  `make ci`.
+- **Property-based invariant tests (#99).** New `tests/test_policy_properties.py`
+  uses Hypothesis to assert authorization invariants across generated
+  principals, capabilities, scopes, constraints, handles, and tokens: every
+  decision carries a stable reason code, `max_rows` never exceeds the policy
+  cap, handle expansion never exceeds the original grant (indirect-use
+  scenario), tokens never verify outside their scope and tampered/expired
+  tokens are always rejected, policy traces never leak raw scope values, and
+  the trace export is always JSON-serialisable. Adds `hypothesis` as a dev
+  dependency.
 - README repositioned to lead with the unique **capability-token + tamper-evident
   audit** value, with explicit boundary framing for the policy engine (vs
   `AgentFence`, #111) and the context firewall (vs `contextweaver`, #110) so a

diff --git a/Makefile b/Makefile
@@ -25,5 +25,6 @@ example:
 	python examples/repository_safety_check.py
 	python examples/chainweaver_flow.py
 	python examples/evaluation_artifact_policy.py
+	python examples/trace_export_demo.py
 
 ci: fmt-check lint type test example
diff --git a/docs/architecture.md b/docs/architecture.md
@@ -154,7 +154,9 @@ Transforms `RawResult → Frame`. Never exposes raw output to the LLM.
 Stores full results by opaque handle ID with TTL. `expand()` supports pagination, field selection, and basic equality filtering.
 
 ### TraceStore
-Records every `ActionTrace`. `explain(action_id)` returns the full audit record. On a successful invocation the trace also carries a `result_summary` — a redaction-safe dict of counts/flags (`fact_count`, `row_count`, `warning_count`, `has_handle`) derived from the firewalled `Frame`, never from raw driver data — so an invocation's outcome is auditable directly (e.g. a repository safety check passed iff `result_summary["row_count"] == 0`). Failed runs have `result_summary == None`.
+Records every `ActionTrace`. `explain(action_id)` returns the full audit record. On a successful invocation the trace also carries a `result_summary` — a redaction-safe dict of counts/flags (`fact_count`, `row_count`, `warning_count`, `has_handle`) derived from the firewalled `Frame`, never from raw driver data — so an invocation's outcome is auditable directly (e.g. a repository safety check passed iff `result_summary["row_count"] == 0`). Failed runs have `result_summary == None`. Each trace also records the invoked capability's `sensitivity` (`NONE`/`PII`/`PCI`/`SECRETS`/`MEMORY`).
+
+`export_action_trace` / `export_action_traces` serialise traces into a stable, versioned, JSON-serialisable shape for downstream analysis tools (distinct from the OpenTelemetry observability export). See [trace_export.md](trace_export.md).
 
 ### Adapters (`weaver_kernel.adapters`)
 Vendor-specific tool-format adapters that translate between `Capability` objects

diff --git a/docs/trace_export.md b/docs/trace_export.md
@@ -0,0 +1,143 @@
+# Action Trace Export
+
+agent-kernel records an [`ActionTrace`](architecture.md) for every invocation.
+The **trace export contract** turns those records into a stable,
+JSON-serialisable shape that an external tool can consume — for example a
+[LessonWeaver](https://github.com/dgenio/weaver-spec)-style lesson-extraction
+layer that learns from past actions, policies, denials, corrections, and
+outcomes.
+
+```python
+from weaver_kernel import export_action_traces
+
+envelope = export_action_traces(kernel._traces.list_all())
+```
+
+Runnable companion: [`examples/trace_export_demo.py`](../examples/trace_export_demo.py).
+
+## How this differs from OpenTelemetry export
+
+agent-kernel also ships an OpenTelemetry integration
+([`weaver_kernel.otel`](architecture.md), `pip install weaver-kernel[otel]`).
+The two serve different consumers and do **not** compete:
+
+| | OpenTelemetry (`instrument_kernel`) | Trace export (`export_action_traces`) |
+|---|---|---|
+| Consumer | Live observability backends (traces/metrics) | Offline analysis / learning tools |
+| Shape | OTel spans + metrics, vendor-defined | Stable JSON envelope defined here |
+| Timing | Emitted during execution | Pulled after the fact from the `TraceStore` |
+| Stability | Tracks OTel semantic conventions | Versioned by `TRACE_EXPORT_VERSION` |
+
+Use OTel for dashboards and alerting; use the export contract when another
+program needs a durable, replayable record of what the agent did.
+
+## Privacy
+
+The export is derived **only** from fields the `ActionTrace` already holds,
+all of which are redaction-safe by construction:
+
+- `args` has memory payloads stripped at record time (keys like `payload`,
+  `content`, `value`, `memory`, `text`, `body` for `memory.*` capabilities
+  become `"[REDACTED]"`).
+- `result_summary` carries counts and flags taken from the **post-firewall**
+  `Frame` — never raw driver data.
+
+The contract adds no field the trace did not already carry, so exporting can
+never widen the I-01 firewall boundary or leak sensitive payloads. A *denied*
+request never produces an `ActionTrace` — policy gates before invocation
+(I-02) — so the export only ever describes authorised invocations; denials are
+surfaced separately via `PolicyDenied` / `Kernel.explain_denial`.
+
+## Envelope shape
+
+`export_action_traces(...)` returns a versioned envelope:
+
+```json
+{
+  "schema": "weaver_kernel.action_trace_export",
+  "version": "1",
+  "traces": [ /* one object per ActionTrace */ ]
+}
+```
+
+Each trace object:
+
+| Field | Type | Notes |
+|-------|------|-------|
+| `action_id` | string | Unique id; matches `Kernel.explain(action_id)`. |
+| `capability_id` | string | The capability (tool) that was invoked. |
+| `principal_id` | string | Who invoked it. |
+| `token_id` | string | The capability token used. |
+| `invoked_at` | string | ISO 8601 timestamp. |
+| `response_mode` | string | `summary` / `table` / `handle_only` / `raw`. |
+| `driver_id` | string | Driver that served the call (`""` on failure). |
+| `handle_id` | string \| null | Handle for the full dataset, if one was minted. |
+| `sensitivity` | string | `NONE` / `PII` / `PCI` / `SECRETS` / `MEMORY`. |
+| `status` | string | `succeeded` or `failed` (derived from `error`). |
+| `error` | string \| null | Failure reason; `null` on success. |
+| `args` | object | Redacted invocation arguments. |
+| `result_summary` | object \| null | Post-firewall counts/flags; `null` on failure. |
+| `correction` | object \| null | Optional human-correction metadata (see below). |
+
+### Human corrections
+
+agent-kernel does not record human corrections itself. A downstream tool can
+attach them at export time by passing a mapping of `action_id` → metadata:
+
+```python
+envelope = export_action_traces(
+    traces,
+    corrections={"act-123": {"corrected_by": "reviewer", "note": "wrong customer"}},
+)
+```
+
+## Example output
+
+```json
+{
+  "schema": "weaver_kernel.action_trace_export",
+  "version": "1",
+  "traces": [
+    {
+      "action_id": "0a1b...",
+      "capability_id": "billing.list_invoices",
+      "principal_id": "agent-007",
+      "token_id": "f3c2...",
+      "invoked_at": "2026-06-05T12:00:00+00:00",
+      "response_mode": "summary",
+      "driver_id": "billing",
+      "handle_id": "9d7e...",
+      "sensitivity": "PII",
+      "status": "succeeded",
+      "error": null,
+      "args": {"operation": "list_invoices", "status": "paid"},
+      "result_summary": {"fact_count": 4, "row_count": 0, "warning_count": 1, "has_handle": true},
+      "correction": null
+    },
+    {
+      "action_id": "5e6f...",
+      "capability_id": "billing.flaky_report",
+      "principal_id": "agent-007",
+      "token_id": "11aa...",
+      "invoked_at": "2026-06-05T12:00:01+00:00",
+      "response_mode": "summary",
+      "driver_id": "",
+      "handle_id": null,
+      "sensitivity": "NONE",
+      "status": "failed",
+      "error": "Handler for operation='flaky_report' raised: reporting backend is unavailable",
+      "args": {"operation": "flaky_report"},
+      "result_summary": null,
+      "correction": {"corrected_by": "on-call", "note": "known outage; retried later"}
+    }
+  ]
+}
+```
+
+## Stability
+
+`TRACE_EXPORT_VERSION` is bumped only on a **breaking** change to the field
+shape. New optional fields may be added without a bump, so consumers should
+ignore unknown keys. Assert on `status`, `sensitivity`, and the presence of
+`error` rather than on human-readable strings (the `error` text itself may
+evolve).
diff --git a/examples/trace_export_demo.py b/examples/trace_export_demo.py
@@ -0,0 +1,133 @@
+"""trace_export_demo.py — export action traces for downstream analysis (#94).
+
+The written contract lives in ``docs/trace_export.md``. This script is the
+runnable companion. It shows how to turn the kernel's audit trail into a
+stable, redaction-safe JSON shape that an external tool (for example a
+LessonWeaver-style lesson-extraction layer) can consume without depending on
+agent-kernel internals.
+
+The demo records two invocations so the export covers both outcomes the
+contract distinguishes:
+
+  1. ``billing.list_invoices`` — a normal READ that **succeeds**.
+  2. ``billing.flaky_report`` — a READ whose driver **fails**, producing a
+     ``status: "failed"`` trace (a *denied* request never reaches invoke, so
+     it never produces a trace; denials surface via ``explain_denial``).
+
+It then prints the versioned export envelope, attaching optional human
+correction metadata to one trace. Everything is offline and deterministic.
+
+Run with: ``python examples/trace_export_demo.py``
+"""
+
+from __future__ import annotations
+
+import asyncio
+import json
+
+from weaver_kernel import (
+    Capability,
+    CapabilityRegistry,
+    DriverError,
+    HMACTokenProvider,
+    Kernel,
+    Principal,
+    SafetyClass,
+    SensitivityTag,
+    StaticRouter,
+    export_action_traces,
+    make_billing_driver,
+)
+from weaver_kernel.drivers.base import ExecutionContext
+from weaver_kernel.models import CapabilityRequest, ImplementationRef
+
+_SECRET = "example-secret-do-not-use-in-prod"
+
+
+def _build_kernel() -> Kernel:
+    capabilities = [
+        Capability(
+            capability_id="billing.list_invoices",
+            name="List Invoices",
+            description="List invoices for a customer",
+            safety_class=SafetyClass.READ,
+            sensitivity=SensitivityTag.PII,
+            allowed_fields=["id", "amount", "currency", "status", "date"],
+            impl=ImplementationRef(driver_id="billing", operation="list_invoices"),
+        ),
+        Capability(
+            capability_id="billing.flaky_report",
+            name="Flaky Report",
+            description="A report whose backing service is currently failing",
+            safety_class=SafetyClass.READ,
+            impl=ImplementationRef(driver_id="billing", operation="flaky_report"),
+        ),
+    ]
+    registry = CapabilityRegistry()
+    registry.register_many(capabilities)
+
+    driver = make_billing_driver()
+
+    def flaky_report(ctx: ExecutionContext) -> object:
+        raise DriverError("reporting backend is unavailable")
+
+    driver.register_handler("flaky_report", flaky_report)
+
+    router = StaticRouter(
+        routes={
+            "billing.list_invoices": ["billing"],
+            "billing.flaky_report": ["billing"],
+        }
+    )
+    kernel = Kernel(
+        registry=registry,
+        token_provider=HMACTokenProvider(secret=_SECRET),
+        router=router,
+    )
+    kernel.register_driver(driver)
+    return kernel
+
+
+async def main() -> None:
+    kernel = _build_kernel()
+    principal = Principal(
+        principal_id="agent-007",
+        roles=["reader"],
+        attributes={"tenant": "acme"},
+    )
+
+    # 1. A successful READ — produces a status="succeeded" trace.
+    list_req = CapabilityRequest(capability_id="billing.list_invoices", goal="list invoices")
+    list_token = kernel.get_token(list_req, principal, justification="")
+    ok_frame = await kernel.invoke(
+        list_token,
+        principal=principal,
+        args={"operation": "list_invoices", "status": "paid"},
+    )
+    print(f"succeeded: action_id={ok_frame.action_id} facts={len(ok_frame.facts)}")
+
+    # 2. A failing READ — produces a status="failed" trace.
+    flaky_req = CapabilityRequest(capability_id="billing.flaky_report", goal="run report")
+    flaky_token = kernel.get_token(flaky_req, principal, justification="")
+    failed_action_id = ""
+    try:
+        await kernel.invoke(flaky_token, principal=principal, args={"operation": "flaky_report"})
+    except DriverError as exc:
+        print(f"failed:    {exc}")
+        # The failure was still recorded; grab the most recent trace's id.
+        failed_action_id = kernel._traces.list_all()[-1].action_id
+
+    # Export everything. Attach an optional human correction to the failed run.
+    corrections = (
+        {failed_action_id: {"corrected_by": "on-call", "note": "known outage; retried later"}}
+        if failed_action_id
+        else None
+    )
+    envelope = export_action_traces(kernel._traces.list_all(), corrections=corrections)
+
+    print("\nExported trace envelope:")
+    print(json.dumps(envelope, indent=2))
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/pyproject.toml b/pyproject.toml
@@ -51,6 +51,7 @@ dev = [
     "pytest>=8.0",
     "pytest-cov>=5.0",
     "pytest-asyncio>=0.23",
+    "hypothesis>=6.100",
     "ruff>=0.4",
     "mypy>=1.10",
     "httpx>=0.27",

diff --git a/src/weaver_kernel/__init__.py b/src/weaver_kernel/__init__.py
@@ -26,6 +26,7 @@
 Handles & traces::
 
     from weaver_kernel import HandleStore, TraceStore
+    from weaver_kernel import export_action_trace, export_action_traces
 
 LLM tool-format adapters::
 
@@ -140,7 +141,13 @@
 from .registry import CapabilityRegistry
 from .router import StaticRouter
 from .tokens import CapabilityToken, HMACTokenProvider
-from .trace import TraceStore
+from .trace import (
+    TRACE_EXPORT_SCHEMA,
+    TRACE_EXPORT_VERSION,
+    TraceStore,
+    export_action_trace,
+    export_action_traces,
+)
 
 # Single source of truth: read the version from the installed distribution
 # metadata (the PyPI dist name is ``weaver-kernel``, distinct from the import
@@ -250,6 +257,11 @@
     # stores
     "HandleStore",
     "TraceStore",
+    # trace export (issue #94)
+    "TRACE_EXPORT_SCHEMA",
+    "TRACE_EXPORT_VERSION",
+    "export_action_trace",
+    "export_action_traces",
     # adapters
     "AnthropicMiddleware",
     "OpenAIMiddleware",

diff --git a/src/weaver_kernel/kernel/__init__.py b/src/weaver_kernel/kernel/__init__.py
@@ -234,6 +234,7 @@ async def invoke(
             args=args,
             response_mode=response_mode,
             plan=plan,
+            capability=capability,
         )
 
     async def invoke_stream(