From 77b94c06a9c684b89805089b998d5fbffcf80fc5 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Fri, 5 Jun 2026 18:25:03 +0000
Subject: [PATCH 1/2] feat: stable ActionTrace export contract (#94) +
 property-based invariant tests (#99)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

#94 — Action trace export:
- Add export_action_trace / export_action_traces: a stable, versioned,
  JSON-serialisable shape for ActionTrace records so downstream tools (e.g.
  LessonWeaver-style lesson extraction) can consume the audit trail without
  depending on internals. Derived only from already-redaction-safe trace
  fields (redacted args, post-firewall result_summary), so it cannot widen
  the I-01 boundary. Optional human-correction metadata attaches at export
  time; denied requests never produce a trace (policy gates before invoke).
- Record the invoked capability's sensitivity on ActionTrace.
- New docs/trace_export.md (incl. how it differs from the OTel export) and
  examples/trace_export_demo.py (one succeeded + one failed action), wired
  into make ci.

#99 — Property-based tests (tests/test_policy_properties.py, Hypothesis):
- Stable reason code on every allow/deny; max_rows never exceeds the policy
  cap; handle expansion never exceeds the original grant (indirect-use
  scenario); tokens never verify outside scope and tampered/expired tokens
  are rejected; policy traces never leak raw scope values; trace export is
  always JSON-serialisable. Adds hypothesis as a dev dependency.

Validated: ruff check, ruff format --check, mypy (41 files), pytest
(581 passed, 1 skipped; test_mcp_driver skipped — mcp not installable
locally), and the example list all green.
---
 .github/workflows/ci.yml             |   1 +
 AGENTS.md                            |   1 +
 CHANGELOG.md                         |  21 ++
 Makefile                             |   1 +
 docs/architecture.md                 |   4 +-
 docs/trace_export.md                 | 142 ++++++++++
 examples/trace_export_demo.py        | 133 ++++++++++
 pyproject.toml                       |   1 +
 src/weaver_kernel/__init__.py        |  14 +-
 src/weaver_kernel/kernel/__init__.py |   1 +
 src/weaver_kernel/kernel/_invoke.py  |  12 +
 src/weaver_kernel/kernel/_stream.py  |   2 +-
 src/weaver_kernel/models.py          |   8 +
 src/weaver_kernel/trace.py           | 105 +++++++-
 tests/test_policy_properties.py      | 370 +++++++++++++++++++++++++++
 tests/test_trace.py                  | 130 +++++++++-
 16 files changed, 939 insertions(+), 7 deletions(-)
 create mode 100644 docs/trace_export.md
 create mode 100644 examples/trace_export_demo.py
 create mode 100644 tests/test_policy_properties.py

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index cdeb98b..7f726cc 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -47,6 +47,7 @@ jobs:
           python examples/http_driver_demo.py
           python examples/tutorial.py
           python examples/readme_quickstart.py
+          python examples/trace_export_demo.py
 
   conformance_stub:
     name: "Weaver Spec Conformance Stub (v0.1.0)"
diff --git a/AGENTS.md b/AGENTS.md
index 3948e15..de88aed 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -138,6 +138,7 @@ See [docs/agent-context/review-checklist.md](docs/agent-context/review-checklist
 | Driver integration patterns | [docs/integrations.md](docs/integrations.md) |
 | Capability design conventions | [docs/capabilities.md](docs/capabilities.md) |
 | Context firewall details | [docs/context_firewall.md](docs/context_firewall.md) |
+| Action trace export contract | [docs/trace_export.md](docs/trace_export.md) |
 
 ## Update policy
 
diff --git a/CHANGELOG.md b/CHANGELOG.md
index a2862ce..e8fe8ff 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -20,6 +20,27 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   a settings rename to `weaver-kernel` is the optional final step.
 
 ### Added
+- **Action trace export contract (#94).** New `export_action_trace` /
+  `export_action_traces` produce a stable, versioned, JSON-serialisable shape
+  for `ActionTrace` records so downstream tools (e.g. LessonWeaver-style lesson
+  extraction) can consume the audit trail without depending on internals. The
+  export is derived only from already-redaction-safe trace fields — `args`
+  (memory payloads stripped) and `result_summary` (post-firewall counts/flags)
+  — so it cannot widen the I-01 boundary. `ActionTrace` now carries the invoked
+  capability's `sensitivity`, and downstream human-correction metadata can be
+  attached at export time. New [`docs/trace_export.md`](docs/trace_export.md)
+  (including how it differs from the OpenTelemetry export) and
+  [`examples/trace_export_demo.py`](examples/trace_export_demo.py), wired into
+  `make ci`.
+- **Property-based invariant tests (#99).** New `tests/test_policy_properties.py`
+  uses Hypothesis to assert authorization invariants across generated
+  principals, capabilities, scopes, constraints, handles, and tokens: every
+  decision carries a stable reason code, `max_rows` never exceeds the policy
+  cap, handle expansion never exceeds the original grant (indirect-use
+  scenario), tokens never verify outside their scope and tampered/expired
+  tokens are always rejected, policy traces never leak raw scope values, and
+  the trace export is always JSON-serialisable. Adds `hypothesis` as a dev
+  dependency.
 - README repositioned to lead with the unique **capability-token + tamper-evident
   audit** value, with explicit boundary framing for the policy engine (vs
   `AgentFence`, #111) and the context firewall (vs `contextweaver`, #110) so a
diff --git a/Makefile b/Makefile
index 4550d82..0f9d441 100644
--- a/Makefile
+++ b/Makefile
@@ -25,5 +25,6 @@ example:
 	python examples/repository_safety_check.py
 	python examples/chainweaver_flow.py
 	python examples/evaluation_artifact_policy.py
+	python examples/trace_export_demo.py
 
 ci: fmt-check lint type test example
diff --git a/docs/architecture.md b/docs/architecture.md
index 9e2fc9c..d4fe3fb 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -154,7 +154,9 @@ Transforms `RawResult → Frame`. Never exposes raw output to the LLM.
 Stores full results by opaque handle ID with TTL. `expand()` supports pagination, field selection, and basic equality filtering.
 
 ### TraceStore
-Records every `ActionTrace`. `explain(action_id)` returns the full audit record. On a successful invocation the trace also carries a `result_summary` — a redaction-safe dict of counts/flags (`fact_count`, `row_count`, `warning_count`, `has_handle`) derived from the firewalled `Frame`, never from raw driver data — so an invocation's outcome is auditable directly (e.g. a repository safety check passed iff `result_summary["row_count"] == 0`). Failed runs have `result_summary == None`.
+Records every `ActionTrace`. `explain(action_id)` returns the full audit record. On a successful invocation the trace also carries a `result_summary` — a redaction-safe dict of counts/flags (`fact_count`, `row_count`, `warning_count`, `has_handle`) derived from the firewalled `Frame`, never from raw driver data — so an invocation's outcome is auditable directly (e.g. a repository safety check passed iff `result_summary["row_count"] == 0`). Failed runs have `result_summary == None`. Each trace also records the invoked capability's `sensitivity` (`NONE`/`PII`/`PCI`/`SECRETS`/`MEMORY`).
+
+`export_action_trace` / `export_action_traces` serialise traces into a stable, versioned, JSON-serialisable shape for downstream analysis tools (distinct from the OpenTelemetry observability export). See [trace_export.md](trace_export.md).
 
 ### Adapters (`weaver_kernel.adapters`)
 Vendor-specific tool-format adapters that translate between `Capability` objects
diff --git a/docs/trace_export.md b/docs/trace_export.md
new file mode 100644
index 0000000..3b80d1e
--- /dev/null
+++ b/docs/trace_export.md
@@ -0,0 +1,142 @@
+# Action Trace Export
+
+agent-kernel records an [`ActionTrace`](architecture.md) for every invocation.
+The **trace export contract** turns those records into a stable,
+JSON-serialisable shape that an external tool can consume — for example a
+[LessonWeaver](https://github.com/dgenio/weaver-spec)-style lesson-extraction
+layer that learns from past actions, policies, denials, corrections, and
+outcomes.
+
+```python
+from weaver_kernel import export_action_traces
+
+envelope = export_action_traces(kernel._traces.list_all())
+```
+
+Runnable companion: [`examples/trace_export_demo.py`](../examples/trace_export_demo.py).
+
+## How this differs from OpenTelemetry export
+
+agent-kernel also ships an OpenTelemetry integration
+([`weaver_kernel.otel`](architecture.md), `pip install weaver-kernel[otel]`).
+The two serve different consumers and do **not** compete:
+
+| | OpenTelemetry (`instrument_kernel`) | Trace export (`export_action_traces`) |
+|---|---|---|
+| Consumer | Live observability backends (traces/metrics) | Offline analysis / learning tools |
+| Shape | OTel spans + metrics, vendor-defined | Stable JSON envelope defined here |
+| Timing | Emitted during execution | Pulled after the fact from the `TraceStore` |
+| Stability | Tracks OTel semantic conventions | Versioned by `TRACE_EXPORT_VERSION` |
+
+Use OTel for dashboards and alerting; use the export contract when another
+program needs a durable, replayable record of what the agent did.
+
+## Privacy
+
+The export is derived **only** from fields the `ActionTrace` already holds,
+all of which are redaction-safe by construction:
+
+- `args` has memory payloads stripped at record time (keys like `payload`,
+  `content`, `value`, `memory`, `text`, `body` for `memory.*` capabilities
+  become `"[REDACTED]"`).
+- `result_summary` carries counts and flags taken from the **post-firewall**
+  `Frame` — never raw driver data.
+
+The contract adds no field the trace did not already carry, so exporting can
+never widen the I-01 firewall boundary or leak sensitive payloads. A *denied*
+request never produces an `ActionTrace` — policy gates before invocation
+(I-02) — so the export only ever describes authorised invocations; denials are
+surfaced separately via `PolicyDenied` / `Kernel.explain_denial`.
+
+## Envelope shape
+
+`export_action_traces(...)` returns a versioned envelope:
+
+```json
+{
+  "schema": "weaver_kernel.action_trace_export",
+  "version": "1",
+  "traces": [ /* one object per ActionTrace */ ]
+}
+```
+
+Each trace object:
+
+| Field | Type | Notes |
+|-------|------|-------|
+| `action_id` | string | Unique id; matches `Kernel.explain(action_id)`. |
+| `capability_id` | string | The capability (tool) that was invoked. |
+| `principal_id` | string | Who invoked it. |
+| `token_id` | string | The capability token used. |
+| `invoked_at` | string | ISO 8601 timestamp. |
+| `response_mode` | string | `summary` / `table` / `handle_only` / `raw`. |
+| `driver_id` | string | Driver that served the call (`""` on failure). |
+| `handle_id` | string \| null | Handle for the full dataset, if one was minted. |
+| `sensitivity` | string | `NONE` / `PII` / `PCI` / `SECRETS` / `MEMORY`. |
+| `status` | string | `succeeded` or `failed` (derived from `error`). |
+| `error` | string \| null | Failure reason; `null` on success. |
+| `args` | object | Redacted invocation arguments. |
+| `result_summary` | object \| null | Post-firewall counts/flags; `null` on failure. |
+| `correction` | object \| null | Optional human-correction metadata (see below). |
+
+### Human corrections
+
+agent-kernel does not record human corrections itself. A downstream tool can
+attach them at export time by passing a mapping of `action_id` → metadata:
+
+```python
+envelope = export_action_traces(
+    traces,
+    corrections={"act-123": {"corrected_by": "reviewer", "note": "wrong customer"}},
+)
+```
+
+## Example output
+
+```json
+{
+  "schema": "weaver_kernel.action_trace_export",
+  "version": "1",
+  "traces": [
+    {
+      "action_id": "0a1b...",
+      "capability_id": "billing.list_invoices",
+      "principal_id": "agent-007",
+      "token_id": "f3c2...",
+      "invoked_at": "2026-06-05T12:00:00+00:00",
+      "response_mode": "summary",
+      "driver_id": "billing",
+      "handle_id": "9d7e...",
+      "sensitivity": "PII",
+      "status": "succeeded",
+      "error": null,
+      "args": {"operation": "list_invoices", "status": "paid"},
+      "result_summary": {"fact_count": 4, "row_count": 0, "warning_count": 1, "has_handle": true},
+      "correction": null
+    },
+    {
+      "action_id": "5e6f...",
+      "capability_id": "billing.flaky_report",
+      "principal_id": "agent-007",
+      "token_id": "11aa...",
+      "invoked_at": "2026-06-05T12:00:01+00:00",
+      "response_mode": "summary",
+      "driver_id": "",
+      "handle_id": null,
+      "sensitivity": "NONE",
+      "status": "failed",
+      "error": "Handler for operation='flaky_report' raised: reporting backend is unavailable",
+      "args": {"operation": "flaky_report"},
+      "result_summary": null,
+      "correction": {"corrected_by": "on-call", "note": "known outage; retried later"}
+    }
+  ]
+}
+```
+
+## Stability
+
+`TRACE_EXPORT_VERSION` is bumped only on a **breaking** change to the field
+shape. New optional fields may be added without a bump, so consumers should
+ignore unknown keys. Assert on `status`, `sensitivity`, and `reason`/`error`
+rather than on human-readable strings, which may evolve.
diff --git a/examples/trace_export_demo.py b/examples/trace_export_demo.py
new file mode 100644
index 0000000..68ff3c5
--- /dev/null
+++ b/examples/trace_export_demo.py
@@ -0,0 +1,133 @@
+"""trace_export_demo.py — export action traces for downstream analysis (#94).
+
+The written contract lives in ``docs/trace_export.md``. This script is the
+runnable companion. It shows how to turn the kernel's audit trail into a
+stable, redaction-safe JSON shape that an external tool (for example a
+LessonWeaver-style lesson-extraction layer) can consume without depending on
+agent-kernel internals.
+
+The demo records two invocations so the export covers both outcomes the
+contract distinguishes:
+
+  1. ``billing.list_invoices`` — a normal READ that **succeeds**.
+  2. ``billing.flaky_report`` — a READ whose driver **fails**, producing a
+     ``status: "failed"`` trace (a *denied* request never reaches invoke, so
+     it never produces a trace; denials surface via ``explain_denial``).
+
+It then prints the versioned export envelope, attaching optional human
+correction metadata to one trace. Everything is offline and deterministic.
+
+Run with: ``python examples/trace_export_demo.py``
+"""
+
+from __future__ import annotations
+
+import asyncio
+import json
+
+from weaver_kernel import (
+    Capability,
+    CapabilityRegistry,
+    DriverError,
+    HMACTokenProvider,
+    Kernel,
+    Principal,
+    SafetyClass,
+    SensitivityTag,
+    StaticRouter,
+    export_action_traces,
+    make_billing_driver,
+)
+from weaver_kernel.drivers.base import ExecutionContext
+from weaver_kernel.models import CapabilityRequest, ImplementationRef
+
+_SECRET = "example-secret-do-not-use-in-prod"
+
+
+def _build_kernel() -> Kernel:
+    capabilities = [
+        Capability(
+            capability_id="billing.list_invoices",
+            name="List Invoices",
+            description="List invoices for a customer",
+            safety_class=SafetyClass.READ,
+            sensitivity=SensitivityTag.PII,
+            allowed_fields=["id", "amount", "currency", "status", "date"],
+            impl=ImplementationRef(driver_id="billing", operation="list_invoices"),
+        ),
+        Capability(
+            capability_id="billing.flaky_report",
+            name="Flaky Report",
+            description="A report whose backing service is currently failing",
+            safety_class=SafetyClass.READ,
+            impl=ImplementationRef(driver_id="billing", operation="flaky_report"),
+        ),
+    ]
+    registry = CapabilityRegistry()
+    registry.register_many(capabilities)
+
+    driver = make_billing_driver()
+
+    def flaky_report(ctx: ExecutionContext) -> object:
+        raise DriverError("reporting backend is unavailable")
+
+    driver.register_handler("flaky_report", flaky_report)
+
+    router = StaticRouter(
+        routes={
+            "billing.list_invoices": ["billing"],
+            "billing.flaky_report": ["billing"],
+        }
+    )
+    kernel = Kernel(
+        registry=registry,
+        token_provider=HMACTokenProvider(secret=_SECRET),
+        router=router,
+    )
+    kernel.register_driver(driver)
+    return kernel
+
+
+async def main() -> None:
+    kernel = _build_kernel()
+    principal = Principal(
+        principal_id="agent-007",
+        roles=["reader"],
+        attributes={"tenant": "acme"},
+    )
+
+    # 1. A successful READ — produces a status="succeeded" trace.
+    list_req = CapabilityRequest(capability_id="billing.list_invoices", goal="list invoices")
+    list_token = kernel.get_token(list_req, principal, justification="")
+    ok_frame = await kernel.invoke(
+        list_token,
+        principal=principal,
+        args={"operation": "list_invoices", "status": "paid"},
+    )
+    print(f"succeeded: action_id={ok_frame.action_id} facts={len(ok_frame.facts)}")
+
+    # 2. A failing READ — produces a status="failed" trace.
+    flaky_req = CapabilityRequest(capability_id="billing.flaky_report", goal="run report")
+    flaky_token = kernel.get_token(flaky_req, principal, justification="")
+    failed_action_id = ""
+    try:
+        await kernel.invoke(flaky_token, principal=principal, args={"operation": "flaky_report"})
+    except DriverError as exc:
+        print(f"failed:    {exc}")
+        # The failure was still recorded; grab the most recent trace's id.
+        failed_action_id = kernel._traces.list_all()[-1].action_id
+
+    # Export everything. Attach an optional human correction to the failed run.
+    corrections = (
+        {failed_action_id: {"corrected_by": "on-call", "note": "known outage; retried later"}}
+        if failed_action_id
+        else None
+    )
+    envelope = export_action_traces(kernel._traces.list_all(), corrections=corrections)
+
+    print("\nExported trace envelope:")
+    print(json.dumps(envelope, indent=2))
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/pyproject.toml b/pyproject.toml
index 5b0cbaa..404d7bf 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -51,6 +51,7 @@ dev = [
     "pytest>=8.0",
     "pytest-cov>=5.0",
     "pytest-asyncio>=0.23",
+    "hypothesis>=6.100",
     "ruff>=0.4",
     "mypy>=1.10",
     "httpx>=0.27",
diff --git a/src/weaver_kernel/__init__.py b/src/weaver_kernel/__init__.py
index e324f3f..9f358f2 100644
--- a/src/weaver_kernel/__init__.py
+++ b/src/weaver_kernel/__init__.py
@@ -26,6 +26,7 @@
 Handles & traces::
 
     from weaver_kernel import HandleStore, TraceStore
+    from weaver_kernel import export_action_trace, export_action_traces
 
 LLM tool-format adapters::
 
@@ -140,7 +141,13 @@
 from .registry import CapabilityRegistry
 from .router import StaticRouter
 from .tokens import CapabilityToken, HMACTokenProvider
-from .trace import TraceStore
+from .trace import (
+    TRACE_EXPORT_SCHEMA,
+    TRACE_EXPORT_VERSION,
+    TraceStore,
+    export_action_trace,
+    export_action_traces,
+)
 
 # Single source of truth: read the version from the installed distribution
 # metadata (the PyPI dist name is ``weaver-kernel``, distinct from the import
@@ -250,6 +257,11 @@
     # stores
     "HandleStore",
     "TraceStore",
+    # trace export (issue #94)
+    "TRACE_EXPORT_SCHEMA",
+    "TRACE_EXPORT_VERSION",
+    "export_action_trace",
+    "export_action_traces",
     # adapters
     "AnthropicMiddleware",
     "OpenAIMiddleware",
diff --git a/src/weaver_kernel/kernel/__init__.py b/src/weaver_kernel/kernel/__init__.py
index 07bf004..0fb7353 100644
--- a/src/weaver_kernel/kernel/__init__.py
+++ b/src/weaver_kernel/kernel/__init__.py
@@ -234,6 +234,7 @@ async def invoke(
             args=args,
             response_mode=response_mode,
             plan=plan,
+            capability=capability,
         )
 
     async def invoke_stream(
diff --git a/src/weaver_kernel/kernel/_invoke.py b/src/weaver_kernel/kernel/_invoke.py
index b8d22ca..dd546e8 100644
--- a/src/weaver_kernel/kernel/_invoke.py
+++ b/src/weaver_kernel/kernel/_invoke.py
@@ -22,10 +22,12 @@
 from typing import TYPE_CHECKING, Any
 
 from ..drivers.base import Driver, ExecutionContext
+from ..enums import SensitivityTag
 from ..errors import DriverError
 from ..firewall.budget_manager import BudgetManager
 from ..models import (
     ActionTrace,
+    Capability,
     Frame,
     Handle,
     Principal,
@@ -150,6 +152,7 @@ def record_failure_trace(
     response_mode: ResponseMode,
     error_message: str,
     trace_store: TraceStore,
+    sensitivity: SensitivityTag = SensitivityTag.NONE,
 ) -> None:
     """Persist an :class:`ActionTrace` for a run where no driver succeeded."""
     trace_store.record(
@@ -162,6 +165,7 @@ def record_failure_trace(
             args=_redact_args_for_trace(capability_id, args),
             response_mode=response_mode,
             driver_id="",
+            sensitivity=sensitivity,
             error=error_message,
         )
     )
@@ -179,6 +183,7 @@ def record_success_trace(
     handle_id: str | None,
     result_summary: dict[str, Any] | None,
     trace_store: TraceStore,
+    sensitivity: SensitivityTag = SensitivityTag.NONE,
 ) -> None:
     """Persist an :class:`ActionTrace` for a successful invocation."""
     trace_store.record(
@@ -191,6 +196,7 @@ def record_success_trace(
             args=_redact_args_for_trace(capability_id, args),
             response_mode=response_mode,
             driver_id=driver_id,
+            sensitivity=sensitivity,
             handle_id=handle_id,
             result_summary=result_summary,
         )
@@ -205,6 +211,7 @@ async def perform_invoke(
     args: dict[str, Any],
     response_mode: ResponseMode,
     plan: RoutePlan,
+    capability: Capability,
 ) -> Frame:
     """Run the non-dry-run invocation pipeline end-to-end.
 
@@ -221,6 +228,9 @@ async def perform_invoke(
         args: Driver arguments.
         response_mode: The caller-requested response mode.
         plan: The router-resolved :class:`RoutePlan` for *token*.
+        capability: The resolved :class:`Capability`; its
+            :attr:`~weaver_kernel.models.Capability.sensitivity` is copied onto
+            the recorded :class:`ActionTrace`.
     """
     action_id = str(uuid.uuid4())
     effective_mode = resolve_effective_mode(
@@ -272,6 +282,7 @@ async def perform_invoke(
             response_mode=response_mode,
             error_message=err_msg,
             trace_store=kernel._traces,
+            sensitivity=capability.sensitivity,
         )
         raise DriverError(
             f"All drivers failed for capability '{token.capability_id}'. Last error: {err_msg}"
@@ -316,6 +327,7 @@ async def perform_invoke(
         handle_id=handle.handle_id if handle else None,
         result_summary=_frame_result_summary(frame),
         trace_store=kernel._traces,
+        sensitivity=capability.sensitivity,
     )
     logger.info(
         "invoke_success",
diff --git a/src/weaver_kernel/kernel/_stream.py b/src/weaver_kernel/kernel/_stream.py
index 9691b00..94c3242 100644
--- a/src/weaver_kernel/kernel/_stream.py
+++ b/src/weaver_kernel/kernel/_stream.py
@@ -53,7 +53,6 @@ async def invoke_stream_impl(
     response_mode: ResponseMode,
 ) -> AsyncIterator[Frame]:
     """Stream Frames for one capability invocation."""
-    del capability  # currently unused; kept in signature for future hooks.
     action_id = str(uuid.uuid4())
     initial_mode = resolve_effective_mode(
         response_mode=response_mode,
@@ -150,6 +149,7 @@ async def invoke_stream_impl(
                 args=_redact_args_for_trace(token.capability_id, args),
                 response_mode=(last_frame.response_mode if last_frame else initial_mode),
                 driver_id=fallback_driver_id,
+                sensitivity=capability.sensitivity,
                 handle_id=handle.handle_id if handle else None,
                 result_summary=(_frame_result_summary(last_frame) if last_frame else None),
                 error=None if yielded_any else "stream produced no chunks",
diff --git a/src/weaver_kernel/models.py b/src/weaver_kernel/models.py
index d511eed..3a4dcca 100644
--- a/src/weaver_kernel/models.py
+++ b/src/weaver_kernel/models.py
@@ -413,6 +413,14 @@ class ActionTrace:
     args: dict[str, Any]
     response_mode: ResponseMode
     driver_id: str
+    sensitivity: SensitivityTag = SensitivityTag.NONE
+    """Sensitivity tag of the invoked capability, copied at record time.
+
+    Lets the audit trail (and the :mod:`~weaver_kernel.trace` export contract)
+    flag which invocations touched PII/PCI/SECRETS/MEMORY data without a
+    second registry lookup. Defaults to :attr:`SensitivityTag.NONE` for traces
+    constructed directly (e.g. in tests) or for non-sensitive capabilities.
+    """
     handle_id: str | None = None
     error: str | None = None
     result_summary: dict[str, Any] | None = None
diff --git a/src/weaver_kernel/trace.py b/src/weaver_kernel/trace.py
index cd94d96..1650c40 100644
--- a/src/weaver_kernel/trace.py
+++ b/src/weaver_kernel/trace.py
@@ -1,10 +1,113 @@
-"""TraceStore: in-memory audit trail for kernel invocations."""
+"""TraceStore: in-memory audit trail for kernel invocations.
+
+This module also defines the **stable export contract** for
+:class:`~weaver_kernel.models.ActionTrace` records
+(:func:`export_action_trace` / :func:`export_action_traces`) so downstream
+analysis tools — for example a LessonWeaver-style lesson-extraction layer —
+can consume traces without depending on agent-kernel internals.
+
+The export is intentionally distinct from the OpenTelemetry observability
+export in :mod:`weaver_kernel.otel`: OTel emits live spans and metrics for
+monitoring, whereas this contract produces a stable, JSON-serialisable audit
+record per invocation for offline analysis. See ``docs/trace_export.md``.
+
+Privacy: the export is derived **only** from already-redaction-safe
+:class:`ActionTrace` fields. ``args`` has memory payloads stripped at record
+time and ``result_summary`` carries counts/flags only — never raw driver data
+— so exporting cannot widen the I-01 firewall boundary or leak sensitive
+payloads. The contract adds no field that the trace did not already hold.
+"""
 
 from __future__ import annotations
 
+from collections.abc import Iterable
+from typing import Any
+
 from .errors import AgentKernelError
 from .models import ActionTrace
 
+# ── Export contract ─────────────────────────────────────────────────────────
+
+TRACE_EXPORT_SCHEMA = "weaver_kernel.action_trace_export"
+"""Stable schema identifier embedded in every exported envelope."""
+
+TRACE_EXPORT_VERSION = "1"
+"""Schema version of the export envelope. Bumped only on a breaking change to
+the field shape; new optional fields may be added without a bump."""
+
+
+def export_action_trace(
+    trace: ActionTrace,
+    *,
+    correction: dict[str, Any] | None = None,
+) -> dict[str, Any]:
+    """Serialise a single :class:`ActionTrace` to the stable export shape.
+
+    The returned dict is JSON-serialisable as long as ``trace.args`` and
+    ``trace.result_summary`` hold JSON-compatible values (they do for traces
+    the kernel records). No raw driver output is added: every field is copied
+    verbatim from the already-redaction-safe trace.
+
+    Args:
+        trace: The recorded action trace to export.
+        correction: Optional human-correction metadata to attach (e.g.
+            ``{"corrected_by": "reviewer", "note": "..."}``). agent-kernel does
+            not record corrections itself; a downstream tool supplies them at
+            export time. ``None`` when no correction is available.
+
+    Returns:
+        A dict with the stable export fields. ``status`` is ``"failed"`` when
+        the invocation recorded an ``error`` and ``"succeeded"`` otherwise.
+        (A *denied* request never produces an :class:`ActionTrace` — policy
+        gates before invocation, per I-02 — so the export only ever describes
+        authorised invocations; denials surface via
+        :class:`~weaver_kernel.PolicyDenied` / ``explain_denial``.)
+    """
+    return {
+        "action_id": trace.action_id,
+        "capability_id": trace.capability_id,
+        "principal_id": trace.principal_id,
+        "token_id": trace.token_id,
+        "invoked_at": trace.invoked_at.isoformat(),
+        "response_mode": trace.response_mode,
+        "driver_id": trace.driver_id,
+        "handle_id": trace.handle_id,
+        "sensitivity": trace.sensitivity.value,
+        "status": "failed" if trace.error is not None else "succeeded",
+        "error": trace.error,
+        "args": trace.args,
+        "result_summary": trace.result_summary,
+        "correction": correction,
+    }
+
+
+def export_action_traces(
+    traces: Iterable[ActionTrace],
+    *,
+    corrections: dict[str, dict[str, Any]] | None = None,
+) -> dict[str, Any]:
+    """Export an iterable of traces as a versioned, JSON-serialisable envelope.
+
+    Args:
+        traces: The action traces to export (e.g. ``TraceStore.list_all()``).
+        corrections: Optional mapping of ``action_id`` → human-correction
+            metadata, applied per trace. Entries with no matching trace are
+            ignored; traces with no entry get ``correction=None``.
+
+    Returns:
+        A dict ``{"schema", "version", "traces": [...]}`` where each entry is
+        the result of :func:`export_action_trace`.
+    """
+    corrections = corrections or {}
+    return {
+        "schema": TRACE_EXPORT_SCHEMA,
+        "version": TRACE_EXPORT_VERSION,
+        "traces": [
+            export_action_trace(trace, correction=corrections.get(trace.action_id))
+            for trace in traces
+        ],
+    }
+
 
 class TraceStore:
     """Stores :class:`ActionTrace` records indexed by ``action_id``.
diff --git a/tests/test_policy_properties.py b/tests/test_policy_properties.py
new file mode 100644
index 0000000..3a26a57
--- /dev/null
+++ b/tests/test_policy_properties.py
@@ -0,0 +1,370 @@
+"""Property-based invariant tests for the authorization surface (issue #99).
+
+These tests use Hypothesis to generate varied principals, capabilities,
+requests, scopes, constraints, handles, and tokens, then assert that the
+core security/audit invariants always hold — the failure modes that
+example-based unit tests miss (token-scope confusion, handle expansion
+outside the original grant, policy traces leaking raw argument values, etc.).
+
+Invariants under test (see ``docs/agent-context/invariants.md`` and AGENTS.md):
+
+* **I-02 — every decision is stable and auditable**
+    - :func:`test_decision_always_carries_a_stable_reason_code` — an allow
+      returns ``allowed=True`` with a stable :class:`AllowReason`; a deny
+      *raises* :class:`PolicyDenied` with a stable :class:`DenialReason`
+      (so a denied capability never silently yields a grant/token/frame).
+* **Constraint integrity**
+    - :func:`test_max_rows_never_exceeds_policy_cap`
+    - :func:`test_handle_expand_never_exceeds_grant` — the indirect /
+      handle-expansion scenario: an expand query never returns more rows or
+      wider fields than the original grant authorised.
+* **I-06 — tokens bind principal + capability + expiry**
+    - :func:`test_token_never_verifies_outside_its_scope`
+    - :func:`test_tampered_token_is_always_rejected`
+* **Redaction safety (feeds the #94 export contract)**
+    - :func:`test_policy_trace_never_leaks_raw_scope_values`
+    - :func:`test_trace_export_is_always_json_serialisable`
+
+Reproducing failures: on failure Hypothesis prints a minimal falsifying
+example plus a ``@reproduce_failure(...)`` decorator, and persists the case in
+its example database so the next run replays it first. The rate limiter is
+disabled in :func:`_engine` so repeated generated examples do not spuriously
+deny on the sliding window.
+"""
+
+from __future__ import annotations
+
+import datetime
+import json
+import string
+from dataclasses import asdict
+
+import pytest
+from hypothesis import given, settings
+from hypothesis import strategies as st
+
+from weaver_kernel import (
+    ActionTrace,
+    AllowReason,
+    Capability,
+    CapabilityRequest,
+    DefaultPolicyEngine,
+    DenialReason,
+    HandleConstraintViolation,
+    HandleStore,
+    HMACTokenProvider,
+    PolicyDenied,
+    Principal,
+    SafetyClass,
+    SensitivityTag,
+    TokenExpired,
+    TokenInvalid,
+    TokenScopeError,
+    export_action_traces,
+)
+
+# ── Shared strategies & helpers ─────────────────────────────────────────────
+
+_ROLE_POOL = [
+    "reader",
+    "writer",
+    "admin",
+    "service",
+    "pii_reader",
+    "secrets_reader",
+    "memory_writer",
+    "memory_reader_sensitive",
+]
+_ID_ALPHABET = string.ascii_letters + string.digits + "-_"
+_DENIAL_CODES = {code.value for code in DenialReason}
+_ALLOW_CODES = {code.value for code in AllowReason}
+
+# Effectively unlimited rate limits: Hypothesis runs many examples against a
+# fresh engine, and the sliding-window limiter would otherwise deny later
+# examples for reasons unrelated to the property under test.
+_NO_RATE_LIMIT = {sc: (1_000_000, 3600.0) for sc in SafetyClass}
+
+_ids = st.text(alphabet=_ID_ALPHABET, min_size=1, max_size=16)
+_roles = st.lists(st.sampled_from(_ROLE_POOL), unique=True, max_size=5)
+_attributes = st.dictionaries(
+    st.sampled_from(["tenant", "region", "team"]),
+    st.text(alphabet=string.ascii_letters, min_size=1, max_size=8),
+    max_size=3,
+)
+_justifications = st.text(max_size=40)
+_ROW_FIELDS = ["id", "email", "amount", "status"]
+
+
+def _engine() -> DefaultPolicyEngine:
+    """A fresh engine with rate limiting disabled (see module docstring)."""
+    return DefaultPolicyEngine(rate_limits=_NO_RATE_LIMIT)
+
+
+def _read_capability() -> Capability:
+    """A READ capability with no sensitivity — always reaches the allow path."""
+    return Capability(
+        capability_id="cap.read",
+        name="read",
+        description="generated read capability",
+        safety_class=SafetyClass.READ,
+        sensitivity=SensitivityTag.NONE,
+    )
+
+
+@st.composite
+def _principals(draw: st.DrawFn) -> Principal:
+    return Principal(
+        principal_id=draw(_ids),
+        roles=draw(_roles),
+        attributes=draw(_attributes),
+    )
+
+
+@st.composite
+def _capabilities(draw: st.DrawFn) -> Capability:
+    cap_id = draw(_ids)
+    return Capability(
+        capability_id=cap_id,
+        name=cap_id,
+        description="generated capability",
+        safety_class=draw(st.sampled_from(list(SafetyClass))),
+        sensitivity=draw(st.sampled_from(list(SensitivityTag))),
+    )
+
+
+@st.composite
+def _rows(draw: st.DrawFn) -> list[dict[str, object]]:
+    count = draw(st.integers(min_value=0, max_value=12))
+    return [
+        {
+            "id": f"R-{i}",
+            "email": draw(st.text(alphabet=string.ascii_lowercase, min_size=1, max_size=6)),
+            "amount": draw(st.integers(min_value=0, max_value=1000)),
+            "status": draw(st.sampled_from(["paid", "unpaid", "overdue"])),
+        }
+        for i in range(count)
+    ]
+
+
+@st.composite
+def _action_traces(draw: st.DrawFn) -> ActionTrace:
+    has_error = draw(st.booleans())
+    return ActionTrace(
+        action_id=draw(_ids),
+        capability_id=draw(_ids),
+        principal_id=draw(_ids),
+        token_id=draw(_ids),
+        invoked_at=datetime.datetime.now(tz=datetime.timezone.utc),
+        args=draw(st.dictionaries(st.text(min_size=1, max_size=8), st.integers(), max_size=4)),
+        response_mode=draw(st.sampled_from(["summary", "table", "handle_only", "raw"])),
+        driver_id=draw(_ids),
+        sensitivity=draw(st.sampled_from(list(SensitivityTag))),
+        error=draw(st.text(max_size=20)) if has_error else None,
+        result_summary=(
+            None if has_error else {"row_count": draw(st.integers(min_value=0, max_value=100))}
+        ),
+    )
+
+
+# ── I-02: every decision is stable and auditable ────────────────────────────
+
+
+@settings(deadline=None, max_examples=200)
+@given(principal=_principals(), capability=_capabilities(), justification=_justifications)
+def test_decision_always_carries_a_stable_reason_code(
+    principal: Principal, capability: Capability, justification: str
+) -> None:
+    engine = _engine()
+    request = CapabilityRequest(capability_id=capability.capability_id, goal="generated goal")
+    try:
+        decision = engine.evaluate(request, capability, principal, justification=justification)
+    except PolicyDenied as exc:
+        # A denial raises before any token is issued, so a denied capability
+        # can never produce a usable grant or frame. The code must be stable.
+        assert str(exc.reason_code) in _DENIAL_CODES
+        return
+    assert decision.allowed is True
+    assert decision.reason_code is not None
+    assert str(decision.reason_code) in _ALLOW_CODES
+
+
+# ── Constraint integrity ────────────────────────────────────────────────────
+
+
+@settings(deadline=None, max_examples=200)
+@given(
+    principal=_principals(),
+    requested_max_rows=st.one_of(st.none(), st.integers(min_value=-10, max_value=10_000)),
+)
+def test_max_rows_never_exceeds_policy_cap(
+    principal: Principal, requested_max_rows: int | None
+) -> None:
+    engine = _engine()
+    capability = _read_capability()
+    constraints = {} if requested_max_rows is None else {"max_rows": requested_max_rows}
+    request = CapabilityRequest(
+        capability_id=capability.capability_id, goal="g", constraints=constraints
+    )
+    decision = engine.evaluate(request, capability, principal, justification="")
+    cap_limit = 500 if "service" in principal.roles else 50
+    capped = decision.constraints["max_rows"]
+    assert 0 <= capped <= cap_limit
+    if requested_max_rows is not None and requested_max_rows >= 0:
+        assert capped <= requested_max_rows
+
+
+@settings(deadline=None, max_examples=200)
+@given(
+    rows=_rows(),
+    granted_max_rows=st.one_of(st.none(), st.integers(min_value=0, max_value=20)),
+    granted_fields=st.lists(st.sampled_from(_ROW_FIELDS), unique=True, max_size=4),
+    query_limit=st.one_of(st.none(), st.integers(min_value=-5, max_value=30)),
+    query_offset=st.integers(min_value=0, max_value=10),
+    query_fields=st.lists(st.sampled_from(_ROW_FIELDS), unique=True, max_size=4),
+)
+def test_handle_expand_never_exceeds_grant(
+    rows: list[dict[str, object]],
+    granted_max_rows: int | None,
+    granted_fields: list[str],
+    query_limit: int | None,
+    query_offset: int,
+    query_fields: list[str],
+) -> None:
+    store = HandleStore()
+    constraints: dict[str, object] = {}
+    if granted_max_rows is not None:
+        constraints["max_rows"] = granted_max_rows
+    if granted_fields:
+        constraints["allowed_fields"] = granted_fields
+    handle = store.store("cap.read", rows, principal_id="p1", constraints=constraints)
+
+    query: dict[str, object] = {"offset": query_offset}
+    if query_limit is not None:
+        query["limit"] = query_limit
+    if query_fields:
+        query["fields"] = query_fields
+
+    try:
+        frame = store.expand(handle, query=query, principal_id="p1")
+    except HandleConstraintViolation:
+        return  # rejecting the over-broad request is the safe outcome
+
+    preview = frame.table_preview
+    if granted_max_rows is not None:
+        assert len(preview) <= granted_max_rows
+    if granted_fields:
+        for row in preview:
+            assert set(row).issubset(set(granted_fields))
+
+
+# ── I-06: tokens bind principal + capability + expiry ───────────────────────
+
+
+@settings(deadline=None, max_examples=200)
+@given(
+    capability_id=_ids,
+    principal_id=_ids,
+    other_principal_id=_ids,
+    other_capability_id=_ids,
+    ttl=st.one_of(
+        st.integers(min_value=-86_400, max_value=-1),
+        st.integers(min_value=120, max_value=86_400),
+    ),
+)
+def test_token_never_verifies_outside_its_scope(
+    capability_id: str,
+    principal_id: str,
+    other_principal_id: str,
+    other_capability_id: str,
+    ttl: int,
+) -> None:
+    provider = HMACTokenProvider(secret="prop-test-secret")
+    token = provider.issue(capability_id, principal_id, ttl_seconds=ttl)
+
+    if ttl <= 0:
+        with pytest.raises(TokenExpired):
+            provider.verify(
+                token,
+                expected_principal_id=principal_id,
+                expected_capability_id=capability_id,
+            )
+        return
+
+    # In-scope verification of a live token succeeds.
+    provider.verify(
+        token, expected_principal_id=principal_id, expected_capability_id=capability_id
+    )
+    if other_principal_id != principal_id:
+        with pytest.raises(TokenScopeError):
+            provider.verify(
+                token,
+                expected_principal_id=other_principal_id,
+                expected_capability_id=capability_id,
+            )
+    if other_capability_id != capability_id:
+        with pytest.raises(TokenScopeError):
+            provider.verify(
+                token,
+                expected_principal_id=principal_id,
+                expected_capability_id=other_capability_id,
+            )
+
+
+@settings(deadline=None, max_examples=100)
+@given(
+    capability_id=_ids,
+    principal_id=_ids,
+    flip_index=st.integers(min_value=0, max_value=63),
+)
+def test_tampered_token_is_always_rejected(
+    capability_id: str, principal_id: str, flip_index: int
+) -> None:
+    provider = HMACTokenProvider(secret="prop-test-secret")
+    token = provider.issue(capability_id, principal_id, ttl_seconds=3600)
+    sig = token.signature
+    idx = flip_index % len(sig)
+    replacement = "0" if sig[idx] != "0" else "1"
+    token.signature = sig[:idx] + replacement + sig[idx + 1 :]
+    with pytest.raises(TokenInvalid):
+        provider.verify(
+            token, expected_principal_id=principal_id, expected_capability_id=capability_id
+        )
+
+
+# ── Redaction safety (feeds the #94 export contract) ────────────────────────
+
+
+@settings(deadline=None, max_examples=200)
+@given(
+    principal=_principals(),
+    scope_value=st.text(alphabet=string.ascii_letters + string.digits, min_size=4, max_size=16),
+)
+def test_policy_trace_never_leaks_raw_scope_values(principal: Principal, scope_value: str) -> None:
+    engine = _engine()
+    capability = _read_capability()
+    sentinel = f"SCOPEVAL{scope_value}SCOPEVAL"
+    request = CapabilityRequest(
+        capability_id=capability.capability_id,
+        goal="g",
+        scope={"customer_id": sentinel},
+    )
+    decision = engine.evaluate(request, capability, principal, justification="")
+    trace = decision.trace
+    assert trace is not None
+    # The scope *key* is recorded for audit, but its raw *value* must not be.
+    assert "customer_id" in trace.scope_keys
+    serialised = json.dumps(asdict(trace))
+    assert sentinel not in serialised
+
+
+@settings(deadline=None, max_examples=200)
+@given(traces=st.lists(_action_traces(), max_size=6))
+def test_trace_export_is_always_json_serialisable(traces: list[ActionTrace]) -> None:
+    envelope = export_action_traces(traces)
+    assert envelope["version"] == "1"
+    assert len(envelope["traces"]) == len(traces)
+    blob = json.dumps(envelope)  # must not raise
+    assert isinstance(blob, str)
+    for exported, trace in zip(envelope["traces"], traces, strict=True):
+        assert exported["status"] == ("failed" if trace.error is not None else "succeeded")
+        assert exported["sensitivity"] == trace.sensitivity.value
diff --git a/tests/test_trace.py b/tests/test_trace.py
index 2c8a469..803f2ac 100644
--- a/tests/test_trace.py
+++ b/tests/test_trace.py
@@ -1,14 +1,28 @@
-"""Tests for TraceStore."""
+"""Tests for TraceStore and the ActionTrace export contract (issue #94)."""
 
 from __future__ import annotations
 
 import datetime
+import json
 
 import pytest
 
-from weaver_kernel import TraceStore
+from weaver_kernel import (
+    Capability,
+    CapabilityRegistry,
+    HMACTokenProvider,
+    InMemoryDriver,
+    Kernel,
+    Principal,
+    SafetyClass,
+    SensitivityTag,
+    StaticRouter,
+    TraceStore,
+    export_action_trace,
+    export_action_traces,
+)
 from weaver_kernel.errors import AgentKernelError
-from weaver_kernel.models import ActionTrace
+from weaver_kernel.models import ActionTrace, CapabilityRequest
 
 
 def _trace(action_id: str = "act-1") -> ActionTrace:
@@ -63,3 +77,113 @@ def test_result_summary_defaults_none() -> None:
     # (e.g. failure traces, or callers constructing ActionTrace directly) keep
     # it unset rather than fabricating a summary.
     assert _trace("act-default").result_summary is None
+
+
+def test_sensitivity_defaults_none() -> None:
+    assert _trace("act-default").sensitivity is SensitivityTag.NONE
+
+
+# ── Export contract (issue #94) ─────────────────────────────────────────────
+
+
+def test_export_action_trace_success_shape() -> None:
+    trace = ActionTrace(
+        action_id="act-ok",
+        capability_id="billing.list_invoices",
+        principal_id="u1",
+        token_id="tok-1",
+        invoked_at=datetime.datetime(2026, 1, 2, 3, 4, 5, tzinfo=datetime.timezone.utc),
+        args={"operation": "billing.list_invoices"},
+        response_mode="summary",
+        driver_id="billing",
+        sensitivity=SensitivityTag.PII,
+        handle_id="h-1",
+        result_summary={"row_count": 3, "fact_count": 1, "warning_count": 0, "has_handle": True},
+    )
+    exported = export_action_trace(trace)
+    assert exported["action_id"] == "act-ok"
+    assert exported["capability_id"] == "billing.list_invoices"
+    assert exported["invoked_at"] == "2026-01-02T03:04:05+00:00"
+    assert exported["sensitivity"] == "PII"
+    assert exported["status"] == "succeeded"
+    assert exported["error"] is None
+    assert exported["result_summary"]["row_count"] == 3
+    assert exported["correction"] is None
+
+
+def test_export_action_trace_failure_status() -> None:
+    trace = ActionTrace(
+        action_id="act-fail",
+        capability_id="cap.x",
+        principal_id="u1",
+        token_id="tok-1",
+        invoked_at=datetime.datetime.now(tz=datetime.timezone.utc),
+        args={},
+        response_mode="summary",
+        driver_id="",
+        error="All drivers failed",
+    )
+    exported = export_action_trace(trace)
+    assert exported["status"] == "failed"
+    assert exported["error"] == "All drivers failed"
+    assert exported["result_summary"] is None
+
+
+def test_export_action_trace_attaches_correction() -> None:
+    correction = {"corrected_by": "reviewer", "note": "wrong customer"}
+    exported = export_action_trace(_trace("act-corr"), correction=correction)
+    assert exported["correction"] == correction
+
+
+def test_export_envelope_version_and_corrections() -> None:
+    traces = [_trace("act-0"), _trace("act-1")]
+    envelope = export_action_traces(traces, corrections={"act-1": {"note": "flagged"}})
+    assert envelope["schema"] == "weaver_kernel.action_trace_export"
+    assert envelope["version"] == "1"
+    assert [t["action_id"] for t in envelope["traces"]] == ["act-0", "act-1"]
+    assert envelope["traces"][0]["correction"] is None
+    assert envelope["traces"][1]["correction"] == {"note": "flagged"}
+    # A correction for an unknown action_id is simply ignored.
+    json.dumps(envelope)  # must be JSON-serialisable
+
+
+@pytest.mark.asyncio
+async def test_export_redacts_memory_payload_end_to_end() -> None:
+    """A memory payload redacted at record time stays redacted in the export."""
+    cap = Capability(
+        capability_id="memory.read_notes",
+        name="read notes",
+        description="read durable notes",
+        safety_class=SafetyClass.READ,
+        sensitivity=SensitivityTag.MEMORY,
+    )
+    registry = CapabilityRegistry()
+    registry.register(cap)
+    driver = InMemoryDriver(driver_id="mem")
+    driver.register_handler("memory.read_notes", lambda ctx: [{"note": "n1"}])
+    kernel = Kernel(
+        registry=registry,
+        token_provider=HMACTokenProvider(secret="test-secret-do-not-use-in-prod"),
+        router=StaticRouter(routes={"memory.read_notes": ["mem"]}),
+    )
+    kernel.register_driver(driver)
+
+    principal = Principal(principal_id="u1", roles=["reader"])
+    req = CapabilityRequest(capability_id="memory.read_notes", goal="read notes")
+    token = kernel.get_token(req, principal, justification="")
+    secret = "topsecret-PAYLOAD-123"
+    frame = await kernel.invoke(
+        token,
+        principal=principal,
+        args={"operation": "memory.read_notes", "payload": secret},
+    )
+
+    trace = kernel.explain(frame.action_id)
+    assert trace.sensitivity is SensitivityTag.MEMORY
+    assert trace.args["payload"] == "[REDACTED]"
+
+    envelope = export_action_traces(kernel._traces.list_all())
+    exported = envelope["traces"][0]
+    assert exported["sensitivity"] == "MEMORY"
+    assert exported["status"] == "succeeded"
+    assert secret not in json.dumps(envelope)

From b2be0e0ce6fe9ec0efb43fc83403f7fb730839b0 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Fri, 5 Jun 2026 18:31:08 +0000
Subject: [PATCH 2/2] fix: address review feedback on PR #115

- docs/trace_export.md: drop the stale `reason` field reference in the
  Stability section (the export shape only has `error`).
- models.py: declare ActionTrace.sensitivity last so adding it does not shift
  the positional __init__ order of pre-existing public fields.
---
 docs/trace_export.md        |  5 +++--
 src/weaver_kernel/models.py | 20 ++++++++++++--------
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/docs/trace_export.md b/docs/trace_export.md
index 3b80d1e..6dc8a47 100644
--- a/docs/trace_export.md
+++ b/docs/trace_export.md
@@ -138,5 +138,6 @@ envelope = export_action_traces(
 
 `TRACE_EXPORT_VERSION` is bumped only on a **breaking** change to the field
 shape. New optional fields may be added without a bump, so consumers should
-ignore unknown keys. Assert on `status`, `sensitivity`, and `reason`/`error`
-rather than on human-readable strings, which may evolve.
+ignore unknown keys. Assert on `status`, `sensitivity`, and the presence of
+`error` rather than on human-readable strings (the `error` text itself may
+evolve).
diff --git a/src/weaver_kernel/models.py b/src/weaver_kernel/models.py
index 3a4dcca..453ba7d 100644
--- a/src/weaver_kernel/models.py
+++ b/src/weaver_kernel/models.py
@@ -413,14 +413,6 @@ class ActionTrace:
     args: dict[str, Any]
     response_mode: ResponseMode
     driver_id: str
-    sensitivity: SensitivityTag = SensitivityTag.NONE
-    """Sensitivity tag of the invoked capability, copied at record time.
-
-    Lets the audit trail (and the :mod:`~weaver_kernel.trace` export contract)
-    flag which invocations touched PII/PCI/SECRETS/MEMORY data without a
-    second registry lookup. Defaults to :attr:`SensitivityTag.NONE` for traces
-    constructed directly (e.g. in tests) or for non-sensitive capabilities.
-    """
     handle_id: str | None = None
     error: str | None = None
     result_summary: dict[str, Any] | None = None
@@ -435,6 +427,18 @@ class ActionTrace:
     == 0``.
     """
 
+    sensitivity: SensitivityTag = SensitivityTag.NONE
+    """Sensitivity tag of the invoked capability, copied at record time.
+
+    Lets the audit trail (and the :mod:`~weaver_kernel.trace` export contract)
+    flag which invocations touched PII/PCI/SECRETS/MEMORY data without a second
+    registry lookup. Defaults to :attr:`SensitivityTag.NONE` for traces
+    constructed directly (e.g. in tests) or for non-sensitive capabilities.
+
+    Declared last so adding it does not shift the positional ``__init__`` order
+    of the pre-existing fields (``ActionTrace`` is part of the public API).
+    """
+
 
 # ── Policy explanation ────────────────────────────────────────────────────────