Add tlog_witness crate and witness_worker implementing c2sp.org/tlog-witness by lukevalenta · Pull Request #218 · cloudflare/azul

lukevalenta · 2026-04-18T00:23:31Z

Introduces two new crates and an integration-test suite.

tlog_witness (spec-level, publishable to crates.io)

AddCheckpointRequest + parsers/serializers for the POST /add-checkpoint
body format (old <N>\n<proof-lines>\n\n<checkpoint-note>) in both
directions (log → witness requests, witness → log responses).
MAX_CONSISTENCY_PROOF_LINES = 63, matching the spec.
CONTENT_TYPE_TLOG_SIZE = "text/x.tlog.size" for 409 response bodies.
TlogWitnessError with variants for malformed request / response / note.
12 unit tests covering happy path, round-trips, spec-example shape, and
malformed-input rejection.

witness_worker (Cloudflare Worker deployment)

POST /add-checkpoint implementing all the MUSTs from the spec:
- Look up log by origin → 404 on unknown.
- Verify the checkpoint note via signed_note::Note::verify. Both
  UnverifiedNote (no trusted-key signature at all) and
  InvalidSignature (a trusted-key signature that failed to verify,
  making the note malformed per c2sp.org/signed-note) are surfaced as
  403 Forbidden, matching sunlight and sigsum-go. The specific
  NoteError variant is logged for debugging; other variants are
  surfaced as 400 Bad Request.
- Range-check old_size <= checkpoint.size → 400 on violation.
- Single-RPC atomic check-proof-and-update against the per-origin
  WitnessState Durable Object. The DO compares old_size against
  stored state (409 with text/x.tlog.size body on mismatch or
  same-size-different-hash), verifies the Merkle consistency proof
  against its stored latest_hash (422 on failure), and writes the
  new (size, hash). Atomicity comes from Durable Objects' input and
  output gates, documented inline at the handler; verifying the proof
  inside the same handler that reads and writes state keeps the whole
  sequence inside one atomic transaction and one DO round trip.
- On 200 from the DO, produce and return a cosignature/v1 signature.
GET /metadata returns the witness's identity (name + DER SPKI),
description, submission/monitoring URL prefixes, and the full list of
configured logs (origin + trusted public keys). description fields
are Option<&str> with #[serde(skip_serializing_if = "Option::is_none")]
so they are omitted (not null) when unset; regression tests pin this.
WitnessState Durable Object (one per log origin, keyed by origin) holds
per-origin (latest_size, latest_hash) state. The DO's on-disk JSON
layout and the internal RPC wire shape (including the embedded
consistency-proof hashes) are pinned by regression tests so format drift
is caught before it can orphan deployed state.
Config model (per-log origin + log_public_keys SPKI list +
top-level witness_name, submission_prefix, monitoring_prefix;
WITNESS_SIGNING_KEY secret) validated at build time against a JSON
Schema.
Produces cosignature/v1 signatures using the existing
tlog_tiles::CosignatureV1CheckpointSigner (Ed25519-only, matching the
witness spec's algorithm requirement).

Integration tests

New crates/integration_tests/tests/tlog_witness.rs drives the witness
end-to-end against a live wrangler dev instance: /metadata, two
successful add-checkpoint submissions (with cosignature verification
against the metadata-advertised public key), then 409 / 404 / 403 / 422 /
400 error cases — including the trusted-(name,id)-with-bad-signature
case that verifies the 403-on-InvalidSignature behavior. The cases are
sequenced inside a single #[tokio::test] threaded through one
in-memory ToyLog so the witness's persisted state evolves
predictably and consistency proofs produced by the test always match
what the witness expects.
New integration-tlog-witness CI job in
.github/workflows/integration.yml builds the worker, starts
wrangler dev, waits for /metadata to respond, then runs the test
suite.
crates/witness_worker/config.dev.json now ships an Ed25519 SPKI whose
matching seed is embedded in the test source as LOG_SIGNING_KEY_PEM.
This keypair is dev-only, clearly documented as such, and never used in
production.

This witness is deployable independently of any MTC work and can cosign
checkpoints from any signed-note-based log (static-ct, MTC, sigsum, etc.).

lukevalenta · 2026-04-19T11:30:26Z

/bonk review

ask-bonk · 2026-04-19T11:30:45Z

UnknownError: ProviderInitError

github run

ask-bonk · 2026-04-19T11:30:46Z

@lukevalenta Bonk workflow failed. Check the logs for details.

View workflow run · To retry, trigger Bonk again.

ask-bonk · 2026-04-19T12:34:06Z

UnknownError: ProviderInitError

github run

lukevalenta · 2026-04-20T14:22:52Z

/bonk review

ask-bonk · 2026-04-20T14:23:23Z

UnknownError: ProviderInitError

github run

ask-bonk · 2026-04-20T14:27:37Z

UnknownError: ProviderInitError

github run

lukevalenta · 2026-04-20T14:45:55Z

/bonk review

ask-bonk

Review — bonk pass

Solid PR overall. The spec-level / worker-level split is clean, the DO atomicity reasoning is well-argued (and well-documented inline), and the integration tests are genuinely end-to-end. The review below flags a few real issues and a pile of smaller nits.

Correctness

1. config.schema.json is missing the top-level description field.
AppConfig deserializes description: Option<String>, but the schema uses "additionalProperties": false and does not list description under properties. Any config that actually sets description will fail jsonschema::validate at build time. Either add description to the schema, or drop it from AppConfig. config.dev.json happens to set it — I'm surprised the build doesn't already reject it. Worth checking whether the dev build actually runs schema validation on the file with description set.

2. Note::verify can return NoteError::AmbiguousKey, which the witness surfaces as 400.
VerifierList::verifier returns AmbiguousKey when two configured keys share the same (name, key_id). For Ed25519 that requires a 32-bit key_id collision on two distinct SPKIs (~2^-32), so this is extremely unlikely in practice, but:

If it does happen it silently locks the log out (400 Bad Request returned to the log on every submission, rather than a loud operator-facing error),
and it's not covered by build.rs validation either.

Either reject ambiguous configured SPKIs at build time (preferred), or at least log at error level and map the case to 500 rather than 400 so it's visible in the worker logs.

3. Origin-mismatch check is a debug_assert_eq!.
frontend_worker.rs:220 asserts cp_text.origin() == origin under debug_assert_eq!. In release this is a no-op. checkpoint_origin uses text.lines().next() while CheckpointText::from_bytes does stricter parsing — if they ever disagree, the release build silently trusts the looser check for log-lookup purposes, then cosigns a CheckpointText whose origin may differ. Swap for a real if cp_text.origin() != origin { return Response::error(..., 400); } — it costs nothing and removes a sharp edge.

4. No upper bound on req.bytes() in add_checkpoint.
frontend_worker.rs:151 unconditionally buffers the whole request body. A reasonable upper bound is MAX_NOTE_SIZE (signed_note's cap) + ~5 KB for the control block (old N\n + up to 63 base64 hashes) ≈ 20 KB. Anything bigger is guaranteed to be rejected downstream anyway; rejecting it at the read boundary avoids accidental amplification.

5. InvalidSignature mapped to 403 is defensible but not the only reasonable choice.
The inline comment argues this matches sunlight/sigsum-go, which is true. But the signed-note spec calls this a malformed note, and another reasonable mapping is 400. Your choice is consistent with the prior art, and the commit message explains why, so I think 403 is fine — just noting that sunlight's choice isn't itself justified by the tlog-witness spec, so if that behavior ever changes upstream you'll want to revisit.

Style / maintainability

6. monitoring_prefix: String + is_empty() sentinel → use Option<String>.
config::AppConfig::monitoring_prefix uses #[serde(default)] + is_empty() to mean "absent". Option<String> is self-documenting and removes the "did the operator deliberately configure an empty string?" ambiguity. Same pattern the schema uses for logs.*.description via Option.

7. load_witness_public_key_der recomputes the SPKI on every /metadata hit.
Small but it's a LazyLock<Vec<u8>> / OnceLock<Vec<u8>> away from being free. Low-traffic endpoint though, so low priority.

8. base64 dep in witness_worker_config is only used via the Base64 helper in serde_with.
You already depend on serde_with which bundles base64 via its base64 feature. Check whether dropping the direct base64.workspace = true dep from config/Cargo.toml is possible — cargo machete will tell you.

9. checkpoint_origin(&checkpoint) is used only once; inline it or delete it in favor of CheckpointText::origin().
CheckpointText::from_bytes already parses and validates the first line, so you can flip the order: parse cp_text first, then look up the log by cp_text.origin(). That removes the checkpoint_origin helper, removes the debug_assert_eq! entirely, and fixes issue #3 for free.

10. hash_hex::to_hex.
The manual hex loop reinvents what hex::encode does in three characters. Either pull in the hex crate (already a transitive dep via several things in the workspace, worth checking) or use format!("{:02x?}", ...)-style. Not blocking.

11. config::Base64 is a private duplicate of serde_with::base64::Base64.
The config crate already depends on serde_with. Replacing your custom Base64 helper with serde_with::base64::Base64 (as frontend_worker.rs:36 already does) would shrink config/src/lib.rs by ~10 lines and drop the direct base64 dep. LogParams::log_public_keys would become #[serde_as(as = "Vec<serde_with::base64::Base64>")].

12. AddCheckpointRequest::consistency_proof uses Vec<Hash>; consider documenting the empty-vs-non-empty invariants in the type.
The spec says the proof MUST be empty when old_size == 0 or old_size == new_size. That's validated in the DO today, but the parser-level type doesn't encode it. Fine as-is (keeps the parser general), but a doc sentence on AddCheckpointRequest::consistency_proof would help.

13. Tests for the response format are asymmetric.
request_matches_spec_example_shape pins the request shape; there's no equivalent for the response. A one-liner asserting serialize_add_checkpoint_response(&[ns]) starts with "\u{2014} " and ends with "\n" would lock that down.

14. tlog_witness/src/lib.rs re-exports via pub use.
The re-export is fine, but dropping pub mod add_checkpoint; (keeping only the pub use) hides the inner module name from downstream docs. Or the reverse — pick one and be consistent; right now both the module and the re-exports are pub, so tlog_witness::add_checkpoint::AddCheckpointRequest and tlog_witness::AddCheckpointRequest are both valid paths.

Integration tests

15. LOG_SIGNING_KEY_PEM committed to the repo.
Already called out in the source comment and commit message. Worth adding a belt-and-braces check: a #[test] in witness_worker that reads config.dev.json, parses the committed SPKI, and asserts it matches the verifying key derived from the PEM in the integration test. Otherwise a future operator who rotates the dev key without touching LOG_SIGNING_KEY_PEM gets a confusing 403 at integration-test time instead of a targeted unit-test failure.

16. Integration test step (7) (trusted (name,id) with bad sig bytes) is excellent.
That's the case most witness implementations get wrong. Glad to see it explicitly.

17. wait_for_witness loops 30 × 2s = 60 s.
Matches the CI workflow's own wait loop. Fine, just double-booked.

18. .github/workflows/integration.yml installs wasm-pack but never invokes it.
The witness build uses worker-build, which is installed on the next line. wasm-pack install appears to be copy-pasted from a sibling job — it's ~30 s of wasted CI time. Worth dropping unless I'm missing something.

Nits

witness_state_do.rs:147 error message "consistency proof must be empty when old_size == checkpoint size" is good; consider the symmetric version for old_size == 0 to also mention "first cosignature" for debuggability.
frontend_worker.rs:297 sets content-type: text/plain; charset=utf-8 on the success response. The spec doesn't mandate a specific content type here — fine, but sunlight uses text/x.signed-checkpoint or similar on some responses; worth a sanity-check against what clients expect.
AGENTS.md mentions cargo test excludes integration_tests by default. Confirmed this PR's new test respects that (it's in integration_tests).

Pre-push checks

Per AGENTS.md, please confirm on your side:

cargo clippy --workspace --all-targets -- -Dwarnings -Dclippy::pedantic
cargo test
cargo fmt --all --check
cargo machete

all pass — in particular -Dclippy::pedantic on the new crates.

Nice work overall.

ask-bonk · 2026-04-20T14:49:43Z

Posted a review on PR #218. The main findings:

Correctness issues worth fixing:

config.schema.json is missing the top-level description field (while AppConfig accepts it and config.dev.json sets it). With additionalProperties: false this should be failing build-time schema validation.
NoteError::AmbiguousKey is mapped to 400 in add_checkpoint, which silently locks out a log on colliding Ed25519 key_ids. Should be validated at build time or surfaced as 500 + logged loudly.
The origin-match check in add_checkpoint is debug_assert_eq!, a no-op in release. Also suggested reordering to parse CheckpointText first and look up the log by cp_text.origin(), which eliminates the checkpoint_origin helper entirely.
req.bytes() in add_checkpoint has no size cap.

Style / maintainability: monitoring_prefix should be Option<String> (not is_empty() sentinel); the config crate's custom Base64 helper duplicates serde_with::base64::Base64; hash_hex::to_hex reinvents hex::encode; wasm-pack install step in the new CI job appears unused.

Tests: Commended the trusted-(name,id)-with-bad-sig case. Suggested a unit test in witness_worker that cross-checks the committed dev SPKI against the PEM embedded in the integration test.

Also noted that the 403-for-InvalidSignature choice is defensible (matches sunlight/sigsum-go) even though the spec itself is silent on it.

github run

…witness Introduces two new crates and an integration-test suite. tlog_witness (spec-level, publishable to crates.io) - `AddCheckpointRequest` + parsers/serializers for the `POST /add-checkpoint` body format (`old <N>\n<proof-lines>\n\n<checkpoint-note>`) in both directions (log → witness requests, witness → log responses). - `MAX_CONSISTENCY_PROOF_LINES = 63`, matching the spec. - `CONTENT_TYPE_TLOG_SIZE = "text/x.tlog.size"` for 409 response bodies. - `TlogWitnessError` with variants for malformed request / response / note. - 15 unit tests covering happy path, round-trips, spec-example shape, malformed-input rejection, and response wire-format shape. - Internal module structure: the `add_checkpoint` module is private; all public items are re-exported from the crate root. witness_worker (Cloudflare Worker deployment) - `POST /add-checkpoint` implementing all the MUSTs from the spec: - Caps the request body at 1 MiB + 16 KiB before buffering to bound memory use on malformed or malicious input. - Parses `CheckpointText` from the note body first and uses `cp_text.origin()` for the log lookup so a single parse is authoritative for both the 404 check and the origin field. - Looks up configured log keys from `LOG_KEYS` — a `LazyLock`-built map populated at first access. Any non-Ed25519 SPKI or `(name, key_id)` collision across two configured keys for the same log causes the worker to abort at startup rather than failing silently at request time. - Verifies the checkpoint note via `signed_note::Note::verify`. Both `UnverifiedNote` (no trusted-key signature at all) and `InvalidSignature` (a trusted-key signature that failed to verify, making the note malformed per c2sp.org/signed-note) are surfaced as 403 Forbidden, matching sunlight and sigsum-go. Other variants are surfaced as 400 Bad Request. - Range-check `old_size <= checkpoint.size` → 400 on violation. - Single-RPC atomic check-proof-and-update against the per-origin `WitnessState` Durable Object. The DO compares `old_size` against stored state (409 with `text/x.tlog.size` body on mismatch or same-size-different-hash), verifies the Merkle consistency proof against its stored `latest_hash` (422 on failure), and writes the new `(size, hash)`. Atomicity comes from Durable Objects' input and output gates, documented inline at the handler; verifying the proof inside the same handler that reads and writes state keeps the whole sequence inside one atomic transaction. - On 200 from the DO, produce and return a `cosignature/v1` signature. - `GET /metadata` returns the witness's identity (name + DER SPKI), optional description, submission/monitoring URL prefixes, and the full list of configured logs. The witness SPKI is computed once and cached in a `OnceLock`. `description` and `monitoring_prefix` are `Option<String>`; `description` is omitted from the JSON when absent (not serialized as `null`); `monitoring_prefix` falls back to `submission_prefix` when absent. Regression tests pin the serde shape. - `WitnessState` Durable Object holds per-origin `(latest_size, latest_hash)` state. The on-disk JSON layout and the internal RPC wire shape (including the embedded consistency-proof hashes) are pinned by regression tests so format drift is caught before it can orphan deployed state. Hex encoding uses the `hex` crate throughout. - Config model (per-log `origin` + `log_public_keys` SPKI list, parsed via `serde_with::base64::Base64`; top-level `witness_name`, `submission_prefix`, optional `monitoring_prefix`; `WITNESS_SIGNING_KEY` secret) validated at build time against a JSON Schema. - Produces `cosignature/v1` signatures using the existing `tlog_tiles::CosignatureV1CheckpointSigner` (Ed25519-only). Integration tests - New `crates/integration_tests/tests/tlog_witness.rs` drives the witness end-to-end against a live `wrangler dev` instance: `/metadata`, two successful `add-checkpoint` submissions (with cosignature verification against the metadata-advertised public key), then 409 / 404 / 403 / 422 / 400 error cases — including the trusted-(name,id)-with-bad-signature case that verifies the 403-on-`InvalidSignature` behavior. The cases are sequenced inside a single `#[tokio::test]` threaded through one in-memory `ToyLog` so state evolves predictably and consistency proofs always match what the witness expects. - A unit test in `witness_worker` cross-checks that the SPKI in `config.dev.json` matches the `LOG_SIGNING_KEY_PEM` embedded in the integration tests; a mismatch is caught at `cargo test` time rather than as a confusing 403 at integration-test time. - New `integration-tlog-witness` CI job builds the worker, starts `wrangler dev`, and runs the test suite. The job intentionally omits `wasm-pack` (not needed for `witness_worker`, unlike the CT/MTC jobs). This witness is deployable independently of any MTC work and can cosign checkpoints from any `signed-note`-based log (static-ct, MTC, sigsum, etc.).

lukevalenta · 2026-04-20T17:56:04Z

/bonk review again

ask-bonk · 2026-04-20T18:02:20Z

All pre-push checks pass. The code already addresses essentially all of the actionable bonk review items. Let me write a response to the PR author summarizing the status.

Based on my analysis of the current PR state, here's my summary:

Status of bonk review items on PR #218

I've audited the current branch against the bonk review. Almost every actionable item is already addressed in the committed code:

Correctness items — all addressed

config.schema.json missing description — bonk was wrong. The schema lists description under top-level properties (config.schema.json:16-18) and per-log (:35). With additionalProperties: false this already validates.
NoteError::AmbiguousKey silently returns 400 — addressed at startup, not runtime: parse_log_keys (src/lib.rs:84-111) asserts on (name, key_id) collisions when building LOG_KEYS, so the worker aborts at init with a loud operator-facing panic rather than 400-ing every submission.
debug_assert_eq! origin check — already gone. frontend_worker.rs:181-191 parses CheckpointText first and looks up the log by cp_text.origin(). The checkpoint_origin helper no longer exists.
Unbounded req.bytes() — capped at MAX_ADD_CHECKPOINT_BODY_SIZE = 1 MiB + 16 KiB (frontend_worker.rs:155-161, 269), with a comment deriving the bound from MAX_NOTE_SIZE + envelope headroom.
InvalidSignature → 403 — kept, with the spec-silence caveat documented inline (frontend_worker.rs:193-210).

Style / maintainability — all addressed

monitoring_prefix: Option<String> — done (config/src/lib.rs:37), with fallback to submission_prefix in /metadata (frontend_worker.rs:114-117).
Cached SPKI — WITNESS_PUBLIC_KEY_DER: OnceLock<Vec<u8>> (src/lib.rs:135, populated in load_witness_public_key_der at :182-193).
Extraneous base64 dep — not present; witness_worker/config/Cargo.toml only depends on serde and serde_with.
Private Base64 duplicate — the config crate already uses serde_with::base64::Base64 directly (config/src/lib.rs:18,70).
Response-shape pin test — response_matches_spec_line_shape (tlog_witness/src/add_checkpoint.rs:382-397) asserts leading "\u{2014} " and trailing \n.

Tests — addressed

Cross-check test — dev_config_tests::dev_config_spki_matches_embedded_pem in src/lib.rs:223-244 decodes the committed dev SPKI and the integration-test PEM and asserts equality. A dev-key rotation that forgets to update both will now fail a targeted unit test, not a confusing 403 in integration.
wasm-pack in CI — removed from the witness job with an explanatory comment (.github/workflows/integration.yml:122-124).

Remaining open items (all low priority or not actionable)

Publish crates to crates.io #5 (InvalidSignature → 403): kept, defensible, documented inline. Nothing to do.
Add 'Deploy to Cloudflare' button #10 (hash_hex inlining): the helper uses hex::encode directly (witness_state_do.rs:208, 243); already minimal.
Unicode not displayed correctly in checkpoints served from Worker #12 (parser-level invariant doc): AddCheckpointRequest::consistency_proof already carries a doc block explaining the empty-proof MUSTs (add_checkpoint.rs:72-82).
Fix and simplify alarm logic #14 (re-export style): lib.rs:43 is mod add_checkpoint; (private), only pub use exposes the types — already consistent with the nit's preferred direction.
Fix comparison to MAX_IN_FLIGHT #17 (duplicate 60 s wait): matches the pattern used by the other two integration jobs; leaving it for symmetry.

Pre-push checks

All four pass locally on this branch:

cargo clippy --workspace --all-targets -- -Dwarnings -Dclippy::pedantic   ✓
cargo test                                                                ✓
cargo fmt --all --check                                                   ✓
cargo machete                                                             ✓

No code changes are required from the bonk review. If you'd like me to harden any of the remaining low-priority nits (e.g. item #12's stronger parser-type encoding, or a more precise content-type on the 200 response per nit #2), say the word and I'll push a follow-up commit. Otherwise the PR is ready as-is from the review-response side.

github run

lukevalenta commented Apr 18, 2026

View reviewed changes

lukevalenta force-pushed the lvalenta/tlog-witness branch 4 times, most recently from 1c6863c to 5c882c8 Compare April 19, 2026 11:29

lukevalenta force-pushed the lvalenta/tlog-witness branch from 5c882c8 to ec86e84 Compare April 20, 2026 10:44

lukevalenta self-assigned this Apr 20, 2026

lukevalenta force-pushed the lvalenta/tlog-witness branch from ec86e84 to 8156350 Compare April 20, 2026 13:02

lukevalenta mentioned this pull request Apr 20, 2026

internal/witness: return 404 for unknown log, matching the spec FiloSottile/sunlight#63

Open

lukevalenta force-pushed the lvalenta/tlog-witness branch from 8156350 to d273275 Compare April 20, 2026 14:05

lukevalenta force-pushed the lvalenta/tlog-witness branch from 1029452 to d273275 Compare April 20, 2026 14:39

ask-bonk bot reviewed Apr 20, 2026

View reviewed changes

lukevalenta force-pushed the lvalenta/tlog-witness branch from d273275 to 75f8f35 Compare April 20, 2026 15:58

lukevalenta marked this pull request as ready for review April 20, 2026 18:15

lukevalenta requested review from bwesterb, cjpatton, mendess and rozbb as code owners April 20, 2026 18:15

lukevalenta requested review from a team and lbaquerofierro as code owners April 20, 2026 18:15

Conversation

lukevalenta commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lukevalenta commented Apr 19, 2026

Uh oh!

ask-bonk bot commented Apr 19, 2026

Uh oh!

ask-bonk bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ask-bonk bot commented Apr 19, 2026

Uh oh!

lukevalenta commented Apr 20, 2026

Uh oh!

ask-bonk bot commented Apr 20, 2026

Uh oh!

ask-bonk bot commented Apr 20, 2026

Uh oh!

lukevalenta commented Apr 20, 2026

Uh oh!

ask-bonk bot left a comment

Choose a reason for hiding this comment

Review — bonk pass

Correctness

Style / maintainability

Integration tests

Nits

Pre-push checks

Uh oh!

ask-bonk bot commented Apr 20, 2026

Uh oh!

lukevalenta commented Apr 20, 2026

Uh oh!

ask-bonk bot commented Apr 20, 2026

Status of bonk review items on PR #218

Correctness items — all addressed

Style / maintainability — all addressed

Tests — addressed

Remaining open items (all low priority or not actionable)

Pre-push checks

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lukevalenta commented Apr 18, 2026 •

edited

Loading

ask-bonk bot commented Apr 19, 2026 •

edited

Loading