Skip to content

feat: add pallet-block-forwarder for offchain HTTP indexer forwarding#185

Open
rishitesh-snt wants to merge 13 commits intomainfrom
feat/block-forwarder-ocw-rebased
Open

feat: add pallet-block-forwarder for offchain HTTP indexer forwarding#185
rishitesh-snt wants to merge 13 commits intomainfrom
feat/block-forwarder-ocw-rebased

Conversation

@rishitesh-snt
Copy link
Copy Markdown
Contributor

@rishitesh-snt rishitesh-snt commented Apr 20, 2026

Why

sxt-node already emits SchemaUpdated, TablesCreatedWithCommitments, TableDropped, and QuorumReached events. Today there's no first-party path for indexer-paired nodes to replay those events off-chain. Operators
hook that up themselves with ad-hoc scripts. This pallet closes that gap with:

  • A substrate-native producer/consumer split (on_finalize captures, offchain_worker drains).
  • A wire contract (/v1/create_table, /v1/drop_table, /v1/put_batches, /v1/checkpoint, /v1/get_last_checkpoint) that's simple enough that any indexer service (not just prb-service) can implement it.
  • A checkpoint-as-source-of-truth model that keeps the node's forwarding progress aligned with the server's ingestion progress even across node restarts.

Architecture

  • Producer (on_finalize). Runs during block execution, including historical sync. Reads frame_system::Events, filters the four target variants, SCALE-encodes a BlockIndex payload, writes it to the offchain DB
    keyed by block number. Running during sync means fresh nodes backfill their indexer automatically.
  • Consumer (offchain_worker). Fires at chain tip only. Asks the server for last_checkpoint (single source of truth — no local cursor, so node and server cannot drift). Walks from cursor+1 to tip (capped at
    MAX_BLOCKS_PER_INVOCATION = 100 per OCW tick → ~100× realtime catch-up). Forwards each block's events in deposit order, checkpoints the block, deletes the consumed offchain-DB entry.
  • Wire format. DDL for schemas (CreateTableRequest.arrow_schema carries raw CREATE TABLE … bytes). For row data, the Arrow IPC single-batch stream bytes that pallet_indexing::submit_data already validates — the
    forwarder relays them verbatim. No new host functions; the runtime does no Arrow decoding.
  • Dynamic pallet-index resolution. Rather than hard-coding the construct_runtime! order of pallet_indexing, pallet-block-forwarder::Config takes IndexingPallet: PalletInfoAccess and reads its index at runtime.
    Runtime wires type IndexingPallet = Indexing. Variant byte for QuorumReached is a Get the runtime provides as ConstU8<1> — constant for now, swap in real decoding only when the enum becomes volatile.
  • Dedup key. Every forwarded table is registered with META_ROW_NUMBER as the dedup column. Any table schema without that column will be rejected by the server at create_table time — documented in the pallet's
    module docs.
  • Requires --enable-offchain-indexing=true on the node. Without it, sp_io::offchain_index::set is a silent no-op and nothing forwards. Worth calling out because the failure mode is "silent success" on the
    producer side.

Test coverage

  • 7 unit/integration tests in pallets/block_forwarder/src/tests.rs:
    • ocw_skips_when_not_configured — OCW no-op without URL.
    • ocw_forwards_and_deletes_offchain_entry — full create-table path.
    • ocw_checkpoints_empty_blocks — checkpoint advances even with zero events.
    • ocw_resumes_from_server_checkpoint — server is source of truth.
    • ocw_processes_multiple_blocks_in_order — deposit ordering preserved.
    • 2 construct_runtime! integrity tests.
  • A mock HTTP server under mock-server/ (separate workspace member) that mirrors the five /v1/* endpoints, for humans running the node locally against a fake indexer.
  • Helper scripts: scripts/configure-ocw.sh (SCALE-encode + RPC the URL into a running node), scripts/run-local-demo.sh (tmux-orchestrated node + mock-server + OCW setup).

Runtime integration surface

impl pallet_block_forwarder::Config for Runtime {                                                                                                                                                                  
    type RuntimeEvent = RuntimeEvent;                   
    type IndexingPallet = Indexing;                                                                                                                                                                                
    type QuorumReachedVariantIndex = ConstU8<1>;
}                                                                                                                                                                                                                  
                                                        
#[runtime::pallet_index(111)]                                                                                                                                                                                      
pub type BlockForwarder = pallet_block_forwarder;       

Workspace-level: pallet-block-forwarder and pallet-block-forwarder/mock-server added as members; pallet-block-forwarder added to runtime/Cargo.toml with the /std feature.

Offchain worker that forwards table lifecycle and data events
(SchemaUpdated, TablesCreatedWithCommitments, TableDropped, QuorumReached)
to an external HTTP indexer service via protobuf-over-HTTP.

Architecture:
- Producer (on_finalize): extracts events during block execution, writes
  block-indexed entries to the offchain DB. Runs during both live blocks
  and historical sync, enabling full backfill for new nodes.
- Consumer (offchain_worker): drains the offchain DB queue in deposit order,
  POSTs to the HTTP server, checkpoints, deletes consumed entries.
- Server checkpoint is the sole source of truth (no local cursor).
- No new host functions; raw DDL and postcard bytes shipped as-is.
- Processes up to 100 blocks per OCW invocation for catch-up.
- Dynamic indexing pallet detection: IndexingPallet (PalletInfoAccess)
  resolves the pallet index from construct_runtime!; QuorumReachedVariantIndex
  is a Get<u8> the runtime provides (currently ConstU8<1>).

Includes the proto definition, an HTTP+protobuf client, the producer/consumer
pipeline, an axum-based mock server for local testing, helper scripts, and
seven unit/integration tests covering the full pipeline.
@rishitesh-snt rishitesh-snt requested review from a team as code owners April 20, 2026 16:53
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

1.69.0

Bug Fixes

  • add basic documentation to runtime crate (30c26e7
  • add missing docs for runtime module items (52a2905
  • add missing docs to chain-utils (c8235c6
  • address warnings in on-chain-table (095a856
  • address warnings in pallet-rewards mock module (886bc36
  • allow deprecated CurrencyAdapter usage (cfe0517
  • allow missing docs in private items in test_end_row_limits binary (7700cd6
  • allow missing_docs_in_private_items on event forwarder contract (b2d6f8b
  • allow unused private_key variables in pallet-keystore benchmarks (c218f0d
  • declare anonymous lifetime in commitment-sql proptest helper (63b3fc0
  • declare anonymous lifetimes in canaries parse functions (f81e9de
  • document all items in watcher main module (2fe60db
  • document memory_commitment_map module in commitment-map (32a38e3
  • document missing items in canaries (72c5f33
  • document missing items in event-forwarder (5408075
  • document modules in chain-utils (3eb6545
  • document modules in rpc crate (19cf1af
  • don't hide elided lifetimes in backwards compatibility test case generator (4e6968e
  • don't import frame_benchmarking if runtime-benchmarks is disabled in node (32cfca3
  • expect AttestationInfo::block_number to be dead code in canaries (7753c44
  • expect some dead code in NewFullBase in node (fe06f5e
  • ignore unused error variable in watcher main module (d330fe5
  • ignore unused variable in attestation-tree test (ce8adf3
  • ignore unused variable in pallet-attestation benchmarks (b0ef4a7
  • implement benchmark configurations for runtime outside runtime api implementation (d1a3db8
  • only define migrations with runtime-benchmarks disabled (26e8804
  • privatize deposit_event in pallet-rewards (6c1caaf
  • privatize deposit_event in pallet-smartcontracts (fe43a19
  • privatize deposit_event in pallet-system-tables (8bf7c21
  • privatize deposit_event in pallet-tables (da951a5
  • privatize pallet-keystore deposit_event (e878e37
  • privatize pallet-permissions deposit_event (099a7ba
  • privatize system-contracts deposit_event (305dfc9
  • privatize zkpay deposit_event (9d5993c
  • propagate tui error in update_ui in watcher (fe943eb
  • remove dead code in chain-utils (0070600
  • remove dead code in node (6958ec6
  • remove unused block_number parameter from verify_attestations in watcher (6225426
  • remove unused block_number parameter from verify_signature in watcher (08f3bff
  • remove unused const in pallet-tables test (e7bf9f5
  • remove unused const in runtime test (c7e0add
  • remove unused dependencies from sxt-core (361f8a0
  • remove unused destructure variables in verify_attestation in watcher (dc04d24
  • remove unused import in memory_commitment_map test (4c2300a
  • remove unused import in pallet-attestation benchmarks (5fa662e
  • remove unused import in pallet-commitments migrations (ecf6a05
  • remove unused imports and format imports in chain-utils (5cd269b
  • remove unused imports in attestation-tree (91f5d61
  • remove unused imports in node (1f4c7a8
  • remove unused imports in pallet-smartcontracts (b7fa723
  • remove unused imports in pallet-tables (a08ccfd
  • remove unused imports in runtime (614a1e3
  • remove unused imports in test_end_row_limits (0328143
  • remove unused imports in watcher (340d27b
  • remove unused substrate key path from watcher client (2bd4d02
  • switch hex to being a dev dependency in rpc crate (7b01208
  • warn unused_crate_dependencies in sxt-core (62c85b7

Features

  • add pallet-block-forwarder for offchain HTTP indexer forwarding (604db65
  • allow SCI namespace creation without special permissions (092ba68
  • block-forwarder: surface --enable-offchain-indexing requirement (1a9b83b
  • node: --indexer-url flag seeds block_forwarder OCW storage (dd873ed
  • runtime: resolve pallet_indexing::QuorumReached variant by name (a19ae60

Performance Improvements

  • runtime: cache DynamicQuorumReachedIndex lookup via lazy_static (c8e5c2d

The pallet ships QuorumReached.data bytes (which pallet_indexing
validates as Arrow IPC single-batch stream via record_batch_bytes_dimensions)
through the HTTP adapter verbatim. Block-forwarder never decodes them.

Updates the stale doc comment in lib.rs and the mock-server log field
from 'postcard OnChainTable' to 'Arrow IPC stream'. No runtime behavior
change.
Replaces the hard-coded `QuorumReachedVariantIndex = ConstU8<1>` wiring
with a `DynamicQuorumReachedIndex` struct that looks up the variant's
index on `pallet_indexing::Event` by name via `scale_info::TypeInfo`
metadata that FRAME already derives on every pallet event.

Motivation: the hard-coded `1` was a silent-drift hazard. If anyone
reorders `pallet_indexing::Event` (e.g., prepends a new variant), the
old index becomes wrong and the block-forwarder silently stops matching
QuorumReached events — data quietly stops flowing to indexers. Looking
it up by name survives reorderings and fails loudly at node boot with a
clear message if the variant is ever renamed or removed.

The lookup lives entirely in the runtime crate. The pallet's `Config`
contract is unchanged (`QuorumReachedVariantIndex: Get<u8>`); the pallet
and its mock stay exactly as they were. No new deps; scale-info is
already a direct runtime dep.

Adds a `dynamic_quorum_reached_index_resolves` unit test so a future
rename of `QuorumReached` fails at `cargo test` time rather than node
boot.
Adds an `--indexer-url <URL>` CLI flag to sxt-node. When set, the node
writes the URL into the block-forwarder OCW's persistent local storage
(key: `block_forwarder::indexer_url`) during startup, as a
SCALE-encoded `Vec<u8>` under the standard offchain STORAGE_PREFIX.

Equivalent to running `pallets/block_forwarder/scripts/configure-ocw.sh`
via the offchain_localStorageSet RPC, but:
  - no --rpc-methods=unsafe required;
  - no second process / second terminal;
  - seeds before the first block is authored, so no events are missed.

If omitted, behaviour is unchanged: OCW is a no-op until the URL is
written by some other means (RPC, manual offchain_localStorageSet, etc.).

Touches node/src/cli.rs (new CLI arg), node/src/service.rs
(`configure_indexer_url` helper invoked from new_full_base).
The block-forwarder producer writes events via sp_io::offchain_index::set,
which is a silent no-op unless --enable-offchain-indexing=true is passed.
Until now, forgetting that flag caused forwarding to look healthy on the
producer side while nothing arrived at the indexer — a hard-to-debug
silent-failure mode.

Two changes to turn silent failure into loud failure:

1. node/src/service.rs (new_full_base):
   - --indexer-url set + --enable-offchain-indexing=false → hard startup
     error. Unambiguous misconfiguration; boot refuses to proceed.
   - --indexer-url unset + --enable-offchain-indexing=false → stderr
     warning at boot. Forwarding will never work even if the URL is
     written via RPC later; worth surfacing even if the operator hasn't
     opted into the CLI flag path.

2. pallets/block_forwarder/README.md:
   New README explaining the three node flags that must be set for
   forwarding to work (--enable-offchain-indexing, --offchain-worker,
   --indexer-url), the runtime wiring, the wire data format, the dedup
   key contract, and the testing options. One-command dev setup example.

No behavior change for correctly-configured nodes. Tests unchanged:
pallet-block-forwarder 7/7, sxt-runtime 4/4.
Comment thread pallets/block_forwarder/scripts/configure-ocw.sh Outdated
};
index.events.push(BlockEvent::Data(DataEntry {
table: quorum.table,
data: data.to_vec(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I'm sure you've noticed but the data in the quorum reached event isn't a record batch but a native table type we made for no-std compatibility. The source batch for it doesn't have the META_ROW_NUMBER column, so we can't just use that. I think you mentioned this last week actually. We might unfortunately need to add a native interface to convert it back to a record batch like you said last week, for the offchain workers to transform the processed data back to a format that the PRB can read. Or I guess the PRB could be made to read OnChainTables but that seems wrong.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, right now reads the OnChainTable. The interface is not very generic but is optimised for our use case and to keep OCW dumb. If ever PRB evolves to be more than a ingestion component, we can add interfaces separately.

Comment thread pallets/prover_db_indexer/src/lib.rs Outdated
}

fn try_extract_indexing_event(
event: &<T as polkadot_sdk::frame_system::Config>::RuntimeEvent,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW what's the benefit of processing events in this pallet as opposed to just writing to local storage w/ off-chain-indexing as part of the pallet-tables/pallet-indexing extrinsics? It seems like there's a little bit of headache to decoding the events in this separate place. Not saying it's wrong, just wanna consider both options

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still curious what your thoughts are on this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that it would be less code — but the cost shows up across pallet-indexing and pallet-tables call paths instead, and couples their internals to block_forwarder's storage. I preferred to pay the decode tax in one place than spread writes across three pallets' extrinsics.

Real integration path is now:
- prb-service with --features indexer implements the HTTP surface.
- --indexer-url flag on sxt-node seeds the OCW storage at startup.
- sxt-int-harness (standalone crate) drives chain actions via TOML.

Removing the now-redundant local-testing artifacts:
- pallets/block_forwarder/mock-server/ (axum stub HTTP server that
  logged calls; superseded by running the real prb-service).
- pallets/block_forwarder/scripts/configure-ocw.sh (SCALE-encoded the
  URL and called offchain_localStorageSet via RPC; superseded by the
  --indexer-url CLI flag).
- pallets/block_forwarder/scripts/run-local-demo.sh (tmux orchestrator
  that chained the mock server + node + configure-ocw.sh; no longer
  has a job).
- Workspace-member entry "pallets/block_forwarder/mock-server".

README updated to describe the new integration path.
Comments in node/src/cli.rs and node/src/service.rs no longer
reference configure-ocw.sh.

pallet-block-forwarder tests unchanged: 7/7 pass — they use
sp_core::offchain::testing::TestOffchainExt, not the mock-server.
`DynamicQuorumReachedIndex::get()` is called once per pallet_indexing
event in block-forwarder's per-event filter (`try_extract_indexing_event`),
which is the hot path during on_finalize. Before this change each call
re-ran `scale_info::TypeInfo::type_info()`, which builds a fresh `Type`
struct with heap-allocated children on every invocation.

Wraps the one-time lookup in `lazy_static!` so subsequent calls are a
single atomic load. Lookup still runs on first access (panics on
rename/remove, as intended) — just doesn't run 10+ times per block.

Adds `lazy_static = { workspace = true }` to runtime's deps. The
workspace already pins it with `spin_no_std` feature, so this works in
both std (native) and no_std (WASM) builds.

No behavior change. sxt-runtime 4/4 tests pass, including
`dynamic_quorum_reached_index_resolves`.
…foAccess

The filter had two different mechanisms for resolving a pallet's
construct_runtime! index:

- pallet_indexing → `type IndexingPallet: PalletInfoAccess;`
  (introduced earlier this session, FRAME-idiomatic, clean).
- pallet_tables → `fn tables_pallet_index()` that fabricated a dummy
  `TableDropped(None, Community, empty_ident, Source::default())`,
  routed it through the `From<pallet_tables::Event>` for RuntimeEvent
  impl, SCALE-encoded the result, and peeked at byte 0. Brittle —
  every time `TableDropped`'s variant shape changes (it gained a 4th
  Source field recently), the dummy constructor has to track it.

Now both use `PalletInfoAccess`. Adds `TablesPallet` to Config parallel
to `IndexingPallet`; runtime wires `type TablesPallet = Tables;` and the
mock wires `type TablesPallet = Tables;` as well. The `tables_pallet_index`
function is deleted entirely.

Separately fixes a typo the pallet picked up during an earlier edit:
`TableTzype::Community` → `TableType::Community`. The line is now gone
with the dummy constructor, so the typo no longer matters, but the
deletion implicitly resolves it.

No behavior change. Tests: pallet-block-forwarder 7/7 pass; sxt-runtime
checks clean.
…row IPC

Earlier this session I incorrectly updated these docs to say the
forwarded row-data bytes were Arrow IPC. Re-tracing the pallet_indexing
flow shows that's not what QuorumReached.data carries:

  indexer → submit_data.data = Arrow IPC bytes
  chain:
    validate_data: parse Arrow IPC header (weight accounting, check only)
    host fn record_batch_to_onchain: Arrow IPC → RecordBatch → OnChainTable
    process_insert_and_update_commitments: attach meta columns
    postcard::to_allocvec(&insert_with_meta_columns)  ← POSTCARD from here
    QuorumReached { data: <postcard bytes> }
  block-forwarder: opaque relay of postcard bytes to /v1/put_batches

The Arrow IPC format only lives on the indexer-side input; the on-chain
event data and everything downstream is postcard-encoded OnChainTable.
Module header and README data-format section now reflect that.

No code change — this is purely a documentation correction. The
companion fix on the sxtdb side restores the prb-service decoder to
postcard.
- Allow dead_code/missing_docs on the auto-generated pallet module
  and the prost-generated proto submodule.
- Allow enum_variant_names on http_client::Error's IoError variant.
- Drop unused TableType import in the OCW tests.
- Attach the existing result_large_err expectation to
  configure_indexer_url so CI's -Dclippy::all doesn't regress on it.
- cargo f (imports_granularity=Module, group_imports=StdExternalCrate,
  imports_layout=HorizontalVertical) across our PR files.
Comment thread pallets/prover_db_indexer/src/lib.rs Outdated
}

fn try_extract_indexing_event(
event: &<T as polkadot_sdk::frame_system::Config>::RuntimeEvent,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still curious what your thoughts are on this.

Comment thread pallets/prover_db_indexer/src/lib.rs
}

Ok(response.body().collect())
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would appreciate this PR being split into more. Perhaps 1 PR for configuration, 1 for the http client, one for the indexing, one for the OCW flushing the indexed items to the http client. It's still nice to see it all put together here still but yeah we do usually strive for smaller PRs in this repo

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure let me try that, first I would like to get a generic feedback on the approach.

Comment thread node/src/cli.rs Outdated
Comment thread node/src/cli.rs Outdated
Comment thread node/src/service.rs Outdated
Comment thread pallets/block_forwarder/src/lib.rs Outdated
Comment thread pallets/prover_db_indexer/src/lib.rs
Comment thread pallets/prover_db_indexer/src/lib.rs Outdated
Address review comments on the offchain HTTP forwarder pallet:

- Rename pallet from block-forwarder to prover-db-indexer; the old
  name was too generic for what is specifically forwarding to a
  prover-db backend (directory, crate, runtime wiring, storage-key
  prefixes, log targets, mock/test idents, README all moved).
- Rename proto file: indexer.proto -> prover-db.proto.
- Rename CLI flag and OCW storage key: --indexer-url -> --prover-db-url
  (PROVER_DB_URL_KEY const + "prover_db_indexer::prover_db_url" key).
- Strengthen --prover-db-url type from String to url::Url so an invalid
  value is rejected at clap-parse time rather than failing on the OCW's
  first HTTP request.
- Drop debug eprintln on successful storage seed (the Result already
  propagates errors; success is silent).
Address PR review: instead of SCALE-encoding each runtime event and
peeking pallet/variant tag bytes, supertrait the source pallets and
let `construct_runtime!`'s generated `TryInto<pallet_X::Event>`
conversions do typed downcasts. Variant-rename safety is now
compile-time, not a `lazy_static!` panic at startup.

- Make pallet instanced (`Pallet<T, I = ()>`, `Config<I>`,
  `Event<T, I>`) so we can supertrait
  `pallet_indexing::Config<I>` (which is itself instanced).
- Drop `TablesPallet`, `IndexingPallet`, `QuorumReachedVariantIndex`
  associated types. Drop runtime's `DynamicQuorumReachedIndex`,
  `lazy_static!` cache, `Get<u8>` impl, and `find_event_variant_index`
  helper. Drop `lazy_static` dep from runtime.
- Add `native_pallet` aliasing module so `construct_runtime!` can
  refer to `Pallet<Runtime>` instead of carrying the instance type
  parameter.
- Bridge `frame_system::Config::RuntimeEvent` -> our
  `Config<I>::RuntimeEvent` via explicit `From::from(...)`; same
  underlying value, distinct types to the type system, joined by
  `IsType`.
- Mock expands from 78 to ~225 lines (mirrors pallet_indexing's
  mock) — pure trait-impl boilerplate to satisfy the supertrait
  chain. No test exercised the producer side either before or after
  this change.
Address PR review: prefer immutability. The two extraction helpers
no longer take `&mut BlockIndex` — they return what would be
appended (`Vec<BlockEvent>` for tables since one event can yield
N creates, `Option<BlockEvent>` for indexing since it's at most
one). `extract_block_index` flattens them into a single `collect`,
and `BlockIndex` is constructed once from the result rather than
mutated in a loop.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants