Skip to content

Conversation

@akaladarshi
Copy link
Collaborator

@akaladarshi akaladarshi commented Dec 3, 2025

Summary of changes

Changes introduced in this pull request:

  • Store FVM events in blockstore
  • Removed the indices store column from the DB, so now it doesn't store the events root -> tipsetKey mapping
  • Directly using the blockstore to serve the ChainGetEvents API, and if events are not available we just error out, same as lotus

Reference issue to close (if applicable)

Closes #6271

Other information and links

The issue was: A message generated two identical events and forest always used to compute the state and get the events for serving the ChainGetEvents API call request, so we were always getting two identical events, but the lotus was serving directly from the blockstore so they were getting single event.

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

  • Bug Fixes

    • Events now preserve exact order and duplicates; event-root/receipt-root validation tightened. RPC returns correct events.
  • Breaking Changes

    • Chain store constructor simplified — fewer database handles required.
  • New Features

    • State-events type and event-AMT bit-width exposed; ChainGetEvents RPC now serves events directly from storage.
  • Chores

    • Database migration to v0.31.0 removes legacy indices column.
  • Tests

    • Expanded coverage for event storage, ordering, duplicates, and cross-version compatibility.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 3, 2025

Walkthrough

Removed the IndicesStore and related column/impls, consolidated event retrieval to AMT-backed StampedEvent::get_events (preserving order and duplicates), updated ChainStore::new to drop the indices/db-writer parameter across call sites, added StateEvents and EVENTS_AMT_BITWIDTH, and added a v0.31.0 DB migration removing the Indices column.

Changes

Cohort / File(s) Summary
ChainStore & call sites
src/chain/store/chain_store.rs, src/chain_sync/chain_follower.rs, src/rpc/methods/sync.rs, src/libp2p/chain_exchange/provider.rs, src/tool/..., src/daemon/context.rs
Removed IndicesStore dependency/field and index-related methods (put_index, get_tipset_key); updated ChainStore::new(...) signature and all call sites to omit one DB argument.
DB traits & implementations removed
src/db/mod.rs, src/db/memory.rs, src/db/parity_db.rs, src/db/car/many.rs, src/tool/subcommands/api_cmd/generate_test_snapshot.rs
Deleted IndicesStore and IndicesStoreExt traits and all impls; removed indices field/column support and associated read/write/exists helpers.
Event retrieval & RPC
src/rpc/methods/chain.rs, src/rpc/methods/eth/filter/mod.rs
Added ChainGetEvents RPC that delegates to get_events_by_event_root; replaced tipset-based collection with direct AMT-based retrieval via StampedEvent::get_events, preserving order/duplicates.
StampedEvent & AMT handling
src/shim/executor.rs
Added #[serde(untagged)] to StampedEvent and pub fn get_events<DB: Blockstore>(db, events_root) -> anyhow::Result<Vec<StampedEvent>> which loads events AMT (v4 → fallback v3) preserving stored order.
State manager & event storage
src/state_manager/mod.rs, src/state_manager/tests.rs
Added pub const EVENTS_AMT_BITWIDTH: u32 = 5 and StateEvents struct; updated event AMT construction/storage (using Amtv0/Amt::new_from_iter_with_bit_width) and added validation with ensure!; expanded tests for V3/V4, ordering, duplicates, and CID equivalence.
Daemon & tooling updates
src/daemon/db_util.rs, src/tool/.../test_snapshot.rs, src/tool/.../api_compare_tests.rs, src/tool/.../generate_test_snapshot.rs, src/tool/subcommands/index_cmd.rs, src/tool/subcommands/state_compute_cmd.rs, src/tool/offline_server/server.rs
Removed indices backfill/copy paths; adjusted ChainStore construction call sites to match new signature; some constructors now return Result and use ?.
DB migration & mapping
src/db/migration/mod.rs, src/db/migration/migration_map.rs, src/db/migration/v0_31_0.rs
Added migration Migration0_26_0_0_31_0 to migrate DB to v0.31.0 schema (removes Indices column); added per-version paritydb helpers and migration logic that skips Indices.
Tests, snapshots & scripts
src/tool/subcommands/api_cmd/test_snapshots.txt, src/state_manager/tests.rs, scripts/tests/api_compare/filter-list*
Updated RPC snapshots for ChainGetEvents; added/adapted tests for event AMT behavior; removed ChainGetEvents from ignore/filter lists.
Version bump
Cargo.toml
Bumped package version 0.30.5 → 0.31.0.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant RPC as Filecoin.ChainGetEvents
participant EthHandler as EthEventHandler / RPC handler
participant Blockstore as Blockstore / AMT
participant Stamped as StampedEvent

Client->>RPC: request(rootCid)
RPC->>EthHandler: get_events_by_event_root(rootCid)
EthHandler->>Blockstore: load events AMT at rootCid (try v4, fallback v3)
Blockstore-->>Stamped: return ordered stamped entries (may include duplicates)
Stamped-->>EthHandler: convert to Vec<Event>
EthHandler-->>RPC: return Vec<Event)
RPC-->>Client: respond(events)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • hanabi1224
  • LesnyRumcajs
  • sudo-shashank
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: remove unique events logic from the ChainGetEvents API' directly describes the main change: removing deduplication/unique-events logic and reworking event serving.
Linked Issues check ✅ Passed The PR addresses the core objective from #6271 by removing faulty deduplication logic and serving events directly from the blockstore, matching Lotus behavior.
Out of Scope Changes check ✅ Passed While the PR includes comprehensive infrastructure changes (removing IndicesStore, updating ChainStore signatures, adding database migrations), all changes are tightly coupled to and necessary for implementing the core objective of serving events directly from blockstore.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@akaladarshi akaladarshi force-pushed the akaladarshi/fix-get-events-api branch 3 times, most recently from dda3266 to a15e4e9 Compare December 4, 2025 16:29
@akaladarshi akaladarshi added the RPC requires calibnet RPC checks to run on CI label Dec 4, 2025
@akaladarshi akaladarshi marked this pull request as ready for review December 4, 2025 16:30
@akaladarshi akaladarshi requested a review from a team as a code owner December 4, 2025 16:30
@akaladarshi akaladarshi requested review from LesnyRumcajs and hanabi1224 and removed request for a team December 4, 2025 16:30
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/shim/executor.rs (1)

340-415: StampedEvent AMT loader looks good; watch serde wire format & small perf nit

The new StampedEvent::get_events helper correctly handles both v4-first and v3-fallback AMT loading and preserves storage order, which aligns with the new state-manager tests. Two minor follow‑ups to consider:

  • #[serde(untagged)] on StampedEvent changes its serialized shape (no variant tag). Please double‑check that nothing persists or exchanges the old tagged representation; if there is such usage, a compatibility note or migration test would help.
  • If you expect large event AMTs, you can avoid reallocations by reserving events capacity from the AMT’s element count before for_each.
src/state_manager/tests.rs (1)

341-406: Cache tests are correct; consider aligning events_root construction with production

The receipt/events cache tests behave correctly: update_cache_with_state_output only uses the provided events vector and treats events_root as an opaque identifier, so building the events root with Amtv0 is safe here. For future clarity (especially now that production writes events into a specific-bitwidth AMT), it may be worth switching this test’s events_root construction to the same AMT flavor/bitwidth used in the main code, or adding a comment noting that the root isn’t dereferenced in this test.

src/state_manager/mod.rs (1)

99-105: StateEvents and cache update logic look coherent; consider asserting vector invariants

Making StateEvents public and wiring update_cache_with_state_output to insert both events and their roots into the receipt/event cache handler matches how tipset_state_events and the RPC layer consume this data. One implicit assumption is that events.len() and events_roots.len() always stay equal; given other code zips these two vectors, it may be worth adding a debug/assert (either when creating StateEvents or before caching) to catch any future mismatch early.

Also applies to: 511-528

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07cbdaf and a15e4e9.

📒 Files selected for processing (7)
  • src/chain/store/chain_store.rs (1 hunks)
  • src/rpc/methods/chain.rs (1 hunks)
  • src/rpc/methods/eth/filter/mod.rs (1 hunks)
  • src/shim/executor.rs (3 hunks)
  • src/state_manager/mod.rs (4 hunks)
  • src/state_manager/tests.rs (4 hunks)
  • src/tool/subcommands/api_cmd/test_snapshots.txt (1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-09-02T14:23:53.808Z
Learnt from: akaladarshi
Repo: ChainSafe/forest PR: 5923
File: src/tool/subcommands/api_cmd/test_snapshots.txt:206-252
Timestamp: 2025-09-02T14:23:53.808Z
Learning: In the Forest project, .rpcsnap.json.zst snapshot files listed in src/tool/subcommands/api_cmd/test_snapshots.txt are not stored in the repository itself but are downloaded from a DigitalOcean (DO) bucket when needed. The manifest file serves as an index/catalog of available snapshots.

Applied to files:

  • src/tool/subcommands/api_cmd/test_snapshots.txt
📚 Learning: 2025-08-25T14:11:10.790Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5978
File: .github/workflows/unit-tests.yml:86-96
Timestamp: 2025-08-25T14:11:10.790Z
Learning: hanabi1224's RPC snapshot tests include a checksum validation mechanism that compares against HTTP HEAD headers, automatically re-downloading snapshots if the cached version doesn't match, providing protection against corrupted cached files.

Applied to files:

  • src/tool/subcommands/api_cmd/test_snapshots.txt
📚 Learning: 2025-07-17T15:21:40.753Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

Applied to files:

  • src/shim/executor.rs
📚 Learning: 2025-08-08T12:11:55.266Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5867
File: src/ipld/util.rs:461-487
Timestamp: 2025-08-08T12:11:55.266Z
Learning: Forest (src/ipld/util.rs, Rust): In UnorderedChainStream::poll_next, dropping `extract_sender` (when no more tipsets and the extract queue is empty) is the intended shutdown signal for workers. Any subsequent attempt to enqueue work after this drop is a logic error and should be treated as an error; do not change `send()` to ignore a missing sender.

Applied to files:

  • src/rpc/methods/chain.rs
📚 Learning: 2025-09-11T16:03:14.328Z
Learnt from: akaladarshi
Repo: ChainSafe/forest PR: 6000
File: src/tool/subcommands/api_cmd/state_decode_params_tests/miner.rs:4-9
Timestamp: 2025-09-11T16:03:14.328Z
Learning: In the Forest codebase's state_decode_params_tests module, common imports like `to_vec`, `TokenAmount`, `Cid`, etc. are centralized in the mod.rs file and made available to submodules through `use super::*;`, eliminating the need for duplicate imports in individual test files.

Applied to files:

  • src/state_manager/tests.rs
  • src/state_manager/mod.rs
🧬 Code graph analysis (2)
src/chain/store/chain_store.rs (2)
src/shim/executor.rs (1)
  • key (305-310)
src/blocks/tipset.rs (2)
  • key (337-340)
  • key (512-515)
src/state_manager/tests.rs (1)
src/shim/executor.rs (4)
  • events_root (180-186)
  • event (383-388)
  • events (118-124)
  • get_events (392-414)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build MacOS
  • GitHub Check: Build Ubuntu
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: All lint checks
🔇 Additional comments (7)
src/tool/subcommands/api_cmd/test_snapshots.txt (1)

7-7: The snapshot file filecoin_chaingetevents_1764864316078100.rpcsnap.json.zst is correctly listed in the manifest and is already available in the DigitalOcean bucket (https://forest-snapshots.fra1.cdn.digitaloceanspaces.com/rpc_test/). The change is valid and the test suite will be able to download this snapshot at runtime.

src/rpc/methods/eth/filter/mod.rs (1)

379-395: Preserving AMT event duplicates and order is correct

The updated collect_chain_events flow (filter by root, then flatten without dedup) now faithfully reflects the AMT contents, including duplicates and order, which is consistent with the new events tests and the API goal of returning all emitted events.

src/chain/store/chain_store.rs (1)

201-208: Renamed tipset-key lookup clarifies events-root usage

The new get_tipset_key_by_events_root wrapper keeps the previous behavior while making the intent (lookup by events root) explicit, which matches its use from ChainGetEvents.

src/rpc/methods/chain.rs (1)

254-263: ChainGetEvents now correctly resolves via events-root index

Switching to get_tipset_key_by_events_root and updating the error message to reference the tipset for the given events root aligns the RPC with the new indexing logic; the subsequent tipset load and event collection remain consistent.

src/state_manager/mod.rs (3)

471-595: Event-root indexing and blockstore fallback path are sensible; confirm all relevant tipsets pass through here

The new logic in tipset_state_output and tipset_state_events gives a clear pipeline:

  • When computing state, you index each non‑None events_root to its TipsetKey.
  • When serving events, you first try the cache, then load from the events AMT in the blockstore (when an events_root is provided), and only fall back to recomputing state if that fails.

This is a good layering and matches how ChainGetEvents now resolves root → tipset. One thing to verify is that, for all tipsets whose events you expect clients to query by root, there is at least one code path that calls tipset_state_output (and thus writes these indices). If some validation or background paths only ever call apply_block_messages directly without going through tipset_state_output, their events_root mappings won’t be written and ChainGetEvents for those particular roots will correctly, but perhaps unexpectedly, return the “can't find tipset for events root …” error.


1898-2034: Receipt and events AMT construction/verification in apply_block_messages is well-structured

Within apply_block_messages, constructing the receipts root from the VM‑returned receipts and then, separately, materializing each non‑empty events AMT and asserting that its derived root matches the VM‑provided root gives you:

  • A receipts root consistent with what Receipt::get_receipts expects to read.
  • Events AMTs that are actually persisted, with an explicit equality check guarding against any divergence between how the VM computes roots and how the node rebuilds them (including bitwidth configuration).

The error path on an events‑root mismatch is clear and will surface early during tipset execution, which is appropriate for a consensus‑sensitive invariant.


1769-1819: validate_tipsets now bails cleanly on state mismatch

Switching validate_tipsets to use bail!("state mismatch") when the expected and actual state/receipt roots differ simplifies the error path without changing behavior, and the surrounding logging still records all the relevant details for debugging.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/state_manager/tests.rs (1)

437-625: Event AMT tests comprehensively cover duplicates, order, empties, and v3/v4 compatibility

The new helpers and tests look solid:

  • create_raw_event_v3/v4 generate deterministic RAW IPLD events, which is ideal for CID equality checks.
  • test_events_store_and_retrieve_basic, test_events_entries_are_preserved_when_duplicates_are_stored, and test_events_preserve_order directly exercise StampedEvent::get_events and confirm the “no dedup, preserve insertion order” behavior that this PR is aiming for.
  • test_events_same_content_same_cid and test_identical_events_produce_same_root together validate that identical event sequences yield identical AMT roots and remain readable through get_events.
  • test_events_empty_list and test_events_v3_store_and_retrieve nicely cover the empty‑root and v3‑compat paths.

One small, purely optional nit: test_events_same_content_same_cid and test_identical_events_produce_same_root overlap quite a bit; you could fold them into a single test that asserts both CID equality and successful retrieval to reduce duplication, but it's not required.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a15e4e9 and b50f882.

📒 Files selected for processing (3)
  • scripts/tests/api_compare/filter-list (0 hunks)
  • scripts/tests/api_compare/filter-list-offline (0 hunks)
  • src/state_manager/tests.rs (4 hunks)
💤 Files with no reviewable changes (2)
  • scripts/tests/api_compare/filter-list-offline
  • scripts/tests/api_compare/filter-list
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-11T16:03:14.328Z
Learnt from: akaladarshi
Repo: ChainSafe/forest PR: 6000
File: src/tool/subcommands/api_cmd/state_decode_params_tests/miner.rs:4-9
Timestamp: 2025-09-11T16:03:14.328Z
Learning: In the Forest codebase's state_decode_params_tests module, common imports like `to_vec`, `TokenAmount`, `Cid`, etc. are centralized in the mod.rs file and made available to submodules through `use super::*;`, eliminating the need for duplicate imports in individual test files.

Applied to files:

  • src/state_manager/tests.rs
🧬 Code graph analysis (1)
src/state_manager/tests.rs (1)
src/shim/executor.rs (4)
  • events_root (180-186)
  • event (383-388)
  • events (118-124)
  • get_events (392-414)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: All lint checks
🔇 Additional comments (2)
src/state_manager/tests.rs (2)

14-14: Importing both Amt and Amtv0 keeps tests aligned with runtime AMT variants

Using Amt for the new EVENTS_AMT_BITWIDTH‑backed event AMTs and Amtv0 for legacy roots (receipts / earlier tests) makes the intent explicit and avoids alias confusion; usage below is consistent with this split. No changes needed here, just noting the clarity.


351-352: Amtv0::new_from_iter correctly mirrors v0‑formatted roots in cache tests

Switching both the mock events_root and receipt_root construction to Amtv0::new_from_iter ensures these tests only pass when the blockstore actually contains a valid v0 AMT, matching how roots are stored/loaded in the state manager. This gives the cache tests more realistic coverage without overcomplicating them.

Also applies to: 388-389

@akaladarshi akaladarshi marked this pull request as draft December 4, 2025 17:04
@akaladarshi akaladarshi force-pushed the akaladarshi/fix-get-events-api branch from b1ae8f4 to ad77eee Compare December 5, 2025 12:21
@akaladarshi akaladarshi marked this pull request as ready for review December 5, 2025 12:48
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/state_manager/mod.rs (2)

511-528: tipset_state_events fast path currently poisons the events cache with partial data

When events_root is provided, the new logic:

  • Checks the cache via get_events_or_else.
  • Inside the supplier closure, tries to load only the events for that root via StampedEvent::get_events, and if found, returns a StateEvents containing a single Vec<StampedEvent> and a single Some(root).

Because this StateEvents is then cached under the tipset key, later callers that expect a full tipset view (e.g., collect_events with events_root = None) will:

  • Pull this truncated StateEvents from the cache.
  • Hit the new length check in collect_events (messages.len() == events.len()), and fail with a length mismatch.

This can manifest whenever Filecoin.ChainGetEvents is called for a tipset before any filter-based event queries.

A safer pattern would be:

  • Use the blockstore fast path outside of get_events_or_else and do not insert that partial StateEvents into the cache, or
  • Restrict the cache to always store the full per-tipset events/roots produced by compute_tipset_state, regardless of whether an events_root hint is present.

For example, you could restructure tipset_state_events roughly as:

@@
-        let key = tipset.key();
-        let ts = tipset.clone();
-        let this = Arc::clone(self);
-        let cids = tipset.cids();
-        let events_root = events_root.cloned();
-        self.receipt_event_cache_handler
-            .get_events_or_else(
-                key,
-                Box::new(move || {
-                    Box::pin(async move {
-                        // Try to load events directly from the blockstore
-                        if let Some(stamped_events) = events_root
-                            .as_ref()
-                            .and_then(|root| StampedEvent::get_events(this.blockstore(), root).ok())
-                            .filter(|events| !events.is_empty())
-                        {
-                            return Ok(StateEvents {
-                                events: vec![stamped_events],
-                                roots: vec![events_root],
-                            });
-                        }
-
-                        // Fallback: compute the tipset state if events not found in the blockstore
-                        let state_out = this
-                            .compute_tipset_state(ts, NO_CALLBACK, VMTrace::NotTraced)
-                            .await?;
-                        trace!("Completed tipset state calculation {:?}", cids);
-                        Ok(StateEvents {
-                            events: state_out.events,
-                            roots: state_out.events_roots,
-                        })
-                    })
-                }),
-            )
-            .await
+        let events_root = events_root.cloned();
+
+        // Fast path: if a specific events root is requested and the AMT exists,
+        // serve it directly without mutating the per‑tipset cache.
+        if let Some(root) = events_root.as_ref() {
+            if let Ok(stamped_events) = StampedEvent::get_events(self.blockstore(), root) {
+                if !stamped_events.is_empty() {
+                    return Ok(StateEvents {
+                        events: vec![stamped_events],
+                        roots: vec![events_root],
+                    });
+                }
+            }
+        }
+
+        let key = tipset.key();
+        let ts = tipset.clone();
+        let this = Arc::clone(self);
+        let cids = tipset.cids();
+        self.receipt_event_cache_handler
+            .get_events_or_else(
+                key,
+                Box::new(move || {
+                    Box::pin(async move {
+                        let state_out = this
+                            .compute_tipset_state(ts, NO_CALLBACK, VMTrace::NotTraced)
+                            .await?;
+                        trace!("Completed tipset state calculation {:?}", cids);
+                        Ok(StateEvents {
+                            events: state_out.events,
+                            roots: state_out.events_roots,
+                        })
+                    })
+                }),
+            )
+            .await

This keeps the cache invariant (“cached StateEvents represent the full tipset”) while still leveraging the AMT-backed fast path for ChainGetEvents.

Also applies to: 555-595


199-205: Verify that event index insertion is not bypassed in future state computation paths

tipset_state_output indexes events roots via put_index after computing state, but this indexing only occurs in two paths:

  • Main RPC path via tipset_state_output (line 497)
  • Backfill utility via process_ts in db_util.rs

Direct calls to compute_tipset_state or apply_block_messages outside these two paths—such as in execution_trace, validation, or tool commands—do not trigger indexing. While current production paths (sync via tipset_syncer, RPC via tipset_state) route through tipset_state_output properly, future code that calls these functions directly risks creating persisted state without index entries, breaking ChainGetEvents resolution.

Consider either consolidating all state computations through tipset_state_output, or moving the put_index call to the point where events roots are produced (e.g., in apply_block_messages after vm.flush()), to guarantee indexing regardless of caller.

🧹 Nitpick comments (3)
src/rpc/methods/eth/filter/mod.rs (1)

281-291: Stricter message/events length check is appropriate

The added ensure! with concrete lengths will fail fast instead of silently zipping mismatched messages and events, which is much safer for debugging and prevents misaligned event attribution.

src/state_manager/tests.rs (1)

437-625: Event AMT tests give good coverage of storage, order, duplicates, and v3/v4 handling

The new helpers (create_raw_event_v3/v4) and tests comprehensively cover:

  • Basic store/retrieve via EVENTS_AMT_BITWIDTH and StampedEvent::get_events.
  • Duplicate events (consecutive and non‑consecutive) preserving count and position.
  • Strict order preservation across multiple entries.
  • Identical content producing identical root CIDs.
  • Empty-event lists round‑tripping correctly.
  • V3-encoded events still being readable through StampedEvent::get_events.

The slight overlap between test_events_same_content_same_cid and test_identical_events_produce_same_root is acceptable and keeps intent clear.

src/state_manager/mod.rs (1)

1899-2035: Receipt and events AMT construction in apply_block_messages is consistent and well-guarded

The updated apply_block_messages pipeline:

  • Uses Amtv0::new_from_iter to build receipt_root from the receipts vector, aligning with the tests that now construct receipts using Amtv0.
  • Iterates (events, events_roots) and, when an events_root is present, reconstructs the AMT via Amt::new_from_iter_with_bit_width and EVENTS_AMT_BITWIDTH, then ensure!s that the derived root matches the FVM-provided root.

That ensure! gives a clear, early failure if the bitwidth or serialization diverge from what the VM produced, which is valuable for catching subtle incompatibilities at state-application time.

As long as EVENTS_AMT_BITWIDTH stays coordinated with the bitwidth used when events roots are produced, this looks sound.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b50f882 and ad77eee.

📒 Files selected for processing (7)
  • src/chain/store/chain_store.rs (1 hunks)
  • src/rpc/methods/chain.rs (1 hunks)
  • src/rpc/methods/eth/filter/mod.rs (3 hunks)
  • src/shim/executor.rs (3 hunks)
  • src/state_manager/mod.rs (4 hunks)
  • src/state_manager/tests.rs (4 hunks)
  • src/tool/subcommands/api_cmd/test_snapshots.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/tool/subcommands/api_cmd/test_snapshots.txt
  • src/shim/executor.rs
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: akaladarshi
Repo: ChainSafe/forest PR: 5923
File: src/rpc/registry/actors/miner.rs:221-223
Timestamp: 2025-09-02T10:05:34.350Z
Learning: For miner actor ChangeOwnerAddress and ChangeOwnerAddressExported methods: versions 8-10 use bare Address as parameter type, while versions 11+ use ChangeOwnerAddressParams. This reflects the evolution of the Filecoin miner actor parameter structures across versions.
📚 Learning: 2025-09-11T16:03:14.328Z
Learnt from: akaladarshi
Repo: ChainSafe/forest PR: 6000
File: src/tool/subcommands/api_cmd/state_decode_params_tests/miner.rs:4-9
Timestamp: 2025-09-11T16:03:14.328Z
Learning: In the Forest codebase's state_decode_params_tests module, common imports like `to_vec`, `TokenAmount`, `Cid`, etc. are centralized in the mod.rs file and made available to submodules through `use super::*;`, eliminating the need for duplicate imports in individual test files.

Applied to files:

  • src/state_manager/tests.rs
  • src/state_manager/mod.rs
📚 Learning: 2025-08-08T12:11:55.266Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5867
File: src/ipld/util.rs:461-487
Timestamp: 2025-08-08T12:11:55.266Z
Learning: Forest (src/ipld/util.rs, Rust): In UnorderedChainStream::poll_next, dropping `extract_sender` (when no more tipsets and the extract queue is empty) is the intended shutdown signal for workers. Any subsequent attempt to enqueue work after this drop is a logic error and should be treated as an error; do not change `send()` to ignore a missing sender.

Applied to files:

  • src/rpc/methods/chain.rs
🧬 Code graph analysis (2)
src/chain/store/chain_store.rs (2)
src/shim/executor.rs (1)
  • key (305-310)
src/blocks/tipset.rs (2)
  • key (337-340)
  • key (512-515)
src/state_manager/mod.rs (2)
src/shim/executor.rs (5)
  • events_root (180-186)
  • key (305-310)
  • new (254-266)
  • get_events (392-414)
  • events (118-124)
src/state_manager/cache.rs (3)
  • get_events (152-152)
  • get_events (206-208)
  • get_events (263-265)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
  • GitHub Check: tests
🔇 Additional comments (5)
src/chain/store/chain_store.rs (1)

201-208: Method rename to clarify events-root index usage looks good

Renaming to get_tipset_key_by_events_root better reflects how the indices store is used here; behavior remains a straight passthrough to indices.read_obj and is consistent with the ChainGetEvents call site.

src/rpc/methods/chain.rs (1)

248-265: ChainGetEvents now resolves tipset explicitly via events-root index

Switching to get_tipset_key_by_events_root with the updated error context accurately describes the failure mode while keeping the rest of the flow (loading the tipset and collecting events) unchanged.

src/rpc/methods/eth/filter/mod.rs (1)

369-403: ChainGetEvents now preserves duplicate events and order as stored in AMT

The new ensure! on state_events.roots.len() == state_events.events.len() plus removal of any deduplication ensures:

  • You only proceed when roots and event vectors are structurally consistent.
  • All events under the selected root (including intentional duplicates) are returned in AMT order.

This aligns with the stated goal of matching Lotus’ behavior for ChainGetEvents.

src/state_manager/tests.rs (1)

340-407: Receipt/events cache tests align with new AMTv0 usage

Using Amtv0::new_from_iter to build receipt_root and the mock events_root matches the production change to use Amtv0 for receipts while treating event roots as opaque Cids in the cache. The tests correctly assert that empty events do not populate the cache and that a valid AMT-backed receipt_root yields one cached receipt.

src/state_manager/mod.rs (1)

93-105: StateEvents layout and sizing helpers look consistent

Making StateEvents public with events: Vec<Vec<StampedEvent>> and roots: Vec<Option<Cid>>, plus the GetSize annotations, matches how it’s consumed by the cache layer and RPC code. The explicit heap-size helpers for nested vectors should keep cache accounting accurate.

@akaladarshi akaladarshi marked this pull request as draft December 5, 2025 14:32
@akaladarshi akaladarshi force-pushed the akaladarshi/fix-get-events-api branch 2 times, most recently from 51357fe to 35ac348 Compare December 9, 2025 14:35
@akaladarshi
Copy link
Collaborator Author

akaladarshi commented Dec 9, 2025

DB migration check in CI is failing because I've removed the Indices column from DB so need to create a new migration or maybe go back to old not sure right now.

I was thinking, if I can get some comments on the PR, to finalise before the migration, it would be helpful 🙏 @hanabi1224 @LesnyRumcajs

@akaladarshi akaladarshi marked this pull request as ready for review December 9, 2025 15:13
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/state_manager/mod.rs (1)

1985-2006: Consider adding a length check before the zip operation.

The zip at line 1989 will silently truncate if events and events_roots have different lengths. While apply_block_messages should return consistent lengths, adding an explicit check would catch bugs early.

         // step 5: construct receipt root from receipts
         let receipt_root = Amtv0::new_from_iter(chain_index.db(), receipts)?;

         // step 6: store events AMTs in the blockstore
+        ensure!(
+            events.len() == events_roots.len(),
+            "Events count ({}) and events_roots count ({}) mismatch",
+            events.len(),
+            events_roots.len()
+        );
         for (msg_events, events_root) in events.iter().zip(events_roots.iter()) {
src/rpc/methods/eth/filter/mod.rs (1)

368-383: LGTM on get_events_by_event_root implementation.

The function correctly loads events via StampedEvent::get_events from the blockstore without deduplication, aligning with Lotus behavior. The error message at line 377 provides useful context.

Consider simplifying the error handling pattern:

-    let state_events =
-        match StampedEvent::get_events(ctx.chain_store().blockstore(), events_root) {
-            Ok(e) => e,
-            Err(e) => {
-                return Err(anyhow::anyhow!("load events amt: {}", e));
-            }
-        };
+    let state_events = StampedEvent::get_events(ctx.chain_store().blockstore(), events_root)
+        .context("load events amt")?;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad77eee and 35ac348.

📒 Files selected for processing (22)
  • src/benchmark_private/tipset_validation.rs (0 hunks)
  • src/chain/store/chain_store.rs (3 hunks)
  • src/chain_sync/chain_follower.rs (0 hunks)
  • src/daemon/context.rs (0 hunks)
  • src/daemon/db_util.rs (1 hunks)
  • src/db/car/many.rs (1 hunks)
  • src/db/memory.rs (1 hunks)
  • src/db/mod.rs (0 hunks)
  • src/db/parity_db.rs (1 hunks)
  • src/libp2p/chain_exchange/provider.rs (0 hunks)
  • src/rpc/methods/chain.rs (1 hunks)
  • src/rpc/methods/eth/filter/mod.rs (3 hunks)
  • src/rpc/methods/sync.rs (1 hunks)
  • src/state_manager/mod.rs (4 hunks)
  • src/state_manager/tests.rs (4 hunks)
  • src/tool/offline_server/server.rs (0 hunks)
  • src/tool/subcommands/api_cmd/api_compare_tests.rs (0 hunks)
  • src/tool/subcommands/api_cmd/generate_test_snapshot.rs (1 hunks)
  • src/tool/subcommands/api_cmd/test_snapshot.rs (2 hunks)
  • src/tool/subcommands/api_cmd/test_snapshots.txt (1 hunks)
  • src/tool/subcommands/index_cmd.rs (0 hunks)
  • src/tool/subcommands/state_compute_cmd.rs (0 hunks)
💤 Files with no reviewable changes (9)
  • src/tool/offline_server/server.rs
  • src/libp2p/chain_exchange/provider.rs
  • src/tool/subcommands/state_compute_cmd.rs
  • src/daemon/context.rs
  • src/db/mod.rs
  • src/tool/subcommands/index_cmd.rs
  • src/benchmark_private/tipset_validation.rs
  • src/tool/subcommands/api_cmd/api_compare_tests.rs
  • src/chain_sync/chain_follower.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/tool/subcommands/api_cmd/test_snapshots.txt
  • src/state_manager/tests.rs
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-10-17T14:24:47.046Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6167
File: src/tool/subcommands/state_compute_cmd.rs:89-91
Timestamp: 2025-10-17T14:24:47.046Z
Learning: In `src/tool/subcommands/state_compute_cmd.rs`, when using `ReadOpsTrackingStore` to generate minimal snapshots, `HEAD_KEY` should be written to `db.tracker` (not `db` itself) before calling `export_forest_car()`, because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation.

Applied to files:

  • src/db/memory.rs
  • src/daemon/db_util.rs
  • src/tool/subcommands/api_cmd/generate_test_snapshot.rs
  • src/tool/subcommands/api_cmd/test_snapshot.rs
📚 Learning: 2025-09-11T16:03:14.328Z
Learnt from: akaladarshi
Repo: ChainSafe/forest PR: 6000
File: src/tool/subcommands/api_cmd/state_decode_params_tests/miner.rs:4-9
Timestamp: 2025-09-11T16:03:14.328Z
Learning: In the Forest codebase's state_decode_params_tests module, common imports like `to_vec`, `TokenAmount`, `Cid`, etc. are centralized in the mod.rs file and made available to submodules through `use super::*;`, eliminating the need for duplicate imports in individual test files.

Applied to files:

  • src/state_manager/mod.rs
  • src/db/car/many.rs
  • src/tool/subcommands/api_cmd/generate_test_snapshot.rs
📚 Learning: 2025-08-08T12:11:55.266Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5867
File: src/ipld/util.rs:461-487
Timestamp: 2025-08-08T12:11:55.266Z
Learning: Forest (src/ipld/util.rs, Rust): In UnorderedChainStream::poll_next, dropping `extract_sender` (when no more tipsets and the extract queue is empty) is the intended shutdown signal for workers. Any subsequent attempt to enqueue work after this drop is a logic error and should be treated as an error; do not change `send()` to ignore a missing sender.

Applied to files:

  • src/rpc/methods/chain.rs
  • src/daemon/db_util.rs
  • src/rpc/methods/eth/filter/mod.rs
  • src/tool/subcommands/api_cmd/test_snapshot.rs
📚 Learning: 2025-08-25T13:35:24.230Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5969
File: src/tool/subcommands/snapshot_cmd.rs:412-412
Timestamp: 2025-08-25T13:35:24.230Z
Learning: In src/tool/subcommands/snapshot_cmd.rs, the +1 in `last_epoch = ts.epoch() - epochs as i64 + 1` fixes an off-by-1 bug where specifying --check-stateroots=N would validate N+1 epochs instead of N epochs, causing out-of-bounds errors when the snapshot contains only N recent state roots.

Applied to files:

  • src/tool/subcommands/api_cmd/test_snapshot.rs
📚 Learning: 2025-08-11T14:00:47.060Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5886
File: src/daemon/db_util.rs:236-259
Timestamp: 2025-08-11T14:00:47.060Z
Learning: In Forest's snapshot import (`src/daemon/db_util.rs`), when F3 CID is present in snapshot metadata but the actual F3 data is missing, this should cause a hard error as it indicates snapshot corruption. Only errors during the F3 import process itself (after data is successfully retrieved) should be non-fatal and logged.

Applied to files:

  • src/tool/subcommands/api_cmd/test_snapshot.rs
🧬 Code graph analysis (8)
src/db/memory.rs (3)
src/chain/store/chain_store.rs (1)
  • db (503-503)
src/chain/store/index.rs (1)
  • db (48-50)
src/rpc/methods/state.rs (1)
  • db (1443-1444)
src/rpc/methods/chain.rs (3)
src/rpc/methods/eth/filter/mod.rs (1)
  • get_events_by_event_root (369-383)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (1)
  • ctx (99-151)
src/tool/subcommands/api_cmd/test_snapshot.rs (1)
  • ctx (122-169)
src/db/car/many.rs (3)
src/db/memory.rs (2)
  • SettingsStoreExt (32-32)
  • SettingsStoreExt (156-156)
src/db/parity_db.rs (1)
  • SettingsStoreExt (177-177)
src/chain/mod.rs (1)
  • SettingsStoreExt (45-45)
src/rpc/methods/sync.rs (1)
src/chain/store/chain_store.rs (4)
  • new (117-142)
  • new (567-571)
  • db (503-503)
  • chain_config (247-249)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (5)
src/db/car/many.rs (1)
  • SettingsStoreExt (256-256)
src/db/memory.rs (2)
  • SettingsStoreExt (32-32)
  • SettingsStoreExt (156-156)
src/db/parity_db.rs (1)
  • SettingsStoreExt (177-177)
src/chain/mod.rs (1)
  • SettingsStoreExt (45-45)
src/db/mod.rs (1)
  • open_db (283-285)
src/chain/store/chain_store.rs (3)
src/chain/store/index.rs (2)
  • db (48-50)
  • new (42-46)
src/state_manager/mod.rs (2)
  • new (181-183)
  • chain_config (359-361)
src/daemon/context.rs (1)
  • chain_config (65-67)
src/rpc/methods/eth/filter/mod.rs (1)
src/shim/executor.rs (3)
  • events (118-124)
  • events_root (180-186)
  • get_events (392-414)
src/tool/subcommands/api_cmd/test_snapshot.rs (1)
src/chain/store/chain_store.rs (6)
  • db (503-503)
  • eth_mappings (180-182)
  • eth_mappings (196-198)
  • new (117-142)
  • new (567-571)
  • chain_config (247-249)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
  • GitHub Check: tests
🔇 Additional comments (25)
src/db/car/many.rs (1)

14-15: LGTM - Import cleanup aligns with IndicesStore removal.

The consolidated import statement correctly removes IndicesStore while retaining all other necessary traits for ManyCar functionality.

src/db/memory.rs (1)

6-6: LGTM - Clean removal of IndicesStore import.

The import change correctly reflects the removal of the IndicesStore trait implementation from MemoryDB.

src/chain/store/chain_store.rs (3)

9-9: LGTM - Import cleanup consistent with IndicesStore removal.

The import correctly removes IndicesStore and IndicesStoreExt while retaining EthMappingsStore and EthMappingsStoreExt which are still used for Ethereum mapping functionality.


743-744: Test correctly updated for new ChainStore::new signature.

The test now passes 5 arguments matching the updated constructor: (db, heaviest_tipset_key_provider, eth_mappings, chain_config, genesis_block_header).


758-758: LGTM - Consistent with other test updates.

src/rpc/methods/sync.rs (1)

189-191: LGTM - Test correctly updated for new ChainStore::new signature.

The ctx() function now constructs ChainStore with the updated 5-argument signature, consistent with the removal of the IndicesStore parameter.

src/tool/subcommands/api_cmd/test_snapshot.rs (2)

55-64: Simplified backfill logic using let-chains.

The refactored function cleanly chains the conditionals. Note that try_write() will silently skip the backfill if the lock is already held elsewhere. This is likely acceptable for test scenarios but worth being aware of.


134-140: Improved error handling with ? operator.

Good change - using ? instead of unwrap() properly propagates errors through the anyhow::Result return type.

src/daemon/db_util.rs (1)

332-334: Event root indexing removed from tipset processing.

The compute_tipset_state result is no longer unpacked to extract and index events_roots. This aligns with the PR objective of serving events directly from the blockstore rather than maintaining a separate index. Events are now stored as part of state computation and retrieved via StampedEvent::get_events when needed.

src/rpc/methods/chain.rs (2)

254-256: Simplified event retrieval aligned with Lotus behavior.

The handler now directly fetches events from the blockstore via EthEventHandler::get_events_by_event_root. This removes the previous tipset-resolution flow and duplicate-event logic that caused issue #6271. If events aren't present in the blockstore, the call will error out with an AMT load error, matching Lotus's behavior.


1610-1618: Test helper correctly updated for new ChainStore signature.

The _load helper now constructs ChainStore with the updated 5-argument signature, using separate MemoryDB instances for heaviest_tipset_key_provider and eth_mappings.

src/tool/subcommands/api_cmd/generate_test_snapshot.rs (3)

13-14: LGTM on import updates.

The imports are correctly updated to remove IndicesStore while retaining the necessary EthMappingsStore, SettingsStore, and related types for the tracking store functionality.


83-97: LGTM on simplified build_index function.

The function now only builds the eth_mappings index from the tracker, which aligns with the removal of the indices store column. The logic correctly returns None when no mappings are present.


112-121: Verify ChainStore::new signature consistency.

The constructor takes 5 parameters: db (Arc), heaviest_tipset_key_provider (Arc<dyn HeaviestTipsetKeyProvider + Sync + Send>), eth_mappings (Arc<dyn EthMappingsStore + Sync + Send>), chain_config, and genesis_block_header. All call sites consistently pass the same db instance for the first three parameters, which is valid since the concrete db type implements all required traits.

src/db/parity_db.rs (3)

4-4: LGTM on import simplification.

The IndicesStore import is correctly removed, leaving only the necessary store traits.


23-42: LGTM on DbColumn enum update.

The Indices variant is cleanly removed. The remaining columns (GraphDagCborBlake2b256, GraphFull, Settings, EthMappings, PersistentGraph) are properly ordered and documented.


44-81: LGTM on create_column_options update.

The match arms correctly handle all remaining column variants with appropriate options (preimage, btree_index, compression settings).

src/state_manager/mod.rs (6)

64-64: LGTM on import update.

Adding ensure from anyhow is appropriate for the new validation logic in step 6.


99-105: LGTM on StateEvents struct.

The struct appropriately holds paired events and roots vectors with proper derives for debugging, cloning, and memory tracking.


550-576: LGTM on tipset_state_events fallback logic.

The method now properly falls back to computing the tipset state when events are not found in cache, returning a StateEvents struct with both events and roots.


1992-2003: Good defensive check for AMT root consistency.

The ensure! macro correctly validates that the derived event root matches the FVM-computed root. This is essential for detecting any discrepancies in event serialization or AMT construction.


94-94: EVENTS_AMT_BITWIDTH value is verified against FVM at runtime.

The bitwidth of 5 is already validated through the equality check at lines 2000-2001, where the derived event root is compared against the FVM-computed root. If this constant were misaligned with FVM's event AMT construction, the ensure!() check would fail. No additional verification needed.


71-71: Confirm the intentionality of using Amtv0 for receipt roots and Amt for event roots.

The code uses Amtv0::new_from_iter for receipt construction and Amt::new_from_iter_with_bit_width for events. While Amtv0 is documented as the legacy AMT encoding (for backward compatibility) and Amt is the current encoding, the codebase lacks explicit documentation explaining why receipts specifically require Amtv0. Consider adding a comment referencing the Lotus behavior or actor specification that mandates this version distinction.

src/rpc/methods/eth/filter/mod.rs (2)

40-40: LGTM on StampedEvent import.

The import is correctly added to support the new get_events_by_event_root implementation that loads events directly via StampedEvent::get_events.


283-290: LGTM on improved error messages.

The error message now includes the concrete lengths for better debugging when there's a mismatch between messages and events.

@LesnyRumcajs
Copy link
Member

DB migration check in CI is failing because I've removed the Indices column from DB so need to create a new migration or maybe go back to old not sure right now.

I was thinking, if I can get some comments on the PR, to finalise before the migration, it would be helpful 🙏 @hanabi1224 @LesnyRumcajs

Before moving forward without the indices column, I'd like to first see some numbers showing the performance impact for affected methods.

match StampedEvent::get_events(ctx.chain_store().blockstore(), events_root) {
Ok(e) => e,
Err(e) => {
return Err(anyhow::anyhow!("load events amt: {}", e));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
return Err(anyhow::anyhow!("load events amt: {}", e));
bail!("load events amt: {e}");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, probably let state_events = bla.context("failed getting events")?; would make more sense.

Copy link
Collaborator Author

@akaladarshi akaladarshi Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current error doesn't make much sense but I was trying to match lotus API level error , hence this

) -> anyhow::Result<Vec<StampedEvent>> {
let mut events = Vec::new();

// Try StampedEvent_v4 first (StampedEvent_v4 and StampedEvent_v3 are identical, use v4 here)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there other versions as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, for events there is only two versions , V3 and V4,

pub enum StampedEvent {


// Try StampedEvent_v4 first (StampedEvent_v4 and StampedEvent_v3 are identical, use v4 here)
if let Ok(amt) = Amt::<StampedEvent_v4, _>::load(events_root, db) {
amt.for_each(|_, event| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for_each_cacheless is a better option yielding faster iteration if the AMT is not iterated multiple times.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// step 5: construct receipt root from receipts and flush the state-tree
let receipt_root = Amt::new_from_iter(chain_index.db(), receipts)?;
// step 5: construct receipt root from receipts
let receipt_root = Amtv0::new_from_iter(chain_index.db(), receipts)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why v0?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier it was V0 only, but we renamed it to Amt .
It was being used like this

use fil_actors_shared::fvm_ipld_amt::Amtv0 as Amt;

// step 5: construct receipt root from receipts
let receipt_root = Amtv0::new_from_iter(chain_index.db(), receipts)?;

// step 6: store events AMTs in the blockstore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there corresponding Lotus logic?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah , lotus has similar logic here

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/shim/executor.rs (1)

390-414: get_events implementation matches goals (v4‑first, v3 fallback, ordered, no dedup)

The new StampedEvent::get_events correctly:

  • Loads from the events AMT root via Amt<StampedEvent_v4> first, then cleanly falls back to Amt<StampedEvent_v3> on error.
  • Uses for_each_cacheless, which is appropriate for single‑pass iteration and preserves AMT index order, so events are returned in the stored order with duplicates intact (aligning with the new ChainGetEvents behavior).

If you expect very large event sets, you could optionally pre‑allocate events with events.reserve(amt.count() as usize) before iteration for a minor perf win, but that’s not required for correctness.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 35ac348 and 1dd279f.

📒 Files selected for processing (1)
  • src/shim/executor.rs (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-17T15:21:40.753Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

Applied to files:

  • src/shim/executor.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: tests-release
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: tests
  • GitHub Check: All lint checks
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: Build MacOS
🔇 Additional comments (2)
src/shim/executor.rs (2)

10-10: Amt import aligns with new events AMT usage

Pulling in Amt alongside Amtv0 is consistent with using the new AMT API for events while keeping receipts on Amtv0. Please just double‑check that the version of Amt here matches how events are actually stored and how Lotus reads them, so ChainGetEvents parity is guaranteed end‑to‑end.


340-346: Untagged StampedEvent serialization looks appropriate for RPC parity

Adding #[serde(untagged)] to StampedEvent matches how Receipt is handled and should make the serialized shape match the underlying FVM event types (and Lotus) instead of introducing an enum wrapper. Since we only derive Serialize, this shouldn’t introduce deserialization ambiguity. Please confirm that no external clients relied on the previously tagged JSON shape of StampedEvent (if it was ever exposed directly).

@akaladarshi akaladarshi force-pushed the akaladarshi/fix-get-events-api branch from 1dd279f to d3f0c50 Compare January 12, 2026 13:37
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/daemon/db_util.rs (1)

371-378: Stale documentation: Events root → TipsetKey mapping no longer exists.

The doc comment mentions "Events root [Cid] -> [TipsetKey]" mapping, but this PR removes that functionality. The comment should be updated to reflect that only the Ethereum RPC API mappings are populated.

📝 Suggested fix
-/// To support the Event RPC API, a new column has been added to parity-db to handle the mapping:
-/// - Events root [`Cid`] -> [`TipsetKey`].
-///
-/// Similarly, to support the Ethereum RPC API, another column has been introduced to map:
+/// To support the Ethereum RPC API, columns exist to map:
 /// - [`struct@EthHash`] -> [`TipsetKey`],
 /// - [`struct@EthHash`] -> Delegated message [`Cid`].
 ///
 /// This function traverses the chain store and populates these columns accordingly.
🤖 Fix all issues with AI agents
In @src/db/migration/v0_31_0.rs:
- Around line 112-144: The EthMappings migration (DbColumn::EthMappings) is
storing TipsetKey bytes (val.value) as the value instead of the expected
serialized (Cid, u64) tuple, which corrupts data and breaks get_mapping and
get_message_cids; fix by decoding TipsetKey with fvm_ipld_encoding::from_slice,
extracting the Cid via TipsetKey::cid(), convert Cid -> EthHash for the key,
then serialize the tuple (cid, timestamp_u64) with fvm_ipld_encoding::to_vec and
pass that serialized bytes into new_db.db.commit_changes (via Db::set_operation)
instead of val.value, preserving the existing error handling flow around
tsk/cid/res and bubbling errors; add a unit test that runs the migration on a
test DB containing an old EthMappings entry and asserts the migrated value
deserializes to (Cid, u64) and that get_mapping/get_message_cids return the
expected results.
🧹 Nitpick comments (2)
src/db/migration/v0_31_0.rs (2)

163-252: Consider extracting shared column options logic to reduce duplication.

The paritydb_0_26_0 and paritydb_0_31_0 modules have nearly identical create_column_options implementations. While duplication in migration code is often acceptable for stability (avoiding changes to old migration logic), consider whether a shared helper could reduce maintenance burden for future migrations.

This is a minor observation — keeping migrations self-contained is also a valid design choice for reproducibility.

Also applies to: 254-336


30-161: Missing unit tests for migration logic.

This migration file lacks test coverage. Given that database migrations are critical operations that can lead to data loss if incorrect, consider adding tests that:

  1. Create a mock v0.26.0 database with sample data
  2. Run the migration
  3. Verify data integrity in the resulting v0.31.0 database

The existing migration tests in migration_map.rs test the migration chain logic but not the actual data transformation.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1dd279f and d3f0c50.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • Cargo.toml
  • src/chain/store/chain_store.rs
  • src/chain_sync/chain_follower.rs
  • src/daemon/context.rs
  • src/daemon/db_util.rs
  • src/db/car/many.rs
  • src/db/memory.rs
  • src/db/migration/migration_map.rs
  • src/db/migration/mod.rs
  • src/db/migration/v0_31_0.rs
  • src/db/mod.rs
  • src/db/parity_db.rs
  • src/libp2p/chain_exchange/provider.rs
💤 Files with no reviewable changes (4)
  • src/daemon/context.rs
  • src/chain_sync/chain_follower.rs
  • src/db/mod.rs
  • src/libp2p/chain_exchange/provider.rs
✅ Files skipped from review due to trivial changes (1)
  • Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/db/memory.rs
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-09-11T16:03:14.328Z
Learnt from: akaladarshi
Repo: ChainSafe/forest PR: 6000
File: src/tool/subcommands/api_cmd/state_decode_params_tests/miner.rs:4-9
Timestamp: 2025-09-11T16:03:14.328Z
Learning: In the Forest codebase's state_decode_params_tests module, common imports like `to_vec`, `TokenAmount`, `Cid`, etc. are centralized in the mod.rs file and made available to submodules through `use super::*;`, eliminating the need for duplicate imports in individual test files.

Applied to files:

  • src/db/car/many.rs
📚 Learning: 2026-01-05T12:54:40.850Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/cron_state.rs:8-8
Timestamp: 2026-01-05T12:54:40.850Z
Learning: In Rust code reviews, do not derive Eq for a struct if any field does not implement Eq (e.g., types from external dependencies). If a type like CronStateLotusJson includes fields wrapping external dependencies that lack Eq, derive PartialEq (or implement PartialEq manually) but avoid deriving Eq. This ensures comparisons compile and reflect actual equivalence semantics. When needed, consider implementing custom PartialEq (and possibly Eq) only after ensuring all fields (or wrappers) implement Eq, or keep PartialEq-only if full equality semantics cannot be expressed.

Applied to files:

  • src/db/car/many.rs
  • src/daemon/db_util.rs
  • src/db/migration/mod.rs
  • src/db/migration/migration_map.rs
  • src/chain/store/chain_store.rs
  • src/db/parity_db.rs
  • src/db/migration/v0_31_0.rs
📚 Learning: 2026-01-05T12:56:13.802Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/evm_state.rs:41-44
Timestamp: 2026-01-05T12:56:13.802Z
Learning: In Rust codebases (e.g., Forest), do not add #[cfg(test)] to functions already annotated with #[test]. The #[test] attribute ensures the function is compiled only for tests, so a separate #[cfg(test)] is redundant and can be removed if present. Apply this check to all Rust files that contain #[test] functions.

Applied to files:

  • src/db/car/many.rs
  • src/daemon/db_util.rs
  • src/db/migration/mod.rs
  • src/db/migration/migration_map.rs
  • src/chain/store/chain_store.rs
  • src/db/parity_db.rs
  • src/db/migration/v0_31_0.rs
📚 Learning: 2025-10-17T14:24:47.046Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6167
File: src/tool/subcommands/state_compute_cmd.rs:89-91
Timestamp: 2025-10-17T14:24:47.046Z
Learning: In `src/tool/subcommands/state_compute_cmd.rs`, when using `ReadOpsTrackingStore` to generate minimal snapshots, `HEAD_KEY` should be written to `db.tracker` (not `db` itself) before calling `export_forest_car()`, because the export reads from the tracker MemoryDB which accumulates only the accessed data during computation.

Applied to files:

  • src/daemon/db_util.rs
📚 Learning: 2025-08-08T12:11:55.266Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5867
File: src/ipld/util.rs:461-487
Timestamp: 2025-08-08T12:11:55.266Z
Learning: Forest (src/ipld/util.rs, Rust): In UnorderedChainStream::poll_next, dropping `extract_sender` (when no more tipsets and the extract queue is empty) is the intended shutdown signal for workers. Any subsequent attempt to enqueue work after this drop is a logic error and should be treated as an error; do not change `send()` to ignore a missing sender.

Applied to files:

  • src/daemon/db_util.rs
🧬 Code graph analysis (2)
src/db/car/many.rs (3)
src/db/memory.rs (2)
  • SettingsStoreExt (32-32)
  • SettingsStoreExt (156-156)
src/db/parity_db.rs (1)
  • SettingsStoreExt (177-177)
src/chain/mod.rs (1)
  • SettingsStoreExt (45-45)
src/db/migration/v0_31_0.rs (2)
src/db/migration/migration_map.rs (12)
  • new (26-28)
  • new (321-326)
  • new (491-496)
  • from (30-30)
  • from (328-330)
  • from (498-500)
  • to (32-32)
  • to (332-334)
  • to (502-504)
  • migrate_core (60-60)
  • migrate_core (309-315)
  • migrate_core (481-489)
src/db/parity_db.rs (1)
  • set_operation (365-367)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: rubocop
  • GitHub Check: All lint checks
  • GitHub Check: Coverage
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
🔇 Additional comments (8)
src/chain/store/chain_store.rs (3)

9-9: LGTM!

The import correctly removes IndicesStore dependency, aligning with the PR objective to eliminate the events root → tipset key mapping from the database.


743-744: LGTM!

Test correctly updated to use the new ChainStore::new signature. The MemoryDB implements all required traits (Blockstore, HeaviestTipsetKeyProvider, EthMappingsStore), making this a valid test setup.


758-758: LGTM!

Consistent with the other test update.

src/daemon/db_util.rs (1)

342-344: LGTM!

The change correctly removes events_roots indexing. Events are now stored directly in the blockstore during FVM execution, eliminating the need for a separate index.

src/db/car/many.rs (1)

14-16: LGTM!

The import correctly removes IndicesStore from the module. The SettingsStore and EthMappingsStore implementations remain intact, and this change aligns with the PR objective to remove the indices store column from the database.

src/db/migration/migration_map.rs (1)

13-13: LGTM! Migration entry follows established patterns.

The import and migration mapping are correctly added. The comment clarifies that v0.30.0 uses the same schema as v0.26.0, which explains why Migration0_26_0_0_31_0 is reused for the 0.30.0 → 0.31.0 path.

Also applies to: 159-160

src/db/migration/mod.rs (1)

8-8: LGTM!

Module declaration follows the established pattern and ordering.

src/db/parity_db.rs (1)

4-4: LGTM! Indices column correctly removed from schema.

The Indices column was at position 5 (the last column), so removing it doesn't affect the indices of the remaining columns. The import and enum changes are consistent with the migration.

Also applies to: 21-42

Comment on lines +91 to +111
DbColumn::GraphDagCborBlake2b256 | DbColumn::PersistentGraph => {
db.db.iter_column_while(col as u8, |val| {
let hash = MultihashCode::Blake2b256.digest(&val.value);
let cid = Cid::new_v1(DAG_CBOR, hash);
res = new_db
.db
.commit_changes([Db::set_operation(
col as u8,
cid.to_bytes(),
val.value,
)])
.context("failed to commit");

if res.is_err() {
return false;
}

true
})?;
res?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for PersistentGraph usage and how data is stored in this column
rg -n "PersistentGraph" --type rust -A 3 -B 3 | head -100

Repository: ChainSafe/forest

Length of output: 5798


🏁 Script executed:

#!/bin/bash
# Look for where data is written to PersistentGraph column to understand the format
rg -n "put.*[Pp]ersistent" --type rust -A 2 | head -80

Repository: ChainSafe/forest

Length of output: 2537


🏁 Script executed:

#!/bin/bash
# Check how GraphDagCborBlake2b256 is populated to compare patterns
rg -n "GraphDagCborBlake2b256" --type rust -A 3 -B 3 | head -100

Repository: ChainSafe/forest

Length of output: 6403


🏁 Script executed:

#!/bin/bash
# Check what data types are stored in PersistentGraph - specifically manifests
rg -n "manifest|Manifest" --type rust src/daemon/bundle.rs -B 5 -A 5 | head -100

Repository: ChainSafe/forest

Length of output: 1558


🏁 Script executed:

#!/bin/bash
# Check the actual CID construction for data stored in PersistentGraph
rg -n "put_keyed_persistent|cid\|block" --type rust src/daemon/bundle.rs -B 5 -A 2 | head -80

Repository: ChainSafe/forest

Length of output: 694


🏁 Script executed:

#!/bin/bash
# Look at the snapshot import logic for any clues about CID format for persistent data
rg -n "F3|manifest" --type rust src/daemon/ -l | head -20

Repository: ChainSafe/forest

Length of output: 120


🏁 Script executed:

#!/bin/bash
# Check ActorBundleInfo definition to see the CID codec/hash for manifests
rg -n "ActorBundleInfo" --type rust -A 10 | head -60

Repository: ChainSafe/forest

Length of output: 4368


🏁 Script executed:

#!/bin/bash
# Look for how CIDs are created when writing to PersistentGraph - check CAR stream handling
rg -n "CarBlock|CarStream" --type rust src/daemon/bundle.rs -B 3 -A 3

Repository: ChainSafe/forest

Length of output: 1123


Migration assumption about PersistentGraph CID format requires validation.

The migration reconstructs CIDs for PersistentGraph data using hardcoded DAG_CBOR codec and Blake2b256 hash. However, PersistentGraph accepts arbitrary IPLD data via pre-computed CIDs from external sources (actor bundle CAR files). While actor bundles almost certainly use DAG_CBOR + Blake2b256, the code doesn't enforce or validate this assumption. If any data with a different codec or hash was stored in PersistentGraph, the migration would silently reconstruct incorrect CID keys, corrupting the data. Consider adding validation during migration to detect and reject data that doesn't match the expected format, rather than silently reconstructing potentially wrong CIDs.

Comment on lines +112 to +144
DbColumn::EthMappings => {
db.db.iter_column_while(col as u8, |val| {
let tsk: Result<TipsetKey, fvm_ipld_encoding::Error> =
fvm_ipld_encoding::from_slice(&val.value);
if tsk.is_err() {
res = Err(tsk.context("serde error").unwrap_err());
return false;
}
let cid = tsk.unwrap().cid();

if cid.is_err() {
res = Err(cid.context("serde error").unwrap_err());
return false;
}

let hash: EthHash = cid.unwrap().into();
res = new_db
.db
.commit_changes([Db::set_operation(
col as u8,
hash.0.as_bytes().to_vec(),
val.value,
)])
.context("failed to commit");

if res.is_err() {
return false;
}

true
})?;
res?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the schema definitions in the migration file
fd v0_31_0.rs --type f -x cat -n {} | head -300

Repository: ChainSafe/forest

Length of output: 13093


🏁 Script executed:

# Get the rest of the v0_31_0.rs file to see the complete new schema
fd v0_31_0.rs --type f -x tail -n +300 {}

Repository: ChainSafe/forest

Length of output: 1163


🏁 Script executed:

# Search for EthMappings usage in the codebase to understand what data it stores
rg -n "EthMappings" --type rust -B 3 -A 3 | head -150

Repository: ChainSafe/forest

Length of output: 9223


🏁 Script executed:

# Search for EthMappingsStore write operations to understand what data is actually stored
rg -n "write.*EthMappings|EthMappings.*write" --type rust -B 2 -A 2 | head -200

Repository: ChainSafe/forest

Length of output: 2651


🏁 Script executed:

# Search for message_cids references to understand the (Cid, u64) format
rg -n "message_cids|MessageCid" --type rust -B 2 -A 2 | head -150

Repository: ChainSafe/forest

Length of output: 7314


🏁 Script executed:

# Check git history of parity_db.rs around EthMappings to see if schema changed
git log --oneline --all -- src/db/parity_db.rs | head -20

Repository: ChainSafe/forest

Length of output: 1128


🏁 Script executed:

# Look for old implementations of EthMappings in migration files
rg -n "EthMappings" src/db/migration/ --type rust -B 5 -A 5

Repository: ChainSafe/forest

Length of output: 11339


🏁 Script executed:

# Check the v0_26_0 migration file to see how EthMappings was handled there
fd v0_26_0.rs --type f -x cat -n {} | grep -A 30 "DbColumn::EthMappings"

Repository: ChainSafe/forest

Length of output: 4061


🏁 Script executed:

# Let's search for when the EthMappings schema changed from TipsetKey to (Cid, u64)
git log --all -p -- src/db/parity_db.rs | grep -B 10 -A 10 "Cid, u64" | head -100

Repository: ChainSafe/forest

Length of output: 1793


🏁 Script executed:

# Look for get_message_cids implementation history  
git log --oneline --all -- src/db/parity_db.rs src/db/migration/

Repository: ChainSafe/forest

Length of output: 2763


🏁 Script executed:

# Search for where EthMappings values are actually written to understand the current schema
rg -n "write.*EthHash|write_bin.*eth" --type rust -B 5 -A 5 | head -150

Repository: ChainSafe/forest

Length of output: 6176


🏁 Script executed:

# Search for where (Cid, u64) values are written to EthMappingsStore
rg -n "write_obj.*u64|write_bin" src/rpc/methods/eth.rs -B 5 -A 5 | head -200

Repository: ChainSafe/forest

Length of output: 42


🏁 Script executed:

# Look for EthGetMessageCidByTransactionHash implementation
rg -n "EthGetMessageCidByTransactionHash" src/rpc/methods/eth.rs -B 2 -A 30 | head -150

Repository: ChainSafe/forest

Length of output: 1606


🏁 Script executed:

# Search for where eth mappings are populated with message CIDs
rg -n "eth.*mapping|EthMappingsStore.*write" src/ --type rust -B 3 -A 3 | head -200

Repository: ChainSafe/forest

Length of output: 12145


🏁 Script executed:

# Search for get_mapping implementation to see what it returns
rg -n "fn get_mapping" src/ --type rust -B 2 -A 10

Repository: ChainSafe/forest

Length of output: 1019


🏁 Script executed:

# Search for EthMappingCollector to understand what it's supposed to write
rg -n "EthMappingCollector" src/ --type rust -B 2 -A 15 | head -150

Repository: ChainSafe/forest

Length of output: 4277


🏁 Script executed:

# Check if there are any tests for the migration
rg -n "test.*migration|#\[test\]" src/db/migration/v0_31_0.rs --type rust

Repository: ChainSafe/forest

Length of output: 42


The EthMappings migration stores the wrong data type and will corrupt migrated data.

The migration correctly transforms the key from the old format to EthHash, but it stores TipsetKey bytes as the value instead of serialized (Cid, u64) tuples. This causes:

  1. get_mapping() (chain_store.rs:198) expects to deserialize values as (Cid, u64) but will get TipsetKey bytes, returning None silently
  2. get_message_cids() (parity_db.rs:207) uses if let Ok(...) which silently ignores deserialization failures
  3. All Ethereum mappings from migrated databases will be effectively lost

The migration should extract the CID from the TipsetKey and serialize it with a timestamp as (Cid, u64) before storing:

Expected fix pattern (similar to GraphDagCborBlake2b256 migration)
DbColumn::EthMappings => {
    db.db.iter_column_while(col as u8, |val| {
        let tsk: Result<TipsetKey, _> = fvm_ipld_encoding::from_slice(&val.value);
        if tsk.is_err() { /* handle error */ }
        let cid = tsk.unwrap().cid();
        if cid.is_err() { /* handle error */ }
        
        let hash: EthHash = cid.unwrap().into();
        // Serialize (Cid, u64) instead of storing val.value
        let value = fvm_ipld_encoding::to_vec(&(cid.unwrap(), timestamp))?;
        res = new_db.db.commit_changes([Db::set_operation(col as u8, hash.0.as_bytes().to_vec(), value)])...
    })
}

Additionally, no unit tests exist for this migration, allowing this corruption to go undetected.

🤖 Prompt for AI Agents
In @src/db/migration/v0_31_0.rs around lines 112 - 144, The EthMappings
migration (DbColumn::EthMappings) is storing TipsetKey bytes (val.value) as the
value instead of the expected serialized (Cid, u64) tuple, which corrupts data
and breaks get_mapping and get_message_cids; fix by decoding TipsetKey with
fvm_ipld_encoding::from_slice, extracting the Cid via TipsetKey::cid(), convert
Cid -> EthHash for the key, then serialize the tuple (cid, timestamp_u64) with
fvm_ipld_encoding::to_vec and pass that serialized bytes into
new_db.db.commit_changes (via Db::set_operation) instead of val.value,
preserving the existing error handling flow around tsk/cid/res and bubbling
errors; add a unit test that runs the migration on a test DB containing an old
EthMappings entry and asserts the migrated value deserializes to (Cid, u64) and
that get_mapping/get_message_cids return the expected results.

@akaladarshi akaladarshi marked this pull request as draft January 12, 2026 13:42
@codecov
Copy link

codecov bot commented Jan 12, 2026

Codecov Report

❌ Patch coverage is 17.13147% with 208 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.35%. Comparing base (2c87fa1) to head (a3fa8f1).

Files with missing lines Patch % Lines
src/db/migration/v0_31_0.rs 2.57% 189 Missing ⚠️
src/rpc/methods/eth/filter/mod.rs 45.45% 5 Missing and 1 partial ⚠️
src/shim/executor.rs 64.70% 5 Missing and 1 partial ⚠️
src/state_manager/mod.rs 72.72% 1 Missing and 2 partials ⚠️
src/tool/subcommands/api_cmd/test_snapshot.rs 83.33% 1 Missing and 1 partial ⚠️
src/daemon/db_util.rs 0.00% 1 Missing ⚠️
src/rpc/methods/chain.rs 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/chain/store/chain_store.rs 67.70% <100.00%> (-1.02%) ⬇️
src/chain_sync/chain_follower.rs 35.71% <ø> (-0.10%) ⬇️
src/daemon/context.rs 64.97% <ø> (-0.17%) ⬇️
src/db/car/many.rs 67.24% <ø> (+0.02%) ⬆️
src/db/memory.rs 86.40% <ø> (+0.99%) ⬆️
src/db/migration/migration_map.rs 88.95% <ø> (ø)
src/db/mod.rs 41.66% <ø> (+2.72%) ⬆️
src/db/parity_db.rs 69.75% <ø> (+1.73%) ⬆️
src/libp2p/chain_exchange/provider.rs 83.45% <ø> (-0.12%) ⬇️
src/rpc/methods/sync.rs 71.83% <100.00%> (-1.33%) ⬇️
... and 12 more

... and 4 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c87fa1...a3fa8f1. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@akaladarshi akaladarshi marked this pull request as ready for review January 12, 2026 17:08
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @src/db/migration/v0_31_0.rs:
- Around line 144-152: The default branch currently commits each record
individually using new_db.db.commit_changes([Db::set_operation(col as u8, key,
value)]), causing the same per-record performance issue; change it to collect a
batch of Db::set_operation entries (same batch size and batching logic used in
the other branches), then call new_db.db.commit_changes once per batch (and once
for any remaining ops) while iterating the iterator (iter) so commits are
batched instead of per-record.
- Around line 91-109: The current loop in iter_column_while calls
new_db.db.commit_changes for every record (inside iter_column_while), causing
extreme slowdown; change iter_column_while usage to collect Db::set_operation
items into a Vec (e.g., ops) and call new_db.db.commit_changes(ops_slice) only
when the batch size threshold (e.g., 10_000) is reached and once more after the
loop to flush remaining ops; ensure you propagate errors by mapping
commit_changes().context("failed to commit") to res and return false from the
iterator on error (so the ? outside still works), clear the ops Vec after each
successful commit, and keep using MultihashCode::Blake2b256.digest, Cid::new_v1
and cid.to_bytes() to build the Db::set_operation entries.
🧹 Nitpick comments (3)
src/db/migration/v0_31_0.rs (3)

111-143: Simplify error handling with ? operator or pattern matching.

The is_err() + unwrap() pattern is convoluted and error-prone. Consider restructuring to use idiomatic Rust error handling.

♻️ Suggested refactor for cleaner error handling
 DbColumn::EthMappings => {
     db.db.iter_column_while(col as u8, |val| {
-        let tsk: Result<TipsetKey, fvm_ipld_encoding::Error> =
-            fvm_ipld_encoding::from_slice(&val.value);
-        if tsk.is_err() {
-            res = Err(tsk.context("serde error").unwrap_err());
-            return false;
-        }
-        let cid = tsk.unwrap().cid();
-
-        if cid.is_err() {
-            res = Err(cid.context("serde error").unwrap_err());
-            return false;
-        }
-
-        let hash: EthHash = cid.unwrap().into();
+        let result = (|| -> anyhow::Result<()> {
+            let tsk: TipsetKey = fvm_ipld_encoding::from_slice(&val.value)
+                .context("failed to deserialize TipsetKey")?;
+            let cid = tsk.cid().context("failed to get CID from TipsetKey")?;
+            let hash: EthHash = cid.into();
+            new_db.db.commit_changes([Db::set_operation(
+                col as u8,
+                hash.0.as_bytes().to_vec(),
+                val.value,
+            )]).context("failed to commit")?;
+            Ok(())
+        })();
+        
+        match result {
+            Ok(()) => true,
+            Err(e) => {
+                res = Err(e);
+                false
+            }
+        }
-        res = new_db
-            .db
-            .commit_changes([Db::set_operation(
-                col as u8,
-                hash.0.as_bytes().to_vec(),
-                val.value,
-            )])
-            .context("failed to commit");
-
-        if res.is_err() {
-            return false;
-        }
-
-        true
     })?;
     res?;
 }

162-251: LGTM!

The old schema module correctly defines the v0.26.0 database structure with the Indices column.

Minor: Line 201 has a typo "othewise" → "otherwise" (also present in the v0.31.0 module at line 292).


80-85: Add explicit documentation or assertion to safeguard column index compatibility in migration logic.

The code correctly handles the migration by skipping the Indices column (which is removed) and copying columns 0-4 using their enum indices via col as u8. This works because these columns maintain the same order and meaning in both schemas. However, the implicit coupling between the two DbColumn enum definitions (old paritydb_0_26_0 and new paritydb_0_31_0) creates a maintenance risk: reordering columns in either version would silently break the migration.

Consider adding a compile-time assertion or a detailed comment (near lines 80-85) explicitly documenting that columns 0-4 must remain in the same order across both schema versions for the migration to function correctly.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3f0c50 and 2acca34.

📒 Files selected for processing (1)
  • src/db/migration/v0_31_0.rs
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-08-11T14:00:47.060Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5886
File: src/daemon/db_util.rs:236-259
Timestamp: 2025-08-11T14:00:47.060Z
Learning: In Forest's snapshot import (`src/daemon/db_util.rs`), when F3 CID is present in snapshot metadata but the actual F3 data is missing, this should cause a hard error as it indicates snapshot corruption. Only errors during the F3 import process itself (after data is successfully retrieved) should be non-fatal and logged.

Applied to files:

  • src/db/migration/v0_31_0.rs
📚 Learning: 2026-01-05T12:54:40.850Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/cron_state.rs:8-8
Timestamp: 2026-01-05T12:54:40.850Z
Learning: In Rust code reviews, do not derive Eq for a struct if any field does not implement Eq (e.g., types from external dependencies). If a type like CronStateLotusJson includes fields wrapping external dependencies that lack Eq, derive PartialEq (or implement PartialEq manually) but avoid deriving Eq. This ensures comparisons compile and reflect actual equivalence semantics. When needed, consider implementing custom PartialEq (and possibly Eq) only after ensuring all fields (or wrappers) implement Eq, or keep PartialEq-only if full equality semantics cannot be expressed.

Applied to files:

  • src/db/migration/v0_31_0.rs
📚 Learning: 2026-01-05T12:56:13.802Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/evm_state.rs:41-44
Timestamp: 2026-01-05T12:56:13.802Z
Learning: In Rust codebases (e.g., Forest), do not add #[cfg(test)] to functions already annotated with #[test]. The #[test] attribute ensures the function is compiled only for tests, so a separate #[cfg(test)] is redundant and can be removed if present. Apply this check to all Rust files that contain #[test] functions.

Applied to files:

  • src/db/migration/v0_31_0.rs
🧬 Code graph analysis (1)
src/db/migration/v0_31_0.rs (2)
src/db/migration/migration_map.rs (12)
  • new (26-28)
  • new (321-326)
  • new (491-496)
  • from (30-30)
  • from (328-330)
  • from (498-500)
  • to (32-32)
  • to (332-334)
  • to (502-504)
  • migrate_core (60-60)
  • migrate_core (309-315)
  • migrate_core (481-489)
src/db/parity_db.rs (1)
  • set_operation (365-367)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
  • GitHub Check: Coverage
🔇 Additional comments (4)
src/db/migration/v0_31_0.rs (4)

1-28: LGTM!

Imports are appropriate for the migration functionality. The module documentation clearly explains the purpose of the migration.


29-50: LGTM!

The migration struct correctly implements the MigrationOperation trait methods.


55-72: LGTM!

The CAR database copy logic correctly handles the case where the directory may not exist in older versions.


253-335: LGTM!

The new schema module correctly removes the Indices column while preserving column indices 0-4 for the remaining columns.

Comment on lines +91 to +109
db.db.iter_column_while(col as u8, |val| {
let hash = MultihashCode::Blake2b256.digest(&val.value);
let cid = Cid::new_v1(DAG_CBOR, hash);
res = new_db
.db
.commit_changes([Db::set_operation(
col as u8,
cid.to_bytes(),
val.value,
)])
.context("failed to commit");

if res.is_err() {
return false;
}

true
})?;
res?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Performance: Per-record commits will be extremely slow for large databases.

Each record triggers a separate commit_changes call, which involves disk I/O synchronization. For databases with millions of records, this will make the migration prohibitively slow.

Consider batching commits (e.g., every 10,000 records) to significantly improve performance.

♻️ Suggested approach for batching commits
+const BATCH_SIZE: usize = 10_000;
+
 match col {
     DbColumn::GraphDagCborBlake2b256 | DbColumn::PersistentGraph => {
+        let mut batch = Vec::with_capacity(BATCH_SIZE);
         db.db.iter_column_while(col as u8, |val| {
             let hash = MultihashCode::Blake2b256.digest(&val.value);
             let cid = Cid::new_v1(DAG_CBOR, hash);
-            res = new_db
-                .db
-                .commit_changes([Db::set_operation(
-                    col as u8,
-                    cid.to_bytes(),
-                    val.value,
-                )])
-                .context("failed to commit");
-
-            if res.is_err() {
-                return false;
+            batch.push(Db::set_operation(col as u8, cid.to_bytes(), val.value));
+            if batch.len() >= BATCH_SIZE {
+                res = new_db.db.commit_changes(batch.drain(..)).context("failed to commit");
+                if res.is_err() {
+                    return false;
+                }
             }
-
             true
         })?;
+        // Commit remaining items
+        if !batch.is_empty() {
+            new_db.db.commit_changes(batch).context("failed to commit")?;
+        }
         res?;
     }
🤖 Prompt for AI Agents
In @src/db/migration/v0_31_0.rs around lines 91 - 109, The current loop in
iter_column_while calls new_db.db.commit_changes for every record (inside
iter_column_while), causing extreme slowdown; change iter_column_while usage to
collect Db::set_operation items into a Vec (e.g., ops) and call
new_db.db.commit_changes(ops_slice) only when the batch size threshold (e.g.,
10_000) is reached and once more after the loop to flush remaining ops; ensure
you propagate errors by mapping commit_changes().context("failed to commit") to
res and return false from the iterator on error (so the ? outside still works),
clear the ops Vec after each successful commit, and keep using
MultihashCode::Blake2b256.digest, Cid::new_v1 and cid.to_bytes() to build the
Db::set_operation entries.

Comment on lines +144 to +152
_ => {
let mut iter = db.db.iter(col as u8)?;
while let Some((key, value)) = iter.next()? {
new_db
.db
.commit_changes([Db::set_operation(col as u8, key, value)])
.context("failed to commit")?;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same per-record commit performance issue applies here.

Apply the same batching approach for the default column handling path.

🤖 Prompt for AI Agents
In @src/db/migration/v0_31_0.rs around lines 144 - 152, The default branch
currently commits each record individually using
new_db.db.commit_changes([Db::set_operation(col as u8, key, value)]), causing
the same per-record performance issue; change it to collect a batch of
Db::set_operation entries (same batch size and batching logic used in the other
branches), then call new_db.db.commit_changes once per batch (and once for any
remaining ops) while iterating the iterator (iter) so commits are batched
instead of per-record.

@LesnyRumcajs
Copy link
Member

@akaladarshi could you please mark @coderabbitai comments as irrelevant, incorrect or address them if they make sense?

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

Tip

For best results, initiate chat on the files or code changes.

@LesnyRumcajs, I'm here to help! I'm happy to clarify any of my review comments if needed, or provide additional context. @akaladarshi, feel free to let me know if you have questions about any of my suggestions or if you'd like me to re-evaluate any specific comments based on new information or context I may have missed.


🧠 Learnings used
Learnt from: LesnyRumcajs
Repo: ChainSafe/forest PR: 5907
File: src/rpc/methods/state.rs:523-570
Timestamp: 2025-08-06T15:44:33.467Z
Learning: LesnyRumcajs prefers to rely on BufWriter's Drop implementation for automatic flushing rather than explicit flush() calls in Forest codebase.

@akaladarshi
Copy link
Collaborator Author

@akaladarshi could you please mark @coderabbitai comments as irrelevant, incorrect or address them if they make sense?

I was thinking we can address this as a separate PR, as the current migration code is taken from previous one , so we will also have to update those migrations as well.
And this PR is already getting quite big. WDYT ?

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

Tip

For best results, initiate chat on the files or code changes.

@akaladarshi, splitting a large PR into smaller, manageable pieces can improve clarity and facilitate more thorough reviews, so your proposal makes sense. Given that this PR already touches many files across the codebase to remove the Indices column and add the migration logic, keeping the scope focused on that core functionality is reasonable.

Since the migration code patterns are inherited from previous migrations and would require updating multiple migration files to be consistent, deferring that cleanup to a separate PR is a pragmatic approach. This way:

  • The current PR stays focused on removing Indices and providing a working migration path
  • Reviewers can concentrate on verifying the core logic works correctly
  • Migration code improvements can be addressed systematically across all migration files in a dedicated follow-up

Would you like me to help create a GitHub issue to track the migration code improvements as a follow-up task? That way, the work won't be forgotten and can be prioritized appropriately by the team.

@akaladarshi
Copy link
Collaborator Author

Tip

For best results, initiate chat on the files or code changes.

@akaladarshi, splitting a large PR into smaller, manageable pieces can improve clarity and facilitate more thorough reviews, so your proposal makes sense. Given that this PR already touches many files across the codebase to remove the Indices column and add the migration logic, keeping the scope focused on that core functionality is reasonable.

Since the migration code patterns are inherited from previous migrations and would require updating multiple migration files to be consistent, deferring that cleanup to a separate PR is a pragmatic approach. This way:

  • The current PR stays focused on removing Indices and providing a working migration path
  • Reviewers can concentrate on verifying the core logic works correctly
  • Migration code improvements can be addressed systematically across all migration files in a dedicated follow-up

Would you like me to help create a GitHub issue to track the migration code improvements as a follow-up task? That way, the work won't be forgotten and can be prioritized appropriately by the team.

@coderabbitai Yes, Create a new issue for this.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@akaladarshi, I'll create a GitHub issue to track the migration code improvements.


🏁 Script executed:

#!/bin/bash
# Create a GitHub issue for migration code improvements

gh issue create \
  --repo ChainSafe/forest \
  --title "Refactor and improve database migration code patterns" \
  --assignee akaladarshi \
  --label "technical-debt" \
  --body "## Context

As discussed in PR #6286 (https://github.com/ChainSafe/forest/pull/6286), the current migration code for v0.31.0 follows patterns inherited from previous migrations (e.g., v0.26.0). While this approach works, there are opportunities to improve the migration code quality and consistency across all migration files.

## Objectives

- Review and refactor migration code patterns across all existing migration files
- Improve error handling, logging, and progress reporting in migrations
- Ensure consistency in code style and structure across migration modules
- Update helper functions and utilities used in migrations
- Add or improve documentation for migration patterns

## Scope

This is a follow-up to PR #6286 which focused on removing the Indices column and providing a working migration path. The improvements should be addressed systematically across all migration files in \`src/db/migration/\`.

## References

- Original PR: #6286
- Comment thread: https://github.com/ChainSafe/forest/pull/6286#discussion_r
- Requestor: @akaladarshi

## Notes

This work was deferred from PR #6286 to keep that PR focused on its core functionality (removing Indices column and adding migration logic) and to allow for a more thorough, systematic improvement of all migration code in a dedicated PR."

Length of output: 47


🏁 Script executed:

#!/bin/bash
# Create a GitHub issue for migration code improvements without label

gh issue create \
  --repo ChainSafe/forest \
  --title "Refactor and improve database migration code patterns" \
  --assignee akaladarshi \
  --body "## Context

As discussed in PR #6286 (https://github.com/ChainSafe/forest/pull/6286), the current migration code for v0.31.0 follows patterns inherited from previous migrations (e.g., v0.26.0). While this approach works, there are opportunities to improve the migration code quality and consistency across all migration files.

## Objectives

- Review and refactor migration code patterns across all existing migration files
- Improve error handling, logging, and progress reporting in migrations
- Ensure consistency in code style and structure across migration modules
- Update helper functions and utilities used in migrations
- Add or improve documentation for migration patterns

## Scope

This is a follow-up to PR #6286 which focused on removing the Indices column and providing a working migration path. The improvements should be addressed systematically across all migration files in \`src/db/migration/\`.

## References

- Original PR: #6286
- Comment thread: https://github.com/ChainSafe/forest/pull/6286
- Requestor: @akaladarshi

## Notes

This work was deferred from PR #6286 to keep that PR focused on its core functionality (removing Indices column and adding migration logic) and to allow for a more thorough, systematic improvement of all migration code in a dedicated PR."

Length of output: 47


@akaladarshi, I've successfully created issue #6413 to track the migration code improvements: #6413

The issue has been assigned to you and includes:

This will ensure the migration code improvements aren't forgotten and can be prioritized appropriately by the team.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
CHANGELOG.md (1)

44-55: Changelog: consider referencing issue #6271 and keep link formatting consistent.

Since this change closes an issue, consider using the issue reference (or include both), and add : after the link for consistency with nearby entries.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2acca34 and 488e5c3.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • scripts/tests/api_compare/filter-list
  • scripts/tests/api_compare/filter-list-offline
  • src/rpc/methods/chain.rs
💤 Files with no reviewable changes (2)
  • scripts/tests/api_compare/filter-list
  • scripts/tests/api_compare/filter-list-offline
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-01-05T12:54:40.850Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/cron_state.rs:8-8
Timestamp: 2026-01-05T12:54:40.850Z
Learning: In Rust code reviews, do not derive Eq for a struct if any field does not implement Eq (e.g., types from external dependencies). If a type like CronStateLotusJson includes fields wrapping external dependencies that lack Eq, derive PartialEq (or implement PartialEq manually) but avoid deriving Eq. This ensures comparisons compile and reflect actual equivalence semantics. When needed, consider implementing custom PartialEq (and possibly Eq) only after ensuring all fields (or wrappers) implement Eq, or keep PartialEq-only if full equality semantics cannot be expressed.

Applied to files:

  • src/rpc/methods/chain.rs
📚 Learning: 2026-01-05T12:56:13.802Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/evm_state.rs:41-44
Timestamp: 2026-01-05T12:56:13.802Z
Learning: In Rust codebases (e.g., Forest), do not add #[cfg(test)] to functions already annotated with #[test]. The #[test] attribute ensures the function is compiled only for tests, so a separate #[cfg(test)] is redundant and can be removed if present. Apply this check to all Rust files that contain #[test] functions.

Applied to files:

  • src/rpc/methods/chain.rs
🧬 Code graph analysis (1)
src/rpc/methods/chain.rs (3)
src/rpc/methods/eth/filter/mod.rs (1)
  • get_events_by_event_root (369-383)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (1)
  • ctx (99-151)
src/tool/subcommands/api_cmd/test_snapshot.rs (1)
  • ctx (122-169)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: All lint checks
  • GitHub Check: Coverage
  • GitHub Check: rubocop
🔇 Additional comments (1)
src/rpc/methods/chain.rs (1)

242-261: Add root CID to error context (helps parity debugging & ops).

Right now the error path depends on downstream formatting and may omit root_cid. Adding local context makes failures much easier to diagnose.

Proposed tweak
 async fn handle(
     ctx: Ctx<impl Blockstore + Send + Sync + 'static>,
     (root_cid,): Self::Params,
 ) -> Result<Self::Ok, ServerError> {
-    let events = EthEventHandler::get_events_by_event_root(&ctx, &root_cid)?;
+    let events = EthEventHandler::get_events_by_event_root(&ctx, &root_cid)
+        .with_context(|| format!("failed to load events for events_root={root_cid}"))?;
     Ok(events)
 }

[建议_recommended_refactor]

@LesnyRumcajs
Copy link
Member

Please ensure all @coderabbitai comments are closed.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@LesnyRumcajs, I'll check the status of all my review comments on this PR and help close any that have been addressed or are no longer relevant.


🏁 Script executed:

#!/bin/bash
# List all review comments made by coderabbitai on PR #6286
gh pr view 6286 --repo ChainSafe/forest --json reviews,comments | jq '
  {
    "review_comments": [
      .reviews[]? | 
      select(.author.login == "coderabbitai" or .author.login == "coderabbitai[bot]") | 
      {
        id: .id,
        state: .state,
        body: .body,
        createdAt: .createdAt
      }
    ],
    "comments": [
      .comments[]? | 
      select(.author.login == "coderabbitai" or .author.login == "coderabbitai[bot]") | 
      {
        id: .id,
        body: .body,
        createdAt: .createdAt
      }
    ]
  }
'

# Also get review threads to see unresolved comments
gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/ChainSafe/forest/pulls/6286/comments \
  --jq '.[] | select(.user.login == "coderabbitai" or .user.login == "coderabbitai[bot]") | {id: .id, path: .path, line: .line, body: .body[0:200], created_at: .created_at, updated_at: .updated_at, in_reply_to_id: .in_reply_to_id}'

Length of output: 122208


@LesnyRumcajs, I'll check the resolution status of the review comments to give you a complete picture.


🏁 Script executed:

#!/bin/bash
# Get detailed review thread information including resolution status
gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/ChainSafe/forest/pulls/6286/comments \
  --jq '.[] | select(.user.login == "coderabbitai" or .user.login == "coderabbitai[bot]") | {
    id: .id,
    path: .path,
    line: .line,
    created_at: .created_at,
    body_preview: .body[0:100],
    in_reply_to_id: .in_reply_to_id
  }' | jq -s 'group_by(.path) | map({
    path: .[0].path,
    comments: . | length,
    thread_starts: [.[] | select(.in_reply_to_id == null)] | length
  })'

# Also check for any unresolved conversation threads
gh pr view 6286 --repo ChainSafe/forest --json reviewThreads --jq '
  .reviewThreads | 
  map(select(.isResolved == false)) | 
  map({
    path: .path,
    line: .line,
    isResolved: .isResolved,
    comments: .comments | length
  })
'

Length of output: 864


🏁 Script executed:

#!/bin/bash
# Get more detailed information about each comment thread
echo "=== CodeRabbit Review Comments on PR #6286 ==="
echo ""

# Get all review comments with their IDs and check for replies
gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/ChainSafe/forest/pulls/6286/comments \
  --jq '.[] | select(.user.login == "coderabbitai" or .user.login == "coderabbitai[bot]")' | jq -s '. | sort_by(.created_at) | .[] | {
    id: .id,
    path: .path,
    line: .line,
    created: .created_at,
    updated: .updated_at,
    in_reply_to: .in_reply_to_id,
    severity: (if (.body | contains("🔴 Critical")) then "Critical" elif (.body | contains("🟠 Major")) then "Major" elif (.body | contains("🟡 Minor")) then "Minor" else "Info" end),
    first_100_chars: .body[0:100]
  }'

echo ""
echo "=== Checking for replies to these comments ==="
# Get all comments and see if any are replies to coderabbitai comments
gh api \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  /repos/ChainSafe/forest/pulls/6286/comments \
  --jq '[.[] | {id: .id, in_reply_to: .in_reply_to_id, user: .user.login}] | 
        map(select(.in_reply_to != null)) | 
        group_by(.in_reply_to) | 
        map({parent_id: .[0].in_reply_to, reply_count: length, repliers: [.[] | .user] | unique})'

Length of output: 2105


@LesnyRumcajs, I found 5 unresolved CodeRabbit review comments on this PR:

Summary by Severity:

🔴 Critical (2 comments) - src/db/migration/v0_31_0.rs

  • Line 110: Created Jan 12, 2026
  • Line 143: Created Jan 12, 2026

🟠 Major (2 comments) - src/db/migration/v0_31_0.rs

  • Line 109: "Performance: Per-record commits will be extremely slow for large..." (Created Jan 12, 2026)
  • Line 152: "Same per-record commit performance issue applies here..." (Created Jan 12, 2026)

🟡 Minor (1 comment) - src/state_manager/tests.rs

  • "Event AMT tests are solid; fix minor test name/comment inconsist..." (Created Dec 4, 2025)

Status:

None of these comments have received any replies, meaning they are all still open and unaddressed.

Recommendation:

Since 4 of the 5 comments are flagged as Critical or Major severity issues in the migration code (specifically around per-record commit performance), I recommend reviewing these before closing them. The performance issues in particular could cause significant problems during database migration.

Would you like me to:

  1. Fetch the full content of these comments so you can review them?
  2. Analyze whether these are valid concerns that need addressing, or if they can be safely dismissed?
  3. Something else?

🧠 Learnings used
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6074
File: src/rpc/methods/chain.rs:55-56
Timestamp: 2025-09-17T11:32:44.185Z
Learning: In the Forest codebase, hanabi1224 prefers that CodeRabbit should not warn about potential compilation issues (such as Send bounds, async/await compatibility, etc.) since they are already guarded by CI tests. Focus should be on higher-level design and logic issues instead.

Learnt from: LesnyRumcajs
Repo: ChainSafe/forest PR: 5907
File: src/rpc/methods/state.rs:523-570
Timestamp: 2025-08-06T15:44:33.467Z
Learning: LesnyRumcajs prefers to rely on BufWriter's Drop implementation for automatic flushing rather than explicit flush() calls in Forest codebase.

LesnyRumcajs
LesnyRumcajs previously approved these changes Jan 13, 2026
@LesnyRumcajs LesnyRumcajs dismissed their stale review January 13, 2026 16:56

ooops, wrong PR

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/rpc/methods/eth/filter/mod.rs (1)

368-383: Consider simplifying the error handling pattern.

The function correctly loads events from the blockstore using StampedEvent::get_events, which aligns with the PR objective to serve events directly from the blockstore. The conversion to Vec<Event> via Into::into is clean.

However, the error handling can be simplified using the ? operator with context:

♻️ Suggested simplification
     pub fn get_events_by_event_root<DB: Blockstore + Send + Sync + 'static>(
         ctx: &Ctx<DB>,
         events_root: &Cid,
     ) -> anyhow::Result<Vec<Event>> {
-        let state_events =
-            match StampedEvent::get_events(ctx.chain_store().blockstore(), events_root) {
-                Ok(e) => e,
-                Err(e) => {
-                    return Err(anyhow::anyhow!("load events amt: {}", e));
-                }
-            };
+        let state_events = StampedEvent::get_events(ctx.chain_store().blockstore(), events_root)
+            .with_context(|| format!("load events amt for root {events_root}"))?;
 
         let chain_events: Vec<Event> = state_events.into_iter().map(Into::into).collect();
         Ok(chain_events)
     }

This provides consistent error handling with the rest of the codebase and includes the events root CID in the error context for easier debugging.

📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 488e5c3 and a3fa8f1.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • CHANGELOG.md
  • Cargo.toml
  • src/rpc/methods/eth/filter/mod.rs
  • src/state_manager/mod.rs
  • src/tool/subcommands/api_cmd/test_snapshots.txt
  • src/tool/subcommands/index_cmd.rs
💤 Files with no reviewable changes (1)
  • src/tool/subcommands/index_cmd.rs
🚧 Files skipped from review as they are similar to previous changes (3)
  • Cargo.toml
  • CHANGELOG.md
  • src/tool/subcommands/api_cmd/test_snapshots.txt
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-09-11T16:03:14.328Z
Learnt from: akaladarshi
Repo: ChainSafe/forest PR: 6000
File: src/tool/subcommands/api_cmd/state_decode_params_tests/miner.rs:4-9
Timestamp: 2025-09-11T16:03:14.328Z
Learning: In the Forest codebase's state_decode_params_tests module, common imports like `to_vec`, `TokenAmount`, `Cid`, etc. are centralized in the mod.rs file and made available to submodules through `use super::*;`, eliminating the need for duplicate imports in individual test files.

Applied to files:

  • src/state_manager/mod.rs
📚 Learning: 2026-01-05T13:02:14.604Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/shim/actors/builtin/cron/mod.rs:47-47
Timestamp: 2026-01-05T13:02:14.604Z
Learning: The Entry enum in src/shim/actors/builtin/cron/mod.rs cannot derive Eq because it wraps external fil_actor_cron_state::vX::Entry types (v8-v17) that don't implement Eq. Only PartialEq can be derived for this enum due to this external dependency constraint.

Applied to files:

  • src/state_manager/mod.rs
  • src/rpc/methods/eth/filter/mod.rs
📚 Learning: 2026-01-05T12:54:40.850Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/cron_state.rs:8-8
Timestamp: 2026-01-05T12:54:40.850Z
Learning: In Rust code reviews, do not derive Eq for a struct if any field does not implement Eq (e.g., types from external dependencies). If a type like CronStateLotusJson includes fields wrapping external dependencies that lack Eq, derive PartialEq (or implement PartialEq manually) but avoid deriving Eq. This ensures comparisons compile and reflect actual equivalence semantics. When needed, consider implementing custom PartialEq (and possibly Eq) only after ensuring all fields (or wrappers) implement Eq, or keep PartialEq-only if full equality semantics cannot be expressed.

Applied to files:

  • src/state_manager/mod.rs
  • src/rpc/methods/eth/filter/mod.rs
📚 Learning: 2026-01-05T12:56:13.802Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 6381
File: src/lotus_json/actors/states/evm_state.rs:41-44
Timestamp: 2026-01-05T12:56:13.802Z
Learning: In Rust codebases (e.g., Forest), do not add #[cfg(test)] to functions already annotated with #[test]. The #[test] attribute ensures the function is compiled only for tests, so a separate #[cfg(test)] is redundant and can be removed if present. Apply this check to all Rust files that contain #[test] functions.

Applied to files:

  • src/state_manager/mod.rs
  • src/rpc/methods/eth/filter/mod.rs
📚 Learning: 2025-08-08T12:11:55.266Z
Learnt from: hanabi1224
Repo: ChainSafe/forest PR: 5867
File: src/ipld/util.rs:461-487
Timestamp: 2025-08-08T12:11:55.266Z
Learning: Forest (src/ipld/util.rs, Rust): In UnorderedChainStream::poll_next, dropping `extract_sender` (when no more tipsets and the extract queue is empty) is the intended shutdown signal for workers. Any subsequent attempt to enqueue work after this drop is a logic error and should be treated as an error; do not change `send()` to ignore a missing sender.

Applied to files:

  • src/rpc/methods/eth/filter/mod.rs
🧬 Code graph analysis (1)
src/rpc/methods/eth/filter/mod.rs (1)
src/shim/executor.rs (3)
  • events (118-124)
  • events_root (180-186)
  • get_events (392-414)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: Check
  • GitHub Check: rubocop
  • GitHub Check: tests-release
  • GitHub Check: All lint checks
  • GitHub Check: Coverage
🔇 Additional comments (8)
src/state_manager/mod.rs (6)

64-64: LGTM!

The ensure! macro import is correctly added to support the new verification check in apply_block_messages.


71-71: LGTM!

The import of both Amt and Amtv0 is correct. Amtv0 is used for receipts (legacy format) and Amt is used for events AMT with the configurable bit width.


93-93: LGTM!

The EVENTS_AMT_BITWIDTH constant with value 5 aligns with the FVM's standard bit width for events AMT. This ensures consistency with Lotus behavior.


98-104: LGTM!

The StateEvents struct cleanly encapsulates events data with appropriate derives and custom size calculation helpers.


562-570: LGTM!

The fallback path correctly computes the tipset state when events are not found in the blockstore, aligning with the PR objective to serve events directly from the blockstore.


2014-2034: Error handling and verification logic are sound—no changes needed.

The events AMT storage code correctly handles both failure scenarios. The EVENTS_AMT_BITWIDTH = 5 constant is already well-defined (src/state_manager/mod.rs:93) and validated consistently across tests and the codebase—the bitwidth verification request is unnecessary. The .map_err() provides descriptive context for storage failures, and the ensure!() check properly validates the derived root matches the FVM-computed root. The error messages are clear and sufficient for debugging. While adding error! level logging before the ensure check could serve as defensive instrumentation, it is optional given the explicit error message already provided.

src/rpc/methods/eth/filter/mod.rs (2)

40-40: LGTM!

The StampedEvent import is correctly added to support the new get_events_by_event_root function.


283-290: LGTM!

Good improvement to the error message - including the actual lengths of messages and events helps with debugging when a mismatch occurs. The tipset_state_events call is correctly updated to match the new signature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Faulty deduplication Filecoin.ChainGetEvents

3 participants