fix(consensus/XDPoS): fix results ordering and false unknown ancestor error in VerifyHeaders, XFN-12#2139
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR fixes a result-ordering bug in XDPoS.VerifyHeaders where splitting headers into v1/v2 buckets and running two concurrent goroutines caused nondeterministic result-to-header mapping around the v1→v2 switch boundary. This led to misleading "BAD BLOCK" log messages (e.g., a v1-style error attributed to a v2-height header) and sync failures (issue #2138).
Changes:
consensus/XDPoS/XDPoS.go: Replaced the two-bucket/two-goroutineVerifyHeadersimplementation with a single goroutine that iterates headers in input order, dispatching each to the appropriate engine version.consensus/tests/engine_v2_tests/adaptor_test.go: AddedTestAdaptorVerifyHeadersKeepsInputOrderAcrossConsensusSwitchto assert that results arrive in the same order as the input slice across the consensus switch.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
consensus/XDPoS/XDPoS.go |
Rewrites VerifyHeaders to a single sequential goroutine preserving input order |
consensus/tests/engine_v2_tests/adaptor_test.go |
New regression test for result ordering across the v1/v2 switch |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
16837f0 to
41a76ec
Compare
41a76ec to
5e4ff8a
Compare
5e4ff8a to
8bb288c
Compare
8bb288c to
b04895c
Compare
a14f15c to
eea3308
Compare
eea3308 to
ec982d4
Compare
b543a79 to
776237d
Compare
48b5c76 to
d0a24c7
Compare
d0a24c7 to
958aef2
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
unknown ancestor error and mixed v1/v2 VerifyHeaders result ordering, XFN-12
unknown ancestor error and mixed v1/v2 VerifyHeaders result ordering, XFN-12unknown ancestor error and VerifyHeaders result race, XFN-12
19e32b4 to
831dc3c
Compare
83741ef to
987b1cf
Compare
…XFN-12 Problem: When a VerifyHeaders batch crosses the v1/v2 boundary, EngineV1 and EngineV2 can write to the same results channel concurrently. The emitted order then depends on goroutine scheduling instead of a deterministic order expected by callers. Solution: Split v1 and v2 verification outputs into separate buffered channels, then forward them to the public results channel in a fixed sequence (all v1 results first, then all v2 results). Keep the single-engine fast paths unchanged. Impact: Mixed-version batches now produce stable and predictable result ordering, removing race-driven ordering nondeterminism. Behavior and performance characteristics for pure v1 or pure v2 batches remain unchanged. Validation: Added and passed mixed-boundary regression coverage to verify deterministic result ordering and to ensure no scheduling-dependent output order.
Problem: In mixed v1/v2 VerifyHeaders batches, the first v2 header can fail with unknown ancestor when its v1 parent exists only in the same input batch (not yet persisted), or when hash-based parent lookup is masked. Solution: Wrap mixed-path verification with verifyChainReader. The wrapper resolves GetHeader, GetHeaderByHash, GetHeaderByNumber, and GetBlock from in-batch headers first, then falls back to the underlying chain reader. Impact: Mixed-version verification now has deterministic ancestor visibility across the v1->v2 boundary, eliminating false unknown ancestor failures. Pure v1 and pure v2 verification paths remain unchanged. Validation: Added and passed regression coverage for batch shadowing, in-batch parent resolution, nil-chain safety, and mixed VerifyHeaders result flow.
987b1cf to
77bc576
Compare
…tch sync Problem: Batch header verification could return false unknown-ancestor errors when parent headers were present in the same in-memory batch but not yet persisted. This could propagate to downloader sync as invalid-chain failures and trigger peer drops. Solution: - Use verifyChainReader for all XDPoS VerifyHeaders paths (v1-only, v2-only, and mixed) so in-batch ancestors are visible consistently. - Add/extend regression coverage in engine_v2 tests for pure-v2 epoch-switch batch verification where parent headers are not yet written to DB. - Add downloader black-box regression tests for both LightSync and FastSync to verify: - ancestor error path is classified as errInvalidChain and drops the peer - control path succeeds and keeps the peer. - Rename paired downloader tests to short, symmetric names. Validation: - go test ./consensus/tests/engine_v2_tests -run 'TestShouldVerifyPureV2EpochSwitchHeadersEvenIfParentNotYetWrittenIntoDB' -count=1 - go test ./eth/downloader -run 'TestSyncBatchAncestorErrDropPeer|TestSyncBatchNoAncestorErrKeepPeer|TestBlockHeaderAttackerDropping64' -count=1
unknown ancestor error and VerifyHeaders result race, XFN-12VerifyHeaders result race and false unknown ancestor error, XFN-12
VerifyHeaders result race and false unknown ancestor error, XFN-12unknown ancestor error in VerifyHeaders, XFN-12
unknown ancestor error in VerifyHeaders, XFN-12unknown ancestor error in VerifyHeaders, XFN-12
Proposed changes
fix:
VerifyHeadersunknown ancestorerror during syncProblems:
Solution:
Types of changes
What types of changes does your code introduce to XDC network?
Put an
✅in the boxes that applyImpacted Components
Which parts of the codebase does this PR touch?
Put an
✅in the boxes that applyChecklist
Put an
✅in the boxes once you have confirmed below actions (or provide reasons on not doing so) that