Skip to content

Everlight: supernode compatibility + p2p STORAGE_FULL eligibility gate#284

Merged
mateeullahmalik merged 12 commits intomasterfrom
feat/p2p-storage-full-gate
Apr 24, 2026
Merged

Everlight: supernode compatibility + p2p STORAGE_FULL eligibility gate#284
mateeullahmalik merged 12 commits intomasterfrom
feat/p2p-storage-full-gate

Conversation

@mateeullahmalik
Copy link
Copy Markdown
Collaborator

@mateeullahmalik mateeullahmalik commented Apr 22, 2026

Consolidates the Everlight supernode workstream into one PR. Supersedes #282 (closed) and #272 (closed).

Implements the supernode side of lumera #113 (Everlight Phase 1). Three layers, stacked in commit order:

  1. Compat / telemetry (was Everlight #282) — host_reporter measures disk usage on the p2p data dir mount and emits cascade_kademlia_db_bytes + disk_usage_percent in audit epoch reports; probing + verifier accept ACTIVE and STORAGE_FULL as operational states; supernode query client gains wrappers for the new chain state fields; go.mod bumped to lumera v1.12.0-rc.
  2. P2P store/retrieve list separation (was Update/store retrieve list seperation #272, rewritten) — bootstrap seeds two distinct allowlists: routingIDs (reads) and storeIDs (writes). Replaces the single-allowlist model and is a prerequisite for the STORAGE_FULL gate.
  3. P2P STORAGE_FULL eligibility gate (this PR's new commits) — extends the routing-vs-store split to the new SUPERNODE_STATE_STORAGE_FULL (enum 6). STORAGE_FULL nodes continue to serve reads and earn payout but do not receive new STORE / BatchStore writes and are not targeted by replication.

State-class matrix

State Routing (reads)? Store (writes)? Payout?
ACTIVE
POSTPONED
STORAGE_FULL
DISABLED / STOPPED / PENALIZED

How

  1. Single source of truthp2p/kademlia/supernode_state.go imports sntypes.SuperNodeState from lumera. No numeric literals anywhere in p2p outside this file. Helpers: isRoutingEligibleState, isStoreEligibleState.
  2. Bootstrap population (bootstrap.go::loadBootstrapCandidatesFromChain) — routingIDs = {ACTIVE, POSTPONED, STORAGE_FULL}, storeIDs = {ACTIVE}.
  3. Self-state cache — this node's latest chain state cached on DHT.selfState (atomic). Consumed by the STORE RPC self-guard.
  4. STORE self-guard (network.go::handleStoreData, handleBatchStoreData) — if self is not store-eligible, reject STORE/BatchStore requests that contain any genuinely new keys. Replication of already-held keys still allowed (preserves availability during transitions).
  5. Eager replication prune (bootstrap.go::SyncBootstrapOncepruneIneligibleStorePeers) — on every bootstrap refresh flip replication_info.Active=false for any peer no longer in the store allowlist. Closes the 10-minute+ window between a chain STORAGE_FULL transition and the next successful ping.
  6. Defensive read-gate (dht.go::BatchRetrieve, BatchRetrieveStream) — filterEligibleNodes applied to the closest-contact list as belt-and-braces.
  7. Host-reporter disk usage (host_reporter/service.go) — measures the filesystem backing the p2p data dir rather than /, and reports cascade_kademlia_db_bytes so the chain can drive STORAGE_FULL / POSTPONED transitions deterministically.
  8. Probing + verifier (supernode_metrics/active_probing.go, verifier/verifier.go) — ACTIVE and STORAGE_FULL both count as operational for probe-role assignment and challenge response.

Tests — invariant-oriented, one violation test per enforcement point

Invariant Test
Routing allowlist population / pre-init permissive / post-init enforcing TestEligibleForRouting_PreInit_AndPopulated
Store allowlist strictly ⊆ routing TestEligibleForStore_StrictlyContainedInRouting
STORE self-guard across 7 states × newKeys={0,>0} TestShouldRejectStore, TestSelfStoreEligible
Eager replication prune TestPruneIneligibleStorePeers_*
State classification SSoT TestStateClassification_Table (parametric over all 7 chain enum values)
Host-reporter tick behavior (STORAGE_FULL transitions, disk metric emission) host_reporter/tick_behavior_test.go
Verifier accepts STORAGE_FULL verifier/verifier_test.go
Probing accepts STORAGE_FULL reachability_active_probing_test.go

No numeric state literals anywhere in p2p/kademlia/ outside supernode_state.go.

Verification

  • go build ./... OK
  • go vet ./... OK
  • go test ./... OK incl. integration
  • CI: unit-tests ✅ integration-tests ✅ build ✅ Rooview ✅
  • CI: cascade-e2e-tests ❌ — unrelated: test asserts a single coin_spent event of 10000ulume but the v1.12.0-rc fee payout is now split (200ulume protocol cut + 9800ulume supernode share). Fix is a test-only change to sum coin_spent amounts for the spender (or match on action_registered.fee). Pushing that fix next.

Out of scope (follow-ups)

  • SDK parity tests (sdk-go / sdk-js / sdk-rs) — Phase C.
  • Lumera tests/systemtests/everlight_p2p_test.go — Phase D.
  • Server-side Replicate RPC self-guard (the replication-worker path already honors replication_info.Active which is pruned eagerly here).

Risks

  • R1: STORAGE_FULL transition arrives mid-flight. Server-side STORE guard makes rejection deterministic; client-side retry already exists; test matrix covers this.
  • R2: Chain enum renumbering. Using sntypes.SuperNodeState* constants directly makes this a compile error on the next go-mod bump rather than silent drift.
  • R3: pruneIneligibleStorePeers scans full replication_info. O(n) per 10-min bootstrap refresh; negligible for current network size.

Rollback

Per-feature reverts are clean:

  • p2p STORAGE_FULL gate commits (top 4) can be reverted independently — restores the routing/store split without the STORAGE_FULL class.
  • host-reporter + verifier + probing commits revert to pre-Everlight behavior with no compile breaks on master (no chain state dependencies).
  • go.mod bump is the only cross-cutting change; reverting it requires rolling back the lumera side (feat: Add capabilities management to supernode #113).

Related:

@roomote-v0
Copy link
Copy Markdown

roomote-v0 Bot commented Apr 22, 2026

Rooviewer Clock   See task

Reviewed commit a65fe57 (review fixes). All four previously flagged issues are resolved: empty-list guard added to setStoreAllowlist, nil-check reordered in eligibleForStore, handleBatchStoreData now delegates to shouldRejectBatchStore, and IterateBatchStore uses debounced MaybeRefreshAllowlists. Regression tests cover each fix. No new issues found.

  • setStoreAllowlist missing empty-list guard (dht.go): Fixed. Empty-map guard added, symmetric with setRoutingAllowlist. Retains previous allowlist on transient chain issues.
  • eligibleForStore nil-check ordering (dht.go): Fixed. Nil/empty-ID check moved before storeAllowReady, consistent with eligibleForRouting.
  • handleBatchStoreData bypasses shouldRejectStore (network.go): Fixed. Replaced inline logic with shouldRejectBatchStore which delegates to shouldRejectStore internally.
  • Synchronous ListSuperNodes RPC on every IterateBatchStore call (dht.go): Fixed. MaybeRefreshAllowlists debounces with a 10-second minimum interval; background 30s ticker handles steady state.
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Comment thread p2p/kademlia/dht.go
Comment on lines +188 to 208
func (s *DHT) setStoreAllowlist(ctx context.Context, allow map[[32]byte]struct{}) {
if s == nil {
return
}
// Integration tests may use synthetic bootstrap sets; do not enforce chain-state gating.
if integrationTestEnabled() {
return
}

s.storeAllowMu.Lock()
s.storeAllow = allow
s.storeAllowMu.Unlock()

s.storeAllowCount.Store(int64(len(allow)))
s.storeAllowReady.Store(true)

logtrace.Debug(ctx, "store allowlist updated", logtrace.Fields{
logtrace.FieldModule: "p2p",
"store_peers": len(allow),
})
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setStoreAllowlist unconditionally accepts an empty map and marks storeAllowReady=true with storeAllowCount=0. If a transient chain issue returns zero ACTIVE supernodes, eligibleForStore will return false for every peer, blocking all writes network-wide until the next bootstrap refresh (up to 10 min). setRoutingAllowlist explicitly guards against this by returning early when len(allow) == 0 and retaining the previous allowlist. The store allowlist should have the same protection.

Fix it with Roo Code or mention @roomote and request a fix.

Comment thread p2p/kademlia/dht.go Outdated
Comment on lines +244 to +262
func (s *DHT) eligibleForStore(n *Node) bool {
if s == nil {
return false
}
// In integration tests allow everything; chain state gating is not stable/available there.
if integrationTestEnabled() {
return true
}
// If the store allowlist isn't ready yet, avoid blocking writes during bootstrap.
if !s.storeAllowReady.Load() {
return true
}
// Once initialized, an empty active set means no write-eligible peers.
if s.storeAllowCount.Load() == 0 {
return false
}
if n == nil || len(n.ID) == 0 {
return false
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nil/empty node check (n == nil || len(n.ID) == 0) is placed after the storeAllowReady and storeAllowCount checks. When storeAllowReady is false (pre-bootstrap), this function returns true for a nil node, which could cause a nil pointer dereference if a nil *Node slips through a call site during bootstrap. eligibleForRouting checks n == nil as the very first thing (line 214). Moving the nil check before the ready/count checks would be consistent and defensive.

Suggested change
func (s *DHT) eligibleForStore(n *Node) bool {
if s == nil {
return false
}
// In integration tests allow everything; chain state gating is not stable/available there.
if integrationTestEnabled() {
return true
}
// If the store allowlist isn't ready yet, avoid blocking writes during bootstrap.
if !s.storeAllowReady.Load() {
return true
}
// Once initialized, an empty active set means no write-eligible peers.
if s.storeAllowCount.Load() == 0 {
return false
}
if n == nil || len(n.ID) == 0 {
return false
}
func (s *DHT) eligibleForStore(n *Node) bool {
if s == nil {
return false
}
if n == nil || len(n.ID) == 0 {
return false
}
// In integration tests allow everything; chain state gating is not stable/available there.
if integrationTestEnabled() {
return true
}
// If the store allowlist isn't ready yet, avoid blocking writes during bootstrap.
if !s.storeAllowReady.Load() {
return true
}
// Once initialized, an empty active set means no write-eligible peers.
if s.storeAllowCount.Load() == 0 {
return false
}

Fix it with Roo Code or mention @roomote and request a fix.

Comment thread p2p/kademlia/network.go Outdated
Comment on lines +1247 to +1266
if !s.dht.selfStoreEligible() {
newKeys := 0
for _, data := range request.Data {
k, _ := utils.Blake3Hash(data)
existing, rErr := s.dht.store.Retrieve(ctx, k)
if rErr != nil || len(existing) == 0 {
newKeys++
}
}
if newKeys > 0 {
logtrace.Warn(ctx, "rejecting BatchStore: self is not store-eligible", logtrace.Fields{
logtrace.FieldModule: "p2p",
"sender": message.Sender.String(),
"self_state": s.dht.selfState.Load(),
"new_keys": newKeys,
"total_keys": len(request.Data),
})
return s.generateResponseMessage(ctx, BatchStoreData, message.Sender, ResultFailed, "batch store rejected: self not store-eligible")
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handleStoreData was refactored to use shouldRejectStore(1), but this batch handler still calls selfStoreEligible() directly and reimplements the newKeys-counting logic inline. If shouldRejectStore is updated later (e.g., to add logging, metrics, or a grace period), this path will silently diverge. Consider replacing the !s.dht.selfStoreEligible() guard + manual newKeys count with s.dht.shouldRejectStore(newKeys) after computing newKeys, matching the single-store handler's pattern.

Fix it with Roo Code or mention @roomote and request a fix.

Aligns both supernode go.mod and tests/system go.mod with the v1.12.0-rc
release tag (lumera commit 7ca770a / Everlight #113). Resolves the
install-lumera CI step which requires a real downloadable release asset
rather than a pseudo-version.
Extends PR #272's routing-vs-store allowlist split to cover the new
SUPERNODE_STATE_STORAGE_FULL introduced by lumera #113 (Everlight).

Policy:
- routing (reads)  = {ACTIVE, POSTPONED, STORAGE_FULL}
- store   (writes) = {ACTIVE}

STORAGE_FULL nodes continue to serve reads and earn payout, but must
not receive new STORE/BatchStore writes or be targeted by replication.

Changes:
- p2p/kademlia/supernode_state.go: SSoT helpers using chain's
  sntypes.SuperNodeState enum (no numeric literals), selfState cache,
  pruneIneligibleStorePeers, selfStoreEligible.
- bootstrap.go: use isRoutingEligibleState / isStoreEligibleState;
  record self-state during chain sync; call pruneIneligibleStorePeers
  after setStoreAllowlist so replication_info.Active is cleared
  eagerly for ineligible peers (closes ping-cadence window).
- network.go: STORE RPC self-guard for single handleStoreData (reject
  new-key write when self not store-eligible; replication of already-
  held key still permitted); BatchStoreData self-guard rejects when
  batch contains any genuinely new keys.
- dht.go: selfState atomic fields; defensive filterEligibleNodes on
  BatchRetrieve and BatchRetrieveStream closest-contact lists.
- dht_batch_store_test.go: accept new "no eligible store peers"
  error variant alongside legacy "no candidate nodes".
Coverage matrix aligned with the invariant table in the plan doc:

- I1 routing allowlist population/pre-init: TestEligibleForRouting_PreInit_AndPopulated
- I2 store allowlist strictly ⊆ routing:   TestEligibleForStore_StrictlyContainedInRouting
- I3+I9 shouldRejectStore contract (ACTIVE, STORAGE_FULL, POSTPONED,
  DISABLED, with newKeys=0 vs >0, pre-init permissive):
  TestShouldRejectStore, TestSelfStoreEligible
- I5 eager replication-info prune on storeAllowlist update:
  TestPruneIneligibleStorePeers_ClearsNonStorePeers,
  TestPruneIneligibleStorePeers_SkipsWhenNotReady
- I6 state-classification SSoT (no drift possible):
  TestStateClassification_Table (parametric over all 7 chain enum values)

Refactors handleStoreData self-guard to use shouldRejectStore helper for
direct test coverage without a full *Network + hashtable boot.

Adds a minimal fakeStore in-package (satisfies the full Store interface
with no-op implementations) used only by prune tests.
@mateeullahmalik mateeullahmalik force-pushed the feat/p2p-storage-full-gate branch from af9c32c to 7a2b535 Compare April 22, 2026 15:58
@mateeullahmalik mateeullahmalik changed the title feat(p2p): Everlight STORAGE_FULL eligibility gate Everlight: supernode compatibility + p2p STORAGE_FULL eligibility gate Apr 22, 2026
@mateeullahmalik mateeullahmalik changed the base branch from everlight-supernode-compat-plan to master April 22, 2026 17:47
@mateeullahmalik mateeullahmalik mentioned this pull request Apr 22, 2026
The cascade e2e asserted that a single coin_spent event in the
MsgFinalizeAction tx carried amount == autoPrice. That was true
pre-Everlight when the action module paid the full fee to the supernode
in one bank send.

Under Everlight Phase 1 (lumera #113) x/action/v1/keeper DistributeFees
splits the fee across multiple bank sends: the governance-tunable
RegistrationFeeShareBps is routed to the x/supernode reward pool, any
FoundationFeeShare goes to the community pool, and the remainder is paid
to the supernode(s). No single coin_spent event equals autoPrice
anymore, so the pre-Everlight assertion fails deterministically on the
v1.12.0-rc bump introduced by this PR.

Replace the single-event check with a value-conservation assertion:

- Sum all msg_index-tagged coin_spent events per spender; feeSpent is
  true iff some spender (the action module) paid out autoPrice in total.
- Sum all msg_index-tagged coin_received events across recipients;
  feeReceived is true iff the total equals autoPrice.
- msg_index discriminator excludes the tx-level gas-fee pair (emitted
  by x/auth, no msg_index) from the payout accounting.
- fromAddress / toAddress are retained for the payment-flow log;
  toAddress picks the recipient of the largest share (the supernode
  payout under the default split ratio).

This matches the real product invariant (value conservation across the
payout flow) and survives future parameter changes to
RegistrationFeeShareBps or FoundationFeeShare without further test
churn.
…path

Production-gate devnet run (lumera@master + this branch) found that cascade
uploads fail deterministically whenever an on-chain supernode state
transition to STORAGE_FULL has just happened and the client's local
storeAllowlist has not yet refreshed.

Observed failure:

  ERROR  Batch store to node failed  err="batch store rejected: self
         not store-eligible"  node=<STORAGE_FULL peer>
  → failed to achieve desired success rate, only: 71.43% successful
  → registration task failed → cascade upload FAILS

Root cause (two compounding issues):

1. setStoreAllowlist is only called from SyncBootstrapOnce, which ticks
   every bootstrapRefreshInterval (10 minutes). Between a chain STORAGE_FULL
   transition and the next tick, the caller's in-memory storeAllow still
   contains the transitioned peer. IterateBatchStore's eligibleForStore(n)
   gate therefore passes, and the client dispatches a STORE RPC that the
   peer correctly rejects via its own self-guard.

2. The IterateBatchStore response handler counted the authoritative
   "self not store-eligible" rejection as a store failure. A single
   transitioned peer in the closest-contact list therefore dragged the
   measured success rate below the 75% minimumDataStoreSuccessRate
   threshold and failed the entire cascade upload.

Fix (surgical, two coordinated parts, both required):

1. Fast allowlist-only refresher — new StartAllowlistRefresher /
   RefreshAllowlistsFromChain in p2p/kademlia/bootstrap.go, ticking every
   30s (allowlistRefreshInterval), plus an opportunistic best-effort
   refresh at the top of IterateBatchStore. Reuses loadBootstrapCandidates-
   FromChain's existing gRPC ListSuperNodes query; does NOT touch
   replication_info upserts, pings, routing-table seeding, or any other
   heavyweight bootstrap behavior. Converges the write-gate within
   O(seconds) of a chain transition.

2. Tolerate authoritative peer rejections — new isPeerStoreIneligibleError
   helper + new pruneStoreAllowEntry method in p2p/kademlia/supernode_state.go.
   When the peer responds with the canonical "self not store-eligible"
   marker (emitted by both handleStoreData and handleBatchStoreData), the
   client now:
     (a) decrements its request counter so the rejection is NOT in the
         success-rate denominator,
     (b) removes the peer from the in-memory storeAllow so subsequent
         hot-path lookups in the same run do not re-pick it.
   The server-side emitter and the client-side recognizer share the
   peerStoreIneligibleMarker constant to prevent silent drift.

Invariants locked in by new unit tests in peer_store_rejection_test.go:

  - isPeerStoreIneligibleError recognizes both STORE and BatchStoreData
    server emitter phrasings (and wrapped variants).
  - isPeerStoreIneligibleError does NOT match unrelated store errors
    (timeout, rpc internal, near-miss typos, case-mismatch).
  - pruneStoreAllowEntry removes the peer and decrements
    storeAllowCount atomically; is idempotent.
  - pruneStoreAllowEntry is a no-op for nil receiver, nil node, empty
    ID, or unset allowlist.
  - TestPeerStoreIneligibleMarker_PresentInServerEmitters cross-checks
    that the client-side recognizer matches the exact strings emitted
    on the server side.

Blast radius:

  - Write-path only; read-path (BatchRetrieve) is unaffected.
  - No behavior change in the all-ACTIVE (unmixed) cluster case.
  - Pre-existing tests on #284 remain green
    (TestEligibleForRouting_*, TestEligibleForStore_*, TestShouldReject-
    Store, TestSelfStoreEligible, TestPruneIneligibleStorePeers_*,
    TestStateClassification_Table).

Rollback: single revert of this commit restores previous behavior; the
pre-existing STORAGE_FULL gate logic continues to function (it was
correct on the server side; the fix only strengthens client-side
convergence).
Comment thread p2p/kademlia/dht.go Outdated
Comment on lines +2400 to +2410
// Best-effort allowlist refresh on the write hot path. This is the
// first line of defense against a stale storeAllowlist on the very
// first upload after a chain state transition. The peer-rejection
// handling below is the second line of defense for races we miss.
if err := s.RefreshAllowlistsFromChain(ctx); err != nil {
logtrace.Debug(ctx, "dht: opportunistic allowlist refresh failed", logtrace.Fields{
logtrace.FieldModule: "dht",
"task_id": id,
logtrace.FieldError: err.Error(),
})
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RefreshAllowlistsFromChain calls ListSuperNodes (a full chain RPC) synchronously on every IterateBatchStore invocation. In a cascade upload this function is called per symbol batch, so every batch pays the latency of a chain round-trip and the chain node sees one ListSuperNodes query per batch. The 30-second background StartAllowlistRefresher already keeps the allowlist current for the steady state. Consider adding a time-based debounce here (e.g., skip if last refresh was <30s ago) so the opportunistic refresh fires at most once per upload rather than once per batch.

Fix it with Roo Code or mention @roomote and request a fix.

Four issues raised on PR #284 by the Roo review bot, triaged and fixed
here. Honest summary of severity (from PR review-of-review):

1. REAL BUG: setStoreAllowlist accepted an empty allowlist post-ready,
   clamping storeAllowCount to 0 and causing eligibleForStore to return
   false for every peer. A transient chain RPC failure (ListSuperNodes
   returning zero results) would therefore block ALL writes
   network-wide until the next bootstrap refresh. setRoutingAllowlist
   already guarded against this; setStoreAllowlist did not. Asymmetry.

2. MINOR PERF: RefreshAllowlistsFromChain was called unconditionally at
   the top of IterateBatchStore. A burst of concurrent uploads would
   fan out to one chain ListSuperNodes query per batch. The 30-second
   background StartAllowlistRefresher already handles steady-state
   convergence; the opportunistic call only matters for the ~30s
   window immediately after a chain state transition.

3. MAINTAINABILITY: handleBatchStoreData reimplemented the
   shouldRejectStore contract inline (selfStoreEligible() + manual
   newKeys count) while handleStoreData used shouldRejectStore(1).
   Future refinements would risk silent divergence.

4. COSMETIC: eligibleForStore checked n == nil AFTER consulting
   storeAllowReady / storeAllowCount. eligibleForRouting checked it
   first. No observable bug in current call sites (all iterate over
   slices built from non-nil sources), but defense-in-depth alignment
   is free.

Changes:

* p2p/kademlia/dht.go
  - setStoreAllowlist: reject empty-map updates post-ready, retaining
    the previous allowlist. WARN-level log on a transient outage,
    DEBUG-level log during bootstrap. Symmetric with setRoutingAllowlist.
  - eligibleForStore: move `n == nil || len(n.ID) == 0` to the top,
    matching eligibleForRouting's ordering. Defense-in-depth.
  - DHT struct: new lastAllowlistRefreshUnixNano atomic.Int64 to
    support the debounced refresh path.
  - IterateBatchStore: swap RefreshAllowlistsFromChain for the new
    MaybeRefreshAllowlists helper.

* p2p/kademlia/bootstrap.go
  - new const allowlistOpportunisticMinInterval = 10s
  - RefreshAllowlistsFromChain: stamps lastAllowlistRefreshUnixNano
    on successful completion (so the 30s background refresher also
    feeds the debounce window).
  - new MaybeRefreshAllowlists: debounced hot-path variant. Stamps
    on ATTEMPT (not success) so a persistently-failing chain cannot
    fan out to one RPC per caller; the background ticker retries on
    its own cadence.

* p2p/kademlia/supernode_state.go
  - new shouldRejectBatchStore(ctx, payloads) ([]bool, int): the
    batch-sized companion to shouldRejectStore. Preserves the hot-path
    short-circuit (selfStoreEligible() check bypasses the per-payload
    storage lookup on the 99% case) while routing the final decision
    through shouldRejectStore for single-source-of-truth semantics.

* p2p/kademlia/network.go
  - handleBatchStoreData: replace the inline selfStoreEligible +
    manual newKeys counting with a shouldRejectBatchStore call. The
    RPC response phrasing ("batch store rejected: self not
    store-eligible") is unchanged so the client-side isPeerStoreIneligibleError
    recognizer still matches.

* p2p/kademlia/review_fixes_test.go (new)
  - TestSetStoreAllowlist_EmptyMapPostReady_RetainsPrevious
    TestSetStoreAllowlist_EmptyMapPreReady_NoOp
      Invariant test locking in issue (1). Prior allowlist must be
      retained when the chain returns empty; pre-ready must not
      falsely flip ready=true with count=0.
  - TestEligibleForStore_NilNode_ReturnsFalseEvenPreReady
      Locks in issue (4) nil-ordering. Pre-ready valid node still
      permissive (bootstrap not broken).
  - TestMaybeRefreshAllowlists_Debounces
    TestMaybeRefreshAllowlists_FiresWhenStale
      Lock in issue (2) debounce semantics: within the 10s window
      opportunistic callers skip; outside it they fire.
  - TestShouldRejectBatchStore_ShortCircuitsOnSelfEligible
    TestShouldRejectBatchStore_NilReceiver
      Verify the batch-helper's performance-critical short-circuit
      (no storage lookups when self is store-eligible) and defensive
      nil-receiver handling.

All existing p2p tests remain green. go vet clean.

Why the author missed these during the original #284 authorship:

- Confirmation bias: focus on "does Phase 4 pass" after the first fix
  commit blinded me to asymmetric hardening between sibling functions.
- Skipped my own invariant-first-coding skill: per-path violation
  tests for write-path entry points were not authored up front. A
  side-by-side of setRoutingAllowlist vs setStoreAllowlist would have
  surfaced (1) immediately.
- Positive-path-only testing: peer_store_rejection_test.go covers the
  new tolerance path but not input edge cases to setStoreAllowlist
  (empty map, nil map, pre- vs post-ready semantics).

These fixes tighten the invariants the first commit put in place and
add the missing violation tests that would have surfaced them.
@mateeullahmalik mateeullahmalik merged commit 2a97508 into master Apr 24, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant