fix(kv): bound verifyLeaderEngine ReadIndex with 5s deadline by bootjp · Pull Request #745 · bootjp/elastickv

bootjp · 2026-05-08T08:11:24Z

Summary

verifyLeaderEngine() called engine.VerifyLeader with context.Background(), so callers without an upstream context blocked indefinitely on a ReadIndex round-trip. A single transient stall accumulated callers permanently. This caps the no-context path at 5s.

Production incident — 2026-05-08

Follower 192.168.0.214 lost its network route (no route to host, ARP INCOMPLETE). The leader's ReadIndex completion stalled intermittently and verify-callers piled up at ~9/sec without bound.

After ~37 minutes the leader (192.168.0.212) showed:

20,560 goroutines, 20,478 of them in etcd.(*Engine).submitRead [select, 35-39 minutes]
CPU 1870% (Engine.run Ready loop walks pendingReads O(N) per tick → queue feeds back on itself)
Host MemAvailable trending toward 0 → OOM
Each new leader after failover re-entered the same death spiral

Mitigation: docker restart elastickv on 212 dropped it to 74% CPU / 163 MiB. 214 was hardware-rebooted and is REACHABLE again. This PR prevents the next leader from re-entering the spiral.

Affected callers

All use the no-context verifyLeaderEngine variant:

kv/leader_proxy.go — LeaderProxy.Commit / .Abort (every Redis write)
kv/coordinator.go — Coordinate.VerifyLeader
kv/sharded_coordinator.go — ShardedCoordinator.VerifyLeader / VerifyLeaderForKey
adapter/s3.go — isVerifiedS3Leader / inline VerifyLeader at line 2291 (healthz)
adapter/sqs.go — isVerifiedSQSLeader (healthz)
main_admin.go — LeaderProbe callback for /admin/healthz/leader

Failure mode on timeout

context.DeadlineExceeded surfaces to the caller. LeaderProxy falls back to forwardWithRetry (the existing path for any verify failure). Healthz handlers report 503 not-leader. Background loops (lock resolver, HLC lease) skip this tick.

No new infinite loop: even when this node is the leader, a verify-failure → forward path already exists in LeaderProxy.Commit; that path is bounded by leaderProxyRetryBudget = 5s and maxForwardRetries = 3.

Self-review (5 lenses)

Data loss — none. The fix only shortens a never-returning wait. verifyLeaderEngine is a freshness check, not a write path. Already-committed proposals are unaffected.
Concurrency — the new ctx is local to each call (defer cancel), no shared state, no lock changes. Engine-side blocking semantics unchanged; we just stop waiting forever.
Performance — net positive. Removes the unbounded goroutine pile-up and the O(N) pendingReads walk it caused. No new allocations on the success path beyond the WithTimeout context.
Data consistency — ReadIndex still completes when quorum heartbeats land within 5s. A timeout means the caller could not confirm leadership freshness, which the existing "fall through to forward" path already treats as a soft failure.
Test coverage — kv/raft_engine_test.go::TestVerifyLeaderEngine_BoundsBlockingReadIndex pins the regression: a blockingLeaderView that holds VerifyLeader on its ctx must surface DeadlineExceeded within 2 * verifyLeaderTimeout.

Test plan

go test -race -count=1 ./kv — 9.3s, all green
New regression test TestVerifyLeaderEngine_BoundsBlockingReadIndex covers the blocking case
Roll out to 192.168.0.x cluster after merge, watch CPU/Mem panel for the next 4-6h to confirm no more OOM cascade

Future work (separate PRs)

Plumb real request contexts through LeaderProxy.Commit/Abort and the healthz handlers so client-side deadlines cascade naturally instead of relying on this fixed bound. Today the Redis adapter's per-command deadline doesn't reach LeaderProxy; the proxy interface takes []*pb.Request only.

Summary by CodeRabbit

Bug Fixes
- Implemented timeout bounds for leader verification operations with a 5-second limit to prevent indefinite blocking.
Tests
- Added test to verify leader verification properly handles timeout scenarios and completes within the expected timeframe under stalled conditions.

verifyLeaderEngine() called engine.VerifyLeader with context.Background(), so any caller without an upstream context blocked indefinitely on a ReadIndex round-trip. A single transient stall accumulated callers permanently because they never timed out and never returned. Production hit this on 2026-05-08: follower 192.168.0.214 lost its network route (no route to host, ARP INCOMPLETE), the leader's ReadIndex completion stalled intermittently, and verify-callers piled up at roughly 9/sec without bound. After ~37 minutes the leader (192.168.0.212) held 20,560 goroutines (20,478 in submitRead select, oldest 39 minutes), CPU pinned at 1870% (Engine.run Ready loop walks pendingReads O(N) per tick, so the queue feeds back on itself), and host MemAvailable trended toward 0 until OOM. Each new leader after failover re-entered the same death spiral. Affected callers (all use the no-context variant): - LeaderProxy.Commit / .Abort -- every Redis write hits this - Coordinate.VerifyLeader / ShardedCoordinator.VerifyLeader[ForKey] - adapter S3/SQS /healthz/leader handlers (Caddy probes) - main_admin.go LeaderProbe (admin dashboard /admin/healthz/leader) - adapter/sqs.go isVerifiedSQSLeader, adapter/s3.go isVerifiedS3Leader Fix: cap the no-context path at 5s (matching leaderForwardTimeout). On timeout, callers see context.DeadlineExceeded -- LeaderProxy falls back to forwardWithRetry as it already does for any verify failure, healthz handlers report not-leader, and the lock resolver skips this tick. Self-review (5 lenses): 1. Data loss -- none. The fix only shortens a never-returning wait. verifyLeaderEngine is a freshness check, not a write path. 2. Concurrency -- the new ctx is local to each call (defer cancel), no shared state, no lock changes. Engine-side blocking semantics unchanged; we just stop waiting forever. 3. Performance -- positive. Removes the unbounded goroutine pile-up and the O(N) pendingReads walk it caused. No new allocations on the success path beyond the WithTimeout context. 4. Data consistency -- ReadIndex still completes when quorum heartbeats land within 5s. A timeout means the caller could not confirm leadership freshness, which the existing "fall through to forward" path already treats as a soft failure. 5. Test coverage -- kv/raft_engine_test.go pins the regression: a blockingLeaderView that holds VerifyLeader on its ctx must surface DeadlineExceeded within 2x verifyLeaderTimeout. Test: go test -race -count=1 ./kv -- 9.3s, all green. Future work (separate PRs): plumb real request contexts through LeaderProxy.Commit/Abort and the healthz handlers so a client-side deadline cascades naturally instead of relying on this fixed bound.

coderabbitai · 2026-05-08T08:11:38Z

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 26 minutes and 55 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ce81b9c5-3e55-4c87-9298-a4efb6e044c9

📥 Commits

Reviewing files that changed from the base of the PR and between ad924ad and a2ab245.

📒 Files selected for processing (2)

kv/raft_engine.go
kv/raft_engine_test.go

📝 Walkthrough

Walkthrough

This PR adds a 5-second timeout guard to the verifyLeaderEngine function to prevent unbounded blocking on stalled leader verification operations. The implementation wraps the underlying call with a timeout context and includes a test that verifies the deadline is enforced even when the leader read operation blocks indefinitely.

Changes

Leader Verification Timeout

Layer / File(s)	Summary
Timeout Constant & Rationale `kv/raft_engine.go`	Introduces `verifyLeaderTimeout` constant (5 seconds) with extensive comments documenting the prior unbounded behavior and operational failure mode.
Implementation Change `kv/raft_engine.go`	`verifyLeaderEngine` now wraps the leader verification call in `context.WithTimeout(context.Background(), verifyLeaderTimeout)` and defers cancellation, replacing the prior unbounded call.
Test Support & Deadline Verification `kv/raft_engine_test.go`	Adds `blockingLeaderView` mock and `TestVerifyLeaderEngine_BoundsBlockingReadIndex` test that verifies the timeout is enforced and the function returns `context.DeadlineExceeded` within the deadline window.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A timeout, so wise and so keen,
Five seconds to bound the in-between,
No more shall the leader read stall,
The deadline will answer the call,
A test hops along to ensure all is right! 🕒

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and specifically describes the main change: bounding the verifyLeaderEngine function's ReadIndex operation with a 5-second deadline.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/verify-leader-deadline

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

bootjp · 2026-05-08T08:12:36Z

@claude review

claude · 2026-05-08T08:12:55Z

Claude finished @bootjp's task in 2m 44s —— View job

Code Review

Gather context and read changed files
Review kv/raft_engine.go changes
Review kv/raft_engine_test.go changes
Cross-check callers and failure modes
Post final review

The fix is correct and minimal. The root cause analysis is solid — verifyLeaderEngine calling engine.VerifyLeader with context.Background() is a classic goroutine leak pattern, and the O(N) pendingReads walk compounding the pile-up explains the CPU death spiral. The 5-second cap is the right level of fix here.

Three observations, ordered by severity:

1. Incident narrative in source code (style)

kv/raft_engine.go:12–35 — the comment on verifyLeaderTimeout carries IP addresses, goroutine counts, CPU figures, and timestamps. This is post-mortem/commit-message material, not source code. The specific details will become stale (or misleading if those nodes change roles or leave the cluster), and future readers who encounter 192.168.0.214 will have no way to verify the claim.

The rationale is exactly right and should stay — "why 5s," "why this constant exists," the O(N) pendingReads feedback loop. Everything from "Production hit this on 2026-05-08..." down to "...until OOM" can be collapsed to a one-liner:

// 5s matches leaderForwardTimeout: a verify that takes longer than a
// single forward RPC is useless as a freshness check, and the proxy's
// verify-then-forward path stays within its 5s retry budget.
//
// See incident 2026-05-08 (PR #745) for the goroutine-pile production failure
// this prevents.
const verifyLeaderTimeout = 5 * time.Second

Fix this →

2. Test uses stdlib `errors.Is` — inconsistent with project conventions

kv/raft_engine_test.go:45:

if !stderrors.Is(err, context.DeadlineExceeded) {

This works correctly (cockroachdb/errors.WithStack implements Unwrap() so stdlib errors.Is traverses the chain), but the project convention from CLAUDE.md is to use github.com/cockroachdb/errors at all boundaries. The errors.Is check here should use the cockroachdb import for consistency, and avoids any future footgun if the wrapping ever uses a mark-based equivalence instead of a direct chain:

import "github.com/cockroachdb/errors"
...
if !errors.Is(err, context.DeadlineExceeded) {

Fix this →

3. Slow test — add `testing.Short()` guard

kv/raft_engine_test.go:35 — TestVerifyLeaderEngine_BoundsBlockingReadIndex blocks for verifyLeaderTimeout = 5 seconds on every run, including make test. Go convention for slow tests is to gate on testing.Short():

func TestVerifyLeaderEngine_BoundsBlockingReadIndex(t *testing.T) {
    t.Parallel()
    if testing.Short() {
        t.Skip("skipping: blocks for verifyLeaderTimeout (5s)")
    }
    ...

This keeps go test -short ./kv/... fast without removing regression coverage from the default make test run.

Fix this →

No issues found

defer cancel() placement is correct — no leak on either success or timeout path.
context.DeadlineExceeded is not classified as isTransientLeaderError (it's not ErrNotLeader, ErrLeadershipLost, or ErrLeadershipTransferInProgress), so the LeaderProxy forward path behaves correctly.
verifyLeaderTimeout = 5s aligning with leaderForwardTimeout is well-reasoned and clearly documented.
blockingLeaderView satisfies the full raftengine.LeaderView interface (State, Leader, VerifyLeader, LinearizableRead). ✓
The forwardWithRetry fallback on timeout is bounded by leaderProxyRetryBudget = 5s, so worst-case the partition path is still bounded (verify 5s + forward budget 5s = 10s max). ✓

The first observation is worth a quick fixup commit; the second and third are minor. The core fix is sound.

gemini-code-assist

Code Review

This pull request introduces a 5-second timeout to the verifyLeaderEngine function to prevent indefinite blocking and potential resource exhaustion during ReadIndex stalls, addressing a production issue that previously led to goroutine leaks and OOM errors. A regression test was added to verify that the function now returns a deadline exceeded error within a reasonable timeframe. The review feedback suggests enhancing the test by asserting a minimum elapsed time to ensure the timeout logic is correctly exercised and not returning early for other reasons.

gemini-code-assist · 2026-05-08T08:14:31Z

+	if elapsed > 2*verifyLeaderTimeout {
+		t.Fatalf("verifyLeaderEngine(blocking) returned after %s; want <= 2x verifyLeaderTimeout (%s)", elapsed, verifyLeaderTimeout)
+	}


To ensure that the test is actually exercising the timeout logic and not returning early for some other reason, consider asserting a minimum elapsed time. This confirms that the blockingLeaderView mock correctly held the call until the context deadline was reached.

if elapsed < verifyLeaderTimeout { t.Fatalf("verifyLeaderEngine(blocking) returned too early after %s; want >= %s", elapsed, verifyLeaderTimeout) } if elapsed > 2*verifyLeaderTimeout { t.Fatalf("verifyLeaderEngine(blocking) returned after %s; want <= 2x verifyLeaderTimeout (%s)", elapsed, verifyLeaderTimeout) }

Round-1 review on commit ad924ad: - gemini medium: assert minimum elapsed time, not just upper bound. Without a lower bound a future regression returning DeadlineExceeded before actually waiting would silently pass. Added a lower-bound check with 200ms slack so a slow CI scheduler that trips ctx.Done() a hair before the wall clock catches up does not flake. - claude bot (style): trim incident narrative from the verifyLeaderTimeout doc comment. Specific IPs, goroutine counts, and timestamps belong in the commit message and PR body, not the source — they go stale and add no design value at the call site. Kept the rationale (why 5s, the O(N) pendingReads feedback loop, match leaderForwardTimeout) and added a "see PR #745" reference for readers who want the full incident detail. - claude bot: switch the regression test from stdlib `stderrors "errors"` to `github.com/cockroachdb/errors` for consistency with the project convention (CLAUDE.md mandates cockroachdb/errors at boundaries; the test is a boundary). Also drops the import alias which made gci complain about ordering. - claude bot: add `if testing.Short() { t.Skip(...) }` guard so `go test -short ./kv/...` does not pay the verifyLeaderTimeout (5s) wait. Default `make test` still runs it. - reviewdog/golangci (gci): import order fixed by reordering third-party (cockroachdb) above local (bootjp) per the project's default gci sections (standard, default, prefix(...)). No behavior change. Test still passes: go test -race -count=1 -run TestVerifyLeaderEngine ./kv -- 6.06s go test -race -count=1 -short ./kv -- 5.55s (test skipped under -short) No semantic change to the production fix; this commit is test + comment polish only, so no caller audit needed.

bootjp · 2026-05-08T08:36:51Z

@claude review

claude · 2026-05-08T08:37:14Z

Claude finished @bootjp's task in 1m 43s —— View job

Claude finished @bootjp's task ——

Round-2 Code Review

Gather context and read changed files
Verify round-1 issues resolved
Check remaining linter warning
Post final review

All three round-1 issues are resolved, and the Gemini lower-bound suggestion has been incorporated. One item remains open.

Remaining issue: gci import ordering in test file

kv/raft_engine_test.go:8–11 — the golangci gci formatter is still flagging the test file. The import block has two non-stdlib groups separated by a blank line:

import (
    "context"
    "testing"
    "time"

    "github.com/cockroachdb/errors"          // ← its own group

    "github.com/bootjp/elastickv/internal/raftengine"  // ← separate group
)

The project convention (see kv/coordinator.go:3–16) puts all non-stdlib imports in a single group with no blank line between third-party and local packages:

import (
    "context"
    "testing"
    "time"

    "github.com/bootjp/elastickv/internal/raftengine"
    "github.com/cockroachdb/errors"
)

gofmt won't catch this — gci specifically enforces the grouping, which is why the reviewdog bot flagged it.

Fix this →

Minor nit: comment references review tooling

kv/raft_engine_test.go:61 — the inline comment says "Pulled in from gemini's PR #745 round-1 review". This is process metadata that will rot once the PR is merged and the review link is no longer in context. Consider trimming it to just explain the design reason:

// Lower bound: confirm the engine actually held the call until the
// deadline fired, not that some other error path returned immediately.

The PR description already credits the source; the source file only needs the rationale.

Everything else looks correct

Comment on verifyLeaderTimeout ✓ — incident narrative removed; design rationale (O(N) pendingReads, leaderForwardTimeout alignment) and PR reference kept.
errors.Is ✓ — uses github.com/cockroachdb/errors correctly.
testing.Short() guard ✓ — correctly placed after t.Parallel().
Lower bound (elapsed+slack < verifyLeaderTimeout) ✓ — catches early-return regressions; 200ms slack is appropriate for CI schedulers.
Upper bound (elapsed > 2*verifyLeaderTimeout) ✓ — proves the call returned at all.
defer cancel() placement ✓ — no leak on either success or timeout path.
blockingLeaderView satisfies the full raftengine.LeaderView interface ✓.

The gci fix is one import-block tweak; everything else is shippable.

Round-2 review on commit 0be9fc7: - reviewdog/golangci (gci): test file still failed gci because the project convention puts ALL non-stdlib imports in a single group (no blank line between third-party and local). Round-2's "cockroachdb above bootjp with a blank line between" guess was wrong; merging into one alphabetical group (bootjp before cockroachdb) matches kv/coordinator.go and resolves the warning. - claude bot (nit): drop "Pulled in from gemini's PR #745 round-1 review" from the lower-bound rationale comment. Process metadata rots once the PR is merged; the design reason is what the source needs. No behavior change. Test still passes 6.05s.

bootjp · 2026-05-08T08:44:31Z

@claude review

claude · 2026-05-08T08:44:51Z

Claude finished @bootjp's task in 1m 5s —— View job

Round-3 Code Review

Gather context from previous review rounds
Read current state of kv/raft_engine.go
Read current state of kv/raft_engine_test.go
Verify round-2 issues resolved
Post final review

Both round-2 issues are resolved. This is clean.

Round-2 issues — verified closed

gci import ordering (kv/raft_engine_test.go:8–9) ✓
Both non-stdlib packages are now in the same group with no blank line between them, alphabetical order (bootjp before cockroachdb), matching the kv/coordinator.go convention exactly. The golangci gci warning is gone.

import (
    "context"
    "testing"
    "time"

    "github.com/bootjp/elastickv/internal/raftengine"
    "github.com/cockroachdb/errors"
)

Comment referencing review tooling (kv/raft_engine_test.go:55–67) ✓
The comment now explains only the design rationale (lower-bound slack, why the check prevents silent early-return regressions). No reference to Gemini or round-1 review.

Everything checks out

verifyLeaderTimeout comment — clean. Rationale (O(N) pendingReads, alignment with leaderForwardTimeout), PR reference, no incident narrative. ✓
verifyLeaderEngine — context.WithTimeout + defer cancel(), no leak on any path. ✓
blockingLeaderView — satisfies all four methods of raftengine.LeaderView (State, Leader, VerifyLeader, LinearizableRead). ✓
Test structure — t.Parallel() → testing.Short() guard → lower bound (with 200ms slack) → upper bound (2×). ✓
errors.Is — uses github.com/cockroachdb/errors per project convention. ✓

No remaining issues. This PR is ready to merge.

Follow-up to #745 (verifyLeaderEngine 5s deadline incident hotfix). PR #745 capped the no-context verify path at 5s as defense-in-depth; this PR plumbs the caller's context.Context end-to-end through: Dispatch path: ShardedCoordinator.Dispatch(ctx) → dispatchTxn(ctx) → dispatchSingleShardTxn(ctx) / commitPrimaryTxn(ctx) / commitSecondaryTxns(ctx) / abortPreparedTxn(ctx) / broadcastToAllGroups(ctx) / dispatchDelPrefixBroadcast(ctx) → Transactional.Commit/Abort(ctx, reqs) → leaseRefreshingTxn / LeaderProxy / TransactionManager / ShardRouter → applyRequests(ctx) → proposer.Propose(ctx) → verifyLeaderEngineCtx(ctx) Lock-resolver / shard-store path: LockResolver.resolveExpiredLock(ctx) → applyTxnResolution(ctx) / ShardStore.applyScanLockResolutions(ctx) / ShardStore.tryAbortExpiredPrimary(ctx) → g.Txn.Commit(ctx) Leader-probe / healthz path: Coordinator.VerifyLeader(ctx) / Coordinator.VerifyLeaderForKey(ctx, key) LeaderProbe.IsVerifiedLeader(ctx) adapter S3/SQS/DynamoDB /healthz/leader handlers feed r.Context() Adapter healthz helpers: isVerifiedSQSLeader(ctx, coordinator) isVerifiedDynamoLeader(ctx, coordinator) S3Server.isVerifiedS3Leader(ctx) DistributionServer.verifyCatalogLeader(ctx) Admin LeaderProbe wiring: main_admin.go newAdminLeaderProbe → coordinate.VerifyLeader(ctx) After: a Redis BLPOP timeout=2s whose dispatch lands on a slow ReadIndex now fails after 2s (its own deadline), not after the 5s verifyLeaderTimeout safety bound. A Caddy active health probe with a 1s budget likewise fails after 1s. The 5s bound stays in place as defense-in-depth for the no-arg verifyLeaderEngine() — still hit by LockResolver background loops, HLC lease ticks, and any future internal caller that genuinely cannot inherit a deadline. Also fixes proposer.Propose at kv/transaction.go:152 which used context.Background() inline; same shape as the original verifyLeaderEngine bug, just on the propose path. Now plumbed. Batched commit caveat: TransactionManager.commitRaw merges many callers into a single propose via a separate goroutine, so no single ctx can bound the underlying applyRequests. The wait site in commitRaw selects between item.done and ctx.Done so per-caller cancellation still works (the propose continues, the abandoned caller exits early); the goroutine's applyRequests call uses context.Background by design and is documented as such. Test stub updates (3 ripples through the Transactional interface, 6 ripples through Coordinator, 2 through LeaderProbe): kv/coordinator_txn_test.go::stubTransactional kv/coordinator_retry_test.go::scriptedTransactional kv/sharded_coordinator_txn_test.go::recordingTransactional kv/sharded_coordinator_abort_test.go::failingTransactional kv/sharded_lease_test.go::fixedTransactional kv/shard_router_test.go::fakeTM kv/shard_router_partition_test.go::fakeTxn kv/leader_routed_store_test.go::stubLeaderCoordinator adapter/redis_retry_test.go::retryOnceCoordinator adapter/redis_info_test.go::infoTestCoordinator adapter/redis_hello_test.go::helloTestCoordinator adapter/redis_keys_pattern_test.go::stubAdapterCoordinator adapter/distribution_server_test.go::distributionCoordinatorStub adapter/dynamodb_test.go::testCoordinatorWrapper adapter/s3_test.go::routeAwareS3Coordinator + followerS3Coordinator internal/admin/router_test.go::LeaderProbeFunc literal forms Self-review (5 lenses): 1. Data loss — none. ctx is value-passed; no persisted state changes. verifyLeaderEngineCtx already existed and is what we now route to; only the caller of the no-arg variant changes (background loops only). 2. Concurrency — ctx flows through the existing locking unchanged. commitRaw goroutine intentionally uses Background to avoid mixing per-caller deadlines into a shared batched propose; documented. 3. Performance — same call count, no extra allocations beyond the ctx value passed by reference. context.WithTimeout wrapping in the no-ctx fallback is unchanged. 4. Data consistency — verify is a freshness check, not a write path. Shorter caller deadlines surface ErrLeaderNotFound earlier instead of waiting out the 5s safety bound. 5. Test coverage — every Transactional / Coordinator / LeaderProbe stub gains the ctx parameter (currently `_ context.Context`, ready for future tests asserting cancel propagation). Test: go test -race -count=1 -short ./kv ./adapter ./internal/admin -- 9.6m total, all green (adapter is the long pole on Jepsen-shaped tests; kv 6.6s, admin 1.4s). Design doc: docs/design/2026_05_10_proposed_kv_ctx_plumbing.md.

#749) ## Summary Follow-up to #745. PR #745 capped `verifyLeaderEngine`'s no-context call site at 5s as an incident hotfix; this PR plumbs the caller's `context.Context` end-to-end through the kv write path and the verify-leader path so callers that hold a request deadline see THEIR deadline honoured, not the 5s safety bound. The 5s bound stays in place as defense-in-depth for callers without an upstream context (LockResolver, HLC lease tick). ## What changed **Interface signatures (kv-internal):** ```go type Transactional interface { Commit(ctx context.Context, reqs []*pb.Request) (*TransactionResponse, error) Abort(ctx context.Context, reqs []*pb.Request) (*TransactionResponse, error) } type Coordinator interface { // … VerifyLeader(ctx context.Context) error VerifyLeaderForKey(ctx context.Context, key []byte) error } type LeaderProbe interface { IsVerifiedLeader(ctx context.Context) bool } ``` **Plumbing chains** (every step now carries ctx): ``` Dispatch path: ShardedCoordinator.Dispatch(ctx) → dispatchTxn(ctx) / dispatchSingleShardTxn(ctx) / commitPrimaryTxn(ctx) / commitSecondaryTxns(ctx) / abortPreparedTxn(ctx) / broadcastToAllGroups(ctx) / dispatchDelPrefixBroadcast(ctx) → Transactional.Commit/Abort(ctx) → leaseRefreshingTxn / LeaderProxy / TransactionManager / ShardRouter → applyRequests(ctx) → proposer.Propose(ctx) → verifyLeaderEngineCtx(ctx) Lock-resolver / shard-store path: LockResolver.resolveExpiredLock(ctx) → applyTxnResolution(ctx) / applyScanLockResolutions(ctx) / tryAbortExpiredPrimary(ctx) → g.Txn.Commit(ctx) Healthz: S3 / SQS / DynamoDB /healthz/leader handlers → isVerifiedSQSLeader(ctx, …) / isVerifiedDynamoLeader(ctx, …) / S3Server.isVerifiedS3Leader(ctx) / DistributionServer.verifyCatalogLeader(ctx) → coordinator.VerifyLeader(ctx) / VerifyLeaderForKey(ctx, key) Admin LeaderProbe: /admin/healthz/leader handler → LeaderProbe.IsVerifiedLeader(r.Context()) → coordinate.VerifyLeader(ctx) ``` Also fixes the inline `context.Background()` at `kv/transaction.go:152` (`proposer.Propose`) — same shape as the original `verifyLeaderEngine` bug, on the propose path. ## Behaviour For callers with upstream deadlines: | caller | before | after | |---|---|---| | Redis BLPOP `timeout=2s` on slow ReadIndex | fails after 5s (verifyLeaderTimeout) | fails after 2s (caller's deadline) | | Caddy active probe `timeout=1s` | fails after 5s | fails after 1s | | gRPC `Dispatch(ctx)` with explicit deadline | dropped to 5s | honours ctx deadline | For background callers without upstream ctx: - LockResolver tick, HLC lease renewer, etc. continue to hit `verifyLeaderEngine()` (no-arg variant) which still wraps with `context.WithTimeout(context.Background(), verifyLeaderTimeout)`. PR #745's 5s bound is unchanged for them. ## Batched-commit caveat `TransactionManager.commitRaw` merges many callers into a single propose via a separate goroutine, so no single ctx can bound the underlying `applyRequests`. The fix: - Wait site in `commitRaw` selects between `item.done` and `ctx.Done()` so per-caller cancellation works (caller exits early; the propose still completes; other waiters in the same batch get their results normally). - Goroutine's `applyRequests` call uses `context.Background` by design, documented in the code. ## Self-review (5 lenses) 1. **Data loss** — none. ctx is value-passed; no persisted state changes. `verifyLeaderEngineCtx` is the existing path; only the caller of the no-arg variant changes. 2. **Concurrency** — ctx flows through the existing locking unchanged. `commitRaw` goroutine intentionally uses Background to avoid mixing per-caller deadlines into a shared batched propose; documented. 3. **Performance** — same call count, no extra allocations beyond the ctx value passed by reference. `context.WithTimeout` wrapping in the no-ctx fallback is unchanged. 4. **Data consistency** — verify is a freshness check, not a write path. Shorter caller deadlines surface `ErrLeaderNotFound` earlier instead of waiting out the 5s safety bound. 5. **Test coverage** — every Transactional / Coordinator / LeaderProbe stub gains the ctx parameter (currently `_ context.Context`, ready for future tests asserting cancel propagation). 16 stubs / signatures updated. ## Test plan - [x] `go test -race -count=1 -short ./kv ./adapter ./internal/admin` — all green - kv: 6.6s - adapter: 565s (long Jepsen-shaped tests) - internal/admin: 1.4s - [x] No semantic change beyond the ctx propagation; all existing tests pass without modification of their assertions. ## Follow-up scope (not in this PR) The audit also surfaced ~210 `context.Background()` usages in tree, mostly concentrated in: | file | count | nature | |---|---|---| | adapter/redis_compat_commands.go | 66 | dispatch helpers, internal commands | | adapter/redis_lua_context.go | 25 | Lua script context loaders | | adapter/redis.go | 25 | handler context fallback paths | | proxy/* | 28 | proxy package, separate concern | These are "easy win" candidates where a `ctx` is in scope but `Background()` was used instead. Not bundled here to keep the PR reviewable; a follow-up "audit pass" PR targeting Redis adapter helpers is the natural next step. The single `context.TODO()` (kv/fsm.go) is structural — raft `Apply()` does not provide a ctx — and is left as is. ## Design doc `docs/design/2026_05_10_proposed_kv_ctx_plumbing.md` — included in this PR.  ## Summary by CodeRabbit * **Refactor** * Updated internal APIs to propagate request context through leader verification and transaction operations, enabling request deadlines to be respected instead of using fixed internal timeouts. * **Documentation** * Added design document outlining context plumbing through KV write paths and leader-probe verification across adapters. [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/bootjp/elastickv/pull/749)

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

bootjp merged commit 5f12d5d into main May 8, 2026
9 checks passed

bootjp deleted the fix/verify-leader-deadline branch May 8, 2026 08:51

bootjp mentioned this pull request May 9, 2026

refactor(kv): plumb caller context through write + verify-leader paths #749

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(kv): bound verifyLeaderEngine ReadIndex with 5s deadline#745

fix(kv): bound verifyLeaderEngine ReadIndex with 5s deadline#745
bootjp merged 3 commits intomainfrom
fix/verify-leader-deadline

bootjp commented May 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

bootjp commented May 8, 2026

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

bootjp commented May 8, 2026

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

bootjp commented May 8, 2026

Uh oh!

claude Bot commented May 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bootjp commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Production incident — 2026-05-08

Affected callers

Failure mode on timeout

Self-review (5 lenses)

Test plan

Future work (separate PRs)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

bootjp commented May 8, 2026

Uh oh!

claude Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

1. Incident narrative in source code (style)

2. Test uses stdlib errors.Is — inconsistent with project conventions

3. Slow test — add testing.Short() guard

No issues found

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented May 8, 2026

Uh oh!

claude Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Round-2 Code Review

Remaining issue: gci import ordering in test file

Minor nit: comment references review tooling

Everything else looks correct

Uh oh!

bootjp commented May 8, 2026

Uh oh!

claude Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Round-3 Code Review

Round-2 issues — verified closed

Everything checks out

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bootjp commented May 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading

claude Bot commented May 8, 2026 •

edited

Loading

2. Test uses stdlib `errors.Is` — inconsistent with project conventions

3. Slow test — add `testing.Short()` guard

claude Bot commented May 8, 2026 •

edited

Loading

claude Bot commented May 8, 2026 •

edited

Loading