Skip to content

refactor(redis): bound rangeList / readValueAt VerifyLeaderForKey to per-call ctx#752

Open
bootjp wants to merge 1 commit into
mainfrom
fix/redis-handler-ctx-bound
Open

refactor(redis): bound rangeList / readValueAt VerifyLeaderForKey to per-call ctx#752
bootjp wants to merge 1 commit into
mainfrom
fix/redis-handler-ctx-bound

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented May 10, 2026

Summary

  • Redis adapter follow-up to PR refactor(kv): plumb caller context through write + verify-leader paths #749. rangeList and readValueAt now accept a ctx context.Context parameter and forward it to VerifyLeaderForKey instead of using the long-lived r.handlerContext().
  • lrange wraps r.handlerContext() with redisDispatchTimeout and passes it down. Inside MULTI/EXEC, txnContext carries the EXEC-scoped dispatch ctx (set at construction in runTransaction) so load() → readValueAt() respects the caller's deadline.
  • Not a correctness fix: the verifyLeaderEngineCtx default-deadline guard introduced in PR refactor(kv): plumb caller context through write + verify-leader paths #749 r1 already bounded these paths at 5 s. This brings them under the same per-command dispatch budget as the rest of the adapter so a slow leader probe can't outlive the user-visible command timeout via the 5 s fallback.

Background

PR #749 review (gemini/Codex) flagged that keys() and FLUSHDB had escaped the dispatch deadline by feeding r.handlerContext() (server baseCtx, no deadline) into VerifyLeader*. Those were fixed inline. Two remaining call sites — rangeList (LRANGE leader path) and readValueAt (single-key reads from EXEC bodies and string GETs) — were called out as a non-blocking follow-up because the new verifyLeaderEngineCtx 5 s default already protected against the unbounded-wait regression. This PR tightens them to match.

Why txnContext.ctx

txnContext is the value type that wraps one in-flight EXEC body. load() is called repeatedly during EXEC and may transitively call readValueAt(). Plumbing ctx through each method would mean threading it through every working[…] access — the struct is exactly the right scope to carry the EXEC dispatch ctx, so it gets a ctx field with a //nolint:containedctx annotation explaining the rationale.

Test plan

  • go test -race -count=1 -short ./adapter — 531s, all green
  • go vet ./... — clean
  • CI green

Summary by CodeRabbit

  • Bug Fixes
    • Transactions now properly respect per-call timeout deadlines during execution, ensuring operations complete within expected time constraints.
    • Command execution timeout behavior is now consistent across operations, with better deadline handling for data lookups and leader verification.

Review Change Stack

…per-call ctx

Follow-up to PR #749 (Redis adapter audit).

Before: rangeList and readValueAt called
VerifyLeaderForKey(r.handlerContext(), key). handlerContext is
the long-lived server baseCtx with no deadline; a stalled
ReadIndex would block the call until the
verifyLeaderEngineCtx defense-in-depth tripped its 5 s default
deadline.

After: both functions accept a ctx parameter and forward it to
VerifyLeaderForKey, same shape as the keys() and FLUSHDB fixes
in PR #749 r1. Callers wrap r.handlerContext() with
redisDispatchTimeout (LRANGE) or thread the EXEC dispatch ctx
through txnContext (load → readValueAt inside a transaction).

txnContext gains a ctx field carrying the EXEC-scoped dispatch
ctx so reads invoked inside the transaction respect the
caller's deadline. The struct holds it across the EXEC body
(documented in-place with the //nolint:containedctx rationale —
EXEC is a value-typed wrapper around a single in-flight client
command, ctx must travel with it).

This is a tightening, not a correctness fix: the
verifyLeaderEngineCtx default-deadline guard already bounded
these calls at 5 s. The change brings them under the same
per-command dispatch budget as the rest of the Redis adapter so
a slow leader probe can't blow past the user-visible command
timeout the way a 5 s fallback could.

Test:
  go test -race -count=1 -short ./adapter -- 531s, all green.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 10, 2026

📝 Walkthrough

Walkthrough

This PR threads per-request dispatch contexts through transaction execution and command handlers so that reads and leader verification operations honor the caller's redisDispatchTimeout deadline instead of relying on the server's long-lived handler context.

Changes

Context Threading for Deadline-Aware Operations

Layer / File(s) Summary
Function Signatures
adapter/redis.go
txnContext gains a ctx field; readValueAt accepts ctx context.Context parameter; rangeList accepts ctx parameter for leader verification.
Transaction Context Threading
adapter/redis.go
runTransaction initializes txnContext with ctx: dispatchCtx; txnContext.load calls readValueAt(t.ctx, ...) for deadline-aware reads.
LRANGE Command Handler
adapter/redis.go
lrange handler creates a redisDispatchTimeout-bounded context and passes it to rangeList for leader verification.
Test Updates
adapter/redis_retry_test.go
TestRedisEvalRetriesWriteConflict updated to pass context.Background() to readValueAt.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • bootjp/elastickv#745: Both PRs address context bounding for leader verification; this PR threads per-request contexts into VerifyLeaderForKey, while the related PR bounds the no-context path with a 5s deadline.

Poem

🐰 Context flows like a gentle stream,
Through transactions and handlers, a deadline dream,
No more fallback to endless time,
Each request gets its bounded line,
Leader checks now whisper true!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: binding VerifyLeaderForKey timeouts to per-call context in rangeList and readValueAt functions, which is the core refactoring effort across both modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/redis-handler-ctx-bound

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements context propagation for Redis command execution, specifically updating rangeList and readValueAt to honor caller deadlines. The review feedback recommends consistently using the provided context for all internal operations within these methods, such as store.GetAt and keyTypeAt, rather than falling back to context.Background(). Additionally, the feedback emphasizes acquiring MVCC timestamps after the linearizable read fence to maintain consistency.

Comment thread adapter/redis.go
}

func (r *RedisServer) rangeList(key []byte, startRaw, endRaw []byte) ([]string, error) {
func (r *RedisServer) rangeList(ctx context.Context, key []byte, startRaw, endRaw []byte) ([]string, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While ctx is now passed to rangeList, it is only used for the VerifyLeaderForKey call. Other context-aware operations in this method, such as keyTypeAt (line 3260), resolveListMeta (line 3279), and fetchListRange (line 3292), still use context.Background(). To fully adhere to the goal of bounding the command execution and ensuring prompt cancellation, these calls should also use the provided ctx. Additionally, per the rules for linearizable reads in MVCC, ensure the timestamp for the read is acquired after the linearizable read fence (VerifyLeaderForKey) to guarantee a consistent snapshot and prevent stale data reads.

References
  1. When designing interfaces, use context.Context for managing deadlines and cancellation instead of separate timeout parameters.
  2. When implementing linearizable reads in an MVCC system, ensure that the timestamp used for the MVCC read is acquired after the linearizable read fence has completed to guarantee a consistent snapshot view, especially for rangeList.

Comment thread adapter/redis.go
}

func (r *RedisServer) readValueAt(key []byte, readTS uint64) ([]byte, error) {
func (r *RedisServer) readValueAt(ctx context.Context, key []byte, readTS uint64) ([]byte, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to rangeList, readValueAt now receives a ctx but continues to use context.Background() for hasExpired (line 3522) and store.GetAt (line 3538). These should be updated to use the passed ctx to ensure the entire read path respects the dispatch deadline. Furthermore, ensure that the timestamp used for the MVCC read is acquired after the linearizable read fence has completed to guarantee a consistent snapshot view and prevent stale data reads, as required for readValueAt.

References
  1. When designing interfaces, use context.Context for managing deadlines and cancellation instead of separate timeout parameters.
  2. When implementing linearizable reads in an MVCC system, ensure that the timestamp used for the MVCC read is acquired after the linearizable read fence has completed to guarantee a consistent snapshot view, especially for readValueAt.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@adapter/redis.go`:
- Around line 1764-1770: The load() method should defensively handle a nil
txnContext.ctx to avoid panics in tests that construct txnContext literals
without ctx; inside load() (the caller used by stageKeyDeletion() which loops
and calls readValueAt()), check if t.ctx is nil and substitute
context.Background() (or another appropriate base context) before passing it
into readValueAt(); this mirrors the pattern used by
streamDeletions/loadListState and ensures mocks that call context.WithTimeout
won't panic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 764ffb47-a72f-456a-a177-f2a989b81f30

📥 Commits

Reviewing files that changed from the base of the PR and between 9303e52 and c1cbd8d.

📒 Files selected for processing (2)
  • adapter/redis.go
  • adapter/redis_retry_test.go

Comment thread adapter/redis.go
Comment on lines +1764 to +1770
server *RedisServer
// ctx is the per-EXEC dispatch context (redisDispatchTimeout-bounded
// at the call site in runTransaction). Plumbed through so reads
// inside the EXEC such as load() → readValueAt() respect the
// caller's deadline rather than falling back to handlerContext +
// the verifyLeaderEngineCtx safety net.
ctx context.Context //nolint:containedctx // EXEC is a long-lived value type that wraps a single client command, ctx must travel with it.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find every txnContext composite literal in the repo and dump 12 lines after
# so we can see which fields are initialised.
rg -nPU --type=go -A 12 '(?s)&?txnContext\s*\{' adapter
echo '---'
# Cross-check call sites of stageKeyDeletion and load(internalKey) reachable
# from those fixtures.
rg -n --type=go '\bstageKeyDeletion\s*\(|\.load\s*\(' adapter

Repository: bootjp/elastickv

Length of output: 4263


🏁 Script executed:

# Check if loadListState calls into stageKeyDeletion or load
rg -n --type=go -A 15 'func \(t \*txnContext\) loadListState' adapter/redis.go | head -30

# Also check stageKeyDeletion to see what it does
rg -n --type=go -A 20 'func \(t \*txnContext\) stageKeyDeletion' adapter/redis.go | head -40

Repository: bootjp/elastickv

Length of output: 1429


🏁 Script executed:

rg -n --type=go -A 50 'func \(t \*txnContext\) stageKeyDeletion' adapter/redis.go | head -60

Repository: bootjp/elastickv

Length of output: 1953


🏁 Script executed:

rg -n --type=go -A 8 'func \(t \*txnContext\) load\(key \[\]byte\)' adapter/redis.go | head -20

Repository: bootjp/elastickv

Length of output: 477


🏁 Script executed:

rg -n --type=go -A 25 'func \(t \*txnContext\) load\(key \[\]byte\)' adapter/redis.go | grep -A 25 "1847:"

Repository: bootjp/elastickv

Length of output: 1009


🏁 Script executed:

rg -n --type=go -A 40 'func \(t \*txnContext\) load\(key \[\]byte\)' adapter/redis.go | tail -20

Repository: bootjp/elastickv

Length of output: 639


🏁 Script executed:

rg -n --type=go 'func.*readValueAt' adapter/redis.go

Repository: bootjp/elastickv

Length of output: 165


🏁 Script executed:

sed -n '1877,1883p' adapter/redis.go | cat -n

Repository: bootjp/elastickv

Length of output: 302


Add a nil-safe fallback for t.ctx in load() to support test fixtures that omit the field.

Test fixtures in redis_txn_test.go (lines 37, 73, 108, 146) construct txnContext literals without ctx. When stageKeyDeletion() is called—for example, during key deletion in a transaction—it invokes t.load(internalKey) in a loop (lines 2380–2392), which passes t.ctx to readValueAt() at line 1879. If ctx is nil, any coordinator implementation or mock that uses context.WithTimeout(ctx, …) would panic.

Currently, test fixtures only call loadListState(), which uses context.Background() internally, so the issue is latent. However, this matches the existing pattern used for streamDeletions (line 2399): "test fixtures that build a minimal txnContext literal without this field still work." A one-line defensive fallback in load() preserves that design principle and adds robustness:

Suggested fix
 	} else {
 		var err error
-		val, err = t.server.readValueAt(t.ctx, storageKey, t.startTS)
+		ctx := t.ctx
+		if ctx == nil {
+			ctx = context.Background()
+		}
+		val, err = t.server.readValueAt(ctx, storageKey, t.startTS)
 		if err != nil && !errors.Is(err, store.ErrKeyNotFound) {
 			return nil, errors.WithStack(err)
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@adapter/redis.go` around lines 1764 - 1770, The load() method should
defensively handle a nil txnContext.ctx to avoid panics in tests that construct
txnContext literals without ctx; inside load() (the caller used by
stageKeyDeletion() which loops and calls readValueAt()), check if t.ctx is nil
and substitute context.Background() (or another appropriate base context) before
passing it into readValueAt(); this mirrors the pattern used by
streamDeletions/loadListState and ensures mocks that call context.WithTimeout
won't panic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant