feat(sqs): tombstone reaper enumerates partitioned keyspace (Phase 3.D PR 6a) by bootjp · Pull Request #735 · bootjp/elastickv

bootjp · 2026-05-04T13:34:17Z

Summary

Phase 3.D PR 6a: tombstone reaper enumerates partitioned keyspace. Closes the Codex P2 from PR #732 round 0 ("Reap partitioned dedup records to prevent key growth"), deferred at the time as a "PR 6 concern". With PR #734 (5b-3) lifting the dormancy gate, partitioned FIFO queues can now land in production — but until this PR the tombstone reaper only swept the legacy keyspace, so partitioned data / vis / byage / dedup / group rows leaked permanently after DeleteQueue / PurgeQueue.

What changes

Tombstone schema: tombstone value now encodes PartitionCount so the reaper can drive partition iteration off the tombstone alone (the meta record is gone by the time the reaper observes the tombstone). 8 bytes big-endian uint64. Backward compat: PartitionCount<=1 keeps the byte-identical legacy []byte{1} sentinel; non-canonical values (empty, 1-byte, 4-byte, out-of-range 8-byte) decode to PartitionCount=1 so a partial rollback or future encoding revision degrades safely to the legacy single-partition reap.
DeleteQueue + PurgeQueue write the encoded value, reading PartitionCount from the pre-delete / pre-purge meta (immutable across SetQueueAttributes / PurgeQueue per §3.2).
Tombstone reaper: reapTombstonedGeneration takes the decoded count, sweeps the legacy keyspace (always — covers pre-HT-FIFO queues plus a defensive pass) and on PartitionCount > 1 ALSO sweeps every partition's byage / dedup / group prefix family via the new reapPartitionedGeneration helper. allDone gating delays tombstone deletion until every partition is fully drained.
Per-record dispatch: reapOneRecord and buildReapOps now take *sqsQueueMeta + partition uint32 instead of hardcoding nil + 0. Legacy callers pass nil + 0 (byte-identical to pre-PR-5b layout); the partitioned twin reapOneRecordPartitioned synthesises a meta carrying PartitionCount > 1 so the dispatch helpers route to sqsPartitionedMsg{Data,Vis,Group}Key.
New parsed-key helpers in sqs_keys.go:
- sqsPartitionedMsgByAgePrefixForPartition(queue, partition, gen)
- sqsPartitionedMsgDedupKeyPrefix(queue, partition, gen)
- sqsPartitionedMsgGroupKeyPrefix(queue, partition, gen)

Out of scope (follow-up PR 6b)

Live-queue retention reap (reapQueue) still iterates only the legacy byage prefix; partitioned-byage live retention is the next slice.
PurgeQueue / DeleteQueue do not iterate partition keyspace synchronously; the tombstone-driven async sweep is sufficient for AWS-shaped semantics.

Caller audit

encodeQueueTombstoneValue — 2 production callers (DeleteQueue, PurgeQueue), both pass the queue's PartitionCount.
decodeQueueTombstoneValue — 1 production caller (reaper); legacy / out-of-range fallback to 1 documented and tested.
reapOneRecord (signature changed) — 3 production callers: reapDeadByAgePage (legacy), reapDeadByAgePartitionPage (via reapOneRecordPartitioned, partitioned), reapQueue's over-retention sweep (legacy). All updated; partitioned path uses the synthetic meta to flip the dispatch helper branch.
New helpers (reapPartitionedGeneration, reapDeadByAgePartition, reapDeadByAgePartitionPage) each have exactly one caller in the new tombstone-reap path.

Self-review (CLAUDE.md)

Data loss — Closes the Codex P2 leak; no path now leaves partitioned records orphaned after DeleteQueue/PurgeQueue. Legacy queues unchanged: tombstone value, reaper sweep prefixes, and dispatch helpers all keep the byte-identical pre-PR-6a behaviour for PartitionCount<=1.
Concurrency / distributed failures — Reaper still runs only on the leader (existing reaperCtx). Tombstone read is at the reaper's snapshot readTS; every individual reap is its own OCC dispatch with ErrWriteConflict treated as success. Per-partition iteration is sequential; no new shared state.
Performance — Per-tick cost grows from O(legacy prefix scans) to O((partition_count + 1) * legacy prefix scans). Capped by sqsReaperPerQueueBudget per partition. PartitionCount=1 queues see no extra cost (partitioned branch short-circuits).
Data consistency — Tombstone deletion gated on full-drain across all partition + legacy prefix families. Partial drains leave the tombstone in place so the next reaper tick can resume. PartitionCount immutability invariant means the tombstone-encoded count and the reaper's iteration bound can never disagree with the live queue.
Test coverage — 4 unit tests on the tombstone-value codec (legacy roundtrip, partitioned encode, fallback for non-canonical values, out-of-range clamp). 1 wire-level integration test that creates an 8-partition queue, sends to 6 distinct groups, calls DeleteQueue, hand-runs reapTombstonedQueues, and asserts every partitioned data / vis / byage / dedup / group prefix is empty + the tombstone itself is deleted.

Test plan

make lint — golangci-lint 0 issues
Targeted: tombstone-value codec + DeleteQueue/PurgeQueue/Reaper/HTFIFO/PartitionedFIFO suites (-race, clean)
Wider regression sweep on CreateQueue / DeleteQueue / PurgeQueue / Send/Receive/Delete / dispatch / capability gate (-race, clean)
CI: full Jepsen + race on the PR

Summary by CodeRabbit

Release Notes

Bug Fixes
- Improved queue deletion cleanup to properly enumerate and fully remove all partitions from deleted queues, ensuring complete data removal.
- Enhanced cleanup logic to correctly handle both single and multi-partition queue configurations.
Tests
- Added comprehensive tests for tombstone reaper functionality, including partition-aware cleanup verification and edge case handling.

…D PR 6a) Addresses the Codex P2 review on PR #732 round 0 ("Reap partitioned dedup records to prevent key growth"), deferred at the time as a "PR 6 concern". With PR 5b-3 lifting the dormancy gate, partitioned FIFO queues can now land in production — but until this PR the tombstone reaper only swept the legacy keyspace, so partitioned data / vis / byage / dedup / group rows leaked permanently after DeleteQueue or PurgeQueue. Scope (the focused first piece of §11 PR 6 from docs/design/2026_04_26_proposed_sqs_split_queue_fifo.md): - Tombstone schema: tombstone value now encodes PartitionCount so the reaper can drive partition iteration off the tombstone alone (the meta record is gone by the time the reaper observes the tombstone). Encoding is 8 bytes big-endian uint64. Backward compat: PartitionCount<=1 keeps the byte-identical legacy []byte{1} sentinel; any non-canonical value (empty, 1-byte, 4-byte, out-of-range 8-byte) decodes to PartitionCount=1 so a partial rollback or future encoding revision degrades safely to legacy single-partition reap behaviour. - DeleteQueue (sqs_catalog.go) and PurgeQueue (sqs_purge.go) write the encoded value, reading PartitionCount from the pre-delete / pre-purge meta. PartitionCount is immutable across SetQueueAttributes / PurgeQueue (§3.2) so the post-purge meta and the pre-purge tombstone always agree on the count. - Tombstone reaper: reapTombstonedGeneration takes the decoded PartitionCount, sweeps the legacy keyspace (always — covers pre-HT-FIFO queues plus a defensive "did we accidentally write legacy?" pass) and, on PartitionCount > 1, ALSO sweeps every partition's byage / dedup / group prefix family via the new reapPartitionedGeneration helper. allDone gating delays tombstone deletion until every partition is fully drained, so a wide-fanout queue may need multiple reaper ticks but cannot leave the tombstone-deleted-but-data-lingering inconsistency. - Per-record dispatch: reapOneRecord and buildReapOps now take *sqsQueueMeta + partition uint32 instead of hardcoding nil + 0. Legacy callers pass nil + 0 (byte-identical to the pre-PR-5b layout); the partitioned twin reapOneRecordPartitioned synthesises a meta carrying PartitionCount > 1 so the dispatch helpers route to sqsPartitionedMsg{Data,Vis,Group}Key. - New parsed-key helpers in sqs_keys.go: sqsPartitionedMsgByAgePrefixForPartition(queue, partition, gen) sqsPartitionedMsgDedupKeyPrefix(queue, partition, gen) sqsPartitionedMsgGroupKeyPrefix(queue, partition, gen) Each returns the (queue, partition, gen)-bound scan prefix the reaper feeds deleteAllPrefix / reapDeadByAgePartition. Out of scope (follow-up PR 6b): - Live-queue retention reap (reapQueue) still iterates only the legacy byage prefix; partitioned-byage live retention is the next slice. The tombstone-cohort path this PR addresses is the immediate Codex P2 concern (DeleteQueue + reaper leaving partitioned dedup records behind). - PurgeQueue / DeleteQueue do not iterate partition keyspace synchronously; the tombstone-driven async sweep is sufficient for the AWS-shaped semantics today, and the design doc does not require synchronous per-partition deletion. Caller audit: - encodeQueueTombstoneValue: two production callers (DeleteQueue, PurgeQueue), both pass the queue's PartitionCount. - decodeQueueTombstoneValue: one production caller (reaper); legacy / out-of-range fallback to 1 documented and tested. - reapOneRecord (signature changed): three production callers — reapDeadByAgePage (legacy), reapDeadByAgePartitionPage (via reapOneRecordPartitioned, partitioned), reapQueue's over-retention sweep (legacy). All updated; partitioned path uses the synthetic meta to flip the dispatch helper branch. - reapPartitionedGeneration / reapDeadByAgePartition / reapDeadByAgePartitionPage: each has exactly one caller in the new tombstone-reap path. Self-review (CLAUDE.md): 1. Data loss — Closes the Codex P2 leak; no path now leaves partitioned records orphaned after DeleteQueue/PurgeQueue. Legacy queues unchanged: tombstone value, reaper sweep prefixes, and dispatch helpers all keep the byte-identical pre-PR-6a behaviour for PartitionCount<=1. 2. Concurrency / distributed failures — Reaper still runs only on the leader (existing reaperCtx). Tombstone read is at the reaper's snapshot readTS; every individual reap is its own OCC dispatch with ErrWriteConflict treated as success (message touched by another path). Per-partition iteration is sequential; no new shared state. 3. Performance — Per-tick cost grows from O(legacy prefix scans) to O((partition_count + 1) * legacy prefix scans). Capped by sqsReaperPerQueueBudget per partition. PartitionCount=1 queues see no extra cost (the partitioned branch short-circuits on partitionCount<=1). 4. Data consistency — Tombstone deletion gated on full-drain across all partition + legacy prefix families. Partial drains leave the tombstone in place so the next reaper tick can resume. PartitionCount immutability invariant means the tombstone-encoded count and the reaper's iteration bound can never disagree with the live queue. 5. Test coverage — 4 unit tests on the tombstone-value codec (legacy roundtrip, partitioned encode, fallback for non- canonical values, out-of-range clamp). 1 wire-level integration test that creates a partitioned queue, sends to 6 distinct groups, calls DeleteQueue, hand-runs the reaper, and asserts every partitioned data / vis / byage / dedup / group prefix is empty + the tombstone itself is deleted.

chatgpt-codex-connector · 2026-05-04T13:34:22Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

bootjp · 2026-05-04T13:34:27Z

@claude review

claude · 2026-05-04T13:34:41Z

Claude finished @bootjp's task in 5m 23s —— View job

Code Review — PR #735

Overall this is a solid, well-scoped piece of work. The tombstone-driven partition-count encoding is the right design for a reaperless meta. The allDone gating, prefix helper correctness, and backward-compat sentinel all check out. Four items worth addressing before merge:

Issue 1 — `//nolint:gosec` violates the project convention

adapter/sqs_keys.go:179

return uint32(pc) //nolint:gosec // bounded by htfifoMaxPartitions

CLAUDE.md is explicit: "Avoid //nolint — refactor instead." The gosec complaint is a false positive (the value is bounded to ≤ 32 two lines above), but the convention still applies. A narrow fix that satisfies gosec without the annotation:

// pc is in [1, htfifoMaxPartitions=32] from the guard above.
// Split into two assignments so gosec's int-overflow check
// sees the explicit guard rather than a bare uint64→uint32 cast.
pc32 := pc & 0xFFFFFFFF
return uint32(pc32)

Or — arguably cleaner — change the local to uint32 immediately after parsing:

raw := binary.BigEndian.Uint64(value)
if raw == 0 || raw > uint64(htfifoMaxPartitions) {
    return 1
}
return uint32(raw) //nolint:gosec ...    ← still triggers

The cleanest fix without any annotation is to verify in two steps so gosec can statically prove no overflow is possible. Fix this →

Issue 2 — `reapOneRecordPartitioned` synthetic sentinel should use a named minimum

adapter/sqs_reaper.go:553

syntheticMeta := &sqsQueueMeta{PartitionCount: htfifoMaxPartitions}

The dispatch helpers only branch on PartitionCount > 1 — any value in [2, htfifoMaxPartitions] is functionally equivalent. Using the maximum (32) as the sentinel implies the synthetic meta carries the queue's real count, which it doesn't. A future caller who reads meta.PartitionCount for any other purpose (e.g., a fanout count) would silently get the wrong answer.

The simplest fix is to use the minimum legal partitioned value as the sentinel:

const partitionedSentinel uint32 = 2   // any value >1; dispatch only checks the sign
syntheticMeta := &sqsQueueMeta{PartitionCount: partitionedSentinel}

Or inline with a comment:

// PartitionCount only needs to be >1 to flip the dispatch helpers
// to the partitioned branch; the exact value is not read by the
// reaper's per-key path.
syntheticMeta := &sqsQueueMeta{PartitionCount: 2}

Fix this →

Issue 3 — Test comment says "[0, 4)" but the loop iterates [0, 8)

adapter/sqs_partitioned_dispatch_test.go (inside countPartitionedRows)

// Match every partition by using the family-level "all
// partitions" prefix where one exists; for dedup / group
// the family-level prefix isn't pre-built, so iterate
// partitions [0, 4) explicitly.
for partition := uint32(0); partition < 8; partition++ {

The comment says [0, 4) but the loop upper bound is 8 (matching the test's const partitions uint32 = 8). The comment is stale — should say [0, 8). Fix this →

Issue 4 — `sqsGenerationSuffixLen` reused for tombstone value size

adapter/sqs_keys.go:155, 170

out := make([]byte, sqsGenerationSuffixLen)   // line 155
if len(value) != sqsGenerationSuffixLen {      // line 170

sqsGenerationSuffixLen is named for and documented as "the byte length of the trailing big-endian uint64 generation segment in tombstone and byage keys" — i.e., a key encoding constant. It incidentally equals 8, same as the tombstone value encoding size. Reusing it works, but a reader doesn't immediately see why the value size check uses a key-layout constant. A dedicated sqsTombstoneValueLen = 8 (or at minimum a comment explaining the coincidence) would make the intent self-documenting. Low severity, easy rename. Fix this →

Observations (no action needed)

Per-tick work grows multiplicatively. For a 32-partition queue, one tombstone reaper tick can do up to (1 + 32) × 3 × sqsReaperPerQueueBudget ≈ 100k key operations. The PR description acknowledges "O((partition_count + 1) × legacy prefix scans)" but the absolute magnitude isn't quantified. Worth monitoring under load once this reaches production, though the existing per-queue budget cap prevents unbounded work per queue.

countPartitionedRows byage prefix spans all generations. sqsPartitionedMsgByAgePrefixForQueueAllPartitions covers all gens, not just the deleted gen. This is fine for the single-queue, single-gen test — but a test that exercises multiple generations of the same queue could get a false pass if old-gen rows survived. Not a bug in this PR.

allDone and early-exit semantics are correct. reapPartitionedGeneration intentionally continues iterating later partitions even when an earlier partition hits its budget, to avoid starvation. This is the right call and is well-documented.

Key prefix helpers are byte-correct. sqsPartitionedMsgByAgePrefixForPartition, sqsPartitionedMsgDedupKeyPrefix, and sqsPartitionedMsgGroupKeyPrefix all match the structure of their respective key constructors — partition and gen are appended in the same order, and no trailing segment is included.

Backward compatibility holds. PartitionCount <= 1 → []byte{1} (byte-identical legacy sentinel); decoder maps every non-canonical encoding to 1. Pre-PR-6a tombstones continue to take the legacy single-partition reaper path unchanged.

coderabbitai · 2026-05-04T13:38:22Z

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 44 minutes and 12 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3111d207-8e82-4ea2-a644-fc36dfd3a07c

📥 Commits

Reviewing files that changed from the base of the PR and between 0f74197 and 59dd724.

📒 Files selected for processing (3)

adapter/sqs_keys.go
adapter/sqs_partitioned_dispatch_test.go
adapter/sqs_reaper.go

📝 Walkthrough

Walkthrough

This PR implements partition-count-encoded tombstones and partition-aware reaping for HT-FIFO partitioned queues. Queue deletions and purges now encode the queue's partition count into the tombstone value; the reaper decodes this count and performs targeted partition-scoped cleanup of byage, dedup, and group-lock key families before removing the tombstone.

Changes

Partition-Count Tombstones and Partition-Aware Reaping

Layer / File(s)	Summary
Data Shape & Encoding `adapter/sqs_keys.go`	`encodeQueueTombstoneValue` encodes unpartitioned queues as legacy `[]byte{1}`, partitioned queues as 8-byte big-endian `uint64`. `decodeQueueTombstoneValue` extracts partition count with safe fallback to `1`. New prefix helpers (`sqsPartitionedMsgByAgePrefixForPartition`, `sqsPartitionedMsgDedupKeyPrefix`, `sqsPartitionedMsgGroupKeyPrefix`) generate `(queue, partition, gen)`-scoped scan boundaries.
Tombstone Writing `adapter/sqs_catalog.go`, `adapter/sqs_purge.go`	Queue deletion and purge now compute `encodeQueueTombstoneValue(existing.PartitionCount)` and write the encoded value (instead of constant `[]byte{1}`) to the tombstone key, enabling partition-aware reaping.
Partition-Aware Reaping `adapter/sqs_reaper.go`	Reaper decodes `partitionCount` from tombstone and conditionally performs partitioned cleanup when `partitionCount > 1`. New `reapPartitionedGeneration` iterates all partitions and drains each partition's byage/dedup/group families. `reapDeadByAgePartition` / `reapDeadByAgePartitionPage` scan partitioned byage prefixes; `reapOneRecord` / `reapOneRecordPartitioned` dispatch to partition-aware key constructors. `buildReapOps` now accepts `meta` and `partition` to route group-lock reads and deletions correctly.
Tests `adapter/sqs_tombstone_value_test.go`, `adapter/sqs_partitioned_dispatch_test.go`	Unit tests validate tombstone encoding/decoding round-trips and defensive clamping. Integration test `TestSQSServer_PartitionedFIFO_TombstoneReaperEnumeratesPartitions` verifies full partition cleanup and tombstone deletion via helper `countPartitionedRows`.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant Adapter as SQS Adapter
    participant Store as KV Store
    participant Reaper as Tombstone Reaper

    Client->>Adapter: DeleteQueue(queue)
    Note over Adapter: Load queue meta (with PartitionCount)
    Adapter->>Adapter: Encode tombstone value<br/>encodeQueueTombstoneValue(PartitionCount)
    Adapter->>Store: Put tombstone key with encoded value
    
    Note over Reaper: Background reap tick
    Reaper->>Store: Scan tombstone keys
    Reaper->>Store: Fetch tombstone value
    Reaper->>Reaper: Decode PartitionCount<br/>from tombstone value
    
    alt PartitionCount > 1
        loop For each partition [0, PartitionCount)
            Reaper->>Store: Delete byage keys (partitioned prefix)
            Reaper->>Store: Delete dedup keys (partitioned prefix)
            Reaper->>Store: Delete group-lock keys (partitioned prefix)
        end
    end
    
    Note over Reaper: Legacy sweep (if needed)
    Reaper->>Store: Delete legacy byage/dedup/group keys
    Reaper->>Store: Delete tombstone key
    Reaper-->>Client: Queue fully reaped

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

docs(sqs): proposals for Phase 3.C (throttling) + 3.D (split-queue FIFO) #664: Both PRs implement partition-count-encoded tombstones and partition-aware reaping for HT-FIFO split-queue partitioning, with the main PR implementing the reaper/keyspace logic specified by PR #664.
feat(sqs): per-key dispatch helpers for partitioned-FIFO routing (Phase 3.D PR 5b-1) #731: Both PRs introduce partition-aware key prefix helpers and dispatch routing between legacy and partitioned key constructors for SQS partitioned-FIFO operations.
docs: propose SQS-compatible adapter design #606: The main PR extends SQS adapter key formats, tombstone handling in catalog/purge logic, and related test infrastructure introduced in PR #606.

Poem

🐰 A rabbit hops through partitioned queues,
Encoding counts in tombstone hues,
The reaper bounces, partition by part,
Cleaning each shard with surgical art!
No crumb left behind in the burrow so deep—
Just tidy, clean partitions for us to keep. 🌱

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: enabling the tombstone reaper to enumerate partitioned keyspace for SQS queues, which is the core objective of this PR.
Docstring Coverage	✅ Passed	Docstring coverage is 92.86% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/sqs-htfifo-reaper-partition

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 44 minutes and 12 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request enables the SQS tombstone reaper to drain partitioned queues by encoding the partition count into tombstone values and introducing partition-aware cleanup logic. Feedback identifies a potential resource exhaustion issue where the work budget is not correctly shared across partitions and suggests refactoring tombstone deletion to use more descriptive helpers.

gemini-code-assist · 2026-05-04T13:39:48Z

+	// Partitioned sweep: one (byage, dedup, group) triple per
+	// partition. Each triple shares the per-queue budget with the
+	// legacy sweep, so a wide-fanout queue may need multiple reaper
+	// ticks to fully drain — same contract as the live-queue reap.


The comment states that each triple 'shares the per-queue budget with the legacy sweep', but the implementation of reapPartitionedGeneration and its helpers each initialize their own processed counter. This results in a total work per tick of O(partitionCount * budget). To prevent potential resource exhaustion and ensure bounds are strictly enforced, please update the implementation to share the budget or correct the documentation to reflect the actual behavior.

References

To prevent unbounded memory growth and potential OOM issues, apply a fixed bound to collections that can grow from external requests. Reject new requests when the bound is reached.

gemini-code-assist · 2026-05-04T13:39:48Z

+		allDone = allDone && partDone
+	}
+	if allDone {
 		_ = s.dispatchDedupDelete(ctx, tombstoneKey, readTS)


Using dispatchDedupDelete to delete the tombstone key is slightly confusing as the tombstone is not a deduplication record. To improve maintainability and follow the practice of using appropriate centralized helpers, consider using a more generic helper or calling coordinator.Dispatch directly with a kv.Del operation.

References

Centralize common complex logic, such as retry/backoff mechanisms, into shared helper functions to avoid code duplication and improve maintainability.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

adapter/sqs_keys.go (1)

169-179: ⚡ Quick win

Drop the new //nolint:gosec by decoding into uint32 directly.

This branch is already bounded to htfifoMaxPartitions, so the suppression is avoidable. Splitting the 8-byte payload into high/low uint32s keeps the same wire format and removes the new lint escape hatch.

♻️ Proposed refactor

 func decodeQueueTombstoneValue(value []byte) uint32 {
 	if len(value) != sqsGenerationSuffixLen {
 		return 1
 	}
-	pc := binary.BigEndian.Uint64(value)
-	if pc == 0 || pc > uint64(htfifoMaxPartitions) {
+	hi := binary.BigEndian.Uint32(value[:4])
+	pc := binary.BigEndian.Uint32(value[4:])
+	if hi != 0 || pc == 0 || pc > htfifoMaxPartitions {
 		return 1
 	}
-	// Bounded by htfifoMaxPartitions (=32) immediately above, so
-	// the uint32 narrow cannot overflow.
-	return uint32(pc) //nolint:gosec // bounded by htfifoMaxPartitions
+	return pc
 }

As per coding guidelines, "Use gofmt formatting and pass linters configured in .golangci.yaml (gocritic, gocyclo, gosec, wrapcheck, mnd, etc.). Avoid //nolint annotations — refactor instead."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@adapter/sqs_keys.go` around lines 169 - 179, The function
decodeQueueTombstoneValue currently reads an 8-byte uint64 and uses
//nolint:gosec; instead, decode directly into a uint32 to avoid the lint
suppression: in decodeQueueTombstoneValue keep the length check against
sqsGenerationSuffixLen, then extract the lower 4 bytes (using
binary.BigEndian.Uint32 on value[4:]) (or the appropriate 4-byte slice
preserving the wire format) into a uint32, validate that it is non-zero and <=
htfifoMaxPartitions, and return that uint32; remove the //nolint:gosec comment
and the uint64 path entirely so no narrowing lint is needed.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@adapter/sqs_partitioned_dispatch_test.go`:
- Around line 621-649: Modify countPartitionedRows to accept a partitions int
parameter (replace the hard-coded 8) and include scans for the data and vis
keyspaces in addition to byage/dedup/group. Keep the existing readTS logic
(node.sqsServer.nextTxnReadTS) and prefixScanEnd usage, add the family-level
all-partitions prefixes for data/vis (e.g.
sqsPartitionedMsgDataPrefixForQueueAllPartitions(queueName) and
sqsPartitionedMsgVisPrefixForQueueAllPartitions(queueName)) to prefixes, and
iterate partition from 0 to partitions-1 to append
sqsPartitionedMsgDedupKeyPrefix(queueName, partition, gen) and
sqsPartitionedMsgGroupKeyPrefix(queueName, partition, gen). Update the two call
sites in this test to pass the actual partitions value. Ensure require.NoError
remains around each ScanAt call and total sums len(rows) as before.

---

Nitpick comments:
In `@adapter/sqs_keys.go`:
- Around line 169-179: The function decodeQueueTombstoneValue currently reads an
8-byte uint64 and uses //nolint:gosec; instead, decode directly into a uint32 to
avoid the lint suppression: in decodeQueueTombstoneValue keep the length check
against sqsGenerationSuffixLen, then extract the lower 4 bytes (using
binary.BigEndian.Uint32 on value[4:]) (or the appropriate 4-byte slice
preserving the wire format) into a uint32, validate that it is non-zero and <=
htfifoMaxPartitions, and return that uint32; remove the //nolint:gosec comment
and the uint64 path entirely so no narrowing lint is needed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 76940a69-4c8d-4c68-b80d-0101a17c1008

📥 Commits

Reviewing files that changed from the base of the PR and between 3dc5b6d and 0f74197.

📒 Files selected for processing (6)

adapter/sqs_catalog.go
adapter/sqs_keys.go
adapter/sqs_partitioned_dispatch_test.go
adapter/sqs_purge.go
adapter/sqs_reaper.go
adapter/sqs_tombstone_value_test.go

coderabbitai · 2026-05-04T13:47:30Z

+// countPartitionedRows sums the rows under every partitioned
+// data / vis / byage / dedup / group prefix for a (queue, gen)
+// cohort. Used by the tombstone-reaper integration test to assert
+// the cohort is fully drained without enumerating individual keys.
+func countPartitionedRows(t *testing.T, node Node, queueName string, gen uint64) int {
+	t.Helper()
+	ctx := context.Background()
+	readTS := node.sqsServer.nextTxnReadTS(ctx)
+	prefixes := [][]byte{
+		// Match every partition by using the family-level "all
+		// partitions" prefix where one exists; for dedup / group
+		// the family-level prefix isn't pre-built, so iterate
+		// partitions [0, 4) explicitly.
+		sqsPartitionedMsgByAgePrefixForQueueAllPartitions(queueName),
+	}
+	for partition := uint32(0); partition < 8; partition++ {
+		prefixes = append(prefixes,
+			sqsPartitionedMsgDedupKeyPrefix(queueName, partition, gen),
+			sqsPartitionedMsgGroupKeyPrefix(queueName, partition, gen),
+		)
+	}
+	total := 0
+	for _, prefix := range prefixes {
+		rows, err := node.sqsServer.store.ScanAt(ctx, prefix, prefixScanEnd(prefix), 1024, readTS)
+		require.NoError(t, err)
+		total += len(rows)
+	}
+	return total
+}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Expand this helper to scan the actual data/vis keyspace.

countPartitionedRows currently ignores partitioned data and vis rows, so this test can still pass if the reaper drops byage/dedup/group but leaks message records. The hard-coded 8 also makes the helper undercount future callers with a different partition count.

💡 Tighten the helper to cover the full cohort

-func countPartitionedRows(t *testing.T, node Node, queueName string, gen uint64) int { +func countPartitionedRows(t *testing.T, node Node, queueName string, gen uint64, partitionCount uint32) int { t.Helper() ctx := context.Background() readTS := node.sqsServer.nextTxnReadTS(ctx) - prefixes := [][]byte{ - // Match every partition by using the family-level "all - // partitions" prefix where one exists; for dedup / group - // the family-level prefix isn't pre-built, so iterate - // partitions [0, 4) explicitly. - sqsPartitionedMsgByAgePrefixForQueueAllPartitions(queueName), - } - for partition := uint32(0); partition < 8; partition++ { + prefixes := make([][]byte, 0, int(partitionCount)*5) + for partition := uint32(0); partition < partitionCount; partition++ { prefixes = append(prefixes, + sqsPartitionedMsgDataKey(queueName, partition, gen, ""), + sqsPartitionedMsgVisPrefixForQueue(queueName, partition, gen), + sqsPartitionedMsgByAgePrefixForPartition(queueName, partition, gen), sqsPartitionedMsgDedupKeyPrefix(queueName, partition, gen), sqsPartitionedMsgGroupKeyPrefix(queueName, partition, gen), ) } total := 0

Please also pass partitions at the two call sites in this test.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@adapter/sqs_partitioned_dispatch_test.go` around lines 621 - 649, Modify countPartitionedRows to accept a partitions int parameter (replace the hard-coded 8) and include scans for the data and vis keyspaces in addition to byage/dedup/group. Keep the existing readTS logic (node.sqsServer.nextTxnReadTS) and prefixScanEnd usage, add the family-level all-partitions prefixes for data/vis (e.g. sqsPartitionedMsgDataPrefixForQueueAllPartitions(queueName) and sqsPartitionedMsgVisPrefixForQueueAllPartitions(queueName)) to prefixes, and iterate partition from 0 to partitions-1 to append sqsPartitionedMsgDedupKeyPrefix(queueName, partition, gen) and sqsPartitionedMsgGroupKeyPrefix(queueName, partition, gen). Update the two call sites in this test to pass the actual partitions value. Ensure require.NoError remains around each ScanAt call and total sums len(rows) as before.

Addresses Claude's 4 review items on the round-0 commit. None behaviour-changing; all narrow code-quality / clarity fixes. 1. Remove the //nolint:gosec on the uint64→uint32 narrow in decodeQueueTombstoneValue (Claude Issue 1). CLAUDE.md is explicit: "Avoid //nolint — refactor instead." Refactored to read the 8-byte tombstone value as two big-endian uint32s instead of a single uint64, then check the high half is zero (must be — encoder only writes PartitionCount<=htfifoMaxPartitions into the low 32 bits) and the low half is in [1, htfifoMaxPartitions]. No narrowing conversion → no gosec flag → no nolint annotation. 2. reapOneRecordPartitioned's synthetic meta uses PartitionCount=2 instead of htfifoMaxPartitions (=32) as the dispatch sentinel (Claude Issue 2). The dispatch helpers only branch on PartitionCount>1; using the maximum implies the synthetic meta carries the queue's real count, which it does not. A future caller that read meta.PartitionCount for any other purpose would silently get the wrong answer. 3. Stale comment fix in countPartitionedRows test helper (Claude Issue 3): "[0, 4)" → "[0, 8)" to match the actual loop bound and the test queue's PartitionCount=8. 4. New sqsTombstoneValueLen=8 constant for the tombstone value size (Claude Issue 4). Was reusing sqsGenerationSuffixLen which is a key-layout constant — incidentally also 8 today but conceptually unrelated. Dedicated constant makes the intent self-documenting and decouples the two: a future tombstone-value encoding revision can change the value size without touching key parsers. Below threshold, intentionally not addressed: - Gemini medium "shared budget" comment vs. per-partition budget implementation (sqs_reaper.go:182). The PR description explicitly states "O((partition_count + 1) × legacy prefix scans)" — the per-partition budget IS the design. Comment was already correct in describing the intent. This is a wording nit at most; not P0/P1/high/critical/major. - Gemini medium "use generic Dispatch instead of dispatchDedupDelete for tombstone deletion" (sqs_reaper.go:193). Style suggestion; the current call works correctly. Below threshold. Verified: build clean, golangci-lint 0 issues, targeted test sweep (-race) all green.

bootjp · 2026-05-04T13:51:22Z

@claude review

Round 1 — addressed Claude's 4 review items from round 0. None behaviour-changing.

Remove //nolint:gosec on the uint64→uint32 narrow in decodeQueueTombstoneValue (Claude Issue 1). CLAUDE.md: "Avoid //nolint — refactor instead." Read the 8-byte tombstone value as two big-endian uint32s instead of a single uint64; check the high half is zero (must be — encoder only writes PartitionCount<=htfifoMaxPartitions into the low 32 bits). No narrowing conversion → no gosec flag → no annotation.
reapOneRecordPartitioned synthetic sentinel: PartitionCount=2 instead of htfifoMaxPartitions (Claude Issue 2). Dispatch helpers only branch on PartitionCount>1; using the maximum implied the synthetic meta carried the queue's real count, which it does not.
Stale comment fix in countPartitionedRows test helper (Claude Issue 3): [0, 4) → [0, 8) to match the actual loop bound and the test queue's PartitionCount=8.
sqsTombstoneValueLen=8 dedicated constant for the tombstone value size (Claude Issue 4). Was reusing sqsGenerationSuffixLen, a key-layout constant — incidentally also 8 today but conceptually unrelated.

Below threshold (intentionally not addressed):

Gemini medium "shared budget" wording (sqs_reaper.go:182): per-partition budget IS the design (PR description: "O((partition_count + 1) × legacy prefix scans)"). Wording nit at most.
Gemini medium "use generic Dispatch for tombstone deletion" (sqs_reaper.go:193): style suggestion, current call works correctly.

Build clean, golangci-lint 0 issues, -race test sweep all green.

claude · 2026-05-04T13:51:38Z

Claude finished @bootjp's task in 4m 19s —— View job

Code Review — PR #735 Round 1

Reading changed files
Verifying Round 0 fixes
Reviewing for new issues
Posting final review

All four Round 0 issues are correctly addressed. No new issues introduced.

Fix 1 — `//nolint:gosec` removed ✅

adapter/sqs_keys.go:187–196

The two-uint32 decode is clean and correct:

highHalf := binary.BigEndian.Uint32(value[:4])
lowHalf  := binary.BigEndian.Uint32(value[4:])
if highHalf != 0 || lowHalf == 0 || lowHalf > htfifoMaxPartitions {
    return 1
}
return lowHalf

Wire format is preserved: the encoder still writes uint64(partitionCount) as 8 big-endian bytes, so the high half is always 0 for any value ≤ 32. The highHalf != 0 guard doubles as a future-format detector. No narrowing conversion, no lint annotation needed.

Fix 2 — Synthetic sentinel ✅

adapter/sqs_reaper.go:553

const partitionedDispatchSentinel uint32 = 2
syntheticMeta := &sqsQueueMeta{PartitionCount: partitionedDispatchSentinel}

Local constant, clear name, comment explains exactly why 2 and not the actual queue count. A future reader who inspects meta.PartitionCount inside reapOneRecord won't be misled.

Fix 3 — Stale comment ✅

adapter/sqs_partitioned_dispatch_test.go:633

// partitions [0, 8) explicitly to match the test queue's
// PartitionCount=8.
for partition := uint32(0); partition < 8; partition++ {

Comment and loop bound now agree.

Fix 4 — Dedicated constant ✅

adapter/sqs_keys.go:143

const sqsTombstoneValueLen = 8

Both encodeQueueTombstoneValue and decodeQueueTombstoneValue use the new constant. sqsGenerationSuffixLen is no longer borrowed for value-size checks. The constant comment explains the conceptual separation.

Observations (no action needed)

Comment attribution on sqsTombstoneValueLen. The constant doc references "Claude review on PR #735" — CLAUDE.md warns against caller/fix attribution in comments since it rots as the codebase evolves. The load-bearing part of the comment is the second sentence ("A future encoding revision... can change this without touching key parsers"), which stands on its own. Could trim the attribution in a follow-up, but it doesn't block merge.

Gemini "shared budget" wording (sqs_reaper.go:180) — pre-existing in the round-0 commit, not touched in Round 1. Comment says "Each triple shares the per-queue budget" but the implementation gives each partition its own fresh budget counter (total cost is (1 + partitionCount) × sqsReaperPerQueueBudget). The PR description is accurate about the O() bound; the in-code wording is slightly imprecise. Agreed this is below threshold for the round, and the allDone gate on tombstone deletion means the worst case is an extra reaper tick, not a data leak.

countPartitionedRows doesn't scan data/vis rows. The CodeRabbit comment is valid: the helper counts byage + dedup + group but not the actual message data/vis records. However, reapOneRecord deletes (data, vis, byage) atomically in a single OCC batch, so an empty byage prefix implies empty data/vis. The omission can't produce a false-pass in the current test. A future test with multiple generations of the same queue or with artificial byage gaps could be fooled — acceptable gap for PR 6b.

LGTM. All four Round 0 items are properly closed. Ready to merge pending CI green.

## Summary Phase 3.D PR 7a — adds the `elastickv_sqs_partition_messages_total{queue, partition, action}` Prometheus counter so dashboards and alerts can spot uneven `MessageGroupId` distributions across partitioned-FIFO queues. Metrics-only: the Jepsen HT-FIFO workload (PR 7b) ships separately so the two review loops do not cross. - `monitoring/sqs.go` (new): `SQSMetrics` with cardinality cap (`sqsMaxTrackedQueues = 512`, overflow collapses to `_other`) mirroring `DynamoDBMetrics`. Drops empty queue names and unknown action labels so a future call-site bug cannot pollute the series space dashboards have to learn about. Nil-receiver-safe so adapter call sites do not need to nil-guard. - `adapter/sqs.go`: `SQSPartitionObserver` interface + `WithSQSPartitionObserver` option. Re-declared in `adapter` so it doesn't import `monitoring` at the package boundary (matches the DynamoDB/Redis observer pattern). Action constants (`send`/`receive`/`delete`) re-declared on the adapter side and validated at runtime by the monitoring side — drift between the two surfaces as a dropped observation, not a wedge. - `adapter/sqs_fifo.go`, `adapter/sqs_messages.go`: emit the counter on the **partitioned** commit branch only (`PartitionCount > 1`) for send / receive / delete. Legacy single-partition queues stay off the metric since partition is always 0 and the cardinality cost would buy nothing. - `monitoring/registry.go`, `main_sqs.go`, `main.go`: wire the registry's `SQSPartitionObserver()` into `startSQSServer` so the SQS server picks up the production observer on cluster boot. Test fixtures and CLI tools that build `SQSServer` without a registry pass `nil` and the metric stays at zero. ## Tests `monitoring/sqs_test.go` (new, 6 cases): - `TestSQSMetrics_ObservePartitionMessage_IncrementsByLabelTriple` — pin the `(queue, partition, action)` counter contract. - `TestSQSMetrics_ObservePartitionMessage_DropsInvalidAction` — pin the typo guard against future drift between adapter and monitoring constants. - `TestSQSMetrics_ObservePartitionMessage_DropsEmptyQueue` — pin that an empty queue name does not collapse with valid observations onto a shared series. - `TestSQSMetrics_NilReceiverIsSafe` — pin the nil-receiver short-circuit the adapter relies on. - `TestSQSMetrics_QueueLabelOverflow` — pin the cap-and-collapse so a misbehaving caller cannot exhaust the Prometheus series budget. - `TestSQSMetrics_RegistryWiring` — pin that the public `Registry` exposes the metric under the documented name. ## Self-review (5 lenses) 1. **Data loss** — N/A; metrics-only, no storage / Raft / FSM touch. 2. **Concurrency** — counter increments are atomic via Prometheus; the `trackedQueues` map is only consulted from the dispatch-success path under the SQS server's existing concurrency model. No new locks. 3. **Performance** — one map lookup + one `CounterVec` lookup per partitioned send/receive/delete on the success branch. Legacy queues skip the call entirely. Cardinality bounded at 512 queue × 32 partition (`htfifoMaxPartitions`) × 3 action ≈ 49k series worst case; in practice a 32-partition queue yields 96 series, so the budget is plenty for the SLO panels. 4. **Data consistency** — the metric is observed AFTER OCC dispatch succeeds, so the counter reflects committed state. Receive/delete branches that return on retryable errors deliberately do not increment (the retry path will observe on the eventual success). 5. **Test coverage** — 6 unit tests in `monitoring/`, plus the adapter-side nil-observer path is exercised by all existing partitioned-FIFO tests in `adapter/sqs_partitioned_dispatch_test.go` (they pass `nil` observer through the test fixture). ## Test plan - [x] `go test -race -count=1 ./monitoring/...` - [x] `go test -race -count=1 -run 'TestSQS' ./adapter/...` - [x] `go test -race -count=1 ./...` (full suite) - [x] `golangci-lint --config=.golangci.yaml run ./...` (full repo) - [ ] Jepsen HT-FIFO workload — deferred to PR 7b ## Refs - `docs/design/2026_05_01_partial_split_queue_fifo.md` §11 PR 7 - Builds on PR 5b-3 (#734) capability gate, PR 6a (#735) tombstone reaper, PR 6b (#736 in flight) live-queue reaper.

….D PR 6b) (#736) ## Summary Phase 3.D PR 6b: live-queue reaper enumerates partitioned keyspace. Follow-up to PR #735 (6a) — the tombstone-driven sweep already walks partitioned data on DeleteQueue / PurgeQueue, but the live-queue retention reaper still only saw the legacy keyspace, so: - retention-expired messages on partitioned queues leaked their data / vis / byage / group rows forever (`reapQueue` walked `sqsMsgByAgePrefixAllGenerations` only), - expired dedup records on partitioned FIFO queues leaked forever (`reapExpiredDedup` scanned `SqsMsgDedupPrefix` only — empty for partitioned queues since `sqsMsgDedupKeyDispatch` routes their writes under `SqsPartitionedMsgDedupPrefix`). Closes the live-queue half of the Codex P2 from PR #732 round 0; PR 6a covered the tombstoned-cohort half. ## What changes - **`reapQueue`**: legacy byage walk extracted as `reapQueueLegacy` (byte-identical to pre-PR-6b for non-partitioned queues). Adds `reapQueuePartition` step that runs once per partition for `PartitionCount > 1` queues. Per-partition budget per the §6 design ("partitions × budget per cycle"); 30s tick interval comfortably absorbs. - **`reapPartitionedPage`**: partitioned twin of `reapPage`. Same live-vs-orphan classification, but parses each entry with `parseSqsPartitionedMsgByAgeKey` and routes the dispatch through `reapOneRecordPartitioned`. - **`classifyPartitionedByAgeEntry`**: helper extracted from `reapPartitionedPage` so the loop body stays under the cyclop ceiling. Returns `(parsedKey, reapable bool)`. - **`reapExpiredDedup`** (signature changed): now takes `*sqsQueueMeta` and routes by `PartitionCount`. Legacy meta → `reapExpiredDedupLegacy` (byte-identical). Partitioned meta → `reapExpiredDedupPartitioned` (NEW), iterates each partition's dedup prefix under its own per-partition budget. ## Caller audit - `reapQueue` — one production caller (`reapAllQueues`); signature unchanged. Non-partitioned queues byte-identical; partitioned get the extra per-partition pass. - `reapExpiredDedup` — signature changed to take `*sqsQueueMeta`; one production caller (`reapAllQueues`), updated. No tests called it directly. - New helpers (`reapQueueLegacy` / `reapQueuePartition` / `reapPartitionedPage` / `reapExpiredDedupLegacy` / `reapExpiredDedupPartitioned` / `classifyPartitionedByAgeEntry`) each have exactly one production caller in the new live-queue reap path. - `reapOneRecordPartitioned` (existing PR 6a helper): previously called from `reapDeadByAgePartitionPage` (tombstone path); now also from `reapPartitionedPage` (live-queue path). Same dispatch semantics. ## Tests - New `TestSQSServer_PartitionedFIFO_LiveQueueDedupReaperPartitions`: 4-partition queue, send across 6 distinct groups, backdate every partitioned dedup record's `ExpiresAtMillis`, run `reapAllQueues`, assert every partitioned dedup row across `[0, 4)` is gone. Pre-PR-6b reaper would leave every row in place. ## Self-review (CLAUDE.md) 1. **Data loss** — Closes the live-queue dedup leak + partitioned retention-expired-message leak. Legacy queues unchanged. 2. **Concurrency / distributed failures** — Reaper still runs only on the leader. Per-partition pass is sequential; per-partition budget bounds the pass. OCC semantics on each record reap unchanged. 3. **Performance** — Per-tick partitioned-queue cost grows from O(1 walk) to O(partition_count walks) on byage AND dedup. Each partition bounded by `sqsReaperPerQueueBudget`. 30s tick interval comfortably absorbs 32-partition × per-queue budget per design. 4. **Data consistency** — Live-vs-orphan classification on partitioned byage mirrors the legacy branch exactly (`reapPage` / `reapPartitionedPage` share the rules through `classifyPartitionedByAgeEntry`). `PartitionCount` immutability means the meta-driven iteration bound matches the on-disk keys. 5. **Test coverage** — One new wire-level integration test for the partitioned dedup walk; the partitioned byage walk reuses parsing / dispatch helpers already covered by PR 6a's tombstone-reap integration test. ## Test plan - [x] `make lint` — 0 issues - [x] Targeted reaper / retention / dedup / HTFIFO / PartitionedFIFO suites (-race, clean) - [x] Wider regression on Send/Receive/Delete + CreateQueue/DeleteQueue/PurgeQueue (-race, clean) - [ ] CI: full Jepsen + race  ## Summary by CodeRabbit * **Bug Fixes** * Fixed deduplication record cleanup in partitioned FIFO queues to properly remove expired records across all partitions. * Enhanced cleanup mechanism to correctly handle partition-aware behavior and prevent memory leaks in partitioned queue scenarios. * **Tests** * Added comprehensive test for deduplication record cleanup across partitioned FIFO queue partitions.

gemini-code-assist Bot reviewed May 4, 2026

View reviewed changes

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

bootjp merged commit f489669 into main May 4, 2026
9 checks passed

bootjp deleted the feat/sqs-htfifo-reaper-partition branch May 4, 2026 15:12

This was referenced May 4, 2026

feat(sqs): live-queue reaper enumerates partitioned keyspace (Phase 3.D PR 6b) #736

Merged

feat(sqs): HT-FIFO partition metrics counter (Phase 3.D PR 7a) #737

Merged

Conversation

bootjp commented May 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changes

Out of scope (follow-up PR 6b)

Caller audit

Self-review (CLAUDE.md)

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

chatgpt-codex-connector Bot commented May 4, 2026

Uh oh!

bootjp commented May 4, 2026

Uh oh!

claude Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — PR #735

Issue 1 — //nolint:gosec violates the project convention

Issue 2 — reapOneRecordPartitioned synthetic sentinel should use a named minimum

Issue 3 — Test comment says "[0, 4)" but the loop iterates [0, 8)

Issue 4 — sqsGenerationSuffixLen reused for tombstone value size

Observations (no action needed)

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented May 4, 2026

Uh oh!

claude Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — PR #735 Round 1

Fix 1 — //nolint:gosec removed ✅

Fix 2 — Synthetic sentinel ✅

Fix 3 — Stale comment ✅

Fix 4 — Dedicated constant ✅

Observations (no action needed)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bootjp commented May 4, 2026 •

edited by coderabbitai Bot

Loading

claude Bot commented May 4, 2026 •

edited

Loading

Issue 1 — `//nolint:gosec` violates the project convention

Issue 2 — `reapOneRecordPartitioned` synthetic sentinel should use a named minimum

Issue 4 — `sqsGenerationSuffixLen` reused for tombstone value size

coderabbitai Bot commented May 4, 2026 •

edited

Loading

claude Bot commented May 4, 2026 •

edited

Loading

Fix 1 — `//nolint:gosec` removed ✅