Test/hub failover auth fast fail by josealekhine · Pull Request #98 · ActiveMemory/ctx

josealekhine · 2026-05-23T15:53:37Z

No description provided.

The behavior was already implemented in 8bcb620 (the original master-follower replication feature): failover.go checks authErr(callErr) on every peer attempt and returns immediately when the gRPC status code is Unauthenticated or PermissionDenied. The reasoning is sound — the same bearer token would fail on every peer in the list, so cycling is pointless and just wastes time. But the test suite never asserted the invariant. The three existing failover tests cover: - TestFailoverClient_FirstPeerWorks: happy path - TestFailoverClient_SkipsBadPeer: connection-error skip - TestFailoverClient_AllBad: all-unreachable error None of them exercise the "auth fails → walk stops" contract. A future refactor could silently delete the authErr(callErr) branch and all three would still pass. New TestFailoverClient_FailsFastOnAuthError: - Real server as the first peer; bogus bearer token will trigger Unauthenticated. - Unrouted 127.0.0.1:1 as the second peer — would surface Unavailable if the walk continued past auth. - Asserts returned gRPC code is Unauthenticated or PermissionDenied; if it's Unavailable, the walk cycled past auth into the unrouted peer, which is the exact regression to catch. Closes line 375 of TASKS.md (open since 2026-04-08). Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne <jose@ctx.ist>

Capture the triage finding from 22cffc2 (hub failover auth-fast-fail test). The task at TASKS.md line 375 had been open since 2026-04-08; on triage, the code already existed in failover.go:61-63 and err_check.go:22-30, but no test asserted the invariant. The task closed by writing the regression test, not by writing the feature. Application beyond this one task: behavior-named TASKS.md items older than a few weeks should get an implementation-status sweep before being scoped as new work. The pattern fits hub/connect/replication-adjacent tasks particularly well — several still-open items (lines 380 file locking, 385 fanout entry loss) may warrant the same triage. Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne <jose@ctx.ist>

Triaged the file-locking task: a "lock" exists in state.go:42-50 but it's a check-then-write TOCTOU race, not real locking. Filed #93 with the analysis + two proposed fixes (atomic O_CREATE|O_EXCL or syscall.Flock) + the three regression tests the fix should add. Task stays open in TASKS.md (the bug is real and unfixed); the cross-reference points readers at the filed issue so the analysis isn't duplicated. Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne <jose@ctx.ist>

The triage of TASKS.md line 393 (use crypto/subtle.ConstantTimeCompare + replace O(n) scan with map) showed the implementation was already correct: store.go:174-189 uses an O(1) map lookup on s.tokenIdx plus a defensive CTC against the stored token. The task description ("currently uses ==") was stale. But the test surface didn't pin the contract. The existing TestStoreRegisterAndValidate covers valid vs. "invalid" (a completely different string), which a naive prefix matcher or non-constant comparison would also pass. A future "simplification" PR could collapse the careful chain to a single == or strings.HasPrefix and the test suite would stay green. New TestStoreValidateToken_RejectsNearMissTokens exercises 9 near-miss cases that probe the boundary the CTC defends: - empty token - last byte changed (would pass a HasPrefix(stored, X)) - first byte changed - middle byte changed - prefix only (would pass HasPrefix(X, stored)) - extra suffix appended (would pass HasPrefix(X, stored)) - case-shifted (would pass a strings.EqualFold) - whitespace-padded - all-X same length (random-token false-positive sanity) All must return nil. If a future refactor weakens the comparison, these start failing immediately. The CTC itself is technically redundant once the map lookup hits — Go map keys match exact-byte by definition — but the CTC is the explicit signal of intent. The test pins the *behavior* the CTC enables, not the nanosecond-level timing (which can't be tested reliably in Go anyway). Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne <jose@ctx.ist>

… record gateway-coupling decision Three coordinated context updates: 1. **New task: ctx-architecture-next** (TASKS.md, architecture pipeline section). Fourth step in the pipeline: map → enrich → hunt → prescribe. Consumes the three prior artifacts (ARCHITECTURE.md, enriched verifications, DANGER-ZONES.md) and produces NEXT-ACTIONS.md — a ranked, sequenced fix plan that maps each danger zone to a concrete next move (refactor, test, doc, escalate, accept) with effort estimates. Distinct from the previously-skipped ctx-architecture-extend: extend was about "where features grow" (overlapped with DETAILED_DESIGN and enrich's registration sites); next is pure synthesis, no overlap. Pipeline grows from 3 to 4 because each step now has a distinct output document. The synthesis layer is gateway-free by design: reads markdown, writes markdown, no MCP dependency at the skill's own layer. 2. **Skip: pluggable graph tool interface** (TASKS.md line 1350-ish, `#added:2026-03-25-120000`). Marked `[-]` with rationale: pluggable abstraction implies multiple candidate graph tools, which implies ctx vouches for the interface contract across implementations — exactly the ownership coupling the new decision rules out. Spec sketch in ideas/spec-mcp-warm-up-ceremony.md stays available as the starting point if the trade-off ever shifts. 3. **New decision: MCP gateway not worth the coupling cost** (DECISIONS.md). Builds on the existing 2026-03-12 "Recommend companion RAGs as peer MCP servers not bridge through ctx" with a stronger reason: ownership coupling. A gateway would make ctx vouch for the gatewayed tool's install/uninstall/upgrade/error surface — coupling we don't want for tools we don't ship. Pluggable abstraction is the same trap; both are skipped as a direct consequence. Doctor-style "is the peer available?" checks remain valid; that's not proxying. Spec: ideas/spec-companion-intelligence.md Signed-off-by: Jose Alekhinne <jose@ctx.ist>

josealekhine added 5 commits May 23, 2026 05:14

josealekhine self-assigned this May 23, 2026

josealekhine merged commit 79c5a01 into main May 23, 2026
16 checks passed

josealekhine deleted the test/hub-failover-auth-fast-fail branch May 23, 2026 15:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test/hub failover auth fast fail#98

Test/hub failover auth fast fail#98
josealekhine merged 5 commits into
mainfrom
test/hub-failover-auth-fast-fail

josealekhine commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

josealekhine commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant