Test/hub failover auth fast fail#98
Merged
Merged
Conversation
The behavior was already implemented in 8bcb620 (the original master-follower replication feature): failover.go checks authErr(callErr) on every peer attempt and returns immediately when the gRPC status code is Unauthenticated or PermissionDenied. The reasoning is sound — the same bearer token would fail on every peer in the list, so cycling is pointless and just wastes time. But the test suite never asserted the invariant. The three existing failover tests cover: - TestFailoverClient_FirstPeerWorks: happy path - TestFailoverClient_SkipsBadPeer: connection-error skip - TestFailoverClient_AllBad: all-unreachable error None of them exercise the "auth fails → walk stops" contract. A future refactor could silently delete the authErr(callErr) branch and all three would still pass. New TestFailoverClient_FailsFastOnAuthError: - Real server as the first peer; bogus bearer token will trigger Unauthenticated. - Unrouted 127.0.0.1:1 as the second peer — would surface Unavailable if the walk continued past auth. - Asserts returned gRPC code is Unauthenticated or PermissionDenied; if it's Unavailable, the walk cycled past auth into the unrouted peer, which is the exact regression to catch. Closes line 375 of TASKS.md (open since 2026-04-08). Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne <jose@ctx.ist>
Capture the triage finding from 22cffc2 (hub failover auth-fast-fail test). The task at TASKS.md line 375 had been open since 2026-04-08; on triage, the code already existed in failover.go:61-63 and err_check.go:22-30, but no test asserted the invariant. The task closed by writing the regression test, not by writing the feature. Application beyond this one task: behavior-named TASKS.md items older than a few weeks should get an implementation-status sweep before being scoped as new work. The pattern fits hub/connect/replication-adjacent tasks particularly well — several still-open items (lines 380 file locking, 385 fanout entry loss) may warrant the same triage. Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne <jose@ctx.ist>
Triaged the file-locking task: a "lock" exists in state.go:42-50 but it's a check-then-write TOCTOU race, not real locking. Filed #93 with the analysis + two proposed fixes (atomic O_CREATE|O_EXCL or syscall.Flock) + the three regression tests the fix should add. Task stays open in TASKS.md (the bug is real and unfixed); the cross-reference points readers at the filed issue so the analysis isn't duplicated. Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne <jose@ctx.ist>
The triage of TASKS.md line 393 (use crypto/subtle.ConstantTimeCompare
+ replace O(n) scan with map) showed the implementation
was already correct: store.go:174-189 uses an O(1) map
lookup on s.tokenIdx plus a defensive CTC against the
stored token. The task description ("currently uses ==")
was stale.
But the test surface didn't pin the contract. The
existing TestStoreRegisterAndValidate covers valid vs.
"invalid" (a completely different string), which a
naive prefix matcher or non-constant comparison would
also pass. A future "simplification" PR could collapse
the careful chain to a single == or strings.HasPrefix
and the test suite would stay green.
New TestStoreValidateToken_RejectsNearMissTokens
exercises 9 near-miss cases that probe the boundary the
CTC defends:
- empty token
- last byte changed (would pass a HasPrefix(stored, X))
- first byte changed
- middle byte changed
- prefix only (would pass HasPrefix(X, stored))
- extra suffix appended (would pass HasPrefix(X, stored))
- case-shifted (would pass a strings.EqualFold)
- whitespace-padded
- all-X same length (random-token false-positive sanity)
All must return nil. If a future refactor weakens the
comparison, these start failing immediately.
The CTC itself is technically redundant once the map
lookup hits — Go map keys match exact-byte by definition
— but the CTC is the explicit signal of intent. The
test pins the *behavior* the CTC enables, not the
nanosecond-level timing (which can't be tested reliably
in Go anyway).
Spec: specs/hub-security-audit.md
Signed-off-by: Jose Alekhinne <jose@ctx.ist>
… record gateway-coupling decision Three coordinated context updates: 1. **New task: ctx-architecture-next** (TASKS.md, architecture pipeline section). Fourth step in the pipeline: map → enrich → hunt → prescribe. Consumes the three prior artifacts (ARCHITECTURE.md, enriched verifications, DANGER-ZONES.md) and produces NEXT-ACTIONS.md — a ranked, sequenced fix plan that maps each danger zone to a concrete next move (refactor, test, doc, escalate, accept) with effort estimates. Distinct from the previously-skipped ctx-architecture-extend: extend was about "where features grow" (overlapped with DETAILED_DESIGN and enrich's registration sites); next is pure synthesis, no overlap. Pipeline grows from 3 to 4 because each step now has a distinct output document. The synthesis layer is gateway-free by design: reads markdown, writes markdown, no MCP dependency at the skill's own layer. 2. **Skip: pluggable graph tool interface** (TASKS.md line 1350-ish, `#added:2026-03-25-120000`). Marked `[-]` with rationale: pluggable abstraction implies multiple candidate graph tools, which implies ctx vouches for the interface contract across implementations — exactly the ownership coupling the new decision rules out. Spec sketch in ideas/spec-mcp-warm-up-ceremony.md stays available as the starting point if the trade-off ever shifts. 3. **New decision: MCP gateway not worth the coupling cost** (DECISIONS.md). Builds on the existing 2026-03-12 "Recommend companion RAGs as peer MCP servers not bridge through ctx" with a stronger reason: ownership coupling. A gateway would make ctx vouch for the gatewayed tool's install/uninstall/upgrade/error surface — coupling we don't want for tools we don't ship. Pluggable abstraction is the same trap; both are skipped as a direct consequence. Doctor-style "is the peer available?" checks remain valid; that's not proxying. Spec: ideas/spec-companion-intelligence.md Signed-off-by: Jose Alekhinne <jose@ctx.ist>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.