test: pin error codes, result-type keys, and schema strictness boundaries#2258
test: pin error codes, result-type keys, and schema strictness boundaries#2258felixweinberger wants to merge 3 commits into
Conversation
Downstream code matches SDK errors by literal code values, error.name, and message substrings rather than enum members or instanceof (which breaks across package boundaries). None of these were pinned, so a renumber or reword would pass the entire suite silently. Lock them in: - ErrorCode: exact-equality table of all 8 members' numeric values (also locks membership in both directions) - McpError: error.name === 'McpError' set explicitly in the constructor - close rejection: literal code -32000 plus a message containing 'Connection closed', not just the enum member - streamable HTTP GET failure: StreamableHTTPError with the HTTP status as .code and the 'Failed to open SSE stream' message marker - per-session hosting: the 'No valid session ID' rejection surfaces through the client error with .code 400 and the body text embedded - OAuth discovery failures: 'trying to load' + 'metadata' substrings - startAuthorization/registerClient: anchored 'Incompatible auth server' message head - parseErrorResponse fallback: anchored 'HTTP <status>' message head
The mutual-assignability checks tolerate renames or removals of optional members on passthrough result types: the old key is absorbed by the catchall index signature and the renamed key is optional, so assignment still compiles in both directions. Renaming CallToolResult's isError to isFailure passed typecheck and every existing test. Add per-type key lists (checked against keyof with index signatures stripped) for the result members consumers read by name (isError, structuredContent, _meta, nextCursor, instructions, capability flags), plus a runtime check that CallToolResultSchema.parse preserves isError.
Pin the accept/strip/reject line each Zod schema draws at the wire: - EmptyResult acks are strict: an extra non-_meta key on a ping ack rejects client-side with the raw validator error - JSON-RPC envelopes are strict: an unknown top-level sibling field is rejected at the deserialization boundary (HTTP 400/-32700; stdio onerror + drop + connection survives) - Typed request params strip unknown siblings instead of rejecting them - Typed results pass unknown top-level siblings through to the caller (tools/call and completion/complete) - A result failing the consumer-supplied schema rejects with the raw issues-bearing validator error, not a wrapped McpError, for both request() and connect()
|
commit: |
| verifies('typescript:consumer:session-expiry-message', async (_args: TestArgs) => { | ||
| // The per-session host rejects unrecognized session ids with HTTP 400 and a body containing | ||
| // 'No valid session ID' (the documented hosting pattern). Consumers regex-match that string | ||
| // on the client-side error message and read the HTTP status off .code to detect session | ||
| // expiry, so the client must surface both through the rejection. | ||
| const host = hostPerSession(() => echoServer()); | ||
| const url = new URL('http://in-process/mcp'); | ||
| const fetch = (u: URL | string, init?: RequestInit) => host.handleRequest(new Request(u, init)); | ||
|
|
||
| const client = newClient(); | ||
| try { | ||
| await client.connect(new StreamableHTTPClientTransport(url, { fetch })); | ||
| await client.listTools(); | ||
|
|
||
| // Drop every server-side session while the client still holds its session id. | ||
| await host.close(); | ||
|
|
||
| const err = await client.listTools().catch((e: unknown) => e); | ||
| expect(err).toBeInstanceOf(StreamableHTTPError); | ||
| expect(err).toMatchObject({ code: 400 }); | ||
| expect((err as Error).message).toMatch(/No valid session ID/i); | ||
| } finally { | ||
| await client.close(); | ||
| await host.close(); | ||
| } |
There was a problem hiding this comment.
🟡 The 'No valid session ID' message asserted at line 217 is emitted by the test's own helper (test/e2e/helpers/index.ts:172, a hand-written mirror of the documented hosting pattern), not by the SDK example servers that carry the documented pattern's actual copies of the string — so rewording the message in every src/examples/server file leaves this test green even though the requirement describes the message text as pinned. Consider sharing the rejection string via an exported constant used by both the helper and the examples (or asserting the literal against one example source); the client-side half of the pin (.code = HTTP status, raw body embedded in the message) is genuinely enforced and valuable as-is.
Extended reasoning...
What the test pins vs. what the requirement says it pins. The new requirement typescript:consumer:session-expiry-message (test/e2e/requirements.ts:1560-1566) is worded around the consumer workflow: the documented per-session hosting pattern rejects an unrecognized session id with HTTP 400 and a JSON-RPC error body whose message contains 'No valid session ID', and consumers detect session expiry by regex-matching that message and reading .code. The test body drives the client against hostPerSession() from test/e2e/helpers/index.ts, and the 400 + 'Bad Request: No valid session ID provided' rejection asserted at hosting-session.test.ts:217 is produced by the helper itself at test/e2e/helpers/index.ts:169-176 — a hand-written copy whose comment says it merely 'mirrors the SDK's documented hosting pattern'.
Where the documented pattern's string actually lives. The literal appears independently in src/examples/server/simpleStreamableHttp.ts:751, jsonResponseStreamableHttp.ts:136, sseAndStreamableHttpCompatibleServer.ts:142, standaloneSseWithGetStreamableHttp.ts:86, elicitationUrlExample.ts:655, elicitationFormExample.ts:401, plus a variant in simpleTaskInteractive.ts:678. There is no shared constant, and nothing in test/e2e imports or reads any of those files. No other test asserts their message text.
Concrete trigger. Reword the message in every example file (say, to 'Session not found or expired'). The suite stays fully green: the helper still emits the old string, so line 217's /No valid session ID/i still matches. Meanwhile consumers who built servers from the documented examples — the population the requirement describes — would see their session-expiry regexes break. That is exactly the silent-reword regression class the PR description says this series turns red ('renaming or renumbering these surfaces turns CI red instead of landing silently'). For the server-message half of the contract, the pin is self-referential: the harness is asserted against its own copy of the string.
What IS genuinely enforced — and a fair counterpoint. A reasonable reading (raised during verification) is that the test's real subject is the client-side surfacing: StreamableHTTPError with .code = HTTP status and the raw body text embedded in the message (src/client/streamableHttp.ts), with the server-side 400 + message acting only as a fixture. Under that reading the test is correct and covers previously-untested client behavior — which is true, and is why this is filed as a non-blocking suggestion rather than a defect. But the requirement prose goes further than that reading: it names the specific message text consumers regex-match and declares it part of the pinned surface. As written, the suite cannot detect drift in the only copies of that text that consumers ever see (the example servers), so the requirement over-claims relative to what the test enforces.
Step-by-step proof. (1) client.connect + listTools establish a session via hostPerSession. (2) host.close() clears the session map while the client retains its session id. (3) The next listTools POST carries the stale mcp-session-id; the helper's sid !== undefined branch (helpers/index.ts:166) returns 400 with { error: { code: -32000, message: 'Bad Request: No valid session ID provided' } }. (4) The client wraps the body into a StreamableHTTPError with .code 400, and line 217 matches the regex — against the helper's literal, never touching src/examples. (5) Apply sed -i \"s/No valid session ID provided/Session expired/\" src/examples/server/*.ts and re-run: every test in this suite still passes.
Suggested fix. Export the rejection message (and ideally the whole error body) from a small shared module that both the example servers and the e2e helper import, so the helper provably emits the documented string; or, more cheaply, add an assertion that reads one example source (e.g. simpleStreamableHttp.ts) and expects it to contain the literal, tying the test's fixture to the documented pattern it claims to represent. Either keeps the valuable client-side assertions unchanged while making the message half of the requirement actually enforceable.
Pins three parts of the public surface that consumers depend on but no test previously locked: literal error codes and matched message text, the named optional members of result types, and the strict/strip/passthrough line each wire schema draws. Tests only.
Motivation and Context
Tightening behavioral coverage ahead of the next protocol revision. Consumers match SDK errors by numeric code,
error.name, and message substrings, and read result members likeisErrorby name. Until now a renumber, reword, or optional-member rename passed the suite silently — assignability checks absorb renamed optional keys through the passthrough index signature.What changed:
ErrorCodemembers;McpError.name; the-32000close rejection with its "Connection closed" text; Streamable HTTP and OAuth message heads.isError,structuredContent,_meta,nextCursor,instructions, capability flags), plus a runtime check thatCallToolResultSchema.parsepreservesisError.EmptyResultacks and JSON-RPC envelopes, unknown-sibling stripping on typed params, passthrough on typed results, raw validator errors on schema mismatch forrequest()andconnect().How Has This Been Tested?
First of a 6-PR series; full matrix at the series tip: e2e 1268/1268 (~22 s), unit 1625/1625, typecheck and lint green. The suite remains deterministic across repeated runs.
Breaking Changes
None — tests only.
Types of changes
Checklist
Additional context
After merge, renaming or renumbering these surfaces turns CI red instead of landing silently.