Skip to content

test: pin error codes, result-type keys, and schema strictness boundaries#2258

Open
felixweinberger wants to merge 3 commits into
v1.xfrom
fweinberger/e2e-behavior-pins
Open

test: pin error codes, result-type keys, and schema strictness boundaries#2258
felixweinberger wants to merge 3 commits into
v1.xfrom
fweinberger/e2e-behavior-pins

Conversation

@felixweinberger
Copy link
Copy Markdown
Contributor

Pins three parts of the public surface that consumers depend on but no test previously locked: literal error codes and matched message text, the named optional members of result types, and the strict/strip/passthrough line each wire schema draws. Tests only.

Motivation and Context

Tightening behavioral coverage ahead of the next protocol revision. Consumers match SDK errors by numeric code, error.name, and message substrings, and read result members like isError by name. Until now a renumber, reword, or optional-member rename passed the suite silently — assignability checks absorb renamed optional keys through the passthrough index signature.

What changed:

  • Exact-value table for all 8 ErrorCode members; McpError.name; the -32000 close rejection with its "Connection closed" text; Streamable HTTP and OAuth message heads.
  • Key-existence checks for the result members consumers read by name (isError, structuredContent, _meta, nextCursor, instructions, capability flags), plus a runtime check that CallToolResultSchema.parse preserves isError.
  • Strictness boundaries: strict EmptyResult acks and JSON-RPC envelopes, unknown-sibling stripping on typed params, passthrough on typed results, raw validator errors on schema mismatch for request() and connect().

How Has This Been Tested?

First of a 6-PR series; full matrix at the series tip: e2e 1268/1268 (~22 s), unit 1625/1625, typecheck and lint green. The suite remains deterministic across repeated runs.

Breaking Changes

None — tests only.

Types of changes

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

After merge, renaming or renumbering these surfaces turns CI red instead of landing silently.

Downstream code matches SDK errors by literal code values, error.name,
and message substrings rather than enum members or instanceof (which
breaks across package boundaries). None of these were pinned, so a
renumber or reword would pass the entire suite silently. Lock them in:

- ErrorCode: exact-equality table of all 8 members' numeric values
  (also locks membership in both directions)
- McpError: error.name === 'McpError' set explicitly in the constructor
- close rejection: literal code -32000 plus a message containing
  'Connection closed', not just the enum member
- streamable HTTP GET failure: StreamableHTTPError with the HTTP status
  as .code and the 'Failed to open SSE stream' message marker
- per-session hosting: the 'No valid session ID' rejection surfaces
  through the client error with .code 400 and the body text embedded
- OAuth discovery failures: 'trying to load' + 'metadata' substrings
- startAuthorization/registerClient: anchored 'Incompatible auth
  server' message head
- parseErrorResponse fallback: anchored 'HTTP <status>' message head
The mutual-assignability checks tolerate renames or removals of optional
members on passthrough result types: the old key is absorbed by the
catchall index signature and the renamed key is optional, so assignment
still compiles in both directions. Renaming CallToolResult's isError to
isFailure passed typecheck and every existing test.

Add per-type key lists (checked against keyof with index signatures
stripped) for the result members consumers read by name (isError,
structuredContent, _meta, nextCursor, instructions, capability flags),
plus a runtime check that CallToolResultSchema.parse preserves isError.
Pin the accept/strip/reject line each Zod schema draws at the wire:

- EmptyResult acks are strict: an extra non-_meta key on a ping ack
  rejects client-side with the raw validator error
- JSON-RPC envelopes are strict: an unknown top-level sibling field is
  rejected at the deserialization boundary (HTTP 400/-32700; stdio
  onerror + drop + connection survives)
- Typed request params strip unknown siblings instead of rejecting them
- Typed results pass unknown top-level siblings through to the caller
  (tools/call and completion/complete)
- A result failing the consumer-supplied schema rejects with the raw
  issues-bearing validator error, not a wrapped McpError, for both
  request() and connect()
@felixweinberger felixweinberger requested a review from a team as a code owner June 7, 2026 12:18
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jun 7, 2026

⚠️ No Changeset found

Latest commit: 85c86fa

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Jun 7, 2026

Open in StackBlitz

npm i https://pkg.pr.new/@modelcontextprotocol/sdk@2258

commit: 85c86fa

Comment on lines +197 to +221
verifies('typescript:consumer:session-expiry-message', async (_args: TestArgs) => {
// The per-session host rejects unrecognized session ids with HTTP 400 and a body containing
// 'No valid session ID' (the documented hosting pattern). Consumers regex-match that string
// on the client-side error message and read the HTTP status off .code to detect session
// expiry, so the client must surface both through the rejection.
const host = hostPerSession(() => echoServer());
const url = new URL('http://in-process/mcp');
const fetch = (u: URL | string, init?: RequestInit) => host.handleRequest(new Request(u, init));

const client = newClient();
try {
await client.connect(new StreamableHTTPClientTransport(url, { fetch }));
await client.listTools();

// Drop every server-side session while the client still holds its session id.
await host.close();

const err = await client.listTools().catch((e: unknown) => e);
expect(err).toBeInstanceOf(StreamableHTTPError);
expect(err).toMatchObject({ code: 400 });
expect((err as Error).message).toMatch(/No valid session ID/i);
} finally {
await client.close();
await host.close();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The 'No valid session ID' message asserted at line 217 is emitted by the test's own helper (test/e2e/helpers/index.ts:172, a hand-written mirror of the documented hosting pattern), not by the SDK example servers that carry the documented pattern's actual copies of the string — so rewording the message in every src/examples/server file leaves this test green even though the requirement describes the message text as pinned. Consider sharing the rejection string via an exported constant used by both the helper and the examples (or asserting the literal against one example source); the client-side half of the pin (.code = HTTP status, raw body embedded in the message) is genuinely enforced and valuable as-is.

Extended reasoning...

What the test pins vs. what the requirement says it pins. The new requirement typescript:consumer:session-expiry-message (test/e2e/requirements.ts:1560-1566) is worded around the consumer workflow: the documented per-session hosting pattern rejects an unrecognized session id with HTTP 400 and a JSON-RPC error body whose message contains 'No valid session ID', and consumers detect session expiry by regex-matching that message and reading .code. The test body drives the client against hostPerSession() from test/e2e/helpers/index.ts, and the 400 + 'Bad Request: No valid session ID provided' rejection asserted at hosting-session.test.ts:217 is produced by the helper itself at test/e2e/helpers/index.ts:169-176 — a hand-written copy whose comment says it merely 'mirrors the SDK's documented hosting pattern'.

Where the documented pattern's string actually lives. The literal appears independently in src/examples/server/simpleStreamableHttp.ts:751, jsonResponseStreamableHttp.ts:136, sseAndStreamableHttpCompatibleServer.ts:142, standaloneSseWithGetStreamableHttp.ts:86, elicitationUrlExample.ts:655, elicitationFormExample.ts:401, plus a variant in simpleTaskInteractive.ts:678. There is no shared constant, and nothing in test/e2e imports or reads any of those files. No other test asserts their message text.

Concrete trigger. Reword the message in every example file (say, to 'Session not found or expired'). The suite stays fully green: the helper still emits the old string, so line 217's /No valid session ID/i still matches. Meanwhile consumers who built servers from the documented examples — the population the requirement describes — would see their session-expiry regexes break. That is exactly the silent-reword regression class the PR description says this series turns red ('renaming or renumbering these surfaces turns CI red instead of landing silently'). For the server-message half of the contract, the pin is self-referential: the harness is asserted against its own copy of the string.

What IS genuinely enforced — and a fair counterpoint. A reasonable reading (raised during verification) is that the test's real subject is the client-side surfacing: StreamableHTTPError with .code = HTTP status and the raw body text embedded in the message (src/client/streamableHttp.ts), with the server-side 400 + message acting only as a fixture. Under that reading the test is correct and covers previously-untested client behavior — which is true, and is why this is filed as a non-blocking suggestion rather than a defect. But the requirement prose goes further than that reading: it names the specific message text consumers regex-match and declares it part of the pinned surface. As written, the suite cannot detect drift in the only copies of that text that consumers ever see (the example servers), so the requirement over-claims relative to what the test enforces.

Step-by-step proof. (1) client.connect + listTools establish a session via hostPerSession. (2) host.close() clears the session map while the client retains its session id. (3) The next listTools POST carries the stale mcp-session-id; the helper's sid !== undefined branch (helpers/index.ts:166) returns 400 with { error: { code: -32000, message: 'Bad Request: No valid session ID provided' } }. (4) The client wraps the body into a StreamableHTTPError with .code 400, and line 217 matches the regex — against the helper's literal, never touching src/examples. (5) Apply sed -i \"s/No valid session ID provided/Session expired/\" src/examples/server/*.ts and re-run: every test in this suite still passes.

Suggested fix. Export the rejection message (and ideally the whole error body) from a small shared module that both the example servers and the e2e helper import, so the helper provably emits the documented string; or, more cheaply, add an assertion that reads one example source (e.g. simpleStreamableHttp.ts) and expects it to contain the literal, tying the test's fixture to the documented pattern it claims to represent. Either keeps the valuable client-side assertions unchanged while making the message half of the requirement actually enforceable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant