Skip to content

Conversation

@TheMailmans
Copy link

@TheMailmans TheMailmans commented Dec 12, 2025

Summary

This PR primarily fixes #572 by enabling graceful shutdown without consuming self. While implementing this, I noticed delete_session() is spawned as a background task, which means close() may return before HTTP session cleanup completes. Since this is part of the same shutdown lifecycle and can cause resource leaks/races, I'm including a small, localized fix to ensure cleanup is completed before close() returns. If maintainers prefer, I can split the cleanup timing change into a follow-up PR.

Changes

Service Layer (service.rs)

  • close(&mut self) - Gracefully shuts down the connection and waits for cleanup to complete without consuming the RunningService. This was the main ask in Client connection is not closed on drop, and no way to gracefully stop #572.
  • close_with_timeout(&mut self, Duration) - Same as close() but with a bounded wait time. Useful for apps that need deterministic shutdown.
  • is_closed(&self) - Check if the service has already been closed or cancelled.
  • Drop impl - Logs a debug message if dropped without explicit close() call. The DropGuard still handles async cleanup, but this helps developers catch missing cleanup calls.

HTTP Transport (streamable_http_client.rs)

  • Moved delete_session() from a fire-and-forget tokio::spawn to inline cleanup at the end of the worker's run() method
  • Added a 5-second timeout on the cleanup HTTP request to prevent close() from hanging indefinitely if the server is unresponsive
  • This ensures the HTTP DELETE request completes (or times out) before close() returns, preventing orphaned sessions on authenticated MCP servers

Why both changes together?

The original issue mentions that "for authenticated MCP connections, the sessions on the remote would still be active." The service-layer close() method alone doesn't fully solve this because the HTTP transport's session deletion was spawned as a background task. By moving it inline with a timeout, close() now guarantees bounded, deterministic cleanup.

Design Decisions

Q: Can close() hang forever?
No. The HTTP session cleanup has a 5-second timeout. If the server doesn't respond, we log a warning and continue. close_with_timeout() provides an additional layer of control at the service level.

Q: What happens if cancelled mid-cleanup?
The cleanup runs after the main loop exits, so the cancellation token has already fired. The timeout ensures we don't block indefinitely regardless of server behavior.

Manual Verification

# Build
cargo check -p rmcp --all-features

# Run all tests
cargo test -p rmcp --all-features

# Run the new close() tests specifically  
cargo test --test test_close_connection --features "client server"

# Clippy
cargo clippy -p rmcp --all-features

All tests pass, clippy is clean.

Test Plan

  • test_close_method - Verifies close() works and can be called twice safely
  • test_close_with_timeout - Verifies timeout-bounded shutdown
  • test_cancel_method - Confirms existing cancel() API still works (backward compat)
  • test_drop_without_close - Verifies drop behavior doesn't panic and async cleanup still happens

Backward Compatibility

  • Existing cancel() method still works (now delegates to close() internally)
  • Existing waiting() method still works
  • Dropping without close() still triggers cleanup via DropGuard - just with a debug log now

Fixes #572

@github-actions github-actions bot added T-test Testing related changes T-core Core library changes T-transport Transport layer changes labels Dec 12, 2025
This PR primarily fixes modelcontextprotocol#572 by enabling graceful shutdown without consuming self. While implementing this, I noticed delete_session() is spawned as a background task, which means close() may return before HTTP session cleanup completes. Since this is part of the same shutdown lifecycle and can cause resource leaks/races, I'm including a small, localized fix to ensure cleanup is completed before close() returns. If maintainers prefer, I can split the cleanup timing change into a follow-up PR.

Changes:
- Add close(&mut self) for graceful shutdown without consuming
- Add close_with_timeout() for bounded shutdown operations
- Add is_closed() to check connection state
- Move HTTP delete_session from background spawn to inline cleanup
- Add 5-second timeout on session cleanup to prevent indefinite hangs
- Add Drop impl with debug log if dropped without explicit close

Fixes modelcontextprotocol#572
@TheMailmans TheMailmans force-pushed the fix/graceful-connection-close branch from cefbbd7 to 89db1bf Compare December 12, 2025 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T-core Core library changes T-test Testing related changes T-transport Transport layer changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Client connection is not closed on drop, and no way to gracefully stop

1 participant