Harden tray<->gateway keep-alive and reconnect lifecycle#627
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed June 4, 2026, 7:40 PM ET / 23:40 UTC. Summary Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Risk before merge
Maintainer options:
Next step before merge
Review detailsBest possible solution: Retry the Codex review after fixing the execution failure. Do we have a high-confidence way to reproduce the issue? Unclear. The review failed before ClawSweeper could establish a reproduction path. Is this the best way to solve the issue? Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction. AGENTS.md: unclear because the file could not be read completely. Codex review notes: model gpt-5.5, reasoning high; reviewed against 99efc50cbc22. Label changesLabel changes:
Label justifications:
Evidence reviewedWhat I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
WebSocketClientBase: - Add _disposing flag distinct from _disposed; set before OnDisposing, promote to _disposed after, so teardown callbacks see a stable state. - Gate ConnectAsync, ReconnectWithBackoffAsync, listener-finally (CAS/event/reconnect-kickoff), and reconnect kickoff sites on (!_disposed && !_disposing) to prevent post-Dispose work. - Re-check disposal after OnConnectedAsync and before spawning the listener Task so a Dispose racing the connect path does not leak a background listener. - CAS-clear + Abort() + Dispose() any ClientWebSocket installed into _webSocket if disposal wins the race during ConnectAsync; mirror in the orphan-clean else-if branch. - Abort() before Dispose() on owned sockets so peers see a clean RST instead of an unsent CLOSE. - Volatile reads on shutdown flags in HeartbeatLoopAsync to avoid cache staleness across cores. - Catch ObjectDisposedException narrowly in SendRawAsync and CloseWebSocketAsync so torn-down sockets do not surface as errors. - Guard OnDisposing with try/catch so a throwing subclass cannot skip later cleanup steps. - Per-event try/catch wrappers around status/error event raises so a throwing subscriber cannot block teardown. OpenClawGatewayClient: - Apply matching reconnect/teardown hygiene around the keep-alive and heartbeat paths so connection state stays consistent across forced disconnects. Tests: - Relax reconnect-backoff log assertion to tolerate jitter in the delay value (still asserts attempt number). Validation: - ./build.ps1 clean (0/0) - Shared.Tests: 2045 passed / 29 skipped - Tray.Tests: 877 passed - Manual: tray launched from net10.0-windows10.0.22621.0 build Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
b42fb20 to
1f48af3
Compare
No code changes; previous push 1f48af3 addressed the Dispose() race finding. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
WebSocketClientBase:
OpenClawGatewayClient:
Tests:
Validation: