Stop drain_proxy on subscriber arrival to prevent relay queue leak#233
Stop drain_proxy on subscriber arrival to prevent relay queue leak#233
Conversation
blackhole.stop() alone doesn't remove the drain_proxy from relay.__proxies. Relay continues putting frames into the dead proxy's queue, causing unbounded memory growth. Now add_track_subscriber calls drain_proxy.stop() (removes from relay) + blackhole.stop() (cancels internal recv task) + drain_task.cancel() (safety net).
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughRefactors SubscriberPeerConnection's incoming-video draining: Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/rtc/test_subscriber_drain.py (1)
25-42: Test uses mocks, which the coding guidelines discourage.The guidelines state: "Do not use mocks or mock objects in tests unless directly requested." However, this test is verifying internal cleanup mechanics (
drain_proxy.stop()being called) that would be difficult to test via integration testing without setting up full WebRTC connections.If mock usage was an intentional exception for this specific scenario, please disregard. Otherwise, consider whether an integration-style test could verify the memory leak fix by checking proxy counts on the relay.
As per coding guidelines: "Do not use mocks or mock objects in tests unless directly requested"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/rtc/test_subscriber_drain.py` around lines 25 - 42, The test TestAddTrackSubscriberStopsDrain uses Mock/AsyncMock for internal objects (_video_drains entries and blackhole/drain_proxy) which violates the "no mocks in tests" guideline; either convert the test to an integration-style assertion that exercises add_track_subscriber and verifies cleanup via observable behavior (e.g., create a real MediaRelay and real Track-like objects and assert the relay/proxy counts decrease or that the track_id is removed from subscriber_pc._video_drains after add_track_subscriber), or explicitly mark this test as an allowed exception with a short comment explaining why mocks are necessary; update references to add_track_subscriber, MediaRelay, subscriber_pc._video_drains, blackhole.stop and drain_proxy.stop to use real implementations or document the exception.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/rtc/test_subscriber_drain.py`:
- Around line 25-42: The test TestAddTrackSubscriberStopsDrain uses
Mock/AsyncMock for internal objects (_video_drains entries and
blackhole/drain_proxy) which violates the "no mocks in tests" guideline; either
convert the test to an integration-style assertion that exercises
add_track_subscriber and verifies cleanup via observable behavior (e.g., create
a real MediaRelay and real Track-like objects and assert the relay/proxy counts
decrease or that the track_id is removed from subscriber_pc._video_drains after
add_track_subscriber), or explicitly mark this test as an allowed exception with
a short comment explaining why mocks are necessary; update references to
add_track_subscriber, MediaRelay, subscriber_pc._video_drains, blackhole.stop
and drain_proxy.stop to use real implementations or document the exception.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 851c8717-fc5a-4e42-8561-3654685272d5
📒 Files selected for processing (2)
getstream/video/rtc/pc.pytests/rtc/test_subscriber_drain.py
Single None check on the tuple instead of checking each element.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@getstream/video/rtc/pc.py`:
- Around line 211-217: handle_track_ended() currently clears track_map and
video_frame_trackers but does not clean up entries in _video_drains, leaking
(blackhole, drain_task, drain_proxy) for tracks that end before
add_track_subscriber() is called; update handle_track_ended() to mirror the
cleanup in the other branch by popping the track_id from _video_drains, and if
present call blackhole.stop() (schedule with asyncio.create_task if needed),
call drain_proxy.stop(), and cancel drain_task to fully release resources
associated with that track. Ensure you reference the same tuple semantics used
where video_drain is unpacked into blackhole, drain_task, drain_proxy so
behavior matches add_track_subscriber() cleanup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
If a participant leaves before anyone subscribes to their video, the drain entry stays pinned in _video_drains for the lifetime of the peer connection.
Why
After #232,
add_track_subscribercallsblackhole.stop()to clean up the drain. Butblackhole.stop()only cancels the internal recv task - it doesn't remove thedrain_proxyfromrelay.__proxies. The relay worker continues putting frames into the dead proxy's unbounded queue, causing OOM for agents with video processors.Changes
drain_proxyalongside blackhole and drain_task in_video_drainstupledrain_proxy.stop()inadd_track_subscriberto remove proxy from relay__proxiesblackhole.stop()to cancel internal task blocked onqueue.get()drain_task.cancel()as safety netdrain_proxy.stop()is calledHow the cleanup works
Companion PR
Summary by CodeRabbit
Bug Fixes
Tests