Skip to content

Comments

[RHIDP-11647] Add Interrupt POST Endpoint for /v1/streaming_query#1176

Open
Jdubrick wants to merge 10 commits intolightspeed-core:mainfrom
Jdubrick:query-interrupt
Open

[RHIDP-11647] Add Interrupt POST Endpoint for /v1/streaming_query#1176
Jdubrick wants to merge 10 commits intolightspeed-core:mainfrom
Jdubrick:query-interrupt

Conversation

@Jdubrick
Copy link
Contributor

@Jdubrick Jdubrick commented Feb 18, 2026

Description

  • Adds /streaming_query/interrupt so that users are able to stop an in-flight streaming request
  • Streaming queries get a generated 'request_id' attached to them that you can use to trigger the interrupt if the user owns the request

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Claude Opus 4.6
  • Generated by:

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • POST /v1/streaming_query/interrupt to stop in-progress streams.
    • Streaming start SSE events and responses now include request_id.
    • Interrupt responses include request_id, interrupted (bool), and a human-readable message.
  • Bug Fixes / Behavioral

    • Cancelled streams emit interruption SSE, are deregistered, and skip post-stream side effects.
  • Documentation & Tests

    • OpenAPI updated and new integration/unit tests cover interrupt lifecycle.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds an in-memory interrupt registry, a new POST /v1/streaming_query/interrupt endpoint with request/response models, and wiring in the streaming flow to register per-request asyncio tasks, allow cancellation by request_id/user, emit interrupted SSE events, and update OpenAPI, routers, models, and tests.

Changes

Cohort / File(s) Summary
Interrupt Registry
src/utils/stream_interrupts.py
New thread-safe StreamInterruptRegistry with ActiveStream dataclass, CancelStreamResult enum, and register/cancel/deregister/get methods plus DI getter.
Interrupt Endpoint
src/app/endpoints/stream_interrupt.py
New APIRouter and POST /streaming_query/interrupt handler using auth + registry DI; validates ownership, cancels registered task, and returns StreamingInterruptResponse or appropriate error responses.
Streaming Integration
src/app/endpoints/streaming_query.py
Per-request request_id generation; register/deregister asyncio task with registry; stream_start_event signature updated to include request_id; added stream_interrupted_event; handles asyncio.CancelledError to emit interrupted SSE and skip post-stream side effects.
Models — Requests & Responses
src/models/requests.py, src/models/responses.py
Added StreamingInterruptRequest (validated request_id) and StreamingInterruptResponse (request_id, interrupted, message); updated SSE examples and added a NotFoundResponse example for streaming requests.
Context Model
src/models/context.py
Added request_id: str field to ResponseGeneratorContext.
Routing & OpenAPI
src/app/routers.py, docs/openapi.json
Registered new router under /v1; OpenAPI updated with POST /v1/streaming_query/interrupt, new schemas, SSE start event example now includes request_id; A2A operationId tweaks.
Tests — Unit & Integration
tests/unit/.../test_stream_interrupt.py, tests/unit/.../test_streaming_query.py, tests/integration/.../test_stream_interrupt_integration.py, tests/integration/test_openapi_json.py, tests/unit/app/test_routers.py
Added unit/integration tests for interrupt lifecycle, ownership/cancellation cases; updated streaming tests to propagate request_id and isolate registry per test.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant StreamEndpoint as "Streaming Endpoint\n(/v1/streaming_query)"
    participant Registry as "Interrupt Registry\n(StreamInterruptRegistry)"
    participant StreamTask as "Active Stream Task\n(asyncio.Task)"
    participant InterruptEndpoint as "Interrupt Endpoint\n(/v1/streaming_query/interrupt)"

    Client->>StreamEndpoint: POST /v1/streaming_query
    activate StreamEndpoint
    StreamEndpoint->>StreamEndpoint: generate request_id, start task
    StreamEndpoint->>Registry: register_stream(request_id, user_id, task)
    StreamEndpoint-->>Client: SSE start event (includes request_id)
    StreamTask->>Client: stream events...

    Client->>InterruptEndpoint: POST /v1/streaming_query/interrupt (request_id)
    activate InterruptEndpoint
    InterruptEndpoint->>Registry: cancel_stream(request_id, user_id)
    Registry->>StreamTask: cancel task
    StreamTask-->>StreamEndpoint: raises CancelledError
    StreamEndpoint-->>Client: SSE interrupted event
    StreamEndpoint->>Registry: deregister_stream(request_id)
    InterruptEndpoint-->>Client: 200 {interrupted:true, request_id, message}
    deactivate InterruptEndpoint
    deactivate StreamEndpoint
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

ok-to-test

Suggested reviewers

  • tisnik
  • are-ces
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main feature: adding a new interrupt POST endpoint for streaming queries. It is concise, directly related to the primary change, and avoids vague terminology.
Docstring Coverage ✅ Passed Docstring coverage is 87.18% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/app/endpoints/streaming_query.py (1)

828-829: ⚠️ Potential issue | 🟡 Minor

Stale docstring in shield_violation_generator — it no longer yields start or end events.

The function body yields only a single token event. The start event is now emitted by generate_response before iterating any generator, and the end event is emitted by generate_response after completion.

🔧 Proposed fix
-    Yields start, token, and end events immediately for shield violations.
-    This function creates a minimal streaming response without going through
-    the Llama Stack response format.
+    Yields a single token event containing the violation message.
+    Start and end events are emitted by the caller (``generate_response``).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/streaming_query.py` around lines 828 - 829, The docstring
for shield_violation_generator is stale: it claims the generator yields start,
token, and end events but the implementation yields only a single token event;
update the docstring in the shield_violation_generator function to reflect the
current behavior (it yields a single token/event representing the violation, not
start or end events) and mention that generate_response now emits the start and
end events; reference the function name shield_violation_generator and update
wording to describe the single token payload and any fields it contains (e.g.,
token content and reason).
docs/openapi.json (1)

4414-4440: ⚠️ Potential issue | 🟠 Major

Make operationId unique between GET and POST.

Both /a2a GET and POST now share handle_a2a_jsonrpc_a2a_get, which violates OpenAPI uniqueness and breaks client generation. Use distinct IDs per method (e.g., restore _post for POST).

✅ Suggested fix
-                "operationId": "handle_a2a_jsonrpc_a2a_get",
+                "operationId": "handle_a2a_jsonrpc_a2a_post",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/openapi.json` around lines 4414 - 4440, The OpenAPI spec currently
reuses the same operationId "handle_a2a_jsonrpc_a2a_get" for both the GET and
POST A2A endpoints; change the POST operationId to a unique identifier (for
example "handle_a2a_jsonrpc_a2a_post") so GET and POST have distinct
operationIds and client generation won't fail—update the "operationId" field
under the POST operation accordingly.
tests/unit/app/test_routers.py (1)

133-168: ⚠️ Potential issue | 🟡 Minor

Update the docstring count to match the new assertions.
The docstring still says 16 routers while the test now asserts 20.

📝 Suggested docstring fix
-    Asserts that 16 routers are registered on a MockFastAPI instance and that
+    Asserts that 20 routers are registered on a MockFastAPI instance and that
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/app/test_routers.py` around lines 133 - 168, The
test_check_prefixes docstring is out of date (it says 16 routers) — update the
docstring in the test_check_prefixes function to reflect the current assertions
(20 routers) and, if present, adjust any explanatory text about which routers
have which prefixes so it matches the actual checks performed for
include_routers against the MockFastAPI instance (references:
test_check_prefixes, include_routers, MockFastAPI, and the router symbols like
conversations_v2.router/conversations_v1.router).
🧹 Nitpick comments (7)
src/utils/stream_interrupts.py (2)

17-17: asyncio.Task is missing a type parameter.

asyncio.Task is generic since Python 3.9; prefer asyncio.Task[Any] for completeness.

🔧 Proposed fix
+from typing import Any

 `@dataclass`
 class ActiveStream:
     user_id: str
-    task: asyncio.Task
+    task: asyncio.Task[Any]

Apply the same change to the register_stream signature:

-    def register_stream(self, request_id: str, user_id: str, task: asyncio.Task) -> None:
+    def register_stream(self, request_id: str, user_id: str, task: asyncio.Task[Any]) -> None:

As per coding guidelines: "Use complete type annotations for all function parameters and return types."

Also applies to: 29-29

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/stream_interrupts.py` at line 17, The task type annotations use the
bare generic asyncio.Task; update them to asyncio.Task[Any] and add Any to
imports (from typing import Any) so type checks are complete; specifically
change the parameter/type annotations for the task variable and the
register_stream function signature in stream_interrupts.py from asyncio.Task to
asyncio.Task[Any], and ensure the top-level imports include Any so the code
compiles with the updated annotations.

28-69: Method docstrings are missing Google-style Args sections.

None of the four public methods document their parameters.

As per coding guidelines: "Follow Google Python docstring conventions for modules, classes, and functions with Parameters, Returns, Raises, and Attributes sections."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/stream_interrupts.py` around lines 28 - 69, Add Google-style
docstring parameter sections to the four public methods in this class:
register_stream, cancel_stream, deregister_stream, and get_stream. For each
method include an "Args:" block that documents each parameter (request_id: str,
user_id: str where present, task: asyncio.Task where present) and for
cancel_stream and get_stream also add a "Returns:" block describing the return
value (bool for cancel_stream, Optional[ActiveStream] for get_stream). Keep
wording brief and follow the existing one-line summary then the Args/Returns
headers per Google Python docstring conventions.
src/app/endpoints/streaming_query.py (1)

87-88: stream_interrupt_registry singleton used directly in generate_response, breaking test overrides.

The interrupt endpoint correctly injects the registry via get_stream_interrupt_registry() (allowing app.dependency_overrides in tests), but generate_response imports and calls the module-level singleton directly. Integration tests for the streaming path cannot substitute a mock registry without patching the module.

♻️ Proposed refactor — thread the registry through
 async def generate_response(
     generator: AsyncIterator[str],
     context: ResponseGeneratorContext,
     responses_params: ResponsesApiParams,
     turn_summary: TurnSummary,
     request_id: str,
+    registry: StreamInterruptRegistry | None = None,
 ) -> AsyncIterator[str]:
     ...
+    if registry is None:
+        from utils.stream_interrupts import get_stream_interrupt_registry
+        registry = get_stream_interrupt_registry()
     current_task = asyncio.current_task()
     if current_task is not None:
-        stream_interrupt_registry.register_stream(...)
+        registry.register_stream(...)
     ...
     finally:
-        stream_interrupt_registry.deregister_stream(request_id)
+        registry.deregister_stream(request_id)

Then pass registry from the endpoint, obtained via Depends(get_stream_interrupt_registry).

Also applies to: 321-326, 347-355, 390-391

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/streaming_query.py` around lines 87 - 88, generate_response
currently uses the module-level stream_interrupt_registry singleton (imported
from utils.stream_interrupts), which prevents test overrides; change
generate_response to accept a registry parameter (e.g., registry) and stop
referencing the module-level stream_interrupt_registry, then update the
streaming endpoint to inject the registry via
Depends(get_stream_interrupt_registry) and pass it into generate_response. Also
propagate this parameter through other call sites that use the singleton in this
file (the other invocations noted around the earlier blocks) so they call
generate_response (or the helper that invokes it) with the injected registry
instead of the module singleton.
tests/integration/test_openapi_json.py (1)

220-231: Consider adding 422 for consistency with other POST endpoints in this suite.
Both /v1/query and /v1/streaming_query include 422 in expected codes; keeping the interrupt endpoint consistent improves coverage expectations.

🔧 Suggested update
-            {"200", "401", "403", "404"},
+            {"200", "401", "403", "404", "422"},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_openapi_json.py` around lines 220 - 231, The
"/v1/streaming_query/interrupt" POST expected status set is missing 422; update
the test expectations by adding "422" to the status code set for the route tuple
(" /v1/streaming_query/interrupt", "post", {...}) so it matches the other POST
endpoints (e.g., "/v1/query" and "/v1/streaming_query") and maintains consistent
coverage.
src/app/endpoints/stream_interrupt.py (3)

5-5: Add status to the FastAPI import per guidelines.

Request and status are prescribed in the import template. While Request is not used here, status should be imported and used explicitly for the status_code on @router.post.

🔧 Proposed fix
-from fastapi import APIRouter, Depends, HTTPException
+from fastapi import APIRouter, Depends, HTTPException, status

Then on the @router.post decorator:

 `@router.post`(
     "/streaming_query/interrupt",
+    status_code=status.HTTP_200_OK,
     responses=stream_interrupt_responses,
     summary="Streaming Query Interrupt Endpoint Handler",
 )

As per coding guidelines: "Import FastAPI components using: from fastapi import APIRouter, HTTPException, Request, status, Depends".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/stream_interrupt.py` at line 5, Import the FastAPI status
helper and use it as the explicit status_code in the route decorator: update the
import line to include status (from fastapi import APIRouter, Depends,
HTTPException, status) and modify the `@router.post` decorator in this module (the
APIRouter route for stream interruption) to include
status_code=status.HTTP_200_OK (or the appropriate status constant) instead of a
raw integer or implicit default.

48-48: Expand the docstring to follow Google Python conventions.

The one-liner is missing the mandatory Parameters, Returns, and Raises sections.

📝 Proposed fix
-    """Interrupt an in-progress streaming query by request identifier."""
+    """Interrupt an in-progress streaming query by request identifier.
+
+    Parameters:
+        interrupt_request (StreamingInterruptRequest): Payload containing the
+            request_id of the stream to cancel.
+        auth (AuthTuple): Resolved authentication tuple; the first element is
+            used as the owning user_id.
+        registry (StreamInterruptRegistry): Active stream registry used to
+            cancel the targeted task.
+
+    Returns:
+        StreamingInterruptResponse: Confirmation payload with interrupted=True.
+
+    Raises:
+        HTTPException: 404 when no active stream matching the request_id and
+            user_id is found (stream absent, already complete, or owned by
+            another user).
+    """

As per coding guidelines: "All functions must have complete docstrings with brief descriptions" and "Follow Google Python docstring conventions for modules, classes, and functions with Parameters, Returns, Raises, and Attributes sections."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/stream_interrupt.py` at line 48, The one-line docstring in
stream_interrupt.py should be expanded to a full Google-style docstring for the
endpoint function (e.g., interrupt_stream or the function that "Interrupt an
in-progress streaming query by request identifier")—add a brief description plus
a Parameters section describing the request identifier arg (type and meaning), a
Returns section describing the response (type and structure), and a Raises
section listing possible exceptions (e.g., ValueError/KeyError or HTTPError) and
when they are raised; place this docstring immediately above the endpoint
handler function definition and follow Google Python conventions for formatting.

3-22: Add a module-level logger — required by guidelines and needed for observability.

The file has no logger = get_logger(__name__) and the handler emits no log entries. The registry already logs the wrong-user warning, but the endpoint layer logs nothing: successful cancellations and all 404 paths are entirely silent in endpoint traces.

🔧 Proposed fix
+from log import get_logger
 from models.config import Action
 from models.requests import StreamingInterruptRequest
 ...

+logger = get_logger(__name__)

 router = APIRouter(tags=["streaming_query_interrupt"])

Then inside the handler:

     interrupted = registry.cancel_stream(request_id, user_id)
     if not interrupted:
+        logger.debug("Stream interrupt not fulfilled for request_id=%s", request_id)
         response = NotFoundResponse(
             resource="streaming request",
             resource_id=request_id,
         )
         raise HTTPException(**response.model_dump())
+    logger.info("Stream interrupted: request_id=%s user_id=%s", request_id, user_id)
     return StreamingInterruptResponse(...)

As per coding guidelines: "Use logger = get_logger(__name__) from log.py for module logging" and "Use logger.debug() for detailed diagnostic information, logger.info() for general execution info."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/stream_interrupt.py` around lines 3 - 22, Add a
module-level logger by importing get_logger from log.py and defining logger =
get_logger(__name__) at top of src/app/endpoints/stream_interrupt.py, then
update the request handler that uses StreamingInterruptRequest,
StreamInterruptRegistry/get_stream_interrupt_registry to emit logs: use
logger.info() when a stream interruption is successfully cancelled (include
stream id and user from the request), and use logger.debug() or logger.info()
when returning a 404 path (e.g., interrupt not found) to record the attempted
stream id and caller; keep existing registry warnings intact. Ensure you use the
module-level logger variable (logger) rather than print or other loggers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/streaming_query.py`:
- Around line 349-355: When asyncio.current_task() returns None the stream isn't
registered and interrupt requests silently fail; update the block around
current_task and stream_interrupt_registry.register_stream to log a warning when
current_task is None (include request_id and user_id in the message) so failures
are diagnosable, i.e., call logger.warning(...) describing that no current task
was found for the given request_id/user_id and that interrupt registration was
skipped.
- Around line 387-394: The except asyncio.CancelledError handler currently
swallows cancellation and only logs the event; update it to call task.uncancel()
on the current task and then re-raise the CancelledError after running cleanup
so asyncio cancellation semantics are preserved. Specifically, in the except
block around the streaming loop that logs "Streaming request %s interrupted by
user" and yields stream_interrupted_event(request_id), call
asyncio.current_task().uncancel() (or the appropriate task reference) before
performing any non-exceptional cleanup, ensure
stream_interrupt_registry.deregister_stream(request_id) still runs in finally,
and then re-raise the CancelledError so that outer timeouts/TaskGroup
cancellation behave correctly while keeping the existing if not
stream_completed: return behavior to skip post-stream side effects.

In `@src/models/requests.py`:
- Around line 270-293: Update the docstrings for the StreamingInterruptRequest
class and its validator check_request_id to follow Google style: add an
"Attributes" section to the class docstring describing request_id (type: str,
description: the active streaming request ID) and include example format; add a
full docstring for check_request_id that contains "Parameters" (value: str — the
request ID to validate), "Returns" (str — the validated request ID), and
"Raises" (ValueError — if suid.check_suid(value) returns False), referencing the
validator's call to suid.check_suid to validate format; ensure wording is brief
and consistent with existing project docstring style.

In `@src/utils/stream_interrupts.py`:
- Line 7: Replace all legacy Optional[ActiveStream] annotations with the modern
union syntax ActiveStream | None; specifically update the import/annotation at
the top that currently reads "from typing import Optional" and any function,
variable, or return type annotations referencing Optional[ActiveStream] (e.g.,
the ActiveStream-typed parameters/returns around the ActiveStream class and
usages on lines referenced) to use "ActiveStream | None" instead; ensure any
other typing hints in the same file (notably the occurrences mentioned around
lines 66-69) are similarly converted and remove the now-unused Optional import
if no longer required.
- Around line 4-9: Replace the stdlib logger instantiation with the project's
centralized logger: remove the `import logging` and `logger =
logging.getLogger(__name__)` usage and instead call `get_logger(__name__)` from
`log.py`; update the top of the module to import `get_logger` and set `logger =
get_logger(__name__)` (this affects the module-level `logger` symbol in this
file so any existing references remain unchanged).

---

Outside diff comments:
In `@docs/openapi.json`:
- Around line 4414-4440: The OpenAPI spec currently reuses the same operationId
"handle_a2a_jsonrpc_a2a_get" for both the GET and POST A2A endpoints; change the
POST operationId to a unique identifier (for example
"handle_a2a_jsonrpc_a2a_post") so GET and POST have distinct operationIds and
client generation won't fail—update the "operationId" field under the POST
operation accordingly.

In `@src/app/endpoints/streaming_query.py`:
- Around line 828-829: The docstring for shield_violation_generator is stale: it
claims the generator yields start, token, and end events but the implementation
yields only a single token event; update the docstring in the
shield_violation_generator function to reflect the current behavior (it yields a
single token/event representing the violation, not start or end events) and
mention that generate_response now emits the start and end events; reference the
function name shield_violation_generator and update wording to describe the
single token payload and any fields it contains (e.g., token content and
reason).

In `@tests/unit/app/test_routers.py`:
- Around line 133-168: The test_check_prefixes docstring is out of date (it says
16 routers) — update the docstring in the test_check_prefixes function to
reflect the current assertions (20 routers) and, if present, adjust any
explanatory text about which routers have which prefixes so it matches the
actual checks performed for include_routers against the MockFastAPI instance
(references: test_check_prefixes, include_routers, MockFastAPI, and the router
symbols like conversations_v2.router/conversations_v1.router).

---

Duplicate comments:
In `@tests/integration/test_openapi_json.py`:
- Around line 307-318: The HTTP response code sets for the OpenAPI path entries
(e.g., "/v1/query", "/v1/streaming_query", "/v1/streaming_query/interrupt",
"/v1/config") are inconsistent with the rest of the test; normalize them to
match the project's canonical format (same ordering and representation for
status codes) — e.g., use a consistently ordered tuple/list or a sorted set
literal across all entries or the same formatting used elsewhere in
test_openapi_json.py so these four lines follow the same style as the other path
entries.

---

Nitpick comments:
In `@src/app/endpoints/stream_interrupt.py`:
- Line 5: Import the FastAPI status helper and use it as the explicit
status_code in the route decorator: update the import line to include status
(from fastapi import APIRouter, Depends, HTTPException, status) and modify the
`@router.post` decorator in this module (the APIRouter route for stream
interruption) to include status_code=status.HTTP_200_OK (or the appropriate
status constant) instead of a raw integer or implicit default.
- Line 48: The one-line docstring in stream_interrupt.py should be expanded to a
full Google-style docstring for the endpoint function (e.g., interrupt_stream or
the function that "Interrupt an in-progress streaming query by request
identifier")—add a brief description plus a Parameters section describing the
request identifier arg (type and meaning), a Returns section describing the
response (type and structure), and a Raises section listing possible exceptions
(e.g., ValueError/KeyError or HTTPError) and when they are raised; place this
docstring immediately above the endpoint handler function definition and follow
Google Python conventions for formatting.
- Around line 3-22: Add a module-level logger by importing get_logger from
log.py and defining logger = get_logger(__name__) at top of
src/app/endpoints/stream_interrupt.py, then update the request handler that uses
StreamingInterruptRequest, StreamInterruptRegistry/get_stream_interrupt_registry
to emit logs: use logger.info() when a stream interruption is successfully
cancelled (include stream id and user from the request), and use logger.debug()
or logger.info() when returning a 404 path (e.g., interrupt not found) to record
the attempted stream id and caller; keep existing registry warnings intact.
Ensure you use the module-level logger variable (logger) rather than print or
other loggers.

In `@src/app/endpoints/streaming_query.py`:
- Around line 87-88: generate_response currently uses the module-level
stream_interrupt_registry singleton (imported from utils.stream_interrupts),
which prevents test overrides; change generate_response to accept a registry
parameter (e.g., registry) and stop referencing the module-level
stream_interrupt_registry, then update the streaming endpoint to inject the
registry via Depends(get_stream_interrupt_registry) and pass it into
generate_response. Also propagate this parameter through other call sites that
use the singleton in this file (the other invocations noted around the earlier
blocks) so they call generate_response (or the helper that invokes it) with the
injected registry instead of the module singleton.

In `@src/utils/stream_interrupts.py`:
- Line 17: The task type annotations use the bare generic asyncio.Task; update
them to asyncio.Task[Any] and add Any to imports (from typing import Any) so
type checks are complete; specifically change the parameter/type annotations for
the task variable and the register_stream function signature in
stream_interrupts.py from asyncio.Task to asyncio.Task[Any], and ensure the
top-level imports include Any so the code compiles with the updated annotations.
- Around line 28-69: Add Google-style docstring parameter sections to the four
public methods in this class: register_stream, cancel_stream, deregister_stream,
and get_stream. For each method include an "Args:" block that documents each
parameter (request_id: str, user_id: str where present, task: asyncio.Task where
present) and for cancel_stream and get_stream also add a "Returns:" block
describing the return value (bool for cancel_stream, Optional[ActiveStream] for
get_stream). Keep wording brief and follow the existing one-line summary then
the Args/Returns headers per Google Python docstring conventions.

In `@tests/integration/test_openapi_json.py`:
- Around line 220-231: The "/v1/streaming_query/interrupt" POST expected status
set is missing 422; update the test expectations by adding "422" to the status
code set for the route tuple (" /v1/streaming_query/interrupt", "post", {...})
so it matches the other POST endpoints (e.g., "/v1/query" and
"/v1/streaming_query") and maintains consistent coverage.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/app/endpoints/stream_interrupt.py (1)

1-21: Add module-level logger per coding guidelines.

The module has no logger defined. The coding guidelines require logger = get_logger(__name__) for all source modules. Since a success path (e.g., confirming a successful interrupt) and the @authorize dependency both benefit from audit-level logging, this is also useful for observability.

♻️ Proposed addition
 """Endpoint for interrupting in-progress streaming query requests."""
 
 from typing import Annotated, Any
 
 from fastapi import APIRouter, Depends, HTTPException
 
 from authentication import get_auth_dependency
 from authentication.interface import AuthTuple
 from authorization.middleware import authorize
+from log import get_logger
 from models.config import Action
 ...
+logger = get_logger(__name__)

As per coding guidelines: "Use logger = get_logger(__name__) from log.py for module logging."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/stream_interrupt.py` around lines 1 - 21, Add a
module-level logger by importing get_logger and defining logger =
get_logger(__name__) at top of this module (so it’s available to the APIRouter
handlers, the authorize dependency usage, and functions interacting with
StreamInterruptRegistry / get_stream_interrupt_registry); then replace or
augment existing audit/success logging sites (e.g., where a
StreamingInterruptResponse is returned or where `@authorize` triggers) to use this
logger for audit-level messages. Ensure the import comes from the same logging
utility used elsewhere (get_logger) and that the logger variable name is exactly
logger to match coding guidelines.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/stream_interrupt.py`:
- Around line 25-32: The 404 example label "streaming request" used in
stream_interrupt_responses does not exist on NotFoundResponse, so OpenAPI will
show no examples; fix by either changing the examples argument in
stream_interrupt_responses to an existing label such as "conversation" (i.e.,
replace "streaming request" with "conversation") or add a new labeled example
"streaming request" to
NotFoundResponse.model_config["json_schema_extra"]["examples"] (update the
NotFoundResponse class to include that key/value) so the
NotFoundResponse.openapi_response(examples=[...]) call can find it.

In `@src/utils/stream_interrupts.py`:
- Around line 3-7: The import block violates import ordering and includes an
unused symbol: move all standard-library imports (asyncio,
dataclasses.dataclass, threading.Lock, and typing if used) before the
first-party get_logger import and remove the unused typing.Any import entirely;
specifically, reorder so dataclass and Lock imports come before get_logger (or
import Lock via threading) and delete the "from typing import Any" line to
resolve the C0411 and F401/W0611 linter errors in this module (look for
get_logger, dataclass, and Lock to locate the relevant imports).

---

Duplicate comments:
In `@src/app/endpoints/streaming_query.py`:
- Around line 392-396: The except block currently swallows
asyncio.CancelledError; modify it so after yielding
stream_interrupted_event(request_id) and after finally calling
stream_interrupt_registry.deregister_stream(request_id) you re-raise the
CancelledError to preserve asyncio semantics and outer cancel propagation, and
before re-raising call asyncio.current_task().uncancel() (if available) to clear
the internal cancellation counter; also change the logger message in the except
from "interrupted by user" to a neutral "interrupted" (or include a source hint
if available) so it doesn't misleadingly imply user-initiated interrupts. Ensure
you update the except handler that references request_id,
stream_interrupted_event, and stream_interrupt_registry.deregister_stream
accordingly.

---

Nitpick comments:
In `@src/app/endpoints/stream_interrupt.py`:
- Around line 1-21: Add a module-level logger by importing get_logger and
defining logger = get_logger(__name__) at top of this module (so it’s available
to the APIRouter handlers, the authorize dependency usage, and functions
interacting with StreamInterruptRegistry / get_stream_interrupt_registry); then
replace or augment existing audit/success logging sites (e.g., where a
StreamingInterruptResponse is returned or where `@authorize` triggers) to use this
logger for audit-level messages. Ensure the import comes from the same logging
utility used elsewhere (get_logger) and that the logger variable name is exactly
logger to match coding guidelines.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/utils/stream_interrupts.py (2)

59-73: Consider asyncio.Lock over threading.Lock for async-idiomatic behaviour.

The threading.Lock is acquired for the duration of the task.done() check and task.cancel() call (lines 59-73). Because task.cancel() itself only schedules a CancelledError on the event loop and returns immediately, the lock hold-time is negligible and blocking the loop is not a practical concern. However, using asyncio.Lock (and making the method async) would:

  • be idiomatic for code living in an async service;
  • avoid the subtle footgun of calling a blocking primitive from a coroutine if the critical section ever grows.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/stream_interrupts.py` around lines 59 - 73, The current code uses a
threading.Lock (self._lock) inside the method that checks stream.task.done() and
calls stream.task.cancel(); switch to an asyncio.Lock to be async-idiomatic:
initialize self._lock = asyncio.Lock() in the class, change the method to async
and replace the blocking with "async with self._lock:" around the critical
section (the stream lookup, user_id check, task.done() and task.cancel() calls),
and update all callers to await this method (or add a thin sync wrapper that
schedules it) so the lock is acquired without blocking the event loop.

97-106: In-memory singleton is isolated per process — multi-worker deployments will silently miss interrupts.

stream_interrupt_registry is a module-level object: each OS process running the application owns an independent copy. Under a --workers N uvicorn/gunicorn configuration, a POST /v1/streaming_query/interrupt request may be dispatched to a worker that did not register the target stream, causing a spurious 404.

Consider documenting this constraint prominently or, for production scale-out, replacing the in-memory store with a shared backend (Redis, a database table, or a shared-memory mechanism) so interrupt requests can reach the owning worker regardless of routing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/stream_interrupts.py` around lines 97 - 106, The module-level
in-memory singleton stream_interrupt_registry (returned by
get_stream_interrupt_registry and typed as StreamInterruptRegistry) will be
isolated per OS process causing missed interrupts in multi-worker deployments;
fix by making the registry pluggable: add a factory (e.g.,
create_stream_interrupt_registry) that reads configuration/ENV and returns
either the current in-memory StreamInterruptRegistry or a
RedisBackedStreamInterruptRegistry/DB-backed implementation, replace the
module-level instantiation with a call to that factory, and update
get_stream_interrupt_registry to return the factory-created instance; if you
prefer not to implement a shared backend now, add a prominent module-level
docstring and README note explaining the multi-worker limitation and
recommending Redis/DB for production.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/stream_interrupts.py`:
- Around line 20-21: The Task field is using a bare asyncio.Task which triggers
generic-type warnings; update the annotation for the streaming task to
asyncio.Task[None] in the relevant dataclass/structure (the task field in
src/utils/stream_interrupts.py) so the generic parameter is explicit—i.e.,
change the type of task to asyncio.Task[None] wherever it's declared (and adjust
imports if needed).
- Around line 45-73: The cancel_stream method conflates three distinct failure
modes into a single False return, so update cancel_stream to return a granular
result (an enum or typed tuple) instead of bool—e.g., CancelResult = Enum
{NOT_FOUND, FORBIDDEN, ALREADY_DONE, CANCELLED}—and adjust logic in
cancel_stream (use self._lock, self._streams.get(request_id), compare
stream.user_id, check stream.task.done(), call stream.task.cancel()) to return
the appropriate CancelResult in each branch; then update the caller in the
stream interrupt endpoint to map CancelResult.NOT_FOUND -> 404, .FORBIDDEN ->
404 (if you keep that security choice) or 403, .ALREADY_DONE -> 409, and
.CANCELLED -> 200/202 so the HTTP status codes reflect the precise outcome.

---

Nitpick comments:
In `@src/utils/stream_interrupts.py`:
- Around line 59-73: The current code uses a threading.Lock (self._lock) inside
the method that checks stream.task.done() and calls stream.task.cancel(); switch
to an asyncio.Lock to be async-idiomatic: initialize self._lock = asyncio.Lock()
in the class, change the method to async and replace the blocking with "async
with self._lock:" around the critical section (the stream lookup, user_id check,
task.done() and task.cancel() calls), and update all callers to await this
method (or add a thin sync wrapper that schedules it) so the lock is acquired
without blocking the event loop.
- Around line 97-106: The module-level in-memory singleton
stream_interrupt_registry (returned by get_stream_interrupt_registry and typed
as StreamInterruptRegistry) will be isolated per OS process causing missed
interrupts in multi-worker deployments; fix by making the registry pluggable:
add a factory (e.g., create_stream_interrupt_registry) that reads
configuration/ENV and returns either the current in-memory
StreamInterruptRegistry or a RedisBackedStreamInterruptRegistry/DB-backed
implementation, replace the module-level instantiation with a call to that
factory, and update get_stream_interrupt_registry to return the factory-created
instance; if you prefer not to implement a shared backend now, add a prominent
module-level docstring and README note explaining the multi-worker limitation
and recommending Redis/DB for production.

Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/openapi.json (1)

4434-4460: ⚠️ Potential issue | 🟠 Major

Fix duplicate operationId for /a2a POST.

Line 4458 repeats the GET operationId, which makes operationIds non-unique and can break client code generation.

🔧 Proposed fix
-                "operationId": "handle_a2a_jsonrpc_a2a_get",
+                "operationId": "handle_a2a_jsonrpc_a2a_post",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/openapi.json` around lines 4434 - 4460, The POST entry under the /a2a
endpoint has a duplicated operationId ("handle_a2a_jsonrpc_a2a_get") which must
be unique; update the POST operationId to a distinct value (for example
"handle_a2a_jsonrpc_a2a_post") in the OpenAPI JSON so GET and POST use different
operationIds and client code generation won't break—modify the "post" ->
"operationId" field accordingly.
🧹 Nitpick comments (3)
src/utils/stream_interrupts.py (1)

33-33: asyncio.Task parameter is missing its generic type argument.

register_stream's task parameter uses bare asyncio.Task while the ActiveStream.task field correctly uses asyncio.Task[None]. Strict type-checkers will flag the parameter annotation.

🔧 Proposed fix
-    def register_stream(
-        self, request_id: str, user_id: str, task: asyncio.Task
-    ) -> None:
+    def register_stream(
+        self, request_id: str, user_id: str, task: asyncio.Task[None]
+    ) -> None:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/stream_interrupts.py` at line 33, The register_stream parameter
annotation uses a bare asyncio.Task which is inconsistent with
ActiveStream.task; update the register_stream signature to annotate task as
asyncio.Task[None] (matching ActiveStream.task) so strict type-checkers won't
complain; locate the register_stream definition and replace the bare
asyncio.Task type with asyncio.Task[None].
tests/unit/app/endpoints/test_streaming_query.py (1)

1360-1368: mock_context is missing vector_store_ids and rag_id_mapping compared to all other generate_response tests.

Every other test in TestGenerateResponse sets:

mock_context.vector_store_ids = []
mock_context.rag_id_mapping = {}

Because mocker.Mock(spec=ResponseGeneratorContext) auto-creates Mock objects for missing attributes, access to these fields won't raise AttributeError, but may behave silently differently from [] / {} if generate_response iterates them during the cancellation path.

🛠 Proposed fix
 mock_context = mocker.Mock(spec=ResponseGeneratorContext)
 mock_context.conversation_id = "conv_123"
 mock_context.user_id = "user_123"
+mock_context.vector_store_ids = []
+mock_context.rag_id_mapping = {}
 mock_context.query_request = QueryRequest(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/app/endpoints/test_streaming_query.py` around lines 1360 - 1368,
The mock_context in this test lacks explicit vector_store_ids and rag_id_mapping
which other TestGenerateResponse cases set; update the test to assign
mock_context.vector_store_ids = [] and mock_context.rag_id_mapping = {} so the
mocked ResponseGeneratorContext behaves like real instances when
generate_response iterates or checks those fields (locate the mock_context
created with mocker.Mock(spec=ResponseGeneratorContext) in this test and add the
two assignments).
tests/unit/app/endpoints/test_stream_interrupt.py (1)

70-103: Consider adding an edge-case test for interrupting an already-completed task.

task.cancel() returns False when the task has already finished. If the endpoint handler calls cancel() on a done task without checking the return value, it could return interrupted=True even though no actual cancellation occurred — a misleading response to the caller. There is currently no test covering this race condition.

✅ Suggested additional test
`@pytest.mark.asyncio`
async def test_stream_interrupt_endpoint_already_completed(
    registry: StreamInterruptRegistry,
) -> None:
    """Interrupt endpoint handles an already-completed task gracefully."""
    request_id = "123e4567-e89b-12d3-a456-426614174003"
    user_id = "00000001-0001-0001-0001-000000000001"

    async def instant_stream() -> None:
        pass  # completes immediately

    task = asyncio.create_task(instant_stream())
    await task  # let the task finish naturally
    registry.register_stream(request_id, user_id, task)

    response = await stream_interrupt_endpoint_handler(
        interrupt_request=StreamingInterruptRequest(request_id=request_id),
        auth=(user_id, "mock_username", False, "mock_token"),
        registry=registry,
    )

    # Verify the response accurately reflects that cancel() had no effect
    assert isinstance(response, StreamingInterruptResponse)
    assert response.request_id == request_id
    # interrupted should be False (or the handler should document its semantics here)
    assert response.interrupted is False

Do you want me to open a new issue to track this test case?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/app/endpoints/test_stream_interrupt.py` around lines 70 - 103, Add
a unit test to cover the race where a stream task has already completed before
interruption: create an async test named
test_stream_interrupt_endpoint_already_completed that schedules an
instant-completing coroutine, awaits it so the Task is done, registers it with
StreamInterruptRegistry via register_stream(request_id, user_id, task), then
calls stream_interrupt_endpoint_handler with a StreamingInterruptRequest and the
correct auth tuple and assert the returned StreamingInterruptResponse has the
same request_id and that interrupted is False (verifying the handler does not
report a successful cancel when task.cancel() had no effect).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration/test_openapi_json.py`:
- Around line 226-230: The test's expected response codes for the
"/v1/streaming_query/interrupt" POST endpoint omit "422", but FastAPI adds a 422
for endpoints with a Pydantic body (StreamingInterruptRequest) and
_check_paths_and_responses_exist performs subset checks; update the tuple for
"/v1/streaming_query/interrupt" to include "422" in its set of expected
responses and apply the same change to the other URL-based parametrization entry
for the interrupt endpoint so both parametrized lists include "422".

In `@tests/unit/app/endpoints/test_streaming_query.py`:
- Line 54: The tests import and use the global singleton
stream_interrupt_registry which creates fragile inter-test coupling; update
tests in test_streaming_query.py to either mock stream_interrupt_registry with
pytest-mock for each test or add a fixture that resets/clears the registry
before/after each test so generate_response's registration/deregistration
(including its finally cleanup) cannot leak between tests; reference
stream_interrupt_registry, generate_response, and the shared request_id value
when implementing the mock/fixture so each async test gets an isolated or
cleaned registry state.

---

Outside diff comments:
In `@docs/openapi.json`:
- Around line 4434-4460: The POST entry under the /a2a endpoint has a duplicated
operationId ("handle_a2a_jsonrpc_a2a_get") which must be unique; update the POST
operationId to a distinct value (for example "handle_a2a_jsonrpc_a2a_post") in
the OpenAPI JSON so GET and POST use different operationIds and client code
generation won't break—modify the "post" -> "operationId" field accordingly.

---

Duplicate comments:
In `@src/app/endpoints/stream_interrupt.py`:
- Around line 25-32: The 404 example label "streaming request" used in
stream_interrupt_responses is not defined in NotFoundResponse, so OpenAPI drops
the examples; either change the label in stream_interrupt_responses to one of
the existing NotFoundResponse example keys (e.g., "conversation", "provider",
"model", or "rag") when calling NotFoundResponse.openapi_response(...) or add a
new "streaming request" entry into NotFoundResponse's
model_config["json_schema_extra"]["examples"] (update the NotFoundResponse class
definition to include that key and an appropriate example payload) so
NotFoundResponse.openapi_response(examples=["streaming request"]) will render
correctly.

In `@src/app/endpoints/streaming_query.py`:
- Around line 392-399: The CancelledError handler in the except block for
asyncio.CancelledError (around the use of request_id, stream_interrupted_event,
and stream_interrupt_registry) swallows cancellations without clearing the
task's cancelled state; update the handler to (1) log a neutral message like
"Streaming request %s cancelled" instead of "interrupted by user", and (2) after
yielding stream_interrupted_event(request_id) call uncancel on the current task
to clear the cancellation flag (use asyncio.current_task().uncancel() when
available—guard for older Python versions or feature presence). Ensure
stream_interrupt_registry.deregister_stream(request_id) still runs in finally
and preserve behavior around stream_completed.

In `@src/utils/stream_interrupts.py`:
- Around line 45-73: Change cancel_stream to return a richer result enum instead
of a plain bool so callers can distinguish the three failure modes; introduce a
CancelResult enum (e.g., NOT_FOUND, FORBIDDEN, ALREADY_DONE, CANCELLED) and
update the cancel_stream signature and implementation to return the appropriate
enum value for each branch (stream missing -> NOT_FOUND, wrong owner ->
FORBIDDEN, task.done() -> ALREADY_DONE, task.cancel() -> CANCELLED); update any
callers (HTTP endpoint) to map CancelResult to the correct HTTP status (404,
403, 409, 200/202) accordingly.

---

Nitpick comments:
In `@src/utils/stream_interrupts.py`:
- Line 33: The register_stream parameter annotation uses a bare asyncio.Task
which is inconsistent with ActiveStream.task; update the register_stream
signature to annotate task as asyncio.Task[None] (matching ActiveStream.task) so
strict type-checkers won't complain; locate the register_stream definition and
replace the bare asyncio.Task type with asyncio.Task[None].

In `@tests/unit/app/endpoints/test_stream_interrupt.py`:
- Around line 70-103: Add a unit test to cover the race where a stream task has
already completed before interruption: create an async test named
test_stream_interrupt_endpoint_already_completed that schedules an
instant-completing coroutine, awaits it so the Task is done, registers it with
StreamInterruptRegistry via register_stream(request_id, user_id, task), then
calls stream_interrupt_endpoint_handler with a StreamingInterruptRequest and the
correct auth tuple and assert the returned StreamingInterruptResponse has the
same request_id and that interrupted is False (verifying the handler does not
report a successful cancel when task.cancel() had no effect).

In `@tests/unit/app/endpoints/test_streaming_query.py`:
- Around line 1360-1368: The mock_context in this test lacks explicit
vector_store_ids and rag_id_mapping which other TestGenerateResponse cases set;
update the test to assign mock_context.vector_store_ids = [] and
mock_context.rag_id_mapping = {} so the mocked ResponseGeneratorContext behaves
like real instances when generate_response iterates or checks those fields
(locate the mock_context created with mocker.Mock(spec=ResponseGeneratorContext)
in this test and add the two assignments).

Comment on lines +226 to +230
(
"/v1/streaming_query/interrupt",
"post",
{"200", "401", "403", "404"},
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing "422" in expected response codes for the interrupt endpoint.

Every other POST endpoint with a Pydantic body (e.g. /v1/query, /v1/streaming_query) includes "422" in its expected set. FastAPI unconditionally generates a 422 Unprocessable Entity entry in the OpenAPI spec for any endpoint that accepts a validated request body (StreamingInterruptRequest). Because _check_paths_and_responses_exist performs a subset check, the test still passes, but the omission means the 422 response is never validated.

🛠 Proposed fix
-        (
-            "/v1/streaming_query/interrupt",
-            "post",
-            {"200", "401", "403", "404"},
-        ),
+        (
+            "/v1/streaming_query/interrupt",
+            "post",
+            {"200", "401", "403", "404", "422"},
+        ),

Apply the same change in the URL-based parametrize list (lines 313-317).

Also applies to: 313-317

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/test_openapi_json.py` around lines 226 - 230, The test's
expected response codes for the "/v1/streaming_query/interrupt" POST endpoint
omit "422", but FastAPI adds a 422 for endpoints with a Pydantic body
(StreamingInterruptRequest) and _check_paths_and_responses_exist performs subset
checks; update the tuple for "/v1/streaming_query/interrupt" to include "422" in
its set of expected responses and apply the same change to the other URL-based
parametrization entry for the interrupt endpoint so both parametrized lists
include "422".

@Jdubrick
Copy link
Contributor Author

@tisnik PTAL

return self._streams.get(request_id)


stream_interrupt_registry = StreamInterruptRegistry()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using Singleton approach here (src/utils/types.Singleton).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this is much better approach.

context=context,
responses_params=responses_params,
turn_summary=turn_summary,
request_id=request_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to extend context rather than adding new parameters.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in overall. There are just one problem (global scope, commented by CodeRabbit) + the nitpick about context (see Andrej's comment).

After these two fixes it can be merged.

return self._streams.get(request_id)


stream_interrupt_registry = StreamInterruptRegistry()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this is much better approach.

self._streams[request_id] = ActiveStream(user_id=user_id, task=task)

def cancel_stream(self, request_id: str, user_id: str) -> bool:
"""Cancel an active stream owned by user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with CodeRabbit here. All 3 "unhappy" behavior types result in the same 404 error, which is IMO not the best solution.
I would recommend:

L62 -> 404 because the stream was not found
L69 -> 403 because you access a resource that does not belong to you
L71 -> Probably 200 maybe with False flag in the model and proper description (this is worth a discussion)

examples=["missing header", "missing token"]
),
403: ForbiddenResponse.openapi_response(examples=["endpoint"]),
404: NotFoundResponse.openapi_response(examples=["streaming request"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to extend examples of this response model to make it work properly (see models/responses.NotFoundResponse).

Copy link
Contributor

@asimurka asimurka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good, I commented on a few nits and also suggested an improvement to distinguish different types of errors.

Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/openapi.json (1)

4450-4468: ⚠️ Potential issue | 🟠 Major

Fix duplicate operationId for /a2a POST endpoint.

Both GET and POST operations share the operationId handle_a2a_jsonrpc_a2a_get, violating OpenAPI's uniqueness requirement. This causes collisions in generated clients and breaks spec compliance.

🔧 Suggested fix
-                "operationId": "handle_a2a_jsonrpc_a2a_get",
+                "operationId": "handle_a2a_jsonrpc_a2a_post",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/openapi.json` around lines 4450 - 4468, The POST operation under the
/a2a path reuses operationId "handle_a2a_jsonrpc_a2a_get", causing a duplicate;
change the POST operationId to a unique identifier (for example
"handle_a2a_jsonrpc_a2a_post" or similar) so it no longer collides with the GET
operationId, and update any consumers or references that rely on the old POST
operationId accordingly.
♻️ Duplicate comments (2)
src/app/endpoints/streaming_query.py (1)

389-393: ⚠️ Potential issue | 🟠 Major

asyncio.CancelledError suppressed without re-raising — asyncio semantics violated.

Catching CancelledError and not re-raising after cleanup means the asyncio task appears to complete normally. Outer contexts (Starlette timeouts, TaskGroup, server shutdown) that relied on cancellation propagation do not see it, and the internal cancellation counter on the task stays non-zero, which can cause asyncio.timeout() scopes higher in the stack to misbehave.

The graceful handling (logging, emitting the interrupted SSE event, deregistering) is preserved when re-raising:

🛠️ Proposed fix
     except asyncio.CancelledError:
         logger.info("Streaming request %s interrupted by user", context.request_id)
         yield stream_interrupted_event(context.request_id)
+        raise
     finally:
         get_stream_interrupt_registry().deregister_stream(context.request_id)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/streaming_query.py` around lines 389 - 393, The except
block catching asyncio.CancelledError in the streaming generator should perform
the existing cleanup (log via logger.info, yield
stream_interrupted_event(context.request_id), deregister via
get_stream_interrupt_registry().deregister_stream(context.request_id)) and then
re-raise the CancelledError so asyncio cancellation semantics propagate; update
the except asyncio.CancelledError handler in the streaming generator to re-raise
after emitting the interrupted event and before/after deregistration as
appropriate to ensure outer TaskGroup/timeout/shutdown handlers observe the
cancellation.
src/app/endpoints/stream_interrupt.py (1)

31-32: ⚠️ Potential issue | 🟡 Minor

"streaming request" is not a declared label in NotFoundResponse — OpenAPI 404 examples will be empty.

NotFoundResponse.openapi_response(examples=["streaming request"]) filters by label, and the only labels defined in NotFoundResponse.model_config["json_schema_extra"]["examples"] are "conversation", "provider", "model", and "rag". An unlabelled example results in an empty examples block in the generated OpenAPI spec.

Either add a "streaming request" entry to NotFoundResponse.model_config["json_schema_extra"]["examples"] in src/models/responses.py, or substitute an existing label (e.g. "conversation") as a temporary placeholder.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/stream_interrupt.py` around lines 31 - 32, The 404 OpenAPI
example label "streaming request" passed to
NotFoundResponse.openapi_response(...) doesn't exist in
NotFoundResponse.model_config["json_schema_extra"]["examples"], so the generated
examples are empty; fix by either adding a "streaming request" entry to
NotFoundResponse.model_config["json_schema_extra"]["examples"] in the responses
model (provide a representative example payload under that key) or change the
call to NotFoundResponse.openapi_response(examples=[...]) in stream_interrupt.py
to use one of the existing labels (e.g., "conversation") so the example is
included.
🧹 Nitpick comments (3)
src/models/responses.py (1)

565-585: Consider documenting the interrupted=False outcome in field and model examples.

The interrupted field can legitimately be False (e.g., the streaming task finished before the interrupt was processed), but Field(examples=[True]) and the model_config example only show the True case. API consumers reading the schema would benefit from seeing both outcomes.

✏️ Suggested addition
     interrupted: bool = Field(
         description="Whether an in-progress stream was interrupted",
-        examples=[True],
+        examples=[True, False],
     )
 
     message: str = Field(
         description="Human-readable interruption status message",
-        examples=["Streaming request interrupted"],
+        examples=["Streaming request interrupted", "Streaming request was not active"],
     )
 
     model_config = {
         "json_schema_extra": {
             "examples": [
                 {
                     "request_id": "123e4567-e89b-12d3-a456-426614174000",
                     "interrupted": True,
                     "message": "Streaming request interrupted",
                 }
             ]
         }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models/responses.py` around lines 565 - 585, The schema only shows the
interrupted=True case; update the interrupted Field and model_config examples to
include the non-interrupted outcome: add False to the interrupted Field examples
(e.g., Field(..., examples=[True, False])) and add a second example object in
model_config["json_schema_extra"]["examples"] that demonstrates interrupted:
False and an appropriate message (e.g., "Streaming completed") so both outcomes
are documented; modify the interrupted field, the message field example if
desired, and the model_config examples to include this non-interrupted example.
tests/unit/app/endpoints/test_streaming_query.py (1)

1360-1413: Add an assertion that register_stream is called in the cancellation test.

The test thoroughly verifies the teardown path (deregister_stream, skipped side effects, interrupted event) but doesn't assert that register_stream was invoked before the cancellation. In a pytest-asyncio context asyncio.current_task() returns a real task, so the register branch should execute — asserting it makes the test a complete contract for the registration/deregistration lifecycle.

♻️ Proposed addition
+        isolate_stream_interrupt_registry.register_stream.assert_called_once_with(
+            request_id=test_request_id,
+            user_id=mock_context.user_id,
+            task=unittest.mock.ANY,
+        )
         isolate_stream_interrupt_registry.deregister_stream.assert_called_once_with(
             test_request_id
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/app/endpoints/test_streaming_query.py` around lines 1360 - 1413,
The test test_generate_response_cancelled_skips_side_effects is missing an
assertion that the stream was registered before cancellation; add an assertion
that isolate_stream_interrupt_registry.register_stream was called (e.g.
register_stream.assert_called_once_with(test_request_id)) after invoking
generate_response (or immediately after starting the async iteration) to
validate the registration/deregistration lifecycle alongside the existing
deregister_stream assertion; refer to the mock symbol
isolate_stream_interrupt_registry.register_stream and the test_request_id
variable to locate and implement this assertion.
src/app/endpoints/streaming_query.py (1)

346-393: Store get_stream_interrupt_registry() result once to avoid two singleton look-ups.

get_stream_interrupt_registry() is called at registration (line 348) and again in the finally block (line 393). Storing the result once makes the pairing explicit and avoids the implicit assumption that the callable always returns the same object.

♻️ Proposed refactor
+    stream_registry = get_stream_interrupt_registry()
     current_task = asyncio.current_task()
     if current_task is not None:
-        get_stream_interrupt_registry().register_stream(
+        stream_registry.register_stream(
             request_id=context.request_id,
             user_id=user_id,
             task=current_task,
         )
     ...
     finally:
-        get_stream_interrupt_registry().deregister_stream(context.request_id)
+        stream_registry.deregister_stream(context.request_id)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/streaming_query.py` around lines 346 - 393, Store the
result of get_stream_interrupt_registry() in a local variable before calling
asyncio.current_task(), then use that variable for both register_stream(...) and
deregister_stream(...); specifically, assign registry =
get_stream_interrupt_registry() before the current_task check and replace calls
to get_stream_interrupt_registry().register_stream(...) and
get_stream_interrupt_registry().deregister_stream(...) with
registry.register_stream(...) and registry.deregister_stream(...) so the same
singleton instance is used and available in the finally block.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/stream_interrupts.py`:
- Around line 43-45: The register_stream parameter type for task is inconsistent
with ActiveStream.task: change the register_stream signature to use the same
generic task type (asyncio.Task[None]) so strict type-checkers agree; update the
parameter in def register_stream(self, request_id: str, user_id: str, task:
asyncio.Task[None]) -> None and ensure any references that create or pass tasks
match that generic annotation (refer to ActiveStream.task and register_stream).

In `@tests/integration/endpoints/test_stream_interrupt_integration.py`:
- Around line 13-16: The fixture registry_fixture currently returns the global
singleton because StreamInterruptRegistry uses the Singleton metaclass, so tests
share state; modify the fixture to reset the singleton state before yielding to
ensure isolation: import collections.abc.Generator, create the fixture as a
generator that clears or reinitializes StreamInterruptRegistry's internal state
(e.g., reset whatever storage attribute the class uses or call a provided
clear/reset method on StreamInterruptRegistry or its Singleton base) before
yielding the instance and optionally clear it again after the test so no entries
leak; reference StreamInterruptRegistry and the Singleton metaclass from
src/utils/types.py when locating the reset point.

---

Outside diff comments:
In `@docs/openapi.json`:
- Around line 4450-4468: The POST operation under the /a2a path reuses
operationId "handle_a2a_jsonrpc_a2a_get", causing a duplicate; change the POST
operationId to a unique identifier (for example "handle_a2a_jsonrpc_a2a_post" or
similar) so it no longer collides with the GET operationId, and update any
consumers or references that rely on the old POST operationId accordingly.

---

Duplicate comments:
In `@src/app/endpoints/stream_interrupt.py`:
- Around line 31-32: The 404 OpenAPI example label "streaming request" passed to
NotFoundResponse.openapi_response(...) doesn't exist in
NotFoundResponse.model_config["json_schema_extra"]["examples"], so the generated
examples are empty; fix by either adding a "streaming request" entry to
NotFoundResponse.model_config["json_schema_extra"]["examples"] in the responses
model (provide a representative example payload under that key) or change the
call to NotFoundResponse.openapi_response(examples=[...]) in stream_interrupt.py
to use one of the existing labels (e.g., "conversation") so the example is
included.

In `@src/app/endpoints/streaming_query.py`:
- Around line 389-393: The except block catching asyncio.CancelledError in the
streaming generator should perform the existing cleanup (log via logger.info,
yield stream_interrupted_event(context.request_id), deregister via
get_stream_interrupt_registry().deregister_stream(context.request_id)) and then
re-raise the CancelledError so asyncio cancellation semantics propagate; update
the except asyncio.CancelledError handler in the streaming generator to re-raise
after emitting the interrupted event and before/after deregistration as
appropriate to ensure outer TaskGroup/timeout/shutdown handlers observe the
cancellation.

---

Nitpick comments:
In `@src/app/endpoints/streaming_query.py`:
- Around line 346-393: Store the result of get_stream_interrupt_registry() in a
local variable before calling asyncio.current_task(), then use that variable for
both register_stream(...) and deregister_stream(...); specifically, assign
registry = get_stream_interrupt_registry() before the current_task check and
replace calls to get_stream_interrupt_registry().register_stream(...) and
get_stream_interrupt_registry().deregister_stream(...) with
registry.register_stream(...) and registry.deregister_stream(...) so the same
singleton instance is used and available in the finally block.

In `@src/models/responses.py`:
- Around line 565-585: The schema only shows the interrupted=True case; update
the interrupted Field and model_config examples to include the non-interrupted
outcome: add False to the interrupted Field examples (e.g., Field(...,
examples=[True, False])) and add a second example object in
model_config["json_schema_extra"]["examples"] that demonstrates interrupted:
False and an appropriate message (e.g., "Streaming completed") so both outcomes
are documented; modify the interrupted field, the message field example if
desired, and the model_config examples to include this non-interrupted example.

In `@tests/unit/app/endpoints/test_streaming_query.py`:
- Around line 1360-1413: The test
test_generate_response_cancelled_skips_side_effects is missing an assertion that
the stream was registered before cancellation; add an assertion that
isolate_stream_interrupt_registry.register_stream was called (e.g.
register_stream.assert_called_once_with(test_request_id)) after invoking
generate_response (or immediately after starting the async iteration) to
validate the registration/deregistration lifecycle alongside the existing
deregister_stream assertion; refer to the mock symbol
isolate_stream_interrupt_registry.register_stream and the test_request_id
variable to locate and implement this assertion.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 71ac9a1 and 1c47a01.

📒 Files selected for processing (10)
  • docs/openapi.json
  • src/app/endpoints/stream_interrupt.py
  • src/app/endpoints/streaming_query.py
  • src/models/context.py
  • src/models/responses.py
  • src/utils/stream_interrupts.py
  • tests/integration/endpoints/test_stream_interrupt_integration.py
  • tests/unit/app/endpoints/test_stream_interrupt.py
  • tests/unit/app/endpoints/test_streaming_query.py
  • tests/unit/models/responses/test_error_responses.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unit/app/endpoints/test_stream_interrupt.py

Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
@Jdubrick
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 23, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/integration/endpoints/test_stream_interrupt_integration.py (1)

26-66: Consider splitting the mega-test and adding a FORBIDDEN coverage path.

Two optional improvements worth considering:

  1. Missing FORBIDDEN path — The test exercises CANCELLED, ALREADY_DONE, and NOT_FOUND, but never calls the handler with a different user_id to verify the 403 FORBIDDEN branch in stream_interrupt_endpoint_handler. Adding this makes the integration surface complete.

  2. Monolithic test — Packing four lifecycle phases into one function means an early failure (e.g. the assertion at line 45) silently skips the later phases. The NOT_FOUND phase (lines 60–66) is fully independent of the cancel outcome and could be split into its own test function without needing the preceding interrupt. Example decomposition:

♻️ Suggested test decomposition
+@pytest.mark.asyncio
+async def test_stream_interrupt_forbidden(
+    registry: StreamInterruptRegistry,
+) -> None:
+    """Attempting to interrupt a stream owned by another user raises 403."""
+    async def pending_stream() -> None:
+        await asyncio.sleep(10)
+
+    task = asyncio.create_task(pending_stream())
+    registry.register_stream(TEST_REQUEST_ID, OWNER_USER_ID, task)
+    try:
+        with pytest.raises(HTTPException) as exc_info:
+            await stream_interrupt_endpoint_handler(
+                interrupt_request=StreamingInterruptRequest(request_id=TEST_REQUEST_ID),
+                auth=("different-user-id", "other_user", False, "mock_token"),
+                registry=registry,
+            )
+        assert exc_info.value.status_code == 403
+    finally:
+        task.cancel()
+
+
+@pytest.mark.asyncio
+async def test_stream_interrupt_not_found(
+    registry: StreamInterruptRegistry,
+) -> None:
+    """Interrupting a non-existent request_id raises 404."""
+    with pytest.raises(HTTPException) as exc_info:
+        await stream_interrupt_endpoint_handler(
+            interrupt_request=StreamingInterruptRequest(request_id=TEST_REQUEST_ID),
+            auth=(OWNER_USER_ID, "mock_username", False, "mock_token"),
+            registry=registry,
+        )
+    assert exc_info.value.status_code == 404

Note: the FORBIDDEN test needs a distinct user_id string that is not OWNER_USER_ID; any other valid UUID-like string works. The fixture's cleanup handles TEST_REQUEST_ID for all cases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/endpoints/test_stream_interrupt_integration.py` around
lines 26 - 66, The test test_stream_interrupt_full_round_trip is too monolithic
and misses exercising the 403 FORBIDDEN branch; split the single mega-test into
at least two tests (e.g., one for interrupt lifecycle/cancel and one for
NOT_FOUND) and add a new test that calls stream_interrupt_endpoint_handler with
the same TEST_REQUEST_ID but a different user id (not OWNER_USER_ID) to assert
response raises HTTPException with status_code 403; use the existing
StreamingInterruptRequest, StreamInterruptRegistry, OWNER_USER_ID, and
TEST_REQUEST_ID symbols to locate registration/deregistration and ensure the
forbidden case uses a distinct UUID-like string for auth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/integration/endpoints/test_stream_interrupt_integration.py`:
- Around line 26-66: The test test_stream_interrupt_full_round_trip is too
monolithic and misses exercising the 403 FORBIDDEN branch; split the single
mega-test into at least two tests (e.g., one for interrupt lifecycle/cancel and
one for NOT_FOUND) and add a new test that calls
stream_interrupt_endpoint_handler with the same TEST_REQUEST_ID but a different
user id (not OWNER_USER_ID) to assert response raises HTTPException with
status_code 403; use the existing StreamingInterruptRequest,
StreamInterruptRegistry, OWNER_USER_ID, and TEST_REQUEST_ID symbols to locate
registration/deregistration and ensure the forbidden case uses a distinct
UUID-like string for auth.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c47a01 and 15b82a1.

📒 Files selected for processing (3)
  • src/utils/stream_interrupts.py
  • tests/integration/endpoints/test_stream_interrupt_integration.py
  • tests/unit/app/endpoints/test_stream_interrupt.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/utils/stream_interrupts.py
  • tests/unit/app/endpoints/test_stream_interrupt.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants