[https://nvbugs/6120535][fix] Only call _verify_ctx_response and _get_gen_request when generation will actuall#13701
Conversation
…st completes When a context-only request finishes during the context phase (finish_reason='stop'), the C++ executor does not populate ctx_request_id in disaggregated_params because no KV cache transfer is needed. The _verify_ctx_response validation was called unconditionally, causing a ValueError for these completed requests. Only verify disaggregated params and prepare the generation request when generation is actually needed (finish_reason is 'length' or 'not_finished'), matching the _need_gen logic that already handles this case correctly downstream. Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>
📝 WalkthroughWalkthroughThe PR modifies context-response handling in ChangesContext Response Handling
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Review rate limit: 9/10 reviews remaining, refill in 6 minutes. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tensorrt_llm/serve/openai_disagg_service.py`:
- Around line 148-153: The code dereferences ctx_response.choices[0] before
validating structure, risking IndexError; change the logic to validate the
response structure first by calling self._verify_ctx_response(ctx_response) (or
explicitly guard that ctx_response and ctx_response.choices exist and are
non-empty) before accessing choices[0] or checking its finish_reason, then
proceed to call self._get_gen_request(request, ctx_response, disagg_request_id)
only after validation passes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 61e69492-507f-4b02-a4f2-079e7efbbec9
📒 Files selected for processing (1)
tensorrt_llm/serve/openai_disagg_service.py
| if not ctx_response or ctx_response.choices[0].finish_reason in ( | ||
| "length", | ||
| "not_finished", | ||
| ): | ||
| await self._verify_ctx_response(ctx_response) | ||
| gen_req = self._get_gen_request(request, ctx_response, disagg_request_id) |
There was a problem hiding this comment.
Guard choices[0] access before finish-reason check.
Line 148 dereferences ctx_response.choices[0] before structural validation. If the ctx payload is malformed (e.g., zero choices), this throws IndexError instead of the controlled ValueError from _verify_ctx_response.
Proposed fix
ctx_response = await self._ctx_client.send_request(
ctx_req, server=ctx_server, hooks=hooks
)
- if not ctx_response or ctx_response.choices[0].finish_reason in (
+ if ctx_response and len(ctx_response.choices) != 1:
+ await self._verify_ctx_response(ctx_response)
+
+ if not ctx_response or ctx_response.choices[0].finish_reason in (
"length",
"not_finished",
):
await self._verify_ctx_response(ctx_response)
gen_req = self._get_gen_request(request, ctx_response, disagg_request_id)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tensorrt_llm/serve/openai_disagg_service.py` around lines 148 - 153, The code
dereferences ctx_response.choices[0] before validating structure, risking
IndexError; change the logic to validate the response structure first by calling
self._verify_ctx_response(ctx_response) (or explicitly guard that ctx_response
and ctx_response.choices exist and are non-empty) before accessing choices[0] or
checking its finish_reason, then proceed to call self._get_gen_request(request,
ctx_response, disagg_request_id) only after validation passes.
Summary
Test plan
Links
Summary by CodeRabbit
Release Notes