[https://nvbugs/6115039][fix] Override `from_hf` in `Qwen3HybridConfig` to pre-compute `num_attention_layers` by tensorrt-cicd · Pull Request #13663 · NVIDIA/TensorRT-LLM

tensorrt-cicd · 2026-04-30T12:21:27Z

Summary

Root cause: Qwen3HybridConfig.set_values_if_none validator calls load_pretrained_config(self.name) where self.name is the bench identifier ("qwen3.5_9b_hf"), not a valid model path, causing OSError during trtllm-bench model type resolution.
Fix: Override from_hf in Qwen3HybridConfig to pre-compute num_attention_layers and num_linear_attention_layers from the already-loaded pretrained config, so the validator's reload path is never triggered.
Automated fix generated by repair-bot

Test plan

Verify fix on the same GPU type as the original failure
Check for regressions in related tests

Links

Bug: https://nvbugs/6115039

Summary by CodeRabbit

New Features
- Qwen3Hybrid configurations can now be constructed directly from Hugging Face model names, automatically loading pretrained settings and computing necessary model parameters.

…f to avoid invalid model name lookup Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>

coderabbitai · 2026-04-30T12:25:17Z

📝 Walkthrough

Walkthrough

A classmethod from_hf is added to Qwen3HybridConfig to construct config from HuggingFace models by loading pretrained config, computing parameter count, and counting attention layer types. Additionally, set_values_if_none refactors to use list.count() instead of generator-based sum().

Changes

Cohort / File(s)	Summary
Qwen3HybridConfig Enhancement `tensorrt_llm/bench/build/dataclasses.py`	Added `from_hf` classmethod to build config from HuggingFace models by loading pretrained config, computing `param_count`, counting `full_attention` and `linear_attention` layer types. Refactored `set_values_if_none` to use `list.count()` for deriving layer counts instead of generator-based `sum()`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title is fully related to the main change: overriding `from_hf` in `Qwen3HybridConfig` to pre-compute layer counts, which directly addresses the root cause of the bug described in the PR.
Description check	✅ Passed	The PR description includes all critical required sections: a clear summary of the root cause and fix, test plan with verification steps, and a link to the bug. The description template structure is followed adequately despite minor formatting differences.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Review rate limit: 9/10 reviews remaining, refill in 6 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

tensorrt_llm/bench/build/dataclasses.py (1)

275-291: ⚡ Quick win

Annotate and document the new public factory.

from_hf is a new public API, but it currently omits the required return type annotation and a Google-style docstring. Please add both so this entry point follows the repo’s Python guidelines.

Suggested patch

-    def from_hf(cls, model_hf_name, hf_model_path):
+    def from_hf(
+        cls,
+        model_hf_name: str,
+        hf_model_path: str | None,
+    ) -> "Qwen3HybridConfig":
+        """Build a Qwen3 hybrid config from Hugging Face metadata."""

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/bench/build/dataclasses.py` around lines 275 - 291, The new
public factory method from_hf lacks a return type annotation and a Google-style
docstring; update the method signature to include the correct return type (the
dataclass type, e.g., -> "ModelDataclass" or use "cls" typing via
ClassVar/Type[...]/"Self" depending on project typing rules) and add a concise
Google-style docstring above from_hf describing parameters (model_hf_name,
hf_model_path), behavior (loads pretrained_config via load_pretrained_config,
derives layer types with get_qwen3_hybrid_layer_types, computes param_count via
cls.get_param_count) and the returned instance, and ensure any forward
references are quoted if necessary to satisfy static typing.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tensorrt_llm/bench/build/dataclasses.py`:
- Around line 275-291: The new public factory method from_hf lacks a return type
annotation and a Google-style docstring; update the method signature to include
the correct return type (the dataclass type, e.g., -> "ModelDataclass" or use
"cls" typing via ClassVar/Type[...]/"Self" depending on project typing rules)
and add a concise Google-style docstring above from_hf describing parameters
(model_hf_name, hf_model_path), behavior (loads pretrained_config via
load_pretrained_config, derives layer types with get_qwen3_hybrid_layer_types,
computes param_count via cls.get_param_count) and the returned instance, and
ensure any forward references are quoted if necessary to satisfy static typing.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b3846119-7d9f-45ff-ac9e-1d75915fd9a1

📥 Commits

Reviewing files that changed from the base of the PR and between b7aee0b and 51e70e7.

📒 Files selected for processing (1)

tensorrt_llm/bench/build/dataclasses.py

[nvbugs/6115039][fix] Pre-compute Qwen3 hybrid layer counts in from_h…

51e70e7

…f to avoid invalid model name lookup Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>

tensorrt-cicd requested a review from a team as a code owner April 30, 2026 12:21

tensorrt-cicd requested a review from dc3671 April 30, 2026 12:21

tensorrt-cicd assigned nv-guomingz Apr 30, 2026

github-actions Bot assigned tensorrt-cicd Apr 30, 2026

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[https://nvbugs/6115039][fix] Override `from_hf` in `Qwen3HybridConfig` to pre-compute `num_attention_layers` #13663

[https://nvbugs/6115039][fix] Override `from_hf` in `Qwen3HybridConfig` to pre-compute `num_attention_layers` #13663
tensorrt-cicd wants to merge 1 commit intoNVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6115039

tensorrt-cicd commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 30, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tensorrt-cicd commented Apr 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Links

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 30, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tensorrt-cicd commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading