Skip to content

[https://nvbugs/6115039][fix] Override from_hf in Qwen3HybridConfig to pre-compute num_attention_layers #13663

Open
tensorrt-cicd wants to merge 1 commit intoNVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6115039
Open

[https://nvbugs/6115039][fix] Override from_hf in Qwen3HybridConfig to pre-compute num_attention_layers #13663
tensorrt-cicd wants to merge 1 commit intoNVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6115039

Conversation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

@tensorrt-cicd tensorrt-cicd commented Apr 30, 2026

Summary

  • Root cause: Qwen3HybridConfig.set_values_if_none validator calls load_pretrained_config(self.name) where self.name is the bench identifier ("qwen3.5_9b_hf"), not a valid model path, causing OSError during trtllm-bench model type resolution.
  • Fix: Override from_hf in Qwen3HybridConfig to pre-compute num_attention_layers and num_linear_attention_layers from the already-loaded pretrained config, so the validator's reload path is never triggered.
  • Automated fix generated by repair-bot

Test plan

  • Verify fix on the same GPU type as the original failure
  • Check for regressions in related tests

Links

Summary by CodeRabbit

  • New Features
    • Qwen3Hybrid configurations can now be constructed directly from Hugging Face model names, automatically loading pretrained settings and computing necessary model parameters.

…f to avoid invalid model name lookup

Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

📝 Walkthrough

Walkthrough

A classmethod from_hf is added to Qwen3HybridConfig to construct config from HuggingFace models by loading pretrained config, computing parameter count, and counting attention layer types. Additionally, set_values_if_none refactors to use list.count() instead of generator-based sum().

Changes

Cohort / File(s) Summary
Qwen3HybridConfig Enhancement
tensorrt_llm/bench/build/dataclasses.py
Added from_hf classmethod to build config from HuggingFace models by loading pretrained config, computing param_count, counting full_attention and linear_attention layer types. Refactored set_values_if_none to use list.count() for deriving layer counts instead of generator-based sum().

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title is fully related to the main change: overriding from_hf in Qwen3HybridConfig to pre-compute layer counts, which directly addresses the root cause of the bug described in the PR.
Description check ✅ Passed The PR description includes all critical required sections: a clear summary of the root cause and fix, test plan with verification steps, and a link to the bug. The description template structure is followed adequately despite minor formatting differences.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Review rate limit: 9/10 reviews remaining, refill in 6 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tensorrt_llm/bench/build/dataclasses.py (1)

275-291: ⚡ Quick win

Annotate and document the new public factory.

from_hf is a new public API, but it currently omits the required return type annotation and a Google-style docstring. Please add both so this entry point follows the repo’s Python guidelines.

Suggested patch
-    def from_hf(cls, model_hf_name, hf_model_path):
+    def from_hf(
+        cls,
+        model_hf_name: str,
+        hf_model_path: str | None,
+    ) -> "Qwen3HybridConfig":
+        """Build a Qwen3 hybrid config from Hugging Face metadata."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/bench/build/dataclasses.py` around lines 275 - 291, The new
public factory method from_hf lacks a return type annotation and a Google-style
docstring; update the method signature to include the correct return type (the
dataclass type, e.g., -> "ModelDataclass" or use "cls" typing via
ClassVar/Type[...]/"Self" depending on project typing rules) and add a concise
Google-style docstring above from_hf describing parameters (model_hf_name,
hf_model_path), behavior (loads pretrained_config via load_pretrained_config,
derives layer types with get_qwen3_hybrid_layer_types, computes param_count via
cls.get_param_count) and the returned instance, and ensure any forward
references are quoted if necessary to satisfy static typing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tensorrt_llm/bench/build/dataclasses.py`:
- Around line 275-291: The new public factory method from_hf lacks a return type
annotation and a Google-style docstring; update the method signature to include
the correct return type (the dataclass type, e.g., -> "ModelDataclass" or use
"cls" typing via ClassVar/Type[...]/"Self" depending on project typing rules)
and add a concise Google-style docstring above from_hf describing parameters
(model_hf_name, hf_model_path), behavior (loads pretrained_config via
load_pretrained_config, derives layer types with get_qwen3_hybrid_layer_types,
computes param_count via cls.get_param_count) and the returned instance, and
ensure any forward references are quoted if necessary to satisfy static typing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b3846119-7d9f-45ff-ac9e-1d75915fd9a1

📥 Commits

Reviewing files that changed from the base of the PR and between b7aee0b and 51e70e7.

📒 Files selected for processing (1)
  • tensorrt_llm/bench/build/dataclasses.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants