[https://nvbugs/6115039][fix] Override from_hf in Qwen3HybridConfig to pre-compute num_attention_layers #13663
[https://nvbugs/6115039][fix] Override from_hf in Qwen3HybridConfig to pre-compute num_attention_layers #13663tensorrt-cicd wants to merge 1 commit intoNVIDIA:mainfrom
from_hf in Qwen3HybridConfig to pre-compute num_attention_layers #13663Conversation
…f to avoid invalid model name lookup Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>
📝 WalkthroughWalkthroughA classmethod Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Review rate limit: 9/10 reviews remaining, refill in 6 minutes. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tensorrt_llm/bench/build/dataclasses.py (1)
275-291: ⚡ Quick winAnnotate and document the new public factory.
from_hfis a new public API, but it currently omits the required return type annotation and a Google-style docstring. Please add both so this entry point follows the repo’s Python guidelines.Suggested patch
- def from_hf(cls, model_hf_name, hf_model_path): + def from_hf( + cls, + model_hf_name: str, + hf_model_path: str | None, + ) -> "Qwen3HybridConfig": + """Build a Qwen3 hybrid config from Hugging Face metadata."""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/bench/build/dataclasses.py` around lines 275 - 291, The new public factory method from_hf lacks a return type annotation and a Google-style docstring; update the method signature to include the correct return type (the dataclass type, e.g., -> "ModelDataclass" or use "cls" typing via ClassVar/Type[...]/"Self" depending on project typing rules) and add a concise Google-style docstring above from_hf describing parameters (model_hf_name, hf_model_path), behavior (loads pretrained_config via load_pretrained_config, derives layer types with get_qwen3_hybrid_layer_types, computes param_count via cls.get_param_count) and the returned instance, and ensure any forward references are quoted if necessary to satisfy static typing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tensorrt_llm/bench/build/dataclasses.py`:
- Around line 275-291: The new public factory method from_hf lacks a return type
annotation and a Google-style docstring; update the method signature to include
the correct return type (the dataclass type, e.g., -> "ModelDataclass" or use
"cls" typing via ClassVar/Type[...]/"Self" depending on project typing rules)
and add a concise Google-style docstring above from_hf describing parameters
(model_hf_name, hf_model_path), behavior (loads pretrained_config via
load_pretrained_config, derives layer types with get_qwen3_hybrid_layer_types,
computes param_count via cls.get_param_count) and the returned instance, and
ensure any forward references are quoted if necessary to satisfy static typing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b3846119-7d9f-45ff-ac9e-1d75915fd9a1
📒 Files selected for processing (1)
tensorrt_llm/bench/build/dataclasses.py
Summary
Qwen3HybridConfig.set_values_if_nonevalidator callsload_pretrained_config(self.name)whereself.nameis the bench identifier ("qwen3.5_9b_hf"), not a valid model path, causing OSError during trtllm-bench model type resolution.from_hfinQwen3HybridConfigto pre-computenum_attention_layersandnum_linear_attention_layersfrom the already-loaded pretrained config, so the validator's reload path is never triggered.Test plan
Links
Summary by CodeRabbit