feat: add dynamic shapes kernel specialization strategy for TRT-RTX#4184
Merged
lanluo-nvidia merged 2 commits intopytorch:mainfrom Apr 21, 2026
Merged
Conversation
tp5uiuc
commented
Apr 12, 2026
tp5uiuc
commented
Apr 12, 2026
7 tasks
c222c72 to
385eec6
Compare
Expose IRuntimeConfig.setDynamicShapesKernelSpecializationStrategy()
through the Torch-TensorRT Python API. Users can now control how
shape-specialized kernels are compiled at runtime for dynamic shapes
on TensorRT-RTX via the new `dynamic_shapes_kernel_specialization_strategy`
compilation setting ("lazy", "eager", or "none").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address review feedback: compile with torchtrt.Input min/opt/max ranges so dynamic shapes are actually exercised. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
385eec6 to
d7619ca
Compare
lanluo-nvidia
approved these changes
Apr 20, 2026
Collaborator
lanluo-nvidia
left a comment
There was a problem hiding this comment.
lgtm, one minor comment.
| hardware_compatible (bool): Build the TensorRT engines compatible with GPU architectures other than that of the GPU on which the engine was built (currently works for NVIDIA Ampere and newer) | ||
| timing_cache_path (str): Path to the timing cache if it exists (or) where it will be saved after compilation. Not used for TensorRT-RTX. | ||
| runtime_cache_path (str): Path to the runtime cache for TensorRT-RTX JIT compilation results. Not used for standard TensorRT. | ||
| dynamic_shapes_kernel_specialization_strategy (str): Strategy for dynamic shape kernel specialization at runtime (TensorRT-RTX only). Options: "lazy", "eager", "none". Default: "lazy". |
Collaborator
There was a problem hiding this comment.
Can we add a warning or check in case user configured dynamic_shapes_kernel_specialization_strategy in TensorRT
Contributor
Author
There was a problem hiding this comment.
This is a good suggestion Lan, I have a followup task to emit user warnings for
- timing cache used in TRT-RTX
- runtime cache used in standard TRT
- dynamic shape strategy used in standard TRT
- cudagraphs flag used in standard TRT
so that its easier to review the change/behavior. I will put it in then
10 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Expose
IRuntimeConfig.setDynamicShapesKernelSpecializationStrategy()through the Torch-TensorRT Python API for TensorRT-RTX builds.Users can now control how shape-specialized kernels are compiled at runtime for dynamic shapes via the new
dynamic_shapes_kernel_specialization_strategycompilation setting:"lazy"(default): Compile shape-specialized kernels in the background, use fallback until ready"eager": Compile immediately (blocking)"none": Always use fallback kernels, never specializeDepends on: #4180 (runtime cache API — provides the
IRuntimeConfiginfrastructure)Type of change
Checklist: