-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Description
I have a modernbert model in fp32 and I want to convert it into fp16 while keeping embedding layer and classifier layer in fp32. What are the recommended way to do it?
It seems to be layerOutputTypes function but I could not find Python API and it is gone in 10.12?
tensorrt.BuilderFlag.FP16 is said to be Deprecated in TensorRT 10.12. https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/infer/Core/BuilderConfig.html#tensorrt.BuilderFlag
What is the alternative? It said strong typing superseded. Do you have example?
- I am in 10.13, actually it still works.
Thanks so much!
Environment
TensorRT Version:
NVIDIA GPU: L40S
NVIDIA Driver Version:
CUDA Version:
CUDNN Version:
Operating System:
Python Version (if applicable):
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version):
Relevant Files
Model link:
Steps To Reproduce
Commands or scripts:
Have you tried the latest release?:
Attach the captured .json and .bin files from TensorRT's API Capture tool if you're on an x86_64 Unix system
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):