Skip to content

What is recommended way to convert into fp16 from fp32 #4656

@sarahliu-cisco

Description

@sarahliu-cisco

Description

I have a modernbert model in fp32 and I want to convert it into fp16 while keeping embedding layer and classifier layer in fp32. What are the recommended way to do it?
It seems to be layerOutputTypes function but I could not find Python API and it is gone in 10.12?

tensorrt.BuilderFlag.FP16 is said to be Deprecated in TensorRT 10.12. https://docs.nvidia.com/deeplearning/tensorrt/latest/_static/python-api/infer/Core/BuilderConfig.html#tensorrt.BuilderFlag
What is the alternative? It said strong typing superseded. Do you have example?

  • I am in 10.13, actually it still works.

Thanks so much!

Environment

TensorRT Version:

NVIDIA GPU: L40S

NVIDIA Driver Version:

CUDA Version:

CUDNN Version:

Operating System:

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link:

Steps To Reproduce

Commands or scripts:

Have you tried the latest release?:

Attach the captured .json and .bin files from TensorRT's API Capture tool if you're on an x86_64 Unix system

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions