NVIDIA bert on A100 - building engines fails when using fp16 #10

saibulusu · 2025-01-03T22:20:30Z

I have added to the configuration for bert.

I am noticing the following error when I add the line to set precision to fp16 for high accuracy, what do I have to change so that I can use fp16 on NVIDIA A100 GPUs?

The default config version setup (using fp32) works.

saibulusu · 2025-01-05T23:10:32Z

I have tensorrrt version 10.5.0, and I am using CUDA 12.4 on Ubuntu 22.04.

It looks like an older version of tensorrt has this function 'add_fully_connected'. Are older versions of tensorrt compatible with CUDA 12.4?
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-803/api/python_api/infer/Graph/Network.html.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA bert on A100 - building engines fails when using fp16 #10

NVIDIA bert on A100 - building engines fails when using fp16 #10

saibulusu commented Jan 3, 2025 •

edited

Loading

saibulusu commented Jan 5, 2025

NVIDIA bert on A100 - building engines fails when using fp16 #10

NVIDIA bert on A100 - building engines fails when using fp16 #10

Comments

saibulusu commented Jan 3, 2025 • edited Loading

saibulusu commented Jan 5, 2025

saibulusu commented Jan 3, 2025 •

edited

Loading