Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVIDIA bert on A100 - building engines fails when using fp16 #10

Open
saibulusu opened this issue Jan 3, 2025 · 1 comment
Open

NVIDIA bert on A100 - building engines fails when using fp16 #10

saibulusu opened this issue Jan 3, 2025 · 1 comment

Comments

@saibulusu
Copy link

saibulusu commented Jan 3, 2025

I have added to the configuration for bert.
image

I am noticing the following error when I add the line to set precision to fp16 for high accuracy, what do I have to change so that I can use fp16 on NVIDIA A100 GPUs?
image

The default config version setup (using fp32) works.

@saibulusu
Copy link
Author

I have tensorrrt version 10.5.0, and I am using CUDA 12.4 on Ubuntu 22.04.
image

It looks like an older version of tensorrt has this function 'add_fully_connected'. Are older versions of tensorrt compatible with CUDA 12.4?
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-803/api/python_api/infer/Graph/Network.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant