Model inference performance degradation when converting to onnx and tensorrt engine. #1125

alexgrabit · 2022-02-04T10:11:21Z

alexgrabit
Feb 4, 2022

Hi everybody,

Problem description
We converted a EfficientnetV2 trained model to ONNX format and use it to inference some test images. Results are pretty much worse than the inference using .pth.tar file directly (they have no sense for us). Results are similar when inferencing with TensorRT engine converted from ONNX model, so we suppose the problem is in .pth.tar to ONNX conversion.

Does anybody know what could we be doing wrong?

Thanks in advance.

rwightman · 2022-02-04T18:09:58Z

rwightman
Feb 4, 2022
Maintainer

onnx performance issues are beyond the scope of issues here. Generally ONNX GPU models haven't worked particularly quickly for me, I feel the tensorrt step is important. CPU w/ optimizations enabled is often a decent gain. This is all ONNX / PyTorch though and nothing to do with the models themselves.

Moving to discussion as it's more appropriate for this sort of question.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model inference performance degradation when converting to onnx and tensorrt engine. #1125

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Model inference performance degradation when converting to onnx and tensorrt engine. #1125

alexgrabit Feb 4, 2022

Replies: 1 comment

rwightman Feb 4, 2022 Maintainer

alexgrabit
Feb 4, 2022

rwightman
Feb 4, 2022
Maintainer