Model inference performance degradation when converting to onnx and tensorrt engine. #1125
Unanswered
alexgrabit
asked this question in
Q&A
Replies: 1 comment
-
onnx performance issues are beyond the scope of issues here. Generally ONNX GPU models haven't worked particularly quickly for me, I feel the tensorrt step is important. CPU w/ optimizations enabled is often a decent gain. This is all ONNX / PyTorch though and nothing to do with the models themselves. Moving to discussion as it's more appropriate for this sort of question. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everybody,
Problem description
We converted a EfficientnetV2 trained model to ONNX format and use it to inference some test images. Results are pretty much worse than the inference using .pth.tar file directly (they have no sense for us). Results are similar when inferencing with TensorRT engine converted from ONNX model, so we suppose the problem is in .pth.tar to ONNX conversion.
Does anybody know what could we be doing wrong?
Thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions