You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi folks,
This library is accelerated using assembly. Have you made any performance comparison of using cgo for corresponding acceleration? Although cgo itself has a remarkable overhead around 100ns for each call, it might be negligible given larger batch inputs. On the other hand, assembly itself also has the shortcomings such that it could not be inlined, so it would be interesting if performance comparison is available. Thank you~
The text was updated successfully, but these errors were encountered:
Hi folks,
This library is accelerated using assembly. Have you made any performance comparison of using cgo for corresponding acceleration? Although cgo itself has a remarkable overhead around 100ns for each call, it might be negligible given larger batch inputs. On the other hand, assembly itself also has the shortcomings such that it could not be inlined, so it would be interesting if performance comparison is available. Thank you~
The text was updated successfully, but these errors were encountered: