-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Hopper H100 #7
Comments
Also, I had another question that how marlin performs comparing with TRT-LLM : Recently, a NIPS paper called Quip also shared a version of W2~W4.GEMM, it seems marlin and Quip both use a similar mma but very different with TRT-LLM. Any comments on the difference and performance? |
@Ageliss Can you confirm the benchmark result you posted of llama 7B and 65B is on H800 with Marlin kernel? Thank you. Can you also run the marlin kernel bench in |
Hi! You've probably already considered this, but would you be able to add support for Hopper H100 GPUs? A100s don't have nearly as much memory bandwidth. Am happy to run tests/benchmarks on one if that would help, thanks
The text was updated successfully, but these errors were encountered: