Replies: 1 comment
-
The docs mention setup with flash-attention in under ROCm |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I didn't find how to set up flash attention on vLLM. I was wondering if it supports it. If not, what's the reason behind it. Many thanks!
Beta Was this translation helpful? Give feedback.
All reactions