Replies: 3 comments 1 reply
-
I think the short answer is no, as vLLM's engine relies on custom kernels written in CUDA. |
Beta Was this translation helpful? Give feedback.
0 replies
-
You can try ctranslate2 or llama.cpp. |
Beta Was this translation helpful? Give feedback.
1 reply
-
This was not clear to me either - any way to highlight this in bold somewhere on main docs? Sorry if I overlooked. I am trying to do some local testing - that's my use case |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Can we use vllm only on CPU without GPU machine?
Beta Was this translation helpful? Give feedback.
All reactions