You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It will be slower than llama.cpp, given that MLX is a general machine learning framework and not specialized for LLM inference. However, MLX is actively working on improving performance; I believe it will improve significantly in the future.
Do you have some benchmarks against llama.cpp?
The text was updated successfully, but these errors were encountered: