You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Import the 4-bit quantized gguf format model on hugging face in Ollama, and ask questions in openwebui. The output speed is very slow.
The video memory of the 4060ti 16g model of the Ollama host only occupies 8G, the core frequency of the graphics card is often at 210, and rarely reaches the maximum frequency. The CPU usage of 7950x is 50%.
在ollama中导入hugging face上4bit量化后的gguf格式模型,在openwebui中提问,输出速度很慢。
ollama主机4060ti 16g型号的显卡显存占用才8G,显卡核心频率经常在210,很少到最大频率,7950x的CPU占用率50%。
The text was updated successfully, but these errors were encountered: