-
-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Models speak with themselves #605
Comments
This is a known issue caused by using the wrong template with a model. Templates are ment to be stored in the gguf but often isn't leading to it falling back to chatml |
try settings the template parameter to "gemma" or "llama2" The available template strings are "chatml", "llama2", "phi3", "zephyr", "monarch", "gemma", "gemma2", "orion", "vicuna", "vicuna-orca", "deepseek", "command-r", "llama3", "minicpm" and "deepseek2" These are the templates llama.cpp handles out of the box and if they arnt set in gguf file theres currently no way of detecting which template should be used. |
Hi @danemadsen, thanks for the previous comment! |
Hello. I have same issues. |
We are testing MacOS version of the Maid with gguf models from huggingface (https://huggingface.co/bartowski and https://huggingface.co/QuantFactory) and they start to speak with themselves. With anythingLLM these models work fine. I assume there is some issue with end of response token initialization.
The text was updated successfully, but these errors were encountered: