-
-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very bad response #217
Comments
i changed prompt_chatbort.txt to the following and the problem was solved.
|
For completeness, if you want to use the Llama-2 chat finetune precisely as it was intended, you need a little bit more than the chatbot example in the repo. The initial prompt includes the It's not a difficult thing to do for one model, but to support every possible model you'd need a comprehensive templating system, and I tried to keep it simple for the example. The simplest hack to get the chatbot example pretty close to the intended Llama-2 format would be to set a user name of |
Thanks for reply, prompt_chartbot.txt:
Command:
|
The response was not good because in your example the format was not correct and there were no BOS and EOS tokens. The PR gives a correct LLama 2 chat implementation #221 @turboderp I would greatly appreciate if you merged :) |
Thanks, params:
TheBloke_Llama-2-13B-chat-GPTQ/
TheBloke_Llama-2-70B-chat-GPTQ/
|
@turboderp Hm, interestingly it also seems to work for him for 13b @pourfard At the moment I don't own 2 GPUs so I can't test 70B myself, but down below is code. I highly suspect, given your 70B output, that my slicing is incorrect. Below is a code that doesn't stream the response but still outputs text, would you please give that a try with the 70B model? (it's just draft code to see where the problem lies) # EDIT: I removed the code, I will supply new code in a bit... |
Nvm, this code produces even more nonsense... I am sorry, I am working on why that is |
I tried another 70b model (localmodels_Llama-2-70B-Chat-GPTQ) but the result is the same. |
I used text-generation-webui and it works fine. |
Hi, I'm trying to test TheBloke/LLaMA2-70b-chat. But the responses are very different from main LLaMA2-70b-chat.. I'm running the model on dual 3090. For example my question is "how can i connect to a device using mac address?":
command:
python example_chatbot.py -d ../text-generation-webui/models/TheBloke_Llama-2-70B-chat-GPTQ/ -un "Jeff" -p prompt_chatbort.txt -gs 17.2,24
exllama params:
exllama response:
Chatbort: I'm Sorry, I am unable to assist you with that. My apologies for any unspecified assistance. (MACant disconnections).
Original LLaMA2 response:
The text was updated successfully, but these errors were encountered: