-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decoding script invitation #3
Comments
Hi Xumx, Thanks for your interests. Due to the company policy we cannot share the decoding script freely. We are still work on a channel where researchers can register and apply for the demo access. Please stay tuned and we will update you once it is ready. |
The decoding script is not really that hard to implement. But I am not fully sure if the inputs are exactly the same with the original implementation due to the weird tokenization implemented in this repo. |
@qywu I was about to post something similar :) I have created a repo with a decoding script, looks quite similar to yours :) I have added some automatic download and window length for the dialogue history. If you have sometimes check it out, and let me know if is there something to improve. I test it a bit yesterday and the responses are actually very good. I have tried some of the input reported in the repo and I can reproduce some of the responses too. This is the link to the repo: https://github.com/andreamad8/DialoGPT2-Interact I hope this is helpful Andrea |
@andreamad8 Cool. Have you noticed the weird tokenization for words? It seems that they feed tokenized sentences to GPT2, but it is not necessary. |
Oh, I have noticed that honestly, :) I went straight with the hugging face implementation. Which line are you referring too? |
DialoGPT/reddit_extractor/src/reddit.py Lines 114 to 115 in ef531a9
I am not referring to your code, but theirs. For GPT-2, there is no need to tokenize words first. So it doesn't generate sentences like: "Hello , how are you doing ? " |
Oh, I didn't check that file, but you are right, no need, GPT tokenizer does the job already. Maybe open another issue? but good to know. Andrea |
I made an implementation of the MMI decoder (from the description in the paper): https://github.com/LHolten/DialoGTP-MMI-decoder It features unlimited chat length and usage as a discord bot. |
@qywu Great Job! If we want to batch the input_ids, what should we pad? the 0 results are terrible. |
Based on the ideas of MMI of DialoGPT,I have implemented a chatbot for chinese chitchat,it's performance is good |
@andreamad8 can you post the responses that you got? ... and how you got them. We can't seem to match the ones reported. |
the dialogue generated by chatbot is listed as follow: Sample 2: |
@jsedoc so in my decoding script I use multinomial sampling so the output is a bit different every time. If you want to try pure greedy use top-k 0 and change line 91 with Anyway, the generate responses are very good, but yah not exactly the same. For example: USR >>> The trading war between China and US is still happening . and USR >>> who won the world cup in 2018 ? USR >>> what is the boiling point of water? USR >>> who is the first president of the United States In general, I use top-k sampling. Let me know if this help. |
The results seem really impressive, thanks for your work! |
Thanks!!! In the paper, it says that a response was chosen from 10 responses in top-k. This is always the problem with sampling that reproducibility becomes an issue. Especially when one of the 10 top-k responses is selected by a human.
|
@yangjianxin1 The result looks really impressive! We will remark your GitHub repo in our repo as well. Thanks for letting us know |
@dreasysnail thank you very much |
First of all, thank you for releasing the code and the models, it's fantastic. Based on the current DialoGPT implementation, I adapted run_generation.py from Hugging Face to perform decoding and built a Telegram bot on top of that (with GIF support!). Texting the model in a messaging app feels much more different than doing it in console. Responses are sometimes out of this world but still very coherent. Here is a multi-turn chat example with context window of 2 turns:
|
Looks awesome. Thanks for the contribution @polakowo ! |
@andreamad8 @polakowo @yangjianxin1 @LHolten thank you for releasing your code! Have you tried feeding the |
Here are the inputs for a sample dialog (
Is anything wrong with these inputs? Here are the decoded input tokens for your convenience:
|
Hey @nicolas-ivanov, yes I tried and yes it breaks the models' output. I believe that the model has not been trained using this positional token. Maybe because the model was working well without. Anyhow, just keep those None and it works okay. If you need to finetune it, then you also use the position_ids, and they should work :) I hope this help Andrea |
@andreamad8 Thanks a lot for your response! @dreasysnail Could you please confirm that the model was trained without |
Yes @andreamad8 is right (Thanks!). We didn't have the |
Got it, thanks a lot for the clarification! |
Was wondering if you figured out a way to batch decode sentences? |
Hi all, is the third-party decoders still relevant?
Thanks! |
Is there a way to send in requests for the decoding script?
I understand the nature of the challenges surrounding reddit toxicity, we just want to try it out privately, and test different prompts.
The text was updated successfully, but these errors were encountered: