Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training with my own gpt2 #22

Open
dyyzhmm opened this issue May 3, 2023 · 1 comment
Open

training with my own gpt2 #22

dyyzhmm opened this issue May 3, 2023 · 1 comment

Comments

@dyyzhmm
Copy link

dyyzhmm commented May 3, 2023

To train Rrhf using my own Gpt2 model, do I need to first generate a response based on the query using my own model, and then have ChatGPT score it? This way, isn't wombat_train.json useless anymore?

@GanjinZero
Copy link
Owner

You can also use our data to train RRHF, however we do not know the performance without sampling by current policy model (i.e. gpt2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants