training with my own gpt2 #22

dyyzhmm · 2023-05-03T14:17:44Z

To train Rrhf using my own Gpt2 model, do I need to first generate a response based on the query using my own model, and then have ChatGPT score it? This way, isn't wombat_train.json useless anymore?

GanjinZero · 2023-05-04T04:22:36Z

You can also use our data to train RRHF, however we do not know the performance without sampling by current policy model (i.e. gpt2)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training with my own gpt2 #22

training with my own gpt2 #22

dyyzhmm commented May 3, 2023

GanjinZero commented May 4, 2023

training with my own gpt2 #22

training with my own gpt2 #22

Comments

dyyzhmm commented May 3, 2023

GanjinZero commented May 4, 2023