-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Results on Comparison based on Vicuna test set #16
Comments
Vicuna has a test set to which you can refer this. We also find GPT-4 score is not stable at each time and give scores which may not aligned to human preferences. Here is an example which sends to GPT-4 for scoring.
|
Hi, this is a nice work.
I have some questions regarding Results in Comparison based on Vicuna test set section shown in README. How score A and score B are obtained? What does these score mean? I do not find any information regarding these scores. You clarification is much appreciated.
The text was updated successfully, but these errors were encountered: