Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对评估的结果有些疑问 #8

Open
qizheyanger opened this issue Aug 6, 2024 · 0 comments
Open

对评估的结果有些疑问 #8

qizheyanger opened this issue Aug 6, 2024 · 0 comments

Comments

@qizheyanger
Copy link

在尝试复现论文中的DeepSeekMath-7B-RL的基线时,评测出的结果和论文中有较大的出入,想知道可能是哪个环节需要特别注意,导致我无法复现到文中的结果,非常感谢!
生成指令:CUDA_VISIBLE_DEVICES=6 python evaluate_all.py --model_name /data2/wuzhuoyang/grade-school-math-master/model/deepseek --dataset_path /data2/wuzhuoyang/OlympiadBench-main/data/OE_TO_maths_en_COMP/data2/data2/OE_TO_maths_zh_CEE.json --cuda_device 6
python judge.py
计算结果指令: python calculate_accuracy.py --ref_dir /data2/wuzhuoyang/OlympiadBench-main/data/reference --text_only
但是复现出的结果如下,和文中结果22.42相差较大,多次检查未找到问题所在,可能是什么地方没有注意到。
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant