can't reproduce AE-LC numbers in hf ckpt(Llama-3-8b-SFT-DPO, Llama-3-8b-SFT-SimPO)) #77

schrieffer-z · 2024-12-02T04:07:51Z

vllm under any version mismatch with current env and if you separate eval and train. I still need a version of vllm
in separate env, using the command below, I try different engine (4o and 4-turbo) and get some numbers dose not make sense. Have you ever try different annotators when use 4o, it give me a result where DPO>SimPO, while 4-turbo gives the opposite

alpaca_eval evaluate_from_model
--model_configs /mnt/vepfs/fs_users/***/xAI-RLHF/***/SimPO/eval/alpacaeval2/configs/Llama-3-Base-8B-SFT-SimPO.yaml
--annotators_config weighted_alpaca_eval_gpt4_turbo\

schrieffer-z · 2024-12-02T04:23:44Z

repo's AE-LC number:
simpo - 22%
dpo - 18.2%

lancerts · 2024-12-02T18:00:54Z

Had a similar issue with ckpt [princeton-nlp/Llama-3-Instruct-8B-SimPO-v0.2](https://huggingface.co/princeton-nlp/Llama-3-Instruct-8B-SimPO-v0.2)
3 AE runs gives consistent LC/WR (48/44) where the number reported is 53.7/47.5.
One potential issue could be #75 and there is no vllm version specified in https://github.com/princeton-nlp/SimPO/blob/main/environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can't reproduce AE-LC numbers in hf ckpt(Llama-3-8b-SFT-DPO, Llama-3-8b-SFT-SimPO)) #77

can't reproduce AE-LC numbers in hf ckpt(Llama-3-8b-SFT-DPO, Llama-3-8b-SFT-SimPO)) #77

schrieffer-z commented Dec 2, 2024

schrieffer-z commented Dec 2, 2024

lancerts commented Dec 2, 2024

can't reproduce AE-LC numbers in hf ckpt(Llama-3-8b-SFT-DPO, Llama-3-8b-SFT-SimPO)) #77

can't reproduce AE-LC numbers in hf ckpt(Llama-3-8b-SFT-DPO, Llama-3-8b-SFT-SimPO)) #77

Comments

schrieffer-z commented Dec 2, 2024

schrieffer-z commented Dec 2, 2024

lancerts commented Dec 2, 2024