Request for SFT Training scripts and implementation details #71

HCY123902 · 2024-10-19T19:45:26Z

Thank you for sharing your research work. I have a question related to the supervised fine-tuning step, which, according to the paper, is used to initialize the base model before running SimPO. While the SFT configuration file is provided at training_configs/llama-3-8b-base-sft.yaml, may I ask for the SFT training script itself?

In issue #27, there is 1 comment asking about how HuggingFaceH4/ultrachat_200k is processed for SFT. I would like to know this too. HuggingFaceH4/ultrachat_200k samples are multi-turn dialogues. Therefore, I am curious about what labels are used for SFT.

The text was updated successfully, but these errors were encountered:

yumeng5 · 2024-10-19T20:39:53Z

Hi @HCY123902

We used the same SFT training script as the original alignment-handbook repo: https://github.com/huggingface/alignment-handbook/blob/main/scripts/run_sft.py

And the command for SFT training is as follows:

ACCELERATE_LOG_LEVEL=info accelerate launch --config_file accelerate_configs/deepspeed_zero3.yaml scripts/run_sft.py training_configs/llama-3-8b-base-sft.yaml

As for HuggingFaceH4/ultrachat_200k, we didn't do any specific processing of that. This means we train on all turns if the dialogue is multi-turn.

I hope this helps!

Best,
Yu

HCY123902 · 2024-10-20T15:29:27Z

Thank you for clarifying this

OscarXZQ · 2024-11-01T09:40:27Z

Hi, following up on this, may I kindly ask if it's possible to provide a separate file for SFT training? The run_simpo script seems to have hardcoded some preference optimization stage, making the command

ACCELERATE_LOG_LEVEL=info accelerate launch --config_file accelerate_configs/deepspeed_zero3.yaml scripts/run_simpo.py training_configs/llama-3-8b-base-sft.yaml

as provided above not runnable.

yumeng5 · 2024-11-01T16:03:42Z

Hi @OscarXZQ

Sorry there was a typo in my previous comment: run_simpo.py should be replaced with run_sft.py (which is the script in the original alignment-handbook repo). I have fixed the typo.

Best,
Yu

HCY123902 changed the title ~~Training scripts for SFT~~ Request for SFT Training scripts and implementation details Oct 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for SFT Training scripts and implementation details #71

Request for SFT Training scripts and implementation details #71

HCY123902 commented Oct 19, 2024 •

edited

Loading

yumeng5 commented Oct 19, 2024 •

edited

Loading

HCY123902 commented Oct 20, 2024

OscarXZQ commented Nov 1, 2024

yumeng5 commented Nov 1, 2024

Request for SFT Training scripts and implementation details #71

Request for SFT Training scripts and implementation details #71

Comments

HCY123902 commented Oct 19, 2024 • edited Loading

yumeng5 commented Oct 19, 2024 • edited Loading

HCY123902 commented Oct 20, 2024

OscarXZQ commented Nov 1, 2024

yumeng5 commented Nov 1, 2024

HCY123902 commented Oct 19, 2024 •

edited

Loading

yumeng5 commented Oct 19, 2024 •

edited

Loading