[`core` / `DDP`] Fix RM trainer + DDP + quantization + propagate `gradient_checkpointing_kwargs` in SFT & DPO #912

younesbelkada · 2023-10-24T10:18:21Z

Needs : huggingface/peft#1036 / huggingface/transformers#27020

Fixes: #891
Fixes: #835

To avoid issues with PEFT + DDP, we need to call the gradient checkpointing method with use_reentrant=False that you can pass over the argument gradient_checkpointing_kwargs directly in trainer with huggingface/transformers#27020 .

For users that do not have the correct transformers version they need to update transformers with the correct version to get that feature, otherwise gradient_checkpointing_kwargs will get ignored.

cc @lvwerra

HuggingFaceDocBuilderDev · 2023-10-24T10:25:45Z

The documentation is not available anymore as the PR was closed or merged.

vwxyzjn

LG! One minor comment.

vwxyzjn · 2023-10-24T15:36:58Z

examples/scripts/reward_modeling.py

@@ -103,7 +103,7 @@ class ScriptArguments:

 # Step 2: Load the dataset and pre-process it
 tokenizer = AutoTokenizer.from_pretrained(args.model_name)
-train_dataset = load_dataset(args.dataset_name, split="train")
+train_dataset = load_dataset(args.dataset_name, split="train[:50]")


Could this be not hardcoded? Maybe like split=args.split

examples/scripts/reward_modeling.py

HuggingFaceDocBuilderDev · 2023-10-30T11:50:40Z

The documentation is not available anymore as the PR was closed or merged.

lvwerra

Looks good!

* fix bug: generate_args-do_sample * fix gradient_checkpointing_kwargs bug see: huggingface/trl#912 and huggingface/transformers#26969 * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…dient_checkpointing_kwargs` in SFT & DPO (huggingface#912) * make use of forward hooks * correctly delete attributes * fix RM DPP issues * revert unneeded changes * more fixes * fix diff * fix * propagate to SFT * Update examples/scripts/reward_modeling.py * propagate the fix on DPO trainer * add to example scripts * trigger CI

younesbelkada added 4 commits October 18, 2023 16:11

make use of forward hooks

d863767

correctly delete attributes

37bb607

fix RM DPP issues

bc1ab3c

revert unneeded changes

e56b7a0

vwxyzjn reviewed Oct 24, 2023

View reviewed changes

younesbelkada mentioned this pull request Oct 25, 2023

[Trainer / GC] Add gradient_checkpointing_kwargs in trainer and training arguments huggingface/transformers#27068

Merged

younesbelkada added 2 commits October 25, 2023 13:44

more fixes

6a48448

fix diff

3710619

younesbelkada mentioned this pull request Oct 25, 2023

[core] Fix use_reentrant issues huggingface/peft#1036

Merged

younesbelkada added 3 commits October 25, 2023 14:26

Merge remote-tracking branch 'origin/main' into fix-rm-ddp

6733243

fix

8ff4abd

propagate to SFT

a3223c0

younesbelkada commented Oct 30, 2023

View reviewed changes

examples/scripts/reward_modeling.py Outdated Show resolved Hide resolved

younesbelkada and others added 3 commits October 30, 2023 12:36

Update examples/scripts/reward_modeling.py

17d4105

propagate the fix on DPO trainer

b3b0db0

add to example scripts

db01a17

younesbelkada closed this in huggingface/transformers#27068 Oct 30, 2023

younesbelkada reopened this Oct 30, 2023

younesbelkada changed the title ~~[core / DDP] Fix RM trainer + DDP + quantization~~ [core / DDP] Fix RM trainer + DDP + quantization + propagate gradient_checkpointing_kwargs Oct 30, 2023

younesbelkada changed the title ~~[core / DDP] Fix RM trainer + DDP + quantization + propagate gradient_checkpointing_kwargs~~ [core / DDP] Fix RM trainer + DDP + quantization + propagate gradient_checkpointing_kwargs in SFT & DPO Oct 30, 2023

Merge branch 'main' into fix-rm-ddp

daff9c9

younesbelkada marked this pull request as ready for review October 31, 2023 15:55

trigger CI

1e01ef7

younesbelkada requested a review from lvwerra October 31, 2023 16:09

lvwerra approved these changes Oct 31, 2023

View reviewed changes

younesbelkada merged commit cbc6c9b into main Oct 31, 2023

younesbelkada deleted the fix-rm-ddp branch October 31, 2023 17:50

lewtun mentioned this pull request Nov 8, 2023

Error with Multi-GPU peft Reward Training #480

Closed

wwxFromTju mentioned this pull request Feb 5, 2024

fix gradient_checkpointing_kwargs bug OpenRLHF/OpenRLHF#206

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`core` / `DDP`] Fix RM trainer + DDP + quantization + propagate `gradient_checkpointing_kwargs` in SFT & DPO #912

[`core` / `DDP`] Fix RM trainer + DDP + quantization + propagate `gradient_checkpointing_kwargs` in SFT & DPO #912

younesbelkada commented Oct 24, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 24, 2023 •

edited

Loading

vwxyzjn left a comment

vwxyzjn Oct 24, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 30, 2023 •

edited

Loading

lvwerra left a comment

[core / DDP] Fix RM trainer + DDP + quantization + propagate gradient_checkpointing_kwargs in SFT & DPO #912

[core / DDP] Fix RM trainer + DDP + quantization + propagate gradient_checkpointing_kwargs in SFT & DPO #912

Conversation

younesbelkada commented Oct 24, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Oct 24, 2023 • edited Loading

vwxyzjn left a comment

Choose a reason for hiding this comment

vwxyzjn Oct 24, 2023 • edited Loading

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Oct 30, 2023 • edited Loading

lvwerra left a comment

Choose a reason for hiding this comment

[`core` / `DDP`] Fix RM trainer + DDP + quantization + propagate `gradient_checkpointing_kwargs` in SFT & DPO #912

[`core` / `DDP`] Fix RM trainer + DDP + quantization + propagate `gradient_checkpointing_kwargs` in SFT & DPO #912

younesbelkada commented Oct 24, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 24, 2023 •

edited

Loading

vwxyzjn Oct 24, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 30, 2023 •

edited

Loading