Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FSDP][Example] Fix FSDP example for use_orig_params=False #3298

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

SumanthRH
Copy link
Contributor

@SumanthRH SumanthRH commented Dec 16, 2024

What does this PR do?

Fixes the FSDP example for use_orig_params=False. I recently started using FSDP via Accelerate and hit some roadblocks due to rough edges in documentation and examples. Some improvements in this PR:

  • use_orig_params=False should be the recommended route all the time (full param/ lora), since it's typically more efficient.
  • use_orig_params=False requires model preparation before optimizer instantiation. This is a silent failure - I noticed that loss didn't decrease in this case. I believe it is best if .prepare has extra validation for this and raise an Error to the user. (Might do this in a separate PR if I get time). The example prepares everything together because it assumes use_orig_params is True from this PR: fsdp refactoring #2177
  • Documentation has removed this caveat with use_orig_params. I found a note from an older version so I brought it back.
  • SHARD_GRAD_OP is a bit different from Zero Stage 2, this PR adds a tiny note.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

cc @muellerzr

x
Signed-off-by: SumanthRH <[email protected]>
x
Signed-off-by: SumanthRH <[email protected]>
x
Signed-off-by: SumanthRH <[email protected]>
x
Signed-off-by: SumanthRH <[email protected]>
@SumanthRH SumanthRH marked this pull request as ready for review December 16, 2024 15:06
@BenjaminBossan
Copy link
Member

  • use_orig_params=False requires model preparation before optimizer instantiation. This is a silent failure - I noticed that loss didn't decrease in this case.

Nice catch, I didn't know about that.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: SumanthRH <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants