-
Notifications
You must be signed in to change notification settings - Fork 989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make fsdp ram efficient loading optional #2037
Make fsdp ram efficient loading optional #2037
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! I added a few suggestions to try and make a few points more clear, let's see if we can improve it some :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! cc @BenjaminBossan for a secondary :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks Sourab. I only have two small nits, up to you if you want to address them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for enhancing the doc of the parameter.
What does this PR do?
meta
devices during pre-trained model loading. Fixes Error downloading wav2vec2 model when using accelerate launch with FSDP #1948 and NotImplementedError: Cannot copy out of meta tensor; no data! - while using FSDP [works with DDP and DeepSpeed] #2031 by making the ram efficient loading optional.