-
-
Notifications
You must be signed in to change notification settings - Fork 926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Determine FSDP/deepspeed settings on device select. #883
Conversation
Without this, the OS env check for accelerate will fail.
Could you provide an example of the error that you're trying to fix and how to reproduce that? |
It's not a direct error, but a bug. Perhaps the behavior is intended, in which case it should be fixed in some other way. In either case, unless I'm mistaken, the if case in will never be true, unless the user manually does because |
I'm not so clear about that line. As I recall, the only env for that is |
thanks @kallewoof, I made a change to your PR, mostly on naming the new function and calling it between the config validation and normalization step (where the device setup happens) |
@winglian Looks good to me. |
Actually, the normalization and validation should be swapped. Right now if you have e.g. a Edit: this:
|
I'll address this in a different PR |
I think I just fixed this recently. #896 |
No, you circumvented one of the symptoms, but the underlying issue still remains. Normalization should happen before validation, to avoid having to put in code like in #896. |
The reason we validate before normalization is because any errors that need to be surfaced to the user could be harder to understand since what they input could be indirectly different from the validation after normalization. I'm open to figuring out a better way. |
That makes sense. I think the opposite is the case too, though, where a user may become unnecessarily confused because an error or warning is raised that ultimately does not reflect the actual parameters used in training. Such as the VRAM warning mentioned above. |
…#883) * Determine FSDP/deepspeed settings on device select. Without this, the OS env check for accelerate will fail. * rename and move env setup call * chore: lint --------- Co-authored-by: Karl-Johan Alm <[email protected]> Co-authored-by: Wing Lian <[email protected]>
* Determine FSDP/deepspeed settings on device select. Without this, the OS env check for accelerate will fail. * rename and move env setup call * chore: lint --------- Co-authored-by: Karl-Johan Alm <[email protected]> Co-authored-by: Wing Lian <[email protected]>
Without this, the OS env check for accelerate will fail.