-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
functional approach with distributed training #5
Comments
Hi @kevinlin311tw , sure, I can add an example in a day or two. As a side note, the functional approach itself is actually agnostic to parallelism: you need only to wrap your encoder model and do cross process communication in the loss function. Maybe this comment will be helpful if you want to give it a try yourself. |
I've added an example in the readme, along with a new all-gather decorator that may be helpful. Feel free to ping me if you have any questions or find any problems with the code. |
@luyug I wonder if you have any workable example using functional approach on DDP?
According to this post, DDP doesn't seem to support multiple feedforward/backward pass calls. Can you confirm the case and/or provide any solutions? Thank you, |
Thank you for the great work!
Could you please provide some examples about functional approach with distributed multi-gpu training?
The text was updated successfully, but these errors were encountered: