-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add workflow for syncing #76
Conversation
Signed-off-by: Gerald Shen <[email protected]>
Signed-off-by: Gerald Shen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar enough with workflows to be able to review the yaml config, but at first glance it seems reasonable.
Signed-off-by: Gerald Shen <[email protected]>
Co-authored-by: Olivier Delalleau <[email protected]>
thanks! I think it should be okay, I got it from github market place. It should work well |
@@ -318,7 +318,7 @@ def on_load_checkpoint(self, checkpoint) -> None: | |||
""" | |||
# mcore uses distributed checkpointing | |||
if "state_dict" in checkpoint and checkpoint["state_dict"]: | |||
for index, module in enumerate(self.get_gpt_module_list()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops: will remove once everything else looks good to you
@@ -38,7 +38,7 @@ def set_sync_funcs(ptl_model, forward_only): | |||
param_sync_func = ptl_model.sync_overlap_parameters | |||
|
|||
# pipeline schedules will get these from ptl_model.model.config | |||
for module in ptl_model.get_gpt_module_list(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops again: will remove once everything else looks good to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to merge after removing the get_model_module_list()
calls
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
dev should be branch that tracks main but has all the commits whenever nemo changes things. We will sync this branch with main every release