Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add workflow for syncing #76

Closed
wants to merge 4 commits into from
Closed

add workflow for syncing #76

wants to merge 4 commits into from

Conversation

gshennvm
Copy link
Collaborator

@gshennvm gshennvm commented Jan 8, 2024

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

dev should be branch that tracks main but has all the commits whenever nemo changes things. We will sync this branch with main every release

Signed-off-by: Gerald Shen <[email protected]>
@gshennvm gshennvm requested a review from odelalleau January 8, 2024 22:17
@github-actions github-actions bot added the CI label Jan 8, 2024
Signed-off-by: Gerald Shen <[email protected]>
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar enough with workflows to be able to review the yaml config, but at first glance it seems reasonable.

CONTRIBUTING.md Outdated Show resolved Hide resolved
@github-actions github-actions bot added the Utils label Jan 9, 2024
Co-authored-by: Olivier Delalleau <[email protected]>
@gshennvm
Copy link
Collaborator Author

gshennvm commented Jan 9, 2024

I'm not familiar enough with workflows to be able to review the yaml config, but at first glance it seems reasonable.

thanks! I think it should be okay, I got it from github market place. It should work well

@@ -318,7 +318,7 @@ def on_load_checkpoint(self, checkpoint) -> None:
"""
# mcore uses distributed checkpointing
if "state_dict" in checkpoint and checkpoint["state_dict"]:
for index, module in enumerate(self.get_gpt_module_list()):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops: will remove once everything else looks good to you

@@ -38,7 +38,7 @@ def set_sync_funcs(ptl_model, forward_only):
param_sync_func = ptl_model.sync_overlap_parameters

# pipeline schedules will get these from ptl_model.model.config
for module in ptl_model.get_gpt_module_list():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops again: will remove once everything else looks good to you

Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge after removing the get_model_module_list() calls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants