Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to log model instead of saving to folder #1683

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

irenedea
Copy link
Contributor

@irenedea irenedea commented Dec 2, 2024

Manual Test

test-log-model-no-save-hNNfeX
https://dbc-559ffd80-2bfc.cloud.databricks.com/ml/experiments/482477677751793/runs/ea44b38569974edf8573b3f66558a15f?o=7395834863327820

Two models logged at each batch. The last batch ba10 is registered and ba5 is only logged.
image

image

@irenedea irenedea marked this pull request as ready for review December 4, 2024 20:15
@irenedea irenedea requested a review from a team as a code owner December 4, 2024 20:15
llmfoundry/callbacks/hf_checkpointer.py Outdated Show resolved Hide resolved
llmfoundry/callbacks/hf_checkpointer.py Outdated Show resolved Hide resolved
llmfoundry/callbacks/hf_checkpointer.py Outdated Show resolved Hide resolved
llmfoundry/callbacks/hf_checkpointer.py Show resolved Hide resolved
@irenedea irenedea requested a review from dakinggg December 6, 2024 22:28
@dakinggg
Copy link
Collaborator

dakinggg commented Dec 7, 2024

gpu tests failed, will wait to review until CI passes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants