Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download from HuggingFace with private token #390
Download from HuggingFace with private token #390
Changes from all commits
4bda3db
e2aa785
7de66e9
b69db89
ce06288
8cf49c4
918efa1
a18fd37
1ea8605
632e76f
7dc0859
8d132e0
7266d4b
77a53e3
a2a7e3f
8440d47
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Check warning on line 117 in src/metatrain/utils/io.py
Codecov / codecov/patch
src/metatrain/utils/io.py#L116-L117
Check warning on line 124 in src/metatrain/utils/io.py
Codecov / codecov/patch
src/metatrain/utils/io.py#L124
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing string manipulation, should we pass more structured information to this code? Copied from my comment on the overall PR:
maybe a better interface here would look like
mtt export --mode huggingface --hf-api-token=.... --hf-owner=who/what --hf-file=file-path
(very bad syntax, just for the idea. This could also behuggingface://who/what/file-path
orhuggingface://token@who/what/file-path
or whatever).I.e. instead of trying to guess the info we need, ask the use about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so because HuggingFace gives you the download link, so that's what most users will use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asking for the "split information" on the command line is just more work for the user
Check warning on line 137 in src/metatrain/utils/io.py
Codecov / codecov/patch
src/metatrain/utils/io.py#L137
Check warning on line 144 in src/metatrain/utils/io.py
Codecov / codecov/patch
src/metatrain/utils/io.py#L143-L144
Check warning on line 148 in src/metatrain/utils/io.py
Codecov / codecov/patch
src/metatrain/utils/io.py#L148
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand the code properly, this will download
repo/owner/.../main/some/path/to/file.ckpt
to<cwd>/some/path/to/file.ckpt
. Is this intended? Otherwise we should be usingos.path.basename(filename)
as the destinationThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont understand why this is required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if you want the file to be directly downloaded to the current directory you can do
See https://docs.python.org/3/library/urllib.request.html#urllib.request.urlretrieve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is required for fine-tuning.
mtt export
only makes the.pt
available in the current directory, but that's useless for fine-tuningThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change this to use
filename
inurlretrieve
, and add the same INFO message about adding a file to cwd?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail for people running the tests locally, unless they have access to the repo & a token set, right?
Could we skip the test if
HUGGINGFACE_TOKEN_METATRAIN
is unset? And set it on CI/locally as needed. (using a different name in case people are already usingHUGGINGFACE_TOKEN
for some other reason.