-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Jobs] Allowing to specify intermediate bucket for file upload #4257
base: master
Are you sure you want to change the base?
[Jobs] Allowing to specify intermediate bucket for file upload #4257
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zpoint! Left some comments about the design, correctness and lifecycle management for the bucket contents.
sky/utils/schemas.py
Outdated
jobs_configs['properties']['bucket'] = { | ||
'type': 'string', | ||
'pattern': '^(https|s3|gs|r2|cos)://.+', | ||
'required': [] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should include it in controller_resources_schema
and enable this feature for SkyServe as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add this to our docs https://github.com/skypilot-org/skypilot/blob/master/docs/source/reference/config.rst?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adde to controller_resources_schema
Could u point me out the code to change if I need to enable for SkyServe
?
Thanks @zpoint! I'm trying to run this yaml:
The task output should show me contents of my workdir and file_mount. Instead, I get:
Related to this: #4257 (comment)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Romil Bhardwaj <[email protected]>
Co-authored-by: Romil Bhardwaj <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zpoint - I tested the functionality with an external s3 bucket, works well. Left some comments.
sky/data/storage.py
Outdated
def from_store_url(cls, | ||
store_url: str) -> Tuple[str, str, str, Dict[str, Any]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MyClass.from_xyz()
implies the method will return a MyClass
object. Can we move this out from the class into a helper in data_utils or outside in storage.py?
Please reuse data_utils.split_xyz_path
to get the bucket name and subpath.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has to be here; otherwise, either data_util or storage_util will cause a circular import.
Renamed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zpoint. Left some comments on the new interfaces added in the latest commits.
sky/data/storage.py
Outdated
def delete_sub_path(self) -> None: | ||
"""Removes objects from the sub path in the bucket.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel delete_sub_path
should be an internal method that is called by delete when a _bucket_sub_path
is set.
In summary:
- If a Storage object is initialized with
_bucket_sub_path
and the bucket already existed,is_sky_managed
should be false and Storage.delete() should just delete the sub_path. - If a Storage object is initialized with
_bucket_sub_path
and the bucket did not already exist,is_sky_managed
should be true and Storage.delete() should just delete the entire bucket. - If a Storage object is not initialized with any
_bucket_sub_path
, we follow current behavior.
Then delete_sub_path() will not be needed as a public method of this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sounds good
sky/utils/controller_utils.py
Outdated
if store_type is not None: | ||
storage_obj.construct_store(store_type, region, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See other comment on construct_store. Instead of this, can we initialize the store above and pass it through the stores arg storage_lib.Storage(stores = ..., ...)
?
Co-authored-by: Romil Bhardwaj <[email protected]>
Co-authored-by: Romil Bhardwaj <[email protected]>
|
||
# Check bucket is empty, all files under sub directory should be deleted | ||
store = tmp_local_storage_obj_with_sub_path.stores[store_type] | ||
store._delete_sub_path() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I test the delete path for Azure, this exception occurs. However, aws and gcp pass. azure worked before—could it be an account permission problem? @Michaelvll @romilbhardwaj
onseError] This request is not authorized to perform this operation using this permission.
RequestId:0aa67d01-401e-0048-3416-4ef84d000000
Time:2024-12-14T10:55:17.3755800Z
ErrorCode:AuthorizationPermissionMismatch
Content: <?xml version="1.0" encoding="utf-8"?><Error><Code>AuthorizationPermissionMismatch</Code><Message>This request is not authorized to perform this operation using this permission.
RequestId:0aa67d01-401e-0048-3416-4ef84d000000
Time:2024-12-14T10:55:17.3755800Z</Message></Error>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I just checked azure portal and you should have full permissions...
Feature for #3978
jobs: bucket
in~/.sky/config.yaml
.s3, gcs, azure, r2, ibm
skypilot-filemounts-{username}-{run_id}
._bucket_sub_path
(Support bothCOPY
andMOUNT
based on the_bucket_sub_path
, The user will only see the subpath instead of the whole bucket if a subpath is specified)_
means internal use only, remove_
to expose in the futurepytest -s tests/test_smoke.py::test_managed_jobs_intermediate_storage
pytest -s tests/test_smoke.py::TestStorageWithCredentials::test_bucket_sub_path --gcp
_bucket_sub_path
works as expected.For managed jobs:
workdir -> {tmp-bucket-A}/workdir
file_mount -> {tmp-bucket-B}{i}
tmp_files -> {tmp-bucket-C}/tmp-files
workdir -> {intermediate-bucket}/job-{run_id}/workdir
file_mount -> {intermediate-bucket}/job-{run_id}/local-file-mounts/{i}
tmp_files -> {intermediate-bucket}/job-{run_id}/tmp-files
{intermediate-bucket}/job-{run_id}
bucket will be deleted automatically as before.Even using the same bucket, each job creates its own subdirectory based on the '{run-id}', and subdirectories of auto-created files will be cleaned up after the job is finished.
This allows users who don't have permission to create a bucket to manually specify a bucket name under
~/.sky/config.yaml
.Test plan
smoke test
custom test
(sky) ➜ cat ~/.sky/config.yaml jobs: bucket: s3://zpoint-bucket-s3/
Launch
Looks good
Tested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
conda deactivate; bash -i tests/backward_compatibility_tests.sh