Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading a webdataset from cloud storage #5785

Open
1 task done
omri-cavnue opened this issue Jan 16, 2025 · 3 comments
Open
1 task done

Reading a webdataset from cloud storage #5785

omri-cavnue opened this issue Jan 16, 2025 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@omri-cavnue
Copy link

Describe the question.

Hello,

I am curious how to read a webdataset from cloud storage (e.g. s3 or gcs)? I have tried using a pipe, providing the gs:// URI, and other options into the sample code provided here but cannot open the webdataset. Is there a way to do this, or do I need to combine the webdataset reader with another reader?

Check for duplicates

  • I have searched the open bugs/issues and have found no duplicates for this bug report
@omri-cavnue omri-cavnue added the question Further information is requested label Jan 16, 2025
@jantonguirao
Copy link
Contributor

Hi. We do support S3 storage but no gcs at the moment. The example you are referring is relevant, webdataset reader can take S3 uris instead of local paths. Could you please try with a S3 location?

@omri-cavnue
Copy link
Author

Hi @jantonguirao we are using GCS, so S3 was just an example in my description, but I would need GCS support. Is the alternative to use this then?

@jantonguirao
Copy link
Contributor

Yes, if you are using GCS, then your only option is to use ExternalSource at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants