Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide clearer documentation for data download operations #323

Open
anth-volk opened this issue Dec 19, 2024 · 0 comments
Open

Provide clearer documentation for data download operations #323

anth-volk opened this issue Dec 19, 2024 · 0 comments

Comments

@anth-volk
Copy link
Contributor

At the moment, the data download operations we use are unclear, poorly documented, and concerningly typed (e.g., dataset can be a string or a Dataset type or None, and we make a lot of in-place modifications that change types without making this clear). We also appear to try to support all types of interfaces and implementations, but make that unclear; e.g., it seems I could initialize a simulation and override our dataset option to (attempt to) download literally any Hugging Face dataset by prefixing it with hf://, though this will fail because the dataset must exist within our Hugging Face repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant