Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Datasets with only validated recordings #27

Open
soliviantar opened this issue Oct 22, 2023 · 0 comments
Open

Feature request: Datasets with only validated recordings #27

soliviantar opened this issue Oct 22, 2023 · 0 comments

Comments

@soliviantar
Copy link

I've posted this already in the main repo, but seeing #26 here makes me think this might be the more adequate place to request this.

When downloading datasets, one must download the whole set (or a delta) including all sentences and recordings, whether validated or not, even if the user only needs the validated data. This consumes a lot of bandwidth, time and disk space, and it is not environmentally friendly either.

Offering the option to just download the part of the dataset with validated recordings would save a lot of time and make the data more accessible to more people. Being able to download only the tsv files would also be a good addition, but this is already addressed in #26.

I don't know how complex it would be to implement this, but I feel this would be a very useful quality of life feature, so I hope it is taken into consideration.

Thanks for your work in this amazing project in any case!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant