-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add video support #3089
Add video support #3089
Conversation
I remember you mentioned at one point that some video datasets have one video, and each row refers to a range [start, end], which would be accessed as "media fragments" in HTML: https://developer.mozilla.org/en-US/docs/Web/Media/Audio_and_video_delivery#specifying_playback_range. Is it still a thing, or do we just have one row = one video? |
one row = one video (we'll see later if we need to support other cases) |
This reverts commit 03060f3.
videos in /first-rows and /rows will be like
since videos are stored on the hub as files in 99% cases :)
(otherwise the back-ends store videos in the assets on S3 as usual (e.g. if they come from parquet or webdataset data or zipped data) but this is NOT recommended)
In particular, the pinned
datasets
implements the Video type withdecord
and does NOT embed video bytes in Parquet files (HF URL only) when the file is on HF when preparing Parquet files for /rows.Note that I also disable video decoding in
get_rows()
to not have to download the video data in /first-rows (HF URL only as well)cc @AndreaFrancis for viz