Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Latest commit

 

History

History
140 lines (118 loc) · 4.98 KB

DATASET.md

File metadata and controls

140 lines (118 loc) · 4.98 KB

Data Prepration

This document describes how we prepare AVA, EPIC-Kitchens, and Charades datasets.

Note: After finishing the following steps, please verify that the images in frames are consistent with the "frame lists". Using frames at a different FPS or with a different resolution might result in different performance.

License

All models, their output detections, and their output features available for download through this document are licensed under the Creative Commons Attribution-ShareAlike 3.0 license.

AVA

We assume that the AVA dataset is placed at data/ava with the following structure.

ava
|_ frames
|  |_ [video name 0]
|  |  |_ [video name 0]_000001.jpg
|  |  |_ [video name 0]_000002.jpg
|  |  |_ ...
|  |_ [video name 1]
|     |_ [video name 1]_000001.jpg
|     |_ [video name 1]_000002.jpg
|     |_ ...
|_ frame_lists
|  |_ train.csv
|  |_ val.csv
|_ annotations
   |_ [official AVA annotation files]
   |_ ava_train_predicted_boxes.csv
   |_ ava_val_predicted_boxes.csv

You can prepare this structure with the following steps or by creating symlinks to your data.

  1. Download videos
cd dataset_tools/ava
./download_videos.sh

(These video files take 157 GB of space.)

  1. Cut each video from its 15th to 30th minute
./cut_videos.sh
  1. Extract frames
./extract_frames.sh

(These frames take 392 GB of space.)

  1. Download annotations
./download_annotations.sh
  1. Download "frame lists" (train, val) and put them in the frame_lists folder (see structure above).

  2. Download person boxes (train, val, test) and put them in the annotations folder (see structure above). If you prefer to use your own person detector, please see details in GETTING_STARTED.md.

EPIC Kitchens

We assume that the EPIC-Kitchens dataset is placed at data/epic with the following structure.

epic
|_ frames
|  |_ P01
|  |  |_ P01_01_000001.jpg
|  |  |_ ...
|  |_ ...
|  |_ P31
|     |_ P31_01_000001.jpg
|     |_ ...
|_ frame_lists
|  |_ train.csv
|  |_ val.csv
|_ annotations
|  |_ [official EPIC-Kitchens annotation files]
|_ noun_lfb
   |_ train_lfb.pkl
   |_ val_lfb.pkl

You can prepare this structure with the following steps or by creating symlinks to your data.

  1. Download videos with https://github.com/epic-kitchens/download-scripts/blob/master/download_videos.sh

  2. Extract frames (please modify the script based on your data path. )

cd dataset_tools/epic
./extract_epic_frames.sh

(These frames take 147 GB of space.)

  1. Download annotations
cd [path/to/video-lfb/root]
mkdir -P data/epic
git clone https://github.com/epic-kitchens/annotations.git data/epic/annotations
  1. Download "frame lists" (train, val) and put them in the frame_lists folder (see structure above).

  2. Download the pre-computed "Noun LFB" (train, val) and put them in the noun_lfb folder (see structure above). If you prefer to train the detector yourself, please see details in GETTING_STARTED.md.

Charades

We assume that the Charades dataset is placed at data/charades with the following structure.

charades
|_ frames
|  |_ [video name 0]
|  |  |_ [video name 0]-000001.jpg
|  |  |_ [video name 0]-000002.jpg
|  |  |_ ...
|  |_ [video name 1]
|     |_ [video name 1]-000001.jpg
|     |_ [video name 1]-000002.jpg
|     |_ ...
|_ frame_lists
|  |_ train.csv
|  |_ val.csv

You can prepare this structure with the following steps or by creating symlinks to your data.

  1. Download RGB frames from http://ai2-website.s3.amazonaws.com/data/Charades_v1_rgb.tar and put them in (or create symbolic link as) data/charades/frames. (The downloaded tar ball takes 76 GB, and the extracted frames take 85GB of space. )

  2. Download "frame lists" (train, val) and put them in the frame_lists folder (see structure above).