Skip to content

Latest commit

 

History

History
15 lines (10 loc) · 566 Bytes

README.md

File metadata and controls

15 lines (10 loc) · 566 Bytes

Datasets

  • Conceptual Captions link

  • Flickr 30k link

  • TIMIT Speech corpus link

I think that we should use the flickr dataset as the 30k images should really be enough in the limited time we have.

Once you have downloaded the flickr dataset extract it and run the resize script that's located in the flickr30k_images folder, from the folder in question

bash resize_images.sh