Skip to content

vzhou842/easy-VQA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

easy-vqa

Build Status PyPI

The official repository for the Easy Visual Question Answering (easy-VQA) dataset. Contains:

  • the official Python package for the dataset
  • the source code for generating the dataset

Read the easy-VQA blog post for more.

About the Dataset

easy-VQA contains

  • 4,000 train images and 38,575 train questions.
  • 1,000 test images and 9,673 test questions.
  • 13 total possible answers.
  • 28,407 training questions that are yes/no.
  • 7,136 testing questions that are yes/no.

All images are 64x64 color images. See a live demo of a model trained on the dataset.

Example Images

(these image links above only work on Github)

Example Questions

  • What color is the rectangle?
  • Does the image contain a triangle?
  • Is no blue shape present?
  • What shape does the image contain?

Installing the Package

pip install easy-vqa

Using the Package

Questions

Each question has 3 parts:

  • the question text
  • the answer
  • the image ID

The question getters return corresponding arrays for each of the 3 parts:

from easy_vqa import get_train_questions, get_test_questions

train_questions, train_answers, train_image_ids = get_train_questions()
test_questions, test_answers, test_image_ids = get_test_questions()

# Question 0 is at index 0 for all 3 arrays:
print(train_questions[0]) # what shape does the image contain?
print(train_answers[0])   # circle
print(train_image_ids[0]) # 0

Images

The image path getters return dicts that map image ID to absolute paths that can be used to load the image.

from easy_vqa import get_train_image_paths, get_test_image_paths

train_image_paths = get_train_image_paths()
test_image_paths = get_test_image_paths()

print(train_image_paths[0]) # ends in easy_vqa/data/train/images/0.png

Answers

The answers getter returns an array of all possible answers.

from easy_vqa import get_answers

answers = get_answers()

print(answers) # ['teal', 'brown', 'black', 'gray', 'yes', 'blue', 'rectangle', 'yellow', 'triangle', 'red', 'circle', 'no', 'green']

Generating the Dataset

The easy-VQA dataset was generated by running

python gen_data/generate_data.py

which writes to the easy_vqa/data/ directory. Be sure to install the dependencies for dataset generation running generate_data.py:

pip install -r gen_data/requirements.txt

If you want to generate a larger easy-VQA dataset, simply modify the NUM_TRAIN and NUM_TEST constants in generate_data.py. Otherwise, if you want to modify the dataset itself, the files and code in the gen_data/ directory should be pretty self-explanatory.