Created by Julia Peyre at INRIA, Paris.
This is the code for the paper :
Julia Peyre, Ivan Laptev, Cordelia Schmid, Josef Sivic, Detecting Unseen Visual Relations Using Analogies, ICCV19.
The webpage for this project is available here, with a link to the paper.
This code is available for research purpose (MIT License).
This code was tested on Python 2.7, Pytorch 0.4.0, CUDA 8.0 Install the dependencies with:
pip install -r requirements.txt
We release data and pre-trained models for HICO-DET. To set-up the directories, please follow these steps:
- Download the pre-computed data
wget https://www.rocq.inria.fr/cluster-willow/jpeyre/analogy/data.tar.gz
tar zxvf data.tar.gz
This should be unzip into ./data folder
This contains the object detections, visual features as well as database objects to run our code on HICO-DET.
-
Download HICO images
Load HICO images and place them into directory images in ./data/hico/images : -
Link to COCO API
Download COCO API into new directory ./data/coco and run make -
Download pre-computed models and detections
wget https://www.rocq.inria.fr/cluster-willow/jpeyre/analogy/runs.tar.gz
tar zxvf runs.tar.gz
This should be unzip into ./runs folder
You can re-train our model by running:
python train.py --config_path $CONFIG_PATH
We provide config files in ./configs directory.
Feel free to edit the config options to train variants of our model.
You can extract the detections by running:
python eval_hico.py --config_path $CONFIG_PATH
To extract the detections using our analogy model, you can run:
python eval_hico_analogy.py --config_path $CONFIG_PATH
We use the official evaluation code to evaluate performance on HICO-DET
Please note that the numerical results in the paper were obtained using a slightly different version for analogy transformation than what is described in Eq.(6) of the paper. This variant computes analogy transformation as:
where are the embeddings of target subject, predicate and object in unigram spaces, and are the embeddings of source subject, predicate and object in visual phrase space.
You can choose between the 2 versions through the option --analogy_type. The default option described above is called 'hybrid'. To run the variant described in the paper, please activate the option --analogy_type='vp' in the config file such as in './configs/hico_trainvalzeroshot_analogy_vp.yaml'.
The variant 'vp' results in ~1% performance drop compared to the results in the paper (Table 2. s+o+vp+transfer (deep): 28.6 -> 27.5). The corresponding model is released in runs/ directory. We are still investigating why the 'hybrid' version performs better than the 'vp' one.
We would like to thank Kenneth Wong from Institute of Computing Technology, Chinese Academy of Sciences, for his careful code review and pointing out this inconsistency.
We apologize for this inconvenience. Also, please do not hesitate to contact the first author for further clarifications.
If you find this code useful in your research, please, consider citing our paper:
@InProceedings{Peyre19, author = "Peyre, Julia and Laptev, Ivan and Schmid, Cordelia and Sivic, Josef", title = "Detecting Unseen Visual Relations Using Analogies", booktitle = "ICCV", year = "2019" }
Any question please contact the first author [email protected]