Skip to content

Latest commit

 

History

History
48 lines (35 loc) · 2.56 KB

README.md

File metadata and controls

48 lines (35 loc) · 2.56 KB

Fine-Grained-Image-Captioning

The pytorch implementation for "Fine-Grained Image Captioning with Global-Local Discriminative Objective"

Requirements:

Download MSCOCO dataset

  • Download the coco images from http://cocodataset.org/#download. Download 2014 Train images and 2014 Val images, and put them into the train2014/ and val2014/ in the ./image. Download 2014 Test images, and put them into the test2014/

Download COCO captions and preprocess them

Pre-extract the image features

  • python scripts/prepro_feats.py --input_json data/dataset_coco.json --images_root image

Prepare for Reinforcement Learning

  • Download Cider from: https://github.com/vrama91/cider And put "ciderD_token.py" and "ciderD_scorer_token4.py" in the "cider/pyciderevalcap/ciderD/", then
  • python scripts/prepro_ngrams.py --input_json data/dataset_coco.json --dict_json data/cocotalk.json --output_pkl data/coco-train --split train

Prepare for training

Start training

Training using MLE criterion in the initial 20 epochs

  • python MLE_trainpro.py --id TDA --caption_model TDA --checkpoint_path RL_TDA

Training by Global-Local Discriminative Objective

Eval

  • python evalpro.py --caption_model TDA --checkpoint_path RL_TDA

Self-retrieval Experiment

  • python generate_random_5000.py --caption_model TDA --checkpoint_path RL_TDA
  • python self_retrieval.py --id TDA --caption_model TDA --checkpoint_path RL_TDA