Skip to content

[NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"

Notifications You must be signed in to change notification settings

tripletclip/TripletCLIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

[NeurIPS 2024] TripletCLIP : Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives

TripletCLIP

This repository will provide access to the dataset, pretrained checkpoints, inference, and training code for our paper, TripletCLIP. We provide our training scripts written from scratch to train the models reported in paper and OpenCLIP varient for easy reproducibility.


TODOs:

  • Release High-Quality Subset of TripletData.
  • Release all pre-trained and finetuned checkpoints.
  • Release TripletCLIP adaption on OpenCLIP. ./src/openclip
  • Release data generation scripts.
  • Release full TripletData.
  • Release original TripletCLIP training scripts for reproducibility.

Checkpoints

Below are the checkpoints for the models trained on CC3M and CC12M datasets. The fine-tuning checkpoint is also provided for further customization.

Table of Checkpoints

Methods CC3M CC12M
LaCLIP Link Link
LaCLIP+HN Link -
NegCLIP Link Link
NegCLIP++ Link Link
TripletCLIP (ours) Link Link

Fine-tuning Checkpoint

For fine-tuning based model checkpoint, please refer to the following link:

Citing

If you find the TripletCLIP useful, then consider citing:

@article{patel2024tripletclip,
    author = {Patel, Maitreya and Kusumba, Abhiram and Cheng, Sheng and Kim, Changhoon and Gokhale, Tejas and Baral, Chitta and Yang, Yezhou},
    title = {TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives},
    journal={Advances in neural information processing systems},
    year = {2024},
}

Acknowledgement:

We would like to acknowledge the excelletn open-source community OpenCLIP, Huggingface, LAION-AI, and OpenAI for their efforts on making CLIP inference/finetuning and benchmarking easily accessible to all.

About

[NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published