Investigating-the-Arithmetic-of-Visual-Embeddings

This research delves into the influence of various visual pre-training paradigms, including self-supervised and language supervision, on the properties of image embeddings. Drawing inspiration from the well-established NLP example of semantic arithmetic (e.g., "King - Male + Woman = Queen"), we aim to assess how different pre-training methodologies affect the geometric properties of image embeddings. Given the inherent diversity across model architectures, it is anticipated that the resulting embeddings will exhibit distinct properties. Such an investigation is of paramount importance in multi-modal spaces where both textual and visual data are integrated, as it contributes to a deeper understanding of geometric relationships and facilitates vector-oriented reasoning. Through a comprehensive examination, this study endeavors to shed light on the intricate interplay between pre-training paradigms and image embedding properties, thus paving the way for enhanced multi-modal learning techniques.

Image Dataset : https://drive.google.com/drive/folders/1S5zIpY8Tqayh_ln810Xa1CQ0LJsEug2i?usp=drive_link

Description of files:

File	Description
Analogy-pairs.txt	list of word pairs used in our study
analogy-pair-creation.py	Python script used for pre-processing of analogy pairs
dataset-creation.py	Python script for creation of the image dataset
evaluate-arithmetic-properties.ipynb	Helper functions for evaluating arithmetic properties
experiments.ipynb	Contains the experiments performed using different models
image_downloader.ipynb	script used to download images for the dataset
research findings.pdf	Research paper summarizing the study performed

Models used

CLIP: https://huggingface.co/sentence-transformers/clip-ViT-B-32
Word2Vec: https://huggingface.co/fse/word2vec-google-news-300
ResNet50: https://huggingface.co/microsoft/resnet-50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Investigating-the-Arithmetic-of-Visual-Embeddings

Description of files:

Models used

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
Analogy-pairs.txt		Analogy-pairs.txt
README.md		README.md
Research findings.pdf		Research findings.pdf
analogy-pair-creation.py		analogy-pair-creation.py
dataset-creation.py		dataset-creation.py
evaluate-arithmetic-properties.ipynb		evaluate-arithmetic-properties.ipynb
experiments.ipynb		experiments.ipynb
image_downloader.ipynb		image_downloader.ipynb

Rishabh42/Investigating-the-Arithmetic-of-Visual-Embeddings

Folders and files

Latest commit

History

Repository files navigation

Investigating-the-Arithmetic-of-Visual-Embeddings

Description of files:

Models used

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages