Decoding-AI-Art

In recent years, vision language models achieved remarkable advancements, making them incredibly pertinent to image-text tasks such as image-text generation, visual question and answering, image-text contrastive learning, cross-modal retrieval, art and design generation, and more. Within this dynamic landscape of image-text capabilities, one technique that has gained considerable attention is known as “stable diffusion”. This deep learning model specializes in the production of high-quality images, characterized by their nuanced and precise visual content. In essence, high-quality images encapsulate intricate details while concurrently prioritizing the elimination of noise and distortions that might otherwise confuse the model during the training or fine-tuning process. In the context of this research, we employ the most advanced visual language models available to enhance the task of generating prompts for these AI-generated images. Our dataset exclusively comprises high-quality AI-generated images that are created from a third party using stable diffusion. Notably, our research centers on the fine-tuning of two preeminent visual language models, namely BLIP and GIT. BLIP and GIT are both multimodal vision-language models that undergo pretraining on diverse tasks. BLIP employs a complex architecture with separate modules for various tasks, including image-text pairing and captioning. GIT, on the other hand, uses a simpler architecture, concatenating encoded features for efficiency. Through the fine-tuning process of both these models, intriguing disparities emerge in the generated captions between these two models. Consequently, our study seeks to discern the underlying factors contributing to divergent captioning outcomes, with a particular focus on elucidating why one model consistently produces captions of enhanced accuracy compared to the other. Ultimately, achieving superior results compared to the current state-of-the-art AI-generated image-to-prompt models.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
BLIP_Decoding_AI_Art.ipynb		BLIP_Decoding_AI_Art.ipynb
Custom_Model_Decoding_AI_Art.ipynb		Custom_Model_Decoding_AI_Art.ipynb
GIT_Decoding_AI_Art.ipynb		GIT_Decoding_AI_Art.ipynb
README.md		README.md
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Decoding-AI-Art

About

Releases

Packages

Contributors 2

Languages

mohsinposts/Decoding-AI-Art

Folders and files

Latest commit

History

Repository files navigation

Decoding-AI-Art

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages