Skip to content

Latest commit

 

History

History
12 lines (10 loc) · 644 Bytes

File metadata and controls

12 lines (10 loc) · 644 Bytes

My approach:

  1. CLIP is used for text-to-image comparison
  2. Unicom is used for image-to-image comparison
  3. ffmpeg is used to extract keyframes
  4. ChromaDB is used because I have not found how to store metadata (image filename etc) in FAISS

Challenges:

  1. For some reason text-to-image works bag after moving to ChromaDB :(

TODO:

  1. Try different distances except default cosine distance
  2. Fix text-to-image