Skip to content

Latest commit

 

History

History
24 lines (15 loc) · 1.05 KB

multimodal.md

File metadata and controls

24 lines (15 loc) · 1.05 KB

Multimodal Models

LLaVA and BakLLaVA are multimodal models available through Ollama. Select a multimodal model from the Lumos Options page and prompt away!

Note: Some webpages contain many images. It may be preferable to open individual images in a separate tab to reduce the amount of images bound to the model. In the future, optimizations may be made to improve the user experience. At the moment, only 10 images are bound to the model for processing at a time.

Prompting Tip

Prefix the prompt with the text "Based on the image". This prefix will override Lumos's internal prompt classification mechanism.

Examples

  1. "Based on the image, describe the background"
  2. "Based on the image, count how many dogs are in the photo"

Examples

What's for dinner? Screenshot of Multimodal

More food... Screenshot of Multimodal

Cool picture Screenshot of Multimodal