Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Gemini adapter] Support message content with image and video URLs #81

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

samrat
Copy link

@samrat samrat commented Nov 12, 2024

Fixes #80

The Gemini API expects you to upload images/videos separately. That feels outside of the scope of Instructor, so I've avoided adding that in this PR.

Example usage:

defmodule VideoDesc do
  use Ecto.Schema

  @primary_key false
  embedded_schema do
    field(:video_description, :string)
    field(:oscar_winning_moment, :boolean)
  end
end

Instructor.chat_completion(
  mode: :json_schema,
  model: "gemini-1.5-flash",
  response_model: VideoDesc,
  messages: [
    %{
      role: "user", 
      content: [
        %{
          type: "video_url",
          video_url: %{
            url: "https://generativelanguage.googleapis.com/v1beta/files/d2ngu357dmfx",
            mime_type: "video/mp4"
          }
        },
        %{
          type: "text",
          text: "What's going on in this video?"
        }
      ]
    }
  ]
)

@samrat
Copy link
Author

samrat commented Nov 12, 2024

The Gemini API expects you to upload images/videos separately

In case anyone wants to grab the code to do the upload, I've posted it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multimodal support for Gemini
1 participant