Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supporting Gemini #74

Open
koreanssam opened this issue Oct 25, 2024 · 2 comments
Open

supporting Gemini #74

koreanssam opened this issue Oct 25, 2024 · 2 comments

Comments

@koreanssam
Copy link

`model = "gemini/gemini-1.5-flash-002"``


2024-10-25 23:04:56,975 - INFO - HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-002:generateContent?key=secret^^ "HTTP/1.1 200 OK"
2024-10-25 23:05:16,321 - INFO -

LiteLLM completion() model= gemini-1.5-flash-002; provider = gemini
�[92m23:05:16 - LiteLLM:WARNING�[0m: vertex_ai_non_gemini.py:198 - No text in user content. Adding a blank text to user content, to ensure Gemini doesn't fail the request. Relevant Issue - https://github.com/BerriAI/litellm/issues/5515
2024-10-25 23:05:16,332 - WARNING - No text in user content. Adding a blank text to user content, to ensure Gemini doesn't fail the request. Relevant Issue - https://github.com/BerriAI/litellm/issues/5515

How can I use Gemini?

@pradhyumna85
Copy link
Contributor

@koreanssam, Gemini models are already supported (refer #69), what you see are warnings so it should be fine, could you print the output of zerox api to check if you are getting some sensible output. Also I am assuming that you are setting correct api keys are per example:

os.environ["GEMINI_API_KEY"] = "your-api-key"

@igorlima
Copy link

The great news is that py-zerox already supports Gemini! The issue you're facing seems to be beyond py-zerox itself.

For those interested, here's a quick code snippet to get started with Gemini:

  • snippet
    # PDF EXTRACT via zerox
    from pyzerox import zerox
    import pathlib, os, asyncio
    
    # SUPPRESS WARNINGS
    import litellm
    # https://github.com/BerriAI/litellm/blob/11932d0576a073d83f38a418cbdf6b2d8d4ff46f/litellm/litellm_core_utils/get_llm_provider_logic.py#L322
    litellm.suppress_debug_info = True
    # https://docs.litellm.ai/docs/debugging/local_debugging#set-verbose
    litellm.set_verbose=False
    # https://docs.python.org/3/library/logging.html#logging-levels
    # https://github.com/BerriAI/litellm/blob/ea8f0913c2aac4a7b4ec53585cfe9ea04a3282de/litellm/_logging.py#L11
    os.environ['LITELLM_LOG'] = 'CRITICAL'
    
    # SUPPRESS WARNINGS
    # https://docs.python.org/3/library/warnings.html
    import warnings
    # https://docs.python.org/3/library/warnings.html#describing-warning-filters
    # `export PYTHONWARNINGS=ignore`
    warnings.simplefilter("ignore")
    
    
    kwargs = {}
    custom_system_prompt = None
    
    # https://ai.google.dev/gemini-api/docs/models/gemini
    model = "gemini/gemini-2.0-flash-exp"
    model = "gemini/gemini-1.5-flash-8b"
    model = "gemini/gemini-1.5-flash"
    
    async def main():
      file_path = "data/input.pdf"
      select_pages = None
      output_dir = "./data"
      output_dir = None
      result = await zerox(file_path=file_path, model=model, output_dir=output_dir,
                           custom_system_prompt=custom_system_prompt, select_pages=select_pages, **kwargs)
      return result
    
    result = asyncio.run(main())
    md_text = "\n".join([page.content for page in result.pages])
    
    print(md_text)
    pathlib.Path("data/output-zerox-pdf.md").write_text(md_text)
    print("Markdown saved to output-zerox-pdf.md")
    ({
    export GEMINI_API_KEY="XXXXXXXXXXXXXXXXXXXX"
    export LITELLM_LOG=CRITICAL
    export PYTHONWARNINGS=ignore
    python3 snippet.py
    })

Now, let's dive a bit deeper. Before I do, I must say how much I like using Gemini. It's my go-to LLM, and I've tried it in several other places, too:

The biggest perk of Gemini is its free tier. However, as with all things free, there are some limitations. For example, if you're working with a PDF over 10 pages, you might hit Google's rate limit for the Gemini model under the free tier. It seems you're facing is a warning, as mentioned by @pradhyumna85 +.

  • To bypass this warning, you can choose between using Python code or environment variables:
    • using Python code
      import litellm
      # https://github.com/BerriAI/litellm/blob/11932d0576a073d83f38a418cbdf6b2d8d4ff46f/litellm/litellm_core_utils/get_llm_provider_logic.py#L322
      litellm.suppress_debug_info = True
      # https://docs.litellm.ai/docs/debugging/local_debugging#set-verbose
      litellm.set_verbose=False
      # https://docs.python.org/3/library/logging.html#logging-levels
      # https://github.com/BerriAI/litellm/blob/ea8f0913c2aac4a7b4ec53585cfe9ea04a3282de/litellm/_logging.py#L11
      os.environ['LITELLM_LOG'] = 'CRITICAL'
      
      # https://docs.python.org/3/library/warnings.html
      import warnings
      # https://docs.python.org/3/library/warnings.html#describing-warning-filters
      # `export PYTHONWARNINGS=ignore`
      warnings.simplefilter("ignore")
    • using environment variables
      ({
      export GEMINI_API_KEY="XXXXXXXXXXXXXXXXXXXX"
      export LITELLM_LOG=CRITICAL
      export PYTHONWARNINGS=ignore
      python3 snipet.py
      })
  • Using environment variables is generally more effective than using code.

Due to these limitations with Gemini, I was inspired to open this PR:

I hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants