Skip to content

Latest commit

 

History

History
114 lines (89 loc) · 3.24 KB

File metadata and controls

114 lines (89 loc) · 3.24 KB

Use LiteLLM Proxy to Call Groq API

Use LiteLLM Proxy for:

  • Calling 100+ LLMs on Groq, OpenAI, Azure, Vertex, Bedrock, and more in the OpenAI ChatCompletions & Completions format
  • Track usage + set budgets with Virtual Keys

Using Groq API with LiteLLM

Sample Usage

Step 1. Create a Config for LiteLLM proxy

LiteLLM Requires a config with all your models define - we can call this file litellm_config.yaml

Detailed docs on how to setup litellm config - here

model_list:
  - model_name: groq-llama3 ### MODEL Alias ###
    litellm_params: # all params accepted by litellm.completion() - https://docs.litellm.ai/docs/completion/input
      model: groq/llama3-8b-8192 ### MODEL NAME sent to `litellm.completion()` ###
      api_key: "os.environ/GROQ_API_KEY" # does os.getenv("GROQ_API_KEY")

Step 2. Start litellm proxy

docker run \
    -v $(pwd)/litellm_config.yaml:/app/config.yaml \
    -e GROQ_API_KEY=<your-groq-api-key>
    -p 4000:4000 \
    ghcr.io/berriai/litellm:main-latest \
    --config /app/config.yaml --detailed_debug

Step 3. Test it!

Use with Langchain, LlamaIndex, Instructor, etc.

import openai
client = openai.OpenAI(
    api_key="anything",
    base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
    model="groq-llama3",
    messages = [
        {
            "role": "user",
            "content": "this is a test request, write a short poem"
        }
    ]
)

print(response)

Tool Calling

from openai import OpenAI
client = OpenAI(api_key="anything", base_url="http://0.0.0.0:4000") # set base_url to litellm proxy endpoint

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA",
          },
          "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
        },
        "required": ["location"],
      },
    }
  }
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]
completion = client.chat.completions.create(
  model="groq-llama3",
  messages=messages,
  tools=tools,
  tool_choice="auto"
)

print(completion)

Supported Groq API Models

ALL MODELS SUPPORTED.

Just add groq/ to the beginning of the model name.

Example models: LiteLLM Supports 💥 ALL Groq models

Model Name Usge
llama3-8b-8192 completion(model="groq/llama3-8b-8192", messages)
llama3-70b-8192 completion(model="groq/llama3-70b-8192", messages)
mixtral-8x7b-32768 completion(model="groq/mixtral-8x7b-32768", messages)
gemma-7b-it completion(model="groq/gemma-7b-it", messages)
gemma-9b-it completion(model="groq/gemma-9b-it", messages)