AI Gateway

English | 中文

AI Gateway

Reliably route to 100+ LLMs with 1 fast & friendly API

Gateway streamlines requests to 100+ open & closed source models with a unified API. It is also production-ready with support for caching, fallbacks, retries, timeouts, loadbalancing, and can be edge-deployed for minimum latency.

✅ Blazing fast (9.9x faster) with a tiny footprint (~45kb installed)
✅ Load balance across multiple models, providers, and keys
✅ Fallbacks make sure your app stays resilient
✅ Automatic Retries with exponential fallbacks come by default
✅ Configurable Request Timeouts to easily handle unresponsive LLM requests
✅ Multimodal to support routing between Vision, TTS, STT, Image Gen, and more models
✅ Plug-in middleware as needed
✅ Battle tested over 300B tokens
✅ Enterprise-ready for enhanced security, scale, and custom deployments

How to Run Gateway?

Run it Locally for complete control & customization
Hosted by Portkey for quick setup without infrastructure concerns
Enterprise On-Prem for advanced features and dedicated support

Compatible with OpenAI API & SDK

Gateway is fully compatible with the OpenAI API & SDK, and extends them to call 100+ LLMs and makes them reliable. To use the Gateway through OpenAI, you only need to update your base_URL and pass the provider name in headers.

To use through Portkey, set your base_URL to: https://api.portkey.ai/v1
To run locally, set: http://localhost:8787/v1

Run it Locally

Run the following command in your terminal and it will spin up the Gateway on your local system:

npx @portkey-ai/gateway

^{Your AI Gateway is now running on http://localhost:8787 🚀}

Gateway is also edge-deployment ready. Explore Cloudflare, Docker, AWS etc. deployment guides here.

Gateway Hosted by Portkey

This same open-source Gateway powers Portkey API that processes billions of tokens daily and is in production with companies like Postman, Haptik, Turing, MultiOn, SiteGPT, and more.

Sign up for the free developer plan (10K request/month) here or discuss here for enterprise deployments.

How to Use Gateway?

Let's see how we can use the Gateway to make an Anthropic request in OpenAI spec below - the same will follow for all the other providers.

Python

pip install portkey-ai

While instantiating your OpenAI client,

Set the base_URL to http://localhost:8787/v1 (or PORTKEY_GATEWAY_URL through the Portkey SDK if you're using the hosted version)
Pass the provider name in the default_headers param (here we are using createHeaders method with the Portkey SDK to auto-create the full header)

from openai import OpenAI
from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders

gateway = OpenAI(
    api_key="ANTHROPIC_API_KEY",
    base_url=PORTKEY_GATEWAY_URL, # Or http://localhost:8787/v1 if you are running locally
    default_headers=createHeaders(
        provider="anthropic",
        api_key="PORTKEY_API_KEY" # Grab from https://app.portkey.ai Not needed if you are running locally
    )
)

chat_complete = gateway.chat.completions.create(
    model="claude-3-sonnet-20240229",
    messages=[{"role": "user", "content": "What's a fractal?"}],
    max_tokens=512
)

If you want to run the Gateway locally, don't forget to run npx @portkey-ai/gateway in your terminal before this! Otherwise just sign up on Portkey and keep your Portkey API Key handy.

Node

Works same as in Python. Add baseURL & defaultHeaders while instantiating your OpenAI client and pass the relevant provider details.

npm install portkey-ai

import OpenAI from 'openai';
import { PORTKEY_GATEWAY_URL, createHeaders } from 'portkey-ai'
 
const gateway = new OpenAI({
    apiKey: "ANTHROPIC_API_KEY",
    baseURL: PORTKEY_GATEWAY_URL,
    defaultHeaders: createHeaders({
        provider: "anthropic",
        apiKey: "PORTKEY_API_KEY"
  })
});

async function main(){
  const chatCompletion = await portkey.chat.completions.create({
      messages: [{ role: 'user', content: 'Who are you?' }],
      model: 'claude-3-sonnet-20240229',
  });
}

main()

REST

In a typical OpenAI REST request,

Change the request URL to http://localhost:8787/v1 (or https://api.portkey.ai/v1 if you're using the hosted version)
Pass an additional x-portkey-provider header with the provider's name
Change the model's name to claude-3

curl 'http://localhost:8787/v1/chat/completions' \
  -H 'x-portkey-provider: anthropic' \
  -H "Authorization: Bearer $ANTHROPIC_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{ "model": "claude-3-haiku-20240229", "messages": [{"role": "user","content": "Hi"}] }'

Similarly for other providers, change the provider & model to their respective names.

Gateway Docs

Head over to Portkey docs for detailed guides & cookbooks on more provider integrations.

Supported Providers

Provider	Support	Stream	Supported Endpoints
OpenAI	✅	✅	`/completions`, `/chat/completions`,`/embeddings`, `/assistants`, `/threads`, `/runs`, `/images/generations`, `/audio/*`
Azure OpenAI	✅	✅	`/completions`, `/chat/completions`,`/embeddings`
Anyscale	✅	✅	`/chat/completions`
Google Gemini & Palm	✅	✅	`/generateMessage`, `/generateText`, `/embedText`
Anthropic	✅	✅	`/messages`, `/complete`
Cohere	✅	✅	`/generate`, `/embed`, `/rerank`
Together AI	✅	✅	`/chat/completions`, `/completions`, `/inference`
Perplexity	✅	✅	`/chat/completions`
Mistral	✅	✅	`/chat/completions`, `/embeddings`
Nomic	✅	✅	`/embeddings`
AI21	✅	✅	`/complete`, `/chat`, `/embed`
Stability AI	✅	✅	`/generation/{engine_id}/text-to-image`
DeepInfra	✅	✅	`/inference`
Ollama	✅	✅	`/chat/completions`

View the complete list of 100+ supported models here

Reliability Features

Fallback

This feature allows you to specify a prioritized list of LLMs. If the primary LLM fails, Portkey will automatically fallback to the next LLM in the list to ensure reliability.

Automatic Retries

AI Gateway can automatically retry failed requests up to 5 times. A backoff strategy spaces out retry attempts to prevent network overload.

Load Balancing

Distribute load effectively across multiple API keys or providers based on custom weights to ensure high availability and optimal performance.

Request Timeouts

Manage unruly LLMs & latencies by setting up granular request timeouts, allowing automatic termination of requests that exceed a specified duration.

Reliability features are set by passing a relevant Gateway Config (JSON) with the `x-portkey-config` header or with the `config` param in the SDKs

Example: Setting up Fallback from OpenAI to Anthropic

Write the fallback logic

{
  "strategy": { "mode": "fallback" },
  "targets": [
    { "provider": "openai", "api_key": "OPENAI_API_KEY" },
    { "provider": "anthropic", "api_key": "ANTHROPIC_API_KEY" }
  ]
}

Pass it while making your request

Portkey Gateway will automatically trigger Anthropic if the OpenAI request fails:

REST

curl 'http://localhost:8787/v1/chat/completions' \
  -H 'x-portkey-provider: google' \
  -H 'x-portkey-config: $CONFIG' \
  -H "Authorization: Bearer $GOOGLE_AI_STUDIO_KEY" \
  -H 'Content-Type: application/json' \
  -d '{ "model": "gemini-1.5-pro-latest", "messages": [{"role": "user","content": "Hi"}] }'

You can also trigger Fallbacks only on specific status codes by passing an array of status codes with the on_status_codes param in strategy.

Read the full Fallback documentation here.

Example: Loadbalance Requests on 3 Accounts

Write the loadbalancer

{
  "strategy": { "mode": "loadbalance" },
  "targets": [
    { "provider": "openai", "api_key": "ACCOUNT_1_KEY", "weight": 1 },
    { "provider": "openai", "api_key": "ACCOUNT_2_KEY", "weight": 1 },
    { "provider": "openai", "api_key": "ACCOUNT_3_KEY", "weight": 1 }
  ]
}

Pass the Config while instantiating OpenAI client

import OpenAI from 'openai';
import { PORTKEY_GATEWAY_URL, createHeaders } from 'portkey-ai'
 
const gateway = new OpenAI({
  baseURL: PORTKEY_GATEWAY_URL,
  defaultHeaders: createHeaders({
    apiKey: "PORTKEY_API_KEY",
    config: "CONFIG_ID"
  })
});

Read the full Loadbalancing documentation here.

Automatic Retries

Similarly, you can write a Config that will attempt retries up to 5 times

{
    "retry": { "attempts": 5 }
}

Read the full Retries documentation here.

Request Timeouts

Here, the request timeout of 10 seconds will be applied to *all* the targets.

{
  "strategy": { "mode": "fallback" },
  "request_timeout": 10000,
  "targets": [
    { "virtual_key": "open-ai-xxx" },
    { "virtual_key": "azure-open-ai-xxx" }
  ]
}

Read the full Request Timeouts documentation here.

Using Gateway Configs

Here's a guide to use config object in your request.

Supported SDKs

Language	Supported SDKs
Node.js / JS / TS	Portkey SDK OpenAI SDK LangchainJS LlamaIndex.TS
Python	Portkey SDK OpenAI SDK Langchain LlamaIndex
Go	go-openai
Java	openai-java
Rust	async-openai
Ruby	ruby-openai

Deploying AI Gateway

See docs on installing the AI Gateway locally or deploying it on popular locations.

Deploy to Cloudflare Workers
Deploy using Docker
Deploy using Docker Compose
Deploy to Zeabur
Run a Node.js server

Gateway Enterprise Version

Make your AI app more reliable and forward compatible, while ensuring complete data security and privacy.

✅ Secure Key Management - for role-based access control and tracking
✅ Simple & Semantic Caching - to serve repeat queries faster & save costs
✅ Access Control & Inbound Rules - to control which IPs and Geos can connect to your deployments
✅ PII Redaction - to automatically remove sensitive data from your requests to prevent indavertent exposure
✅ SOC2, ISO, HIPAA, GDPR Compliances - for best security practices
✅ Professional Support - along with feature prioritization

Schedule a call to discuss enterprise deployments

Contributing

The easiest way to contribute is to pick any issue with the good first issue tag 💪. Read the Contributing guidelines here.

Bug Report? File here | Feature Request? File here

Community

Join our growing community around the world, for help, ideas, and discussions on AI.

View our official Blog
Chat live with us on Discord
Follow us on Twitter
Connect with us on LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 672 Commits
.github		.github
docs		docs
src		src
.dockerignore		.dockerignore
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
CODE_OF_CONDUCT.cn.md		CODE_OF_CONDUCT.cn.md
CONTRIBUTING.cn.md		CONTRIBUTING.cn.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.cn.md		README.cn.md
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
docker-compose.yaml		docker-compose.yaml
package-lock.json		package-lock.json
package.json		package.json
rollup.config.js		rollup.config.js
tsconfig.json		tsconfig.json
wrangler.toml		wrangler.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Gateway

Reliably route to 100+ LLMs with 1 fast & friendly API

How to Run Gateway?

Compatible with OpenAI API & SDK

Run it Locally

Gateway Hosted by Portkey

How to Use Gateway?

Python

Node

REST

Gateway Docs

Supported Providers

Reliability Features

Fallback

Automatic Retries

Load Balancing

Request Timeouts

Reliability features are set by passing a relevant Gateway Config (JSON) with the `x-portkey-config` header or with the `config` param in the SDKs

Example: Setting up Fallback from OpenAI to Anthropic

Write the fallback logic

Pass it while making your request

Example: Loadbalance Requests on 3 Accounts

Write the loadbalancer

Pass the Config while instantiating OpenAI client

Automatic Retries

Request Timeouts

Using Gateway Configs

Supported SDKs

Deploying AI Gateway

Gateway Enterprise Version

Contributing

Community

About

Releases

Packages

Languages

License

ye4293/gateway

Folders and files

Latest commit

History

Repository files navigation

AI Gateway

Reliably route to 100+ LLMs with 1 fast & friendly API

How to Run Gateway?

Compatible with OpenAI API & SDK

Run it Locally

Gateway Hosted by Portkey

How to Use Gateway?

Python

Node

REST

Gateway Docs

Supported Providers

Reliability Features

Fallback

Automatic Retries

Load Balancing

Request Timeouts

Reliability features are set by passing a relevant Gateway Config (JSON) with the x-portkey-config header or with the config param in the SDKs

Example: Setting up Fallback from OpenAI to Anthropic

Write the fallback logic

Pass it while making your request

Example: Loadbalance Requests on 3 Accounts

Write the loadbalancer

Pass the Config while instantiating OpenAI client

Automatic Retries

Request Timeouts

Using Gateway Configs

Supported SDKs

Deploying AI Gateway

Gateway Enterprise Version

Contributing

Community

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Reliability features are set by passing a relevant Gateway Config (JSON) with the `x-portkey-config` header or with the `config` param in the SDKs

Packages