Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
pandyamarut authored Oct 9, 2024
1 parent b84bf72 commit c3dafc2
Showing 1 changed file with 40 additions and 1 deletion.
41 changes: 40 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

<h1> SgLang Worker</h1>

🚀 | SGLang is yet another fast serving framework for large language models and vision language models.
🚀 | SGLang is fast serving framework for large language models and vision language models.
</div>

## 📖 | Getting Started
Expand Down Expand Up @@ -32,6 +32,45 @@ print(run_request.status())
print(run_request.output())
```

### OpenAI compatible API
```
from openai import OpenAI
import os
# Initialize the OpenAI Client with your RunPod API Key and Endpoint URL
client = OpenAI(
api_key=os.getenv("RUNPOD_API_KEY"),
base_url=f"https://api.runpod.ai/v2/<endpoint_id>/openai/v1",
)
```

`Chat Completions (Non-Streaming)`
```
response = client.chat.completions.create(
model="meta-llama/Meta-Llama-3-8B-Instruct",
messages=[{"role": "user", "content": "Give a two lines on Planet Earth ?"}],
temperature=0,
max_tokens=100,
)
print(f"Response: {response}")
```

`Chat Completions (Streaming)`
```
response_stream = client.chat.completions.create(
model="meta-llama/Meta-Llama-3-8B-Instruct",
messages=[{"role": "user", "content": "Give a two lines on Planet Earth ?"}],
temperature=0,
max_tokens=100,
stream=True
)
for response in response_stream:
print(response.choices[0].delta.content or "", end="", flush=True)
```



## SGLang Server Configuration
When launching an endpoint, you can configure the SGLang server using environment variables. These variables allow you to customize various aspects of the server's behavior without modifying the code.
Expand Down

0 comments on commit c3dafc2

Please sign in to comment.