Skip to content

Commit

Permalink
docs(models-http-api): add vllm and deepseek completion kind (#3354)
Browse files Browse the repository at this point in the history
* docs(models-http-api): add vllm and deepseek completion kind

* docs: add prompt_template intro

* docs: fix prompt template format
  • Loading branch information
zwpaper authored Nov 1, 2024
1 parent 9304869 commit f8f4e32
Show file tree
Hide file tree
Showing 3 changed files with 71 additions and 10 deletions.
24 changes: 24 additions & 0 deletions website/docs/references/models-http-api/deepseek.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# DeepSeek

[DeepSeek](https://www.deepseek.com/) is a platform that offers a suite of AI models. Tabby supports DeepSeek's models for both code completion and chat.

DeepSeek provides some OpenAI-compatible APIs, allowing us to use the OpenAI chat kinds directly.
However, for completion, there are some differences in the implementation, so we should use the `deepseek/completion` kind.

Below is an example

```toml title="~/.tabby/config.toml"
# Chat model
[model.chat.http]
kind = "openai/chat"
model_name = "your_model"
api_endpoint = "https://api.deepseek.com/chat"
api_key = "secret-api-key"

# Completion model
[model.completion.http]
kind = "deepseek/completion"
model_name = "your_model"
api_endpoint = "https://api.deepseek.com/beta"
api_key = "secret-api-key"
```
25 changes: 15 additions & 10 deletions website/docs/references/models-http-api/openai.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,17 @@
# OpenAI

OpenAI is a leading AI company that has developed a range of language models. Tabby supports OpenAI's models for chat and embedding tasks.
OpenAI is a leading AI company that has developed an extensive range of language models.
Tabby supports OpenAI's API specifications for chat, completion, and embedding tasks.

Tabby also supports its legacy `/v1/completions` API for code completion, although **OpenAI itself no longer supports it**; it is still the API offered by some other vendors, such as (vLLM, Nvidia NIM, LocalAI, ...).
The OpenAI API is widely used and is also provided by other vendors,
such as vLLM, Nvidia NIM, and LocalAI.

Below is an example configuration:
OpenAI has designated its `/v1/completions` API for code completion as legacy,
and **OpenAI itself no longer supports it**.

```toml title="~/.tabby/config.toml"
# Completion model
[model.completion.http]
kind = "openai/completion"
model_name = "your_model"
api_endpoint = "https://url_to_your_backend_or_service"
api_key = "secret-api-key"
Tabby continues to support the OpenAI Completion API specifications due to its widespread usage.

```toml title="~/.tabby/config.toml"
# Chat model
[model.chat.http]
kind = "openai/chat"
Expand All @@ -27,4 +25,11 @@ kind = "openai/embedding"
model_name = "text-embedding-3-small"
api_endpoint = "https://api.openai.com/v1"
api_key = "secret-api-key"

# Completion model
[model.completion.http]
kind = "openai/completion"
model_name = "your_model"
api_endpoint = "https://url_to_your_backend_or_service"
api_key = "secret-api-key"
```
32 changes: 32 additions & 0 deletions website/docs/references/models-http-api/vllm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# vLLM

[vLLM](https://docs.vllm.ai/en/stable/) is a fast and user-friendly library for LLM inference and serving.

vLLM offers an `OpenAI Compatible Server`, enabling us to use the OpenAI kinds for chat and embedding.
However, for completion, there are certain differences in the implementation. Therefore, we should use the `vllm/completion` kind and provide a `prompt_template` depending on the specific models.

Below is an example

```toml title="~/.tabby/config.toml"
# Chat model
[model.chat.http]
kind = "openai/chat"
model_name = "your_model"
api_endpoint = "https://url_to_your_backend_or_service"
api_key = "secret-api-key"

# Embedding model
[model.embedding.http]
kind = "openai/embedding"
model_name = "your_model"
api_endpoint = "https://url_to_your_backend_or_service"
api_key = "secret-api-key"

# Completion model
[model.completion.http]
kind = "vllm/completion"
model_name = "your_model"
api_endpoint = "https://url_to_your_backend_or_service"
api_key = "secret-api-key"
prompt_template = "<PRE> {prefix} <SUF>{suffix} <MID>" # Example prompt template for the CodeLlama model series.
```

0 comments on commit f8f4e32

Please sign in to comment.