From 08f511525ddc6e18e6c74a9e93c8bc7f3dc05a30 Mon Sep 17 00:00:00 2001 From: Sander Niels Hummerich <64867257+hummerichsander@users.noreply.github.com> Date: Wed, 25 Sep 2024 12:49:19 +0000 Subject: [PATCH] Refactor README to include available endpoints and environment variables --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index c335298..c3ceb84 100644 --- a/README.md +++ b/README.md @@ -2,9 +2,11 @@ This is a simple fastapi based server mock that implements the OpenAI API. Available endpoints: + - /v1/chat/completion Instead of running a LLM model to generate completions, it simply returns a response generated by surrogate models. Available surrogate models are: + - "yes_no": returns random "Yes" or "No" response - "ja_nein": returns random "Ja" or "Nein" response - "lorem_ipsum": returns random "lorem ipsum" text @@ -15,6 +17,7 @@ docker pull ghcr.io/hummerichsander/openai_api_server_mock:v ... # replace ... w ``` Environment variables: + - `CONTEXT_SIZE`: context size for the model (default: 4096) - `SLEEP_TIME`: sleep time in seconds before returning the response (default: 0) - `MAX_CONCURRENT_REQUESTS`: maximum number of concurrent requests (default: 10^9)