Skip to content

Latest commit

 

History

History
35 lines (25 loc) · 729 Bytes

README.md

File metadata and controls

35 lines (25 loc) · 729 Bytes

Llama Fast API Server

Installation

FastAPI

Windows

pip install fastapi uvicorn websockets huggingface_hub

macOS

pip3 install fastapi uvicorn websockets huggingface_hub

Llama-cpp-python

Windows + CUDA

pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu121

Windows + CPU

pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu

macOS + Metal

pip3 install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/metal