Skip to content

Latest commit

 

History

History
19 lines (13 loc) · 771 Bytes

README.md

File metadata and controls

19 lines (13 loc) · 771 Bytes

LLM Prompt Caching Benchmark

This Python script benchmarks the performance improvement of prompt caching for the OpenAI GPT-4o and Anthropic Claude sonnet-3.5 language models. It measures the latency for API calls with and without prompt caching by requesting a small number of output tokens, approximating the time to first byte. It then calculates the percentage improvement in latency.

Screenshot 2024-10-27 at 10 25 08 PM

Usage

  1. Install the required Python packages:
pip install -r requirements.txt
  1. Set the API keys for OpenAI and Anthropic in the benchmark.py script.
  2. Run the script:
python benchmark.py