Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I add prompt caching directive? #38

Open
chew-z opened this issue Aug 24, 2024 · 7 comments
Open

How can I add prompt caching directive? #38

chew-z opened this issue Aug 24, 2024 · 7 comments

Comments

@chew-z
Copy link

chew-z commented Aug 24, 2024

I know how to add beta header but can't add cache_control directive to Message...

Prompt Caching

like ...

      {
        "type": "text", 
        "text": "<the entire contents of Pride and Prejudice>",
        "cache_control": {"type": "ephemeral"}
      }
@madebywelch
Copy link
Owner

Hi @chew-z we haven't added support for prompt caching, yet. I might be able to get to it this weekend.

@kb-sp
Copy link
Contributor

kb-sp commented Sep 17, 2024

It looks like the System object is now an array of Messages, so either a new major version bump or a new "MessageRequest2" with "everything" duplicated... unless I'm misreading their docs.

I'll probably fork and just replace SystemPrompt since I don't care about backwards compatibility, but if a major version is a chosen path I could submit a PR if no one beats me to it.

@madebywelch
Copy link
Owner

@kb-sp I think both are accepted. In Anthropic's API reference it seems you can pass a string or object array. I believe the object array has functionality around prompt caching, which we haven't yet implemented. If you want to take a stab at doing that and bump the major version I'm all for it.

@madebywelch
Copy link
Owner

https://github.com/madebywelch/anthropic-go/tree/v4.0.0 contains the cache header and config update.

@kb-sp
Copy link
Contributor

kb-sp commented Sep 30, 2024

Can you update to go.mod for v4.0.0 to identify itself as /v4? @madebywelch

@kb-sp
Copy link
Contributor

kb-sp commented Sep 30, 2024

Also, as I'm prepping for this, implementing the cache control on "user" messages becomes awkward when you have conversation history.

For a RAG case, ideally I'd like to store the content as an element in the SystemPrompt (like Anthropic's docs on prompt caching) so that user/assistant pairs are consistent.

For a first prompt, I have:

  • SystemPrompt = "Coaching for the LLM's role" (like an agent spec)
  • User = []Block{
    • ["...text file contents..."]
    • [Initial prompt] < CACHED >

But for a conversation, I need:

  • SystemPrompt = ditto
  • User = []Block{
    • "...text file contents..."
    • [User[0]] (first "user" from conversation history)
      • The hope is that [User[0]] is identical to [Initial prompt] and will trigger the cache.
  • Assistant = []Block{
    • [Assistant[0]] (first "assistant from conversation history)
  • User = []Block{
    • [New prompt]

I think the above works (can't test caching with v4 yet).

But the SystemPrompt approach seems a little more straightforward and predictable:

  • SystemPrompt = []Block{
    • [Coaching]
    • ["...text file contents..."] < CACHED >
  • User = []Block{
    • [Initial prompt]

Then:

  • SystemPrompt = ditto
  • User = []Block{
    • [User[0]] (aka Initial prompt)
  • Assistant = []Block{
    • [Assistant[0]]
  • User = []Block{
    • [New prompt]

In other words, I don't have to hope that [Initial prompt] and [User[0]] from the history happen to be identical. Does that make sense?

@madebywelch madebywelch reopened this Sep 30, 2024
@madebywelch
Copy link
Owner

@kb-sp Got it. Good idea. I will upgrade SystemPrompt into array of block instead of string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants