Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Context Caching for Vertex AI #6898

Closed
DreamGenX opened this issue Nov 25, 2024 · 6 comments · Fixed by #7773
Closed

[Feature]: Context Caching for Vertex AI #6898

DreamGenX opened this issue Nov 25, 2024 · 6 comments · Fixed by #7773
Assignees
Labels
enhancement New feature or request

Comments

@DreamGenX
Copy link

The Feature

It looks like Gemini context caching does not work when using Vertex AI.
Doing a cursory search, it looks like this part would need to be implemented to support Vertex AI:

await context_caching_endpoints.async_check_and_create_cache(

Motivation, pitch

Context cachin is already supported using the Gemini API and it's a good way to reduce costs.

Twitter / LinkedIn details

No response

@DreamGenX DreamGenX added the enhancement New feature or request label Nov 25, 2024
@krrishdholakia krrishdholakia self-assigned this Nov 25, 2024
@codenprogressive
Copy link

+1
Any updates on enabling context caching for LLMs (Gemini, Claude, etc) available through VertexAI?

@krrishdholakia
Copy link
Contributor

@codenprogressive just tested claude on vertex ai - it doesn't look like prompt caching is available there yet.

I plan on picking up vertex ai context caching this week

@codenprogressive
Copy link

Hey @krrishdholakia, indeed Claude on Vertex still doesn't support prompt caching! (there is a feature request: anthropics/anthropic-sdk-python#653)

I think for now we can enable it for Gemini on VertexAI.

@emerzon
Copy link
Contributor

emerzon commented Dec 18, 2024

AWS has started a preview for the prompt caching for Claude: https://pages.awscloud.com/promptcaching-Preview.html
Hope Vertex comes up next

@emerzon
Copy link
Contributor

emerzon commented Jan 14, 2025

@krrishdholakia
Copy link
Contributor

@emerzon i believe this ticket is for vertex ai gemini, the vertex ai anthropic prompt caching issue is probably separate. i'm working on it now though, since we shouldn't be passing any extra headers to it.

krrishdholakia added a commit that referenced this issue Jan 15, 2025
rajatvig pushed a commit to rajatvig/litellm that referenced this issue Jan 16, 2025
* build(pyproject.toml): bump uvicorn depedency requirement

Fixes BerriAI#7768

* fix(anthropic/chat/transformation.py): fix is_vertex_request check to actually use optional param passed in

Fixes BerriAI#6898 (comment)

* fix(o1_transformation.py): fix azure o1 'is_o1_model' check to just check for o1 in model string

BerriAI#7743

* test: load vertex creds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
4 participants