-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Request call Bedrock API or increased Ollama compute f #6601
Comments
@jhpyke @julialawrence |
2025-01-27 14:58:31,362 - ERROR - Error raised by bedrock service: An error occurred (ThrottlingException) when calling the InvokeModel operation (reached max retries: 4): Too many requests, please wait before trying again. You have sent too many requests. Wait before trying again. |
Actions: Further to this, the AP team will be investigating any bottlenecks in the existing bedrock setup to understand if these limits being hit are points we have control over or AWS mandated. |
Hello Jake, thank you for letting me know. A 5x increase would be great, however our the DPIA we have in place for this project restricts processing to Ireland or London. Thanks for following up with the AP team. |
Describe the feature request.
I am working on a piece of critical analysis which requires making around 350'000 queries (roughly input tokens with a prompt) to an LLM, on either Ollama or Bedrock.
Input words: 148946466 x 2 x 1.3 (number of words x 2 for each prompt per query) * 1.3 to convert to tokens
Output words: 148946466 words * 1.3 (overestimate).
The pricing for Bedrock is below:
Claude Sonnet total cost: (((148946466 * 1.3) / 1000) * (0.003)) + (((1489464661.3) / 1000) * (0.015))= $3485
Claude Haiku: ((((148946466 * 2) * 1.3) / 1000) * (0.00025)) + (((1489464661.3) / 1000) * (0.00125)) = $338.85
Describe the context.
No response
Value / Purpose
This analysis is critical for the delivery of a report.
User Types
No response
The text was updated successfully, but these errors were encountered: