Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI kit does not support Cross-region inference #8121

Open
rpostulart opened this issue Nov 21, 2024 · 4 comments
Open

AI kit does not support Cross-region inference #8121

rpostulart opened this issue Nov 21, 2024 · 4 comments

Comments

@rpostulart
Copy link

Environment information

System:
  OS: macOS 14.6.1
  CPU: (16) x64 Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
  Memory: 94.01 MB / 16.00 GB
  Shell: /bin/zsh
Binaries:
  Node: 22.9.0 - /usr/local/bin/node
  Yarn: undefined - undefined
  npm: 10.8.3 - /usr/local/bin/npm
  pnpm: undefined - undefined
NPM Packages:
  @aws-amplify/auth-construct: 1.5.0
  @aws-amplify/backend: 1.8.0
  @aws-amplify/backend-auth: 1.4.1
  @aws-amplify/backend-cli: 1.4.2
  @aws-amplify/backend-data: 1.2.1
  @aws-amplify/backend-deployer: 1.1.9
  @aws-amplify/backend-function: 1.8.0
  @aws-amplify/backend-output-schemas: 1.4.0
  @aws-amplify/backend-output-storage: 1.1.3
  @aws-amplify/backend-secret: 1.1.5
  @aws-amplify/backend-storage: 1.2.3
  @aws-amplify/cli-core: 1.2.0
  @aws-amplify/client-config: 1.5.2
  @aws-amplify/deployed-backend-client: 1.4.2
  @aws-amplify/form-generator: 1.0.3
  @aws-amplify/model-generator: 1.0.9
  @aws-amplify/platform-core: 1.2.1
  @aws-amplify/plugin-types: 1.5.0
  @aws-amplify/sandbox: 1.2.6
  @aws-amplify/schema-generator: 1.2.5
  aws-amplify: 6.9.0
  aws-cdk: 2.169.0
  aws-cdk-lib: 2.169.0
  typescript: 5.6.3
No AWS environment variables
No CDK environment variables

Describe the bug

In the schema I can only define model like

const schema = a.schema({
  chat: a
    .conversation({
      aiModel: a.ai.model("Claude 3.5 Sonnet"),
      systemPrompt: `You are a very helpful assistant`,
    })
    .authorization((allow) => allow.owner()),
});

But i get an error in my region because it only allows access to Claude 3.5 via a inference profile. This is the error in de lambda:

{
    "timestamp": "2024-11-21T22:03:13.695Z",
    "level": "ERROR",
    "requestId": "37377ab9-f496-4cc3-b5a1-70115062ea0f",
    "message": "Failed to handle conversation turn event, currentMessageId=2ef0c357-e7ab-45b6-92be-71b855c65597, conversationId=411d1032-0704-4384-8952-5db49f27e5b1 ValidationException: Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.\n    at de_ValidationExceptionRes (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1195:21)\n    at de_CommandError (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1028:19)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20\n    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/core/dist-cjs/index.js:165:18\n    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38\n    at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:34:22\n    at async BedrockConverseAdapter.askBedrockStreaming (/var/task/index.js:813:29)\n    at async ConversationTurnExecutor.execute (/var/task/index.js:1009:32)\n    at async Runtime.handleConversationTurnEvent [as handler] (/var/task/index.js:1043:7) {\n  '$fault': 'client',\n  '$metadata': {\n    httpStatusCode: 400,\n    requestId: 'dbb16273-798b-4bad-946b-ac30835b2c0f',\n    extendedRequestId: undefined,\n    cfId: undefined,\n    attempts: 1,\n    totalRetryDelay: 0\n  }\n}",
    "errorType": "ValidationException",
    "errorMessage": "Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.",
    "stackTrace": [
        "ValidationException: Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.",
        "    at de_ValidationExceptionRes (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1195:21)",
        "    at de_CommandError (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1028:19)",
        "    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)",
        "    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20",
        "    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/core/dist-cjs/index.js:165:18",
        "    at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38",
        "    at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:34:22",
        "    at async BedrockConverseAdapter.askBedrockStreaming (/var/task/index.js:813:29)",
        "    at async ConversationTurnExecutor.execute (/var/task/index.js:1009:32)",
        "    at async Runtime.handleConversationTurnEvent [as handler] (/var/task/index.js:1043:7)"
    ]
}

Reproduction steps

Watched the cloudlogs errors, because I didn't get a response in the front end.

@rpostulart
Copy link
Author

rpostulart commented Nov 22, 2024

It's unclear in the docs, but you can solve it by putting the inference profile id in the resourcepath:

const schema = a.schema({
  chat: a
    .conversation({
      aiModel: {
        resourcePath: "eu.anthropic.claude-3-5-sonnet-20240620-v1:0",
      },

      systemPrompt: `You are a very helpful assistant`,
    })
    .authorization((allow) => allow.owner()),
});

but you also need to update your lambda (that is invoking bedrock) resource policy:

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Action": [
				"bedrock:InvokeModel",
				"bedrock:InvokeModelWithResponseStream"
			],
			"Resource": [
				"arn:aws:bedrock:*::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0",
				"arn:aws:bedrock:*:114244416074:inference-profile/anthropic.claude-3-5-sonnet-20240620-v1:0",
				"arn:aws:bedrock:*:114244416074:inference-profile/eu.anthropic.claude-3-5-sonnet-20240620-v1:0"
			],
			"Effect": "Allow"
		}
	]
}

Would be great if this can be adjusted

@ykethan
Copy link
Member

ykethan commented Nov 22, 2024

Hey, thanks for the feedback and information on how this can be solved. Transferring the issue to the documentation repository for updates.

@ykethan ykethan transferred this issue from aws-amplify/amplify-backend Nov 22, 2024
@atierian
Copy link
Member

Thanks for opening this @rpostulart. We'll add an example to the docs. In the meantime, here's an example of how you can do this directly within your Amplify backend.

Add a custom conversation handler

Add the @aws-amplify/backend-ai package

npm install @aws-amplify/backend-ai

In amplify/data/resource.ts

import { defineConversationHandlerFunction } from "@aws-amplify/backend-ai/conversation";

export const crossRegionModel = `eu.${model}`;
export const model = 'anthropic.claude-3-5-sonnet-20240620-v1:0';

export const conversationHandler = defineConversationHandlerFunction({
  entry: "./conversationHandler.ts",
  name: "conversationHandler",
  models: [{ modelId: crossRegionModel }],
});

const schema = a.schema({
  chat: a.conversation({
    aiModel: {
      resourcePath: crossRegionModel,
    },
    systemPrompt: 'You are a helpful assistant.',
    handler: conversationHandler,
  )}
    .authorization((allow) => allow.owner())
});

Create a new file amplify/data/conversationHandler.ts

import { handleConversationTurnEvent } from '@aws-amplify/backend-ai/conversation/runtime';

export const handler = handleConversationTurnEvent;

In amplify/backend.ts

import { defineBackend } from "@aws-amplify/backend";
import { auth } from "./auth/resource";
import { data, conversationHandler, crossRegionModel, model } from "./data/resource";
import { PolicyStatement } from "aws-cdk-lib/aws-iam";

const backend = defineBackend({
  auth,
  data,
  conversationHandler,
});

// This policy statement assumes that you're deploying in `eu-west-1`. 
// If that's not the case, adjust the resources block in the policy statements accordingly.
backend.conversationHandler.resources.lambda.addToRolePolicy(
  new PolicyStatement({
    resources: [
      `arn:aws:bedrock:eu-west-1:[account-number]:inference-profile/${crossRegionModel}`,
      `arn:aws:bedrock:eu-west-1::foundation-model/${model}`,
      `arn:aws:bedrock:eu-west-3::foundation-model/${model}`,
      `arn:aws:bedrock:eu-central-1::foundation-model/${model}`,
    ],
    actions: [
      'bedrock:InvokeModelWithResponseStream'
    ],
  })
);

@rpostulart
Copy link
Author

ok this is great. I will close the issue with your commitment you will update the docs :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants