Skip to content

Latest commit

 

History

History
126 lines (86 loc) · 3.92 KB

README.md

File metadata and controls

126 lines (86 loc) · 3.92 KB

Deploy RAG/AI App To AWS

Getting Started

Configure AWS

You need to have an AWS account, and AWS CLI set up on your machine. You'll also need to have Bedrock enabled on AWS (and granted model access to Claude or whatever you want to use).

Update .env File with AWS Credentials

Create a file named .env in image/. Do NOT commit the file to .git. The file should have content like this:

AWS_ACCESS_KEY_ID=XXXXX
AWS_SECRET_ACCESS_KEY=XXXXX
AWS_DEFAULT_REGION=us-east-1
TABLE_NAME=YourTableName

This will be used by Docker for when we want to test the image locally. The AWS keys are just your normal AWS credentials and region you want to run this in (even when running locally you will still need access to Bedrock LLM and to the DynamoDB table to write/read the data).

You'll also need a TABLE_NAME for the DynamoDB table for this to work (so you'll have to create that first).

Installing Requirements

pip install -r image/requirements.txt

Building the Vector DB

Put all the PDF source files you want into image/src/data/source/. Then go image and run:

# Use "--reset" if you want to overwrite an existing DB.
python populate_database.py --reset

Running the App

# Execute from image/src directory
cd image/src
python rag_app/query_rag.py "how much does a landing page cost?"

Example output:

Answer the question based on the above context: How much does a landing page cost to develop?

Response:  Based on the context provided, the cost for a landing page service offered by Galaxy Design Agency is $4,820. Specifically, under the "Our Services" section, it states "Landing Page for Small Businesses ($4,820)" when describing the landing page service. So the cost listed for a landing page is $4,820.
Sources: ['src/data/source/galaxy-design-client-guide.pdf:1:0', 'src/data/source/galaxy-design-client-guide.pdf:7:0', 'src/data/source/galaxy-design-client-guide.pdf:7:1']

Starting FastAPI Server

# From image/src directory.
python app_api_handler.py

Then go to http://0.0.0.0:8000/docs to try it out.

Using Docker Image

Build and Test the Image Locally

These commands can be run from image/ directory to build, test, and serve the app locally.

docker build --platform linux/amd64 -t aws_rag_app .

This will build the image (using linux amd64 as the platform — we need this for pysqlite3 for Chroma).

# Run the container using command `python app_work_handler.main`
docker run --rm -it \
    --entrypoint python \
    --env-file .env \
    aws_rag_app app_work_handler.py

This will test the image, seeing if it can run the RAG/AI component with a hard-coded question (see app_work_handler.py). But since it uses Bedrock as the embeddings and LLM platform, you will need an AWS account and have all the environment variables for your access set (AWS_ACCESS_KEY_ID, etc).

You will also need to have Bedrock's models enabled and granted for the region you are running this in.

Running Locally as a Server

Assuming you've build the image from the previous step.

docker run --rm -p 8000:8000 \
    --entrypoint python \
    --env-file .env \
    aws_rag_app app_api_handler.py

Testing Locally

After running the Docker container on localhost, you can access an interactive API page locally to test it: http://0.0.0.0:8000/docs.

curl -X 'POST' \
  'http://0.0.0.0:8000/submit_query' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query_text": "How much does a landing page for a small business cost?"
}'

Deploy to AWS

I have put all the AWS CDK files into rag-cdk-infra/. Go into the folder and install the Node dependencies.

npm install

Then run this command to deploy it (assuming you have AWS CLI already set up, and AWS CDK already bootstrapped). I recommend deploying to us-east-1 to start with (since all the AI models are there).

cdk deploy