Deploy RAG/AI App To AWS

Getting Started

Configure AWS

You need to have an AWS account, and AWS CLI set up on your machine. You'll also need to have Bedrock enabled on AWS (and granted model access to Claude or whatever you want to use).

Update .env File with AWS Credentials

Create a file named .env in image/. Do NOT commit the file to .git. The file should have content like this:

AWS_ACCESS_KEY_ID=XXXXX
AWS_SECRET_ACCESS_KEY=XXXXX
AWS_DEFAULT_REGION=us-east-1
TABLE_NAME=YourTableName

This will be used by Docker for when we want to test the image locally. The AWS keys are just your normal AWS credentials and region you want to run this in (even when running locally you will still need access to Bedrock LLM and to the DynamoDB table to write/read the data).

You'll also need a TABLE_NAME for the DynamoDB table for this to work (so you'll have to create that first).

Installing Requirements

pip install -r image/requirements.txt

Building the Vector DB

Put all the PDF source files you want into image/src/data/source/. Then go image and run:

# Use "--reset" if you want to overwrite an existing DB.
python populate_database.py --reset

Running the App

# Execute from image/src directory
cd image/src
python rag_app/query_rag.py "how much does a landing page cost?"

Example output:

Answer the question based on the above context: How much does a landing page cost to develop?

Response:  Based on the context provided, the cost for a landing page service offered by Galaxy Design Agency is $4,820. Specifically, under the "Our Services" section, it states "Landing Page for Small Businesses ($4,820)" when describing the landing page service. So the cost listed for a landing page is $4,820.
Sources: ['src/data/source/galaxy-design-client-guide.pdf:1:0', 'src/data/source/galaxy-design-client-guide.pdf:7:0', 'src/data/source/galaxy-design-client-guide.pdf:7:1']

Starting FastAPI Server

# From image/src directory.
python app_api_handler.py

Then go to http://0.0.0.0:8000/docs to try it out.

Using Docker Image

Build and Test the Image Locally

These commands can be run from image/ directory to build, test, and serve the app locally.

docker build --platform linux/amd64 -t aws_rag_app .

This will build the image (using linux amd64 as the platform — we need this for pysqlite3 for Chroma).

# Run the container using command `python app_work_handler.main`
docker run --rm -it \
    --entrypoint python \
    --env-file .env \
    aws_rag_app app_work_handler.py

This will test the image, seeing if it can run the RAG/AI component with a hard-coded question (see app_work_handler.py). But since it uses Bedrock as the embeddings and LLM platform, you will need an AWS account and have all the environment variables for your access set (AWS_ACCESS_KEY_ID, etc).

You will also need to have Bedrock's models enabled and granted for the region you are running this in.

Running Locally as a Server

Assuming you've build the image from the previous step.

docker run --rm -p 8000:8000 \
    --entrypoint python \
    --env-file .env \
    aws_rag_app app_api_handler.py

Testing Locally

After running the Docker container on localhost, you can access an interactive API page locally to test it: http://0.0.0.0:8000/docs.

curl -X 'POST' \
  'http://0.0.0.0:8000/submit_query' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query_text": "How much does a landing page for a small business cost?"
}'

Deploy to AWS

I have put all the AWS CDK files into rag-cdk-infra/. Go into the folder and install the Node dependencies.

npm install

Then run this command to deploy it (assuming you have AWS CLI already set up, and AWS CDK already bootstrapped). I recommend deploying to us-east-1 to start with (since all the AI models are there).

cdk deploy

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
image		image
rag-cdk-infra		rag-cdk-infra
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deploy RAG/AI App To AWS

Getting Started

Configure AWS

Update .env File with AWS Credentials

Installing Requirements

Building the Vector DB

Running the App

Starting FastAPI Server

Using Docker Image

Build and Test the Image Locally

Running Locally as a Server

Testing Locally

Deploy to AWS

About

Releases

Packages

Languages

pixegami/deploy-rag-to-aws

Folders and files

Latest commit

History

Repository files navigation

Deploy RAG/AI App To AWS

Getting Started

Configure AWS

Update .env File with AWS Credentials

Installing Requirements

Building the Vector DB

Running the App

Starting FastAPI Server

Using Docker Image

Build and Test the Image Locally

Running Locally as a Server

Testing Locally

Deploy to AWS

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages