You need to have an AWS account, and AWS CLI set up on your machine. You'll also need to have Bedrock enabled on AWS (and granted model access to Claude or whatever you want to use).
Create a file named .env
in image/
. Do NOT commit the file to .git
. The file should have content like this:
AWS_ACCESS_KEY_ID=XXXXX
AWS_SECRET_ACCESS_KEY=XXXXX
AWS_DEFAULT_REGION=us-east-1
TABLE_NAME=YourTableName
This will be used by Docker for when we want to test the image locally. The AWS keys are just your normal AWS credentials and region you want to run this in (even when running locally you will still need access to Bedrock LLM and to the DynamoDB table to write/read the data).
You'll also need a TABLE_NAME for the DynamoDB table for this to work (so you'll have to create that first).
pip install -r image/requirements.txt
Put all the PDF source files you want into image/src/data/source/
. Then go image
and run:
# Use "--reset" if you want to overwrite an existing DB.
python populate_database.py --reset
# Execute from image/src directory
cd image/src
python rag_app/query_rag.py "how much does a landing page cost?"
Example output:
Answer the question based on the above context: How much does a landing page cost to develop?
Response: Based on the context provided, the cost for a landing page service offered by Galaxy Design Agency is $4,820. Specifically, under the "Our Services" section, it states "Landing Page for Small Businesses ($4,820)" when describing the landing page service. So the cost listed for a landing page is $4,820.
Sources: ['src/data/source/galaxy-design-client-guide.pdf:1:0', 'src/data/source/galaxy-design-client-guide.pdf:7:0', 'src/data/source/galaxy-design-client-guide.pdf:7:1']
# From image/src directory.
python app_api_handler.py
Then go to http://0.0.0.0:8000/docs
to try it out.
These commands can be run from image/
directory to build, test, and serve the app locally.
docker build --platform linux/amd64 -t aws_rag_app .
This will build the image (using linux amd64 as the platform — we need this for pysqlite3
for Chroma).
# Run the container using command `python app_work_handler.main`
docker run --rm -it \
--entrypoint python \
--env-file .env \
aws_rag_app app_work_handler.py
This will test the image, seeing if it can run the RAG/AI component with a hard-coded question (see app_work_handler.py
). But since it uses Bedrock as the embeddings and LLM platform, you will need an AWS account and have all the environment variables for your access set (AWS_ACCESS_KEY_ID
, etc).
You will also need to have Bedrock's models enabled and granted for the region you are running this in.
Assuming you've build the image from the previous step.
docker run --rm -p 8000:8000 \
--entrypoint python \
--env-file .env \
aws_rag_app app_api_handler.py
After running the Docker container on localhost, you can access an interactive API page locally to test it: http://0.0.0.0:8000/docs
.
curl -X 'POST' \
'http://0.0.0.0:8000/submit_query' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"query_text": "How much does a landing page for a small business cost?"
}'
I have put all the AWS CDK files into rag-cdk-infra/
. Go into the folder and install the Node dependencies.
npm install
Then run this command to deploy it (assuming you have AWS CLI already set up, and AWS CDK already bootstrapped). I recommend deploying to us-east-1
to start with (since all the AI models are there).
cdk deploy