RAG

Project Summary: Intelligent Document Analysis with Retrieval-Augmented Generation (RAG) and Vector Search

This open-source project leverages Optical Character Recognition (OCR) to convert files in various formats (PDF, TIFF, PNG, JPEG) into text. It integrates Retrieval-Augmented Generation (RAG) for extracting relevant attributes from the text. The core functionality involves taking a query text as input, performing a vector search to identify relevant parts of the file, and using Large Language Model (LLM) providers such as OpenAI, KIMI, and Tencent Hunyuan to generate answers from the search results.

Feature List

Feature	Description
File Upload	Facilitates the upload of files in supported formats for processing.
Multi-format OCR	Supports OCR for PDF, TIFF, PNG, and JPEG files, converting them into text.
Vector Search	Performs vector search to identify relevant parts of the text based on embeddings.
LLM Integration	Integrates with LLM providers like OpenAI, KIMI, and Tencent Hunyuan for generating responses.
Embedding-based Retrieval	Uses vector embeddings for accurate and efficient information retrieval.

Getting Started with RAG

install with Docker

Clone the repos
Set neccessary environment variables

Make sure to set your required environment variables in the .env file. You can read more about how to set them up in the API Keys Section
Deploy using Docker

With Docker installed and the rag repository cloned, navigate to the directory containing the Dockerfile in your terminal or command prompt. Run the following command to start the rag application in detached mode, which allows it to run in the background:

# clone rag repo
git clone https://github.com/likid1412/rag

# navigate to rag
cd rag

# build, will download the necessary Docker images
docker build -t rag .

# run and start rag
docker run --env-file .env -dt --name rag -p 80:80 rag

# check rag logs, once success, you should see `Application startup complete.`
docker container logs rag

Remember, Docker must be installed on your system to use this method. For installation instructions and more details about Docker, visit the official Docker documentation.

You can read FastAPI in Containers for a quick start.

Access rag

You can access your local rag Interactive API docs
You can access your local rag Alternative API docs

logs

We will send logged messages to app.log file and the stdout using loguru

For app.log file, it will located at /rag/app.log
For stadout, you can check it use command such as docker container logs -f rag, use docker container logs --help to read more

API Keys

Before starting rag you'll need to configure access to various components depending on your chosen technologies, such as OpenAI, hunyuan, and Kimi via an .env file. Create this .env in the same directory you want to start rag in. Check the .env.example as example.

Make sure to only set environment variables you intend to use, environment variables with missing or incorrect values may lead to errors.

Below is a comprehensive list of the API keys and variables you may require:

Environment Variable	Value	Description
MINIO_ENDPOINT	the endpoint to your minio storage	See Minio as local storage
MINIO_ACCESS_KEY	Minio access key	See Minio as local storage
MINIO_SECRET_KEY	Minio secret key	See Minio as local storage
---	---	---
TENCENT_VECTOR_URL	URL for Tencent Vector Database	Access to Tencent Vector Database
TENCENT_VECTOR_USER	Username for Tencent Vector Database	Access to Tencent Vector Database
TENCENT_VECTOR_KEY	API Key for Tencent Vector Database	Access to Tencent Vector Database
---	---	---
TENCENTCLOUD_SECRET_ID	Tencent Cloud Secret ID for Tencent hunyuan LLM	Access to Tencent API for Tencent hunyuan LLM
TENCENTCLOUD_SECRET_KEY	Tencent Cloud Secret Key for Tencent hunyuan LLM	Access to Tencent API for Tencent hunyuan LLM
TENCENT_MODEL	Tencent HunYuan Model name	Tencent hunyuan model
---	---	---
API_KEY	OpenAI SDK API Key	Accee OpenAI or compatible LLM Provider API Key such as Kimi
BASE_URL	OpenAI SDK Base URL	Accee OpenAI or compatible LLM Provider API Key such as Kimi
MODEL	OpenAI SDK Model name	Model of OpenAI or compatible LLM Provider

Storage

Use minio as local storage, see Minio as local storage for more detail

Embedding

OpenAI embedding

You can get it from OpenAI

Tencent hunyuan embedding

Check hunyuan-embedding-API for more detail.

You can find instructions for obtaining a key here

Vector Database

You can get it from Tencent Vector Database

LLM providers

OpenAI

You can get it from OpenAI

Kimi (Moonshot)

Check Moonshot for more detail.

You can find instructions for obtaining a key here

Tencent hunyuan

Check hunyuan for more detail.

You can find instructions for obtaining a key here

Endpoint usage examples

Once you have access to rag, you can interact with API using the Interactive API docs, below is the endpoint usage examples.

File Upload Endpoint

Functionality

Accepts one or more file uploads (limited to pdf, tiff, png,jpeg formats).
Saves the processed file to storage (e.g, MinIO) solution, returning one or more unique file identifiers or signed URLs for the upload.

Usage example

Read the alternative automatic documentation for more Upload - ReDoc
Try it out: File Upload Endpoint: /upload
Click the Add string item, and choose file to upload, and will return uploaded file info with the original file name from uploaded file, unique file id, signed URL and unique file name which you can search in minio

OCR Endpoint

Functionality

Running an OCR service on the file downloaded from the signed_url
Process OCR results with embedding models (e.g, OpenAI, Tencent hunyuan)
Upload the embeddings to a vector database (e.g, Pinecone, Tencent Vector Database) for future searches.

Usage example

Read the alternative automatic documentation for more Ocr - ReDoc
Try it out: OCR Endpoint: /ocr
Fill the signed_url value with the url got from upload endppoint, this endpoint return immediately, because it will take some times, doing several tasks in the background mention above.
The return result look like below, you can check progress using Get OCR Progress Endpoint :

Get OCR Progress Endpoint

Functionality

Get ocr progress

Usage example

Read the alternative automatic documentation for more Get Ocr Progress - ReDoc
Try it out: Get OCR Progress Endpoint: /ocr_progress/{file_id}
Fill the file_id which pass to ocr endpoint to get the current progress
If still processing, return {"status": "processing", "progress": 0.xxx}
If completed, return {"status": "completed"}

Attribute Extraction Endpoint

Functionality

Takes a query text and file_id as input, performs a vector search and returns relevanted text based on the embeddings.
Chat with LLM provider (e.g, OpenAI, Tencent hunyuan) to generate the answer from the search result.

Usage example

Read the alternative automatic documentation for more Extract - ReDoc
Try it out: Attribute Extraction Endpoint: /extract
Takes a query text and file_id as input, choose LLM provider api (OpenAI or hunyuan), return the answer from query using LLM and relevant texts search from vector database which related to the file_id
- For OpenAI api, can use OpenAI model or compatible LLM Provider model such as Kimi
- For hunyuan api, can use Tencent hunyuan model

TODO

Upload large file using stream upload
Add requestId for trace
Add Monitoring and Observability
- What is API Monitoring? Use Cases, Tools & Best Practices | Postman
- API Observability: API Telemetry Data, Pillars & Use Cases | Postman
TODO/FIXME in code

One More Thing

About Chunking Strategies

Seems the oce result has divided the content based on its structure and hierarchy, which is the paragraphs, resulting in more semantically coherent chunks, we can simple use Fixed-size chunking base on the paragraphs.

read more: Chunking Strategies for LLM Applications | Pinecone

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
app		app
docs		docs
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG

Project Summary: Intelligent Document Analysis with Retrieval-Augmented Generation (RAG) and Vector Search

Feature List

Getting Started with RAG

API Keys

Storage

Embedding

OpenAI embedding

Tencent hunyuan embedding

Vector Database

LLM providers

OpenAI

Kimi (Moonshot)

Tencent hunyuan

Endpoint usage examples

File Upload Endpoint

OCR Endpoint

Get OCR Progress Endpoint

Attribute Extraction Endpoint

TODO

One More Thing

About Chunking Strategies

About

Releases

Packages

Languages

likid1412/rag

Folders and files

Latest commit

History

Repository files navigation

RAG

Project Summary: Intelligent Document Analysis with Retrieval-Augmented Generation (RAG) and Vector Search

Feature List

Getting Started with RAG

API Keys

Storage

Embedding

OpenAI embedding

Tencent hunyuan embedding

Vector Database

LLM providers

OpenAI

Kimi (Moonshot)

Tencent hunyuan

Endpoint usage examples

File Upload Endpoint

OCR Endpoint

Get OCR Progress Endpoint

Attribute Extraction Endpoint

TODO

One More Thing

About Chunking Strategies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages