This project is for AI R&D Challenge for Sun* Summer Internship
Author: Nguyen Tuan Ngoc, Sophomore from VNU-UET at this time
This project is my Question-Answering System, you can ask anything. To improve the quality of the answer, you can add your data file into vector database to improve the quality with private data.
Here're some of the project's best features:
- Chatting with Chatbot to help you address your problems.
- You can add data with doc, docx(error), pdf format to enhance your database and improve the quality of the answer which based on RAG architecture.
1. Privileges
Python Version 3.10
Conda Command Prompt
2. Conda Setup
conda create -n agent python=3.10 -y
conda activate agent
#using pip to install all package from requirements.txt file
pip install -r requirements.txt
Technologies used in the project:
- Frontend:
- Streamlit
- Backend:
- Langchain
- Zenguard for Prompt Injection Preserving (Not done)
- FastAPI
- Vector database:
- ChromaDB
- RAG Techniques:
- Advanced RAG ReAct Agent
- Hybrid Search
- Agent External Tool: BingSearch Tool.
- Semantic Chunking
- Contextual Compression
- ReRanker (Demo Colab)
- LLM API:
- OpenAI API Key
* Tools:
* AgentTools
- BingSearch.py | Bing Search Tools
- RAG.py | RAG Tools
- Tools.py | Various Tools Powered By Agent
* Retrieval:
- Model.py | Model For Retrieval
- Retrieval.py | Retrieval Tools
* VectorDatabase:
- API.py | API For Vector Database
- Chroma.py | Vector Database Tools
* Config_Model.yaml (use for higher performance configuration)
* Config.py
* RunBackend.py
* requirements.txt
* frontend:
* app.py: Streamlit Server
* client.py
* RAG.py: Classical RAG with Vector Database
- In Tools\VectorDatabase\API.py:
- add_data: upload pdf file or doc file (still error) to chunk and store in Chroma database.
- delete_database: delete database
- In RunBackend.py:
- query_handler: use for process questions and generate answers.
- In RAG.py:
- query_handler: use for process questions and generate answers.
1. Clone this project from GitHub
git clone https://github.com/ngoctuannguyen/SunSunChatBot.git
2. Run Backend Server
python RunBackend.py (Question Answering Server)
python Tools\VectorDatabase\API.py (File Upload Server)
3. Run Frontend Server
streamlit run frontend/app.py
4. Run Classical RAG (due to lackage of data)
Modify the RAG.py: change result of def get_response() to response.json()["text"]
Try some questions in test.txt or your own questions
If you have questions, please issue this repository or contact me via email [email protected].