RAG is an AI framework designed to extract factual information from external knowledge bases. It is used to provide large language models (LLMs) with context about your query by retrieving relevant information from your data.
We make it super easy and effective on any open-source model.
from lamini import RetrievalAugmentedRunner
llm = RetrievalAugmentedRunner()
llm.load_data("data/")
llm.train()
Here's the response:
llm("Who won the case above about dana and wells fargo?")
>> The court ruled in favor of Dana and Linda Phillabaum, the defendants and appellees,
in the foreclosure action brought against them by Wells Fargo.
Run it yourself, change the query:
./example.sh --query "Who won the case above about dana and wells fargo?"
Change the model (any open source model, just use the HuggingFace path!)
llm = RetrievalAugmentedRunner(model_name="meta-llama/Llama-2-13b-chat-hf")
RAG, short for "Retrieval Augmented Generation," is designed to seamlessly integrate data retrieval and content generation. It operates in three key components:
- Indexer: The indexer component is responsible for indexing a user's data, making it easily accessible for the LLM pipeline.
- Retriever: The retriever module fetches pertinent information from the indexed user data in response to user queries.
- Generator: The generator then leverages the retrieved information to produce contextually relevant and coherent text, enriching the content generation process.
This implementation makes use of the Lamini framework, which streamlines the integration of the RAG system into your AI projects. Lamini simplifies setup, training, and deployment of AI models, including RAG.
- Install Lamini
pip install lamini
- Clone the repository
git clone [email protected]:lamini-ai/RAG.git
- cd into the repository
cd RAG
-
Add your data in the data folder. A sample data has been added for you.
-
Run an example on a question you want to ask about your data
./example.sh --query "Who represented the State of Louisiana in flowers vs rausch case?"
Let's look inside RetrievalAugmentedRunner
:
llm.load_data
- Loads data using our directory-loader. It loads all the text readable files from thedata
repository and splits them into batches for faster indexing.llm.train
- Generates embeddings using Lamini and indexes the loaded data using Lamini Index powered by faiss.llm("Who won the flowers vs rausch case?")
- Runs our query engine. Retrieves the top k documents for a given query, appends them to the query and runs inference on the LLM on the new query.
If you are interested in diving deeper into these tools, here's how we implement it in python:
from lamini import DirectoryLoader, LaminiIndex, QueryEngine
loader = DirectoryLoader("data")
index = LaminiIndex(loader)
engine = QueryEngine(index)
response = engine.answer_question("Who won the case above about dana and wells fargo?")
This code is in RAG/rag-deeper.py
script using the following command:
python RAG/rag-deeper.py
Thank you for choosing the Lamini's RAG, a powerful tool to enhance content generation and data retrieval to design a ChatGPT for your data.