This demo features GCP Matching Engine and VertexAI PaLM to combine the functionality of retrieval augmentation and conversational engines to create a question answering system where the user can ask a question and the LLM will use it's given context to answer the question.
The Dataset used is the Stanford Question Answering Dataset (SQuAD) , a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles.
The demo can be accessed here.
- VertexAI Matching Engine: ANN Similarity Seach
- VertexAI PaLM: Conversational Engine
- Cloud Run: Hosting of the API
- Firestore: Document Database
- Firebase: Frontend hosting
- Cloud Build: CI/CD
Frameworks:
- LangChain: Framework for creating conversational agent and retrieval augmentation
- Tensorflow Hub: Embeddings
- Terraform
- A GCP project created
- Infrastructure and Matching Engine Setup: Setup the required infrastructure using Terraform and create the Matching Engine index
- Create embeddings: Generate the embeddings for the documents and index them in Matching Engine
- Firestore: Index the documents in Firestore
- LangChain Retriever and Agent: Create a LangChain retriever and conversational agent
- Cloud Run: Grab all the code, package it and deploy the API to Cloud Run
- Firebase WebUI: Create the Web app