LLM retrieval augmentation in Google Cloud

This demo features GCP Matching Engine and VertexAI PaLM to combine the functionality of retrieval augmentation and conversational engines to create a question answering system where the user can ask a question and the LLM will use it's given context to answer the question.

The Dataset used is the Stanford Question Answering Dataset (SQuAD) , a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles.

The demo can be accessed here.

Services used

VertexAI Matching Engine: ANN Similarity Seach
VertexAI PaLM: Conversational Engine
Cloud Run: Hosting of the API
Firestore: Document Database
Firebase: Frontend hosting
Cloud Build: CI/CD

Frameworks:

LangChain: Framework for creating conversational agent and retrieval augmentation
Tensorflow Hub: Embeddings

Prerequisites

Terraform
A GCP project created

Docs

Infrastructure and Matching Engine Setup: Setup the required infrastructure using Terraform and create the Matching Engine index
Create embeddings: Generate the embeddings for the documents and index them in Matching Engine
Firestore: Index the documents in Firestore
LangChain Retriever and Agent: Create a LangChain retriever and conversational agent
Cloud Run: Grab all the code, package it and deploy the API to Cloud Run
Firebase WebUI: Create the Web app

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LLM retrieval augmentation in Google Cloud

Services used

Prerequisites

Docs

Files

README.md

Latest commit

History

README.md

File metadata and controls

LLM retrieval augmentation in Google Cloud

Services used

Prerequisites

Docs