Skip to content

Latest commit

 

History

History
49 lines (37 loc) · 2.21 KB

README.md

File metadata and controls

49 lines (37 loc) · 2.21 KB

LLM retrieval augmentation in Google Cloud

This demo features GCP Matching Engine and VertexAI PaLM to combine the functionality of retrieval augmentation and conversational engines to create a question answering system where the user can ask a question and the LLM will use it's given context to answer the question.

The Dataset used is the Stanford Question Answering Dataset (SQuAD) , a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles.

The demo can be accessed here.

Services used

Architecture

Frameworks:

Prerequisites

Docs

  1. Infrastructure and Matching Engine Setup: Setup the required infrastructure using Terraform and create the Matching Engine index
  2. Create embeddings: Generate the embeddings for the documents and index them in Matching Engine
  3. Firestore: Index the documents in Firestore
  4. LangChain Retriever and Agent: Create a LangChain retriever and conversational agent
  5. Cloud Run: Grab all the code, package it and deploy the API to Cloud Run
  6. Firebase WebUI: Create the Web app