Skip to content

Latest commit

 

History

History
191 lines (140 loc) · 6.87 KB

README.md

File metadata and controls

191 lines (140 loc) · 6.87 KB

AI Custom Chatbot Quickstart · The Basics

Table of Contents

  1. AI Custom Chatbot Quickstart · The Basics
  2. Goals
  3. Architecture Overview
  4. Prerequisites
  5. Set up your environment
  6. Ingesting Data
  7. Query Data
  8. FAQ

This tutorial shows you how to get started building a custom chatbot using our preferred LLM tech stack:

We are leveraging Retrieval Augmented Generation (RAG). RAG is one of the techniques to enrich your LLM with custom knowledge.

For more resources: check out our blog posts on AI.

Goals

  • Demonstrate how to add custom data to an LLM model (we're using OpenAI's gpt-3.5-turbo)
  • Demonstrate a conversational memory LLM chatbot
  • Demonstrate using agents and tools with the Langchain Framework

Why?

Domain specific AI chatbots can be used in some of the following ways:

  • Virtual Assistants
  • Knowledge retrieval
  • Text synthesis
  • Text formatting
  • Sentiment analysis

Technical Objectives

  • Ingest data into a vector database
  • Query the vector database
  • Query an agent that decides whether to query the vector database

Architecture Overview

Overview Diagram

Prerequisites

  1. Clone this repository. Or branch it if you want to make your own edits.
  2. Pinecone Vector Database. You can create a free account at Pinecone's website.
  3. Open AI API account. You can sign up at Open AI's website.
    • You will need to add a payment method to your account here.
  4. Python (and your favorite IDE). We are using python v3.10.7.
  5. Your favorite API client tool (we used Postman, but you can also use curl)

Set up your environment

  1. Install dependencies: pip install -r requirements.txt
  2. Start the app: python3 main.py
  3. With your favorite API client tool, send a get request to the root endpoint (localhost:8000/)

If you receive the message “Hello World”, you are good to go 🎉

Ingesting Data

We leverage Llama Hub loaders to facilitate our ETL process. These loaders handle the tokenization and embedding for us. These loaders leverage Open AI's embedding model for the translations to a vector.

For more information on how Open AI's embedding model works, here's a good starting point.

Set up Infrastructure

We recommend setting up a Pinecone vector database. Many awesome vector databases exist, but Pinecone is a great starting point. Pinecone is a native vector database which increases the accuracy of search results. The database is managed and provides a dashboard out of the box.

  1. Follow the Prerequisites steps if you haven't already.
  2. Set up your vector database
    • Create an index, give it a name.
    • The index dimension is 1536. This is the number of output dimensions from Open AI's embedding model *text-embedding-ada-002*. Source
  3. Update environment variables
  • Create a .env file that contains the following:
OPENAI_API_KEY=<insert OpenAI API key>
PINECONE_API_KEY=<insert Pinecone API Key>
  • **In the *config.py* file, you will need to update Pinecone information
PINECONE_INDEX=<name of your index>
PINECONE_ENVIRONMENT=<name of your pinecone environment, ex: asia-southeast1-gcp-free>

Run App

This developer kit contains a loader for scraping a website. This is located in *import_service.py*

  1. Start the app: python3 main.py
  2. Send a POST **request to the endpoint /load-website-docs with the following body:
{
  "page_urls": [""]
}

Example:

{
  "page_urls": [
    "https://focusedlabs.io",
    "https://focusedlabs.io/about",
    "https://focusedlabs.io/contact",
    "https://focusedlabs.io/case-studies",
    "https://focusedlabs.io/case-studies/agile-workflow-enabled-btr-automation",
    "https://focusedlabs.io/case-studies/hertz-technology-new-markets",
    "https://focusedlabs.io/case-studies/aperture-agile-transformation",
    "https://focusedlabs.io/case-studies/automated-core-business-functionality"
  ]
}

Outcome: You'll see the vector number increase in your Pinecone dashboard. Yay!!! Now you have data you can query.

Query Data

Search the Database

Starting with the bare minimum. First, we'll make sure we can query the database. This will execute semantic search on the data you've loaded. For more details on what semantic search with Pinecone looks like, start with this article

  1. Start the app: python3 main.py
  2. Send a POST **request to the endpoint /search-database with the following body:
{
  "text": ""
}

Example:

{
  "text": "What solutions did Focused Labs provide for Hertz?"
}

Outcome: You’ll receive an answer from the database.

Query an agent

Ok, you can retrieve data from the database. But what happens when a user asks unrelated questions like "who are you?" We need to add an agent. You can think of agents as the brain behind deciding what tool to use. Sometimes, you need to query the database. Sometimes you don't. The agent decides.

Here is an update to our Architecture Overview diagram showing the agent. Overview with agent Diagram

  1. Start the app: python3 main.py
  2. Send a POST **request to the endpoint /ask-agent with the following body:
{
  "text": ""
}

Example:

{
  "text": "Who are you?"
}

Outcome: You’ll receive an answer from the agent.

FAQ

  • If you run into a Rate Limit Error, you need to make sure your OpenAI account has credit available.
  • If you run into a PermissionError: [Errno 13] Permission denied: then make sure you are running your app with Python3
  • If you run into a MaxRetryError...Caused by SSLError when you are uploading your data to the vector database, wait another 5 minutes for your index to fully initialize and try again.