EES Support Bot (AI Search and Support)

Overview

This is a repository for a prototype of a chatbot for the Department for Education (DfE) Explore Education Statistics service.

The app is powered by embeddings so that when a user inputs a query, the relevant parts of the knowledge base are returned and then the app calls the openai api to answer the question.

The tech stack on the backend is the python framework fastapi and the vector database Qdrant. FastApi is a fast, modern framework for building APIs in python. For more information about FastApi visit their documentation. Langchain is used to query the Qdrant and interact with the openai api. For more information about langchain visit their documentation. For more information on qdrant please visit their documentation.

The frontend tech stack is next.js and typescript although this is subject to change.

App structure

There are three projects contained within this repository, a next.js frontend UI project, a fastapi server for the data ingestion, and fastapi server for the backend which are in the chatbot-ui, data_ingestion and response_automater folders respectively.

The fastapi server for data ingestion has various endpoints to build, rebuild and delete different parts of the vector database, qdrant. To build the database, information is extracted from the content apis from the explore-education-statistics service and chunked into smaller units of text. Via the openai and qdrant apis these pieces of text are converted into vector embeddings and subsequently stored in the qdrant vector database. The endpoint to build the database is .../api/maintenance/publications/build which is contained in the data_ingestion/routers/maintenance.py file. This can be used to build or rebuild all the information from the latest publications in the qdrant vector database. There are also endpoints for building information relating to the methodologies and to delete the embeddings stored within the database contained in the same file. The other two files within the router directory, publications.py and methodologies.py have endpoints for updating a specific publication or methodologies within the qdrant database. For example, if there was a new release of attendance publication, a post request to the .../pupil-attendance-in-schools/update could be triggered.

The latter fastapi server exposes the Qdrant, openai and langchain apis. This means when a user inputs a question into the app, the question is sent to the .../api/chat endpoint. Here the question is converted into a vector embedding. Based on the cosine similarity of this embedding with the embeddings in the vector database, the three most relevant chunks of the vector database are returned. How the api responds is governed by prompt template (contained in utils.py) and the services.message_service.py. The latter contains a send_message function which encompasses the logic for interacting with the qdrant, openai and langchain apis and allows the endpoint to send a response as an event stream.

Prerequisites

Python version 3.11 or higher installed on your system.
Docker installed and running on your system.
Pipenv for managing Python dependencies.
npm for managing frontend dependencies.

Development - Initial Setup

Clone the repo

git clone https://github.com/dfe-analytical-services/chatbot-prototype.git
cd chatbot-prototype

Install pnpm if you haven't already:
```
npm install -g pnpm
```
Install Pipenv if you haven't already:
```
pip install pipenv
```
Create a virtual environment and install project dependencies:
```
pipenv install --dev
```
Set up pre-commit hooks:
```
pipenv run pre-commit install
```
In the project's root directory, .env.example contains placeholders for environment variables that need to be set. Copy .env.example to .env:
```
cp .env.example .env
```
Edit the .env file and customise the environment variables.

Docker Setup

Make sure Docker is up and running.
To start the project using Docker Compose, run:
```
docker-compose up -d
```
This will start the required services defined in docker-compose.yml. To stop running the docker container navigate to root of the project directory and run docker-compose down

Qdrant

Qdrant is the vector database running locally using Docker.

Access the Qdrant Docker instance dashboard: http://localhost:6333/dashboard

Running the FastAPI data ingestion server

To run the project locally (outside Docker), make sure you have activated the Pipenv environment:
```
pipenv shell
```

Start the data ingestion server:

uvicorn data_ingestion.main:app --host 0.0.0.0 --port 8000 --reload

You can optionally skip step 1 by invoking the shell in the above command, like so:

pipenv run python -m uvicorn data_ingestion.main:app --host 0.0.0.0 --port 8000 --reload

Access the data ingestion API docs: http://localhost:8000/docs.

Ingesting data

The python script scripts/data_ingest.py can be used as a helper to make API requests for data maintenance.

Once the data is ingested you can stop the FastAPI data ingestion server (if it's being run locally).

You can run the script with the help of pnpm using pnpm data-ingest.

For help:

pnpm data-ingest --help

To clear the vector database:

pnpm data-ingest --clear

To build all methodologies:

pnpm data-ingest --build-methodologies

To build all publications:

pnpm data-ingest --build-publications

To update a specific methodology:

pnpm data-ingest --update-methodology --slug SLUG

To update a specific publication:

pnpm data-ingest --update-publication --slug SLUG

Running the FastAPI response automater server

To run the project locally (outside Docker), make sure you have activated the Pipenv environment:
```
pipenv shell
```

Start the response automater server.

uvicorn response_automater.main:app --host 0.0.0.0 --port 8010 --reload

You can optionally skip step 1 by invoking the shell in the above command, like so:

pipenv run python -m uvicorn response_automater.main:app --host 0.0.0.0 --port 8010 --reload

Access the response automater API docs: http://localhost:8010/docs.

Running the Next.js Chatbot UI frontend

Install all dependencies for the project:
```
pnpm i
```
Start Next.js:
```
pnpm --filter chatbot-ui dev
```
Access the chatbot UI: http://localhost:3002.

Quick start

This is a guide to starting up the chatbot UI assuming you have already followed the initial setup and run everything once before.

It assumes you have already run the data ingestion server at least once so that you now have a Qdrant data volume and you have used the API to ingest data.

Open a new command prompt in the root directory of the project and run the following:

docker-compose up -d
pipenv shell
uvicorn response_automater.main:app --host 0.0.0.0 --port 8000 --reload

Open a new command prompt in the root directory of the project and run the following:
```
pnpm --filter chatbot-ui dev
```

Name		Name	Last commit message	Last commit date
Latest commit History 239 Commits
.azdo/pipelines		.azdo/pipelines
.idea		.idea
chatbot-ui		chatbot-ui
data_ingestion		data_ingestion
data_ingestion_tests		data_ingestion_tests
infra		infra
infrastructure		infrastructure
response_automater		response_automater
response_automater_tests		response_automater_tests
scripts		scripts
.dockerignore		.dockerignore
.env.example		.env.example
.eslintignore		.eslintignore
.flake8		.flake8
.gitignore		.gitignore
.npmrc		.npmrc
.pre-commit-config.yaml		.pre-commit-config.yaml
.prettierignore		.prettierignore
.prettierrc		.prettierrc
.python-version		.python-version
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
azure.yaml		azure.yaml
chatbot-prototype.code-workspace		chatbot-prototype.code-workspace
docker-compose.yaml		docker-compose.yaml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
pyproject.toml		pyproject.toml
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EES Support Bot (AI Search and Support)

Overview

App structure

Prerequisites

Development - Initial Setup

Docker Setup

Qdrant

Running the FastAPI data ingestion server

Ingesting data

Running the FastAPI response automater server

Running the Next.js Chatbot UI frontend

Quick start

About

Releases

Packages

Contributors 5

Languages

dfe-analytical-services/chatbot-prototype

Folders and files

Latest commit

History

Repository files navigation

EES Support Bot (AI Search and Support)

Overview

App structure

Prerequisites

Development - Initial Setup

Docker Setup

Qdrant

Running the FastAPI data ingestion server

Ingesting data

Running the FastAPI response automater server

Running the Next.js Chatbot UI frontend

Quick start

About

Resources

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages