diff --git a/docs/images/boxplot-naive-vs-biochatter.pdf b/docs/images/boxplot-naive-vs-biochatter.pdf index e959007d..710ca682 100644 Binary files a/docs/images/boxplot-naive-vs-biochatter.pdf and b/docs/images/boxplot-naive-vs-biochatter.pdf differ diff --git a/docs/images/dotplot-per-task.pdf b/docs/images/dotplot-per-task.pdf index 2e199ae9..e53b6865 100644 Binary files a/docs/images/dotplot-per-task.pdf and b/docs/images/dotplot-per-task.pdf differ diff --git a/docs/images/dotplot-per-task.png b/docs/images/dotplot-per-task.png index 2be37524..18f09d09 100644 Binary files a/docs/images/dotplot-per-task.png and b/docs/images/dotplot-per-task.png differ diff --git a/docs/images/scatter-per-quantisation-name.pdf b/docs/images/scatter-per-quantisation-name.pdf index ac259969..2364bba8 100644 Binary files a/docs/images/scatter-per-quantisation-name.pdf and b/docs/images/scatter-per-quantisation-name.pdf differ diff --git a/docs/images/scatter-per-quantisation-name.png b/docs/images/scatter-per-quantisation-name.png index b3fdb1bf..bc305928 100644 Binary files a/docs/images/scatter-per-quantisation-name.png and b/docs/images/scatter-per-quantisation-name.png differ diff --git a/docs/images/scatter-quantisation-accuracy.pdf b/docs/images/scatter-quantisation-accuracy.pdf index 84b4c6d0..be0e4de5 100644 Binary files a/docs/images/scatter-quantisation-accuracy.pdf and b/docs/images/scatter-quantisation-accuracy.pdf differ diff --git a/docs/images/scatter-size-accuracy.pdf b/docs/images/scatter-size-accuracy.pdf index 9641f896..61209ceb 100644 Binary files a/docs/images/scatter-size-accuracy.pdf and b/docs/images/scatter-size-accuracy.pdf differ diff --git a/docs/images/stripplot-extraction-tasks.png b/docs/images/stripplot-extraction-tasks.png index c50cbd7b..1d23ae3f 100644 Binary files a/docs/images/stripplot-extraction-tasks.png and b/docs/images/stripplot-extraction-tasks.png differ diff --git a/docs/images/stripplot-per-model.png b/docs/images/stripplot-per-model.png index 048f60cb..8eb8c779 100644 Binary files a/docs/images/stripplot-per-model.png and b/docs/images/stripplot-per-model.png differ diff --git a/docs/images/stripplot-rag-tasks.pdf b/docs/images/stripplot-rag-tasks.pdf index c4728686..9b9a8b80 100644 Binary files a/docs/images/stripplot-rag-tasks.pdf and b/docs/images/stripplot-rag-tasks.pdf differ diff --git a/docs/images/stripplot-rag-tasks.png b/docs/images/stripplot-rag-tasks.png index cf695aa3..9f09d347 100644 Binary files a/docs/images/stripplot-rag-tasks.png and b/docs/images/stripplot-rag-tasks.png differ diff --git a/docs/rag.md b/docs/rag.md index 1bb6aa60..63b73065 100644 --- a/docs/rag.md +++ b/docs/rag.md @@ -1,4 +1,4 @@ -# Retrieval-Augmented Generation (RAG) +# Retrieval-Augmented Generation ## Overview @@ -99,6 +99,24 @@ to LLM interaction with the database. The current prototypical implementation of query generation through an LLM is implemented in the `prompts.py` module on the example of a Neo4j knowledge graph connection. +```mermaid +sequenceDiagram + participant User/Primary Agent + participant DatabaseAgent + participant Knowledge Graph + + User/Primary Agent ->> DatabaseAgent: question + Knowledge Graph ->> DatabaseAgent: schema information + DatabaseAgent ->> DatabaseAgent: select entities + DatabaseAgent ->> DatabaseAgent: select relationships + DatabaseAgent ->> DatabaseAgent: select properties + DatabaseAgent ->> DatabaseAgent: generate query + DatabaseAgent ->> Knowledge Graph: submit query + Knowledge Graph ->> DatabaseAgent: return results + DatabaseAgent ->> DatabaseAgent: summarise (optional) + DatabaseAgent ->> User/Primary Agent: return results +``` + ### Connecting The database connectivity of BioChatter to BioCypher knowledge graphs is handled @@ -273,6 +291,23 @@ in these repositories, you can call `docker compose up -d standalone` (`standalone` being the Milvus endpoint, which starts two other services alongside it). +```mermaid +sequenceDiagram + participant User/Primary Agent + participant VectorDatabaseAgent + participant Vector Database + participant Documents + + Documents ->> Vector Database: embed text fragments + User/Primary Agent ->> VectorDatabaseAgent: question + VectorDatabaseAgent ->> VectorDatabaseAgent: generate artificial answer (optional) + VectorDatabaseAgent ->> VectorDatabaseAgent: embed question or artificial answer + VectorDatabaseAgent ->> Vector Database: submit search query embedding + Vector Database ->> VectorDatabaseAgent: return most similar embedded fragments + VectorDatabaseAgent ->> VectorDatabaseAgent: summarise (optional) + VectorDatabaseAgent ->> User/Primary Agent: return results +``` + ### Connecting To connect to a vector DB host, we can use the corresponding class: @@ -352,6 +387,22 @@ Agent. It is designed to interact with various external APIs and provides a structured approach to generating queries, fetching results, and interpreting the responses from different API services. +```mermaid +sequenceDiagram + participant User/Primary Agent + participant APIAgent + participant External Software + + External Software ->> APIAgent: API definition + User/Primary Agent ->> APIAgent: question + APIAgent ->> APIAgent: parameterise API + APIAgent ->> APIAgent: generate API query + APIAgent ->> External Software: submit query (optional) + APIAgent ->> External Software: fetch result + External Software ->> APIAgent: return result + APIAgent ->> APIAgent: summarise / interpret (optional) + APIAgent ->> User/Primary Agent: return results +``` ### Example: OncoKB Integration