Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add conversation search documentation #5018

Merged
merged 15 commits into from
Sep 19, 2023
226 changes: 226 additions & 0 deletions _ml-commons-plugin/conversational-search.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
---
layout: default
title: Conversational search
has_children: false
nav_order: 200
---

# Conversational search

Conversational search is an experimental machine learning (ML) feature that enables a new search interface. Whereas traditional document search allows a user to ask a question and receive a list of documents that might contain the answer to that question, conversation search uses large language models (LLM) to read the top N documents and synthesizes those documents into a plaintest "answer" to your question.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

Currently, conversational search utilizes two systems to synthesize documents:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

- [Conversation memory](#conversation-memory)
- [Retreival Augmented Generation (RAG) pipeline](#rag-pipeline)

## Conversation memory

Conversation memory consists of a simple CRUD-life API comprised of two resources: **Conversations** and **Interactions**. Conversations are made up of interactions. An interaction represents a pair of messages; a human input and the AI response.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

To make it easier to build and debug applications that make use of conversation memory, `conversation-meta` and `conversation-interactions` are stored in two system indexes.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

### `conversation-meta` index

In the `conversation-meta` index, you can customize the `name` field to make it easier for end users to know how to continue a conversation with the AI, as shown in the following schema:

```jsx
.plugins-ml-conversation-meta
{
"_meta": {
"schema_version": 1
},
"properties": {
"name": {"type": "keyword"},
"create_time": {"type": "date", "format": "strict_date_time||epoch_millis"},
"user": {"type": "keyword"}
}
}
```

### `conversation-interactions` index

In the `conversation-interactions` index, the following fields are all set by the user or AI application. Each of the following fields are input as strings:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

| Field | Description |
| :--- | :--- |
| `input` | The question which forms the basis for an interaction.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
| `prompt_template` | The prompt template that was used as the framework for this interaction. |
| `response` | The AI’s response to the prompt. |
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
| `origin` | The name of the AI or other system that generated the response. |
| `additional_info` | Any other information that was sent to the “origin” in the prompt. |
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

The goal of the `conversation-interactions` index is to create a clean interaction and make it easy for the index to reconstruct the exact prompts sent to the LLM, enabling robust debugging and explainability, as shown in the following schema:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```jsx
.plugins-ml-conversation-interactions
{
"_meta": {
"schema_version": 1
},
"properties": {
"conversation_id": {"type": "keyword"},
"create_time": {"type": "date", "format": "strict_date_time||epoch_millis"},
"input": {"type": "text"},
"prompt_template": {"type": "text"},
"response": {"type": "text"},
"origin": {"type": "keyword"},
"additional_info": {"type": "text"}
}
}
```

### Creating conversations and interactions
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

To begin using conversation memory, enable the following cluster setting:

```json
PUT /_cluster/settings
{
"persistent": {
"plugins.ml_commons.memory_feature_enabled": true
}
}
```

After conversation memory is enabled, you can now use the Memory API to create a conversation. Remember, you cannot create an interactions until you've created a conversation.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

To make the conversation easily identifable, use the optional `name` field in the Memory API, as shown in the following example. This will be your only opportunity to give your conversation a name.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved



```json
POST /_plugins/_ml/memory/conversation
{
"name": Example conversation
}
```

The Memory API responds with the conversation ID, as shown in the following example response:

```json
{ "conversation_id": "4of2c9nhoIuhcr" }
```

You'll use the `conversation_id` to create interactions inside the conversation. To create interactions, enter the `conversation_id` into the Memory API path. Then, customize the [fields](#conversation-interactions-index) in the request body, as shown in the following example:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```json
POST /_plugins/_ml/memory/conversation/4of2c9nhoIuhcr
{
"input": "How do I make an interaction?",
"prompt_template": "Hello OpenAI, can you answer this question? \
Here's some extra info that may help. \
[INFO] \n [QUESTION]",
"response": "Hello, this is OpenAI. Here is the answer to your question.",
"origin": "MyFirstOpenAIWrapper",
"additional_info": "Additional text related to the answer \
A JSON or other semi-structured response"
}
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
```

The Memory API then responds with an interaction ID, as shown in the following response:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```json
{ "interaction_id": "948vh_PoiY2hrnpo" }
```

### Getting conversations

You can get a list of conversation using the following Memory API operation:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```json
GET /_plugins/_ml/memory/conversation?max_results=3&next_token=0
```

Use the following path parameters to customize your results:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

Parameter | Data type | Description
:--- | :--- | :---
`max_results` | Integer | The maximum number of results returned by the response. Default is `10`.
`next_token` | Integer | The number of paginated results returned when more than the `max_results` exist.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

The Memory API responds with the most recent conversation created first, as indicated in the `create_time` field of the response as shown in the following example:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirm that my changes retain the intended meaning.


```json
{
"conversations": [
{
"conversation_id": "0y4hto_in1",
"name": "",
"create_time": "2023-4-23 10:25.324662"
}, ... (2 more since we specified max_results=3)
],
"next_token": 3
}
```


If there are fewer than the number of conversations set in `max_results`, then the response only returns the number of conversations that exist. Lastly, `next_token` provides an ordered position of the sorted list of conversations. When a conversation is added between subsequent GET conversation calls, then one of the listed conversations will be duplicated in the result, for example:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```json
GetConversations -> [BCD]EFGH
CreateConversation -> ABCDEFGH
GetConversations(next_token=3) -> ABC[DEF]GH
```

### Getting interactions

To see a list of interactions in a conversation, enter the `conversation_id` at the end of API request, as shown in the following example. You can use the `max_results` and `next_token` to sort the response:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```json
GET /_plugins/_ml/memory/conversation/4of2c9nhoIuhcr
```

The Memory API returns the following interaction information:

```json
{
"interactions": [
{
"interaction_id": "342968y2u4-0",
"conversation_id": "0y4hto_in1",
"create_time": "2023-4-23 10:25.324662",
"input": "How do I make an interaction?",
"prompt_template": "Hello OpenAI, can you answer this question? \
Here's some extra info that may help. \
[INFO] \n [QUESTION]",
"response": "Hello, this is OpenAI. Here is the answer to your question.",
"origin": "MyFirstOpenAIWrapper",
"additional_info": "Additional text related to the answer \
A JSON or other semi-structured response"
}, ... (9 more since max_results defaults to 10)
],
"next_token": 10
}
```

### Deleting conversations

To delete a conversation, use the `DELETE` operation as showing in the following example:
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

```json
DELETE /_plugins/_ml/memory/conversation/4of2c9nhoIuhcr
```

The Memory API responds with the following:

```json
{ "success": true }
```

Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
## RAG pipeline

RAG is a technique for grounding LLMs and a natural language interface, implemented as a search pipeline.
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

### Enabling RAG

Use the following cluster setting to enable the RAG pipeline feature:

```json
PUT /_cluster/settings
{
"persistent": {"plugins.ml_commons.rag_pipline_feature_enabled": "true"}
}
```

RAG requires a LLM to function. We recommend using a [connector](https://opensearch.org/docs/latest/ml-commons-plugin/extensibility/connectors/)