forked from pinterest/querybook
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: refactor ai-assistant plugin and add vector search support (pin…
…terest#1325) * feat: use websocket for ai assistant (pinterest#1311) * feat: use websocket for ai assistant * fix node test * comments * remove memory and add keep button * fix linter * feat: add embedding based table search support (pinterest#1314) * feat: add embedding based table search support * update * build fail * linter * test failure * comments * nodetest * opensearch volumne path * docs: ai assistant plugin (pinterest#1323) * feat: add vector table search (pinterest#1322) * feat: add vector table search * fix linter * remove realtime record query cell * handle table select exceptions * add public config * comments * remove unused config * update var name metatore_id as metastoreId * comments
- Loading branch information
1 parent
9964555
commit beda6e3
Showing
74 changed files
with
2,143 additions
and
850 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
--- | ||
id: add_ai_assistant | ||
title: AI Assistant | ||
sidebar_label: AI Assistant | ||
--- | ||
|
||
:::info | ||
Please check the [user guide](../user_guide/ai_assistant.md) of how the AI assistant features look like. | ||
::: | ||
|
||
The AI assistant plugin is powered by LLM(Large Language Model), like ChatGPT from openai. We're using [Langchain](https://python.langchain.com/docs/get_started/introduction) to build the plugin. | ||
|
||
## AI Asssitant Plugin | ||
|
||
The AI Assistant plugin will allow users to do title generation, text to sql and query auto fix. | ||
|
||
Please follow below steps to enable AI assistant plugin: | ||
|
||
1. [Optional] Create your own AI assistant provider if needed. Please refer to `querybook/server/lib/ai_assistant/openai_assistant.py` as an example. | ||
|
||
2. Add your provider in `plugins/ai_assistant_plugin/__init__.py` | ||
|
||
3. Add configs in the `querybook_config.yaml`. Please refer to `containers/bundled_querybook_config.yaml` as an example. Please also check the model's official doc for all avaialbe model args. | ||
|
||
- Dont forget to set proper environment variables for your provider. e.g. for openai, you'll need `OPENAI_API_KEY`. | ||
|
||
4. Enable it in `querybook/config/querybook_public_config.yaml` | ||
|
||
## Vector Store Plugin | ||
|
||
The vector store plugin supports embedding based table search using natural language. It requires an embeddings provider and a vector store. Please check Langchain docs for more details of available [embeddings](https://python.langchain.com/docs/integrations/text_embedding/) and [vector stores](https://python.langchain.com/docs/integrations/vectorstores/). | ||
|
||
:::note | ||
How to set up and host a vector store or use a cloud vector store solution is not covered here. You can choose your own vector db solution. | ||
::: | ||
|
||
1. [Optional] Create your own embeddings or vector store if needed. Please refer to `querybook/server/lib/vector_store/stores/opensearch.py` as an example | ||
|
||
2. Add the providers in `plugins/vector_store_plugin/__init__.py` | ||
|
||
3. Add configs in the `querybook_config.yaml`. Please refer to `containers/bundled_querybook_config.yaml` as an example. Please also check Langchain doc for configs each vector store requires. | ||
|
||
- Also dont forget to set proper environment variables for your provider. e.g. for openai embeddings, you'll need `OPENAI_API_KEY`. | ||
|
||
4. Enable it in `querybook/config/querybook_public_config.yaml` | ||
|
||
With vector store plugin enabled, text-to-sql will also use it to find tables if tables are not provided by the user. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
--- | ||
id: ai_assistant | ||
title: AI Assistant | ||
sidebar_label: AI Assistant | ||
--- | ||
|
||
If the [AI Assistant plugin](../integrations/add_ai_assistant.md) is enabled, you'll be able to use below AI features powered by LLM. | ||
|
||
## Title Generation | ||
|
||
Click the '#' icon will generate title of the query cell automatically. | ||
![](/img/user_guide/title_generation.gif) | ||
|
||
## Text To SQL | ||
|
||
Hover over to the left side of the query cell and click the star-like icon will open the text-to-sql modal. | ||
|
||
### Query Generation | ||
|
||
To use it: select the table(s) you are going to query against and type your question prompt and hit Enter. | ||
|
||
If you're unsure which table to use, you can also type your question directly and AI will try to find the table(s) for you. | ||
|
||
![](/img/user_guide/text_to_sql.gif) | ||
|
||
### Query Editing | ||
|
||
If you would like to modify the generated query, you can keep the query and type in the prompt to edit it. | ||
|
||
If the query cell already has a query, open the text-to-sql modal will automatically go to the edit mode. | ||
|
||
![](/img/user_guide/text_to_sql_edit.gif) | ||
|
||
## SQL Fix | ||
|
||
If your query failed, you will see ‘Auto fix’ button on the right corner of the error message | ||
|
||
![](/img/user_guide/sql_fix.gif) | ||
|
||
## Search Table by Natural Language | ||
|
||
If [vector store](../integrations/add_ai_assistant.md#vector-store) of the AI assistant plugin is also enabled, you'll be able to search the tables by natual language as well as keyword based search. | ||
|
||
![](/img/user_guide/table_vector_search.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
ALL_PLUGIN_VECTOR_STORES = {} | ||
ALL_PLUGIN_EMBEDDINGS = {} | ||
|
||
# Example to add vector store | ||
|
||
# from lib.vector_store.stores.opensearch import OpenSearchVectorStore | ||
# from langchain.embeddings import OpenAIEmbeddings | ||
|
||
# ALL_PLUGIN_VECTOR_STORES = {"opensearch": OpenSearchVectorStore} | ||
# ALL_PLUGIN_EMBEDDINGS = {"openai": OpenAIEmbeddings} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,3 +10,6 @@ ai_assistant: | |
|
||
query_auto_fix: | ||
enabled: true | ||
|
||
table_vector_search: | ||
enabled: false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
from enum import Enum | ||
|
||
|
||
# KEEP IT CONSISTENT AS webapp/const/aiAssistant.ts | ||
class AICommandType(Enum): | ||
SQL_FIX = "sql_fix" | ||
SQL_TITLE = "sql_title" | ||
TEXT_TO_SQL = "text_to_sql" | ||
SQL_SUMMARY = "sql_summary" | ||
TABLE_SUMMARY = "table_summary" | ||
TABLE_SELECT = "table_select" | ||
|
||
|
||
AI_ASSISTANT_NAMESPACE = "/ai_assistant" | ||
|
||
|
||
DEFAULT_SAMPLE_QUERY_COUNT = 50 | ||
MAX_SAMPLE_QUERY_COUNT_FOR_TABLE_SUMMARY = 5 | ||
|
||
|
||
# the minimum score for a table to be considered as a match | ||
DEFAULT_SIMILARITY_SCORE_THRESHOLD = 0.6 | ||
# the minimum score for a table to be considered as a great match | ||
DEFAULT_SIMILARITY_SCORE_THRESHOLD_GREAT_MATCH = 0.7 | ||
# how many docs to fetch from vector store, it may include both table and query summary docs and they need additional processing. | ||
DEFAULT_VECTOR_STORE_FETCH_LIMIT = 30 | ||
# how many tables to return from vector table search eventually | ||
DEFAUTL_TABLE_SEARCH_LIMIT = 10 | ||
# how many tables to select for text-to-sql | ||
DEFAUTL_TABLE_SELECT_LIMIT = 3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,9 @@ | ||
from . import query_execution | ||
from . import datadoc | ||
from . import connect | ||
from . import ai_assistant | ||
|
||
connect | ||
query_execution | ||
datadoc | ||
ai_assistant |
Oops, something went wrong.