Skip to content

Commit

Permalink
docs: add change logs of 20230920 (pinterest#1327)
Browse files Browse the repository at this point in the history
  • Loading branch information
jczhong84 authored and aidenprice committed Jan 3, 2024
1 parent 6af1db4 commit 9d623e6
Show file tree
Hide file tree
Showing 6 changed files with 101 additions and 1 deletion.
2 changes: 1 addition & 1 deletion containers/bundled_querybook_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ ELASTICSEARCH_HOST: http://elasticsearch:9200
# model_name: gpt-3.5-turbo-16k
# temperature: 0
# context_length: 16384
# query_summary:
# sql_summary:
# model_args:
# model_name: gpt-3.5-turbo-16k
# temperature: 0
Expand Down
89 changes: 89 additions & 0 deletions docs_website/docs/changelog/2023-09-20.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
id: sept_2023_9_20_0
title: Sept 2023 (version 3.28.0)
sidebar_label: Sept 2023 (3.28.0)
---

Welcome to the latest release of Querybook 🎉.

Following are the top new features we have added during the year 2023 thus far:

- **AI assistant**: Support query cell title generation, text-to-sql and query auto fix, powered by LLM.
- **Vector table search**: Use natural language to search a table.
- **Data cell/table comment**: Users can leave comments for data cells and tables.
- **Query optimization suggestions**: Provide a tooltip of query optimization suggestions.
- **User group**: Introduce user groups, which can be used as table/datadoc owner/editor.
- **Data element**: Introdce a new metadata `data element`, which provides semantic data meaning and can be assigned to a table column.
- **Stats logging**: Add support of stats logging, like number of users, number of api requests and etc.

## Feature highlights

### AI Assistant

The LLM powered AI assistant can help on

- Query cell title generation
- Text to SQL
- Query error auto fix

Please check the [guide](../user_guide/ai_assistant.md) for more details.

### Vector Table Search

Previously table searching is only keyword based search. Now we introduced [vector store plugin](../integrations/add_ai_assistant.md#vector-store-plugin) and added the support of searching a table by natural language.
![](/img/user_guide/table_vector_search.png)

### Data Cell & Table Comment

Users can leave comments to a data cell/table or view comments from other people.

![](/changelog/20230920/cell_comment.png)
![](/changelog/20230920/table_comment.png)

### Query Optimization Suggestions

The query editor will provide optimization suggestions for some cases. Here are some predefined one for Presto/Trino

- distinct count -> approx_distinct
- like 'a' or like 'b' -> regexp_like(column, 'a|b')
- union -> union all

You can create you own suggestions by following the example of [PrestoOptimizingValidator](https://github.com/pinterest/querybook/blob/c8949b21c854b367d7bf54f08fbe1a12ad4a47c2/querybook/server/lib/query_analysis/validation/validators/presto_optimizing_validator.py#L177)

Check the [PR](https://github.com/pinterest/querybook/pull/1302) for more details.

### User Group

We introduced the support of user group. Now a user in querybook can be a single user or a user group. A table could be owned by a user group, or a datadoc can be shared to a user group(haven't implemented, PR in progress).
![](https://user-images.githubusercontent.com/8308723/216733976-7c2c27cb-ec1b-4401-81e7-c5069798326e.png)

### Data element

A [data element](https://en.wikipedia.org/wiki/Data_element) is an atomic unit of data that has precise meaning or precise semantics, like country, age and etc. We added data element as a new metadata, which can be assigned to a table column to provide more meaningful info for the column.

Note: it can only be synced from metastore.

![](https://user-images.githubusercontent.com/8308723/224444625-067f1527-d936-409d-b99c-a25f4a676c21.png)

### Stats logging

Add support of stats logging, like number of users, number of api requests and etc. Please check the [plugin](../integrations/add_stats_logger.md) for more details.

## Small Feature Improvements/Bug Fixes

- Add username and password authentication for the trino client [#1315](https://github.com/pinterest/querybook/pull/1315)
- Add two new plugins: [monkey patch plugin](../integrations/plugins.md#monkey-patch-plugin) and [api plugin](../integrations/plugins.md#api-plugin) [#1266](https://github.com/pinterest/querybook/pull/1266)
- Fix the display of long table names in search modal [#1246](https://github.com/pinterest/querybook/pull/1246)
- Allow data doc deletion from sidebar [#1241](https://github.com/pinterest/querybook/pull/1241)
- Ensure meta_info is updated when an exception occurs [#1230](https://github.com/pinterest/querybook/pull/1230)
- Add helm deployment guide [#1183](https://github.com/pinterest/querybook/pull/1183)
- Add more metadata support [#1182](https://github.com/pinterest/querybook/pull/1182)
- Enable mssql transpiling [#1178](https://github.com/pinterest/querybook/pull/1178)
- Add ability to cancel dead queries [#1159](https://github.com/pinterest/querybook/pull/1159)
- Fix json-bigint hasOwnProperty undefined issue [#1129](https://github.com/pinterest/querybook/pull/1129)
- Add frontend context logging [#1115](https://github.com/pinterest/querybook/pull/1115)
- Add drag and drop for templated variables [#1112](https://github.com/pinterest/querybook/pull/1112)

Querybook Team<br/>
Pinterest<br/>
🚀
10 changes: 10 additions & 0 deletions docs_website/docs/integrations/add_ai_assistant.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,13 @@ How to set up and host a vector store or use a cloud vector store solution is no
4. Enable it in `querybook/config/querybook_public_config.yaml`

With vector store plugin enabled, text-to-sql will also use it to find tables if tables are not provided by the user.

### Initilize the Vector Index

In Docker based deployments, attach to `web` or `worker` component and run

```shell
python ./querybook/server/scripts/init_vector_store.py
```

It will add summary for all tables and sample query summary of the tables to the vector store. If you'd like to only index part of the tables, you can follow the example of `ingest_vector_index` to create your own script.
1 change: 1 addition & 0 deletions docs_website/sidebars.json
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@
"Changelog": [
"changelog/breaking_changes",
"changelog/security_advisories",
"changelog/sept_2023_9_20_0",
"changelog/dec_2022_3_15_0",
"changelog/nov_2020_2_4_2",
"changelog/may_2020_2_3_0",
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 9d623e6

Please sign in to comment.