Skip to content

Commit

Permalink
Merge pull request #36 from looker-open-source/uat_llm
Browse files Browse the repository at this point in the history
Add LLM summaries and Kmeans model objective
  • Loading branch information
ajcrutch authored Jan 18, 2024
2 parents 31aff02 + 88c50ea commit 7bdd79e
Show file tree
Hide file tree
Showing 5 changed files with 1,085 additions and 2,541 deletions.
34 changes: 32 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ The service account used by the BigQuery connection chosen in Step 1 should have
- BigQuery Job User
- Vertex AI User


#### 3. Create BigQuery Dataset for ML Models

Create a dataset (e.g., `looker_bqml`) in the BigQuery connection's GCP project.
Expand All @@ -49,7 +48,7 @@ The application can be installed directly from [Looker Marketplace](https://mark

#### 5. Configure Application with User Attributes

The application uses three [Looker user attributes](https://cloud.google.com/looker/docs/admin-panel-users-user-attributes) to store its configuration settings. The following user attributes are required for the application to work properly. Each user attribute needs to be named exactly as listed below with a data type of `String`. The recommended setting for user access is `None`.
The application uses three [Looker user attributes](https://cloud.google.com/looker/docs/admin-panel-users-user-attributes) to store its configuration settings. The following user attributes are required for the application to work properly. Each user attribute needs to be named exactly as listed below with a data type of `String`. The recommended setting for user access is `View`.

Create the following user attributes and set their default values.

Expand All @@ -70,3 +69,34 @@ We recommend creating a new Looker role to easily manage user access to the appl
- Create a new Looker permission set named `ML Accelerator` containing all the permisions in the [default User permission set](https://cloud.google.com/looker/docs/admin-panel-users-roles#default_permission_sets) AND the `use_sql_runner` permission
- Create a new Looker role named `ML Accelerator` using the new model and permission set
- Assign the `ML Accelerator` role to Looker users and groups

#### 7. Setup AI-Generated Model Evaluation Summaries

After release 2.2, the application can use text generating AI to summarize the model evaluation to more clearly communicate model performance. This optional feature requires additional setup.

##### 7a: Add an External Connection from Bigquery to Vertex
In BigQuery, an [external connection](https://cloud.google.com/bigquery/docs/external-data-sources) is required to connect it to pre-trained models in Vertex AI. If one is not already set up, you must do so. A tutorial can also be found [here](https://cloud.google.com/bigquery/docs/generate-text-tutorial).
1. Under the same gcp project already in use for the application, verify the [BigQuery Connection](https://console.cloud.google.com/apis/library/bigqueryconnection.googleapis.com) and [Vertex AI](https://console.cloud.google.com/apis/library/aiplatform.googleapis.com) APIs are both enabled.
2. In BigQuery click “add,” then "Connections to external data sources."
3. Select "BigLake and remote function" and use the same location as the dataset already in use by the application
4. The ID will be the name of your connection. Since it could be used to connect to any number of pre-trained models in vertex it is wise to choose something generic, such as “ext-vertex-ai”
5. Create the connection
6. Go to the connection and copy the service account ID. In order to access remote functions from Vertex AI, the [BigQuery connection delegation service agent](https://cloud.google.com/iam/docs/service-agents#bigquery-connection-delegation-service-agent) (of the form bqcx-[#]@gcp-sa-bigquery-condel.iam.gserviceaccount.com) that is associated with this connection must have the "Vertex AI User" role, which can be added in IAM.

##### 7b: Create the Remote Text-Generation Model

In BigQuery, enter the following statement in the query editor (this code uses the suggested naming conventions for the the steps above and assumes region is US-Multi). The text-bison@002 model is suggested, but other LLM models with good performance generating text could also be used. The model_name will be later added as a User Attribute value. A suggestion for model_name is "mla-text-bison"
```
CREATE OR REPLACE MODEL project_id.dataset_id.model_name
REMOTE WITH CONNECTION `us.ext-vertex-ai`
OPTIONS (endpoint = 'text-bison@002');
```
This will take a few minutes to load and will not return any results.

##### 7c: Update the Relevant User Attribute

Similar to section 5 above.

| **Required User Attribute Name** | **Default Value Description** |
|-----------------------------------------------------------------|--------------------------------|
| marketplace_bqml_ext_ml_accelerator_generate_text_model_name | Name chosen in step 7b above |
3,551 changes: 1,022 additions & 2,529 deletions bundle.js

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion bundle.js.map

Large diffs are not rendered by default.

17 changes: 14 additions & 3 deletions manifest.lkml
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,15 @@ project_name: "marketplace_bqml_ext"
application: ml-accelerator {
label: "Machine Learning Accelerator"
file: "bundle.js"
sri_hash: "2Se7ajYLg8GY60c+rlO+X9q3qFnDjV4C7uiYgLWn8iM+/ufBt/+IR+bVOyVc+kmp"
sri_hash: "spnMLxVFVZ71aj3VUpreZGBwpLD9+lzINATVhn47QEAgD5WGV85ID3R9wFiSakLq"
entitlements: {
core_api_methods: [
"all_lookml_models",
"all_users",
"create_query",
"run_query",
"lookml_model_explore",
"get_model",
"model_fieldname_suggestions",
"me",
"user_attribute_user_values",
Expand All @@ -25,6 +27,7 @@ application: ml-accelerator {
scoped_user_attributes: [
"bigquery_connection_name",
"bqml_model_dataset_name",
"generate_text_model_name",
"gcp_project",
]
}
Expand All @@ -35,10 +38,18 @@ constant: CONNECTION_NAME {
export: override_required
}

constant: GCP_PROJECT {
value: "{{_user_attributes['marketplace_bqml_ext_ml_accelerator_gcp_project']}}"
}

constant: BQML_MODEL_DATASET_NAME {
value: "{{_user_attributes['marketplace_bqml_ext_ml_accelerator_bqml_model_dataset_name']}}"
}

constant: GCP_PROJECT {
value: "{{_user_attributes['marketplace_bqml_ext_ml_accelerator_gcp_project']}}"
constant: GENERATE_TEXT_MODEL_NAME {
value: "{{_user_attributes['marketplace_bqml_ext_ml_accelerator_generate_text_model_name']}}"
}
# First create an LLM model in the same dataset as specified in constant "BQML_MODEL_DATASET_NAME", then provide model name here
# https://cloud.google.com/bigquery/docs/generate-text
# Also, modify the service account used for the connection to obtain a new permission: bigquery.connections.use
# This is available to users with role Bigquery Connection User (https://cloud.google.com/iam/docs/understanding-roles#bigquery.connectionUser)
22 changes: 16 additions & 6 deletions marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,33 +15,43 @@
},
"user_attributes": {
"bigquery_connection_name": {
"label": "Machine Learning Accelerator Setting: BigQuery Connection Name",
"label": "BigQuery Connection Name",
"description": "The BigQuery connection the application will be allowed to use. Must be the same connection as chosen above.",
"type": "string",
"required": true,
"value_is_hidden": false,
"user_can_view": false,
"user_can_view": true,
"user_can_edit": false,
"default_value": "",
"value_constraint": "connection"
},
"gcp_project": {
"label": "Machine Learning Accelerator Setting: GCP Project ID",
"label": "GCP Project ID",
"description": "The GCP project ID for the BigQuery dataset where ML models will be saved.",
"type": "string",
"required": true,
"value_is_hidden": false,
"user_can_view": false,
"user_can_view": true,
"user_can_edit": false,
"default_value": ""
},
"bqml_model_dataset_name": {
"label": "Machine Learning Accelerator Setting: BQML Model Dataset Name",
"label": "BQML Model Dataset Name",
"description": "The dataset where ML models will be saved. Create a new dataset for BQML models (recommended) or choose the same dataset used for Looker PDTs.",
"type": "string",
"required": true,
"value_is_hidden": false,
"user_can_view": false,
"user_can_view": true,
"user_can_edit": false,
"default_value": ""
},
"generate_text_model_name": {
"label": "GenAI Text Model Name",
"description": "Name of an LLM model to generate text summaries (optional feature). Must be in same dataset as above. See https://github.com/looker-open-source/app-ml-accelerator/blob/main/README.md#7-setup-ai-generated-evaluation-summaries for setup instructions.",
"type": "string",
"required": false,
"value_is_hidden": false,
"user_can_view": true,
"user_can_edit": false,
"default_value": ""
}
Expand Down

0 comments on commit 7bdd79e

Please sign in to comment.