-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix documentation and especially configuration one
- Loading branch information
Showing
6 changed files
with
134 additions
and
122 deletions.
There are no files selected for viewing
2 changes: 1 addition & 1 deletion
2
docs/audit-logs-vs-information-schema.md → ...ation/audit-logs-vs-information-schema.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
--- | ||
sidebar_position: 5 | ||
sidebar_position: 4.1 | ||
slug: /audit-logs-vs-information-schema | ||
--- | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
--- | ||
sidebar_position: 4.2 | ||
slug: /configuration/audit-logs | ||
--- | ||
|
||
# GCP BigQuery audit logs | ||
|
||
In this mode, the package will monitor all the jobs that written to a GCP BigQuery Audit logs table instead of using `INFORMATION_SCHEMA.JOBS` one. | ||
|
||
:::tip | ||
|
||
To get the best out of this mode, you should enable the `should_combine_audit_logs_and_information_schema` setting to combine both sources. | ||
More details on [the related page](/audit-logs-vs-information-schema). | ||
|
||
::: | ||
|
||
To enable the "cloud audit logs mode", you'll need to define explicitly mandatory settings to set in the `dbt_project.yml` file: | ||
|
||
```yml | ||
vars: | ||
enable_gcp_bigquery_audit_logs: true | ||
gcp_bigquery_audit_logs_storage_project: 'my-gcp-project' | ||
gcp_bigquery_audit_logs_dataset: 'my_dataset' | ||
gcp_bigquery_audit_logs_table: 'my_table' | ||
# should_combine_audit_logs_and_information_schema: true # Optional, default to false but you might want to combine both sources | ||
``` | ||
|
||
[You might use environment variable as well](/configuration/package-settings). | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
--- | ||
sidebar_position: 4 | ||
slug: /configuration | ||
--- | ||
|
||
# Configuration | ||
|
||
Settings have default values that can be overriden using: | ||
|
||
- dbt project variables (and therefore also by CLI variable override) | ||
- environment variables | ||
|
||
Please note that the default region is `us` and there's no way, at the time of writing, to query cross region tables but you might run that project in each region you want to monitor and [then replicate the tables to a central region](https://cloud.google.com/bigquery/docs/data-replication) to build an aggregated view. | ||
|
||
To know which region is related to a job, in the BQ UI, use the `Job history` (bottom panel), take a job and look at `Location` field when clicking on a job. You can also access the region of a dataset/table by opening the details panel of it and check the `Data location` field. | ||
|
||
:::tip | ||
|
||
To get the best out of this package, you should probably configure all data sources and settings: | ||
- Choose the [Baseline mode](#modes) that fits your GCP setup | ||
- [Add metadata to queries](#add-metadata-to-queries-recommended-but-optional) | ||
- [GCP BigQuery Audit logs](/configuration/audit-logs) | ||
- [GCP Billing export](/configuration/gcp-billing) | ||
- [Settings](/configuration/package-settings) (especially the pricing ones) | ||
|
||
::: | ||
|
||
|
||
## Modes | ||
|
||
### Region mode (default) | ||
|
||
In this mode, the package will monitor all the GCP projects in the region specified in the `dbt_project.yml` file. | ||
|
||
```yml | ||
vars: | ||
# dbt bigquery monitoring vars | ||
bq_region: 'us' | ||
``` | ||
**Requirements** | ||
- Execution project needs to be the same as the storage project else you'll need to use the second mode. | ||
- If you have multiple GCP Projects in the same region, you should use the "project mode" (with `input_gcp_projects` setting to specify them) as else you will run into errors such as: `Within a standard SQL view, references to tables/views require explicit project IDs unless the entity is created in the same project that is issuing the query, but these references are not project-qualified: "region-us.INFORMATION_SCHEMA.JOBS"`. | ||
|
||
### Project mode | ||
|
||
To enable the "project mode", you'll need to define explicitly one mandatory setting to set in the `dbt_project.yml` file: | ||
|
||
```yml | ||
vars: | ||
# dbt bigquery monitoring vars | ||
input_gcp_projects: [ 'my-gcp-project', 'my-gcp-project-2' ] | ||
``` | ||
|
||
## Add metadata to queries (Recommended but optional) | ||
|
||
To enhance your query metadata with dbt model information, the package provides a dedicated macro that leverage "dbt query comments" (the header set at the top of each query) | ||
To configure the query comments, add the following config to `dbt_project.yml`. | ||
|
||
```yaml | ||
query-comment: | ||
comment: '{{ dbt_bigquery_monitoring.get_query_comment(node) }}' | ||
job-label: True # Use query comment JSON as job labels | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
--- | ||
sidebar_position: 4.3 | ||
slug: /configuration/gcp-billing | ||
--- | ||
|
||
# GCP Billing export | ||
GCP Billing export is a feature that allows you to export your billing data to BigQuery. It allows the package to track the real cost of your queries and storage overtime. | ||
|
||
To enable on GCP end, you can follow the [official documentation](https://cloud.google.com/billing/docs/how-to/export-data-bigquery) to set up the export. | ||
|
||
Then enable the GCP billing export monitoring in the package, you'll need to define the following settings in the `dbt_project.yml` file: | ||
|
||
```yml | ||
vars: | ||
enable_gcp_billing_export: true | ||
gcp_billing_export_storage_project: 'my-gcp-project' | ||
gcp_billing_export_dataset: 'my_dataset' | ||
gcp_billing_export_table: 'my_table' | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters