Skip to content

Commit

Permalink
Merge branch 'current' into dbeatty/update-incremental-strategies
Browse files Browse the repository at this point in the history
  • Loading branch information
matthewshaver authored Dec 13, 2023
2 parents 69ca5ba + 0e99c69 commit fa2ca0e
Show file tree
Hide file tree
Showing 5 changed files with 481 additions and 73 deletions.
2 changes: 1 addition & 1 deletion website/docs/community/resources/oss-expectations.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ PRs are your surest way to make the change you want to see in dbt / packages / d

**Every PR should be associated with an issue.** Why? Before you spend a lot of time working on a contribution, we want to make sure that your proposal will be accepted. You should open an issue first, describing your desired outcome and outlining your planned change. If you've found an older issue that's already open, comment on it with an outline for your planned implementation. Exception to this rule: If you're just opening a PR for a cosmetic fix, such as a typo in documentation, an issue isn't needed.

**PRs should include robust testing.** With the goal to substantially cut down the number and impact of regressions, we are taking a more meticulous approach to the tests that we require to merge a pull request. We recognize that robust testing can often take significantly more effort than the main portion of the code. Thank you for your help in contributing to this goal!
**PRs should include robust testing.** Comprehensive testing within pull requests is crucial for the stability of our project. By prioritizing robust testing, we ensure the reliability of our codebase, minimize unforeseen issues and safeguard against potential regressions. We understand that creating thorough tests often requires significant effort, and your dedication to this process greatly contributes to the project's overall reliability. Thank you for your commitment to maintaining the integrity of our codebase!"

**Our goal is to review most new PRs within 7 days.** The first review will include some high-level comments about the implementation, including (at a high level) whether it’s something we think suitable to merge. Depending on the scope of the PR, the first review may include line-level code suggestions, or we may delay specific code review until the PR is more finalized / until we have more time.

Expand Down
185 changes: 185 additions & 0 deletions website/docs/reference/resource-configs/bigquery-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -718,3 +718,188 @@ Views with this configuration will be able to select from objects in `project_1.

The `grant_access_to` config is not thread-safe when multiple views need to be authorized for the same dataset. The initial `dbt run` operation after a new `grant_access_to` config is added should therefore be executed in a single thread. Subsequent runs using the same configuration will not attempt to re-apply existing access grants, and can make use of multiple threads.

<VersionBlock firstVersion="1.7">

## Materialized views

The BigQuery adapter supports [materialized views](https://cloud.google.com/bigquery/docs/materialized-views-intro)
with the following configuration parameters:

| Parameter | Type | Required | Default | Change Monitoring Support |
|-------------------------------------------------------------|------------------------|----------|---------|---------------------------|
| `on_configuration_change` | `<string>` | no | `apply` | n/a |
| [`cluster_by`](#clustering-clause) | `[<string>]` | no | `none` | drop/create |
| [`partition_by`](#partition-clause) | `{<dictionary>}` | no | `none` | drop/create |
| [`enable_refresh`](#auto-refresh) | `<boolean>` | no | `true` | alter |
| [`refresh_interval_minutes`](#auto-refresh) | `<float>` | no | `30` | alter |
| [`max_staleness`](#auto-refresh) (in Preview) | `<interval>` | no | `none` | alter |
| [`description`](/reference/resource-properties/description) | `<string>` | no | `none` | alter |
| [`labels`](#specifying-labels) | `{<string>: <string>}` | no | `none` | alter |
| [`hours_to_expiration`](#controlling-table-expiration) | `<integer>` | no | `none` | alter |
| [`kms_key_name`](#using-kms-encryption) | `<string>` | no | `none` | alter |

<Tabs
groupId="config-languages"
defaultValue="project-yaml"
values={[
{ label: 'Project file', value: 'project-yaml', },
{ label: 'Property file', value: 'property-yaml', },
{ label: 'Config block', value: 'config', },
]
}>


<TabItem value="project-yaml">

<File name='dbt_project.yml'>

```yaml
models:
[<resource-path>](/reference/resource-configs/resource-path):
[+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): materialized_view
[+](/reference/resource-configs/plus-prefix)on_configuration_change: apply | continue | fail
[+](/reference/resource-configs/plus-prefix)[cluster_by](#clustering-clause): <field-name> | [<field-name>]
[+](/reference/resource-configs/plus-prefix)[partition_by](#partition-clause):
- field: <field-name>
- data_type: timestamp | date | datetime | int64
# only if `data_type` is not 'int64'
- granularity: hour | day | month | year
# only if `data_type` is 'int64'
- range:
- start: <integer>
- end: <integer>
- interval: <integer>
[+](/reference/resource-configs/plus-prefix)[enable_refresh](#auto-refresh): true | false
[+](/reference/resource-configs/plus-prefix)[refresh_interval_minutes](#auto-refresh): <float>
[+](/reference/resource-configs/plus-prefix)[max_staleness](#auto-refresh): <interval>
[+](/reference/resource-configs/plus-prefix)[description](/reference/resource-properties/description): <string>
[+](/reference/resource-configs/plus-prefix)[labels](#specifying-labels): {<label-name>: <label-value>}
[+](/reference/resource-configs/plus-prefix)[hours_to_expiration](#acontrolling-table-expiration): <integer>
[+](/reference/resource-configs/plus-prefix)[kms_key_name](##using-kms-encryption): <path-to-key>
```

</File>

</TabItem>


<TabItem value="property-yaml">

<File name='models/properties.yml'>

```yaml
version: 2

models:
- name: [<model-name>]
config:
[materialized](/reference/resource-configs/materialized): materialized_view
on_configuration_change: apply | continue | fail
[cluster_by](#clustering-clause): <field-name> | [<field-name>]
[partition_by](#partition-clause):
- field: <field-name>
- data_type: timestamp | date | datetime | int64
# only if `data_type` is not 'int64'
- granularity: hour | day | month | year
# only if `data_type` is 'int64'
- range:
- start: <integer>
- end: <integer>
- interval: <integer>
[enable_refresh](#auto-refresh): true | false
[refresh_interval_minutes](#auto-refresh): <float>
[max_staleness](#auto-refresh): <interval>
[description](/reference/resource-properties/description): <string>
[labels](#specifying-labels): {<label-name>: <label-value>}
[hours_to_expiration](#acontrolling-table-expiration): <integer>
[kms_key_name](##using-kms-encryption): <path-to-key>
```

</File>

</TabItem>


<TabItem value="config">

<File name='models/<model_name>.sql'>

```jinja
{{ config(
[materialized](/reference/resource-configs/materialized)='materialized_view',
on_configuration_change="apply" | "continue" | "fail",
[cluster_by](#clustering-clause)="<field-name>" | ["<field-name>"],
[partition_by](#partition-clause)={
"field": "<field-name>",
"data_type": "timestamp" | "date" | "datetime" | "int64",
# only if `data_type` is not 'int64'
"granularity": "hour" | "day" | "month" | "year,
# only if `data_type` is 'int64'
"range": {
"start": <integer>,
"end": <integer>,
"interval": <integer>,
}
},
# auto-refresh options
[enable_refresh](#auto-refresh)= true | false,
[refresh_interval_minutes](#auto-refresh)=<float>,
[max_staleness](#auto-refresh)="<interval>",
# additional options
[description](/reference/resource-properties/description)="<description>",
[labels](#specifying-labels)={
"<label-name>": "<label-value>",
},
[hours_to_expiration](#acontrolling-table-expiration)=<integer>,
[kms_key_name](##using-kms-encryption)="<path_to_key>",
) }}
```

</File>

</TabItem>

</Tabs>

Many of these parameters correspond to their table counterparts and have been linked above.
The set of parameters unique to materialized views covers [auto-refresh functionality](#auto-refresh).

Find more information about these parameters in the BigQuery docs:
- [CREATE MATERIALIZED VIEW statement](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_materialized_view_statement)
- [materialized_view_option_list](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#materialized_view_option_list)

### Auto-refresh

| Parameter | Type | Required | Default | Change Monitoring Support |
|------------------------------|--------------|----------|---------|---------------------------|
| `enable_refresh` | `<boolean>` | no | `true` | alter |
| `refresh_interval_minutes` | `<float>` | no | `30` | alter |
| `max_staleness` (in Preview) | `<interval>` | no | `none` | alter |

BigQuery supports [automatic refresh](https://cloud.google.com/bigquery/docs/materialized-views-manage#automatic_refresh) configuration for materialized views.
By default, a materialized view will automatically refresh within 5 minutes of changes in the base table, but not more frequently than once every 30 minutes.
BigQuery only officially supports the configuration of the frequency (the "once every 30 minutes" frequency);
however, there is a feature in preview that allows for the configuration of the staleness (the "5 minutes" refresh).
dbt will monitor these parameters for changes and apply them using an `ALTER` statement.

Find more information about these parameters in the BigQuery docs:
- [materialized_view_option_list](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#materialized_view_option_list)
- [max_staleness](https://cloud.google.com/bigquery/docs/materialized-views-create#max_staleness)

### Limitations

As with most data platforms, there are limitations associated with materialized views. Some worth noting include:

- Materialized view SQL has a [limited feature set](https://cloud.google.com/bigquery/docs/materialized-views-create#supported-mvs).
- Materialized view SQL cannot be updated; the materialized view must go through a `--full-refresh` (DROP/CREATE).
- The `partition_by` clause on a materialized view must match that of the underlying base table.
- While materialized views can have descriptions, materialized view *columns* cannot.
- Recreating/dropping the base table requires recreating/dropping the materialized view.

Find more information about materialized view limitations in Google's BigQuery [docs](https://cloud.google.com/bigquery/docs/materialized-views-intro#limitations).

</VersionBlock>
103 changes: 92 additions & 11 deletions website/docs/reference/resource-configs/postgres-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,21 +108,100 @@ models:

## Materialized views

The Postgres adapter supports [materialized views](https://www.postgresql.org/docs/current/rules-materializedviews.html).
Indexes are the only configuration that is specific to `dbt-postgres`.
The remaining configuration follows the general [materialized view](/docs/build/materializations#materialized-view) configuration.
There are also some limitations that we hope to address in the next version.
The Postgres adapter supports [materialized views](https://www.postgresql.org/docs/current/rules-materializedviews.html)
with the following configuration parameters:

### Monitored configuration changes
| Parameter | Type | Required | Default | Change Monitoring Support |
|---------------------------|--------------------|----------|---------|---------------------------|
| `on_configuration_change` | `<string>` | no | `apply` | n/a |
| [`indexes`](#indexes) | `[{<dictionary>}]` | no | `none` | alter |

The settings below are monitored for changes applicable to `on_configuration_change`.
<Tabs
groupId="config-languages"
defaultValue="project-yaml"
values={[
{ label: 'Project file', value: 'project-yaml', },
{ label: 'Property file', value: 'property-yaml', },
{ label: 'Config block', value: 'config', },
]
}>

#### Indexes

Index changes (`CREATE`, `DROP`) can be applied without the need to rebuild the materialized view.
This differs from a table model, where the table needs to be dropped and re-created to update the indexes.
If the `indexes` portion of the `config` block is updated, the changes will be detected and applied
directly to the materialized view in place.
<TabItem value="project-yaml">

<File name='dbt_project.yml'>

```yaml
models:
[<resource-path>](/reference/resource-configs/resource-path):
[+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): materialized_view
[+](/reference/resource-configs/plus-prefix)on_configuration_change: apply | continue | fail
[+](/reference/resource-configs/plus-prefix)[indexes](#indexes):
- columns: [<column-name>]
unique: true | false
type: hash | btree
```

</File>

</TabItem>


<TabItem value="property-yaml">

<File name='models/properties.yml'>

```yaml
version: 2
models:
- name: [<model-name>]
config:
[materialized](/reference/resource-configs/materialized): materialized_view
on_configuration_change: apply | continue | fail
[indexes](#indexes):
- columns: [<column-name>]
unique: true | false
type: hash | btree
```

</File>

</TabItem>


<TabItem value="config">

<File name='models/<model_name>.sql'>

```jinja
{{ config(
[materialized](/reference/resource-configs/materialized)="materialized_view",
on_configuration_change="apply" | "continue" | "fail",
[indexes](#indexes)=[
{
"columns": ["<column-name>"],
"unique": true | false,
"type": "hash" | "btree",
}
]
) }}
```

</File>

</TabItem>

</Tabs>

The [`indexes`](#indexes) parameter corresponds to that of a table, as explained above.
It's worth noting that, unlike tables, dbt monitors this parameter for changes and applies the changes without dropping the materialized view.
This happens via a `DROP/CREATE` of the indexes, which can be thought of as an `ALTER` of the materialized view.

Find more information about materialized view parameters in the Postgres docs:
- [CREATE MATERIALIZED VIEW](https://www.postgresql.org/docs/current/sql-creatematerializedview.html)

<VersionBlock firstVersion="1.6" lastVersion="1.6">

### Limitations

Expand All @@ -138,3 +217,5 @@ If the user changes the model's config to `materialized="materialized_view"`, th
The solution is to execute `DROP TABLE my_model` on the data warehouse before trying the model again.

</VersionBlock>

</VersionBlock>
Loading

0 comments on commit fa2ca0e

Please sign in to comment.