Skip to content

Commit

Permalink
add links to microbatch configs (#6616)
Browse files Browse the repository at this point in the history
  • Loading branch information
mirnawong1 authored Dec 9, 2024
2 parents f34a58d + 42c4347 commit 4ebafab
Show file tree
Hide file tree
Showing 5 changed files with 9 additions and 10 deletions.
11 changes: 5 additions & 6 deletions website/docs/docs/build/incremental-microbatch.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,11 +184,10 @@ Several configurations are relevant to microbatch models, and some are required:
| Config | Description | Default | Type | Required |
|----------|---------------|---------|------|---------|
| [`event_time`](/reference/resource-configs/event-time) | The column indicating "at what time did the row occur." Required for your microbatch model and any direct parents that should be filtered. | N/A | Column | Required |
| `begin` | The "beginning of time" for the microbatch model. This is the starting point for any initial or full-refresh builds. For example, a daily-grain microbatch model run on `2024-10-01` with `begin = '2023-10-01` will process 366 batches (it's a leap year!) plus the batch for "today." | N/A | Date | Required |
| `batch_size` | The granularity of your batches. Supported values are `hour`, `day`, `month`, and `year` | N/A | String | Required |
| `lookback` | Process X batches prior to the latest bookmark to capture late-arriving records. | `1` | Integer | Optional |
| `concurrent_batches` | An override for whether batches run concurrently (at the same time) or sequentially (one after the other). | `None` | Boolean | Optional |

| [`begin`](/reference/resource-configs/begin) | The "beginning of time" for the microbatch model. This is the starting point for any initial or full-refresh builds. For example, a daily-grain microbatch model run on `2024-10-01` with `begin = '2023-10-01` will process 366 batches (it's a leap year!) plus the batch for "today." | N/A | Date | Required |
| [`batch_size`](/reference/resource-configs/batch-size) | The granularity of your batches. Supported values are `hour`, `day`, `month`, and `year` | N/A | String | Required |
| [`lookback`](/reference/resource-configs/lookback) | Process X batches prior to the latest bookmark to capture late-arriving records. | `1` | Integer | Optional |
| [`concurrent_batches`](/reference/resource-properties/concurrent_batches) | An override for whether batches run concurrently (at the same time) or sequentially (one after the other). | `None` | Boolean | Optional |

<Lightbox src="/img/docs/building-a-dbt-project/microbatch/event_time.png" title="The event_time column configures the real-world time of this record"/>

Expand Down Expand Up @@ -290,7 +289,7 @@ The microbatch strategy offers the benefit of updating a model in smaller, more

Parallel batch execution means that multiple batches are processed at the same time, instead of one after the other (sequentially) for faster processing of your microbatch models.

dbt automatically detects whether a batch can be run in parallel in most cases, which means you don’t need to configure this setting. However, the `concurrent_batches` config is available as an override (not a gate), allowing you to specify whether batches should or shouldn’t be run in parallel in specific cases.
dbt automatically detects whether a batch can be run in parallel in most cases, which means you don’t need to configure this setting. However, the [`concurrent_batches` config](/reference/resource-properties/concurrent_batches) is available as an override (not a gate), allowing you to specify whether batches should or shouldn’t be run in parallel in specific cases.

For example, if you have a microbatch model with 12 batches, you can execute those batches to run in parallel. Specifically they'll run in parallel limited by the number of [available threads](/docs/running-a-dbt-project/using-threads).

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Starting in Core 1.9, you can use the new [microbatch strategy](/docs/build/incr
- Simplified query design: Write your model query for a single batch of data. dbt will use your `event_time``lookback`, and `batch_size` configurations to automatically generate the necessary filters for you, making the process more streamlined and reducing the need for you to manage these details.
- Independent batch processing: dbt automatically breaks down the data to load into smaller batches based on the specified `batch_size` and processes each batch independently, improving efficiency and reducing the risk of query timeouts. If some of your batches fail, you can use `dbt retry` to load only the failed batches.
- Targeted reprocessing: To load a *specific* batch or batches, you can use the CLI arguments `--event-time-start` and `--event-time-end`.
- [Automatic parallel batch execution](/docs/build/incremental-microbatch#parallel-batch-execution): Process multiple batches at the same time, instead of one after the other (sequentially) for faster processing of your microbatch models. dbt intelligently auto-detects if your batches can run in parallel, while also allowing you to manually override parallel execution with the `concurrent_batches` config.
- [Automatic parallel batch execution](/docs/build/incremental-microbatch#parallel-batch-execution): Process multiple batches at the same time, instead of one after the other (sequentially) for faster processing of your microbatch models. dbt intelligently auto-detects if your batches can run in parallel, while also allowing you to manually override parallel execution with the [`concurrent_batches` config](/reference/resource-properties/concurrent_batches).


Currently microbatch is supported on these adapters with more to come:
Expand Down
2 changes: 1 addition & 1 deletion website/docs/reference/resource-configs/batch_size.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: "dbt uses `batch_size` to determine how large batches are when runn
datatype: hour | day | month | year
---

Available in dbt Cloud Versionless and dbt Core v1.9 and higher.
Available in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9 and higher.

## Definition

Expand Down
2 changes: 1 addition & 1 deletion website/docs/reference/resource-configs/begin.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: "dbt uses `begin` to determine when a microbatch incremental model
datatype: string
---

Available in dbt Cloud Versionless and dbt Core v1.9 and higher.
Available in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9 and higher.

## Definition

Expand Down
2 changes: 1 addition & 1 deletion website/docs/reference/resource-configs/lookback.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: "dbt uses `lookback` to detrmine how many 'batches' of `batch_size`
datatype: int
---

Available in dbt Cloud Versionless and dbt Core v1.9 and higher.
Available in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9 and higher.

## Definition

Expand Down

0 comments on commit 4ebafab

Please sign in to comment.