[Bug] Defaulting lookback to 0
results in consistently incomplete batches
#10867
Labels
bug
Something isn't working
microbatch
Issues related to the microbatch incremental strategy
user docs
[docs.getdbt.com] Needs better documentation
We should switch the default for
lookback
to1
.Currently the lookback value defaults to
0
. The problem with this is that a lookback of0
means that the dataset will never have a “complete” batch. This is because when microbatch is run, the current time is used for the latest batch. We do that because we’re favoring freshness over ensuring “only complete batches”. Unfortunately, this combined withlookback=0
makes it such that no batch is ever "complete".For an example, consider a microbatch model with a
batch_size
ofday
, and it's run at noon everyday (12:00:00). If ourlookback
is0
then when the microbatch model is run today it’ll get data from today 00:00:00 to 12:00:00. Then tomorrow when my microbatch model it’ll get data for tomorrow 00:00:00 to 12:00:00, but it won’t go back and get the rest of “today’s” data (because the lookback is0
).Thus the “default” valid behavior should be a
lookback
of1
. This ensures that batches are complete. The only caveats being when there is regularly late arriving data for which one can set a largerlookback
value, or when there is one off late arriving data using—event-time-start
+—event-time-end
to backfill the specific range.The text was updated successfully, but these errors were encountered: