Full Changelog: https://github.com/dbt-labs/dbt-utils/compare/1.2.0...main
- Add option to ignore columns in equality test by @brunocostalopes in dbt-labs#765
- The
equality
test now accepts an additional argument,precision
to aide in comparing floating point numbers by @rlh1994 in dbt-labs#765
deduplicate
macro for Databricks now uses theQUALIFY
clause, which fixesNULL
columns issues from the default natural join logic by @graciegoheen in dbt-labs#786- Use QUALIFY clause in
deduplicate
macro for Redshift by @yauhen-sobaleu in dbt-labs#811 - get redshift external tables by @brendan-cook-87 in dbt-labs#753
- Equality test will now raise an error when the second model has less columns than the first by @rlh1994 in dbt-labs#765
- Update documentation for
get_column_values()
to specify that theorder_by
argument must be expressed as an aggregate function by @bakerbryce in dbt-labs#872 - Set the correct language identifier in code blocks within the documentation by @yamotech in dbt-labs#876
- Fix typo of
not_null_proportion
in README.md by @PChambino in dbt-labs#853 - Fix failing example for
dbt_utils.deduplicate()
in README.md by @pruoff in dbt-labs#856 - Link to Haversine Distance article on Wikipedia by @dbeatty10 in dbt-labs#889
- GitHub Action to close issues as stale as-needed by @dbeatty10 in dbt-labs#813
- GitHub Action to add/remove triage labels as-needed by @dbeatty10 in dbt-labs#812
- Instructions for the release process by @dbeatty10 in dbt-labs#821
- Update dev-requirements for new pip syntax by @gwenwindflower in dbt-labs#870
- Disable auto-generation of table of contents (TOC) by @dbeatty10 in dbt-labs#887
- Update tests: -> data_tests: by @graciegoheen in dbt-labs#893
- @yauhen-sobaleu made their first contribution in dbt-labs#811
- @brendan-cook-87 made their first contribution in dbt-labs#753
- @gwenwindflower made their first contribution in dbt-labs#870
- @brunocostalopes made their first contribution in dbt-labs#765
- @rlh1994 made their first contribution in dbt-labs#765
- @yamotech made their first contribution in dbt-labs#876
- @PChambino made their first contribution in dbt-labs#853
- @pruoff made their first contribution in dbt-labs#856
Full Changelog: https://github.com/dbt-labs/dbt-utils/compare/1.1.1...1.2.0
- Improve the performance of the
at_least_one
test by pruning early. This is especially helpful when running against external tables. By @joshuahuntley in dbt-labs#775
- Fix legacy links in README by @dbeatty10 in dbt-labs#796
- Safe subtract by @dchess in dbt-labs#748
- Add Databricks handler for get_table_types_sql.sql by @Harmuth94 in dbt-labs#769
- Typo fix by @AndrewLane in dbt-labs#738
- Removed remark about Dbt 0.9.6 as utils 1.0.0 is now the the default by @ilanbenb in dbt-labs#740
- Fix link in README by @b-per in dbt-labs#743
- Update README.md about use
where
withaccepted_range
tests by @eitsupi in dbt-labs#739 - doc: clarify that
union_relations()
usesunion all
by @owenprough-sift in dbt-labs#760 - Automatically generate TOC for utils readme by @joellabes in dbt-labs#486
- Use CircleCI contexts for environment variables by @dbeatty10 in dbt-labs#754
- fix: #755 - add whitespace control to generate_surrogate_key macro by @akv-akv in dbt-labs#756
- Fix CI by @joellabes in dbt-labs#771
- @AndrewLane made their first contribution in dbt-labs#738
- @ilanbenb made their first contribution in dbt-labs#740
- @eitsupi made their first contribution in dbt-labs#739
- @owenprough-sift made their first contribution in dbt-labs#760
- @akv-akv made their first contribution in dbt-labs#756
- @dchess made their first contribution in dbt-labs#748
- @Harmuth94 made their first contribution in dbt-labs#769
The full migration guide is at https://docs.getdbt.com/guides/migration/versions/upgrading-to-dbt-utils-v1.0
- New macro
get_single_value
(#696) - New macro safe_divide() β Returns null when the denominator is 0, instead of throwing a divide-by-zero error.
- Add
not_empty_string
generic test that asserts column values are not an empty string. (#632, #634)
- Implemented an optional
group_by_columns
argument across many of the generic testing macros to test for properties that only pertain to group-level or are can be more rigorously conducted at the group level. Property available inrecency
,at_least_one
,equal_row_count
,fewer_rows_than
,not_constant
,not_null_proportion
, andsequential
tests #633 - With the addition of an on-by-default quote_identifiers flag in the star() macro, you can now disable quoting if necessary. (#706)
union()
now includes/excludes columns case-insensitively- The
expression_is_true test
doesnβt output * unless storing failures, a cost improvement for BigQuery (#683, #686) - Updated the
slugify
macro to prepend "_" to column names beginning with a number since most databases do not allow names to begin with numbers.
- Remove deprecated table argument from
unpivot
(#671) - Delete the deprecated identifier macro (#672)
- Handle deprecations in deduplicate macro (#673)
- Fully remove varargs usage in
surrogate_key
andsafe_add
(#674) - Remove obsolete condition argument from
expression_is_true
(#699) - Explicitly stating the namespace for cross-db macros so that the dispatch logic works correctly by restoring the dbt. prefix for all migrated cross-db macros (#701)
- [@CR-Lough] (https://github.com/CR-Lough) (#706) (#696)
- @fivetran-catfritz
- @crowemi
- @SimonQuvang (#701)
- @christineberger (#624)
- @epapineau (#634)
- @courentin (#651)
- @zachoj10 (#692)
- @miles170
- @emilyriederer
- Stop showing cross-db deprecation warnings for macros who have already been migrated (#725)
Rolled back due to accidental incompatibilities
- Remove unnecessary generated new lines in
star
by @courentin in dbt-labs#651 - fix: Actually suppress
union_relations
source_column_name when passingnone
by @kmclaugh in dbt-labs#661 - Make
mutually_exclusive_ranges
' test deterministic by addingupper_bound_column
toorder by
clause by @sfc-gh-ancoleman in dbt-labs#660 - update union_relations to use core string literal macro by @dave-connors-3 in dbt-labs#665
- Add where clause example to get_column_values documentation by @arsenkhy in dbt-labs#623
- @courentin made their first contribution in dbt-labs#651
- @kmclaugh made their first contribution in dbt-labs#661
- @sfc-gh-ancoleman made their first contribution in dbt-labs#660
- @dave-connors-3 made their first contribution in dbt-labs#665
- @arsenkhy made their first contribution in dbt-labs#623
- Remove cross-db dbt_utils references by @clausherther in #650
- π¨ (Almost all) cross-db macros are now implemented in dbt Core instead of dbt-utils. A backwards-compatibility layer remains for now and will be removed in dbt utils 1.0 later this year. Completed by @dbeatty10 and @jtcohen6 in dbt-labs#597, dbt-labs#586 and dbt-labs#615
- See #487 for further discussion on the backstory
- If you are a package maintainer with a dependency on these macros, prepare for their removal by switching to
{{ dbt.some_macro() }}
. Refer to #package-ecosystem in the Community Slack for further assistance
- Feature: Add option to remove the
source_column_name
on theunion_relations
macro by @christineberger in dbt-labs#624
- Use adapter.quote() instead of hardcoded BQ quoting for get_table_types_sql by @alla-bongard in dbt-labs#636
- standardize yml indentation under the 'models:' line on the README by @leoebfolsom in dbt-labs#613
- Use MADR 3.0.0 for formatting decision records by @dbeatty10 in dbt-labs#614
- Docs cleanup by @dbeatty10 in dbt-labs#620
- Add not_accepted_values to README ToC by @david-beallor in dbt-labs#646
- @leoebfolsom made their first contribution in dbt-labs#613
- @christineberger made their first contribution in dbt-labs#624
- @alla-bongard made their first contribution in dbt-labs#636
- @david-beallor made their first contribution in dbt-labs#646
- New macros
array_append
andarray_construct
(#595)
- Use
*
instar
macro if no columns (for SQLFluff) (#605, #561) - Only raise error within
union_relations
forbuild
/run
sub-commands (#606, #607)
- Add slugify to list of Jinja Helpers (#602)
- @swanjson (#561)
- @dataders (#561)
- @epapineau (#583)
- @graciegoheen (#595)
- @jeremyyeo (#606)
The call signature of deduplicate
has changed. The previous call signature is marked as deprecated and will be removed in the next minor version.
- The
group_by
argument is now deprecated and replaced bypartition_by
. - The
order_by
argument is now required. - The
relation_alias
argument has been removed as the macro now supportsrelation
as a string directly. If you were usingrelation_alias
to point to a CTE previously then you can now pass the alias directly torelation
.
Before:
{% macro deduplicate(relation, group_by, order_by=none, relation_alias=none) -%}
...
{% endmacro %}
After:
{% macro deduplicate(relation, partition_by, order_by) -%}
...
{% endmacro %}
- Add an optional
where
clause parameter toget_column_values()
to filter values returned (#511, #583) - Add
where
parameter tounion_relations
macro (#554) - Add Postgres specific implementation of
deduplicate()
(#548) - Add Snowflake specific implementation of
deduplicate()
(#543, #548)
- Fix
union_relations
source_column_name
none option. - Enable a negative part_number for
split_part()
(#557, #559) - Make
exclude
case insensitive forunion_relations()
(#578, #587)
- Documentation about listagg macro (#544, #560)
- Fix links to macro section in table of contents (#555)
- Use the ADR (Architectural Design Record) pattern for documenting significant decisions (#573)
- Contributing guide (#574)
- Add better documentation for
deduplicate()
(#542, #548)
- Fail integration tests appropriately (#540, #545)
- Upgrade CircleCI postgres convenience image (#584, #585)
- Run test for
deduplicate
(#579, #580) - Reduce warnings when executing integration tests (#558, #581)
- Framework for functional testing using
pytest
(#588)
- @graciegoheen (#560)
- @judahrand (#548)
- @clausherther (#555)
- @LewisDavies (#554)
- @epapineau (#583)
- @b-per (#559)
- @dbeatty10, @jeremyyeo (#587)
- A macro for deduplicating data,
deduplicate()
(#335, #512) - A cross-database implementation of
listagg()
(#530) - A new macro to get the columns in a relation as a list,
get_filtered_columns_in_relation()
. This is similar to thestar()
macro, but creates a Jinja list instead of a comma-separated string. (#516)
get_column_values()
once more raises an error when the model doesn't exist and there is no default provided (#531, #533)get_column_values()
raises an error when used with an ephemeral model, instead of getting stuck in a compilation loop (#358, #518)- BigQuery materialized views work correctly with
get_relations_by_pattern()
(#525)
- Updated references to 'schema test' in project file structure and documentation (#485, #521)
date_trunc()
anddatediff()
default macros now have whitespace control to assist with linting and readability #529star()
no longer raises an error during SQLFluff linting (#506, #532)
- @judahrand (#512)
- @b-moynihan (#521)
- @sunriselong (#529)
- @jpmmcneill (#533)
- @KamranAMalik (#532)
- @graciegoheen (#530)
- @luisleon90 (#525)
- @epapineau (#518)
- @patkearns10 (#516)
- A cross-database implementation of
any_value()
(#497, #501) - A cross-database implementation of
bool_or()
(#504)
- also ignore
dbt_packages/
directory #463 - Remove block comments to make date_spine macro compatible with the Athena connector (#462)
type_timestamp
macro now explicitly casts postgres and redshift warehouse timestamp data types astimestamp without time zone
, to be consistent with Snowflake behaviour (timestamp_ntz
).union_relations
macro will now raise an exception if the use ofinclude
orexclude
results in no columns (#473, #266).get_relations_by_pattern()
works with foreign data wrappers on Postgres again. (#357, #476)star()
will only alias columns if a prefix/suffix is provided, to allow the unmodified output to still be used ingroup by
clauses etc. #468- The
sequential_values
test is now compatible with quoted columns #479 pivot()
escapes values containing apostrophes #503
- grahamwetzler (#473)
- Aesthet (#476)
- Kamitenshi (#462)
- nickperrott (#468)
- jelstongreen (#468)
- armandduijn (#479)
- mdutoo (#503)
- dbt ONE POINT OH is here! This version of dbt-utils requires any version (minor and patch) of v1, which means far less need for compatibility releases in the future.
- The partition column in the
mutually_exclusive_ranges
test is now always calledpartition_by_col
. This enables compatibility with--store-failures
when multiple columns are concatenated together. If you have models built on top of the failures table, update them to reflect the new column name. (#423, #430)
- codigo-ergo-sum (#430)
π¨ This is a compatibility release in preparation for dbt-core
v1.0.0 (π). Projects using dbt-utils 0.7.4 with dbt-core v1.0.0 can expect to see a deprecation warning. This will be resolved in dbt_utils v0.8.0.
- Regression in
get_column_values()
where the default would not be respected if the model didn't exist. (#444, #448)
- get_url_host() macro now correctly handles URLs beginning with android-app:// (#426)
get_column_values()
now works correctly with mixed-quoting styles on Snowflake (#424, #440)- Remove extra semicolon in
insert_by_period
materialization that was causing errors (#439) - Swap
limit 0
out for{{ limit_zero() }}
on theslugify()
tests to allow for compatibility with tsql-utils (#437)
π¨π¨ We have renamed the master
branch to main
. If you have a local version of dbt-utils
, you will need to update to the new branch. See the GitHub docs for more details.
- Bump
require-dbt-version
to have an upper bound of'<=1.0.0'
. - Url link fixes within the README for
not_constant
,dateadd
,datediff
and updated the headerLogger
toJinja Helpers
. (#431) - Fully qualified a
cte_name.*
in theequality
test to avoid an Exasol error (#420) get_url_host()
macro now correctly handles URLs beginning withandroid-app://
(#426)
- Fix bug introduced in 0.7.2 in dbt_utils.star which could cause the except argument to drop columns that were not explicitly specified (#418)
- Remove deprecated argument from not_null_proportion (#416)
- Change final select statement in not_null_proportion to avoid false positive failures (#416)
- Add
not_null_proportion
generic test that allows the user to specify the minimum (at_least
) tolerated proportion (e.g.,0.95
) of non-null values (#411)
- Allow user to provide any case type when defining the
exclude
argument indbt_utils.star()
(#403) - Log whole row instead of just column name in 'accepted_range' generic test to allow better visibility into failures (#413)
- Use column name to group in 'get_column_values ' to allow better cross db functionality (#407)
- Declare compatibility with dbt v0.21.0, which has no breaking changes for this package (#398)
dbt v0.20.0 or greater is required for this release. If you are not ready to upgrade, consider using a previous release of this package.
In accordance with the version upgrade, this package release includes breaking changes to:
- Generic (schema) tests
dispatch
functionality
The order of (optional) arguments has changed in the get_column_values
macro.
Before:
{% macro get_column_values(table, column, order_by='count(*) desc', max_records=none, default=none) -%}
...
{% endmacro %}
After:
{% macro get_column_values(table, column, max_records=none, default=none) -%}
...
{% endmacro %}
If you were relying on the position to match up your optional arguments, this may be a breaking change β in general, we recommend that you explicitly declare any optional arguments (if not all of your arguments!)
-- before: This works on previous version of dbt-utils, but on 0.7.0, the `50` would be passed through as the `order_by` argument
{% set payment_methods = dbt_utils.get_column_values(
ref('stg_payments'),
'payment_method',
50
) %}
-- after
{% set payment_methods = dbt_utils.get_column_values(
ref('stg_payments'),
'payment_method',
max_records=50
) %}
- Add new argument,
order_by
, toget_column_values
(code originally in #289 from @clausherther, merged via #349) - Add
slugify
macro, and use it in the pivot macro. π¨ This macro uses there
module, which is only available in dbt v0.19.0+. As a result, this feature introduces a breaking change. (#314) - Add
not_null_proportion
generic test that allows the user to specify the minimum (at_least
) tolerated proportion (e.g.,0.95
) of non-null values
- Update the default implementation of concat macro to use
||
operator (#373 from @ChristopheDuong). Note this may be a breaking change for adapters that supportconcat()
but not||
, such as Apache Spark.
- Use
power()
instead ofpow()
ingenerate_series()
andhaversine_distance()
as they are synonyms in most SQL dialects, but some dialects only havepower()
(#354 from @swanderz) - Make
get_column_values
return the default value passed as a parameter instead of an empty string before compilation (#304 from @jmriego
- make
sequential_values
generic test usedbt_utils.type_timestamp()
to allow for compatibility with db's without timestamp data type. #376 from @swanderz
- Add new
accepted_range
test (#276 @joellabes) - Make
expression_is_true
work as a column test (code originally in #226 from @elliottohara, merged via #313) - Add new generic test,
not_accepted_values
(#284 @JavierMonton) - Support a new argument,
zero_length_range_allowed
in themutually_exclusive_ranges
test (#307 @zemekeneng) - Add new generic test,
sequential_values
(#318, inspired by @hundredwatt) - Support
quarter
in thepostgres__last_day
macro (#333 @seunghanhong) - Add new argument,
unit
, tohaversine_distance
(#340 @bastienboutonnet) - Add new generic test,
fewer_rows_than
(code originally in #221 from @dmarts, merged via #343)
- Handle booleans gracefully in the unpivot macro (#305 @avishalom)
- Fix a bug in
get_relation_by_prefix
that happens with Snowflake external tables. Now the macro will retrieve tables that match the prefix which are external tables (#351) - Fix
cardinality_equality
test when the two tables' column names differed (#334 @joellabes)
- Fix Markdown formatting for hub rendering (#336 @coapacetic)
- Reorder readme and improve docs
- Fix
insert_by_period
to supportdbt v0.19.0
, with backwards compatibility for earlier versions (#319, #320)
- Speed up CI via threads, workflows (#315, #316)
- Fix
equality
test when used with ephemeral models + explicit column set (#321) - Fix
get_query_results_as_dict
integration test with consistent ordering (#322) - All macros are now properly dispatched, making it possible for non-core adapters to implement a shim package for dbt-utils (#312) Thanks @chaerinlee1 and @swanderz
- Small, non-breaking changes to accomodate TSQL (can't group by column number references, no real TRUE/FALSE values, aggregation CTEs need named columns) (#310) Thanks @swanderz
- Make
get_relations_by_pattern
andget_relations_by_prefix
more powerful by returningrelation.type
(#323)
- Fix the logic in
get_tables_by_pattern_sql
to ensure non-default arguments are respected (#279)
- Fix the logic in
get_tables_by_pattern_sql
for matching a schema pattern on BigQuery (#275)
- π¨ dbt v0.18.0 or greater is required for this release. If you are not ready to upgrade, consider using a previous release of this package
- π¨ The
get_tables_by_prefix
,union_tables
andget_tables_by_pattern
macros have been removed
- Upgrade your dbt project to v0.18.0 using these instructions.
- Upgrade your
packages.yml
file to use version0.6.0
of this package. Rundbt clean
anddbt deps
. - If your project uses the
get_tables_by_prefix
macro, replace it withget_relations_by_prefix
. All arguments have retained the same name. - If your project uses the
union_tables
macro, replace it withunion_relations
. While the order of arguments has stayed consistent, thetables
argument has been renamed torelations
. Further, the default value for thesource_column_name
argument has changed from'_dbt_source_table'
to'_dbt_source_relation'
β you may want to explicitly define this argument to avoid breaking changes.
-- before:
{{ dbt_utils.union_tables(
tables=[ref('my_model'), source('my_source', 'my_table')],
exclude=["_loaded_at"]
) }}
-- after:
{{ dbt_utils.union_relations(
relations=[ref('my_model'), source('my_source', 'my_table')],
exclude=["_loaded_at"],
source_column_name='_dbt_source_table'
) }}
- If your project uses the
get_tables_by_pattern
macro, replace it withget_tables_by_pattern_sql
β all arguments are consistent.
- Switch usage of
adapter_macro
toadapter.dispatch
, and definedbt_utils_dispatch_list
, enabling users of community-supported database plugins to add or override macro implementations specific to their database (#267) - Use
add_ephemeral_prefix
instead of hard-coding a string literal, to support database adapters that use different prefixes (#267) - Implement a quote_columns argument in the unique_combination_of_columns generic test (#270 @JoshuaHuntley)
- Remove deprecated macros
get_tables_by_prefix
andunion_tables
(#268) - Remove
get_tables_by_pattern
macro, which is equivalent to theget_tables_by_pattern_sql
macro (the latter has a more logical name) (#268)
- Improve release process, and fix tests (#251)
- Make deprecation warnings more useful (#258 @tayloramurphy)
- Add more docs for
date_spine
(#265 @calvingiles)