Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#2479] Allow unique_id to take a list #4618

Merged
merged 6 commits into from
Feb 3, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

### Features
- New Dockerfile to support specific db adapters and platforms. See docker/README.md for details ([#4495](https://github.com/dbt-labs/dbt-core/issues/4495), [#4487](https://github.com/dbt-labs/dbt-core/pull/4487))
- Allow unique_key to take a list ([#2479](https://github.com/dbt-labs/dbt-core/issues/2479), [#4618](https://github.com/dbt-labs/dbt-core/pull/4618))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure to add the contributor from the original PR to the changelog for v1.1:

- [@triedandtested-dev](https://github.com/triedandtested-dev) ([#4618](https://github.com/dbt-labs/dbt-core/pull/4618))

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I noticed that this list option is missing from the docs here:
https://docs.getdbt.com/reference/resource-configs/unique_key#use-a-combination-of-two-columns-as-a-unique-key

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I noticed that this list option is missing from the docs here:
https://docs.getdbt.com/reference/resource-configs/unique_key#use-a-combination-of-two-columns-as-a-unique-key

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call @cdabel !

Just opened an issue for that here: dbt-labs/docs.getdbt.com#4642


### Fixes
- User wasn't asked for permission to overwite a profile entry when running init inside an existing project ([#4375](https://github.com/dbt-labs/dbt-core/issues/4375), [#4447](https://github.com/dbt-labs/dbt-core/pull/4447))
Expand Down
2 changes: 2 additions & 0 deletions core/dbt/contracts/graph/model_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,7 @@ class NodeConfig(NodeAndTestConfig):
metadata=MergeBehavior.Update.meta(),
)
full_refresh: Optional[bool] = None
unique_key: Optional[Union[str, List[str]]] = None
on_schema_change: Optional[str] = 'ignore'

@classmethod
Expand Down Expand Up @@ -494,6 +495,7 @@ def same_contents(
@dataclass
class EmptySnapshotConfig(NodeConfig):
materialized: str = 'snapshot'
unique_key: Optional[str] = None # override NodeConfig unique_key definition


@dataclass
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,19 @@
{%- set sql_header = config.get('sql_header', none) -%}

{% if unique_key %}
{% set unique_key_match %}
DBT_INTERNAL_SOURCE.{{ unique_key }} = DBT_INTERNAL_DEST.{{ unique_key }}
{% endset %}
{% do predicates.append(unique_key_match) %}
{% if unique_key is sequence and unique_key is not mapping and unique_key is not string %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This update to default__get_merge_sql will work for Snowflake + BigQuery out of the box. For other databases:

  • Postgres + Redshift don't support merge, they use get_delete_insert_merge_sql instead. We'd need to refactor the delete statement to support multiple unique keys, which is tricky enough that I think we can open a new issue for it. No need to block this PR in the meantime.
  • Spark reimplements this as spark__get_merge_sql, so we'll need to make a similar change over there if we want to support the same functionality

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Spark change here, @gshank maybe you can pair with @McKnight-42 or @VersusFacit on making that change.

@McKnight-42 @VersusFacit - this behavior difference is something good to document in the specific adapter pages if it isn't there yet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll also want integration tests for Snowflake + BigQuery, to ensure this change actually achieves the desired result

{% for key in unique_key %}
{% set this_key_match %}
DBT_INTERNAL_SOURCE.{{ key }} = DBT_INTERNAL_DEST.{{ key }}
{% endset %}
{% do predicates.append(this_key_match) %}
{% endfor %}
{% else %}
{% set unique_key_match %}
DBT_INTERNAL_SOURCE.{{ unique_key }} = DBT_INTERNAL_DEST.{{ unique_key }}
{% endset %}
{% do predicates.append(unique_key_match) %}
{% endif %}
{% else %}
{% do predicates.append('FALSE') %}
{% endif %}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -461,6 +461,7 @@ def rendered_model_config(self, **updates):
'full_refresh': None,
'on_schema_change': 'ignore',
'meta': {},
'unique_key': None,
}
result.update(updates)
return result
Expand All @@ -485,6 +486,7 @@ def rendered_seed_config(self, **updates):
'schema': None,
'alias': None,
'meta': {},
'unique_key': None,
}
result.update(updates)
return result
Expand Down
6 changes: 6 additions & 0 deletions test/integration/047_dbt_ls_tests/test_ls.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ def expect_analyses_output(self):
'schema': None,
'alias': None,
'meta': {},
'unique_key': None
},
'unique_id': 'analysis.test.a',
'original_file_path': normalize('analyses/a.sql'),
Expand Down Expand Up @@ -158,6 +159,7 @@ def expect_model_output(self):
'column_types': {},
'persist_docs': {},
'full_refresh': None,
'unique_key': None,
'on_schema_change': 'ignore',
'database': None,
'schema': None,
Expand All @@ -184,6 +186,7 @@ def expect_model_output(self):
'column_types': {},
'persist_docs': {},
'full_refresh': None,
'unique_key': None,
'on_schema_change': 'ignore',
'incremental_strategy': 'delete+insert',
'database': None,
Expand Down Expand Up @@ -211,6 +214,7 @@ def expect_model_output(self):
'column_types': {},
'persist_docs': {},
'full_refresh': None,
'unique_key': None,
'on_schema_change': 'ignore',
'database': None,
'schema': None,
Expand All @@ -237,6 +241,7 @@ def expect_model_output(self):
'column_types': {},
'persist_docs': {},
'full_refresh': None,
'unique_key': None,
'on_schema_change': 'ignore',
'database': None,
'schema': None,
Expand Down Expand Up @@ -332,6 +337,7 @@ def expect_seed_output(self):
'persist_docs': {},
'quote_columns': False,
'full_refresh': None,
'unique_key': None,
'on_schema_change': 'ignore',
'database': None,
'schema': None,
Expand Down
14 changes: 9 additions & 5 deletions test/unit/test_contracts_graph_parsed.py
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,7 @@ def complex_parsed_model_object():
def test_model_basic(basic_parsed_model_object, base_parsed_model_dict, minimal_parsed_model_dict):
node = basic_parsed_model_object
node_dict = base_parsed_model_dict
compare_dicts(node.to_dict(), node_dict)
assert_symmetric(node, node_dict)
assert node.empty is False
assert node.is_refable is True
Expand Down Expand Up @@ -921,6 +922,7 @@ def test_basic_parsed_hook(minimal_parsed_hook_dict, base_parsed_hook_dict, base
def test_complex_parsed_hook(complex_parsed_hook_dict, complex_parsed_hook_object):
node = complex_parsed_hook_object
node_dict = complex_parsed_hook_dict
# what's different?
assert_symmetric(node, node_dict)
assert node.empty is False
assert node.is_refable is False
Expand Down Expand Up @@ -1494,6 +1496,7 @@ def basic_intermediate_timestamp_snapshot_object():
tags=[],
config=cfg,
checksum=FileHash.from_contents(''),
created_at = 1,
unrendered_config={
'strategy': 'timestamp',
'unique_key': 'id',
Expand Down Expand Up @@ -1596,7 +1599,7 @@ def basic_check_snapshot_object():


@pytest.fixture
def basic_intermedaite_check_snapshot_object():
def basic_intermediate_check_snapshot_object():
cfg = EmptySnapshotConfig()
cfg._extra.update({
'unique_key': 'id',
Expand Down Expand Up @@ -1626,6 +1629,7 @@ def basic_intermedaite_check_snapshot_object():
tags=[],
config=cfg,
checksum=FileHash.from_contents(''),
created_at = 1.0,
unrendered_config={
'target_database': 'some_snapshot_db',
'target_schema': 'some_snapshot_schema',
Expand All @@ -1642,20 +1646,20 @@ def test_timestamp_snapshot_ok(basic_timestamp_snapshot_dict, basic_timestamp_sn
inter = basic_intermediate_timestamp_snapshot_object

assert_symmetric(node, node_dict, ParsedSnapshotNode)
assert_symmetric(inter, node_dict, IntermediateSnapshotNode)
# node_from_dict = ParsedSnapshotNode.from_dict(inter.to_dict(omit_none=True))
# node_from_dict.created_at = 1
assert ParsedSnapshotNode.from_dict(inter.to_dict(omit_none=True)) == node
assert node.is_refable is True
assert node.is_ephemeral is False
pickle.loads(pickle.dumps(node))


def test_check_snapshot_ok(basic_check_snapshot_dict, basic_check_snapshot_object, basic_intermedaite_check_snapshot_object):
def test_check_snapshot_ok(basic_check_snapshot_dict, basic_check_snapshot_object, basic_intermediate_check_snapshot_object):
node_dict = basic_check_snapshot_dict
node = basic_check_snapshot_object
inter = basic_intermedaite_check_snapshot_object
inter = basic_intermediate_check_snapshot_object

assert_symmetric(node, node_dict, ParsedSnapshotNode)
assert_symmetric(inter, node_dict, IntermediateSnapshotNode)
assert ParsedSnapshotNode.from_dict(inter.to_dict(omit_none=True)) == node
assert node.is_refable is True
assert node.is_ephemeral is False
Expand Down
2 changes: 2 additions & 0 deletions test/unit/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,8 @@ def assert_to_dict(obj, dct):
obj_to_dict['created_at'] = 1
if 'created_at' in dct:
dct['created_at'] = 1
if obj_to_dict != dct:
compare_dicts(obj_to_dict, dct)
assert obj_to_dict == dct


Expand Down