Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support render for versioned dbt models. #516

Merged
merged 16 commits into from
Oct 6, 2023

Conversation

binhnq94
Copy link
Contributor

@binhnq94 binhnq94 commented Sep 6, 2023

Description

Model version is supported by dbt-core >= 1.5.
As cosmos v1.1.0a1, it not support to render task for model have versions.

In this PR I propose to use node_dict["alias"] as DbtNode.name instead of node_dict["name"] if node_dict["resource_type"] == model.

This PR also help cosmos render the names of nodes to be correctly the same as identifier if users config alias or custom the generate_alias_name macro. Refer: alias document and the generate_alias_name implementation.

Related Issue(s)

#260

Breaking Change?

When model have version config, each model version will create a database relation with alias <model_name>_v<v>, for examples: dim_customers_v1, dim_customers_v2...

So I expected that cosmos will created run/test tasks for each versions, for example: dim_customers_v1_run, dim_customers_v2_run, dim_customers_v1_test, dim_customers_v2_test...

The second expectation is the dag will render right depedences, for example: stg_customers_v1 >> customers_v1, stg_customers_v2 >> customer_v2...

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

@binhnq94 binhnq94 requested a review from a team as a code owner September 6, 2023 17:43
@binhnq94 binhnq94 requested a review from a team September 6, 2023 17:43
@netlify
Copy link

netlify bot commented Sep 6, 2023

👷 Deploy Preview for amazing-pothos-a3bca0 processing.

Name Link
🔨 Latest commit 898569e
🔍 Latest deploy log https://app.netlify.com/sites/amazing-pothos-a3bca0/deploys/6516fb7836da88000811009d

@binhnq94 binhnq94 temporarily deployed to external September 6, 2023 18:39 — with GitHub Actions Inactive
Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@binhnq94, thanks a lot for your contribution!

I will still read the docs to understand more about aliases and versions, but I have two initial questions:

  1. Would there be any benefit in introducing a new attribute to DbtNode called alias instead of overloading the property name with two possible sources? We could also introduce an identifier if it helped. What are the pros and cons of overriding name?
  2. When we render dbt nodes as Airflow nodes, what would be the benefit of rendering models using aliases instead of their names?

cosmos/dbt/graph.py Outdated Show resolved Hide resolved
@tatiana tatiana added this to the 1.2.0 milestone Sep 6, 2023
@codecov
Copy link

codecov bot commented Sep 6, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (fabedd8) 92.83% compared to head (898569e) 92.83%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #516   +/-   ##
=======================================
  Coverage   92.83%   92.83%           
=======================================
  Files          49       49           
  Lines        1983     1983           
=======================================
  Hits         1841     1841           
  Misses        142      142           
Files Coverage Δ
cosmos/dbt/graph.py 98.78% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@binhnq94
Copy link
Contributor Author

binhnq94 commented Sep 7, 2023

@binhnq94, thanks a lot for your contribution!

I will still read the docs to understand more about aliases and versions, but I have two initial questions:

  1. Would there be any benefit in introducing a new attribute to DbtNode called alias instead of overloading the property name with two possible sources? We could also introduce an identifier if it helped. What are the pros and cons of overriding name?
  2. When we render dbt nodes as Airflow nodes, what would be the benefit of rendering models using aliases instead of their names?
  1. No, because if you use model versions in dbt project, name is not unique any more. dbt run with --models <name> will build all model versions leading to select not work as expected.
  2. Use alias, cosmos will render the DAG that have the same lineage as in the docs created by dbt docs generate cli. . It help users to double check easier and it support model versions.

@binhnq94 binhnq94 temporarily deployed to external September 7, 2023 04:13 — with GitHub Actions Inactive
@pre-commit-ci pre-commit-ci bot temporarily deployed to external September 8, 2023 09:33 Inactive
@binhnq94 binhnq94 temporarily deployed to external September 8, 2023 15:02 — with GitHub Actions Inactive
@binhnq94 binhnq94 force-pushed the fix_dbt_model_version branch from c107c07 to 80be88d Compare September 8, 2023 16:11
@binhnq94 binhnq94 temporarily deployed to external September 8, 2023 16:11 — with GitHub Actions Inactive
@binhnq94 binhnq94 changed the title Add support dbt project render for models version. Add support render for versioned models. Sep 8, 2023
@binhnq94 binhnq94 changed the title Add support render for versioned models. Add support render for versioned dbt models. Sep 8, 2023
@binhnq94 binhnq94 temporarily deployed to external September 11, 2023 11:05 — with GitHub Actions Inactive
@tatiana tatiana temporarily deployed to external September 11, 2023 21:14 — with GitHub Actions Inactive
@tatiana
Copy link
Collaborator

tatiana commented Sep 13, 2023

@binhnq94 let's aim to have this change released as part of 1.2. Please, let me know if you'd like any further support on anything - I'm looking forward to merging this change.

@binhnq94 binhnq94 force-pushed the fix_dbt_model_version branch from 67eda36 to ce3d0eb Compare September 13, 2023 14:43
@binhnq94 binhnq94 temporarily deployed to external September 13, 2023 14:43 — with GitHub Actions Inactive
@binhnq94 binhnq94 temporarily deployed to external September 29, 2023 07:29 — with GitHub Actions Inactive
@binhnq94
Copy link
Contributor Author

binhnq94 commented Sep 29, 2023

@tatiana I fixed the error: improper relation name (too many dotted names) in commit
by deleting the macro file drop_table.sql.

@binhnq94 binhnq94 temporarily deployed to external September 29, 2023 09:38 — with GitHub Actions Inactive
@binhnq94 binhnq94 force-pushed the fix_dbt_model_version branch from 8e15561 to deb9964 Compare September 29, 2023 14:39
@binhnq94 binhnq94 temporarily deployed to external September 29, 2023 14:39 — with GitHub Actions Inactive
@binhnq94
Copy link
Contributor Author

@tatiana I have already fixed the fail in integrate tests (sorry, last time I only run model_version dag only.)
image
Let trigger integration tests :'(

@binhnq94 binhnq94 temporarily deployed to external September 29, 2023 16:29 — with GitHub Actions Inactive
@binhnq94 binhnq94 temporarily deployed to external September 29, 2023 18:05 — with GitHub Actions Inactive
@tatiana tatiana merged commit 6609f76 into astronomer:main Oct 6, 2023
39 checks passed
@tatiana tatiana mentioned this pull request Oct 13, 2023
tatiana added a commit that referenced this pull request Oct 13, 2023
**Features**

* Add support to model versioning available since dbt 1.6 by @binhnq94
in #516
* Add AWS Athena profile mapping by @benjamin-awd in #578
* Support customizing how dbt nodes are converted to Airflow by @tatiana
in #503
* Make the arg ``dbt_project_path`` in the ``ProjectConfig`` optional by
@MrBones757 in #581

**Bug fixes**

* Fix Cosmos custom selector to support filtering a single model by
@jlaneve and @harels in #576
* Fix using ``GoogleCloudServiceAccountDictProfileMapping`` together
with ``LoadMethod.DBT_LS`` by @joppevos in #587
* Fix using the ``full_refresh`` argument in projects that contain tests
by @EgorSemenov and @tatiana in #590
* Stop creating symbolic links for ``dbt_packages`` (solves
``LocalExecutor`` concurrency issue) by @tatiana in #600

**Others**

* Docs: add reference to original Jaffle Shop project by @erdos2n in
#583
* Docs: retries & note about DagBag error by @TJaniF in #592
* pre-commit updates in #575 and #585
@tatiana
Copy link
Collaborator

tatiana commented Oct 13, 2023

This was released as part of 1.2:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants