Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explorer update: column level lineage #4767

Closed
wants to merge 12 commits into from
58 changes: 58 additions & 0 deletions website/docs/docs/collaborate/column-level-lineage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
title: "Column level lineage"
description: "dbt Explorer provides recommendations that you can take to improve the quality of your dbt project."
---

dbt Explorer now offers column level lineage (CLL) for the resources in your dbt project. Analytics engineers can quickly and easily gain insight into the provenance of their data products at a more granular level. For each column in a dbt resource (model, source, or snapshot), Explorer provides the full lineage for the data in that column.
nghi-ly marked this conversation as resolved.
Show resolved Hide resolved

Column level lineage is available to dbt Cloud Enterprise accounts that can use Explorer. It’s also available through the Discovery API.

:::tip Beta
Column-level lineage is now available in beta. Check it out! We'd love to [know what you think](https://docs.google.com/forms/d/e/1FAIpQLSdpCbVkGY9QwfExFonpWE4DTOKi3fQxBGLD0wwKYpkMjgcE7g/viewform)!
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't use "open beta" here, aligned with the betas for Project recs and Model perf pages. lemme know if this doesn't work tho! happy to change

:::

## No setup required
nghi-ly marked this conversation as resolved.
Show resolved Hide resolved

There is no additional setup required for column level lineage if your account is on an Enterprise plan that can use Explorer. You can access column level lineage by expanding the column card in the **Columns** tab of an Explorer [resource details page](/docs/collaborate/explore-projects#view-resource-details) for a model, source, or snapshot.

Lineage will update after each run executed in the production environment. Make sure that `docs generate` is running within at least one job in the environment. Refer to [Generating metadata](/docs/collaborate/explore-projects#generate-metadata) for more details.

<Lightbox src="/img/docs/collaborate/dbt-explorer/example-cll.png" title="Example of the Columns tab and where to expand for the CLL"/>

<LoomVideo id='278c948ba387457884cc6b9545793685' />
Copy link
Contributor Author

@nghi-ly nghi-ly Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dave-connors-3 : i'm using the loom video from the notion draft. we can update the link if you end up rerecording it, np

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dave-connors-3 feel free to re-record, maybe on Fri or Mon? i'd like to see your take, and we may want to wait for more UX improvements to land


## What you can use column level lineage for {#use-cases}
nghi-ly marked this conversation as resolved.
Show resolved Hide resolved

Click on these tabs to learn more about the CLL use cases, the analysis you can do, and the results you can achieve:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explore the sections below to learn more about why and how you can use column-level lineage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated this. for accessibility, "below" is not a great word choice so worked around that


<Tabs>
<TabItem value="root-cause" label="Root cause analysis">

When there is an unexpected breakage in a data pipeline, column level lineage can be a valuable tool to understand the exact point in the pipeline where the error took place. For example, a failing data test on a particular column in your dbt model might've stemmed from an untested column upstream. Using CLL can help quickly identify and fix breakages when they happen.

</TabItem>
<TabItem value="impact" label="Impact analysis">

During development, analytics engineers can use column level lineage to understand the full scope of the impact of their proposed changes. This knowledge empowers them to create higher quality pull requests that require less rework, as they can anticipate and preempt issues that would've been unchecked without column level insights.

</TabItem>
<TabItem value="collaboration" label="Collaboration and efficiency">

When exploring your data products, navigating column lineage allows analytics engineers and data analysts to more easily navigate and understand the origin and usage of their data, enabling them to make better decisions with higher confidence.
</TabItem>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another use case - thoughts @dave-connors-3 ?

Auditing

Column-level awareness makes it easier for data developers to see how columns flow through the project when examining how data is used in a dbt project.

</Tabs>
nghi-ly marked this conversation as resolved.
Show resolved Hide resolved

## Caveats

Column level lineage relies on SQL parsing. Errors can occur when parsing fails or a column's origin is unknown (like with JSON unpacking, lateral joins, and so forth). In these cases, lineage may be incomplete and dbt Cloud will provide a warning about it in the column lineage. To review the error details, open the [full lineage graph](/docs/collaborate/explore-projects#project-lineage) and select the node to open the column’s details panel.

<Lightbox src="/img/docs/collaborate/dbt-explorer/example-parsing-error-pill.png" width="90%" title="Example of warning in the full lineage graph"/>

Possible error cases:

- **Parsing error** &mdash; Error occurs when the SQL is ambiguous or too complex for parsing. An example of ambiguous parsing scenarios are _complex_ lateral joins.
- **Python error** &mdash; Error occurs when a Python model is used within the lineage. Due to the nature of Python models, it's not possible to parse and determine the lineage.
- **Unknown error** &mdash; Error occurs when the lineage can't be determined for an unknown reason. An example of this would be if a dbt best practice is not being followed, like using hardcoded table names instead of `ref` statements.
Copy link
Contributor Author

@nghi-ly nghi-ly Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"improper use of dbt" read a bit strong to me so trying this wording instead. this might be a bit of a mouthful tho




16 changes: 8 additions & 8 deletions website/docs/docs/collaborate/project-recommendations.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,20 @@ title: "Project recommendations"
sidebar_label: "Project recommendations"
description: "dbt Explorer provides recommendations that you can take to improve the quality of your dbt project."
---

:::tip Beta

The project recommendations beta feature is now available in dbt Explorer! Check it out!

:::
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

callouts at the top of pages affect SEO so as a best practice we're moving them to a lower location on the page but making sure they're still before the fold


dbt Explorer provides recommendations about your project from the `dbt_project_evaluator` [package](https://hub.getdbt.com/dbt-labs/dbt_project_evaluator/latest/) using metadata from the Discovery API.

Explorer also offers a global view, showing all the recommendations across the project for easy sorting and summarizing.

These recommendations provide insight into how you can build a more well documented, well tested, and well built project, leading to less confusion and more trust.

The Recommendations overview page includes two top-level metrics measuring the test and documentation coverage of the models in your project.
:::tip Beta

The project recommendations beta feature is now available in dbt Explorer! Check it out!

:::

The **Recommendations** overview page includes two top-level metrics measuring the test and documentation coverage of the models in your project.

- **Model test coverage** &mdash; The percent of models in your project (models not from a package or imported via dbt Mesh) with at least one dbt test configured on them.
- **Model documentation coverage** &mdash; The percent of models in your project (models not from a package or imported via dbt Mesh) with a description.
Expand All @@ -43,7 +43,7 @@ The Recommendations overview page includes two top-level metrics measuring the t

## The Recommendations tab

Models, sources and exposures each also have a Recommendations tab on their resource details page, with the specific recommendations that correspond to that resource:
Models, sources and exposures each also have a **Recommendations** tab on their resource details page, with the specific recommendations that correspond to that resource:

<Lightbox src="/img/docs/collaborate/dbt-explorer/example-recommendations-tab.png" width="80%" title="Example of the Recommendations tab "/>

Expand Down
1 change: 1 addition & 0 deletions website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -425,6 +425,7 @@ const sidebarSettings = {
link: { type: "doc", id: "docs/collaborate/explore-projects" },
items: [
"docs/collaborate/explore-projects",
"docs/collaborate/column-level-lineage",
"docs/collaborate/model-performance",
"docs/collaborate/project-recommendations",
"docs/collaborate/explore-multiple-projects",
Expand Down
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading