From 1d91550e48cdbb6583209c5bf8b66a74f5acc918 Mon Sep 17 00:00:00 2001
From: Justin Lane <94429064+justinl-sc@users.noreply.github.com>
Date: Wed, 27 Sep 2023 10:47:48 +1000
Subject: [PATCH 1/2] Updated hyperlink to section on base models
Updated link in section Staging: Models that references section on base models.
---
.../docs/guides/best-practices/how-we-structure/2-staging.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/website/docs/guides/best-practices/how-we-structure/2-staging.md b/website/docs/guides/best-practices/how-we-structure/2-staging.md
index cb46fa19b33..34be2c69ba1 100644
--- a/website/docs/guides/best-practices/how-we-structure/2-staging.md
+++ b/website/docs/guides/best-practices/how-we-structure/2-staging.md
@@ -102,7 +102,7 @@ select * from renamed
- ✅ **Type casting**
- ✅ **Basic computations** (e.g. cents to dollars)
- ✅ **Categorizing** (using conditional logic to group values into buckets or booleans, such as in the `case when` statements above)
- - ❌ **Joins** — the goal of staging models is to clean and prepare individual source conformed concepts for downstream usage. We're creating the most useful version of a source system table, which we can use as a new modular component for our project. In our experience, joins are almost always a bad idea here — they create immediate duplicated computation and confusing relationships that ripple downstream — there are occasionally exceptions though (see [base models](guides/best-practices/how-we-structure/2-staging#staging-other-considerations) below).
+ - ❌ **Joins** — the goal of staging models is to clean and prepare individual source conformed concepts for downstream usage. We're creating the most useful version of a source system table, which we can use as a new modular component for our project. In our experience, joins are almost always a bad idea here — they create immediate duplicated computation and confusing relationships that ripple downstream — there are occasionally exceptions though (see [base models](#staging-other-considerations) below).
- ❌ **Aggregations** — aggregations entail grouping, and we're not doing that at this stage. Remember - staging models are your place to create the building blocks you’ll use all throughout the rest of your project — if we start changing the grain of our tables by grouping in this layer, we’ll lose access to source data that we’ll likely need at some point. We just want to get our individual concepts cleaned and ready for use, and will handle aggregating values downstream.
- ✅ **Materialized as views.** Looking at a partial view of our `dbt_project.yml` below, we can see that we’ve configured the entire staging directory to be materialized as views. As they’re not intended to be final artifacts themselves, but rather building blocks for later models, staging models should typically be materialized as views for two key reasons:
From 822c9a2a5c180768da25a02329aba16729139ab0 Mon Sep 17 00:00:00 2001
From: mirnawong1 <89008547+mirnawong1@users.noreply.github.com>
Date: Wed, 27 Sep 2023 09:51:28 +0100
Subject: [PATCH 2/2] Update
website/docs/guides/best-practices/how-we-structure/2-staging.md
---
.../docs/guides/best-practices/how-we-structure/2-staging.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/website/docs/guides/best-practices/how-we-structure/2-staging.md b/website/docs/guides/best-practices/how-we-structure/2-staging.md
index 34be2c69ba1..bcb589508e5 100644
--- a/website/docs/guides/best-practices/how-we-structure/2-staging.md
+++ b/website/docs/guides/best-practices/how-we-structure/2-staging.md
@@ -102,7 +102,7 @@ select * from renamed
- ✅ **Type casting**
- ✅ **Basic computations** (e.g. cents to dollars)
- ✅ **Categorizing** (using conditional logic to group values into buckets or booleans, such as in the `case when` statements above)
- - ❌ **Joins** — the goal of staging models is to clean and prepare individual source conformed concepts for downstream usage. We're creating the most useful version of a source system table, which we can use as a new modular component for our project. In our experience, joins are almost always a bad idea here — they create immediate duplicated computation and confusing relationships that ripple downstream — there are occasionally exceptions though (see [base models](#staging-other-considerations) below).
+ - ❌ **Joins** — the goal of staging models is to clean and prepare individual source-conformed concepts for downstream usage. We're creating the most useful version of a source system table, which we can use as a new modular component for our project. In our experience, joins are almost always a bad idea here — they create immediate duplicated computation and confusing relationships that ripple downstream — there are occasionally exceptions though (refer to [base models](#staging-other-considerations) for more info).
- ❌ **Aggregations** — aggregations entail grouping, and we're not doing that at this stage. Remember - staging models are your place to create the building blocks you’ll use all throughout the rest of your project — if we start changing the grain of our tables by grouping in this layer, we’ll lose access to source data that we’ll likely need at some point. We just want to get our individual concepts cleaned and ready for use, and will handle aggregating values downstream.
- ✅ **Materialized as views.** Looking at a partial view of our `dbt_project.yml` below, we can see that we’ve configured the entire staging directory to be materialized as views. As they’re not intended to be final artifacts themselves, but rather building blocks for later models, staging models should typically be materialized as views for two key reasons: