Skip to content

Commit

Permalink
Merge branch 'dbt-labs:current' into current
Browse files Browse the repository at this point in the history
  • Loading branch information
okramarenko authored Nov 8, 2023
2 parents e897498 + 5b8991c commit 2fec8ca
Show file tree
Hide file tree
Showing 129 changed files with 1,684 additions and 1,522 deletions.
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ To learn more about the writing conventions used in the dbt Labs docs, see the [
<!--
Uncomment if you're publishing docs for a prerelease version of dbt (delete if not applicable):
- [ ] Add versioning components, as described in [Versioning Docs](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-entire-pages)
- [ ] Add a note to the prerelease version [Migration Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/guides/migration/versions)
- [ ] Add a note to the prerelease version [Migration Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/core-upgrade)
-->
- [ ] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) and [About versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) so my content adheres to these guidelines.
- [ ] Add a checklist item for anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch."
Expand Down
33 changes: 33 additions & 0 deletions .github/workflows/crawler.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: Algolia Crawler
on:
pull_request:
types:
- closed

jobs:
algolia_recrawl:
# Comment out the if check below if running on every merge to current branch
if: |
contains(github.event.pull_request.labels.*.name, 'trigger-crawl')
&& github.event.pull_request.merged == true
name: Trigger Algolia Crawl
runs-on: ubuntu-latest
steps:
# Checkout repo
- name: Checkout Repo
uses: actions/checkout@v3

# Wait 8 minutes to allow Vercel build to complete
- run: sleep 480

# Once deploy URL is found, trigger Algolia crawl
- name: Run Algolia Crawler
uses: algolia/algoliasearch-crawler-github-actions@v1
id: crawler_push
with:
crawler-user-id: ${{ secrets.CRAWLER_USER_ID }}
crawler-api-key: ${{ secrets.CRAWLER_API_KEY }}
algolia-app-id: ${{ secrets.ALGOLIA_APP_ID }}
algolia-api-key: ${{ secrets.ALGOLIA_API_KEY }}
site-url: 'https://docs.getdbt.com'
crawler-name: ${{ secrets.CRAWLER_NAME }}
43 changes: 30 additions & 13 deletions contributing/content-style-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ When referring to different sections of the IDE, use the name of the section and
People make use of titles in many places like table headers, section headings (such as an H2, H3, or H4), page titles, sidebars, and so much more.
When generating titles or updating them, use sentence case. It sets a more conversational tone to the docs&mdash;making the content more approachable and creating a friendly feel.
When generating titles or updating them, use sentence case. It sets a more conversational tone to the docs&mdash; making the content more approachable and creating a friendly feel.
We've defined five content types you can use when contributing to the docs (as in, writing or authoring). Learn more about title guidelines for [each content type](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-types.md).
Expand All @@ -239,7 +239,7 @@ Placeholder text is something that the user should replace with their own text.
Use all capital letters([screaming snake case](https://fission.codes/blog/screaming-snake-case/)) to indicate text that changes in the user interface or that the user needs to supply in a command or code snippet. Avoid surrounding it in brackets or braces, which someone might copy and use, producing an error.
Identify what the user should replace the placeholder text with in the paragraph preceding the code snippet or command.
Identify what the user should replace the placeholder text within the paragraph preceding the code snippet or command.
:white_check_mark: The following is an example of configuring a connection to a Redshift database. In your YAML file, you must replace `CLUSTER_ID` with the ID assigned to you during setup:

Expand Down Expand Up @@ -276,7 +276,7 @@ Guidelines for making lists are:
- There are at least two items.
- All list items follow a consistent, grammatical structure (like each item starts with a verb, each item begins with a capitalized word, each item is a sentence fragment).
- Lists items don't end in commas, semicolons, or conjunctions (like "and", "or"). However, you can use periods if they’re complete sentences.
- Introduce the list with a heading or, if it's within text, as a complete sentence or as a sentence fragment followed by a colon.
- Introduce the list with a heading or, if it's within the text, as a complete sentence or as a sentence fragment followed by a colon.

If the list starts getting lengthy and dense, consider presenting the same content in a different format such as a table, as separate subsections, or a new guide.

Expand All @@ -286,7 +286,7 @@ A bulleted list with introductory text:

> A dbt project is a directory of `.sql` and .yml` files. The directory must contain at a minimum:
>
> - Models: A model is a single `.sql` file. Each model contains a single `select` statement that either transforms raw data into a dataset that is ready for analytics, or, more often, is an intermediate step in such a transformation.
> - Models: A model is a single `.sql` file. Each model contains a single `select` statement that either transforms raw data into a dataset that is ready for analytics or, more often, is an intermediate step in such a transformation.
> - A project file: A `dbt_project.yml` file, which configures and defines your dbt project.

A bulleted list with sentence fragments:
Expand All @@ -307,10 +307,10 @@ A numbered list following an H2 heading:
## Tables
Tables provide a great way to present complex information and can help the content be more scannable for users, too.

There are many ways to construct a table, like row spanning and cell splitting. Make sure the content is clear, concise, and presents well on the web page (like avoid awkward word wrapping).
There are many ways to construct a table, such as row spanning and cell splitting. The content should be clear, concise, and presented well on the web page (for example, avoid awkward word wrapping).

Guidelines for making tables are:
- Introduce the table with a heading or, if it's within text, as a complete sentence or as a sentence fragment followed by a colon.
- Introduce the table with a heading or, if it's within the text, as a complete sentence or as a sentence fragment followed by a colon.
- Use a header row
- Use sentence case for all content, including the header row
- Content can be complete sentences, sentence fragments, or single words (like `Currency`)
Expand Down Expand Up @@ -338,7 +338,7 @@ A table following an H3 heading:
> | Name | Description | Values |
> | -----| ----------- | ------ |
> | `-help` | Displays information on how to use the command. | Doesn't take any values. |
> | `-readable` | Print output in human readable format. | <ul><li>`true`</li><li>`false`</li></ul> |
> | `-readable` | Print output in human-readable format. | <ul><li>`true`</li><li>`false`</li></ul> |
> | `-file` | Print output to file instead of stdout. | Name of the file. |

## Cards
Expand All @@ -349,7 +349,7 @@ You can configure a card in 2, 3, 4, or 5-column grids. To maintain a good user

There won't be many instances where you need to display 4 or 5 cards on the docs site. While we recommend you use 2 or 3-column grids, you can use 4 or 5-column grids in the following scenarios:

- For cards that contain little text and limited to under 15 words. (This is to make sure the text isn't squished)
- For cards that contain little text and are limited to 15 words or less. This is to make sure the text isn't squished.
- Always have the `hide_table_of_contents:` frontmatter set to `true` (This hides the right table of contents).

Otherwise, the text will appear squished and provide users with a bad experience.
Expand All @@ -371,16 +371,16 @@ To create cards in markdown, you need to:
- Add the props within the card component, including `title`,`body`,`link`,`icon`.
- Close out the div by using `</div>`

Refer to the following prop list for detailed explanation and examples:
Refer to the following prop list for detailed explanations and examples:

| Prop | Type | Info | Example |
| ---- | ---- | ---- | ------- |
| `title` | required | The title should be clear and explain an action the user should take or a product/feature. | `title: dbt Cloud IDE`
| `body` | required | The body contains the actionable or informative text for the user. You can include `<a href="` link within the body of the text. However, if you do this, you must not include the `link` prop set as that'll override any `<a href's` within the body text. | `body="The IDE is the easiest and most efficient way to develop dbt models`
| `link` | optional | Add a link to the entire card component so when users click on the card, it'll trigger the link. Adding a link prop means it'll override any links within the body and if users click on the card, they'll be directed to the link set by the link prop. | `link="/docs/cloud/dbt-cloud-ide/develop-in-the-cloud`
| `icon` | optional but recommended | You can add an icon to the card comonent by using any icons found in the [icons](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/static/img/icons) directory. <br /> * Icons are added in .svg format and you must add icons in two locations: website/static/img/icons and website/static/img/icons/white. This is so users can view the icons in dark or light mode on the docs.getdbt.com site. | ` icon="pencil-paper"/>` |
| `icon` | optional but recommended | You can add an icon to the card component by using any icons found in the [icons](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/static/img/icons) directory. <br /> * Icons are added in .svg format and you must add icons in two locations: website/static/img/icons and website/static/img/icons/white. This is so users can view the icons in dark or light mode on the docs.getdbt.com site. | ` icon="pencil-paper"/>` |

The following is an example of a 4 card column:
The following is an example of a 4-card column:

```
<div className="grid--4-col">
Expand Down Expand Up @@ -488,9 +488,24 @@ Avoid ending a sentence with a preposition unless the rewritten sentence would s

Product names, trademarks, services, and tools should be written as proper nouns, unless otherwise specified by the company or trademark owner.

As of October 2023, avoid using "dbt CLI" or "CLI" terminology when referring to the dbt Cloud CLI or dbt Core. However, if referring to the command line as a tool, CLI is acceptable.

dbt officially provides two command line tools for running dbt commands:

- [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) &mdash; This tool allows you to develop locally and execute dbt commands against your dbt Cloud development environment from your local command line.
- [dbt Core](https://github.com/dbt-labs/dbt-core) &mdash; This open-source tool is designed for local installation, enabling you to use dbt Core on the command line and communicate with databases through adapters.

Here are some examples of what to use and what to avoid: <br />

✅ Set up in the dbt Cloud CLI or dbt Core<br />
✅ Set up in the dbt Cloud CLI or dbt Core CLI<br />

❌ Set up via dbt CLI<br />
❌ Set up in dbt Cloud, **or** via the CLI<br />

### Terms to use or avoid

Use industry-specific terms and research new/improved terminology. Also refer to the Inclusive Language section of this style guide for inclusive and accessible language and style.
Use industry-specific terms and research new/improved terminology. Also, refer to the Inclusive Language section of this style guide for inclusive and accessible language and style.

**DO NOT** use jargon or language familiar to a small subset of readers or assume that your readers understand ALL technical terms.

Expand All @@ -507,11 +522,13 @@ sign in | log in, login
sign up | signup
terminal | shell
username | login
dbt Cloud CLI | CLI, dbt CLI
dbt Core | CLI, dbt CLI
</div></b>

## Links

Links embedded in documentation are about trust. Users trust that we will lead them to sites or pages related to their reading content. In order to maintain that trust, it's important that links are transparent, up-to-date, and lead to legitimate resources.
Links embedded in the documentation are about trust. Users trust that we will lead them to sites or pages related to their reading content. In order to maintain that trust, it's important that links are transparent, up-to-date, and lead to legitimate resources.

### Internal links

Expand Down
6 changes: 3 additions & 3 deletions website/blog/2021-11-23-how-to-upgrade-dbt-versions.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ As noted above, the project is on 0.16.0 right now. 0.17.2 is the final patch re
>
> Practically, it also lets you lock in "checkpoints" of known-stable setups. If you need to pause your migration work to deal with an urgent request, you can safely deploy what you've finished so far instead of having a bunch of unrelated half-finished changes.
Review the migration guides to get an initial indication of what changes you might need to make. For example, in [the migration guide for 0.17.0](/guides/migration/versions), there are several significant changes to dbt's functionality, but it's unlikely that all of them will apply to your project. We'll cover this more later.
Review the migration guides to get an initial indication of what changes you might need to make. For example, in [the migration guide for 0.17.0](/docs/dbt-versions/core-upgrade), there are several significant changes to dbt's functionality, but it's unlikely that all of them will apply to your project. We'll cover this more later.

## Step 2: `Add require-dbt-version` to your `dbt_project.yml` file.

Expand Down Expand Up @@ -126,9 +126,9 @@ In this case, our example project probably has dbt 0.3.0 installed. By reviewing
### Step 5b. Fix errors, then warnings

Obviously, errors that stop you from running your dbt project at all are the most important to deal with. Let's assume that our project used a too-broadly-scoped variable in a macro file, support for which was removed in v0.17. The [migration guide explains what to do instead](/guides/migration/versions), and it's a pretty straightforward fix.
Obviously, errors that stop you from running your dbt project at all are the most important to deal with. Let's assume that our project used a too-broadly-scoped variable in a macro file, support for which was removed in v0.17. The [migration guide explains what to do instead](/docs/dbt-versions/core-upgrade), and it's a pretty straightforward fix.

Once your errors are out of the way, have a look at warnings. For example, 0.17 introduced `config-version: 2` to `dbt_project.yml`. Although it's backwards compatible for now, we know that support for the old version will be removed in a future version of dbt so we might as well deal with it now. Again, the migration guide explains [what we need to do](/guides/migration/versions), and how to take full advantage of the new functionality in the future.
Once your errors are out of the way, have a look at warnings. For example, 0.17 introduced `config-version: 2` to `dbt_project.yml`. Although it's backwards compatible for now, we know that support for the old version will be removed in a future version of dbt so we might as well deal with it now. Again, the migration guide explains [what we need to do](/docs/dbt-versions/core-upgrade), and how to take full advantage of the new functionality in the future.

### Stay focused

Expand Down
8 changes: 4 additions & 4 deletions website/blog/2021-11-29-dbt-airflow-spiritual-alignment.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,22 +144,22 @@ An analyst will be in the dark when attempting to debug this, and will need to r
This can be perfectly ok, in the event your data team is structured for data engineers to exclusively own dbt modeling duties, but that’s a quite uncommon org structure pattern from what I’ve seen. And if you have easy solutions for this analyst-blindness problem, I’d love to hear them.

Once the data has been ingested, dbt Core can be used to model it for consumption. Most of the time, users choose to either:
Use the dbt CLI+ [BashOperator](https://registry.astronomer.io/providers/apache-airflow/modules/bashoperator) with Airflow (If you take this route, you can use an external secrets manager to manage credentials externally), or
Use the dbt Core CLI+ [BashOperator](https://registry.astronomer.io/providers/apache-airflow/modules/bashoperator) with Airflow (If you take this route, you can use an external secrets manager to manage credentials externally), or
Use the [KubernetesPodOperator](https://registry.astronomer.io/providers/kubernetes/modules/kubernetespodoperator) for each dbt job, as data teams have at places like [Gitlab](https://gitlab.com/gitlab-data/analytics/-/blob/master/dags/transformation/dbt_trusted_data.py#L72) and [Snowflake](https://www.snowflake.com/blog/migrating-airflow-from-amazon-ec2-to-kubernetes/).

Both approaches are equally valid; the right one will depend on the team and use case at hand.

| | Dependency management | Overhead | Flexibility | Infrastructure Overhead |
|---|---|---|---|---|
| dbt CLI + BashOperator | Medium | Low | Medium | Low |
| dbt Core CLI + BashOperator | Medium | Low | Medium | Low |
| Kubernetes Pod Operator | Very Easy | Medium | High | Medium |
| | | | | |

If you have DevOps resources available to you, and your team is comfortable with concepts like Kubernetes pods and containers, you can use the KubernetesPodOperator to run each job in a Docker image so that you never have to think about Python dependencies. Furthermore, you’ll create a library of images containing your dbt models that can be run on any containerized environment. However, setting up development environments, CI/CD, and managing the arrays of containers can mean a lot of overhead for some teams. Tools like the [astro-cli](https://github.com/astronomer/astro-cli) can make this easier, but at the end of the day, there’s no getting around the need for Kubernetes resources for the Gitlab approach.

If you’re just looking to get started or just don’t want to deal with containers, using the BashOperator to call the dbt CLI can be a great way to begin scheduling your dbt workloads with Airflow.
If you’re just looking to get started or just don’t want to deal with containers, using the BashOperator to call the dbt Core CLI can be a great way to begin scheduling your dbt workloads with Airflow.

It’s important to note that whichever approach you choose, this is just a first step; your actual production needs may have more requirements. If you need granularity and dependencies between your dbt models, like the team at [Updater does, you may need to deconstruct the entire dbt DAG in Airflow.](https://www.astronomer.io/guides/airflow-dbt#use-case-2-dbt-airflow-at-the-model-level) If you’re okay managing some extra dependencies, but want to maximize control over what abstractions you expose to your end users, you may want to use the [GoCardlessProvider](https://github.com/gocardless/airflow-dbt), which wraps the BashOperator and dbt CLI.
It’s important to note that whichever approach you choose, this is just a first step; your actual production needs may have more requirements. If you need granularity and dependencies between your dbt models, like the team at [Updater does, you may need to deconstruct the entire dbt DAG in Airflow.](https://www.astronomer.io/guides/airflow-dbt#use-case-2-dbt-airflow-at-the-model-level) If you’re okay managing some extra dependencies, but want to maximize control over what abstractions you expose to your end users, you may want to use the [GoCardlessProvider](https://github.com/gocardless/airflow-dbt), which wraps the BashOperator and dbt Core CLI.

#### Rerunning jobs from failure

Expand Down
Loading

0 comments on commit 2fec8ca

Please sign in to comment.