Skip to content

Commit

Permalink
Merge branch 'current' into add-links
Browse files Browse the repository at this point in the history
  • Loading branch information
mirnawong1 authored Nov 11, 2024
2 parents 1b94b78 + 385c30e commit 5b2a081
Show file tree
Hide file tree
Showing 7 changed files with 27 additions and 108 deletions.
15 changes: 0 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,18 +62,3 @@ You can click a link available in a Vercel bot PR comment to see and review your

Advisory:
- If you run into an `fatal error: 'vips/vips8' file not found` error when you run `npm install`, you may need to run `brew install vips`. Warning: this one will take a while -- go ahead and grab some coffee!

## Running the Cypress tests locally

Method 1: Utilizing the Cypress GUI
1. `cd` into the repo: `cd docs.getdbt.com`
2. `cd` into the `website` subdirectory: `cd website`
3. Install the required node packages: `npm install`
4. Run `npx cypress open` to open the Cypress GUI, and choose `E2E Testing` as the Testing Type, before finally selecting your browser and clicking `Start E2E testing in {broswer}`
5. Click on a test and watch it run!

Method 2: Running the Cypress E2E tests headlessly
1. `cd` into the repo: `cd docs.getdbt.com`
2. `cd` into the `website` subdirectory: `cd website`
3. Install the required node packages: `npm install`
4. Run `npx cypress run`
67 changes: 0 additions & 67 deletions contributing/developer-blog.md

This file was deleted.

2 changes: 1 addition & 1 deletion website/docs/docs/build/incremental-strategy.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Click the name of the adapter in the below table for more information about supp
| Data platform adapter | `append` | `merge` | `delete+insert` | `insert_overwrite` | `microbatch` <Lifecycle status="beta"/> |
|-----------------------|:--------:|:-------:|:---------------:|:------------------:|:-------------------:|
| [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) |||| ||
| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) |||| | |
| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) |||| | |
| [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | || |||
| [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) ||| |||
| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) ||| || |
Expand Down
23 changes: 0 additions & 23 deletions website/docs/docs/build/snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -390,29 +390,6 @@ snapshots:

</VersionBlock>

## Snapshot query best practices

This section outlines some best practices for writing snapshot queries:

- #### Snapshot source data
Your models should then select from these snapshots, treating them like regular data sources. As much as possible, snapshot your source data in its raw form and use downstream models to clean up the data

- #### Use the `source` function in your query
This helps when understanding <Term id="data-lineage">data lineage</Term> in your project.

- #### Include as many columns as possible
In fact, go for `select *` if performance permits! Even if a column doesn't feel useful at the moment, it might be better to snapshot it in case it becomes useful – after all, you won't be able to recreate the column later.

- #### Avoid joins in your snapshot query
Joins can make it difficult to build a reliable `updated_at` timestamp. Instead, snapshot the two tables separately, and join them in downstream models.

- #### Limit the amount of transformation in your query
If you apply business logic in a snapshot query, and this logic changes in the future, it can be impossible (or, at least, very difficult) to apply the change in logic to your snapshots.

Basically – keep your query as simple as possible! Some reasonable exceptions to these recommendations include:
* Selecting specific columns if the table is wide.
* Doing light transformation to get data into a reasonable shape, for example, unpacking a <Term id="json" /> blob to flatten your source data into columns.

## Snapshot meta-fields

Snapshot <Term id="table">tables</Term> will be created as a clone of your source dataset, plus some additional meta-fields*.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ Please consider the following actions, as the steps you take will depend on the
<Lightbox src="/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/connections-post-rollout-4.png" width="60%"title="Connections de-duplicated"/>

- Normalization
- Undertsand how new connections should be created to avoid local overrides. If you currently use extended attributes to override the warehouse instance in your production environment - you should instead create a new connection for that instance, and wire your production environment to it, removing the need for the local overrides
- Understand how new connections should be created to avoid local overrides. If you currently use extended attributes to override the warehouse instance in your production environment - you should instead create a new connection for that instance, and wire your production environment to it, removing the need for the local overrides
- Create new connections, update relevant environments to target these connections, removing now unecessary local overrides (which may not be all of them!)
- Test the new wiring by triggering jobs or starting IDE sessions

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ Once the connection is saved, a public key will be generated and displayed for t
To configure the SSH tunnel in dbt Cloud, you'll need to provide the hostname/IP of your bastion server, username, and port, of your choosing, that dbt Cloud will connect to. Review the following steps:

- Verify the bastion server has its network security rules set up to accept connections from the [dbt Cloud IP addresses](/docs/cloud/about-cloud/access-regions-ip-addresses) on whatever port you configured.
- Set up the user account by using the bastion servers instance's CLI, The following example uses the username `dbtcloud:`
- Set up the user account by using the bastion servers instance's CLI, The following example uses the username `dbtcloud`:

```shell
sudo groupadd dbtcloud
Expand Down
24 changes: 24 additions & 0 deletions website/docs/reference/resource-configs/snowflake-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -678,3 +678,27 @@ Per the [Snowflake documentation](https://docs.snowflake.com/en/sql-reference/in
>- DDL operations.
>- DML operations (for tables only).
>- Background maintenance operations on metadata performed by Snowflake.

<VersionBlock firstVersion="1.9">

## Pagination for object results

By default, when dbt encounters a schema with up to 100,000 objects, it will paginate the results from `show objects` at 10,000 per page for up to 10 pages.

Environments with more than 100,000 objects in a schema can customize the number of results per page and the page limit using the following [flags](/reference/global-configs/about-global-configs) in the `dbt_project.yml`:

- `list_relations_per_page` &mdash; The number of relations on each page (Max 10k as this is the most Snowflake allows).
- `list_relations_page_limit` &mdash; The maximum number of pages to include in the results.

For example, if you wanted to include 10,000 objects per page and include up to 100 pages (1 million objects), configure the flags as follows:


```yml
flags:
list_relations_per_page: 10000
list_relations_page_limit: 100
```

</VersionBlock>

0 comments on commit 5b2a081

Please sign in to comment.