diff --git a/README.md b/README.md index c749fedf95a..d306651f545 100644 --- a/README.md +++ b/README.md @@ -62,18 +62,3 @@ You can click a link available in a Vercel bot PR comment to see and review your Advisory: - If you run into an `fatal error: 'vips/vips8' file not found` error when you run `npm install`, you may need to run `brew install vips`. Warning: this one will take a while -- go ahead and grab some coffee! - -## Running the Cypress tests locally - -Method 1: Utilizing the Cypress GUI -1. `cd` into the repo: `cd docs.getdbt.com` -2. `cd` into the `website` subdirectory: `cd website` -3. Install the required node packages: `npm install` -4. Run `npx cypress open` to open the Cypress GUI, and choose `E2E Testing` as the Testing Type, before finally selecting your browser and clicking `Start E2E testing in {broswer}` -5. Click on a test and watch it run! - -Method 2: Running the Cypress E2E tests headlessly -1. `cd` into the repo: `cd docs.getdbt.com` -2. `cd` into the `website` subdirectory: `cd website` -3. Install the required node packages: `npm install` -4. Run `npx cypress run` diff --git a/contributing/developer-blog.md b/contributing/developer-blog.md deleted file mode 100644 index 0d9b3becba2..00000000000 --- a/contributing/developer-blog.md +++ /dev/null @@ -1,67 +0,0 @@ - -* [Contributing](#contributing) -* [Core Principles](#core-principles) - -## Contributing - -The dbt Developer Blog is a place where analytics practitioners can go to share their knowledge with the community. Analytics Engineering is a discipline we’re all building together. The developer blog exists to cultivate the collective knowledge that exists on how to build and scale effective data teams. - -We currently have editorial capacity for a few Community contributed developer blogs per quarter - if we are oversubscribed we suggest you post on another platform or hold off until the editorial team is ready to take on more posts. - -### What makes a good developer blog post? - -- The short answer: Practical, hands on analytics engineering tutorials and stories - - [Slim CI/CD with Bitbucket](https://docs.getdbt.com/blog/slim-ci-cd-with-bitbucket-pipelines) - - [So You Want to Build a dbt Package](https://docs.getdbt.com/blog/so-you-want-to-build-a-package) - - [Founding an Analytics Engineering Team](https://docs.getdbt.com/blog/founding-an-analytics-engineering-team-smartsheet) -- See the [Developer Blog Core Principles](#core-principles) - -### How do I submit a proposed post? - -To submit a proposed post, open a `Contribute to the dbt Developer Blog` issue on the [Developer Hub repo](https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose). You will be asked for: - -- A short (one paragraph) summary of the post you’d like to publish -- An outline of the post - -You’ll hear back from a member of the dbt Labs teams within 7 days with one of three responses: - -- The post looks good to go as is! We’ll ask you to start creating a draft based off of the initial outline you submitted -- Proposed changes to the outline. This could be additional focus on a topic you mention that’s of high community interest or a tweak to the structure to help with narrative flow -- Not a fit for the developer blog right now. We hugely appreciate *any* interest in submitting to the Developer Blog - right now our biggest backlog is capacity to help folks get these published. See below on how we are thinking about and evaluating potential posts. - -### What is the process once my blog is accepted? - -Once a blog is accepted, we’ll ask you for a date when we can expect the draft by. Typically we’ll ask that you can commit to having this ready within a month of submitting the issue. - -Once you submit a draft, we’ll return a first set of edits within 5 business days. - -The typical turnaround time from issue creation to going live on the developer blog is ~4 to 6 weeks. - -### What happens after my blog is published? - -We’ll share the blog on the dbt Labs social media channels! We also encourage you to share on the dbt Slack in #i-made-this. - -### What if my post doesn’t get approved? - -We want to publish as many community contributors as possible, but not every post will be a fit for the Developer Blog. That’s ok! There are many different reasons why we might not be able to publish a post right now and none of them reflect on the quality of the proposed post. - -- **dbt Labs capacity**: We’re committed to providing hands-on feedback and coaching throughout the process. Our goal is not just to generate great developer blogs - it’s to help build a community of great writers / practitioners who can share their knowledge with the community for years to come. This necessarily means we will be able to take on a lower absolute number of posts in the short term, but will hopefully be helpful for the community long term. -- **Focus on narrative / problem solving - not industry trends**: The developer blog exists, primarily, to tell the stories of analytics engineering practitioners and how they solve problems. The idea is that reading the developer blog gives a feel for what it is like to be a data practitioner on the ground today. This is not a hard and fast rule, but a good way to approach this is “How I/we solved X problem” rather than “How everyone should solve X problem”. - -We are very interested in stacks, new tools and integrations and will happily publish posts about this - with the caveat that the *focus* of the post should be solving real world problems. Hopefully if you are writing about these, this is something that you have used yourself in a hands on, production implementation. - -- **Right sized scope**: We want to be able to cover a topic in-depth and dig into the nuances. Big topics like “How should you structure your data team” or “How to ensure data quality in your organization” will be tough to cover in the scope of a single post. If you have a big idea - try subdividing it! “How should you structure your data team” could become “How we successfully partnered with our RevOps team on improving lead tracking” and “How to ensure data quality in your organization” might be “How we cleaned up our utm tracking”. - -### What if I need help / have questions: - -- Feel free to post any questions in #community-writers on the dbt Slack. - -## Core Principles - -- 🧑🏻‍🤝‍🧑🏾 The dbt Developer blog is written by humans **- individual analytics professionals sharing their insight with the world. To the extent feasible, a community member posting on the developer blog is not staking an official organizational stance, but something that *they* have learned or believe based on their work. This is true for dbt Labs employees as well. -- 💍 Developer blog content is knowledge rich - these are posts that readers share, bookmark and come back to time and time again. -- ⛹🏼‍♂️ Developer blog content is written by and for *practitioners* - end users of analytics tools (and sometimes people that work with practitioners). -- ⭐ Developer blog content is best when it is *the story which the author is uniquely positioned to tell.* Authors are encouraged to consider what insight they have that is specific to them and the work they have done. -- 🏎️ Developer blog content is actionable - readers walk away with a clear sense of how they can use this information to be a more effective practitioner. Posts include code snippets, Loom walkthroughs and hands-on, practical information that can be integrated into daily workflows. -- 🤏 Nothing is too small to share - what you think is simple has the potential to change someone's week. -- 🔮 Developer blog content is present focused —posts tell a story of a thing that you've already done or are actively doing, not something that you may do in the future. diff --git a/website/docs/docs/build/incremental-strategy.md b/website/docs/docs/build/incremental-strategy.md index 30de135b09b..1fb35ba637c 100644 --- a/website/docs/docs/build/incremental-strategy.md +++ b/website/docs/docs/build/incremental-strategy.md @@ -27,7 +27,7 @@ Click the name of the adapter in the below table for more information about supp | Data platform adapter | `append` | `merge` | `delete+insert` | `insert_overwrite` | `microbatch` | |-----------------------|:--------:|:-------:|:---------------:|:------------------:|:-------------------:| | [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) | ✅ | ✅ | ✅ | | ✅ | -| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) | ✅ | ✅ | ✅ | | | +| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) | ✅ | ✅ | ✅ | | ✅ | | [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | | ✅ | | ✅ | ✅ | | [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) | ✅ | ✅ | | ✅ | ✅ | | [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | ✅ | ✅ | | ✅ | | diff --git a/website/docs/docs/build/snapshots.md b/website/docs/docs/build/snapshots.md index f5321aa626a..8045dac117b 100644 --- a/website/docs/docs/build/snapshots.md +++ b/website/docs/docs/build/snapshots.md @@ -390,29 +390,6 @@ snapshots: -## Snapshot query best practices - -This section outlines some best practices for writing snapshot queries: - -- #### Snapshot source data - Your models should then select from these snapshots, treating them like regular data sources. As much as possible, snapshot your source data in its raw form and use downstream models to clean up the data - -- #### Use the `source` function in your query - This helps when understanding data lineage in your project. - -- #### Include as many columns as possible - In fact, go for `select *` if performance permits! Even if a column doesn't feel useful at the moment, it might be better to snapshot it in case it becomes useful – after all, you won't be able to recreate the column later. - -- #### Avoid joins in your snapshot query - Joins can make it difficult to build a reliable `updated_at` timestamp. Instead, snapshot the two tables separately, and join them in downstream models. - -- #### Limit the amount of transformation in your query - If you apply business logic in a snapshot query, and this logic changes in the future, it can be impossible (or, at least, very difficult) to apply the change in logic to your snapshots. - -Basically – keep your query as simple as possible! Some reasonable exceptions to these recommendations include: -* Selecting specific columns if the table is wide. -* Doing light transformation to get data into a reasonable shape, for example, unpacking a blob to flatten your source data into columns. - ## Snapshot meta-fields Snapshot tables will be created as a clone of your source dataset, plus some additional meta-fields*. diff --git a/website/docs/docs/cloud/connect-data-platform/about-connections.md b/website/docs/docs/cloud/connect-data-platform/about-connections.md index 89dd13808ec..6497e86de89 100644 --- a/website/docs/docs/cloud/connect-data-platform/about-connections.md +++ b/website/docs/docs/cloud/connect-data-platform/about-connections.md @@ -88,7 +88,7 @@ Please consider the following actions, as the steps you take will depend on the - Normalization - - Undertsand how new connections should be created to avoid local overrides. If you currently use extended attributes to override the warehouse instance in your production environment - you should instead create a new connection for that instance, and wire your production environment to it, removing the need for the local overrides + - Understand how new connections should be created to avoid local overrides. If you currently use extended attributes to override the warehouse instance in your production environment - you should instead create a new connection for that instance, and wire your production environment to it, removing the need for the local overrides - Create new connections, update relevant environments to target these connections, removing now unecessary local overrides (which may not be all of them!) - Test the new wiring by triggering jobs or starting IDE sessions diff --git a/website/docs/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb.md b/website/docs/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb.md index 4719095b87f..5be802cae77 100644 --- a/website/docs/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb.md +++ b/website/docs/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb.md @@ -118,7 +118,7 @@ Once the connection is saved, a public key will be generated and displayed for t To configure the SSH tunnel in dbt Cloud, you'll need to provide the hostname/IP of your bastion server, username, and port, of your choosing, that dbt Cloud will connect to. Review the following steps: - Verify the bastion server has its network security rules set up to accept connections from the [dbt Cloud IP addresses](/docs/cloud/about-cloud/access-regions-ip-addresses) on whatever port you configured. -- Set up the user account by using the bastion servers instance's CLI, The following example uses the username `dbtcloud:` +- Set up the user account by using the bastion servers instance's CLI, The following example uses the username `dbtcloud`: ```shell sudo groupadd dbtcloud diff --git a/website/docs/reference/resource-configs/snowflake-configs.md b/website/docs/reference/resource-configs/snowflake-configs.md index b95b79241ba..7bef180e3d3 100644 --- a/website/docs/reference/resource-configs/snowflake-configs.md +++ b/website/docs/reference/resource-configs/snowflake-configs.md @@ -678,3 +678,27 @@ Per the [Snowflake documentation](https://docs.snowflake.com/en/sql-reference/in >- DDL operations. >- DML operations (for tables only). >- Background maintenance operations on metadata performed by Snowflake. + + + +## Pagination for object results + +By default, when dbt encounters a schema with up to 100,000 objects, it will paginate the results from `show objects` at 10,000 per page for up to 10 pages. + +Environments with more than 100,000 objects in a schema can customize the number of results per page and the page limit using the following [flags](/reference/global-configs/about-global-configs) in the `dbt_project.yml`: + +- `list_relations_per_page` — The number of relations on each page (Max 10k as this is the most Snowflake allows). +- `list_relations_page_limit` — The maximum number of pages to include in the results. + +For example, if you wanted to include 10,000 objects per page and include up to 100 pages (1 million objects), configure the flags as follows: + + +```yml + +flags: + list_relations_per_page: 10000 + list_relations_page_limit: 100 + +``` + + \ No newline at end of file