Skip to content

Commit

Permalink
Update readmes
Browse files Browse the repository at this point in the history
  • Loading branch information
jtcohen6 committed May 6, 2024
1 parent 431975c commit 13c99e8
Show file tree
Hide file tree
Showing 2 changed files with 32 additions and 34 deletions.
29 changes: 14 additions & 15 deletions core/dbt/artifacts/README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,23 @@
# dbt/artifacts

## Overview
This directory is meant to be a lightweight module that is independent (and upstream of) the rest of dbt-core internals.
This directory is meant to be a lightweight module that is independent (and upstream of) the rest of `dbt-core` internals.

It's primary responsibility is to define simple data classes that represent the versioned artifact schemas that dbt writes as JSON files throughout execution.
Its primary responsibility is to define simple data classes that represent the versioned artifact schemas that dbt writes as JSON files throughout execution.

Long term, this module may be released as a standalone package (e.g. dbt-artifacts) to support stable parsing dbt artifacts programmatically.
Eventually, this module may be released as a standalone package (e.g. `dbt-artifacts`) to support stable programmatic parsing of dbt artifacts.

`dbt/artifacts` is organized into artifact 'schemas' and 'resources'. Schemas represent the final serialized artifact object, while resources represent sub-components of the larger artifact schemas.
`dbt/artifacts` is organized into artifact 'schemas' and 'resources'. Schemas represent the final serialized artifact objects, while resources represent smaller components within those schemas.

### dbt/artifacts/schemas


Each major version of a schema under `dbt/artifacts/schema` is defined in its corresponding `dbt/artifacts/schema/<artifact-name>/v<version>` directory. Before `dbt/artifacts` artifact schemas were always modified in-place, which is why artifacts are missing class definitions for historical versions.
Each major version of a schema under `dbt/artifacts/schema` is defined in its corresponding `dbt/artifacts/schema/<artifact-name>/v<version>` directory. Before `dbt/artifacts` artifact schemas were always modified in-place, which is why older artifacts are those missing class definitions.

Currently, there are four artifact schemas defined in `dbt/artifact/schemas`:

| Artifact name | File | Class | Latest definition |
|---------------|------------------|----------------------------------|-----------------------------------|
| manifest | manifest.json | WritableManifest | dbt/artifacts/schema/manifest/v11 |
| manifest | manifest.json | WritableManifest | dbt/artifacts/schema/manifest/v12 |
| catalog | catalog.json | CatalogArtifact | dbt/artifacts/schema/catalog/v1 |
| run | run_results.json | RunResultsArtifact | dbt/artifacts/schema/run/v5 |
| freshness | sources.json | FreshnessExecutionResultArtifact | dbt/artifacts/schema/freshness/v3 |
Expand All @@ -32,27 +31,27 @@ All existing resources are defined under `dbt/artifacts/resources/v1`.

### Non-breaking changes

Freely make incremental, non-breaking changes in-place to the latest major version of any artifact in mantle (via minor or patch bumps). The only changes that are fully forward and backward compatible are:
Freely make incremental, non-breaking changes in-place to the latest major version of any artifact (minor or patch bumps). The only changes that are fully forward and backward compatible are:
* Adding a new field with a default
* Deleting an __optional__ field

### Breaking changes
A breaking change is anything that:
A breaking change is anything that:
* Deletes a required field
* Changes the name or type of an existing field
* Removes default from a field
* Removes the default value from a field

These should generally be avoided, and bundled together to aim for as minimal disruption across the integration ecosystem as possible.
These should be avoided however possible. When necessary, multiple breaking changes should be bundled together, to aim for minimal disruption across the ecosystem of tools that leverage dbt metadata.

However, when it comes time to make one (or more) of these, a new versioned artifact should be created as follows:
When it comes time to make breaking changes, a new versioned artifact should be created as follows:
1. Create a new version directory and file that defines the new artifact schema under `dbt/artifacts/schemas/<artifact>/v<next-artifact-version>/<artifact>.py`
2. If any resources are having breaking changes introduced, create a new resource class that defines the new resource schema under `dbt/artifacts/resources/v<next-resource-version>/<resource>.py`
3. Implement upgrade paths on the new versioned artifact class so it can be constructed given a dictionary representation of any previous version of the same artifact
* TODO: update once the design is finalized
* TODO: link example once available
4. Implement downgrade paths on all previous versions of the artifact class so they can still be constructed given a dictionary representation of the new artifact schema
* TODO: update once the design is finalized
* TODO: link example once available
5. Update the 'latest' aliases to point to the new version of the artifact and/or resource:
* Artifact: `dbt/artifacts/schemas/<artifact>/__init__.py `
* Resource: `dbt/artifacts/resources/__init__.py `

Downstream consumers (e.g. dbt-core) importing from the latest alias are susceptible to breaking changes. Ideally, any incompatibilities should be caught my static type checking in those systems. However, it is always possible for consumers to pin imports to previous versions via `dbt.artifacts.schemas.<artifact>.v<prev-version>`
Downstream consumers (e.g. `dbt-core`) importing from the latest alias are susceptible to breaking changes. Ideally, any incompatibilities should be caught my static type checking in those systems. However, it is always possible for consumers to pin imports to previous versions via `dbt.artifacts.schemas.<artifact>.v<prev-version>`.
37 changes: 18 additions & 19 deletions docs/eli64/behavior-change-flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,30 +4,29 @@ User documentation: https://docs.getdbt.com/reference/global-configs/legacy-beha

## Rules for introducing a new flag

1. **Naming.** All behavior change flags should be named so that their default value changes from **False → True**. This makes it significantly easier to document, talk about, and understand.
* If the flag is prohibiting something that we previously allowed, use the verb require. Examples:
1. **Naming.** All behavior change flags should be named so that their default value changes from **False → True**. This makes it significantly easier for us to document them and talk about them consistently, and it's more intuitive for end users.
a. If the flag is prohibiting something that we previously allowed, use the verb "require." Examples:
* `require_resource_names_without_spaces`
* `require_explicit_package_overrides_for_builtin_materializations`
* All flags should be of boolean type, and False by default when introduced: `bool = False`.
2. **Documentation.** Start with the docs! What is the change? Who might be affected? What action will users need to take to mitigate this change? At this point, the dates for flag Introduction + Maturity are TBD.
b. All flags should be of boolean type, and False by default when introduced: `bool = False`.
2. **Documentation.** Start with the docs. What is the change? Who might be affected? What action will users need to take to mitigate this change? At this point, the dates for flag Introduction + Maturity are "TBD."
3. **Deprecation warnings**. As a general rule, **all** behavior changes should be accompanied by a deprecation warning.
* Always use our standard deprecations module: [https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/deprecations.py](https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/deprecations.py)
* This serves two purposes: Signalling the change to the user, and collecting telemetry so we can understand blast radius among users with telemtry enabled.
* These warning messages should link back to documentation: [https://docs.getdbt.com/reference/global-configs/legacy-behaviors](https://docs.getdbt.com/reference/global-configs/legacy-behaviors#deprecate_package_materialization_builtin_override)
* Even for additive behaviors that are not breaking changes, there is still an opportunity to signal these changes for users, and to gather an estimate of the impact. E.g. `source_freshness_run_project_hooks` should still include a proactive message any time someone runs the `source freshness` command in a project that has `on-run-*` hooks defined.
* The call site for these deprecation warnings should be as close as possible to the place where we’re evaluating conditional logic based on the project flag. Essentially, any time we check the flag value and it returns `False`, we should raise a deprecation warning while preserving the legacy behavior. (In the future, we might be able to streamline more of this boilerplate code.)
* If users want to silence these deprecation warnings, they can do so via `warn_error_options.silence`. Explicitly setting the flag to `False` in `dbt_project.yml` is not sufficient to silence the warning.
a. Always use our standard deprecations module: [https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/deprecations.py](https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/deprecations.py)
b. This serves two purposes: Signalling the change to the user, and collecting telemetry so we can understand blast radius among users with telemtry enabled.
c. These warning messages should link back to documentation: [https://docs.getdbt.com/reference/global-configs/legacy-behaviors](https://docs.getdbt.com/reference/global-configs/legacy-behaviors#deprecate_package_materialization_builtin_override)
d. Even for additive behaviors that are not "breaking changes," there is still an opportunity to signal these changes for users, and to gather an estimate of the impact. E.g. `source_freshness_run_project_hooks` should still include a proactive message any time someone runs the `source freshness` command in a project that has `on-run-*` hooks defined.
e. The call site for these deprecation warnings should be as close as possible to the place where we’re evaluating conditional logic based on the project flag. Essentially, any time we check the flag value and it returns `False`, we should raise a deprecation warning while preserving the legacy behavior. (In the future, we might be able to streamline more of this boilerplate code.)
f. If users want to silence these deprecation warnings, they can do so via [`warn_error_options.silence`](https://docs.getdbt.com/reference/global-configs/warnings). Explicitly setting the flag to `False` in `dbt_project.yml` is not sufficient to silence the warning.
4. **Exceptions.** If the behavior change is to raise an exception that prohibits behavior which was previously permitted (e.g. spaces in model names), the exception message should also link to the docs on legacy behaviors.
5. **Backports.** Whenever possible, we should backport both the deprecation warning and the flag to the previous version of dbt Core.
6. **Open a GitHub issue** in the dbt-core repository that is the implementation ticket for switching the default from `false` to `true`. Add the `behavior_change_flag` issue label, and add it to the GitHub milestone for the next minor version. (This is true in most cases, see below for exceptional considerations.) During planning, we will bundle up the introduced behavior changes into an epic/tasklist that schedules their maturation.
6. **Open a GitHub issue** in the dbt-core repository that is the implementation ticket for switching the default from `false` to `true`. Add the `behavior_change_flag` issue label, and add it to the GitHub milestone for the next minor version. (This is true in most cases, see below for exceptional considerations.) During planning, we will bundle up the "introduced" behavior changes into an epic/tasklist that schedules their maturation.

## After introduction

1. **Mature flag(s) by switching value from `False``True` in dbt-core `main`.**
* This should land in **the next minor (`1.X.0`) release of dbt-core**.
If the behavior change is mitigating a security vulnerability, and the next minor release is still planned for several months away, we still backport the fix + flag (off by default) to supported OSS versions, and we strongly advise all users to opt into the flag sooner.
2. **Removing support for legacy behavior.**
* As a general rule, we will not entirely remove support for any legacy behaviors until dbt v2.0.
* We are not committing to supporting them forever (à la Rust editions). But we are also not taking them away willy-nilly.
* On a case-by-case basis, if there is a strong compelling reason to remove a legacy behavior and we see minimal in-the-wild usage (<1% of relevant projects), we can remove it entirely. This needs to be communicated well in advance — at least 2 minor versions after introduction in dbt Core.
* These are *project configurations*, not feature flags. While they add complexity to our codebase, such is the price of maintaining v1.* software.
1. **Maturing flag(s) by switching value from `False``True` in dbt-core `main`.**
a. This should land in **the next minor (`1.X.0`) release of dbt-core**.
b. If the behavior change is mitigating a security vulnerability, and the next minor release is still planned for several months away, we still backport the fix + flag (off by default) to supported OSS versions, and we strongly advise all users to opt into the flag sooner.
2. **Removing support for legacy behaviors.**
a. As a general rule, we will not entirely remove support for any legacy behaviors until dbt v2.0. At the same time, we are not committing to supporting them forever (à la Rust editions). We need to strike the right balance between _too fast_ and _never_.
c. On a case-by-case basis, if there is a strong compelling reason to remove a legacy behavior and we see minimal in-the-wild usage (<1% of relevant projects), we can remove it entirely. This needs to be communicated well in advance — at least 2 minor versions after introduction in dbt Core.
d. These are project configurations, not temporary feature flags. They add complexity to our codebase; that complexity compounds the more we have, and the longer we have them. Such is the price of maintaining mature v1.* software.

0 comments on commit 13c99e8

Please sign in to comment.