Skip to content

Commit

Permalink
resolved merge conflicts
Browse files Browse the repository at this point in the history
  • Loading branch information
thomend committed Dec 3, 2024
2 parents 512b04f + 9f6aa67 commit f2b02a1
Show file tree
Hide file tree
Showing 48 changed files with 2,386 additions and 1,186 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python: ["3.10", "3.11", "3.12"]
python: ["3.10", "3.11", "3.12", "3.13"]

steps:
- uses: actions/checkout@v4
Expand Down
26 changes: 26 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,32 @@ and this project adheres to

## [Unreleased]

- Documenting support for python 3.13. (#86)

## [0.8.0] - 2024-11-12

- Support linting of sources.
- **Breaking**: Renamed modules: `dbt_score.model_filter` becomes
`dbt_score.rule_filter`
- **Breaking**: Renamed filter class and decorator: `@model_filter` becomes
`@rule_filter` and `ModelFilter` becomes `RuleFilter`.
- **Breaking**: Config option `model_filter_names` becomes `rule_filter_names`.
- **Breaking**: CLI flag naming fixes: `--fail_any_model_under` becomes
`--fail-any-item-under` and `--fail_project_under` becomes
`--fail-project-under`.

## [0.7.1] - 2024-11-01

- Fix mkdocs.

## [0.7.0] - 2024-11-01

- **Breaking**: The rule `public_model_has_example_sql` has been renamed
`has_example_sql` and applies by default to all models.
- **Breaking**: Remove `dbt-core` from dependencies. Since it is not mandatory
for `dbt-score` to execute `dbt`, remove the dependency.
- **Breaking**: Stop using `MultiOption` selection type.

## [0.6.0] - 2024-08-23

- **Breaking**: Improve error handling in CLI. Log messages are written in
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

## What is `dbt-score`?

`dbt-score` is a linter for dbt model metadata.
`dbt-score` is a linter for dbt metadata.

[dbt][dbt] (Data Build Tool) is a great framework for creating, building,
organizing, testing and documenting _data models_, i.e. data sets living in a
Expand Down
10 changes: 5 additions & 5 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ rule_namespaces = ["dbt_score.rules", "dbt_score_rules", "custom_rules"]
disabled_rules = ["dbt_score.rules.generic.columns_have_description"]
inject_cwd_in_python_path = true
fail_project_under = 7.5
fail_any_model_under = 8.0
fail_any_item_under = 8.0

[tool.dbt-score.badges]
first.threshold = 10.0
Expand Down Expand Up @@ -51,8 +51,8 @@ The following options can be set in the `pyproject.toml` file:
- `disabled_rules`: A list of rules to disable.
- `fail_project_under` (default: `5.0`): If the project score is below this
value the command will fail with return code 1.
- `fail_any_model_under` (default: `5.0`): If any model scores below this value
the command will fail with return code 1.
- `fail_any_item_under` (default: `5.0`): If any model or source scores below
this value the command will fail with return code 1.

#### Badges configuration

Expand All @@ -70,7 +70,7 @@ All badges except `wip` can be configured with the following option:

- `threshold`: The threshold for the badge. A decimal number between `0.0` and
`10.0` that will be used to compare to the score. The threshold is the minimum
score required for a model to be rewarded with a certain badge.
score required for a model or source to be rewarded with a certain badge.

The default values can be found in the
[BadgeConfig](reference/config.md#dbt_score.config.BadgeConfig).
Expand All @@ -86,7 +86,7 @@ Every rule can be configured with the following option:
- `severity`: The severity of the rule. Rules have a default severity and can be
overridden. It's an integer with a minimum value of 1 and a maximum value
of 4.
- `model_filter_names`: Filters used by the rule. Takes a list of names that can
- `rule_filter_names`: Filters used by the rule. Takes a list of names that can
be found in the same namespace as the rules (see
[Package rules](package_rules.md)).

Expand Down
71 changes: 56 additions & 15 deletions docs/create_rules.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Create rules

In order to lint and score models, `dbt-score` uses a set of rules that are
applied to each model. A rule can pass or fail when it is run. Based on the
severity of the rule, models are scored with the weighted average of the rules
results. Note that `dbt-score` comes bundled with a
In order to lint and score models or sources, `dbt-score` uses a set of rules
that are applied to each item. A rule can pass or fail when it is run. Based on
the severity of the rule, items are scored with the weighted average of the
rules results. Note that `dbt-score` comes bundled with a
[set of default rules](rules/generic.md).

On top of the generic rules, it's possible to add your own rules. Two ways exist
Expand All @@ -21,7 +21,7 @@ The `@rule` decorator can be used to easily create a new rule:
from dbt_score import Model, rule, RuleViolation

@rule
def has_description(model: Model) -> RuleViolation | None:
def model_has_description(model: Model) -> RuleViolation | None:
"""A model should have a description."""
if not model.description:
return RuleViolation(message="Model lacks a description.")
Expand All @@ -31,6 +31,21 @@ The name of the function is the name of the rule and the docstring of the
function is its description. Therefore, it is important to use a
self-explanatory name for the function and document it well.

The type annotation for the rule's argument dictates whether the rule should be
applied to dbt models or sources.

Here is the same example rule, applied to sources:

```python
from dbt_score import rule, RuleViolation, Source

@rule
def source_has_description(source: Source) -> RuleViolation | None:
"""A source should have a description."""
if not source.description:
return RuleViolation(message="Source lacks a description.")
```

The severity of a rule can be set using the `severity` argument:

```python
Expand All @@ -45,15 +60,23 @@ For more advanced use cases, a rule can be created by inheriting from the `Rule`
class:

```python
from dbt_score import Model, Rule, RuleViolation
from dbt_score import Model, Rule, RuleViolation, Source

class HasDescription(Rule):
class ModelHasDescription(Rule):
description = "A model should have a description."

def evaluate(self, model: Model) -> RuleViolation | None:
"""Evaluate the rule."""
if not model.description:
return RuleViolation(message="Model lacks a description.")

class SourceHasDescription(Rule):
description = "A source should have a description."

def evaluate(self, source: Source) -> RuleViolation | None:
"""Evaluate the rule."""
if not source.description:
return RuleViolation(message="Source lacks a description.")
```

### Rules location
Expand Down Expand Up @@ -91,30 +114,48 @@ def sql_has_reasonable_number_of_lines(model: Model, max_lines: int = 200) -> Ru
)
```

### Filtering models
### Filtering rules

Custom and standard rules can be configured to have model filters. Filters allow
models to be ignored by one or multiple rules.
Custom and standard rules can be configured to have filters. Filters allow
models or sources to be ignored by one or multiple rules if the item doesn't
satisfy the filter criteria.

Filters are created using the same discovery mechanism and interface as custom
rules, except they do not accept parameters. Similar to Python's built-in
`filter` function, when the filter evaluation returns `True` the model should be
`filter` function, when the filter evaluation returns `True` the item should be
evaluated, otherwise it should be ignored.

```python
from dbt_score import ModelFilter, model_filter
from dbt_score import Model, RuleFilter, rule_filter

@model_filter
@rule_filter
def only_schema_x(model: Model) -> bool:
"""Only applies a rule to schema X."""
return model.schema.lower() == 'x'

class SkipSchemaY(ModelFilter):
class SkipSchemaY(RuleFilter):
description = "Applies a rule to every schema but Y."
def evaluate(self, model: Model) -> bool:
return model.schema.lower() != 'y'
```

Filters also rely on type-annotations to dictate whether they apply to models or
sources:

```python
from dbt_score import RuleFilter, rule_filter, Source

@rule_filter
def only_from_source_a(source: Source) -> bool:
"""Only applies a rule to source tables from source X."""
return source.source_name.lower() == 'a'

class SkipSourceDatabaseB(RuleFilter):
description = "Applies a rule to every source except Database B."
def evaluate(self, source: Source) -> bool:
return source.database.lower() != 'b'
```

Similar to setting a rule severity, standard rules can have filters set in the
[configuration file](configuration.md/#tooldbt-scorerulesrule_namespacerule_name),
while custom rules accept the configuration file or a decorator parameter.
Expand All @@ -123,7 +164,7 @@ while custom rules accept the configuration file or a decorator parameter.
from dbt_score import Model, rule, RuleViolation
from my_project import only_schema_x

@rule(model_filters={only_schema_x()})
@rule(rule_filters={only_schema_x()})
def models_in_x_follow_naming_standard(model: Model) -> RuleViolation | None:
"""Models in schema X must follow the naming standard."""
if some_regex_fails(model.name):
Expand Down
8 changes: 4 additions & 4 deletions docs/get_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ Installation of `dbt-score` is simple:
pip install dbt-score
```

If a virtual environment is used to run dbt, make sure to install `dbt-score` in
the same environment.
In order to run `dbt-score` with all its features, be sure to install
`dbt-score` in the same environment as `dbt-core`.

## Usage

Expand Down Expand Up @@ -40,8 +40,8 @@ It's also possible to automatically run `dbt parse`, to generate the
dbt-score lint --run-dbt-parse
```

To lint only a selection of models, the argument `--select` can be used. It
accepts any
To lint only a selection of models or sources, the argument `--select` can be
used. It accepts any
[dbt node selection syntax](https://docs.getdbt.com/reference/node-selection/syntax):

```shell
Expand Down
29 changes: 15 additions & 14 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,41 +2,42 @@

`dbt-score` is a linter for [dbt](https://www.getdbt.com/) metadata.

dbt allows data practitioners to organize their data in to _models_. Those
models have metadata associated with them: documentation, tests, types, etc.
dbt allows data practitioners to organize their data in to _models_ and
_sources_. Those models and sources have metadata associated with them:
documentation, tests, types, etc.

`dbt-score` allows to lint and score this metadata, in order to enforce (or
encourage) good practices.

## Example

```
> dbt-score lint --show-all
🥇 customers (score: 10.0)
> dbt-score lint --show all
🥇 M: customers (score: 10.0)
OK dbt_score.rules.generic.has_description
OK dbt_score.rules.generic.has_owner: Model lacks an owner.
OK dbt_score.rules.generic.has_owner
OK dbt_score.rules.generic.sql_has_reasonable_number_of_lines
Score: 10.0 🥇
```

In this example, the model `customers` scores the maximum value of `10.0` as it
passes all the rules. It also is awarded a golden medal because of the perfect
score. By default a passing model with or without rule violations will not be shown,
unless we pass the `--show-all` flag.
score. By default a passing model with or without rule violations will not be
shown, unless we pass the `--show-all` flag.

## Philosophy

dbt models are often used as metadata containers: either in YAML files or
through the use of `{{ config() }}` blocks, they are associated with a lot of
dbt models/sources are often used as metadata containers: either in YAML files
or through the use of `{{ config() }}` blocks, they are associated with a lot of
information. At scale, it becomes tedious to enforce good practices in large
data teams dealing with many models.
data teams dealing with many models/sources.

To that end, `dbt-score` has 2 main features:

- It runs rules on models, and displays rule violations. Those can be used in
interactive environments or in CI.
- Using those run results, it scores models, as to give them a measure of their
maturity. This score can help gamify model metadata improvements, and be
- It runs rules on dbt models and sources, and displays any rule violations.
These can be used in interactive environments or in CI.
- Using those run results, it scores items, to ascribe them a measure of their
maturity. This score can help gamify metadata improvements/coverage, and be
reflected in data catalogs.

`dbt-score` aims to:
Expand Down
8 changes: 4 additions & 4 deletions docs/programmatic_invocations.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ $ dbt-score lint --format json
"severity": "medium",
"message": "Model lacks an owner."
},
"dbt_score.rules.generic.public_model_has_example_sql": {
"dbt_score.rules.generic.has_example_sql": {
"result": "OK",
"severity": "low",
"message": null
Expand Down Expand Up @@ -61,9 +61,9 @@ When `dbt-score` terminates, it exists with one of the following exit codes:
project being linted either doesn't raise any warning, or the warnings are
small enough to be above the thresholds. This generally means "successful
linting".
- `1` in case of linting errors. This is the unhappy case: some models in the
project raise enough warnings to have a score below the defined thresholds.
This generally means "linting doesn't pass".
- `1` in case of linting errors. This is the unhappy case: some models or
sources in the project raise enough warnings to have a score below the defined
thresholds. This generally means "linting doesn't pass".
- `2` in case of an unexpected error. This happens for example if something is
misconfigured (for example a faulty dbt project), or the wrong parameters are
given to the CLI. This generally means "setup needs to be fixed".
Loading

0 comments on commit f2b02a1

Please sign in to comment.