Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support configuration using pyproject.toml #16

Merged
merged 30 commits into from
May 14, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
7736f7f
Support configuration using pyproject.toml
jochemvandooren May 1, 2024
376c6bc
Formatting
jochemvandooren May 1, 2024
c769daa
Fix existing tests
jochemvandooren May 1, 2024
0d278bf
Add check for pyproject.toml
jochemvandooren May 1, 2024
92398fc
Make rule_config an optional parameter
jochemvandooren May 1, 2024
1ab1c9e
Make general config an optional parameter
jochemvandooren May 1, 2024
651fd53
Add tests
jochemvandooren May 1, 2024
3622da3
Add mkdocs files
jochemvandooren May 1, 2024
58b15b4
Merge branch 'master' into jvandooren/rule-configuration
jochemvandooren May 1, 2024
a465a29
rule-configuration Fix formatting
jochemvandooren May 1, 2024
933207d
Add a test for disabling rules
jochemvandooren May 2, 2024
dc58c5a
Improve docstrings
jochemvandooren May 2, 2024
e62836d
Update src/dbt_score/rule.py
jochemvandooren May 3, 2024
4603f43
Process review comments
jochemvandooren May 3, 2024
edc2766
Make config defaults Final
jochemvandooren May 3, 2024
566b561
Drop python 3.10 support
jochemvandooren May 3, 2024
1abff4f
Rename config_parser to config
jochemvandooren May 3, 2024
7abc430
Revert copying rule config
jochemvandooren May 3, 2024
8cadd1a
Improve logging and error
jochemvandooren May 3, 2024
862da7f
Remove unused fixtures
jochemvandooren May 3, 2024
06e0957
Improve test
jochemvandooren May 3, 2024
da190b7
Fix python version in readme
jochemvandooren May 3, 2024
12d49fc
Fix type hints
jochemvandooren May 3, 2024
ae091fe
Address comments
jochemvandooren May 3, 2024
99c4da3
Update tests/test_cli.py
jochemvandooren May 10, 2024
a50d027
Require config in RuleRegistry
jochemvandooren May 10, 2024
52080c0
Convert Severity in RuleConfig
jochemvandooren May 10, 2024
923d020
Change params to config and make get_config_file static
jochemvandooren May 13, 2024
4af4971
Use rule.source() instead of reconstructing name
jochemvandooren May 14, 2024
87998db
Use proper namespaces in tests
jochemvandooren May 14, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,14 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python: ["3.10", "3.11"]
python: ["3.11"]
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved

steps:
- uses: actions/checkout@v4
- name: Set up PDM
uses: pdm-project/setup-pdm@v4
with:
python-version: "3.10"
python-version: "3.11"
- name: Install dependencies
run: |
pdm sync -d
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Linter for dbt model metadata.

You'll need the following prerequisites:

- Any Python version starting from 3.10
- Any Python version starting from 3.11
- [pre-commit](https://pre-commit.com/)
- [PDM](https://pdm-project.org/2.12/)

Expand Down
3 changes: 3 additions & 0 deletions docs/reference/config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Config

::: dbt_score.config
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ nav:
- Home: index.md
- Reference:
- reference/cli.md
- reference/config.md
- reference/exceptions.md
- reference/evaluation.md
- reference/models.md
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ dependencies = [
"dbt-core>=1.5",
"click>=7.1.1, <9.0.0",
]
requires-python = ">=3.10"
requires-python = ">=3.11"
readme = "README.md"
license = {text = "MIT"}

Expand Down Expand Up @@ -92,7 +92,7 @@ max-args = 6

[tool.ruff.lint.per-file-ignores]
"tests/**/*.py" = [
"PLR2004", # magic value comparisons
"PLR2004", # Magic value comparisons
]

### Coverage ###
Expand Down
7 changes: 6 additions & 1 deletion src/dbt_score/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from click.core import ParameterSource
from dbt.cli.options import MultiOption

from dbt_score.config import DEFAULT_CONFIG_FILE, Config
from dbt_score.lint import lint_dbt_project
from dbt_score.parse import dbt_parse, get_default_manifest_path

Expand Down Expand Up @@ -57,7 +58,11 @@ def lint(select: tuple[str], manifest: Path, run_dbt_parse: bool) -> None:
if manifest_provided and run_dbt_parse:
raise click.UsageError("--run-dbt-parse cannot be used with --manifest.")

config = Config()
if Path(DEFAULT_CONFIG_FILE).exists():
config.load_toml_file(DEFAULT_CONFIG_FILE)
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved

if run_dbt_parse:
dbt_parse()

lint_dbt_project(manifest)
lint_dbt_project(manifest, config)
66 changes: 66 additions & 0 deletions src/dbt_score/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
"""This module is responsible for loading configuration."""

import logging
import tomllib
from dataclasses import dataclass, field
from typing import Any, Final

logger = logging.getLogger(__name__)

DEFAULT_CONFIG_FILE = "pyproject.toml"
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved


@dataclass
class RuleConfig:
"""Configuration for a rule."""

severity: int | None = None
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved
params: dict[str, Any] = field(default_factory=dict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be named config (also in Rule.params -> Rule.config)?
Reason is, I find it confusing to have these 2 concepts, which are fundamentally the same but named differently

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this actually prevents confusion because these things are not the same IMO. Yes both are to configure a Rule, but one configuration is input for a method whereas the other also does other things (update severity). I am open to change it, maybe @druzhinin-kirill has an opinion on what is less confusing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rule.config_params 😄

No strong opinion, however, I would prefer avoiding key mismatch variable names:

return RuleConfig(severity=severity, params=rule_config)

so either variable is params, or the argument is rules (only if it's achievable for zero cost, super minor anyway).

Copy link
Contributor

@matthieucan matthieucan May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the user point of view, this is all "rule config", right?
Now internally the config is divided in two, severity and params/config. I don't think it's an issue, we could do something like

return RuleConfig(severity=severity, config=config)

which I believe is self explanatory, and doesn't require the use of multiple terms? WDYT?

Copy link
Contributor Author

@jochemvandooren jochemvandooren May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the user point of view it's indeed all the same, it's internals only.

Ok I will go for using config inside RuleConfig 👌


@staticmethod
def from_dict(rule_config: dict[str, Any]) -> "RuleConfig":
"""Create a RuleConfig from a dictionary."""
severity = rule_config.pop("severity", None)
return RuleConfig(severity=severity, params=rule_config)


class Config:
"""Configuration for dbt-score."""

_main_section: Final[str] = "tool.dbt-score"
_options: Final[tuple[str, ...]] = ("rule_namespaces", "disabled_rules")
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved
_rules_section: Final[str] = f"{_main_section}.rules"

def __init__(self) -> None:
"""Initialize the Config object."""
self.rule_namespaces: list[str] = ["dbt_score_rules"]
self.disabled_rules: list[str] = []
self.rules_config: dict[str, RuleConfig] = {}

def set_option(self, option: str, value: Any) -> None:
"""Set an option in the config."""
setattr(self, option, value)

def load_toml_file(self, file: str) -> None:
"""Load the options from a TOML file."""
with open(file, "rb") as f:
matthieucan marked this conversation as resolved.
Show resolved Hide resolved
toml_data = tomllib.load(f)
tools = toml_data.get("tool", {})
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved
dbt_score_config = tools.get("dbt-score", {})
rules_config = dbt_score_config.pop("rules", {})

# Main configuration
for option, value in dbt_score_config.items():
# If value is a dictionary, it's another section
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved
if option in self._options and not isinstance(value, dict):
self.set_option(option, value)
else:
logger.warning(
f"Option {option} in {self._main_section} not supported."
)

# Rule configuration
self.rules_config = {
name: RuleConfig.from_dict(config)
for name, config in rules_config.items()
}
6 changes: 2 additions & 4 deletions src/dbt_score/evaluation.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,15 +51,13 @@ def __init__(

def evaluate(self) -> None:
"""Evaluate all rules."""
# Instantiate all rules. In case they keep state across calls, this must be
# done only once.
rules = [rule_class() for rule_class in self._rule_registry.rules.values()]
rules = self._rule_registry.rules.values()

for model in self._manifest_loader.models:
self.results[model] = {}
for rule in rules:
try:
result: RuleViolation | None = rule.evaluate(model)
result: RuleViolation | None = rule.evaluate(model, **rule.params)
except Exception as e:
self.results[model][rule.__class__] = e
else:
Expand Down
5 changes: 3 additions & 2 deletions src/dbt_score/lint.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,20 @@

from pathlib import Path

from dbt_score.config import Config
from dbt_score.evaluation import Evaluation
from dbt_score.formatters.human_readable_formatter import HumanReadableFormatter
from dbt_score.models import ManifestLoader
from dbt_score.rule_registry import RuleRegistry
from dbt_score.scoring import Scorer


def lint_dbt_project(manifest_path: Path) -> None:
def lint_dbt_project(manifest_path: Path, config: Config) -> None:
"""Lint dbt manifest."""
if not manifest_path.exists():
raise FileNotFoundError(f"Manifest not found at {manifest_path}.")
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved

rule_registry = RuleRegistry()
rule_registry = RuleRegistry(config)
rule_registry.load_all()

manifest_loader = ManifestLoader(manifest_path)
Expand Down
42 changes: 42 additions & 0 deletions src/dbt_score/rule.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
"""Rule definitions."""

import inspect
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved
import typing
from dataclasses import dataclass
from enum import Enum
from typing import Any, Callable, Type, TypeAlias, overload

from dbt_score.config import RuleConfig
from dbt_score.models import Model


Expand Down Expand Up @@ -31,17 +34,48 @@ class Rule:

description: str
severity: Severity = Severity.MEDIUM
default_params: typing.ClassVar[dict[str, Any]] = {}

def __init__(self, rule_config: RuleConfig | None = None) -> None:
"""Initialize the rule."""
self.params = self.process_config(rule_config) if rule_config else {}
druzhinin-kirill marked this conversation as resolved.
Show resolved Hide resolved

def __init_subclass__(cls, **kwargs) -> None: # type: ignore
"""Initializes the subclass."""
super().__init_subclass__(**kwargs)
if not hasattr(cls, "description"):
raise AttributeError("Subclass must define class attribute `description`.")

def process_config(self, rule_config: RuleConfig) -> dict[str, Any]:
"""Process the rule config."""
rule_params = self.default_params.copy()

# Overwrite default rule params
for k, v in rule_config.params.items():
if k in self.default_params:
rule_params[k] = v
else:
raise AttributeError(
f"Unknown rule parameter: {k} for rule {self.source()}."
)

self.set_severity(
rule_config.severity
) if rule_config.severity else rule_config.severity

return rule_params
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved

def evaluate(self, model: Model) -> RuleViolation | None:
"""Evaluates the rule."""
raise NotImplementedError("Subclass must implement method `evaluate`.")

@classmethod
def set_severity(cls, severity: int | Severity) -> None:
"""Set the severity of the rule."""
if isinstance(severity, int):
severity = Severity(severity)
cls.severity = severity

@classmethod
def source(cls) -> str:
"""Return the source of the rule, i.e. a fully qualified name."""
Expand Down Expand Up @@ -106,13 +140,21 @@ def wrapped_func(self: Rule, *args: Any, **kwargs: Any) -> RuleViolation | None:
"""Wrap func to add `self`."""
return func(*args, **kwargs)

# Get default parameters from the rule definition
default_params = {
key: val.default
for key, val in inspect.signature(func).parameters.items()
if val.default != inspect.Parameter.empty
}

# Create the rule class inheriting from Rule
rule_class = type(
func.__name__,
(Rule,),
{
"description": rule_description,
"severity": severity,
"default_params": default_params,
"evaluate": wrapped_func,
# Forward origin of the decorated function
"__qualname__": func.__qualname__, # https://peps.python.org/pep-3155/
Expand Down
29 changes: 20 additions & 9 deletions src/dbt_score/rule_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,25 +8,32 @@
import pkgutil
from typing import Iterator, Type

from dbt_score.config import Config, RuleConfig
from dbt_score.exceptions import DuplicatedRuleException
from dbt_score.rule import Rule

logger = logging.getLogger(__name__)

THIRD_PARTY_RULES_NAMESPACE = "dbt_score_rules"


class RuleRegistry:
"""A container for configured rules."""

def __init__(self) -> None:
def __init__(self, config: Config | None = None) -> None:
"""Instantiate a rule registry."""
self.config = config or Config()
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved
self._rules: dict[str, Type[Rule]] = {}
self._initialized_rules: dict[str, Rule] = {}

def init_rules(self) -> None:
"""Initialize rules."""
for rule_name, rule_class in self._rules.items():
rule_config = self.config.rules_config.get(rule_name, RuleConfig())
self._initialized_rules[rule_name] = rule_class(rule_config=rule_config)

@property
def rules(self) -> dict[str, Type[Rule]]:
def rules(self) -> dict[str, Rule]:
"""Get all rules."""
return self._rules
return self._initialized_rules

def _walk_packages(self, namespace_name: str) -> Iterator[str]:
"""Walk packages and sub-packages recursively."""
Expand All @@ -50,14 +57,18 @@ def _load(self, namespace_name: str) -> None:
for obj_name in dir(module):
obj = module.__dict__[obj_name]
if type(obj) is type and issubclass(obj, Rule) and obj is not Rule:
self._add_rule(obj_name, obj)
self._add_rule(f"{module_name}.{obj_name}", obj)

def _add_rule(self, name: str, rule: Type[Rule]) -> None:
if name in self.rules:
"""Add a rule."""
if name in self._rules:
raise DuplicatedRuleException(name)
self._rules[name] = rule
if name not in self.config.disabled_rules:
self._rules[name] = rule

def load_all(self) -> None:
"""Load all rules, core and third-party."""
self._load("dbt_score.rules")
self._load(THIRD_PARTY_RULES_NAMESPACE)
for namespace in self.config.rule_namespaces:
self._load(namespace)
self.init_rules()
30 changes: 30 additions & 0 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,21 @@
from dbt_score import Model, Rule, RuleViolation, Severity, rule
from pytest import fixture

# Configuration


@fixture
def valid_config_path() -> Path:
"""Return the path of the configuration."""
return Path(__file__).parent / "resources" / "pyproject.toml"


@fixture
def invalid_config_path() -> Path:
"""Return the path of the configuration."""
return Path(__file__).parent / "resources" / "invalid_pyproject.toml"


# Manifest


Expand Down Expand Up @@ -156,6 +171,21 @@ def rule_severity_critical(model: Model) -> RuleViolation | None:
return rule_severity_critical


@fixture
def rule_with_params() -> Type[Rule]:
"""An example rule with additional input params."""

@rule
def rule_with_params(
model: Model, model_name: str = "model1"
) -> RuleViolation | None:
"""Rule with additional input params."""
if model.name != model_name:
return RuleViolation(message=model_name)

return rule_with_params


@fixture
def rule_error() -> Type[Rule]:
"""An example rule which fails to run."""
Expand Down
2 changes: 2 additions & 0 deletions tests/resources/invalid_pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[tool.dbt-score]
foo = "bar"
13 changes: 13 additions & 0 deletions tests/resources/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[tool.dbt-score]
rule_namespaces = ["namespace_foo"]
jochemvandooren marked this conversation as resolved.
Show resolved Hide resolved
disabled_rules = ["foo", "bar"]


[tool.dbt-score.rules.foobar]
severity=4

[tool.dbt-score.rules.rule_with_params]
model_name="model2"

[tool.dbt-score.rules."tests.rules.example.rule_test_example"]
severity=4
Loading