Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] add generic config for generic tests #10245

Merged
merged 12 commits into from
Jun 5, 2024
Merged

Conversation

McKnight-42
Copy link
Contributor

@McKnight-42 McKnight-42 commented May 30, 2024

resolves #10197

Problem

We would like to update our TestBuilder class for generic tests to be able to accept any config that the user can think of needing.

examples.

models:
  - name: my_model
    columns:
      - name: colour
        tests:
          - accepted_values:
              values: ['blue', 'red']
              severity: warn
              config:
                custom_config: custom_value

Solution

in the Testbuilder class we first setup a variable with a ll defined accepted configs we can keep this a little further down we can add a if condition to process the config beyond the original of searching for the CONFIG_ARGS variable.

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

@McKnight-42 McKnight-42 self-assigned this May 30, 2024
@cla-bot cla-bot bot added the cla:yes label May 30, 2024
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

Copy link

codecov bot commented May 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.70%. Comparing base (3e37d77) to head (8c2ad87).
Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10245      +/-   ##
==========================================
- Coverage   88.72%   88.70%   -0.02%     
==========================================
  Files         180      180              
  Lines       22476    22474       -2     
==========================================
- Hits        19942    19936       -6     
- Misses       2534     2538       +4     
Flag Coverage Δ
integration 85.96% <100.00%> (-0.11%) ⬇️
unit 63.25% <80.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@mikealfare mikealfare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach of breaking out the logic into separate methods for readability. And I provided comments that align with that approach as you've implemented it.

However, I also think this could be done even better by dispatching to methods that correspond to stage instead of legacy/preferred. There's a lot of duplication in the two methods, which suggests that pivoting how it's split could be useful.

To make sure I completely understand the context, here's how I understand the intent and constraints:

  • there is a legacy method of specifying config at the root, these expected config keys are captured in self.CONFIG_ARGS
  • there is a preferred method of specifying config which will show up as self.args["config"] (as a result of processing data_test via self.extract_test_args())
  • the same key should not show up in both methods, otherwise raise SameKeyNestedError()
  • we need to render str entries, and fail on a missing macro
  • we need to filter None entries

Assuming the above is correct, this could be done by translating the above steps into the below pseudo-code:

def __init__(...)
    ...
    raw_config = self._raw_config()
    self.config = self._rendered_config(raw_config)
    ...

def _raw_config(self) -> Dict[str, Optional[str]]:
    legacy_config = ...
    config = ...
    # identify issue where key is present in both
    raw_config = ...
    return raw_config

def _rendered_config(
    self,
    raw_config: Dict[str, Optional[str]],
    render_ctx: Dict[str, Any],
    column_name: Optional[str]
) -> Dict[str, Any]:
    rendered_config = {}
    for key, value in raw_config.items():
        # try to render, raising an error if the macro is missing
        # filter for `None` here since you're already looping, and you could make the argument that removing `None` values is a form of rendering
    return rendered_config

core/dbt/parser/generic_test_builders.py Outdated Show resolved Hide resolved
core/dbt/parser/generic_test_builders.py Outdated Show resolved Hide resolved
core/dbt/parser/generic_test_builders.py Show resolved Hide resolved
tests/functional/schema_tests/data_test_config.py Outdated Show resolved Hide resolved
@McKnight-42 McKnight-42 requested a review from mikealfare June 3, 2024 16:03
@McKnight-42 McKnight-42 marked this pull request as ready for review June 3, 2024 17:02
@McKnight-42 McKnight-42 requested a review from a team as a code owner June 3, 2024 17:02
@McKnight-42 McKnight-42 marked this pull request as draft June 3, 2024 17:03
@McKnight-42
Copy link
Contributor Author

running command

dbt list --quiet --output json --select accepted_values_table_color__blue__red

in a local project after getting models/seeds ran

shows a json object with a config of

"config": {"enabled": true, "alias": null, "schema": "dbt_test__audit", "database": null, "tags": [], "meta": {}, "group": null, "materialized": "test", "severity": "warn", "store_failures": null, "store_failures_as": null, "where": null, "limit": null, "fail_calc": "count(*)", "warn_if": "!= 0", "error_if": "!= 0", "custom_config_key": "some_value"}

@McKnight-42 McKnight-42 marked this pull request as ready for review June 3, 2024 17:36
@McKnight-42
Copy link
Contributor Author

I like the approach of breaking out the logic into separate methods for readability. And I provided comments that align with that approach as you've implemented it.

However, I also think this could be done even better by dispatching to methods that correspond to stage instead of legacy/preferred. There's a lot of duplication in the two methods, which suggests that pivoting how it's split could be useful.

To make sure I completely understand the context, here's how I understand the intent and constraints:

  • there is a legacy method of specifying config at the root, these expected config keys are captured in self.CONFIG_ARGS
  • there is a preferred method of specifying config which will show up as self.args["config"] (as a result of processing data_test via self.extract_test_args())
  • the same key should not show up in both methods, otherwise raise SameKeyNestedError()
  • we need to render str entries, and fail on a missing macro
  • we need to filter None entries

Assuming the above is correct, this could be done by translating the above steps into the below pseudo-code:

def __init__(...)
    ...
    raw_config = self._raw_config()
    self.config = self._rendered_config(raw_config)
    ...

def _raw_config(self) -> Dict[str, Optional[str]]:
    legacy_config = ...
    config = ...
    # identify issue where key is present in both
    raw_config = ...
    return raw_config

def _rendered_config(
    self,
    raw_config: Dict[str, Optional[str]],
    render_ctx: Dict[str, Any],
    column_name: Optional[str]
) -> Dict[str, Any]:
    rendered_config = {}
    for key, value in raw_config.items():
        # try to render, raising an error if the macro is missing
        # filter for `None` here since you're already looping, and you could make the argument that removing `None` values is a form of rendering
    return rendered_config

When I tried to combine the levels/ or hit its version of code more it would hit macro errors in dbt-adapters to augment to that style seems like it would require a higher lift to supply more configs to other existing macros.

@McKnight-42 McKnight-42 closed this Jun 4, 2024
@McKnight-42 McKnight-42 reopened this Jun 4, 2024
Copy link
Contributor

@mikealfare mikealfare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some small nits that would make the code a bit more succinct, but nothing that's worth stopping from moving forward. If you have the time, take a look, otherwise merge and move on.

core/dbt/parser/generic_test_builders.py Outdated Show resolved Hide resolved
core/dbt/parser/generic_test_builders.py Outdated Show resolved Hide resolved
@McKnight-42 McKnight-42 merged commit b680c7a into main Jun 5, 2024
63 checks passed
@McKnight-42 McKnight-42 deleted the mcknight/10197 branch June 5, 2024 15:37
McKnight-42 added a commit that referenced this pull request Jun 6, 2024
* init push arbitrary configs for generic tests pr

* iterative work

* initial test design attempts

* test reformatting

* test rework, have basic structure for 3 of 4 passing, need to figure out how to best represent same key error, failing correctly though

* swap up test formats for new config dict and mixed varitey to run of dbt parse and inspecting the manifest

* modify tests to get passing, then modify the TestBuilder class work from earlier to be more dry

* add changelog

* modify code to match suggested changes around seperate methods and test id fix

* add column_name reference to init for deeper nested _render_values can use the input

* add type annotations

* feedback based on mike review
@nghi-ly nghi-ly added the user docs [docs.getdbt.com] Needs better documentation label Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes user docs [docs.getdbt.com] Needs better documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Impl] Allow generic tests to accept arbitrary configs
4 participants