-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Begin warning people about spaces in model names #9886
Begin warning people about spaces in model names #9886
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #9886 +/- ##
==========================================
- Coverage 88.14% 88.13% -0.02%
==========================================
Files 178 178
Lines 22459 22507 +48
==========================================
+ Hits 19797 19837 +40
- Misses 2662 2670 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
tests/functional/manifest_validations/test_check_for_spaces_in_model_names.py
Show resolved
Hide resolved
41e5b5d
to
7b60ab0
Compare
Interesting the failing unit tests are also failing locally, but are failing differently 🤔 In the github action they are:
But locally the errors are:
They're failing on different flag attributes being accessed, and in different parts of the codebase... weird |
I just checked out main and am still running into the |
After busting my local tox cache, running the tests via |
678462d
to
958d5ba
Compare
Rebased off of main and tracked down the issue 🙂 |
We had a fix on main for the failing integration tests in |
For projects with a lot of models that have spaces in their names, the warning about this deprecation would be incredibly annoying. Now we instead only log the first model name issue and then a count of how many models have the issue, unless `--debug` is specified.
We want to be able to catch more than just `SpacesInModelNameDeprecation` events, and in the next commit we will alter our tests to do so. Thus instead of writing a new catcher for each event type, a slight modification to the existing `EventCatcher` makes this much easier.
Co-authored-by: Emily Rockman <[email protected]>
…config Previously in our `test_graph.py` unit tests we were setting the flags global, but not actually adding them to the config. The config, without the flags, would then get passed to the manifest loader and other things. Those down stream classes/functions would then look to the config for the flags and not find them. This change ensures that the flags get applied to the config that is being used in down stream operations during our unit tests.
18fb56c
to
9865f4a
Compare
Using `Note` events was causing test flakiness when run in a multi worker environment using `pytest -nauto`. This is because the event manager is currently a global. So in a situation where test `A` starts and test `tests_debug_when_spaces_in_name` starts shortly there after, the event manager for both tests will have the callbacks set in `tests_debug_when_spaces_in_name`. Then if something in test `A` fired a `Note` event, this would affect the count of `Note` events that `tests_debug_when_spaces_in_name` sees, causing assertion failures. By creating a custom event, `TotalModelNamesWithSpacesDeprecation`, we limit the possible flakiness to only tests that fire the custom event. Thus we didn't _eliminate_ all possibility of flakiness, but realistically only the tests in `test_check_for_spaces_in_model_names.py` can now interfere with each other. Which still isn't great, but to fully resolve the problem we need to work on how the event manager is handled (preferably not globally).
Previously we only logged out the count of how many invalid model names there were if there was two or more invalid names (and not in debug mode). However this message is important if there is even one invalid model name and regardless of whether you are running debug mode. That is because automated tools might be looking for the event type to track if anything is wrong. A related change in this commit is that we now only output the debug hint if it wasn't run with debug mode. The idea being that if they are already running it in debug mode, the hint could come accross as somewhat patronizing.
We want people running dbt to be able to at a glance see warnings/errors with running their project. In this case we are focused specifically on errors/warnings in regards to model names containing spaces. Previously we were only ever emitting the `warning_tag` in the message even if the event itself was being emitted at an `ERROR` level. We now properly have `[ERROR]` or `[WARNING]` in the message depending on the level. Unfortunately we couldn't just look what level the event was being fired at, because that information doesn't exist on the event itself. Additionally, we're using events that base off of `DynamicEvents` which unfortunately hard coded to `DEBUG`. Changing this would involve still having a `level` property on the definition in `core_types.proto` and then having `DynamicEvent`s look to `self.level` in the `level_tag` method. Then we could change how firing events works based on the an event's `level_tag` return value. This all sounds like a bit of tech debt suited for PR, possibly multiple, and thus is not being done here.
9865f4a
to
c678cbc
Compare
# TODO Move this to dbt_common.ui | ||
def _error_tag(msg: str) -> str: | ||
return f'[{red("ERROR")}]: {msg}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to do this TODO before merging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not. That would entail opening a a PR in dbt-common and also getting it released, and I didn't want that to block this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up tasks assigned to me:
|
||
|
||
@dataclass | ||
class EventCatcher: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
c9f83b9
to
a2d609f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
resolves #9397
Problem
We don't support models with spaces in their names. However, we haven't been actually enforcing this. If a person have spaces in their model names, it causes issues when using selectors. Additionally, depending on a person's operating system, there were other edge case problems with spaces in model names.
Solution
Begin warning people about spaces in their model names. Depending on some flags, we give more or less information. If
debug
mode is not on, then we log out the first offending model name and a count of how many offending model names their are in total. Ifdebug
mode is on, then we log out every offending model name. Additionally, by default currently these logs are warnings and won't stop dbt from running. However, if one setsallow_spaces_in_model_names
toFalse
in theirdbt_project.yml
, then the logs become errors and found offending model names will stop dbt from running.Checklist