Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Begin warning people about spaces in model names #9886

Merged
merged 20 commits into from
Apr 12, 2024
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
0ac4b3f
Add event type for deprecation of spaces in model names
QMalcolm Apr 9, 2024
a20ef5a
Begin emitting deprecation warning for spaces in model names
QMalcolm Apr 10, 2024
1804c72
Only warn on first model name with spaces unless `--debug` is specified
QMalcolm Apr 10, 2024
b1c584f
Refactor `EventCatcher` so that the event to catch is setable
QMalcolm Apr 10, 2024
fb2547a
Update `tests_debug_when_spaces_in_name` to check for relevant `Note`…
QMalcolm Apr 10, 2024
3665fcc
Add project flag to control whether spaces are allowed in model names
QMalcolm Apr 9, 2024
41ef3f7
Log errors and raise exception when `allow_spaces_in_model_names` is …
QMalcolm Apr 10, 2024
cd62234
Add changie log for new manifest validation of spaces in model names
QMalcolm Apr 10, 2024
117b6d4
Fix capitalization in `Note` event message about improper model names
QMalcolm Apr 10, 2024
0352bf5
Update test to check `Note` event contents for total bad model names
QMalcolm Apr 10, 2024
f51d65a
Alter `SpacesInModelNameDeprecation` to inherit from `DynamicLevel`
QMalcolm Apr 10, 2024
92fc679
Fixup changelog to be a fix not a feature
QMalcolm Apr 10, 2024
f7b6f2d
Fix `test_graph.py` config creation process to ensure flags exist on …
QMalcolm Apr 11, 2024
f1278a3
Use custom even for output invalid name counts instead of `Note` events
QMalcolm Apr 12, 2024
b7e5c60
Always log total invalid model names if atleast one
QMalcolm Apr 12, 2024
c509d75
Reduce duplicate `if` logic in `check_for_spaces_in_model_names`
QMalcolm Apr 12, 2024
c678cbc
Improve readability of logs related to problematic model names
QMalcolm Apr 12, 2024
a2d609f
Alter `TotalModelNamesWithSpacesDeprecation` message to handle singul…
QMalcolm Apr 12, 2024
515b61b
Merge branch 'main' into qmalcolm--9397-deprecate-spaces-in-model-names
QMalcolm Apr 12, 2024
9cdecaa
Remove duplicate import in `test_graph.py` introduced from merging in…
QMalcolm Apr 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changes/unreleased/Fixes-20240409-233347.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Fixes
body: Begin warning people about spaces in model names
time: 2024-04-09T23:33:47.850166-07:00
custom:
Author: QMalcolm
Issue: "9397"
6 changes: 5 additions & 1 deletion core/dbt/contracts/project.py
Original file line number Diff line number Diff line change
Expand Up @@ -296,6 +296,7 @@ def validate(cls, data):

@dataclass
class ProjectFlags(ExtensibleDbtClassMixin):
allow_spaces_in_model_names: Optional[bool] = True
cache_selected_only: Optional[bool] = None
debug: Optional[bool] = None
fail_fast: Optional[bool] = None
Expand All @@ -320,7 +321,10 @@ class ProjectFlags(ExtensibleDbtClassMixin):

@property
def project_only_flags(self) -> Dict[str, Any]:
return {"source_freshness_run_project_hooks": self.source_freshness_run_project_hooks}
return {
"source_freshness_run_project_hooks": self.source_freshness_run_project_hooks,
"allow_spaces_in_model_names": self.allow_spaces_in_model_names,
}


@dataclass
Expand Down
24 changes: 24 additions & 0 deletions core/dbt/events/core_types.proto
Original file line number Diff line number Diff line change
Expand Up @@ -403,6 +403,30 @@ message ProjectFlagsMovedDeprecationMsg {
ProjectFlagsMovedDeprecation data = 2;
}

// D014
message SpacesInModelNameDeprecation {
string model_name = 1;
string model_version = 2;
string level = 3;
}

message SpacesInModelNameDeprecationMsg {
CoreEventInfo info = 1;
SpacesInModelNameDeprecation data = 2;
}

// D015
message TotalModelNamesWithSpacesDeprecation {
int32 count_invalid_names = 1;
bool show_debug_hint = 2;
string level = 3;
}

message TotalModelNamesWithSpacesDeprecationMsg {
CoreEventInfo info = 1;
TotalModelNamesWithSpacesDeprecation data = 2;
}

// I065
message DeprecatedModel {
string model_name = 1;
Expand Down
1,146 changes: 577 additions & 569 deletions core/dbt/events/core_types_pb2.py

Large diffs are not rendered by default.

42 changes: 42 additions & 0 deletions core/dbt/events/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@
from dbt.events.base_types import WarnLevel, InfoLevel, DebugLevel, ErrorLevel, DynamicLevel


# TODO Move this to dbt_common.ui
def _error_tag(msg: str) -> str:
return f'[{red("ERROR")}]: {msg}'

Check warning on line 16 in core/dbt/events/types.py

View check run for this annotation

Codecov / codecov/patch

core/dbt/events/types.py#L16

Added line #L16 was not covered by tests
Comment on lines +14 to +16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to do this TODO before merging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not. That would entail opening a a PR in dbt-common and also getting it released, and I didn't want that to block this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



# Event codes have prefixes which follow this table
#
# | Code | Description |
Expand Down Expand Up @@ -413,6 +418,43 @@
return warning_tag(f"Deprecated functionality\n\n{description}")


class SpacesInModelNameDeprecation(DynamicLevel):
def code(self) -> str:
return "D014"

def message(self) -> str:
version = ".v" + self.model_version if self.model_version else ""
description = (
f"Model `{self.model_name}{version}` has spaces in its name. This is deprecated and "
"may cause errors when using dbt."
)

if self.level == EventLevel.ERROR.value:
description = _error_tag(description)

Check warning on line 433 in core/dbt/events/types.py

View check run for this annotation

Codecov / codecov/patch

core/dbt/events/types.py#L433

Added line #L433 was not covered by tests
elif self.level == EventLevel.WARN.value:
description = warning_tag(description)

Check warning on line 435 in core/dbt/events/types.py

View check run for this annotation

Codecov / codecov/patch

core/dbt/events/types.py#L435

Added line #L435 was not covered by tests

return line_wrap_message(description)


class TotalModelNamesWithSpacesDeprecation(DynamicLevel):
def code(self) -> str:
return "D015"

def message(self) -> str:
description = f"Found {self.count_invalid_names} models with spaces in their names, which is deprecated."
QMalcolm marked this conversation as resolved.
Show resolved Hide resolved

if self.show_debug_hint:
description += " Run again with `--debug` to see them all."

if self.level == EventLevel.ERROR.value:
description = _error_tag(description)

Check warning on line 451 in core/dbt/events/types.py

View check run for this annotation

Codecov / codecov/patch

core/dbt/events/types.py#L451

Added line #L451 was not covered by tests
elif self.level == EventLevel.WARN.value:
description = warning_tag(description)

Check warning on line 453 in core/dbt/events/types.py

View check run for this annotation

Codecov / codecov/patch

core/dbt/events/types.py#L453

Added line #L453 was not covered by tests

return line_wrap_message(description)


# =======================================================
# I - Project parsing
# =======================================================
Expand Down
45 changes: 45 additions & 0 deletions core/dbt/parser/manifest.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
from dbt.context.manifest import generate_query_header_context
from dbt.contracts.graph.semantic_manifest import SemanticManifest
from dbt_common.events.base_types import EventLevel
from dbt_common.exceptions.base import DbtValidationError
import dbt_common.utils
import json
import pprint
Expand Down Expand Up @@ -62,6 +63,8 @@
StateCheckVarsHash,
DeprecatedModel,
DeprecatedReference,
SpacesInModelNameDeprecation,
TotalModelNamesWithSpacesDeprecation,
UpcomingReferenceDeprecation,
)
from dbt.logger import DbtProcessState
Expand Down Expand Up @@ -520,6 +523,7 @@
self.write_manifest_for_partial_parse()

self.check_for_model_deprecations()
self.check_for_spaces_in_model_names()

return self.manifest

Expand Down Expand Up @@ -621,6 +625,47 @@
)
)

def check_for_spaces_in_model_names(self):
"""Validates that model names do not contain spaces

If `DEBUG` flag is `False`, logs only first bad model name
If `DEBUG` flag is `True`, logs every bad model name
If `ALLOW_SPACES_IN_MODEL_NAMES` is `False`, logs are `ERROR` level and an exception is raised if any names are bad
If `ALLOW_SPACES_IN_MODEL_NAMES` is `True`, logs are `WARN` level
"""
improper_model_names = 0
level = (
EventLevel.WARN
if self.root_project.args.ALLOW_SPACES_IN_MODEL_NAMES
else EventLevel.ERROR
)

for node in self.manifest.nodes.values():
if isinstance(node, ModelNode) and " " in node.name:
if improper_model_names == 0 or self.root_project.args.DEBUG:
fire_event(

Check warning on line 646 in core/dbt/parser/manifest.py

View check run for this annotation

Codecov / codecov/patch

core/dbt/parser/manifest.py#L645-L646

Added lines #L645 - L646 were not covered by tests
SpacesInModelNameDeprecation(
model_name=node.name,
model_version=version_to_str(node.version),
level=level.value,
),
level=level,
)
improper_model_names += 1

Check warning on line 654 in core/dbt/parser/manifest.py

View check run for this annotation

Codecov / codecov/patch

core/dbt/parser/manifest.py#L654

Added line #L654 was not covered by tests

if improper_model_names > 0:
fire_event(

Check warning on line 657 in core/dbt/parser/manifest.py

View check run for this annotation

Codecov / codecov/patch

core/dbt/parser/manifest.py#L657

Added line #L657 was not covered by tests
TotalModelNamesWithSpacesDeprecation(
count_invalid_names=improper_model_names,
show_debug_hint=(not self.root_project.args.DEBUG),
level=level.value,
),
level=level,
)

if level == EventLevel.ERROR:
raise DbtValidationError("Model names cannot contain spaces")

Check warning on line 667 in core/dbt/parser/manifest.py

View check run for this annotation

Codecov / codecov/patch

core/dbt/parser/manifest.py#L666-L667

Added lines #L666 - L667 were not covered by tests

def load_and_parse_macros(self, project_parser_files):
for project in self.all_projects.values():
if project.project_name not in project_parser_files:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
import pytest

from dataclasses import dataclass, field
from dbt.cli.main import dbtRunner
from dbt_common.events.base_types import BaseEvent, EventLevel, EventMsg
from dbt.events.types import SpacesInModelNameDeprecation, TotalModelNamesWithSpacesDeprecation
from dbt.tests.util import update_config_file
from typing import Dict, List


@dataclass
class EventCatcher:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

event_to_catch: BaseEvent
caught_events: List[EventMsg] = field(default_factory=list)

def catch(self, event: EventMsg):
if event.info.name == self.event_to_catch.__name__:
self.caught_events.append(event)


class TestSpacesInModelNamesHappyPath:
def test_no_warnings_when_no_spaces_in_name(self, project) -> None:
event_catcher = EventCatcher(SpacesInModelNameDeprecation)
runner = dbtRunner(callbacks=[event_catcher.catch])
runner.invoke(["parse"])
assert len(event_catcher.caught_events) == 0


class TestSpacesInModelNamesSadPath:
@pytest.fixture(scope="class")
def models(self) -> Dict[str, str]:
return {
"my model.sql": "select 1 as id",
}

def tests_warning_when_spaces_in_name(self, project) -> None:
QMalcolm marked this conversation as resolved.
Show resolved Hide resolved
event_catcher = EventCatcher(SpacesInModelNameDeprecation)
total_catcher = EventCatcher(TotalModelNamesWithSpacesDeprecation)
runner = dbtRunner(callbacks=[event_catcher.catch, total_catcher.catch])
runner.invoke(["parse"])

assert len(total_catcher.caught_events) == 1
assert len(event_catcher.caught_events) == 1
event = event_catcher.caught_events[0]
assert "Model `my model` has spaces in its name. This is deprecated" in event.info.msg
assert event.info.level == EventLevel.WARN


class TestSpaceInModelNamesWithDebug:
@pytest.fixture(scope="class")
def models(self) -> Dict[str, str]:
return {
"my model.sql": "select 1 as id",
"my model2.sql": "select 1 as id",
}

def tests_debug_when_spaces_in_name(self, project) -> None:
spaces_check_catcher = EventCatcher(SpacesInModelNameDeprecation)
total_catcher = EventCatcher(TotalModelNamesWithSpacesDeprecation)
runner = dbtRunner(callbacks=[spaces_check_catcher.catch, total_catcher.catch])
runner.invoke(["parse"])
assert len(spaces_check_catcher.caught_events) == 1
assert len(total_catcher.caught_events) == 1
assert (
"Found 2 models with spaces in their names" in total_catcher.caught_events[0].info.msg
)
assert (
"Run again with `--debug` to see them all." in total_catcher.caught_events[0].info.msg
)

spaces_check_catcher = EventCatcher(SpacesInModelNameDeprecation)
total_catcher = EventCatcher(TotalModelNamesWithSpacesDeprecation)
runner = dbtRunner(callbacks=[spaces_check_catcher.catch, total_catcher.catch])
runner.invoke(["parse", "--debug"])
assert len(spaces_check_catcher.caught_events) == 2
assert len(total_catcher.caught_events) == 1
assert (
"Run again with `--debug` to see them all."
not in total_catcher.caught_events[0].info.msg
)


class TestAllowSpacesInModelNamesFalse:
@pytest.fixture(scope="class")
def models(self) -> Dict[str, str]:
return {
"my model.sql": "select 1 as id",
}

def test_dont_allow_spaces_in_model_names(self, project):
spaces_check_catcher = EventCatcher(SpacesInModelNameDeprecation)
runner = dbtRunner(callbacks=[spaces_check_catcher.catch])
runner.invoke(["parse"])
assert len(spaces_check_catcher.caught_events) == 1
assert spaces_check_catcher.caught_events[0].info.level == EventLevel.WARN

config_patch = {"flags": {"allow_spaces_in_model_names": False}}
update_config_file(config_patch, project.project_root, "dbt_project.yml")

spaces_check_catcher = EventCatcher(SpacesInModelNameDeprecation)
runner = dbtRunner(callbacks=[spaces_check_catcher.catch])
result = runner.invoke(["parse"])
assert not result.success
assert "Model names cannot contain spaces" in result.exception.__str__()
assert len(spaces_check_catcher.caught_events) == 1
assert spaces_check_catcher.caught_events[0].info.level == EventLevel.ERROR
4 changes: 4 additions & 0 deletions tests/unit/test_events.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,10 @@ def test_event_codes(self):
adapter_types.CollectFreshnessReturnSignature(),
core_types.TestsConfigDeprecation(deprecated_path="", exp_path=""),
core_types.ProjectFlagsMovedDeprecation(),
core_types.SpacesInModelNameDeprecation(model_name="", model_version="", level=""),
core_types.TotalModelNamesWithSpacesDeprecation(
count_invalid_names=1, show_debug_hint=True, level=""
),
# E - DB Adapter ======================
adapter_types.AdapterEventDebug(),
adapter_types.AdapterEventInfo(),
Expand Down
15 changes: 11 additions & 4 deletions tests/unit/test_graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,16 @@
from dbt.adapters.factory import reset_adapters, register_adapter
import dbt.compilation
import dbt.exceptions
import dbt.flags
import dbt.parser
import dbt.config
import dbt.utils
import dbt.parser.manifest
from dbt import tracking
from dbt.cli.flags import convert_config
from dbt.contracts.files import SourceFile, FileHash, FilePath
from dbt.contracts.graph.manifest import MacroManifest, ManifestStateCheck
from dbt.contracts.project import ProjectFlags
from dbt.flags import get_flags, set_from_args
from dbt.graph import NodeSelector, parse_difference
from dbt.events.logging import setup_event_logger
from dbt.mp_context import get_mp_context
Expand Down Expand Up @@ -126,9 +127,15 @@ def get_config(self, extra_cfg=None):
cfg.update(extra_cfg)

config = config_from_parts_or_dicts(project=cfg, profile=self.profile)
dbt.flags.set_from_args(Namespace(), ProjectFlags())
setup_event_logger(dbt.flags.get_flags())
object.__setattr__(dbt.flags.get_flags(), "PARTIAL_PARSE", False)
set_from_args(Namespace(), ProjectFlags())
flags = get_flags()
setup_event_logger(flags)
object.__setattr__(flags, "PARTIAL_PARSE", False)
for arg_name, args_param_value in vars(flags).items():
args_param_value = convert_config(arg_name, args_param_value)
object.__setattr__(config.args, arg_name.upper(), args_param_value)
object.__setattr__(config.args, arg_name.lower(), args_param_value)

return config

def get_compiler(self, project):
Expand Down