Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create metricflow-semantics Package #1151

Merged
merged 80 commits into from
Apr 26, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
34bb6ff
Add missing `__init__.py` files in `tests`.
plypaul Apr 23, 2024
86420c6
Add semantics module.
plypaul Apr 23, 2024
f208ee1
Rename errors.
plypaul Apr 23, 2024
c75352d
Rename dataset.
plypaul Apr 23, 2024
678b993
Rename specs.
plypaul Apr 23, 2024
e50a1e5
Add missing `__init__.py` in `inference`.
plypaul Apr 23, 2024
8929eec
Move `base_time_grain.py`.
plypaul Apr 23, 2024
614f0e7
Move patterns.
plypaul Apr 23, 2024
2b2ea34
Move spec_classes.py
plypaul Apr 23, 2024
8657abf
Move specs.
plypaul Apr 23, 2024
38f689e
Move naming.
plypaul Apr 23, 2024
e9e04f4
Move filters.
plypaul Apr 23, 2024
91529ec
Move model.
plypaul Apr 23, 2024
248474b
Move dag.
plypaul Apr 23, 2024
c952eb0
Move dataset.
plypaul Apr 23, 2024
0cc46a3
Move dataflow plan sub-modules.
plypaul Apr 23, 2024
4f28202
Move the rest of dataflow.
plypaul Apr 23, 2024
64cad82
Move some files out of plan_conversion.
plypaul Apr 23, 2024
fb53b89
Move plan_conversion.
plypaul Apr 23, 2024
d6cbc72
Move errors.
plypaul Apr 23, 2024
02eda1b
Move protocols.
plypaul Apr 23, 2024
e1160dd
Move query.
plypaul Apr 23, 2024
d80ebc3
Move sql.
plypaul Apr 23, 2024
246b3b0
Move time.
plypaul Apr 23, 2024
c7f6a04
Move top level.
plypaul Apr 23, 2024
298b45e
Move collection_helpers.
plypaul Apr 23, 2024
d2cd7be
Move mf_logging.
plypaul Apr 23, 2024
5de0abd
Move out time.
plypaul Apr 23, 2024
be48a6d
Move out sql.
plypaul Apr 23, 2024
68875ee
Move out protocols/sql_client.
plypaul Apr 23, 2024
8724f02
Move out plan_conversion.
plypaul Apr 23, 2024
9c094dc
Move out dataset.
plypaul Apr 23, 2024
73580a4
Move out dataflow.
plypaul Apr 23, 2024
249cd1b
Move out time.
plypaul Apr 23, 2024
89bef6f
Move out `data_warehouse_model_validator.py`.
plypaul Apr 24, 2024
e629e9d
Move `sql_bind_parameters.py.`
plypaul Apr 24, 2024
471a7bd
Update `linkable_spec_resolver.py` to use metric time from DSI.
plypaul Apr 24, 2024
33752c7
Separate and move `SqlJoinType`.
plypaul Apr 24, 2024
fce72e9
Move `sql_join_type.py`.
plypaul Apr 24, 2024
2ef2163
Remove `SemanticManifestLookup.time_spine_source`.
plypaul Apr 24, 2024
77bd647
Move semantic tests.
plypaul Apr 24, 2024
cfc2eef
Add `metricflow-semantics` package skeleton.
plypaul Apr 24, 2024
58f631c
Move metricflow.semantics.
plypaul Apr 24, 2024
51431b2
Add `tests_metricflow_semantics`.
plypaul Apr 24, 2024
9173de7
Update `snapshot_path_prefix` to handle new test directories.
plypaul Apr 24, 2024
f88ad4e
Add `py.typed` for `metricflow_semantics`.
plypaul Apr 24, 2024
be48ed5
Move fixtures from `setup_fixtures.py` to `sql_client_fixtures.py`.
plypaul Apr 24, 2024
d2094c0
Rename tests -> tests_metricflow.
plypaul Apr 24, 2024
7c17c1e
Update test module name in tests.
plypaul Apr 24, 2024
f755143
Split `test_helpers.py` into separate files.
plypaul Apr 24, 2024
5426a61
Move helpers into `test_helpers` module.
plypaul Apr 24, 2024
3d9aba3
Change signature of `assert_snapshot_text_equal` to use `SnapshotConf…
plypaul Apr 24, 2024
2b87c74
Move `load_semantic_manifest` to `manifest_helpers.py`.
plypaul Apr 24, 2024
0313849
Move `semantic_manifest_yamls` to `test_helpers`.
plypaul Apr 24, 2024
d6d3cfc
Add `DirectoryAnchor` and use new manifest YAML dir.
plypaul Apr 24, 2024
d58a953
Move `assert_*_snapshot*` to `snapshot_helpers`.
plypaul Apr 24, 2024
4c9dda2
Add snapshot methods that don't depend on a SQL client.
plypaul Apr 24, 2024
3e349ff
Change signaure of `assert*` methods to use `SnapshotConfiguration`.
plypaul Apr 24, 2024
7e9ee2c
Initial configuration for `metricflow-semantics` tests.
plypaul Apr 24, 2024
00129ac
Move a few tests to new locations.
plypaul Apr 24, 2024
6f21bb7
Remove `DunderColumnAssociationResolver` from `test_suggestions.py`.
plypaul Apr 24, 2024
b988db5
Move `metric_time_dimension.py` to `test_helpers`.
plypaul Apr 24, 2024
3da1773
Remove `DataSet` dependency from `metric_time_dimension.py`
plypaul Apr 24, 2024
76119b6
Separate dataflow validation from `SemanticModelJoinEvaluator`.
plypaul Apr 24, 2024
ec9d092
Move semantic tests to `tests_metricflow_semantics`.
plypaul Apr 24, 2024
9b468a9
Move `DunderColumnAssociationResolver` to `metricflow-semantics`.
plypaul Apr 24, 2024
d201599
Add `column_association_resolver` fixture.
plypaul Apr 24, 2024
ae485c0
Add missing `query_parser` fixture.
plypaul Apr 24, 2024
742b85b
Update `pyproject.toml`.
plypaul Apr 24, 2024
d80bf8a
Move tests to `metricflow-semantics`.
plypaul Apr 24, 2024
1699579
Fix pretty_printing for newer Pydantic versions.
plypaul Apr 24, 2024
dbf35f8
Update various build-related files.
plypaul Apr 24, 2024
f9d3544
Move ID-related objects to `test_helpers``.
plypaul Apr 24, 2024
b853369
Move snapshots.
plypaul Apr 24, 2024
d54a4ad
Add change log for #1150.
plypaul Apr 24, 2024
4afbd09
Lint fixes.
plypaul Apr 24, 2024
f6f64cb
Address comments.
plypaul Apr 25, 2024
9e2c131
Update `DirectoryPathAnchor` to not require `__file__`.
plypaul Apr 25, 2024
60773a9
Update / cleanup build configuration.
plypaul Apr 25, 2024
47d7057
Add package CI tests.
plypaul Apr 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,19 @@
import re
import webbrowser
from dataclasses import dataclass
from typing import Callable, List, Optional, Tuple
from typing import Callable, List, Optional, Tuple, TypeVar

import _pytest.fixtures
import tabulate
from _pytest.fixtures import FixtureRequest

from metricflow_semantics.dag.mf_dag import MetricFlowDag
from metricflow_semantics.mf_logging.pretty_print import mf_pformat
from metricflow_semantics.model.semantics.linkable_element_set import LinkableElementSet
from metricflow_semantics.naming.object_builder_scheme import ObjectBuilderNamingScheme
from metricflow_semantics.specs.spec_classes import InstanceSpecSet, LinkableSpecSet
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from tests_metricflow.snapshot_utils import assert_object_snapshot_equal, assert_str_snapshot_equal

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -165,3 +175,167 @@ def add_overwrite_snapshots_cli_flag(parser: _pytest.config.argparsing.Parser) -
action="store_true",
help="Overwrites existing snapshots by ones generated during this testing session",
)


# In plan outputs, replace strings that vary from run to run with this so that comparisons can be made
# consistently.
PLACEHOLDER_CHAR_FOR_INCOMPARABLE_STRINGS = "*"


def make_schema_replacement_function(system_schema: str, source_schema: str) -> Callable[[str], str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is all just a straight copy/paste with maybe some autoformatting applied, but nothing else is changed. If that isn't correct I should probably read these files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's correct.

"""Generates a function to replace schema names in test outputs."""

# The schema of the warehouse used in tests changes from run to run, so don't compare those.
def replacement_function(text: str) -> str:
# Replace with a string of the same length so that indents are preserved.
text = text.replace(source_schema, PLACEHOLDER_CHAR_FOR_INCOMPARABLE_STRINGS * len(source_schema))
# Same with the MetricFlow system schema.
return text.replace(system_schema, PLACEHOLDER_CHAR_FOR_INCOMPARABLE_STRINGS * len(system_schema))

return replacement_function


def replace_dataset_id_hash(text: str) -> str:
"""Replaces data set ID hashes for primed semantic models.

The data set ID hash changes from run to run because it's based on the DW schema in the semantic model, which changes
run to run.
"""
pattern = re.compile(r"'[a-zA-Z0-9_]+__[a-zA-Z0-9_]+__(?P<hash>[a-zA-Z0-9_]+)'")
while True:
match = pattern.search(text)
if match:
data_set_id_hash = match.group("hash")
# Replace with the same length to preserve indents
text = text.replace(data_set_id_hash, PLACEHOLDER_CHAR_FOR_INCOMPARABLE_STRINGS * len(data_set_id_hash))
else:
break
return text


PlanT = TypeVar("PlanT", bound=MetricFlowDag)


def assert_plan_snapshot_text_equal(
request: FixtureRequest,
mf_test_configuration: MetricFlowTestConfiguration,
plan: PlanT,
plan_snapshot_text: str,
plan_snapshot_file_extension: str = ".xml",
exclude_line_regex: Optional[str] = None,
incomparable_strings_replacement_function: Optional[Callable[[str], str]] = None,
additional_sub_directories_for_snapshots: Tuple[str, ...] = (),
) -> None:
"""Checks if the given plan text is equal to the one that's saved for comparison.

* The location of the file is automatically generated based on the test and the plan's ID.
* This may create a new saved plan file or overwrite the existing one, depending on the configuration.
* replace_incomparable_strings is used to replace strings in the plan text before comparison. Useful for making
plans consistent when there are strings that vary between runs and shouldn't be compared.
* additional_sub_directories_for_snapshots is used to specify additional sub-directories (in the automatically
generated directory) where plan outputs should reside.

TODO: Make this more generic by renaming plan -> DAG.
"""
assert_snapshot_text_equal(
request=request,
mf_test_configuration=mf_test_configuration,
group_id=plan.__class__.__name__,
snapshot_id=str(plan.dag_id),
snapshot_text=plan_snapshot_text,
snapshot_file_extension=plan_snapshot_file_extension,
exclude_line_regex=exclude_line_regex,
incomparable_strings_replacement_function=incomparable_strings_replacement_function,
additional_sub_directories_for_snapshots=additional_sub_directories_for_snapshots,
)


def assert_linkable_element_set_snapshot_equal( # noqa: D103
request: FixtureRequest,
mf_test_configuration: MetricFlowTestConfiguration,
set_id: str,
linkable_element_set: LinkableElementSet,
) -> None:
headers = ("Semantic Model", "Entity Links", "Name", "Time Granularity", "Date Part", "Properties")
rows = []
for linkable_dimension_iterable in linkable_element_set.path_key_to_linkable_dimensions.values():
for linkable_dimension in linkable_dimension_iterable:
rows.append(
(
# Checking a limited set of fields as the result is large due to the paths in the object.
(
linkable_dimension.semantic_model_origin.semantic_model_name
if linkable_dimension.semantic_model_origin
else None
),
tuple(entity_link.element_name for entity_link in linkable_dimension.entity_links),
linkable_dimension.element_name,
linkable_dimension.time_granularity.name if linkable_dimension.time_granularity is not None else "",
linkable_dimension.date_part.name if linkable_dimension.date_part is not None else "",
sorted(
linkable_element_property.name for linkable_element_property in linkable_dimension.properties
),
)
)

for linkable_entity_iterable in linkable_element_set.path_key_to_linkable_entities.values():
for linkable_entity in linkable_entity_iterable:
rows.append(
(
# Checking a limited set of fields as the result is large due to the paths in the object.
linkable_entity.semantic_model_origin.semantic_model_name,
tuple(entity_link.element_name for entity_link in linkable_entity.entity_links),
linkable_entity.element_name,
"",
"",
sorted(linkable_element_property.name for linkable_element_property in linkable_entity.properties),
)
)

for linkable_metric_iterable in linkable_element_set.path_key_to_linkable_metrics.values():
for linkable_metric in linkable_metric_iterable:
rows.append(
(
# Checking a limited set of fields as the result is large due to the paths in the object.
linkable_metric.join_by_semantic_model.semantic_model_name,
tuple(entity_link.element_name for entity_link in linkable_entity.entity_links),
linkable_metric.element_name,
"",
"",
sorted(linkable_element_property.name for linkable_element_property in linkable_metric.properties),
)
)

assert_str_snapshot_equal(
request=request,
mf_test_configuration=mf_test_configuration,
snapshot_id=set_id,
snapshot_str=tabulate.tabulate(headers=headers, tabular_data=sorted(rows)),
)


def assert_spec_set_snapshot_equal( # noqa: D103
request: FixtureRequest, mf_test_configuration: MetricFlowTestConfiguration, set_id: str, spec_set: InstanceSpecSet
) -> None:
assert_object_snapshot_equal(
request=request,
mf_test_configuration=mf_test_configuration,
obj_id=set_id,
obj=sorted(spec.qualified_name for spec in spec_set.all_specs),
)


def assert_linkable_spec_set_snapshot_equal( # noqa: D103
request: FixtureRequest, mf_test_configuration: MetricFlowTestConfiguration, set_id: str, spec_set: LinkableSpecSet
) -> None:
# TODO: This will be used in a later PR and this message will be removed.
naming_scheme = ObjectBuilderNamingScheme()
assert_snapshot_text_equal(
request=request,
mf_test_configuration=mf_test_configuration,
group_id=spec_set.__class__.__name__,
snapshot_id=set_id,
snapshot_text=mf_pformat(sorted(naming_scheme.input_str(spec) for spec in spec_set.as_tuple)),
snapshot_file_extension=".txt",
additional_sub_directories_for_snapshots=(),
)
2 changes: 1 addition & 1 deletion tests_metricflow/dataflow/builder/test_cyclic_join.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@
MetricSpec,
)
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from metricflow_semantics.test_helpers.snapshot_helpers import assert_plan_snapshot_text_equal

from metricflow.dataflow.builder.dataflow_plan_builder import DataflowPlanBuilder
from tests_metricflow.dataflow_plan_to_svg import display_graph_if_requested
from tests_metricflow.fixtures.manifest_fixtures import MetricFlowEngineTestFixture, SemanticManifestSetup
from tests_metricflow.fixtures.sql_client_fixtures import sql_client # noqa: F401, F403
from tests_metricflow.snapshot_utils import assert_plan_snapshot_text_equal

logger = logging.getLogger(__name__)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@
TimeDimensionSpec,
)
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from metricflow_semantics.test_helpers.snapshot_helpers import assert_plan_snapshot_text_equal

from metricflow.dataflow.builder.dataflow_plan_builder import DataflowPlanBuilder
from metricflow.dataset.dataset_classes import DataSet
from tests_metricflow.dataflow_plan_to_svg import display_graph_if_requested
from tests_metricflow.snapshot_utils import assert_plan_snapshot_text_equal
from tests_metricflow.time.metric_time_dimension import MTD_SPEC_DAY, MTD_SPEC_MONTH, MTD_SPEC_QUARTER, MTD_SPEC_WEEK

logger = logging.getLogger(__name__)
Expand Down
2 changes: 1 addition & 1 deletion tests_metricflow/dataflow/builder/test_node_data_set.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
)
from metricflow_semantics.sql.sql_join_type import SqlJoinType
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from metricflow_semantics.test_helpers.snapshot_helpers import assert_spec_set_snapshot_equal

from metricflow.dataflow.builder.node_data_set import DataflowPlanNodeOutputDataSetResolver
from metricflow.dataflow.nodes.join_to_base import JoinDescription, JoinToBaseOutputNode
Expand All @@ -33,7 +34,6 @@
)
from metricflow.sql.sql_table import SqlTable
from tests_metricflow.fixtures.manifest_fixtures import MetricFlowEngineTestFixture, SemanticManifestSetup
from tests_metricflow.snapshot_utils import assert_spec_set_snapshot_equal

logger = logging.getLogger(__name__)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from metricflow_semantics.dag.mf_dag import DagId
from metricflow_semantics.specs.spec_classes import InstanceSpecSet, MeasureSpec
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from metricflow_semantics.test_helpers.snapshot_helpers import assert_plan_snapshot_text_equal

from metricflow.dataflow.dataflow_plan import (
BaseOutput,
Expand All @@ -21,7 +22,6 @@
)
from tests_metricflow.dataflow_plan_to_svg import display_graph_if_requested
from tests_metricflow.fixtures.manifest_fixtures import MetricFlowEngineTestFixture, SemanticManifestSetup
from tests_metricflow.snapshot_utils import assert_plan_snapshot_text_equal


def make_dataflow_plan(node: BaseOutput) -> DataflowPlan: # noqa: D103
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
MetricSpec,
)
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from metricflow_semantics.test_helpers.snapshot_helpers import assert_plan_snapshot_text_equal

from metricflow.dataflow.builder.dataflow_plan_builder import DataflowPlanBuilder
from metricflow.dataflow.dataflow_plan import (
Expand Down Expand Up @@ -43,7 +44,6 @@
from metricflow.dataflow.optimizer.source_scan.source_scan_optimizer import SourceScanOptimizer
from metricflow.dataset.dataset_classes import DataSet
from tests_metricflow.dataflow_plan_to_svg import display_graph_if_requested
from tests_metricflow.snapshot_utils import assert_plan_snapshot_text_equal

logger = logging.getLogger(__name__)

Expand Down
2 changes: 1 addition & 1 deletion tests_metricflow/dataset/test_convert_semantic_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@
from _pytest.fixtures import FixtureRequest
from dbt_semantic_interfaces.references import SemanticModelReference
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from metricflow_semantics.test_helpers.snapshot_helpers import assert_spec_set_snapshot_equal

from metricflow.protocols.sql_client import SqlClient
from tests_metricflow.fixtures.manifest_fixtures import MetricFlowEngineTestFixture, SemanticManifestSetup
from tests_metricflow.snapshot_utils import assert_spec_set_snapshot_equal
from tests_metricflow.sql.compare_sql_plan import assert_rendered_sql_equal

logger = logging.getLogger(__name__)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
from metricflow_semantics.sql.sql_bind_parameters import SqlBindParameters
from metricflow_semantics.sql.sql_join_type import SqlJoinType
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from metricflow_semantics.test_helpers.snapshot_helpers import assert_plan_snapshot_text_equal

from metricflow.dataflow.builder.dataflow_plan_builder import DataflowPlanBuilder
from metricflow.dataflow.dataflow_plan import (
Expand All @@ -53,7 +54,6 @@
from metricflow.sql.optimizer.optimization_levels import SqlQueryOptimizationLevel
from tests_metricflow.dataflow_plan_to_svg import display_graph_if_requested
from tests_metricflow.fixtures.manifest_fixtures import MetricFlowEngineTestFixture, SemanticManifestSetup
from tests_metricflow.snapshot_utils import assert_plan_snapshot_text_equal
from tests_metricflow.sql.compare_sql_plan import assert_rendered_sql_from_plan_equal, assert_sql_plan_text_equal
from tests_metricflow.time.metric_time_dimension import MTD_SPEC_DAY

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,10 @@
)
from metricflow_semantics.model.semantics.semantic_model_join_evaluator import MAX_JOIN_HOPS
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration

from tests_metricflow.snapshot_utils import assert_linkable_element_set_snapshot_equal, assert_spec_set_snapshot_equal
from metricflow_semantics.test_helpers.snapshot_helpers import (
assert_linkable_element_set_snapshot_equal,
assert_spec_set_snapshot_equal,
)

logger = logging.getLogger(__name__)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,11 @@
from metricflow_semantics.model.semantics.metric_lookup import MetricLookup
from metricflow_semantics.model.semantics.semantic_model_lookup import SemanticModelLookup
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from metricflow_semantics.test_helpers.snapshot_helpers import (
assert_linkable_element_set_snapshot_equal,
)

from tests_metricflow.snapshot_utils import assert_linkable_element_set_snapshot_equal, assert_object_snapshot_equal
from tests_metricflow.snapshot_utils import assert_object_snapshot_equal

logger = logging.getLogger(__name__)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
from metricflow_semantics.model.semantic_manifest_lookup import SemanticManifestLookup
from metricflow_semantics.query.group_by_item.resolution_dag.dag import GroupByItemResolutionDag
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from metricflow_semantics.test_helpers.snapshot_helpers import assert_plan_snapshot_text_equal

from tests_metricflow.semantics.query.group_by_item.ambiguous_resolution_query_id import AmbiguousResolutionQueryId
from tests_metricflow.snapshot_utils import assert_plan_snapshot_text_equal

logger = logging.getLogger(__name__)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@
from metricflow_semantics.query.group_by_item.resolution_dag.dag import GroupByItemResolutionDag
from metricflow_semantics.specs.spec_classes import LinkableSpecSet
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration
from metricflow_semantics.test_helpers.snapshot_helpers import assert_linkable_spec_set_snapshot_equal

from tests_metricflow.semantics.query.group_by_item.conftest import AmbiguousResolutionQueryId
from tests_metricflow.snapshot_utils import assert_linkable_spec_set_snapshot_equal

logger = logging.getLogger(__name__)

Expand Down
Loading