Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of cross-project ref #7276

Merged
merged 58 commits into from
May 3, 2023
Merged
Changes from 1 commit
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
0daea9f
Create publication.py, various Publication classes, Dependency class
gshank Apr 12, 2023
6d511eb
fix merge error
gshank Apr 12, 2023
7ffadaf
Fix test
gshank Apr 12, 2023
8ba476a
Move test_publication into multi_project dir
gshank Apr 13, 2023
9add2a6
Load dependencies.yml and the corresponding publication file
gshank Apr 13, 2023
77d2a5b
use is_latest_version
gshank Apr 14, 2023
6bb6e57
Add version to PublicModel for now
gshank Apr 14, 2023
8dcc2b5
change test to use parse and get manifest
gshank Apr 14, 2023
ede647a
Add unique_id and a couple of properties to PublicModel
gshank Apr 14, 2023
86045ca
add "name" and "package_name" to PublicModel, some changes to ref_lookup
gshank Apr 14, 2023
8009349
Add "external_nodes" and populate ref_lookup
gshank Apr 14, 2023
6a9c06d
resolve_ref working
gshank Apr 14, 2023
f8a7bfd
add "external_nodes" dictionary, save external ref unique_ids to it
gshank Apr 14, 2023
22d6e85
Merge branch 'main' into ct-2327-model_publication
gshank Apr 17, 2023
babaf6a
rename external_nodes to public_nodes
gshank Apr 17, 2023
bd652a7
relation from relation_name. Not tested.
gshank Apr 17, 2023
c96c2a2
Test working with public_node
gshank Apr 17, 2023
17542fb
Add public nodes to parent and child maps
gshank Apr 17, 2023
c3c2940
Bump manifest version and fix tests, use ModelDependsOn
gshank Apr 17, 2023
fb12f7c
re-comment line in test_previous_version_state.py
gshank Apr 18, 2023
22cb844
Merge branch 'main' into ct-2327-model_publication
gshank Apr 18, 2023
0dd963a
Fix refs to manifest.json version, typo in Dependencies
gshank Apr 18, 2023
f7202ae
Split out PublicationArtifact and PublicationConfig, store public_models
gshank Apr 18, 2023
5bd828e
Store dependencies in publication artifact
gshank Apr 19, 2023
a43327d
Remove ModelDependsOn
gshank Apr 19, 2023
942f712
Fix is_latest_version and add some comments
gshank Apr 19, 2023
3054aed
change detection of PublicModel for >= python3.10
gshank Apr 20, 2023
fcd545b
Changie
gshank Apr 20, 2023
a5429cc
Merge branch 'main' into ct-2327-model_publication
gshank Apr 20, 2023
c4db8c0
Fix another isinstance of public model
gshank Apr 20, 2023
3f54bb0
Update another instance check of PublicModel
gshank Apr 21, 2023
e9d5a4c
Handle removing references for re-processing if publication has changed
gshank Apr 21, 2023
9e2284c
Handle only changed publication artifacts
gshank Apr 24, 2023
34b813a
Add another test
gshank Apr 24, 2023
719ab9a
Add some logging events
gshank Apr 24, 2023
41364be
Merge branch 'main' into ct-2327-model_publication
gshank Apr 24, 2023
bac92ea
Remove duplicate nodes from manifest
gshank Apr 24, 2023
dac9fa8
Merge branch 'main' into ct-2327-model_publication
gshank Apr 25, 2023
2e938e3
refactor relation_from_relation_name
gshank Apr 25, 2023
6b01773
Some comments and minor cleanup
gshank Apr 25, 2023
f286af2
get quote character from class, quoting from publication artifact
gshank Apr 25, 2023
4fdfb2b
cleanup depends_on_public_nodes; move call to rebuild_ref_lookup; tweak
gshank Apr 26, 2023
0425c3b
Remove duplicate writing of manifest.json
gshank Apr 26, 2023
acfae95
Add public_nodes to flat_graph
gshank Apr 26, 2023
3254154
Rename some dependencies
gshank Apr 26, 2023
2aad937
Fix test_manifest.py
gshank Apr 26, 2023
fce5c07
Move some file name constants to core/dbt/constants.py
gshank Apr 27, 2023
a256083
Remove "environment" from ProjectDependency. Add
gshank Apr 27, 2023
1589c75
Include external publication dependencies in publication artifact dep…
gshank Apr 27, 2023
97f1fc9
Remove create_from_relation_name, call create_from_node instead
gshank Apr 27, 2023
3fc17fe
Merge branch 'main' into ct-2327-model_publication
gshank Apr 27, 2023
b81157f
Change PublicationArtifactChanged message to debug level
gshank Apr 27, 2023
552b623
Make write_publication_artifact a function in parser/manifest.py
gshank Apr 27, 2023
6369410
Merge branch 'main' into ct-2327-model_publication
gshank Apr 28, 2023
f705d26
Code review cleanup
gshank May 1, 2023
f12e9b6
Merge branch 'main' into ct-2327-model_publication
gshank May 1, 2023
a0611a4
Create fixture to create minimal alternate project (just models)
gshank May 2, 2023
7471b2f
develop multi project test case
gshank May 3, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add "name" and "package_name" to PublicModel, some changes to ref_lookup
gshank committed Apr 14, 2023
commit 86045ca67a7a8400cd77e3af2564095ed17be86c
40 changes: 28 additions & 12 deletions core/dbt/contracts/graph/manifest.py
Original file line number Diff line number Diff line change
@@ -22,7 +22,7 @@
from typing_extensions import Protocol
from uuid import UUID

from dbt.contracts.publication import Dependencies, Publication
from dbt.contracts.publication import Dependencies, Publication, PublicModel

from dbt.contracts.graph.nodes import (
Macro,
@@ -37,6 +37,7 @@
GraphMemberNode,
ResultNode,
BaseNode,
ManifestOrPublicNode,
)
from dbt.contracts.graph.unparsed import SourcePatch, NodeVersion
from dbt.contracts.graph.manifest_upgrade import upgrade_manifest_json
@@ -156,6 +157,7 @@ class RefableLookup(dbtClassMixin):
def __init__(self, manifest: "Manifest"):
self.storage: Dict[str, Dict[PackageName, UniqueID]] = {}
self.populate(manifest)
self.populate_public_nodes(manifest)

def get_unique_id(self, key, package: Optional[PackageName], version: Optional[NodeVersion]):
if version:
@@ -174,7 +176,7 @@ def find(
return self.perform_lookup(unique_id, manifest)
return None

def add_node(self, node: ManifestNode):
def add_node(self, node: ManifestOrPublicNode):
if node.resource_type in self._lookup_types:
if node.name not in self.storage:
self.storage[node.name] = {}
@@ -192,12 +194,21 @@ def populate(self, manifest):
for node in manifest.nodes.values():
self.add_node(node)

def perform_lookup(self, unique_id: UniqueID, manifest) -> ManifestNode:
if unique_id not in manifest.nodes:
def populate_public_nodes(self, manifest):
for external_package in manifest.publications.values():
for node in external_package.public_models:
self.add_node(node)

def perform_lookup(self, unique_id: UniqueID, manifest) -> ManifestOrPublicNode:
if unique_id in manifest.nodes:
node = manifest.nodes[unique_id]
if unique_id in manifest.external_nodes:
node = manifest.external_nodes[unique_id]
if not node:
raise dbt.exceptions.DbtInternalError(
f"Node {unique_id} found in cache but not found in manifest"
)
return manifest.nodes[unique_id]
return node


class MetricLookup(dbtClassMixin):
@@ -279,7 +290,7 @@ class AnalysisLookup(RefableLookup):
_versioned_types: ClassVar[set] = set()


def _search_packages(
def _packages_to_search(
current_project: str,
node_package: str,
target_package: Optional[str] = None,
@@ -637,6 +648,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
env_vars: MutableMapping[str, str] = field(default_factory=dict)
dependencies: Optional[Dependencies] = None
publications: MutableMapping[str, Publication] = field(default_factory=dict)
external_nodes: MutableMapping[str, PublicModel] = field(default_factory=dict)

_doc_lookup: Optional[DocLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
@@ -921,7 +933,7 @@ def analysis_lookup(self) -> AnalysisLookup:
self._analysis_lookup = AnalysisLookup(self)
return self._analysis_lookup

# Called by dbt.parser.manifest._resolve_refs_for_exposure
# Called by dbt.parser.manifest._process_refs_for_exposure, _process_refs_for_metric,
# and dbt.parser.manifest._process_refs_for_node
def resolve_ref(
self,
@@ -935,11 +947,15 @@ def resolve_ref(
node: Optional[ManifestNode] = None
disabled: Optional[List[ManifestNode]] = None

candidates = _search_packages(current_project, node_package, target_model_package)
candidates = _packages_to_search(current_project, node_package, target_model_package)
for pkg in candidates:
node = self.ref_lookup.find(target_model_name, pkg, target_model_version, self)

if node is not None and node.config.enabled:
if (
node is not None
and (hasattr(node, "config") and node.config.enabled)
or isinstance(node, PublicModel)
):
return node

# it's possible that the node is disabled
@@ -960,7 +976,7 @@ def resolve_source(
node_package: str,
) -> MaybeParsedSource:
search_name = f"{target_source_name}.{target_table_name}"
candidates = _search_packages(current_project, node_package)
candidates = _packages_to_search(current_project, node_package)

source: Optional[SourceDefinition] = None
disabled: Optional[List[SourceDefinition]] = None
@@ -990,7 +1006,7 @@ def resolve_metric(
metric: Optional[Metric] = None
disabled: Optional[List[Metric]] = None

candidates = _search_packages(current_project, node_package, target_metric_package)
candidates = _packages_to_search(current_project, node_package, target_metric_package)
for pkg in candidates:
metric = self.metric_lookup.find(target_metric_name, pkg, self)

@@ -1016,7 +1032,7 @@ def resolve_doc(
resolve_ref except the is_enabled checks are unnecessary as docs are
always enabled.
"""
candidates = _search_packages(current_project, node_package, package)
candidates = _packages_to_search(current_project, node_package, package)

for pkg in candidates:
result = self.doc_lookup.find(name, pkg, self)
30 changes: 30 additions & 0 deletions core/dbt/contracts/graph/nodes.py
Original file line number Diff line number Diff line change
@@ -62,6 +62,12 @@
EmptySnapshotConfig,
SnapshotConfig,
)
import sys

if sys.version_info >= (3, 8):
from typing import Protocol
else:
from typing_extensions import Protocol


# =====================================================================
@@ -1334,6 +1340,30 @@ class ParsedMacroPatch(ParsedPatch):
# ====================================


class ManifestOrPublicNode(Protocol):
name: str
package_name: str
unique_id: str
version: Optional[NodeVersion]
relation_name: str

@property
def is_latest_version(self):
pass

@property
def resource_type(self):
pass

@property
def access(self):
pass

@property
def search_name(self):
pass


# ManifestNode without SeedNode, which doesn't have the
# SQL related attributes
ManifestSQLNode = Union[
12 changes: 11 additions & 1 deletion core/dbt/contracts/publication.py
Original file line number Diff line number Diff line change
@@ -5,6 +5,7 @@

from dbt.contracts.util import BaseArtifactMetadata, ArtifactMixin, schema_version
from dbt.contracts.graph.unparsed import NodeVersion
from dbt.contracts.graph.nodes import ManifestOrPublicNode
from dbt.node_types import NodeType, AccessType


@@ -27,7 +28,9 @@ class PublicationMetadata(BaseArtifactMetadata):


@dataclass
class PublicModel(dbtClassMixin):
class PublicModel(dbtClassMixin, ManifestOrPublicNode):
MichelleArk marked this conversation as resolved.
Show resolved Hide resolved
name: str
package_name: str
unique_id: str
relation_name: str
version: Optional[NodeVersion] = None # It's not totally clear if we actually need this
@@ -45,6 +48,13 @@ def resource_type(self):
def access(self):
return AccessType.Public

@property
def search_name(self):
if self.version is None:
return self.name
else:
return f"{self.name}.v{self.version}"


@dataclass
class PublicationMandatory:
5 changes: 5 additions & 0 deletions core/dbt/parser/manifest.py
Original file line number Diff line number Diff line change
@@ -650,6 +650,8 @@ def write_artifacts(self):
public_dependencies = parents.intersection(set_of_public_unique_ids)

public_model = PublicModel(
name=model.name,
package_name=model.package_name,
unique_id=model.unique_id,
relation_name=model.relation_name,
version=model.version,
@@ -693,6 +695,9 @@ def build_dependencies(self):
pub_dict = load_yaml_text(contents)
pub_obj = Publication.from_dict(pub_dict)
self.manifest.publications[project.name] = pub_obj
# Add to dictionary of external_nodes
for external_node in pub_obj.public_models.values():
self.manifest.external_nodes[external_node.unique_id] = external_node
else:
raise PublicationConfigNotFound(
project=project.name, file_name=publication_file_name
4 changes: 4 additions & 0 deletions tests/functional/multi_project/test_publication.py
Original file line number Diff line number Diff line change
@@ -55,13 +55,17 @@
},
"public_models": {
"model.marketing.fct_one": {
"name": "fct_one",
"package_name": "marketing",
"unique_id": "model.marketing.fct_one",
"relation_name": '"dbt"."test_schema"."fct_one"',
"version": null,
"is_latest_version": false,
"public_dependencies": []
},
"model.marketing.fct_two": {
"name": "fct_two",
"package_name": "marketing",
"unique_id": "model.marketing.fct_two",
"relation_name": '"dbt"."test_schema"."fct_two"',
"version": null,