Skip to content

Commit

Permalink
Merge branch 'main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
colin-rogers-dbt authored Oct 11, 2023
2 parents f04993f + cd1783a commit 389fc73
Show file tree
Hide file tree
Showing 24 changed files with 213 additions and 11 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.7.0b1
current_version = 1.7.0b2
parse = (?P<major>[\d]+) # major version number
\.(?P<minor>[\d]+) # minor version number
\.(?P<patch>[\d]+) # patch version number
Expand Down
27 changes: 27 additions & 0 deletions .changes/1.7.0-b2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
## dbt-spark 1.7.0-b2 - October 02, 2023

### Features

- Persist Column level comments when creating views ([#372](https://github.com/dbt-labs/dbt-spark/issues/372))

### Under the Hood

- Remove dependency on hologram ([#881](https://github.com/dbt-labs/dbt-spark/issues/881))

### Dependencies

- Replace sasl with pure-sasl for PyHive ([#818](https://github.com/dbt-labs/dbt-spark/pull/818))
- Update tox requirement from ~=4.8 to ~=4.9 ([#874](https://github.com/dbt-labs/dbt-spark/pull/874))
- Bump mypy from 1.5.0 to 1.5.1 ([#875](https://github.com/dbt-labs/dbt-spark/pull/875))
- Update tox requirement from ~=4.9 to ~=4.10 ([#879](https://github.com/dbt-labs/dbt-spark/pull/879))
- Update pre-commit requirement from ~=3.3 to ~=3.4 ([#884](https://github.com/dbt-labs/dbt-spark/pull/884))
- Update black requirement from ~=23.7 to ~=23.9 ([#886](https://github.com/dbt-labs/dbt-spark/pull/886))
- Update tox requirement from ~=4.10 to ~=4.11 ([#887](https://github.com/dbt-labs/dbt-spark/pull/887))

### Security

- Add docker image to the repo ([#876](https://github.com/dbt-labs/dbt-spark/pull/876))

### Contributors
- [@Fokko](https://github.com/Fokko) ([#876](https://github.com/dbt-labs/dbt-spark/pull/876))
- [@jurasan](https://github.com/jurasan) ([#372](https://github.com/dbt-labs/dbt-spark/issues/372))
6 changes: 6 additions & 0 deletions .changes/1.7.0/Features-20230817-130731.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Persist Column level comments when creating views
time: 2023-08-17T13:07:31.6812862Z
custom:
Author: jurasan
Issue: 372
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20230921-180958.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Support storing test failures as views
time: 2023-09-21T18:09:58.174136-04:00
custom:
Author: mikealfare
Issue: "6914"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231011-094718.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Create temporary views with 'or replace'
time: 2023-10-11T09:47:18.485764-07:00
custom:
Author: annazizian
Issue: "350"
30 changes: 29 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,35 @@
- "Breaking changes" listed under a version may require action from end users or external maintainers when upgrading to that version.
- Do not edit this file directly. This file is auto-generated using [changie](https://github.com/miniscruff/changie). For details on how to document a change, see [the contributing guide](https://github.com/dbt-labs/dbt-spark/blob/main/CONTRIBUTING.md#adding-changelog-entry)

## dbt-spark 1.7.0-b2 - October 02, 2023

### Features

- Persist Column level comments when creating views ([#372](https://github.com/dbt-labs/dbt-spark/issues/372))

### Under the Hood

- Remove dependency on hologram ([#881](https://github.com/dbt-labs/dbt-spark/issues/881))

### Dependencies

- Replace sasl with pure-sasl for PyHive ([#818](https://github.com/dbt-labs/dbt-spark/pull/818))
- Update tox requirement from ~=4.8 to ~=4.9 ([#874](https://github.com/dbt-labs/dbt-spark/pull/874))
- Bump mypy from 1.5.0 to 1.5.1 ([#875](https://github.com/dbt-labs/dbt-spark/pull/875))
- Update tox requirement from ~=4.9 to ~=4.10 ([#879](https://github.com/dbt-labs/dbt-spark/pull/879))
- Update pre-commit requirement from ~=3.3 to ~=3.4 ([#884](https://github.com/dbt-labs/dbt-spark/pull/884))
- Update black requirement from ~=23.7 to ~=23.9 ([#886](https://github.com/dbt-labs/dbt-spark/pull/886))
- Update tox requirement from ~=4.10 to ~=4.11 ([#887](https://github.com/dbt-labs/dbt-spark/pull/887))

### Security

- Add docker image to the repo ([#876](https://github.com/dbt-labs/dbt-spark/pull/876))

### Contributors
- [@Fokko](https://github.com/Fokko) ([#876](https://github.com/dbt-labs/dbt-spark/pull/876))
- [@jurasan](https://github.com/jurasan) ([#372](https://github.com/dbt-labs/dbt-spark/issues/372))


## dbt-spark 1.7.0-b1 - August 17, 2023

### Features
Expand Down Expand Up @@ -53,7 +82,6 @@
- [@etheleon](https://github.com/etheleon) ([#865](https://github.com/dbt-labs/dbt-spark/issues/865))
- [@hanna-liashchuk](https://github.com/hanna-liashchuk) ([#387](https://github.com/dbt-labs/dbt-spark/issues/387))


## Previous Releases
For information on prior major and minor releases, see their changelogs:
- [1.6](https://github.com/dbt-labs/dbt-spark/blob/1.6.latest/CHANGELOG.md)
Expand Down
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ dev: ## Installs adapter in develop mode along with development dependencies
dev-uninstall: ## Uninstalls all packages while maintaining the virtual environment
## Useful when updating versions, or if you accidentally installed into the system interpreter
pip freeze | grep -v "^-e" | cut -d "@" -f1 | xargs pip uninstall -y
pip uninstall -y dbt-spark

.PHONY: mypy
mypy: ## Runs mypy against staged changes for static type checking.
Expand Down
12 changes: 7 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,18 +26,20 @@ more information, consult [the docs](https://docs.getdbt.com/docs/profile-spark)

## Running locally
A `docker-compose` environment starts a Spark Thrift server and a Postgres database as a Hive Metastore backend.
Note: dbt-spark now supports Spark 3.1.1 (formerly on Spark 2.x).
Note: dbt-spark now supports Spark 3.3.2.

The following command would start two docker containers
```
The following command starts two docker containers:

```sh
docker-compose up -d
```

It will take a bit of time for the instance to start, you can check the logs of the two containers.
If the instance doesn't start correctly, try the complete reset command listed below and then try start again.

Create a profile like this one:

```
```yaml
spark_testing:
target: local
outputs:
Expand All @@ -60,7 +62,7 @@ Connecting to the local spark instance:

Note that the Hive metastore data is persisted under `./.hive-metastore/`, and the Spark-produced data under `./.spark-warehouse/`. To completely reset you environment run the following:

```
```sh
docker-compose down
rm -rf ./.hive-metastore/
rm -rf ./.spark-warehouse/
Expand Down
2 changes: 1 addition & 1 deletion dbt/adapters/spark/__version__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
version = "1.7.0b1"
version = "1.7.0b2"
4 changes: 3 additions & 1 deletion dbt/adapters/spark/impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,9 @@ def _get_columns_for_catalog(self, relation: BaseRelation) -> Iterable[Dict[str,
as_dict["table_database"] = None
yield as_dict

def get_catalog(self, manifest: Manifest) -> Tuple[agate.Table, List[Exception]]:
def get_catalog(
self, manifest: Manifest, selected_nodes: Optional[Set] = None
) -> Tuple[agate.Table, List[Exception]]:
schema_map = self._get_catalog_schemas(manifest)
if len(schema_map) > 1:
raise dbt.exceptions.CompilationError(
Expand Down
23 changes: 22 additions & 1 deletion dbt/include/spark/macros/adapters.sql
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@

{#-- We can't use temporary tables with `create ... as ()` syntax --#}
{% macro spark__create_temporary_view(relation, compiled_code) -%}
create temporary view {{ relation }} as
create or replace temporary view {{ relation }} as
{{ compiled_code }}
{%- endmacro -%}

Expand Down Expand Up @@ -229,9 +229,30 @@
{% endfor %}
{% endmacro %}

{% macro get_column_comment_sql(column_name, column_dict) -%}
{% if column_name in column_dict and column_dict[column_name]["description"] -%}
{% set escaped_description = column_dict[column_name]["description"] | replace("'", "\\'") %}
{% set column_comment_clause = "comment '" ~ escaped_description ~ "'" %}
{%- endif -%}
{{ adapter.quote(column_name) }} {{ column_comment_clause }}
{% endmacro %}

{% macro get_persist_docs_column_list(model_columns, query_columns) %}
{% for column_name in query_columns %}
{{ get_column_comment_sql(column_name, model_columns) }}
{{- ", " if not loop.last else "" }}
{% endfor %}
{% endmacro %}

{% macro spark__create_view_as(relation, sql) -%}
create or replace view {{ relation }}
{% if config.persist_column_docs() -%}
{% set model_columns = model.columns %}
{% set query_columns = get_columns_in_query(sql) %}
(
{{ get_persist_docs_column_list(model_columns, query_columns) }}
)
{% endif %}
{{ comment_clause() }}
{%- set contract_config = config.get('contract') -%}
{%- if contract_config.enforced -%}
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def _get_dbt_core_version():


package_name = "dbt-spark"
package_version = "1.7.0b1"
package_version = "1.7.0b2"
dbt_core_version = _get_dbt_core_version()
description = """The Apache Spark adapter plugin for dbt"""

Expand Down
28 changes: 28 additions & 0 deletions tests/functional/adapter/persist_docs/fixtures.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,39 @@
select 1 as id, 'Joe' as name
"""

_MODELS__VIEW_DELTA_MODEL = """
{{ config(materialized='view') }}
select id, count(*) as count from {{ ref('table_delta_model') }} group by id
"""

_MODELS__TABLE_DELTA_MODEL_MISSING_COLUMN = """
{{ config(materialized='table', file_format='delta') }}
select 1 as id, 'Joe' as different_name
"""
_VIEW_PROPERTIES_MODELS = """
version: 2
models:
- name: view_delta_model
description: |
View model description "with double quotes"
and with 'single quotes' as welll as other;
'''abc123'''
reserved -- characters
--
/* comment */
Some $lbl$ labeled $lbl$ and $$ unlabeled $$ dollar-quoting
columns:
- name: id
description: |
id Column description "with double quotes"
and with 'single quotes' as welll as other;
'''abc123'''
reserved -- characters
--
/* comment */
Some $lbl$ labeled $lbl$ and $$ unlabeled $$ dollar-quoting
"""
_PROPERTIES__MODELS = """
version: 2
Expand Down
44 changes: 44 additions & 0 deletions tests/functional/adapter/persist_docs/test_persist_docs.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
_PROPERTIES__MODELS,
_PROPERTIES__SEEDS,
_SEEDS__BASIC,
_MODELS__VIEW_DELTA_MODEL,
_VIEW_PROPERTIES_MODELS,
)


Expand Down Expand Up @@ -76,6 +78,48 @@ def test_delta_comments(self, project):
assert result[2].startswith("Some stuff here and then a call to")


@pytest.mark.skip_profile("apache_spark", "spark_session")
class TestPersistDocsDeltaView:
@pytest.fixture(scope="class")
def models(self):
return {
"table_delta_model.sql": _MODELS__TABLE_DELTA_MODEL,
"view_delta_model.sql": _MODELS__VIEW_DELTA_MODEL,
"schema.yml": _VIEW_PROPERTIES_MODELS,
}

@pytest.fixture(scope="class")
def project_config_update(self):
return {
"models": {
"test": {
"+persist_docs": {
"relation": True,
"columns": True,
},
}
},
}

def test_delta_comments(self, project):
run_dbt(["run"])

results = project.run_sql(
"describe extended {schema}.{table}".format(
schema=project.test_schema, table="view_delta_model"
),
fetch="all",
)

for result in results:
if result[0] == "Comment":
assert result[1].startswith("View model description")
if result[0] == "id":
assert result[2].startswith("id Column description")
if result[0] == "count":
assert result[2] is None


@pytest.mark.skip_profile("apache_spark", "spark_session")
class TestPersistDocsMissingColumn:
@pytest.fixture(scope="class")
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import pytest

from dbt.tests.adapter.store_test_failures_tests import basic
from dbt.tests.adapter.store_test_failures_tests.test_store_test_failures import (
StoreTestFailuresBase,
TEST_AUDIT_SCHEMA_SUFFIX,
Expand Down Expand Up @@ -42,3 +43,33 @@ def project_config_update(self):
def test_store_and_assert_failure_with_delta(self, project):
self.run_tests_store_one_failure(project)
self.run_tests_store_failures_and_assert(project)


@pytest.mark.skip_profile("spark_session")
class TestStoreTestFailuresAsInteractions(basic.StoreTestFailuresAsInteractions):
pass


@pytest.mark.skip_profile("spark_session")
class TestStoreTestFailuresAsProjectLevelOff(basic.StoreTestFailuresAsProjectLevelOff):
pass


@pytest.mark.skip_profile("spark_session")
class TestStoreTestFailuresAsProjectLevelView(basic.StoreTestFailuresAsProjectLevelView):
pass


@pytest.mark.skip_profile("spark_session")
class TestStoreTestFailuresAsGeneric(basic.StoreTestFailuresAsGeneric):
pass


@pytest.mark.skip_profile("spark_session")
class TestStoreTestFailuresAsProjectLevelEphemeral(basic.StoreTestFailuresAsProjectLevelEphemeral):
pass


@pytest.mark.skip_profile("spark_session")
class TestStoreTestFailuresAsExceptions(basic.StoreTestFailuresAsExceptions):
pass

0 comments on commit 389fc73

Please sign in to comment.