Skip to content

Commit

Permalink
add seed (#3)
Browse files Browse the repository at this point in the history
* insert overwrite instead of insert into for new seed runs

* Mv1742 load csv table seed (#1)

* changie commit

* Update Fixes-20231013-120628.yaml

* insert overwrite instead of insert into for new seed runs

changie commit

Mv1742 load csv table seed (#1)

* changie commit

* Update Fixes-20231013-120628.yaml

* Mv1742 rebase truncate (#2)

* merge remote

* add docs-issue workflow to dbt-spark (#913)

* Update seed.sql

* Update seed.sql

* add truncate table function

* rm changelog

* merge remote
  • Loading branch information
machov authored Oct 14, 2023
1 parent 7ac4a7e commit 873381f
Show file tree
Hide file tree
Showing 4 changed files with 22 additions and 1 deletion.
7 changes: 7 additions & 0 deletions .changes/unreleased/Fixes-20231013-120628.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
kind: Fixes
body: Overwrite existing rows on existing seed tables. For unmanaged databases (no location specified), the current seed command in
dbt-spark appends to existing seeded tables instead overwriting.
time: 2023-10-13T12:06:28.078483-06:00
custom:
Author: mv1742
Issue: "112"
7 changes: 7 additions & 0 deletions dbt/include/spark/macros/adapters.sql
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,13 @@
{%- endcall %}
{% endmacro %}


{% macro spark__truncate_relation(relation) -%}
{% call statement('truncate_relation', auto_begin=False) -%}
truncate {{ relation.type }} if exists {{ relation }}
{%- endcall %}
{% endmacro %}

{% macro spark__drop_relation(relation) -%}
{% call statement('drop_relation', auto_begin=False) -%}
drop {{ relation.type }} if exists {{ relation }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@
re: python models and temporary views.

Also, why do neither drop_relation or adapter.drop_relation work here?!
'unmanaged' tables in spark need to manually delete the database
otherwise drop statement does not delete the underlying data.
TODO:add warning that this feature does not work for Unmanaged tables.
Managed tables are fine.
--#}
{% call statement('drop_relation') -%}
drop table if exists {{ tmp_relation }}
Expand Down
5 changes: 4 additions & 1 deletion dbt/include/spark/macros/materializations/seed.sql
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,10 @@

{% macro spark__reset_csv_table(model, full_refresh, old_relation, agate_table) %}
{% if old_relation %}
{{ adapter.truncate_relation(old_relation) }}
{{ adapter.drop_relation(old_relation) }}

{{ return(sql) }}
{% endif %}
{% set sql = create_csv_table(model, agate_table) %}
{{ return(sql) }}
Expand All @@ -27,7 +30,7 @@
{% endfor %}

{% set sql %}
insert into {{ this.render() }} values
insert {% if loop.index0 == 0 -%} overwrite {% else -%} into {% endif -%} {{ this.render() }} values
{% for row in chunk -%}
({%- for col_name in agate_table.column_names -%}
{%- set inferred_type = adapter.convert_type(agate_table, loop.index0) -%}
Expand Down

0 comments on commit 873381f

Please sign in to comment.