Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt seed truncate tables #915

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .changes/unreleased/Fixes-20231013-120628.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
kind: Fixes
body: Overwrite existing rows on existing seed tables. For unmanaged databases (no location specified), the current seed command in
dbt-spark appends to existing seeded tables instead overwriting.
time: 2023-10-13T12:06:28.078483-06:00
custom:
Author: mv1742
Issue: "112"
7 changes: 7 additions & 0 deletions dbt/include/spark/macros/adapters.sql
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,13 @@
{%- endcall %}
{% endmacro %}


{% macro spark__truncate_relation(relation) -%}
{% call statement('truncate_relation', auto_begin=False) -%}
truncate {{ relation.type }} if exists {{ relation }}
{%- endcall %}
{% endmacro %}

{% macro spark__drop_relation(relation) -%}
{% call statement('drop_relation', auto_begin=False) -%}
drop {{ relation.type }} if exists {{ relation }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@
re: python models and temporary views.

Also, why do neither drop_relation or adapter.drop_relation work here?!
'unmanaged' tables in spark need to manually delete the database
otherwise drop statement does not delete the underlying data.
TODO:add warning that this feature does not work for Unmanaged tables.
Managed tables are fine.
--#}
{% call statement('drop_relation') -%}
drop table if exists {{ tmp_relation }}
Expand Down
11 changes: 9 additions & 2 deletions dbt/include/spark/macros/materializations/seed.sql
Original file line number Diff line number Diff line change
@@ -1,16 +1,22 @@
<<<<<<< HEAD
=======
{% macro spark__get_binding_char() %}
{{ return('?' if target.method == 'odbc' else '%s') }}
{% endmacro %}


>>>>>>> ab7c116 (Update seed.sql)
{% macro spark__reset_csv_table(model, full_refresh, old_relation, agate_table) %}
{% if old_relation %}
{{ adapter.truncate_relation(old_relation) }}
{{ adapter.drop_relation(old_relation) }}

{{ return(sql) }}
{% endif %}
{% set sql = create_csv_table(model, agate_table) %}
{{ return(sql) }}
{% endmacro %}


{% macro spark__load_csv_rows(model, agate_table) %}

Expand All @@ -27,7 +33,7 @@
{% endfor %}

{% set sql %}
insert into {{ this.render() }} values
insert {% if loop.index0 == 0 -%} overwrite {% else -%} into {% endif -%} {{ this.render() }} values
{% for row in chunk -%}
({%- for col_name in agate_table.column_names -%}
{%- set inferred_type = adapter.convert_type(agate_table, loop.index0) -%}
Expand Down Expand Up @@ -77,3 +83,4 @@

{{ return(sql) }}
{% endmacro %}