Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Materialized view config change identification not working with multiple sort keys #838

Closed
2 tasks done
jeremyyeo opened this issue Jun 5, 2024 · 0 comments · Fixed by #841 or Shiphero/dbt-redshift#1
Closed
2 tasks done
Labels
bug Something isn't working

Comments

@jeremyyeo
Copy link
Contributor

Is this a new bug in dbt-redshift?

  • I believe this is a new bug in dbt-redshift
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I think this is a continuation of #621 as it's almost the same really but there was a note that says that that issue is 7 months old and to consider opening up another issue.

Expected Behavior

A config unchanged mv should not be identified as "config changed".

Steps To Reproduce

  1. Make sure to delete the MV first.
drop materialized view if exists dev.public.mv_double_sort
  1. Add an mv to the project with more than 1 sort key:
-- models/mv_double_sort.sql
{{
    config(
        materialized = 'materialized_view',
        dist = 'b',
        sort = ['b', 'a'],
        on_configuration_change = 'fail'
    )
}}

select a, b, c from foo
  1. Initial build:
$ dbt --debug build -s my_double_sort

03:08:31  1 of 1 START sql materialized_view model public.mv_double_sort ................. [RUN]
03:08:31  Re-using an available connection from the pool (formerly list_dev_public, now model.my_dbt_project.mv_double_sort)
03:08:31  Began compiling node model.my_dbt_project.mv_double_sort
03:08:31  Writing injected SQL for node "model.my_dbt_project.mv_double_sort"
03:08:31  Began executing node model.my_dbt_project.mv_double_sort
03:08:31  Writing runtime sql for node "model.my_dbt_project.mv_double_sort"
03:08:31  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:31  On model.my_dbt_project.mv_double_sort: BEGIN
03:08:31  Opening a new connection, currently in state closed
03:08:31  Redshift adapter: Establishing connection using ssl with `sslmode` set to 'prefer'.To connect without ssl, set `sslmode` to 'disable'.
03:08:31  Redshift adapter: Connecting to redshift with username/password based auth...
03:08:34  SQL status: SUCCESS in 2.0 seconds
03:08:34  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:34  On model.my_dbt_project.mv_double_sort: /* {"app": "dbt", "dbt_version": "1.8.1", "profile_name": "all", "target_name": "rs", "node_id": "model.my_dbt_project.mv_double_sort"} */

        create materialized view "dev"."public"."mv_double_sort"
        backup yes
        diststyle key
        distkey (b)
        sortkey (b,a)
        auto refresh no
    as (
        select a, b, c from foo
    )
03:08:35  SQL status: SUCCESS in 1.0 seconds
03:08:35  On model.my_dbt_project.mv_double_sort: COMMIT
03:08:35  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:35  On model.my_dbt_project.mv_double_sort: COMMIT
03:08:36  SQL status: SUCCESS in 1.0 seconds
03:08:36  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:36  On model.my_dbt_project.mv_double_sort: BEGIN
03:08:37  SQL status: SUCCESS in 0.0 seconds
03:08:37  Applying DROP to: "dev"."public"."mv_double_sort__dbt_backup"
03:08:37  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:37  On model.my_dbt_project.mv_double_sort: /* {"app": "dbt", "dbt_version": "1.8.1", "profile_name": "all", "target_name": "rs", "node_id": "model.my_dbt_project.mv_double_sort"} */
drop materialized view if exists "dev"."public"."mv_double_sort__dbt_backup"
03:08:37  SQL status: SUCCESS in 0.0 seconds
03:08:37  On model.my_dbt_project.mv_double_sort: COMMIT
03:08:37  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:37  On model.my_dbt_project.mv_double_sort: COMMIT
03:08:38  SQL status: SUCCESS in 1.0 seconds
03:08:38  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:38  On model.my_dbt_project.mv_double_sort: BEGIN
03:08:38  SQL status: SUCCESS in 0.0 seconds
03:08:38  On model.my_dbt_project.mv_double_sort: COMMIT
03:08:38  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:38  On model.my_dbt_project.mv_double_sort: COMMIT
03:08:38  SQL status: SUCCESS in 0.0 seconds
03:08:38  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:38  On model.my_dbt_project.mv_double_sort: BEGIN
03:08:39  SQL status: SUCCESS in 0.0 seconds
03:08:39  Applying DROP to: "dev"."public"."mv_double_sort__dbt_tmp"
03:08:39  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:39  On model.my_dbt_project.mv_double_sort: /* {"app": "dbt", "dbt_version": "1.8.1", "profile_name": "all", "target_name": "rs", "node_id": "model.my_dbt_project.mv_double_sort"} */
drop materialized view if exists "dev"."public"."mv_double_sort__dbt_tmp"
03:08:39  SQL status: SUCCESS in 1.0 seconds
03:08:39  On model.my_dbt_project.mv_double_sort: COMMIT
03:08:39  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:39  On model.my_dbt_project.mv_double_sort: COMMIT
03:08:40  SQL status: SUCCESS in 1.0 seconds
03:08:40  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:08:40  On model.my_dbt_project.mv_double_sort: BEGIN
03:08:40  SQL status: SUCCESS in 0.0 seconds
03:08:40  On model.my_dbt_project.mv_double_sort: ROLLBACK
03:08:40  On model.my_dbt_project.mv_double_sort: Close
03:08:40  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '192109ef-68b1-4a97-a11c-6bfe02667917', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106393b50>]}
03:08:40  1 of 1 OK created sql materialized_view model public.mv_double_sort ............ [SUCCESS in 8.90s]
  1. Subsequent build:
03:09:20  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106007190>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10605fcd0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10607c550>]}
03:09:20  Running with dbt=1.8.1
03:09:20  running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'write_json': 'True', 'log_cache_events': 'False', 'partial_parse': 'True', 'cache_selected_only': 'False', 'profiles_dir': '/Users/jeremy/.dbt', 'debug': 'True', 'version_check': 'True', 'log_path': '/Users/jeremy/git/dbt-basic/logs', 'fail_fast': 'False', 'warn_error': 'None', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'invocation_command': 'dbt --debug build -s mv_double_sort', 'log_format': 'default', 'introspect': 'True', 'target_path': 'None', 'static_parser': 'True', 'send_anonymous_usage_stats': 'True'}
03:09:20  Redshift adapter: Setting redshift_connector to ERROR
03:09:20  Redshift adapter: Setting redshift_connector.core to ERROR
03:09:20  Sending event: {'category': 'dbt', 'action': 'project_id', 'label': '44212e74-9a36-497e-bd19-b512e1ed7892', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10605fd50>]}
03:09:20  Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': '44212e74-9a36-497e-bd19-b512e1ed7892', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10434dfd0>]}
03:09:20  Registered adapter: redshift=1.8.0
03:09:20  checksum: ebc509fb14e151ea268e89b60fd5abf912ed99129f87516462718fa2f14f4838, vars: {}, profile: , target: , version: 1.8.1
03:09:21  Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
03:09:21  Partial parsing enabled, no changes found, skipping parsing
03:09:21  Sending event: {'category': 'dbt', 'action': 'load_project', 'label': '44212e74-9a36-497e-bd19-b512e1ed7892', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x106ba48d0>]}
03:09:21  Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': '44212e74-9a36-497e-bd19-b512e1ed7892', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1070c8150>]}
03:09:21  Found 3 models, 593 macros
03:09:21  
03:09:21  Acquiring new redshift connection 'master'
03:09:21  Acquiring new redshift connection 'list_dev'
03:09:21  Using redshift connection "list_dev"
03:09:21  On list_dev: /* {"app": "dbt", "dbt_version": "1.8.1", "profile_name": "all", "target_name": "rs", "connection_name": "list_dev"} */

    select distinct nspname from pg_namespace
03:09:21  Opening a new connection, currently in state init
03:09:21  Redshift adapter: Establishing connection using ssl with `sslmode` set to 'prefer'.To connect without ssl, set `sslmode` to 'disable'.
03:09:21  Redshift adapter: Connecting to redshift with username/password based auth...
03:09:23  SQL status: SUCCESS in 2.0 seconds
03:09:23  On list_dev: Close
03:09:23  Re-using an available connection from the pool (formerly list_dev, now list_dev_public)
03:09:23  Using redshift connection "list_dev_public"
03:09:23  On list_dev_public: BEGIN
03:09:23  Opening a new connection, currently in state closed
03:09:23  Redshift adapter: Establishing connection using ssl with `sslmode` set to 'prefer'.To connect without ssl, set `sslmode` to 'disable'.
03:09:23  Redshift adapter: Connecting to redshift with username/password based auth...
03:09:25  SQL status: SUCCESS in 2.0 seconds
03:09:25  Using redshift connection "list_dev_public"
03:09:25  On list_dev_public: /* {"app": "dbt", "dbt_version": "1.8.1", "profile_name": "all", "target_name": "rs", "connection_name": "list_dev_public"} */
select
        table_catalog as database,
        table_name as name,
        table_schema as schema,
        'table' as type
    from information_schema.tables
    where table_schema ilike 'public'
    and table_type = 'BASE TABLE'
    union all
    select
      table_catalog as database,
      table_name as name,
      table_schema as schema,
      case
        when view_definition ilike '%create materialized view%'
          then 'materialized_view'
        else 'view'
      end as type
    from information_schema.views
    where table_schema ilike 'public'
03:09:26  SQL status: SUCCESS in 1.0 seconds
03:09:26  On list_dev_public: ROLLBACK
03:09:26  On list_dev_public: Close
03:09:26  Using redshift connection "master"
03:09:26  On master: BEGIN
03:09:26  Opening a new connection, currently in state init
03:09:26  Redshift adapter: Establishing connection using ssl with `sslmode` set to 'prefer'.To connect without ssl, set `sslmode` to 'disable'.
03:09:26  Redshift adapter: Connecting to redshift with username/password based auth...
03:09:29  SQL status: SUCCESS in 2.0 seconds
03:09:29  Using redshift connection "master"
03:09:29  On master: /* {"app": "dbt", "dbt_version": "1.8.1", "profile_name": "all", "target_name": "rs", "connection_name": "master"} */
with
    relation as (
        select
            pg_class.oid as relation_id,
            pg_class.relname as relation_name,
            pg_class.relnamespace as schema_id,
            pg_namespace.nspname as schema_name,
            pg_class.relkind as relation_type
        from pg_class
        join pg_namespace
          on pg_class.relnamespace = pg_namespace.oid
        where pg_namespace.nspname != 'information_schema'
          and pg_namespace.nspname not like 'pg\_%'
    ),
    dependency as (
        select distinct
            coalesce(pg_rewrite.ev_class, pg_depend.objid) as dep_relation_id,
            pg_depend.refobjid as ref_relation_id,
            pg_depend.refclassid as ref_class_id
        from pg_depend
        left join pg_rewrite
          on pg_depend.objid = pg_rewrite.oid
        where coalesce(pg_rewrite.ev_class, pg_depend.objid) != pg_depend.refobjid
    )

select distinct
    dep.schema_name as dependent_schema,
    dep.relation_name as dependent_name,
    ref.schema_name as referenced_schema,
    ref.relation_name as referenced_name
from dependency
join relation ref
    on dependency.ref_relation_id = ref.relation_id
join relation dep
    on dependency.dep_relation_id = dep.relation_id
03:09:29  SQL status: SUCCESS in 1.0 seconds
03:09:29  Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '44212e74-9a36-497e-bd19-b512e1ed7892', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x10743dd10>]}
03:09:29  On master: ROLLBACK
03:09:30  Using redshift connection "master"
03:09:30  On master: BEGIN
03:09:30  SQL status: SUCCESS in 0.0 seconds
03:09:30  On master: COMMIT
03:09:30  Using redshift connection "master"
03:09:30  On master: COMMIT
03:09:31  SQL status: SUCCESS in 1.0 seconds
03:09:31  On master: Close
03:09:31  Concurrency: 1 threads (target='rs')
03:09:31  
03:09:31  Began running node model.my_dbt_project.mv_double_sort
03:09:31  1 of 1 START sql materialized_view model public.mv_double_sort ................. [RUN]
03:09:31  Re-using an available connection from the pool (formerly list_dev_public, now model.my_dbt_project.mv_double_sort)
03:09:31  Began compiling node model.my_dbt_project.mv_double_sort
03:09:31  Writing injected SQL for node "model.my_dbt_project.mv_double_sort"
03:09:31  Began executing node model.my_dbt_project.mv_double_sort
03:09:31  Determining configuration changes on: "dev"."public"."mv_double_sort"
03:09:31  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:09:31  On model.my_dbt_project.mv_double_sort: /* {"app": "dbt", "dbt_version": "1.8.1", "profile_name": "all", "target_name": "rs", "node_id": "model.my_dbt_project.mv_double_sort"} */

    select
            tb.database,
            tb.schema,
            tb.table,
            tb.diststyle,
            tb.sortkey1,
            mv.autorefresh
        from svv_table_info tb
        -- svv_mv_info is queryable by Redshift Serverless, but stv_mv_info is not
        left join svv_mv_info mv
            on mv.database_name = tb.database
            and mv.schema_name = tb.schema
            and mv.name = tb.table
        where tb.table ilike 'mv_double_sort'
        and tb.schema ilike 'public'
        and tb.database ilike 'dev'
03:09:31  Opening a new connection, currently in state closed
03:09:31  Redshift adapter: Establishing connection using ssl with `sslmode` set to 'prefer'.To connect without ssl, set `sslmode` to 'disable'.
03:09:31  Redshift adapter: Connecting to redshift with username/password based auth...
03:09:34  SQL status: SUCCESS in 3.0 seconds
03:09:34  Using redshift connection "model.my_dbt_project.mv_double_sort"
03:09:34  On model.my_dbt_project.mv_double_sort: /* {"app": "dbt", "dbt_version": "1.8.1", "profile_name": "all", "target_name": "rs", "node_id": "model.my_dbt_project.mv_double_sort"} */

    select
            vw.definition
        from pg_views vw
        where vw.viewname = 'mv_double_sort'
        and vw.schemaname = 'public'
        and vw.definition ilike '%create materialized view%'
03:09:34  SQL status: SUCCESS in 1.0 seconds
03:09:34  On model.my_dbt_project.mv_double_sort: Close
03:09:34  FailFast Error in model mv_double_sort (models/mv_double_sort.sql)
  Configuration changes were identified and `on_configuration_change` was set to `fail` for `"dev"."public"."mv_double_sort"`
03:09:34  Sending event: {'category': 'dbt', 'action': 'run_model', 'label': '44212e74-9a36-497e-bd19-b512e1ed7892', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x104360e10>]}
03:09:34  1 of 1 ERROR creating sql materialized_view model public.mv_double_sort ........ [ERROR in 3.70s]
03:09:34  Finished running node model.my_dbt_project.mv_double_sort
03:09:34  Using redshift connection "master"
03:09:34  On master: BEGIN
03:09:34  Opening a new connection, currently in state closed
03:09:34  Redshift adapter: Establishing connection using ssl with `sslmode` set to 'prefer'.To connect without ssl, set `sslmode` to 'disable'.
03:09:34  Redshift adapter: Connecting to redshift with username/password based auth...
03:09:37  SQL status: SUCCESS in 2.0 seconds
03:09:37  On master: COMMIT
03:09:37  Using redshift connection "master"
03:09:37  On master: COMMIT
03:09:37  SQL status: SUCCESS in 0.0 seconds
03:09:37  On master: Close
03:09:37  Connection 'master' was properly closed.
03:09:37  Connection 'model.my_dbt_project.mv_double_sort' was properly closed.
03:09:37  
03:09:37  Finished running 1 materialized view model in 0 hours 0 minutes and 16.73 seconds (16.73s).
03:09:37  Command end result
03:09:37  
03:09:37  Completed with 1 error and 0 warnings:
03:09:37  
03:09:37    FailFast Error in model mv_double_sort (models/mv_double_sort.sql)
  Configuration changes were identified and `on_configuration_change` was set to `fail` for `"dev"."public"."mv_double_sort"`

Relevant log output

No response

Environment

- OS: macOS
- Python: 3.11
- dbt-core: 1.8.1
- dbt-redshift: 1.8.0

Additional Context

  • If I did not use on_configuration_change = 'fail' - the outcome would simply be that we would to a drop and create of the mv - as opposed to refresh which is what is expected here.
  • If the model only had a single sort key:
-- models/mv_single_sort.sql
{{
    config(
        materialized = 'materialized_view',
        dist = 'b',
        sort = ['b'],
        on_configuration_change = 'fail'
    )
}}

select a, b, c from foo

Then there is no issue and the subsequent run refreshes our mv as expected:

03:16:21  On model.my_dbt_project.mv_single_sort: /* {"app": "dbt", "dbt_version": "1.8.1", "profile_name": "all", "target_name": "rs", "node_id": "model.my_dbt_project.mv_single_sort"} */

        refresh materialized view "dev"."public"."mv_single_sort"
03:16:21  SQL status: SUCCESS in 1.0 seconds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants