When a table gets dropped elsewhere, analysis fails causing triggers of multiple repetitions of create table #215

danfran · 2021-09-15T09:57:33Z

Describe the bug

Hi everyone, currently I am going through an issue with DBT (0.20.1) on production that seems related to some "concurrent updates" when I try to run different queries at the same time, not directly related, from different models. As simplified description, I have my folder models divided in subfolders like this:

models:
  model_1:
  	table_1.sql
  	table_2.sql
  	...
  model_2:
  	table_14.sql
  	...
  ...
  model_25:
  	...
  	table_117.sql

for a total of 117 models.
Then at some point, let's say a Process-A starts the model_7 (dbt run --models model_7 on a Server-A) (that contains the query table_65.sql). What I can see from the beginning of the logs for this process is something like:

2021-09-14 08:43:56.293473 (MainThread): Acquiring new spark connection "model.my_proj.table_1".
2021-09-14 08:43:56.385776 (MainThread): Acquiring new spark connection "model.my_proj.table_2".
....
2021-09-14 08:43:56.390074 (MainThread): Acquiring new spark connection "model.my_proj.table_117".

Meantime a separate Process-B starts the model_9 (dbt run --models model_9 on a Server-B) that requires to drop the table table_82 (table_65 and table_82 are not related).
Back to Process-A, from the log I can read now (caused by the Process-B likely):

2021-09-14 08:44:35.863983 (ThreadPoolExecutor-1_0): Error while running:
/* {"app": "dbt", "dbt_version": "0.20.1", "profile_name": "my_proj_spark", "target_name": "prod", "connection_name": "list_None_my_models"} */
show table extended in my_models like '*'
  
2021-09-14 08:44:35.864026 (ThreadPoolExecutor-1_0): Database Error
  org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or view 'table_82' not found in database 'my_models'

This last log message gets repeated 3 times in row. Now for some reason this also triggers the process related to the build of the table_65 3 times (as I can see from the log) for:

2021-09-14 08:44:36.189855 (Thread-1): /* {"app": "dbt", "dbt_version": "0.20.1", "profile_name": "my_proj_spark", "target_name": "prod", "node_id": "model.my_proj.table_65"} */

      create table model_7.table_65


	using parquet
    
    partitioned by (country,month_selected)
    
    
    location 's3://my_data/....'
    
    as ....


2021-09-14 08:44:36.281855 (Thread-1): /* {"app": "dbt", "dbt_version": "0.20.1", "profile_name": "my_proj_spark", "target_name": "prod", "node_id": "model.my_proj.table_65"} */

      create table model_7.table_65


	using parquet
    
    partitioned by (country,month_selected)
    
    
    location 's3://my_data/....'
    
    as ....

 2021-09-14 08:44:36.399835 (Thread-1): /* {"app": "dbt", "dbt_version": "0.20.1", "profile_name": "my_proj_spark", "target_name": "prod", "node_id": "model.my_proj.table_65"} */

      create table model_7.table_65


	using parquet
    
    partitioned by (country,month_selected)
    
    
    location 's3://my_data/....'
    
    as ....

that understandably will make the process fail with this error:

2021-09-14 08:44:36.944890 (Thread-1): Database Error
  org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Table my_proj.table_65 already exists. You need to drop it first.

If I am correctly reading the logs, is there any work around to avoid multiple triggers of the same process if a table disappear not directly related fails to pass the analysis? Or is it a DBT bug? Thank you

Steps To Reproduce

Please look at the bug description. I think the process can be repeated locally following that steps.

Expected behavior

The expected behaviour is avoiding a failure if not related tables get dropped elsewhere.

Screenshots and log output

Please look at the bug description.

System information

Which database are you using dbt with?

The output of dbt --version:

installed version: 0.20.1
   latest version: 0.20.2

Your version of dbt is out of date! You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:
  - bigquery: 0.19.2
  - snowflake: 0.19.2
  - redshift: 0.19.2
  - postgres: 0.19.2
  - spark: 0.20.1

The operating system you're using:
Ubuntu 20.04 (on AWS/EC2)

The output of python --version:
python --version
Python 3.8.14

Additional context

Running this process on AWS/EMR 6.3 with Glue as "Hive" replacement for Spark.

The text was updated successfully, but these errors were encountered:

danfran · 2021-09-20T08:25:29Z

I found the root of the previous issue (at least I hope). The reason why this is happening is due to the fact that DBT runs via Spark the following command: show table extended in my_glue_schema like '*' . In my case the same schema is used by different other processes that can generate tables with "bad" schemas or without "Location" (S3 path) defined (a view for example). This will make fail the highlighted command and so an eventual "drop table" or "insert overwrite" won't happen, and only the create table process will. Why DBT needs to run show table extended in my_glue_schema like '*' ? Can it be skipped some how?

danfran · 2021-09-20T13:15:26Z

I have been pointed to this previous issue: #93. That seems to create a major issue for my case, as we cannot run separate parts of a potential unique "tree" as they may overlap creation/drop of tables that are expected in one or another tree.

In environments like Glue/Catalog via EMR may be not required tho, as the "layer" is the Catalog service itself. Not sure if I am missing the point here. In that case that command should be "optional"? For example something like:

show table extended in my_glue_schema in ('table_1', 'table_2', ...etc...)

jtcohen6 · 2021-10-13T10:01:39Z

Sorry for the delay @danfran! I'm going to close this in favor of the discussion and next steps in #228, since I believe a solid resolution there would also manage to avoid this bug

danfran added bug Something isn't working triage labels Sep 15, 2021

danfran mentioned this issue Oct 13, 2021

[CT-202] Workaround for some limitations due to list_relations_without_caching method #228

Open

jtcohen6 removed the triage label Oct 13, 2021

jtcohen6 closed this as completed Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When a table gets dropped elsewhere, analysis fails causing triggers of multiple repetitions of create table #215

When a table gets dropped elsewhere, analysis fails causing triggers of multiple repetitions of create table #215

danfran commented Sep 15, 2021 •

edited

Loading

danfran commented Sep 20, 2021 •

edited

Loading

danfran commented Sep 20, 2021 •

edited

Loading

jtcohen6 commented Oct 13, 2021

When a table gets dropped elsewhere, analysis fails causing triggers of multiple repetitions of create table #215

When a table gets dropped elsewhere, analysis fails causing triggers of multiple repetitions of create table #215

Comments

danfran commented Sep 15, 2021 • edited Loading

Describe the bug

Steps To Reproduce

Expected behavior

Screenshots and log output

System information

Additional context

danfran commented Sep 20, 2021 • edited Loading

danfran commented Sep 20, 2021 • edited Loading

jtcohen6 commented Oct 13, 2021

danfran commented Sep 15, 2021 •

edited

Loading

danfran commented Sep 20, 2021 •

edited

Loading

danfran commented Sep 20, 2021 •

edited

Loading