Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUPER SLOW dbt run -m model_name #4762

Closed
ashwintastic opened this issue Jan 17, 2024 · 4 comments
Closed

SUPER SLOW dbt run -m model_name #4762

ashwintastic opened this issue Jan 17, 2024 · 4 comments

Comments

@ashwintastic
Copy link

ashwintastic commented Jan 17, 2024

I'm running dbt run -m model_name , which is super slow when I checked the logs dbt is checking each and every tables, views in that schema by running show extended table.

Earlier in version 1.6 it was responding faster

---------------------------------LOGS---------------------------------------
`

============================== 22:47:58.530684 | d8769778-6f77-45ce-8fa3-2564bbaf55fb ==============================
22:47:58.530684 [info ] [MainThread]: Running with dbt=1.7.4
22:47:58.533295 [debug] [MainThread]: running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'write_json': 'True', 'log_cache_events': 'False', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'version_check': 'True', 'fail_fast': 'False', 'log_path': 'some_path\logs', 'debug': 'False', 'profiles_dir': 'some_path', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'invocation_command': "dbt run -m some_model --vars {'param1': 'xxxx', 'param2':'yyyyy', 'param3': 'xyxyxyx' }", 'log_format': 'default', 'introspect': 'True', 'target_path': 'None', 'static_parser': 'True', 'send_anonymous_usage_stats': 'True'}
22:37:21.670827 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'project_id', 'label': 'da6d902d-ca85-448d-a17c-23c11405b822', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001A5A639DE10>]}
22:37:21.958659 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': 'da6d902d-ca85-448d-a17c-23c11405b822', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001A5B71DD4B0>]}
22:37:21.960659 [info ] [MainThread]: Registered adapter: databricks=1.7.3
22:37:22.018129 [debug] [MainThread]: checksum: d3f38ad46d5643a4a6fd32ce6a88e70ccb2066f8630b1f5ec7fec51d1271723b, vars: {'source_dataset': 'SOME_SOURCE',
'param2': 'xxxxxxxxxxxxxxxxxxx',
'param1': 'XYZ'}, profile: , target: , version: 1.7.4

22:37:22.912043 [debug] [MainThread]: Partial parsing enabled: 0 files deleted, 0 files added, 1 files changed.
22:37:22.918044 [debug] [MainThread]: Partial parsing: updated file: databricks://macros\SOME_SCHEMA\futuristic_columns.sql
22:37:24.115807 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'load_project', 'label': 'da6d902d-ca85-448d-a17c-23c11405b822', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001A5B7E640D0>]}
22:37:24.509945 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': 'da6d902d-ca85-448d-a17c-23c11405b822', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001A5B7364280>]}
22:37:24.512072 [info ] [MainThread]: Found 110 models, 16 snapshots, 501 tests, 16 sources, 0 exposures, 0 metrics, 688 macros, 0 groups, 0 semantic models
22:37:24.513947 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': 'da6d902d-ca85-448d-a17c-23c11405b822', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x000001A5B73642E0>]}
22:37:24.536996 [info ] [MainThread]:
22:37:24.539946 [debug] [MainThread]: Databricks adapter: Creating DatabricksDBTConnection. name: master, thread: (11688, 24468), compute: 22:37:24.541971 [debug] [MainThread]: Acquiring new databricks connection 'master' 22:37:24.544117 [debug] [MainThread]: Databricks adapter: DatabricksDBTConnection._acquire: name: master, thread: (11688, 24468), compute:, acquire_release_count: 0, idle time: 0s
22:37:24.545945 [debug] [MainThread]: Databricks adapter: Thread (11688, 24468) using default compute resource.
22:37:24.549979 [debug] [ThreadPool]: Databricks adapter: Creating DatabricksDBTConnection. name: list_hive_metastore, thread: (11688, 5936), compute: 22:37:24.551945 [debug] [ThreadPool]: Acquiring new databricks connection 'list_hive_metastore' 22:37:24.552982 [debug] [ThreadPool]: Databricks adapter: DatabricksDBTConnection._acquire: name: list_hive_metastore, thread: (11688, 5936), compute:, acquire_release_count: 0, idle time: 0s
22:37:24.554949 [debug] [ThreadPool]: Databricks adapter: Thread (11688, 5936) using default compute resource.
22:37:24.555980 [debug] [ThreadPool]: Using databricks connection "list_hive_metastore"
22:37:24.557169 [debug] [ThreadPool]: On list_hive_metastore: GetSchemas(database=hive_metastore, schema=None)
22:37:24.558000 [debug] [ThreadPool]: Opening a new connection, currently in state init
22:37:28.841685 [debug] [ThreadPool]: SQL status: OK in 4.28000020980835 seconds
22:37:28.854641 [debug] [ThreadPool]: Databricks adapter: DatabricksDBTConnection._release: name: list_hive_metastore, thread: (11688, 5936), compute: , acquire_release_count: 1, idle time: 0s 22:37:28.892143 [debug] [ThreadPool]: Databricks adapter: Creating DatabricksDBTConnection. name: list_hive_metastore_SOME_SCHEMA, thread: (11688, 23084), compute:
22:37:28.895142 [debug] [ThreadPool]: Acquiring new databricks connection 'list_hive_metastore_SOME_SCHEMA'
22:37:28.898143 [debug] [ThreadPool]: Databricks adapter: DatabricksDBTConnection._acquire: name: list_hive_metastore_SOME_SCHEMA, thread: (11688, 23084), compute: ``, acquire_release_count: 0, idle time: 0s
22:37:28.900141 [debug] [ThreadPool]: Databricks adapter: Thread (11688, 23084) using default compute resource.
22:37:28.927617 [debug] [ThreadPool]: Using databricks connection "list_hive_metastore_SOME_SCHEMA"
22:37:28.933000 [debug] [ThreadPool]: On list_hive_metastore_SOME_SCHEMA: GetTables(database=hive_metastore, schema=SOME_SCHEMA, identifier=None)
22:37:28.935875 [debug] [ThreadPool]: Opening a new connection, currently in state init

22:37:31.564096 [debug] [ThreadPool]: SQL status: OK in 2.630000114440918 seconds
22:37:31.719021 [debug] [ThreadPool]: Spark adapter: NotImplemented: add_begin_query
22:37:31.722029 [debug] [ThreadPool]: Using databricks connection "list_hive_metastore_SOME_SCHEMA"
22:37:31.725030 [debug] [ThreadPool]: On list_hive_metastore_SOME_SCHEMA: /* {"app": "dbt", "dbt_version": "1.7.4", "dbt_databricks_version": "1.7.3", "databricks_sql_connector_version": "2.9.3", "profile_name": "some_profile", "target_name": "some_env", "connection_name": "list_hive_metastore_SOME_SCHEMA"} */

select current_catalog()

22:37:32.260311 [debug] [ThreadPool]: SQL status: OK in 0.5299999713897705 seconds
22:37:32.296242 [debug] [ThreadPool]: Using databricks connection "list_hive_metastore_SOME_SCHEMA"
22:37:32.299069 [debug] [ThreadPool]: On list_hive_metastore_SOME_SCHEMA: /* {"app": "dbt", "dbt_version": "1.7.4", "dbt_databricks_version": "1.7.3", "databricks_sql_connector_version": "2.9.3", "profile_name": "some_profile", "target_name": "some_env", "connection_name": "list_hive_metastore_SOME_SCHEMA"} */
show views in hive_metastore.SOME_SCHEMA

22:37:32.938768 [debug] [ThreadPool]: SQL status: OK in 0.6399999856948853 seconds
22:37:32.980341 [debug] [ThreadPool]: Using databricks connection "list_hive_metastore_SOME_SCHEMA"
22:37:32.983068 [debug] [ThreadPool]: On list_hive_metastore_SOME_SCHEMA: /* {"app": "dbt", "dbt_version": "1.7.4", "dbt_databricks_version": "1.7.3", "databricks_sql_connector_version": "2.9.3", "profile_name": "some_profile", "target_name": "some_env", "connection_name": "list_hive_metastore_SOME_SCHEMA"} /
show table extended in hive_metastore.SOME_SCHEMA like '
'
22:38:12.162673 [debug] [ThreadPool]: Databricks adapter: Reusing DatabricksDBTConnection. name: list_hive_metastore_SOME_O_SCHEMA, thread: (11688, 23084), compute: , acquire_release_count: 0, idle time: 0.07942080497741699s 22:38:12.212333 [debug] [ThreadPool]: Databricks adapter: DatabricksDBTConnection._acquire: name: list_hive_metastore_SOME_O_SCHEMA, thread: (11688, 23084), compute: , acquire_release_count: 0, idle time: 0.12878203392028809s
22:38:12.256643 [debug] [ThreadPool]: Databricks adapter: Thread (11688, 23084) using default compute resource.
22:38:12.342730 [debug] [ThreadPool]: Using databricks connection "list_hive_metastore_SOME_O_SCHEMA"
22:38:12.369688 [debug] [ThreadPool]: On list_hive_metastore_SOME_O_SCHEMA: GetTables(database=hive_metastore, schema=SOME_O_SCHEMA, identifier=None)
22:38:14.369630 [debug] [ThreadPool]: SQL status: OK in 2.0 seconds
22:38:14.532526 [debug] [ThreadPool]: Using databricks connection "list_hive_metastore_SOME_O_SCHEMA"
22:38:14.534479 [debug] [ThreadPool]: On list_hive_metastore_SOME_O_SCHEMA: /* {"app": "dbt", "dbt_version": "1.7.4", "dbt_databricks_version": "1.7.3", "databricks_sql_connector_version": "2.9.3", "profile_name": "data_transformation", "target_name": "some_env", "connection_name": "list_hive_metastore_SOME_O_SCHEMA"} */

select current_catalog()

22:38:15.058400 [debug] [ThreadPool]: SQL status: OK in 0.5199999809265137 seconds
22:38:15.072412 [debug] [ThreadPool]: Using databricks connection "list_hive_metastore_SOME_O_SCHEMA"
22:38:15.075356 [debug] [ThreadPool]: On list_hive_metastore_SOME_O_SCHEMA: /* {"app": "dbt", "dbt_version": "1.7.4", "dbt_databricks_version": "1.7.3", "databricks_sql_connector_version": "2.9.3", "profile_name": "data_transformation", "target_name": "some_env", "connection_name": "list_hive_metastore_SOME_O_SCHEMA"} */
show views in hive_metastore.SOME_O_SCHEMA

22:38:15.934159 [debug] [ThreadPool]: SQL status: OK in 0.8600000143051147 seconds
22:38:15.952715 [debug] [ThreadPool]: Using databricks connection "list_hive_metastore_SOME_O_SCHEMA"
22:38:15.955722 [debug] [ThreadPool]: On list_hive_metastore_SOME_O_SCHEMA: /* {"app": "dbt", "dbt_version": "1.7.4", "dbt_databricks_version": "1.7.3", "databricks_sql_connector_version": "2.9.3", "profile_name": "data_transformation", "target_name": "some_env", "connection_name": "list_hive_metastore_SOME_O_SCHEMA"} /
show table extended in hive_metastore.SOME_O_SCHEMA like '
'

`

@ashwintastic
Copy link
Author

ashwintastic commented Jan 17, 2024

It took 5 minutes 12 seconds to complete

image

@runleonarun
Copy link
Collaborator

runleonarun commented Jan 18, 2024

Hey @ashwintastic Thanks for letting us know. I think this might belong in the the dbt-core repo. I will check with the team:
@graciegoheen or @jtcohen6 or @dbeatty10 Should this be moved to the Core repo?

@graciegoheen
Copy link
Collaborator

Yes, let's move it to dbt-core or dbt-databricks

@dbeatty10
Copy link
Contributor

Closing this issue and dbt-labs/dbt-core#9402 in favor of databricks/dbt-databricks#556

@dbeatty10 dbeatty10 closed this as not planned Won't fix, can't repro, duplicate, stale Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants