Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[dagster-databricks] Fix setting databricks cluster node configuration (
#20000) ## Summary & Motivation When submitting a Databricks Job, the DatabricksJobRunner is not setting the cluster's `driver_instance_pool_id` node configuration correctly. As a result, it is not possible to launch a Databricks Job from Dagster to run on Databricks instance pools. This sets the cluster's node configuration in accordance with the Databricks Jobs API to allow launching Databricks Jobs on instance pools. <img width="1522" alt="Screenshot 2024-02-28 at 8 49 36 PM" src="https://github.com/dagster-io/dagster/assets/23409221/c1f806d0-2f3c-4364-8fb0-9e14d03640c6"> The code is currently attempting to access `driver_node_type_id` from `cluster.new.nodes`. However, the step launcher's `run_config` spec states that the `cluster.new.nodes` field expects `driver_node_type_id ` and `node_type_id ` to be nested within an object called `node_types`. The fields `driver_instance_pool_id` and `instance_pool_id` are first-class properties alongside `node_types` (see `run_config` spec [here](https://github.com/dagster-io/dagster/blob/2cabe8733cb517c8caaa21e0b323f46b944ef3ef/python_modules/libraries/dagster-databricks/dagster_databricks/configs.py#L362)). ## How I Tested These Changes I launched Dagster runs in a Databricks workspace from local Dagster deployment.
- Loading branch information