Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAP-386] [Bug] prerelease macro snowflake_warehouse unexpected behaviour #533

Closed
2 tasks done
manugarri opened this issue Mar 20, 2023 · 11 comments
Closed
2 tasks done
Assignees
Labels
type:bug Something isn't working

Comments

@manugarri
Copy link

manugarri commented Mar 20, 2023

Is this a new bug in dbt-snowflake?

  • I believe this is a new bug in dbt-snowflake
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

On current prerelease (v1.5.0b3) a default macro snowflake_warehouse has been added (here is the PR).

I am testing a way to use environment variables to dynamically decide which warehouse to allocate based on the user.

But adding an env_var lookup to the macro throws exceptions.

For example, this macro definition throws the exception when doing dbt run:
'MacroManifest' object has no attribute 'env_vars'

{% macro snowflake_warehouse(warehouse) -%}
  {% set foo = env_var('SNOWFLAKE_USER') %}
  {{ return(warehouse) }}
{%- endmacro %}

while this one doesnt

{% macro snowflake_warehouse(warehouse) -%}
  {{ return(warehouse) }}
{%- endmacro %}

Expected Behavior

environment variables should be accesible in the snowflake_warehouse macro as usual.

Steps To Reproduce

  1. on prerelease (v1.5.0b3)
  2. add a macro named snowflake_warehouse to a project with this definition:
{% macro snowflake_warehouse(warehouse) -%}
  {{ return(warehouse) }}
{%- endmacro %}
  1. run dbt run

Relevant log output

No response

Environment

- OS: Mac os
- Python: 3.9.1
- dbt-core: dbt-core==1.4.5
- dbt-snowflake: v1.5.0b3 (prerelease)

Additional Context

No response

@manugarri manugarri added type:bug Something isn't working triage:product labels Mar 20, 2023
@github-actions github-actions bot changed the title [Bug] prerelease macro snowflake_warehouse unexpected behaviour [ADAP-386] [Bug] prerelease macro snowflake_warehouse unexpected behaviour Mar 20, 2023
@Fleid
Copy link
Contributor

Fleid commented Mar 20, 2023

Hey @manugarri,

I was not able to reproduce the issue.
I am running python 3.10.5 with:

Core:
  - installed: 1.5.0-b4
  - latest:    1.4.5    - Ahead of latest version!

Plugins:
  - snowflake: 1.5.0b3 - Ahead of latest version!

I added the following environment variable:

export adap_386=prod

I added the following macro:

{% macro snowflake_warehouse(warehouse) -%}
{% set foo = env_var('adap_386') %}
  {{ return(foo) }}
{%- endmacro %}

Ran the following model:

{{ config(materialized='table') }}

{% set w = snowflake_warehouse() %}

with source_data as (

    select 1 as id, '{{w}}'  as wh
    union all
    select null as id, '{{w}}' as wh

)

select *
from source_data

And got prod as the value for wh in my table.
Can you think of anything else in your setup that may be at cause here?

@Fleid Fleid self-assigned this Mar 20, 2023
@manugarri
Copy link
Author

hey @Fleid thanks for reaching out!, I will try your example, would you mind sharing your example dbt_project so i can replicate?

@Fleid
Copy link
Contributor

Fleid commented Mar 21, 2023

I'm using what you get from dbt init (minimizing conflict surface with anything else).

@McKnight-42
Copy link
Contributor

McKnight-42 commented Mar 21, 2023

@manugarri was hoping on to test his as well and was seeing same thing as you while working in cli.

I had missed this step export adap_386=prod to apply the env got the model to run for me locally

@manugarri
Copy link
Author

@Fleid thanks, i tried to replicate by running dbt init with minimal modifications given my company snowflake policies (we dont use public schema).

so i run dbt init

add the macro in macros/snowflake_warehouse.sql with this content (to ensure is running)

{% macro snowflake_warehouse(warehouse) -%}
    {{ log("SELECTING SNOWFLAKE_WAREHOUSE:" ~ warehouse ~ " FOR USER: " ~ target.user, True) }}
  {{ return('COMPUTE_MED') }}
{%- endmacro %}

set up the dbt project settings like this

models:
  +snowflake_warehouse: "MEDIUM"
  test:
    example:
      +materialized: view
      +schema: test

Run dbt run .... works with that macro and it prints the logging message.

Now replacing the macro with

{% macro snowflake_warehouse(warehouse) -%}
    {{ log("SELECTING SNOWFLAKE_WAREHOUSE:" ~ warehouse ~ " FOR USER: " ~ target.user, True) }}
  {% set foo = env_var('SNOWFLAKE_USER') %}

  {{ return('COMPUTE_MED') }}
{%- endmacro %}

Fails with the error i mentioned. here is a screenshot of the logs
image

@manugarri
Copy link
Author

Another issue that might just be a feature request is, shouldnt the default macro snowflake_warehouse also be applied to the profile? right now you are forced to use a hardcoded warehouse in the profiles.yml while having flexibility in the dbt_project file.

@McKnight-42
Copy link
Contributor

McKnight-42 commented Mar 22, 2023

@manugarri working toward replicating with your example; out of curiosity what would you expect foo to be applied to in this

    {{ log("SELECTING SNOWFLAKE_WAREHOUSE:" ~ warehouse ~ " FOR USER: " ~ target.user, True) }}
  {% set foo = env_var('SNOWFLAKE_USER') %}

  {{ return('COMPUTE_MED') }}
{%- endmacro %}```

@manugarri
Copy link
Author

manugarri commented Mar 23, 2023

@McKnight-42 my original intentions where to setup a macro that select different warehouses depending on whether an analyst is testing dbt locally versus running automated runs on airflow.

We use environment variables on our profiles.yml

snowflake:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
      user: "{{ env_var('SNOWFLAKE_USER') }}"
      password: "{{ env_var('SNOWFLAKE_PASSWORD') }}"
      database: "{{ env_var('SNOWFLAKE_DATABASE') }}"
      warehouse: "{{ env_var('SNOWFLAKE_WAREHOUSE') }}"
      role: "{{ env_var('SNOWFLAKE_ROLE') }}"
      authenticator: "{{ env_var('SNOWFLAKE_AUTHENTICATOR') }}"
      schema: public

My original implementation for the macro was something like this:

{%- macro snowflake_warehouse(warehouse) -%}
    {{ log("SELECTING SNOWFLAKE_WAREHOUSE:" ~ warehouse ~ " FOR USER: " ~ env_var('SNOWFLAKE_USER') , True) }}

    {% set warehouses_specs = {
            "service_user": {
                "MEDIUM": "DBT_MED",
                "LARGE": "DBT_LARGE"
            },
            "human_user": {
                "MEDIUM": "LOCAL_MED",
                "LARGE": "LOCAL_LARGE"
            }
        }
    %}
  #we use SSO for snowflake
    {% if "@" in env_var('SNOWFLAKE_USER')  %}
        {% if warehouse in warehouses_specs["human_user"] %}
            {{ return(warehouses_specs["human_user"][warehouse]) }}
        {% else %}
            {{ return(warehouse) }}
        {% endif %}
    {% else %}
        {% if warehouse in warehouses_specs["service_user"] %}
            {{ return(warehouses_specs["service_user"][warehouse]) }}
        {% else %}
            {{ return(warehouse) }}
        {% endif %}
    {% endif %}
{%- endmacro %}

this was failing with the error i mentioned.

In the end some very helpful person at the dbt slack helped me and suggested replacing env_var('SNOWFLAKE_USER') for target.user , which solves my current needs.

But still, it sounds like allowing environment variables in this macro like any other macro would be useful for other use cases.

Another use case that just arose at my company, we want to have different warehouses per environment, but we want a single profiles.yml (since the prod dbt runs on docker ECS but local runs for multiple analysts) if we have an environment variable to define the environment dbt is running, we can include that environment variable in the mapping function i shared above

@McKnight-42
Copy link
Contributor

McKnight-42 commented Mar 24, 2023

@manugarri after some more digging we have decided for now to actually revert #503 we now think that this method is the wrong way as it allows for some unintended consequences and will be trying to find a better way forward. sorry for any inconvenience this causes

@Fleid
Copy link
Contributor

Fleid commented Mar 24, 2023

As @McKnight-42 mentioned, we are reverting the change, so I'm closing this issue.
@manugarri would you mind copying your latest comment with your use case to #103. I'm re-opening it to find an alternative solution.

@Fleid Fleid closed this as completed Mar 24, 2023
@manugarri
Copy link
Author

@McKnight-42 no problem, better to find an improved solution in a future release. In the meantime, what do you think would be the best way of tackling the issue of dynamic warehouses?

@Fleid will do that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants