Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAP-1014] [Bug] Contract enforcement method is not reliable #659

Closed
2 tasks done
jack-johnson-instapro opened this issue Nov 7, 2023 · 8 comments
Closed
2 tasks done
Labels
bug Something isn't working

Comments

@jack-johnson-instapro
Copy link

jack-johnson-instapro commented Nov 7, 2023

Is this a new bug in dbt-redshift?

  • I believe this is a new bug in dbt-redshift
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When enforcing a contract, the SQL generated by dbt to recover the column types of a model (select * from (...model...) where false limit 0) sometimes incorrectly categorises a column as VARCHAR/TEXT. This seems to occur when there are literal NULLs involved (even if they are beforehand explicitly cast to the desired type)

Example:

create table test_bigint_varchar (
    col1 bigint
);

create table result as (
    select *
    from (
        select null::bigint as col1 
        union all
        select col1
        from test_bigint_varchar
    )
    where false
    limit 0
)

This query results in col1 having type VARCHAR, not BIGINT. This seems to be some kind of Redshift strange behaviour but I believe that dbt should handle it.

Expected Behavior

The contract enforcement logic should handle this properly.

Steps To Reproduce

create table test_bigint_varchar (
    col1 bigint
);

create table result as (
    select *
    from (
        select null::bigint as col1 
        union all
        select col1
        from test_bigint_varchar
    )
    where false
    limit 0
)

Relevant log output

No response

Environment

- OS: macOS 14.0 (23A344)
- Python: 3.9.18
- dbt-core: 1.6.6
- dbt-redshift: 1.6.2

Additional Context

No response

@jack-johnson-instapro jack-johnson-instapro added bug Something isn't working triage labels Nov 7, 2023
@github-actions github-actions bot changed the title [Bug] Contract enforcement method is not reliable [ADAP-1014] [Bug] Contract enforcement method is not reliable Nov 7, 2023
@dbeatty10
Copy link
Contributor

dbeatty10 commented Nov 7, 2023

Thanks for reporting this @jack-johnson-instapro !

For clarity, what does your example dbt model look like? Is it something like this? Or something else?

select null::bigint as col1 
union all
select col1
from test_bigint_varchar

Were you able to find any workarounds to avoid this issue within your own project?

@dbeatty10 dbeatty10 self-assigned this Nov 7, 2023
@jack-johnson-instapro
Copy link
Author

Hi @dbeatty10, our model looks very much like this, although I simplified it to isolate and clarify the issue. Basically we are unioning results which contain columns defined as NULL::BIGINT to other results coming from a table.

Currently we have not found a workaround for this issue, we could probably redesign our data model(s) to avoid this questionable union but we don't have bandwidth for this at this time. Right now, the solution we decided for is to not use contracts on this model.

@dbeatty10
Copy link
Contributor

Thanks for using the most simple model possible for isolating and clarifying the issue 😎.

Did you try out this, by any chance? I'd be curious if flipping the order of the unions makes a difference one way or the other.

-- models/my_model.sql

select col1
from test_bigint_varchar
union all
select null::bigint as col1 

@jack-johnson-instapro
Copy link
Author

jack-johnson-instapro commented Nov 9, 2023

That's a nice finding @dbeatty10, flipping the union order does make a difference! The snippet you suggested results in col1 being of type BIGINT and not VARCHAR. However I'm afraid we cannot use it as a workaround... The a-bit-less-simplified model we are using has NULL literals on both side of the union 🤦 something like:

select 
    col1,
    null::bigint as col2
from table1
union all
select 
    null::bigint as col1,
    col2
from table2

So unfortunately flipping the order results in either one or the other being of type VARCHAR...

@dbeatty10
Copy link
Contributor

Okay!

What about adding an empty relation that has explicit values & types?

select 
    0::bigint as col1,
    0::bigint as col2
where 1=0
union all
select 
    col1,
    null::bigint as col2
from table1
union all
select 
    null::bigint as col1,
    col2
from table2

@dbeatty10
Copy link
Contributor

@jack-johnson-instapro did you get a chance to check if this works for you or not?

I tried it out on my end, and it seemed to work.

@jack-johnson-instapro
Copy link
Author

jack-johnson-instapro commented Jan 15, 2024

Sorry for the late response @dbeatty10. Yes this seems to function correctly as a workaround. Thanks for your help.

@dbeatty10
Copy link
Contributor

Yes this seems to function correctly as a workaround. Thanks for your help.

Awesome news @jack-johnson-instapro !

Since we can't do anything about how Redshift handles null values but we have a workaround now, I'm going to close this as "not planned".

Workaround

If anyone else runs into this, our recommendation is union an empty relation like the following:

select 
    0::bigint as col1,
    0::bigint as col2
where 1=0
union all
select 
    col1,
    null::bigint as col2
from table1
union all
select 
    null::bigint as col1,
    col2
from table2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants