Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spike+] can't unit test precision #9627

Closed
2 tasks done
Tracked by #8283
graciegoheen opened this issue Feb 22, 2024 · 4 comments
Closed
2 tasks done
Tracked by #8283

[Spike+] can't unit test precision #9627

graciegoheen opened this issue Feb 22, 2024 · 4 comments
Assignees
Labels
bug Something isn't working High Severity bug with significant impact that should be resolved in a reasonable timeframe pre-release Bug not yet in a stable release unit tests Issues related to built-in dbt unit testing functionality user docs [docs.getdbt.com] Needs better documentation
Milestone

Comments

@graciegoheen
Copy link
Contributor

graciegoheen commented Feb 22, 2024

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I have a stg_payments model:

with

source as (

    select * from {{ source('stripe',  'payment') }}

),

staged as (

    select
        id as payment_id,
        orderid as order_id,
        paymentmethod as payment_method,
        status,
        -- amount is stored in cents, convert it to dollars
        round(amount / 100, 2) as amount,
        created as created_at
    from source

)

select * from staged

I create a unit test to make sure my amount field is doing the proper conversion / rounding:

unit_tests:
  - name: test_dollar_conversion # this is the unique name of the test
    model: stg_payments # name of the model I'm unit testing
    given: # the mock data for your inputs
      - input: source('stripe',  'payment')
        rows:
         - {id: 1, amount: 1000}
         - {id: 2, amount: 50}
         - {id: 3, amount: 33.33}
    expect: # the expected output given the inputs above
      rows:
        - {payment_id: 1, amount: 10}
        - {payment_id: 2, amount: 0.5}
        - {payment_id: 3, amount: 0.3333}

This passes successfully even though 0.3333 is not technically equivalent to 0.33.

Which is surprising given:
Screenshot 2024-02-21 at 3 33 03 PM

Essentially what’s happening is dbt is converting 0.3333 to the output data type (?) to execute the unit test, so 0.33 = 0.33:

try_cast('0.3333' as NUMBER(38,2))
     as amount

Expected Behavior

My unit test should fail because 0.33 does not equal 0.3333.
Time boxed spike to see if we can do this. If we can, let's complete the work with the spike.

Steps To Reproduce

See above

Relevant log output

No response

Environment

- OS:
- Python:
- dbt:

Which database adapter are you using with dbt?

snowflake (not unique to snowflake)

Additional Context

No response

@graciegoheen graciegoheen added bug Something isn't working triage pre-release Bug not yet in a stable release and removed triage labels Feb 22, 2024
@graciegoheen graciegoheen added this to the v1.8 milestone Feb 22, 2024
@martynydbt martynydbt added the High Severity bug with significant impact that should be resolved in a reasonable timeframe label Feb 22, 2024
@martynydbt martynydbt changed the title can't unit test precision [Spike+] can't unit test precision Mar 11, 2024
@emmyoop
Copy link
Member

emmyoop commented Mar 19, 2024

I have not found the exact cause but was able to narrow it down more. It's actually using the precision from the sql when the cents get converted to dollars. When I set the column type to be numeric(10,2) the _sql_results (generated in the materialization macro here) have 3 decimal places instead of 2 for the dollar column's expected values. The precision is set to three in the sql where the cents are converted to dollars.

With the half field I don't round and it uses the precision set in the data type field of the yaml. Results there are as expected.

By the time we generate the agate tables for comparison of the dollar fields here the values are already incorrect and have been rounded.

My hunch is the problem is in the get_fixture_sql macro.

With postgres, I used

stg_payments model

with
source as (
    select * from {{ source('stripe',  'payment') }}
),
staged as (
    select
        id as payment_id,
        -- amount is stored in cents, convert it to dollars
        CAST(amount / 100.0 AS numeric(10, 3)) as dollar,
        amount / 2 as half
    from source
)
select * from staged

My yaml

unit_tests:
  - name: test_dollar_conversion
    model: stg_payments
    given:
      - input: source('stripe',  'payment')
        rows:
         - {id: 3, amount: 33.3323}
         - {id: 4, amount: 33.3323}
         - {id: 5, amount: 33.3323}
         - {id: 6, amount: 0.55}
         - {id: 8, amount: 0.01111}
         - {id: 9, amount: 0.01111}
    expect:
      rows:
        - {payment_id: 3, dollar: .333}            # expected
        - {payment_id: 4, dollar: 0.333323}    # right math, wrong precision
        - {payment_id: 5, dollar: .333167}      # just wrong
        - {payment_id: 6, dollar: 0.006}         # expected
        - {payment_id: 8, dollar: 0.0001111}   # right math, wrong precision
        - {payment_id: 9, dollar: 0}                  # expected

  - name: test_half
    model: stg_payments
    given:
      - input: source('stripe',  'payment')
        rows:
         - {id: 3, amount: 5}
         - {id: 4, amount: 6}
    expect:
      rows:  # use precision 4 and expect failure
        - {payment_id: 3, half: 2.50}
        - {payment_id: 4, half: 6.00}

sources:
  - name: stripe
    schema: "{{ target.schema }}"
    tables:
      - name: payment
        columns:
          - name: id
            type: integer
          - name: amount
            type: numeric(10,5)

models:
  - name: stg_payments
    columns:
      - name: payment_id
        type: integer
      - name: dollar
        type: numeric(10,2)
      - name: half
        type: numeric(10,2)

my seed for my payments source

id,amount
1,545
2,336
3,103762
4,33.33
5,5

@graciegoheen graciegoheen added the user docs [docs.getdbt.com] Needs better documentation label Mar 26, 2024
@MichelleArk
Copy link
Contributor

MichelleArk commented Mar 26, 2024

My hunch is the problem is in the get_fixture_sql macro.

I think it would likely be in get_expected_sql, since that applies casting to the 'expected' structure based on the sql of the model being tested.

I think there are a couple ways we can approach a fix here:

  1. Explicit option for users to opt out of the automatic casting of expect values. E.g. something called '_dbt_skip_casting' or '_dbt_use_literal'. I think this option should only be valid on the expect rows as opposed to input ones, since there isn't any benefit to being able to test your model given a different input schema as those are fixed in production.
unit_tests:
  - name: test_dollar_conversion
    model: stg_payments
    given:
      - input: source('stripe',  'payment')
        rows:
         - {id: 3, amount: 33.3323}
         - {id: 4, amount: 33.3323}
         - {id: 5, amount: 33.3323}
         - {id: 6, amount: 0.55}
         - {id: 8, amount: 0.01111}
         - {id: 9, amount: 0.01111}
    expect:
      rows:
        - {payment_id: 3, dollar: .333}            # expected
        - {payment_id: 4, dollar: 0.333323, _dbt_skip_casting: ["dollar"]}    # explicit option to skip casting for the 'half' fixture, and use its value as a literal. this might mean that the user must provide a different value, e.g. "0.333323::numeric" 
  1. Same as (1), but we do it under the hood for certain known warehouse types. e.g, if the type to cast to is numeric, don't safe cast and just plumb the literal value through.
    • downside: different values may be valid for input/expected specs. e.g: you can just say '0.33' for the input fixture, but if that is not a valid numeric literal, you will need to do something like 0.33::numeric

@emmyoop
Copy link
Member

emmyoop commented Apr 9, 2024

Closing for implementation ticket #9884

@emmyoop emmyoop closed this as completed Apr 9, 2024
@FishtownBuildBot
Copy link
Collaborator

Opened a new issue in dbt-labs/docs.getdbt.com: dbt-labs/docs.getdbt.com#5254

@dbeatty10 dbeatty10 added the unit tests Issues related to built-in dbt unit testing functionality label Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working High Severity bug with significant impact that should be resolved in a reasonable timeframe pre-release Bug not yet in a stable release unit tests Issues related to built-in dbt unit testing functionality user docs [docs.getdbt.com] Needs better documentation
Projects
None yet
Development

No branches or pull requests

7 participants