[Spike+] can't unit test precision #9627

graciegoheen · 2024-02-22T15:54:47Z

Is this a new bug in dbt-core?

I believe this is a new bug in dbt-core
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I have a stg_payments model:

with

source as (

    select * from {{ source('stripe',  'payment') }}

),

staged as (

    select
        id as payment_id,
        orderid as order_id,
        paymentmethod as payment_method,
        status,
        -- amount is stored in cents, convert it to dollars
        round(amount / 100, 2) as amount,
        created as created_at
    from source

)

select * from staged

I create a unit test to make sure my amount field is doing the proper conversion / rounding:

unit_tests:
  - name: test_dollar_conversion # this is the unique name of the test
    model: stg_payments # name of the model I'm unit testing
    given: # the mock data for your inputs
      - input: source('stripe',  'payment')
        rows:
         - {id: 1, amount: 1000}
         - {id: 2, amount: 50}
         - {id: 3, amount: 33.33}
    expect: # the expected output given the inputs above
      rows:
        - {payment_id: 1, amount: 10}
        - {payment_id: 2, amount: 0.5}
        - {payment_id: 3, amount: 0.3333}

This passes successfully even though 0.3333 is not technically equivalent to 0.33.

Which is surprising given:

Essentially what’s happening is dbt is converting 0.3333 to the output data type (?) to execute the unit test, so 0.33 = 0.33:

try_cast('0.3333' as NUMBER(38,2))
     as amount

Expected Behavior

My unit test should fail because 0.33 does not equal 0.3333.
Time boxed spike to see if we can do this. If we can, let's complete the work with the spike.

Steps To Reproduce

See above

Relevant log output

No response

Environment

- OS:
- Python:
- dbt:

Which database adapter are you using with dbt?

snowflake (not unique to snowflake)

Additional Context

No response

The text was updated successfully, but these errors were encountered:

emmyoop · 2024-03-19T16:13:38Z

I have not found the exact cause but was able to narrow it down more. It's actually using the precision from the sql when the cents get converted to dollars. When I set the column type to be numeric(10,2) the _sql_results (generated in the materialization macro here) have 3 decimal places instead of 2 for the dollar column's expected values. The precision is set to three in the sql where the cents are converted to dollars.

With the half field I don't round and it uses the precision set in the data type field of the yaml. Results there are as expected.

By the time we generate the agate tables for comparison of the dollar fields here the values are already incorrect and have been rounded.

My hunch is the problem is in the get_fixture_sql macro.

With postgres, I used

stg_payments model

with
source as (
    select * from {{ source('stripe',  'payment') }}
),
staged as (
    select
        id as payment_id,
        -- amount is stored in cents, convert it to dollars
        CAST(amount / 100.0 AS numeric(10, 3)) as dollar,
        amount / 2 as half
    from source
)
select * from staged

My yaml

unit_tests:
  - name: test_dollar_conversion
    model: stg_payments
    given:
      - input: source('stripe',  'payment')
        rows:
         - {id: 3, amount: 33.3323}
         - {id: 4, amount: 33.3323}
         - {id: 5, amount: 33.3323}
         - {id: 6, amount: 0.55}
         - {id: 8, amount: 0.01111}
         - {id: 9, amount: 0.01111}
    expect:
      rows:
        - {payment_id: 3, dollar: .333}            # expected
        - {payment_id: 4, dollar: 0.333323}    # right math, wrong precision
        - {payment_id: 5, dollar: .333167}      # just wrong
        - {payment_id: 6, dollar: 0.006}         # expected
        - {payment_id: 8, dollar: 0.0001111}   # right math, wrong precision
        - {payment_id: 9, dollar: 0}                  # expected

  - name: test_half
    model: stg_payments
    given:
      - input: source('stripe',  'payment')
        rows:
         - {id: 3, amount: 5}
         - {id: 4, amount: 6}
    expect:
      rows:  # use precision 4 and expect failure
        - {payment_id: 3, half: 2.50}
        - {payment_id: 4, half: 6.00}

sources:
  - name: stripe
    schema: "{{ target.schema }}"
    tables:
      - name: payment
        columns:
          - name: id
            type: integer
          - name: amount
            type: numeric(10,5)

models:
  - name: stg_payments
    columns:
      - name: payment_id
        type: integer
      - name: dollar
        type: numeric(10,2)
      - name: half
        type: numeric(10,2)

my seed for my payments source

id,amount
1,545
2,336
3,103762
4,33.33
5,5

MichelleArk · 2024-03-26T21:53:27Z

My hunch is the problem is in the get_fixture_sql macro.

I think it would likely be in get_expected_sql, since that applies casting to the 'expected' structure based on the sql of the model being tested.

I think there are a couple ways we can approach a fix here:

Explicit option for users to opt out of the automatic casting of expect values. E.g. something called '_dbt_skip_casting' or '_dbt_use_literal'. I think this option should only be valid on the expect rows as opposed to input ones, since there isn't any benefit to being able to test your model given a different input schema as those are fixed in production.

unit_tests:
  - name: test_dollar_conversion
    model: stg_payments
    given:
      - input: source('stripe',  'payment')
        rows:
         - {id: 3, amount: 33.3323}
         - {id: 4, amount: 33.3323}
         - {id: 5, amount: 33.3323}
         - {id: 6, amount: 0.55}
         - {id: 8, amount: 0.01111}
         - {id: 9, amount: 0.01111}
    expect:
      rows:
        - {payment_id: 3, dollar: .333}            # expected
        - {payment_id: 4, dollar: 0.333323, _dbt_skip_casting: ["dollar"]}    # explicit option to skip casting for the 'half' fixture, and use its value as a literal. this might mean that the user must provide a different value, e.g. "0.333323::numeric"

Same as (1), but we do it under the hood for certain known warehouse types. e.g, if the type to cast to is numeric, don't safe cast and just plumb the literal value through.
- downside: different values may be valid for input/expected specs. e.g: you can just say '0.33' for the input fixture, but if that is not a valid numeric literal, you will need to do something like 0.33::numeric

emmyoop · 2024-04-09T15:50:51Z

Closing for implementation ticket #9884

FishtownBuildBot · 2024-04-09T15:51:19Z

Opened a new issue in dbt-labs/docs.getdbt.com: dbt-labs/docs.getdbt.com#5254

graciegoheen added bug Something isn't working triage pre-release Bug not yet in a stable release and removed triage labels Feb 22, 2024

graciegoheen mentioned this issue Feb 22, 2024

[CT-2911] [Epic] Unit testing dbt models #8283

Closed

graciegoheen added this to the v1.8 milestone Feb 22, 2024

martynydbt added the High Severity bug with significant impact that should be resolved in a reasonable timeframe label Feb 22, 2024

martynydbt changed the title ~~can't unit test precision~~ [Spike+] can't unit test precision Mar 11, 2024

martynydbt assigned emmyoop Mar 12, 2024

martynydbt assigned McKnight-42 Mar 20, 2024

graciegoheen added the user docs [docs.getdbt.com] Needs better documentation label Mar 26, 2024

emmyoop mentioned this issue Apr 9, 2024

[Unit Testing] Allow explicit precision testing in unit tests #9884

Open

1 task

emmyoop closed this as completed Apr 9, 2024

dbeatty10 added the unit tests Issues related to built-in dbt unit testing functionality label Sep 20, 2024

dbeatty10 mentioned this issue Sep 20, 2024

[Feature] Support epsilon / tolerance for float values in unit test expected values. #10470

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Spike+] can't unit test precision #9627

[Spike+] can't unit test precision #9627

graciegoheen commented Feb 22, 2024 •

edited by martynydbt

Loading

emmyoop commented Mar 19, 2024 •

edited

Loading

MichelleArk commented Mar 26, 2024 •

edited

Loading

emmyoop commented Apr 9, 2024

FishtownBuildBot commented Apr 9, 2024

[Spike+] can't unit test precision #9627

[Spike+] can't unit test precision #9627

Comments

graciegoheen commented Feb 22, 2024 • edited by martynydbt Loading

Is this a new bug in dbt-core?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Which database adapter are you using with dbt?

Additional Context

emmyoop commented Mar 19, 2024 • edited Loading

MichelleArk commented Mar 26, 2024 • edited Loading

emmyoop commented Apr 9, 2024

FishtownBuildBot commented Apr 9, 2024

graciegoheen commented Feb 22, 2024 •

edited by martynydbt

Loading

emmyoop commented Mar 19, 2024 •

edited

Loading

MichelleArk commented Mar 26, 2024 •

edited

Loading