Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decimal to Float conversion in jsonschema causes validation error #14

Open
walkabout21 opened this issue Oct 9, 2024 · 9 comments
Open

Comments

@walkabout21
Copy link

loading Decimal type from mssql server to oracle using tap-mssql and target-oracle throws jsonschema validation error

2024-10-08T20:10:20.033688Z [info     ]     raise error                cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-oracle name=target-oracle producer=False run_id=097d3cec-5958-49a2-aa24-50a73adc0779 stdio=stderr string_id=target-oracle
2024-10-08T20:10:20.033753Z [info     ] jsonschema.exceptions.ValidationError: 4.6 is not a multiple of 0.1 cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-oracle name=target-oracle producer=False run_id=097d3cec-5958-49a2-aa24-50a73adc0779 stdio=stderr string_id=target-oracle
2024-10-08T20:10:20.033817Z [info     ]                                cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-oracle name=target-oracle producer=False run_id=097d3cec-5958-49a2-aa24-50a73adc0779 stdio=stderr string_id=target-oracle
2024-10-08T20:10:20.033881Z [info     ] Failed validating 'multipleOf' in schema['properties']['course_mean']: cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-oracle name=target-oracle producer=False run_id=097d3cec-5958-49a2-aa24-50a73ad

Most likely the change needs to be made here:

        if self._jsonschema_type_check(jsonschema_type, ("number",)):
            if self.config.get("prefer_float_over_numeric", False):
                return cast(sqlalchemy.types.TypeEngine, sqlalchemy.types.FLOAT())
            return cast(sqlalchemy.types.TypeEngine, sqlalchemy.types.NUMERIC(38, 10))

Similar to open issue here
Float-decimal validation issue

@walkabout21
Copy link
Author

Forgot to mention that open issue is for target-csv

@radbrt
Copy link
Owner

radbrt commented Oct 9, 2024

This one was interesting @walkabout21. I tried to get some decimal types from tap-mssql, but I don't think my decimals are the same as your decimals. The test-file (test singer records) I intercepted is here: https://github.com/radbrt/target-oracle/blob/decimaltypes/target_oracle/tests/data_files/decimal_types.singer (fyi, these are inserted as varchar)

As you can see, there is no MultipleOf to be seen anywhere in there. Are you using the default wintersrd variant of tap-mssql? I can try to generate the records again with a different variant, or if you are able to find some of the singer records at issue that would be even better.

@walkabout21
Copy link
Author

I'm using the wintersrd variant.
The source column is defined this way by jsonschema

"course_mean":{"inclusion":"available","multipleOf":0.1,"type":["null","number"]}

The value is defined as

"course_mean":4.6

The column in sql server is defined as

course_mean decimal(2,1) NULL,

I'm not sure if this is an issue with the target handling the input or the tap not defining the sqlalchemy datatype properly.

@walkabout21
Copy link
Author

I did try testing this with the TARGET_ORACLE_PREFER_FLOAT_OVER_NUMERIC set to True, but it was the same result.

@radbrt
Copy link
Owner

radbrt commented Oct 10, 2024

Thanks for the details @walkabout21, I was able to reproduce the error by replacing the schema definition for yours: {"inclusion":"available","multipleOf":0.1,"type":["null","number"]} in the schema record. Realistically I might have a fix over the weekend.

@walkabout21
Copy link
Author

I did find a work around by setting the mssql-tap.

use_singer_decimal value: true

That setting exports number columns as strings.

Without setting this, even after I set the stream_maps config to null for the decimal columns, I was still getting type errors saying the numbers had no n length attribute.

@radbrt
Copy link
Owner

radbrt commented Oct 10, 2024

@walkabout21 great to hear, and that explains why I couldn't reproduce it, for some reason I had set that one to true when I was testing.

The Meltano Slack is full of discussions of the multipleOf issue, it seems this is not the only target that has issues with it.

@walkabout21
Copy link
Author

@radbrt should this issue stay open? There is a work around, but it's not ideal. Since this issue shows up in a number of targets is this really a meltano/singer issue?

@radbrt
Copy link
Owner

radbrt commented Oct 11, 2024

@walkabout21 I do think it it should ideally be handled elsewhere, but I do want to look into it more and hopefully find a decent workaround myself, so ot can just stay open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants