-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(udf): fix decimal values #11839
fix(udf): fix decimal values #11839
Conversation
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
This solution is too hacky to me. Converting to arrow not only used in udf, but also used in inserting to iceberg/parquet. |
But arrow does not have a type for floating-point decimal. It's impossible to use a fixed type decimal array to store them. |
Yeah, that's a problem. I think we need a discussion before deciding a solution. |
Signed-off-by: Runji Wang <[email protected]>
This PR has been open for 60 days with no activity. Could you please update the status? Feel free to ping a reviewer if you are waiting for review. |
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #11839 +/- ##
==========================================
- Coverage 68.15% 68.13% -0.03%
==========================================
Files 1524 1524
Lines 262331 262309 -22
==========================================
- Hits 178793 178721 -72
- Misses 83538 83588 +50
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, it's better to add some comments to explain it.
This change breaks decimal handling in iceberg. Should be fixed in the future. |
Signed-off-by: Runji Wang <[email protected]>
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
This PR fixes #11823.
The problem is that Arrow has no data type equivalent to the
DECIMAL
(unconstrained numeric) in RisingWave/PostgreSQL. The decimal types in Arrow are fixed-point, while our decimal type is floating-point. Previously we useddecimal(38,0)
in arrow to represent decimal values. This is clearly wrong because it only accepts integer numbers.This PR switches the arrow type for decimal to "large-binary". Values are stored as their string representations in the array, like
[b"1.23", b"0", b"-inf"]
. This may not be efficient in encoding, but is least error prone. And theoretically, this design can accept arbitrary-precision decimal values.Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.