-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Decimal <--> Floating conversion #15905
New Decimal <--> Floating conversion #15905
Conversation
NOTE: Nevermind, the below is now working, the algorithm was tweaked accordingly. Note that currently the python tests are failing because of 97938.2f -> decimal64 with scale-factor -2. Note that the tests this was failing on before (in the original linked issue) now pass (higher levels of precision), this failure is different. This failure due to the increment_on_truncation() function, where we increment when we should not. Specifically we are triggering the I could change the xfail in the python test to fail this case instead of the others, but I'm trying to see if I can think of a solution ... I'm betting arrow is able to figure it out BECAUSE of this precision argument (7 decimal digits, which is the max precision of float) which we don't have ... perhaps we should have this as an optional argument? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments on the comments
…-nvidia/cudf into main_conversion_branch
…input so the test was bad.
…-nvidia/cudf into main_conversion_branch
Co-authored-by: Nghia Truong <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java approval
@@ -34,6 +35,49 @@ namespace numeric { | |||
|
|||
namespace detail { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entire source of this file is in the detail namespace but the file is in a public header directory.
Are these meant to be called only internally then? We are evaluating public/internal functions/source recently.
Perhaps this could be moved to detail directory instead -- cudf/fixed_point/detail
would probably be ok.
Now I'm seeing this is called from public unary.hpp
which has its own issue.
So this is probably out of scope for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an issue for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of this functionality needs to be exposed for the spark-rapids-jni repository to wrap around for it to perform conversions with the spark-specific rounding. I'm still finalizing the details for the sr-jni code, when that's done I'll have a better idea of what the minimal amount of code we can expose is.
Note that cuDF necessarily will NOT match pyarrow, due to choosing to truncate where pyarrow wants to round. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments in this code are really good. I appreciate the time spent adding them to explain the dense conversions going on in here.
@@ -34,6 +35,49 @@ namespace numeric { | |||
|
|||
namespace detail { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an issue for this change?
/merge |
This reverts commit 3c83ce4.
This PR contains the main algorithm for the new decimal <--> floating conversion code. This algorithm was written to address the precision issues described here.
Summary
Previous PR's
These contain the supporting parts of this work:
Algorithm Outline
We convert floating -> (integer) decimal by:
Decimal -> floating is just the reverse operation.
Supplemental Changes
Performance for various conversions/input-ranges
New algorithm is FASTER by:
New algorithm is SLOWER by:
Other kernels:
Performance discussion
Checklist