-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Losing the battle to convert a string to double #129
Comments
If you know your input is the "shortest" format of a floating-point value of a given width, then there is in fact a fast algorithm to parse these values. I have some code for that. I'll need to do a bit of work in order to commit it here, but it shouldn't be much (famous last words). |
@ulfjack There is an amazing library for Rust that parses float/double values precisely and fast I've ported the moderate path from it to our library for Scala |
I'm aware of the article. I think we can do better. |
I have pushed some code to the https://github.com/ulfjack/ryu/tree/parsing branch. There are still big holes in the code, but that's the general shape. Note that it can only handle up to ~18 decimal digits (which is fine for input that was formatted as shortest), and I need to double-check that the tables have enough bits for this usage. |
I pushed another commit filling in some of the missing pieces. |
I pushed a couple of fixes to the branch. Before it can be merged, I need to figure out error handling, make it work on Mac/Win, and double-check that the bit widths are ok. More tests would also be good, but IMO not a blocker for merging into master. |
@ulfjack please consider using of SWAR techniques for parsing the decimal representation, like here: sirthias/borer#114 |
A limited implementation for up to 17-digit numbers. See #129. * Move some d2s intrinsics to d2s_intrinsics.h * Rudimentary string to double parsing * Implement more pieces for parsing - parse exponents - parse dot - handle negative exponents - handle rounding - add some basic tests for these things * Fix constants * Use __builtin_clzll Thanks to expnkx for finding this. See #133. * Allow exponent to start with '+' * Correctly handle mantissa overflow * Increase the table size Denormal numbers require a larger table because the decimal exponent can go down to -326 (we're using the table inversely to how it's originally used for shortest). * Handle most overflow/underflow situations * Increase table size some more * More tests close to the underflow These are almost exactly halfway between 0 and the smallest representable float. * Implement error handling Mention the restrictions on the input, and that this implementation is experimental. * Add another caveat to the documentation * Make it compile with -DRYU_ONLY_64_BIT_OPS This mode was not defining mulShift. We just use the same implementation as if it's not set. * Use _BitScanReverse64 on Windows * Rename mulShift to mulShift{32,64} Since I moved the mulShift implementations to d2s_intrinsics, they conflict with the ones defined in f2s.c. This is only an issue when -DRYU_OPTIMIZE_SIZE is set, but I think it's better to disambiguate anyway. * Rename max to max32 to avoid name conflicts on MSVC * MSVC: Always use _BitScanReverse64 The HAS_64_BIT_INTRINSICS macro is false if RYU_ONLY_64_BIT_OPS is set, so using that is incorrect. Instead, only check for _MSC_VER. I can't seem to find a macro to check for existence of __builtin_clzll.
@vinniefalco posted on the PR: "How about 0.X1E1000 where X is a thousand zeroes" Looks like you found a bug! |
We were able to fix our number parser: However, it fails to produce the same result as |
Well.... it would have been nice to read that before I struggled to write the code !!! |
@vinniefalco @ulfjack I am currently looking at fast string-to-double conversion, and reading this discussion it's not clear whether current ryu implementation is correct, or whether json implementation is good to use. Can one of those solutions be used yet? |
The JSON implementation seems to work (i.e. passes tests) but it only implements the "fast path" conversion. Which means it can be off by one bit for some strings. And note that it is not officially part of Boost yet, and is still being worked on. |
Going from double to string is amazing using this lib, but I am having trouble efficiently and correctly parsing a string into a double. This is what I have, any ideas? Is there some ulfjack magic that could be worked here?
https://github.com/vinniefalco/json/blob/develop/include/boost/json/detail/number.ipp
The text was updated successfully, but these errors were encountered: