-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: quad-double precision. #184
Comments
This definitely would be great. Especially if float128 will use compiler/assembler version when available.
Yeah, it definitely depends on the approach. The simplest one is of course to use qd-2.3.22.tar.gz verbatim and only add the necessary multiprecision wrapper. A much more interesting approach is to extract the design used to "quadruple" the precision in there. In such a way that quad-float128 would be simply another template of the same design. 512bits is exactly what I will need in my calculations. |
I was thinking more of the binary128 and binary256 formats that are mentiod in IEEE754:2008. But these would be (at least in their portable and generic form) software-intensive and thus lose the hardware acceleration that is the very thing you seek. So that is not exactly what the original request is about.
This might unfortunately have liensing inconsistencies, in the sense that using the code directly with BSL might be problematic. I think that binary128/256 is an idea we could talk about. But I mentioned previously, I think there would be several important design considerations. Another idea would be to create a set of popular backends intended to wrap a given type for the "number" template in Boost.Multiprecision. That might be a way to navigate through the licensing differences. I mean, you could even wrap the old FORTRAN77 REAL*16 that way. I think I'd like to continue this discussion and see what possibilities there are. Kind regards, Chris |
Since binary128/256 would be software intensive I would prefer other approach. When hardware support appears in future CPUs adding such binary support should be fairly easy. As you said, it is not a matter of this thread.
This solution is much more interesting for me. I think that you have already done that with Also the idea to use the "quadrupling" method described in that paper is a worthwhile direction for me. Even if it means completely reimplementing this idea in boost from scratch. This will take a lot of time though. So I could summarise following:
|
Correct. There are already Boost.Multiprecision "float" backend wrappers for __float128/quadmath, GMP/MPIR, MPFR and an "int" wrapper for tommath, and these require that the user environment is able to find and link to the required target-specific lib. I'm not sure what it means for an existing UDT number library to be important/cool/special enough to be wrapped with a supported backend in Boost? I'm also a bit hesitant to directly use DD/QD prototypes when looking at the licensing of the original work. Nonetheless, DD and QD have been around for a while. In fact, I used them heavily about 10-12 years ago. Back then there was a different DD package with 106 binary digits. Maybe it was the predecessor of the work we are mentioning in this post. Even though I'm not particularly fond of the "FP hack" required for x86 architecture in these packages, and I'm also not sure exactly how portable DD and QD are, these libs are quite useful. This idea seems feasible to me. Kind regards, Chris |
The downside of double-double (and I presume of quad-double) has been shown to be some uncertainty estimating about epsilon and consequence difficulties in estimating precision and uncertainties. The Apple/Darwin platforms all show 'different' behaviour in many Boost.Math tests. So the 106 or 212 significand bits may be over-optimistic. If you value speed more then ... |
Indeed. Interestingly enough, two of the greater challenges in potentially wrapping DD/QD and higher orders would be getting the types to behave like proper C++ types and figuring out how to mix the function name calls within BSL headers. These are the two challenges I'm facing in another (unpublished) project that wraps the high precision type of a popular computer algebra system for Boost.Multiprecision. Still, I think the DD/QD topic might be interesting to pursue in a feasibility study. Perhaps some of the epsilon inconsistencies could be addressed if the concepts of unity, comparison, frexp, ldexp are established more clearly in small add-on functionalities beyond the raw original implementation. Kind regards, Chris |
This is a good idea, and in principle easy enough. As mentioned above already, epsilon is likely to the the main sticking point, there are 2 possible definitions:
BTW some time back I tried pretty hard to get Boost.Math working with apples "double double" and just couldn't get everything working - so much of the reasoning about what happens to a number under operation X breaks for these types that it's a bit of a loosing battle. That said, we have since added cpp_dec_float and mpf_float both of which have hidden guard digits and seem to be working OK. In principle, these types aren't that different I hope. |
Thanks John, for your experienced advice. That's what I was seeking in terms of a kind of nod. If there are no other blocking points, then I will write this up ASAP as a feasibility study for Boost GSoC 2020. The timing of this user-raised issue is, I believe, perfect for that potential venue. Kind regards, Chris |
I'm curious about @cosurgi's use case. I feel like the current Boost.Multiprecision covers the intended use well as it stands, which only maybe a factor of 2 available for performance increase. But if you're using quad double, well, performance can't be a huge issue anyway. |
@NAThompson even a two-times performance increase means waiting for result one month or two months :) My use case will be quantum mechanics and quantum chromodynamics. Currently I am preparing the yade source code for all of this. The experimental measurements in some cases are more precise than 20 or even 30 significant digits. Performing high precision calculations is a big deal here. Since I plan to calculate volumes significantly larger than Planck's length I have estimated that my target required precision is about 150 digits, which is 512bits, And it has to be as fast as possible. I know it will take years to achieve, but that's the goal. My current strategy to achieve this is to make sure that yade is 100% compatibile with boost multiprecision. So that when new faster and more precise types appear I will be able to instantly benefit from them. And hope that someday CPUs with native floating point 512 bit type will appear :) |
BTW: MPFR is about 10-times slower. The difference in waiting one month and 10 months for a result becomes really meaningful. |
So let's prefer the 1 month. |
Yes.
This comment rang a bell in my mind. So I mounted a drive from 2007. As it turns out, I developed my own version of I got remarkably far with this code, and am somewhat surprised that I had forgotten about it. If we pursue this topic, I will be sharing my work in a smaller e-mail circle. In that work, it looks like I simply empirically selected 31 for Bset regards, Chris |
Interesting comments and use cases - thanks for this. There's one other potential blocker that springs to mind: the special functions are kind of tricky to implement for types with large bit counts, but small exponent ranges. quad_float is perhaps the archetype here :( There's an open bug for tgamma on this: #150, but it wouldn't surprise me if there aren't others lurking once you go hunting. Question: can double_double be doubled? And doubled again etc? In other words could these be implemented via nested templates? |
One thing I mentioned about this package is licensing of the code. I believe that I am getting this interpretation. Not yet, however, fully sure... But I also notice that the x86-FPU-fix encloses on the top-level call mechanism all fucntions using the package. This means, as is confirmed in the DD/QD documentation, that the code will have a hard time being used in multithreading for x86. I would prefer to use a package that sets and resets the x86-FPU-fix within the individual subroutines that need it, thereby allowing for a synchronization mechanism if trying for multithreading. So at the present, I am not aware of an existing kind of DD or QD that would fully satisfy my personal requirements. That being said, one could focus on a popular one. But there seem to be several popular ones dating all the way back to Best regards, Chris |
You read my mind. I think the answers are yes and yes. But I am also wary of such nested templates having once gotten poor results with an integer type using something like this. But that experience was more than 20 ago, I was less experienced at programming C++ back then and compilers got much (massively much) better at parsing nested templates since then. |
Wouldn't that only be required for code executing on the old x87 co-processor, and not x64 code using the SSE registers? |
Yes. I have also just sent you some work in another thread. In that work, the x86-FPU-fix is only needed for that and only for that legacy FPU. |
This has been done near the bottom of this page |
@cosurgi : There's a very tentative version in draft PR referenced above. I would welcome feedback if you'd like to give it a try and see how far you can get. There are some issues, and some possibly controversial design choices:
Anyhow, if you have any real world code you can plug this into, it would be interesting to hear how you get on. It's all a bit too "fast and loose" for me! |
@jzmaddock I'm sorry for late reply. This looks very interesting. Yes I have a real-world code YADE with a full blown testing suite for math functions I will try to use it this or the next week. Just to be sure - I should take this commit ? The issues:
See some example math tests results for MPFR 150 - scroll up to see the math functions being tested by the earlier mentioned script. |
Oh, btw if 2^10 is 1023.99999999(9)..... then it's acceptable for me, because the error will be smaller than the tolerance level. We only need the working round tripping. |
This is actually the reason that round tripping doesn't work - I mean it's approximately correct (to within an epsilon or so), but without a correctly functioning pow() we basically can't implement correct string output. Actually even the basic arithmetic operations aren't correctly rounded either... as long as string IO is approximately correct (to within an epsilon or so - ie drop one digit and it would round to the "correct" result), wouldn't your tests still function OK? |
heh. I suppose that all math tests would work. But the round tripping test would fail ;) I suppose I could implement an exception for this one. |
Meanwhile.... there are a few issues with the current version - mostly order of initialization issues in numeric_limits<qd_real>, give me a day or so and I'll push a fix as they prevent tgamma and others from working. |
Nevermind, fix pushed. The C99 functions including tgamma etc are now working, note that they are not quite std conforming in that they don't adhere to the non-normative appendix F which deals with handling of infinities and NaNs. |
BTW: after reading the paper and source code I think that extracting "Quadding" method from this paper into a template is possible. The requirement is that underlying type satisfies IEEE rounding method of the last bit. I will probably need quad-float128 sometime in the future. We could get back to that later. EDIT: explanation for this requirement is very simple: if the last bit is wrong (incorrectly rounded), then the next element in the quad - although by itself correct, is completely useless - because a single bit preceding it is all wrong. |
Just a quick heads up. I see that you have done a lot of progress here, should I try testing it again with your latest changes? |
It's way better than it was - especially the library support - there's still one test failing that's not been investigated though.... |
Which one? |
Should be in principle. I've done some preliminary work I needed for my own work here, mostly based on a paper by Knuth, I believe, on error-free transformations, but it's utterly unfinished as I had other stuff I had to do. Still the problems are significant, as to have the DD type behave like a real C++ primitive there's a lot of high precision constants necessary e.g. coefficients for the elementary functions' approximation polynomials, for normalizations of the function arguments, etc. |
Restarting (with my offer to make potential progress as mentor on this) for Boost Google Summer of Code 2021 |
Very nice, thank you! I will be watching closely, ready to integrate into YADE. In principle it shouldn't be much more than just changing this file: https://gitlab.com/yade-dev/trunk/-/blob/master/lib/high-precision/RealHP.hpp#L106 and https://gitlab.com/yade-dev/trunk/-/blob/master/lib/high-precision/RealHP.hpp#L119 :) Then we will have all YADE testing at our hand, for example this: https://gitlab.com/yade-dev/trunk/-/jobs/1048563190 |
I saw this issue recently and took notice because it seems connected to the floating-point expansions that I implemented for applications in geometric predicates (based on the works of Shewchuk, see https://www.cs.cmu.edu/~quake/robust.html ). Maybe a templated implementation that takes an arbitrary number of components could offer the same functionality as double-double or quad-double with more flexibility. My implementation of expansion arithmetic can currently be found in a PR to Boost.Geometry. It only contains the operations +, -, *, all implemented for exact computations. Because I think the concept might be of interest and most of the implementation that would be required for a demo is there already, I created a quick proof of concept based on the backend-skeleton. It is not polished and the sqrt and divide methods are probably not very robust, but it should be sufficient to illustrate the idea. The code can be found at https://github.com/tinko92/floating_point_expansion_number_type Here is an example of the results that this proof-of-concept produces: Let a = 8, b=3, then the formula (a/b) * b - a which is roughly in line with what would be expected if a double has a precision of ~16 digits. The choice of algorithms is probably not ideal because my implementations of +, - and * are all for exact calculations and if we want to limit the number of components one could probably use faster algorithms, maybe the ones presented in http://perso.ens-lyon.fr/jean-michel.muller/07118139.pdf . The example I created uses a templates with a fixed limit of components, but it could also be dynamic for arbitrary precision (but limited by the exponent range, of course, so it would never make sense to have an result of more than ~40 components) using an std::vector instead of an array. There are other potential questions, e.g. whether to use zero-elimination for short expansions, whether to use the compress algorithm from Shewchuk's paper for renormalization or the renormalization presented by Joldes et al. and so on. |
Many thanks. Will look into this (and probably ask relevant questions) as the work on quad materializes and becomes closer to getting moving. |
Work on quas had begun within the context of GSoC 2021. Some design choices have been made already/are to be made in the future. Let's review:
@jzmaddock and @pabristow and @mborland and @NAThompson: Comments, suggestions, ideas? |
See also this post |
If you have been following this older thread, there is a lot of new progress on double/quad-float within the context of the GSoC21 (thanks @sinandredemption and @cosurgi). We now have a really cool draft of quad. We decided to implement a flexible template called
which can be instantiated for If you get the fork here, you can use If you like, try quad-float on your favorite function or part of Boost.Math. We would like to hear field reports. I like using
Cc: @jzmaddock, @mborland, @NAThompson, @pabristow |
Would the normal math tests – but including type boost::multiprecision::cpp_quad_dbl be a good starting test?
Will take a fair while obviously, so unsuitable for a CI test?
Or has this been done already?
From: Christopher Kormanyos ***@***.***>
Sent: 17 August 2021 08:55
To: boostorg/multiprecision ***@***.***>
Cc: Paul A. Bristow ***@***.***>; Mention ***@***.***>
Subject: Re: [boostorg/multiprecision] Feature request: quad-double precision. (#184)
Work on quad has begun...
If you have been following this older thread, there is a lot of new progress on double/quad-float within the context of the GSoC21 (thanks @sinandredemption<https://github.com/sinandredemption> and @cosurgi<https://github.com/cosurgi>). We now have a really cool draft of quad. We decided to implement a flexible template called
template<typename FloatingPointType>
class cpp_quad_float;
which can be instantiated for float, double, long double or boost::multiprecision::float128. This class quadruples these floating-point types basically in terms of precision.
If you get the fork here<https://github.com/BoostGSoC21/multiprecision>, you can use <boost/multiprecision/cpp_quad_float.hpp> and some of its convenience types such as boost::multiprecision::cpp_quad_dbl. A sample is shown below.
If you like, try quad-float on your favorite function or part of Boost.Math. We would like to hear field reports.
#include <iomanip>
#include <iostream>
#include <boost/math/constants/constants.hpp>
#include <boost/math/special_functions/gamma.hpp>
#include <boost/multiprecision/cpp_quad_float.hpp>
int main()
{
using big_float_type = boost::multiprecision::cpp_quad_dbl;
// N[Sqrt[Pi], 201]
// 1.77245385090551602729816748334114518279754945612238712821380778985291128459103218137495065673854466541622682362428257066623615286572442260252509370960278706846203769865310512284992517302895082622893210
std::cout << std::endl << "represent_tgamma_half" << std::endl;
const big_float_type g = boost::math::tgamma(big_float_type(0.5F));
const big_float_type ctrl = sqrt(boost::math::constants::pi<big_float_type>());
const big_float_type delta = fabs(1 - (g / ctrl));
const std::streamsize original_streamsize = std::cout.precision();
std::cout << std::setprecision(std::numeric_limits<big_float_type>::digits10) << g << std::endl;
std::cout << std::setprecision(std::numeric_limits<big_float_type>::digits10) << ctrl << std::endl;
std::cout << std::scientific << std::setprecision(4) << delta << std::endl;
std::cout.precision(original_streamsize);
std::cout.unsetf(std::ios::scientific);
const bool result_is_ok = (delta < std::numeric_limits<big_float_type>::epsilon() * 40U);
std::cout << "result_is_ok: " << std::boolalpha << result_is_ok << std::endl;
return (result_is_ok ? 0 : -1);
}
Cc: @jzmaddock<https://github.com/jzmaddock>, @mborland<https://github.com/mborland>, @NAThompson<https://github.com/NAThompson>, @pabristow<https://github.com/pabristow>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#184 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAIG4AJDGU2UEFYLEYB2POLT5IIVTANCNFSM4KIIZANA>.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>.
|
Hi @pabristow thank you for following up on this. Actually, there are quite a few dedicated Multiprecision tests in the standard Multiprecision CI which exercise both the tape's mathematical functions and operations as well as certain special functions from Math. We are in the process of using/adapting these for the reduced range(s) of double/quad-float. I thik we have about 1/3 or more of the tests running... So we are moving forward on that. We made a dedicated double/quad-float CI for this projec only in the GSoC timeframe so we have been running tests on pushes, PR and also a nightly CI. |
@ckormanyos Seeing as though we did not participate in GSOC 2022 what still needs to happen on the GSOC 2021 fork to bring quad-double to MP? |
Hi Matt (@mborland) we got a good start, but a lot still remains to do for getting Multiprecision-Ready. Together with Janek (@cosurgi) and our excellent GSoC '21 colleague Fahad (@sinandredemption) we got a good start on the basic algorithms and functionality. We implemented this thing with double-float and quad-float whereby instantiation with float, double, long double or even __float128 is foreseen. Quad- From the top of my mind, we still need to...
This was (and is) a really cool project and it would be good to make some progress. I remember toward the end we did some really preliminary benchmarking and were a bit disappointed with the performance of quad double. Perhaps Fahad has some clear recollections of the state of open points. We also have some open issues and a few branches. They might handle the merge from main: I'll see if a good branch can be rebased or merged from main into branch tonight. Cc: @jzmaddock |
Hi Matt (@mborland) I was able to successfully merge the develop branch from BoostOrg into the GSoC double-double/quad-double GSoC21 repository. At the moment, only a few elementary function tests and (most of) the arithmetic tests run in a special CI process that I wrote for that project. The branches were about 500 commits behind BoostOrg, so I took a bit of time to merge. At the moment, the three branches If I get a chance in the upcoming year, I'd like to get back to this project and maybe see if double-double or quad-double can be finished (enough) for productive use. We still have to work out the edge cases and get all the elementary function tests running. If anyone has interest, time, ideas, thoughts, please feel free to add/join-in Cc: @cosurgi and @mborland and @jzmaddock and @sinandredemption |
The fork in its updated state can be found here. Cc: @mborland and @jzmaddock |
Thanks Chris! As you say, it was really just the corner cases and adequate testing that was holding this back, would be nice to see it finished off... |
Hi @cosurgi and @sinandredemption
Can anyone actually help me with 1? Cc: @mborland and @jzmaddock
|
Hello @ckormanyos I would definitely want to contribute to #1. Please allow me some time to re-familiarize myself with the codebase, and setup my development environment. |
Hi @sinandredemption (Fahad) I'm happy to see you on this thread. At the present time, I am doing quite a few small, but significant changes reagrding use of NaNs, inf, limits, zeros and the 128-bit floating-point type. I actually found that Multiprecision standalone is really clean regarding some of these items. I also just finished up using Boost.Multiprecision's near-native One of the first significant testing problems I found has been in rounding near the split-point of the division between fisrt/second. So if you'd like to look at the rounding tests that would be a great place to start. Cc: @cosurgi and @jzmaddock and @mborland |
Hi @sinandredemption there is also another point regarding even more adaption to This results in a handful of functions that could use some of Matt's (@mborland) This could also be an interesting place to jump on and contribute. I can explain more in offline chat(s) if/when needed. |
Introduction of the If/when this finishes in a whle, I'd then suggest we move on with quad-float Cc: @sinandredemption and @cosurgi and @jzmaddock and @mborland |
If possible, could you support a greater variety of combinations? For example, MultiFloats.jl is a very fast library that represents extended-precision numbers with 2x, 3x, ..., up to 8x the precision of double, which might be helpful. |
Hi @AuroraDysis that looks interesting and many thanks for your interest in double/quad/multi-float(s). During the GSoC and after that, I remember We/I were having trouble figuring out the exact algorithms for add/sub/mul/div and struggling with overflow/underfolw/NanNs. For the moment, this project is on the back burner. Perhaps the Julia code you reference could give inshight into this. If I/we ever pick up this issue, we will be sure to have a look, but i fear this might have rather low priority at the moment. |
@ckormanyos Thanks for your reply. I found multi-floats to be very useful in my research area and believe it will play an even more important role in the future. If you have the chance to pick up this project again and need some help, I'd like to contribute or collaborate with you. As for double floats, DoubleFloats.jl is the most mature library I have ever used, although it is not as fast as MultiFloats.jl. |
The plan is to do this and look into multi-floats and ultimately get something adhering to Boost.Multiprecision/Math. If (which actually means when) I/we watm this up again, I'll be sure to look into your referencs and contact you regarding forward motion. Thanks @AuroraDysis for your interest and willingness to help. When this gets moving again, we will be in touch. |
Hi,
as suggested in math #303 I repost this here:
How feasible it is to integrate quad-double precision, It has 212 bits in significand and is faster than MPFR:
I have just recently integrated boost multiprecision with a fairly large numerical computation software YADE.
Float128 isn't enough for my needs, yade would greatly benefit from quad-double (as it is much faster than MPFR). Also yade can serve as a really large testing ground. I have implemented CGAL numerical traits and EIGEN numerical traits to use multiprecision. And all of them are tested daily in our pipeline, see an example run here. Or a more specific example for float128: test (need to scroll up) and check
The text was updated successfully, but these errors were encountered: