-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimized (smaller) lookup table for float (binary32 only) #99
Comments
Pull requests invited! |
To be clear here, if I understand correctly, @jrahlf wants an implementation that supports only binary32 numbers (float). Squeezing the table is easy, one can simply follow through the paper at https://arxiv.org/abs/2101.11408 Of course, the net result will only support binary32 numbers. |
My mistake then, and I just confirmed that this would have off-by-1 values, which would mess up the logic. |
If you change these to float, the table size shrinks from 1302 to 208, i.e. you can save approximately 8kB. So one could add another table power_of_five_128 for float and then let the templatized code use the correct table.
There is one catch: If you used both double and float, the code size would be greater (worse) than when only providing the double table. Two possible solutions: |
Yes, it is. |
So I got a proof of concept: #103
With the separate float LUT the sizes are:
There are two notable things:
I would prefer to to make the double LUT a composite of the float LUT and additional data, but reading a composite object as one linear array would violate C++ aliasing rules. :( Overall it might makes sense to always use either |
Have you considered optimizing the code size for parsing floats?
The LUT
power_of_five_128
has approximately ~1400 entries which are needed for parsing doubles.I don't know how many entries are required for parsing a float, but I suspect the LUT could be a lot smaller in that case.
If there was a separate LUT for parsing floats, the compiled binary size could be reduced significantly.
The text was updated successfully, but these errors were encountered: