-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decimals and other binary bit sizes #9
Comments
Could you elaborate in more details? Do you mean supports for decimal float formats & formats like ieee-754 binary128 or quadfloat? |
Can all these be supported:
|
First, Dragonbox has nothing to do with decimal floats, because Dragonbox is a binary-to-decimal conversion algorithm. The format is already decimal, so there is nothing to convert. For other binary float formats, I have to mention that some components in this library are there for possible extensions to other formats. More specifically, extensions for other formats can be provided by specializing However, currently, I'm using a enum constant to distinguish the formats. But the problem with enum is that it is not open-ended. Users can't add a new enum identifier without modifying the library code, and probably a better approach might be to replace enum's with tag types. The reason I made it like this is because in order to the use core features of Dragonbox with an exotic format, the cache table for the format should be prepared, which requires a fair amount of knowledge on the algorithm. It's not that much if table generation is the only goal: each entry of the cache table is just the first Q significand bits of powers of 10, where the exponent is running from the constant However, I'm not sure which one is better between the closed-ended enum approach and the open-ended tag type approach. Do you have any idea on this? Another issue is that I don't know all of the formats you listed. I'm pretty sure that the algorithm can be generalized for binary16, binary128, and binary256 formats, but it can be possible that some of the other formats are out of range. (In fact, a naive approach of generating the full cache table of binary128 or binary256 will be not usable as the size of the table will be enormous, thus it will be necessary to come up with a table compression scheme similar to what I've done for binary64 format. I'm pretty sure it should be possible, but it's a nontrivial task.) tl;dr |
Making open-ended would be better.
Maybe this can be generated than hardcoded.
Even decimals have the string conversion part.
See: |
Generating them in compile-time is theoretically possible, only if the users can afford stupidly long compilation time, and compilers' constexpr evaluation limit is absurdly long, and we have a constexpr bignum library. (And as I pointed out, precision checking is even more involved.) If you are talking about having a separate runtime code for generating the table, it's already in this repo, although it's currently not supposed to be consumed by users.
As written in README, string conversion is not the focus of this library. And it would be not difficult to write it manually since no conversion is necessary. You can even reuse the string generation part of this library if you want; you can just extract the significand and the exponent from the input and wrap it with
Okay, I'll have a look. |
A few thoughts after looking at it.
Thanks for bring this issue up. I think I should first decide between tag-type approach and enum approach. |
@expnkx Okay, I'll send you a message in discord |
I have similar requirement only for 80 bit floating point format (ie typical long double type in g++ for example, MSVC long double is same size as double). I am happy to eventually work on adding this but no idea how difficult it is for someone not familiar with DragonBox code. I'd be grateful if you could advise on this. |
@zejal First of all, thanks a lot for your attention. Ideally, the step 0 is to understand the algorithm. This is probably the hardest step if you are not familiar with this kinds of stuffs, or possibly not so daunting step if you already have experience in similar algorithms. The whole algorithm is explained in detail in this paper. Assuming that you have enough energy to dive into it, let me briefly go over the paper.
After understanding the algorithm, the next step is the float traits. All of those "float traits" stuffs in this repo are intended for possible extension to other floating-point formats. Specifically, you will need to define a new float traits struct by taking a look at the default implementation. Currently, the default one is somewhat broken since it is relying on an undefined behavior. (I will probably fix it by today.) And then, you need to generate the cache table. This is probably a bit annoying step. I guess the "full cache table" might be too large to be practical, so you may need to go for the "compressed cache table" route. Relevant files are this and this. After that, I guess you might need to implement several format-specific optimized subroutines. At this point I'm not 100% sure if we need no significant change in order to make it work. Please let me know if you find anything needing a big overhaul. Again, thanks a lot for your attention and please let me know if there is anything I can help. |
Just did it: 14c02bd |
Thanks a lot for all this. I've started having a look to the paper but guidance is very helpful. |
I'm far from understanding all algorithm details but I see that implementation quite strongly assumes either 32 or 64 IEEE 754 formats. For example in several locations: if constexpr (std::is_same_v<format, ieee754_binary32>) { .... } else { static_assert(std::is_same_v<format, ieee754_binary64>); ... }. |
I somewhat already tried to minimize the occurrences of such
I believe this is more or less a necessary quirk of how the algorithm works. Right now I don't see how to avoid it, but any suggestions are welcome of course. |
hello @ALL,
yes, it's a little complex, didn't try 128 or other figures but 80-bit long doubles work with the three steps as in 'method B.' in the snippet below. 'method A.' is the path for doubles or floats.
|
You just need to follow the usual procedure of wrapping C++ lib's with C interface. I.e., (1) write a header file with whatever list of declaration of functions you need, (2) write a cpp file that defines those functions by calling functions from (I accidently clicked "Close with comment" button while writing this😅) You can find lots of information on this from the Internet. Most of the articles on it seems to assume object (classes) API therefore defines an opaque pointer, but since |
hello :-) @jk-jeon, thank you for your fast reply, you didn't say something about 'would like a version for 80-bit long doubles, and lack the skills to build such from above info.'. in general i can try to dive into, but have seen here and at ryu that people with better skills didn't finish, thus think it's too big a task for me. let me explain the value of a long double version: in some way longs are dead, and also intel doesn't push them anymore, but! they provide a good way to check the results of double calculations which I consider very helpful for my work on last bit accuracy.
IMHO the two points above ( 'longs' and 'C' access ) could help a lot of people once they were available, whereas for the 'C' access float to string, double to string and long double to string would be sufficient. TIA for any help from you or other persons ... |
I would say it is not a dauntingly hard task, but it is certainly something that requires nontrivial amount of time and energy. Also it is not something super-advanced or something that requires a lot of creativity and innovation, but it certainly requires enough understanding of the underlying algorithms, which I assume not so many people seem to have already acquired, nor willing to acquire, at this point. Fast but 100% accurate float-to-string conversion is probably quite a niche thing that not so many people are caring about, and especially that for 80-bit float is presumably even more niche. Maybe no one did it yet because it is not so super-rewarding given the amount of effort it requires. Also, regarding the C-interface thing, I think this issue thread is not really the right place to talk about it. As I said, you can google, e.g., "wrapping C++ with C interface" and find plenty of information on it. Just be aware that the library API is not object-oriented so you need not to have a so-called "opaque pointer" from the interface. |
How does this change with decimals and other binary bit sizes? Can they be added?
The text was updated successfully, but these errors were encountered: