-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
10-70x faster $float, roundtrip + correct rounding via dragonbox #18008
Conversation
While I don't mind the "new mechanism for managing cacheable dependencies" mechanism, in fact, I recently had similar ideas, but so that the Nim compiler can use Nimble packages, skimming over dragonbox_impl.cc -- c2nim eats this file for breakfast and then we would have a native Nim implementation. In the past adding C/C++ code didn't work well for us, not just because of the lack of dependency tracking. |
great, i think that's a real enabler to add runtime functionality without adding import/CT bloat;
yes, I'm also thinking in that direction; it's one way to break the cyclic dependency issue
how?
and from a local clone:
I'm not against it if there's an automatic source to source translation (that doesn't incur performance loss), but note that even in that case, the |
Well, it needs some c2nim bugfixes but it shouldn't be too hard. |
Why not use the official repo: https://github.com/jk-jeon/dragonbox it has a header only version too. Does it work with float32? |
acbdb1b
to
e764bce
Compare
b1c5c7a
to
5024b89
Compare
Note though that jk-jeon/dragonbox supports more options, and we can wrap it in future work to allow customizing (see policy options in jk-jeon/dragonbox, as well as float32 direct support); in fact I've done exactly that, in timotheecour#732 which wraps both, but we can discuss that after this PR is merged; the good news is that with the logic from this PR, adding new dependencies is easy, more on this later. [1] see timotheecour#732 in which I wrap both jk-jeon/dragonbox and abolz/Drachennest; here's my benchmark which shows jk-jeon/dragonbox (aka toStringDragonbox0) is slower for double but faster for float32 compared to abolz/Drachennest (aka toStringDragonbox):
[2] see jk-jeon/dragonbox#8 (comment),
I honestly doubt it, there'll always be C++ constructs it can't handle; while abolz/Drachennest is C++11 (or c++03), other projects we may want to depend on in future (eg bigint, etc) might have more complex requirements (eg |
compiler/builddeps.nim
Outdated
let objFile = dir / ("$1nimdragonbox.o" % prefix) | ||
if optForceFullMake in conf.globalOptions or not objFile.fileExists: | ||
# xxx | ||
# let cppExe = getCompilerExe(c.config; compiler: TSystemCC; cfile: AbsoluteFile): string = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm currently hardcoding the C++ compiler; followup PR will improve this to use extccomp.getCompilerExe
instead so it honors user configs (deferring this to future PR as it requires a few other changes to allow this and this PR is already quite big)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should c2nim the code, as I said.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I wrote above, I've tried that already; c2nim can't handle dragonbox.cc, it lacks a lot of functionality to support C++ files like dragonbox.cc and chokes on many different parts of the code (see also https://github.com/nim-lang/c2nim/issues).
Compiling dragonbox.cc
directly as in this PR is much simpler than fixing c2nim to parse C++ properly and convert it to nim (only to be later converted back to C or C++ during cgen); fixing c2nim could easily represent a multi-year dedicated effort, time that could be better spent elsewhere.
Anyway if people start converting it to Nim like the last time with ryu, we should probable coordinate and split tasks. |
Thanks for trying Drachennest! :-) Nice to see that this might actually be useful for someone else. Just a few remarks: The Dragonbox implementation in Drachennest currently only works correctly for double-precision (float64) numbers. float32 is actually not supported yet. (The double-precision implementation might produce more digits than necessary for single-precision numbers.) So you should actually use Junekey Jeon's implementation (which currently requires C++20). Or the simplified implementation of Dragonbox in the fmt formatting library here (contributed by Junekey Jeon), which works with C++11, produces output similar to printf("%g"), is widely used and well tested. Alternatively you might want to try out the Schubfach implementation (float32 version, float64 version). Dragonbox is an optimized version of this algorithm. It might be slightly slower than Dragonbox, but the implementation is less complex. (You may tweak the output using the MinFixedDecimalPoint/MaxFixedDecimalPoint constants in the ToChars methods to better match the output of printf.) If you want to try this out, I may try to simplify the code such that (the current version of) c2nim is able to translate it. HTH |
and thanks for writing this library!
indeed, as can be seen with: But note that this isn't a regression, because nim currently renders float32's by first converting it to float64 (and then using sprintf, until this PR), so with this PR you still get all the benefits of dragonbox for Future work can add support for float32's, which will require other changes to nim anyways to avoid the pre-exixting conversions from float32 to float64 done in a few places.
I don't see the point in doing that, you wouldn't get the benefits of future improvements from upstream libraries, and it'd have to be compiled in to object code anyways (as done in this PR) to avoid adding ever more import/cyclic dependencies just to get a working |
@Araq PTAL I have another branch (which depends on this PR) where it also applies dragonbox algorithm to float32 (using Drachennest-schubfach_32) in addition to float64 (still using Drachennest-dragonbox), along with performance benchmarks justifying it; it builds on this PR's |
5024b89
to
7086975
Compare
c2nim can now parse dragonbox.cc and the way you did the "new mechanism for managing cacheable dependencies" via yet another compiler magic is not acceptable. We need an external "bundler" tool that is available at boot time (maybe by generating C sources for it). |
918c0b0
to
1a4e31b
Compare
(EDIT with https://gist.github.com/timotheecour/8f45eddc5b3533f5e166b305fde537f9 definitely better than before. But right now, c2nim doesn't produce a valid nim file, if c2nim ever becomes capable of generating correct nim code (without performance loss) for dragonbox.cc, that's great, we'll always have the option to replace dragonbox.cc with c2nim_dragonbox.nim at that point, but until then, I don't see anything wrong with using dragonbox.cc directly as I did in this PR, so we can benefit from this PR's speedup and correctness improvements without waiting for c2nim.
That was my first thought, but doing this at boot time is strictly less flexible, and prevents optional, on-demand dependencies (bigint etc, for which i have a PR in the work using the same approach leveraging existing libraries).
There are plenty of magics that could be replaced by regular procs (eg |
1a4e31b
to
3080108
Compare
I agree with Araq we should port those algos to nim, so there is no need to rush a new dependency injection system in the compiler. It also has the benefit that you can use them both at runtime, compile time and in nimscript, while using the wrapper to dragonbox.cc you get different results in different contexts. And i add that we should have both float32 and 64 versions. |
then please write the port and submit a PR after making sure the performance is comparable or better to what this PR offers? I don't see the point in re-inventing the wheel, this is NIH.
so, just like this PR then?
how so?
I have a followup branch that supports float32 as mentioned in #18008 (comment) |
I'm not sure, it seems the usage of dragonbox is disabled for nimscript https://github.com/nim-lang/Nim/pull/18008/files#diff-f056834522efcf8e1e24c2c0cb2408cb1b1e6da5ac7a28129f514b89d3134394R17 but maybe it's just me being a newbie in nim compiler code... I didn't want to sound harsh, but just wanted to say that having proper nim support could lead to simplified internals and better control on the compiler code. |
see for yourself:
nim should be able to handle external dependencies (and already does with pcre and all the existing wrappers); re-implementing existing libraries in nim doesn't reduce the overload workload/maintenance, the opposite is true. |
Then what |
it's a vmops, see |
closing this because #18139 was merged, but see #18139 (comment) which shows a 1.4X performance drop compared to this PR |
* use dragonbox algorithm; alternative to nim-lang#18008 * removed unsafe code
fixes
it also makes
round
actually useful for the purpose of formattingdragonbox
$float
,addFloat
with dragonbox algorithmnim r -d:danger tests/benchmarks/tstrfloats_bench
-d:nimLegacyAddFloat
for legacyaddFloat/$
, or usetoStringSprintf
directlystd/strfloats
--forceBuild
is passed)new mechanism for managing cacheable dependencies
std/private/dependency_utils
which allows specifying dependencies at CT, eg:design goals
note
a key part of this PR is that it copies (with minor adjustments) https://github.com/abolz/Drachennest/blob/master/src/dragonbox.cc from this repo: https://github.com/abolz/Drachennest, which uses
Boost Software License 1.0
: https://github.com/abolz/Drachennest/blob/master/LICENSE; that's the only caveat with this PR, but I believe this license should be fine. If not, we could provide compiler options to make this dependency optional (possibly even downloading the sources depending on compiler flags if we must avoid bundling it with nim repo); note that we already have a precedent for that though, with linenoise (bsd2), plus other dependencies past or present.links