Float point reading is lossy. #120

xpol · 2014-08-28T13:33:33Z

The float point value 0.9868011474609375 become 0.9868011474609376.

If it is a performance tradeoff, I think we should provide an option to user that prefers correctness over speed.

The text was updated successfully, but these errors were encountered:

miloyip · 2014-09-03T16:38:02Z

I recently tried an approach in this branch.

It backup the valid characters into the stack (which was only used by ParseString() previously).
If the value can be converted via fast-path for s * 10^p, 0 <= s <= 2^53 - 1 && -22 <= p <= 22, where s, p are integer.
Otherwise, call strtod() with the backup string in stack.

More tests have been added and it can parse the numbers exactly (so with google test EXPECT_DOUBLE_EQ() can be changed to EXPECT_EQ()).

However, the downside is that, the backup process incur unnecessary overheads even for parsing integers. And also calling C++ strtod() may be slow when fast-path's criteria cannot be met.

It is now getting two questions:

Whether to support the previous inexact solution which only uses fast path.
If both solutions are supported, which should be default.

Besides, another issue is that, with the exact solution, the lookup table in Pow10() can reduce the range from [0..308] to [0..22].

pah · 2014-09-04T07:34:46Z

My 3¢:

I would prefer to have both options available, especially as there is a cost for integer numbers as well.
We should try to encapsulate the differences somehow to avoid splilling #if(def)s everywhere.
Can you quantify the maximum error introduced by the fast path solution?

miloyip · 2014-09-04T07:49:20Z

To keep both options, possible implementations:

Macro with #if everywhere in single function.
Macro to switch between two different functions. Double source lines.
New parse option (template parameter), dispatch to two functions. Double source lines. (single function with if is difficult, because some local variables are only used by an implementation).

Seems 3 is more flexible and easier to test. I think that the function is quite difficult to be refactored... But I can try.

I am not sure how to calculate maximum error analytically. I can try to get empirical result with random values.

pah · 2014-09-04T08:05:08Z

I think if EXPECT_DOUBLE_EQ passes for all random values on the fast path, this would be sufficient for many use cases.

I dislike (1) and I even more dislike code duplication. The "local variable" problem can be solved by a similar technique as done for the local copy of the stream: Have a small template class wrapped around the local variable and providing its operations, doing nothing in case of the fast-path-only solution. This way, the cost for the fast path is a single word on the stack. This should be acceptable.

If we go for the template parameter, we'd need a way to set default parse flags globally to avoid cluttering the user code when choosing a non-default configuration.

miloyip · 2014-09-04T08:09:40Z

If we go for the template parameter, we'd need a way to set default parse flags globally to avoid cluttering the user code when choosing a non-default configuration.

Any suggestion? Using macro as parameter default value?

pah · 2014-09-04T08:16:57Z

Using macro as parameter default value?

Yes, this could work as a simple solution:

#ifndef RAPIDJSON_DEFAULT_PARSEFLAGS
#define RAPIDJSON_DEFAULT_PARSEFLAGS kParseNoFlags
#endif
enum ParseFlag {
    kParseNoFlags = 0,
// ....
    kParseDefaultFlags = RAPIDJSON_DEFAULT_PARSEFLAGS

... or something like this.

miloyip · 2014-09-05T14:34:16Z

Today I have refactored ParseNumber() and added kParseFullPrecision option.

Regarding to @pah 's comment

I think if EXPECT_DOUBLE_EQ passes for all random values on the fast path, this would be sufficient for many use cases.

I have added an experiment with random generated double (excluding denormals). By using the same notation of ULP as in google test, the result is:

[ RUN      ] Reader.ParseNumber_NormalPrecisionError
ULP Average = 0.382924, Max = 3 
[       OK ] Reader.ParseNumber_NormalPrecisionError (547 ms)

Since EXPECT_DOUBLE_EQ() considers two doubles are equal if their ULP <= 4, the result of maximum 3 ULP certainly passes EXPECT_DOUBLE_EQ().

pah · 2014-09-05T14:45:54Z

Nice work! 👍

Now only the "default parse flags" are missing, but this is a separate issue, I guess.

spl · 2014-09-15T19:06:10Z

For me, numerical correctness is more important than speed, but I do like the focus you have on performance.

#120 (comment)

Parse JSON number to double in full-precision with custom strtod. Fix #120

miloyip added this to the v1.0 Beta milestone Sep 1, 2014

miloyip added the bug label Sep 1, 2014

miloyip self-assigned this Sep 1, 2014

miloyip mentioned this issue Sep 6, 2014

Parse JSON number to double in full-precision. #137

Merged

pah mentioned this issue Nov 17, 2014

Cannot parse min normal positive double #197

Closed

miloyip added a commit that referenced this issue Nov 30, 2014

Add RAPIDJSON_PARSE_DEFAULT_FLAGS for customizing kParseDefaultFlags

23b7a5e

#120 (comment)

miloyip closed this as completed in #137 Nov 30, 2014

miloyip added a commit that referenced this issue Nov 30, 2014

Merge pull request #137 from miloyip/issue120floatprecision

454146b

Parse JSON number to double in full-precision with custom strtod. Fix #120

m7thon mentioned this issue Jul 14, 2015

Issue when serializing double USCiLab/cereal#202

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Float point reading is lossy. #120

Float point reading is lossy. #120

xpol commented Aug 28, 2014

miloyip commented Sep 3, 2014

pah commented Sep 4, 2014

miloyip commented Sep 4, 2014

pah commented Sep 4, 2014

miloyip commented Sep 4, 2014

pah commented Sep 4, 2014

miloyip commented Sep 5, 2014

pah commented Sep 5, 2014

spl commented Sep 15, 2014

Float point reading is lossy. #120

Float point reading is lossy. #120

Comments

xpol commented Aug 28, 2014

miloyip commented Sep 3, 2014

pah commented Sep 4, 2014

miloyip commented Sep 4, 2014

pah commented Sep 4, 2014

miloyip commented Sep 4, 2014

pah commented Sep 4, 2014

miloyip commented Sep 5, 2014

pah commented Sep 5, 2014

spl commented Sep 15, 2014