Adopting Alexhuszagh's decimal comparison approach for long input strings #104

lemire · 2021-09-14T01:30:31Z

I did some binary bloat checks.

Replaces the existing decimal implementation, for substantial performance improvements with near-halfway cases. This is especially fast with a large number of digits. **Big Integer Implementation** A small subset of big-integer arithmetic has been added, with the `bigint` struct. It uses a stack-allocated vector with enough bits to store the float with the large number of significant digits. This is log2(10^(769 + 342)), to account for the largest possible magnitude exponent, and number of digits (3600 bits), and then rounded up to 4k bits. The limb size is determined by the architecture: most 64-bit architectures have efficient 128-bit multiplication, either by a single hardware instruction or 2 native multiplications for the high and low bits. This includes x86_64, mips64, s390x, aarch64, powerpc64, riscv64, and the only known exception is sparcv8 and sparcv9. Therefore, we define a limb size of 64-bits on 64-bit architectures except SPARC, otherwise we fallback to 32-bit limbs. A simple stackvector is used, which just has operations to add elements, index, and truncate the vector. `bigint` is then just a wrapper around this, with methods for big-integer arithmetic. For our algorithms, we just need multiplication by a power (x * b^N), multiplication by a bigint or scalar value, and addition by a bigint or scalar value. Scalar addition and multiplication uses compiler extensions when possible (__builtin_add_overflow and __uint128_t), if not, then we implement simple logic shown to optimize well on MSVC. Big-integer multiplication is done via grade school multiplication, which is more efficient than any asymptotically faster algorithms. Multiplication by a power is then done via bitshifts for powers-of-two, and by iterative multiplications of a large and then scalar value for powers-of-5. **compute_float** Compute float has been slightly modified so if the algorithm cannot round correctly, it returns a normalized, extended-precision adjusted mantissa with the power2 shifted by INT16_MIN so the exponent is always negative. `compute_error` and `compute_error_scaled` have been added. **Digit Optimiations** To improve performance for numbers with many digits, `parse_eight_digits_unrolled` is used for both integers and fractions, and uses a while loop than two nested if statements. This adds no noticeable performance cost for common floats, but dramatically improves performance for numbers with large digits (without these optimizations, ~65% of the total runtime cost is in parse_number_string). **Parsed Number** Two fields have been added to `parsed_number_string`, which contains a slice of the integer and fraction digits. This is extremely cheap, since the work is already done, and the strings are pre-tokenized during parsing. This allows us on overflow to re-parse these tokenized strings, without checking if each character is an integer. Likewise, for the big-integer algorithms, we can merely re-parse the pre-tokenized strings. **Slow Algorithm** The new algorithm is `digit_comp`, which takes the parsed number string and the `adjusted_mantissa` from `compute_float`. The significant digits are parsed into a big integer, and the exponent relative to the significant digits is calculated. If the exponent is >= 0, we use `positive_digit_comp`, otherwise, we use `negative_digit_comp`. `positive_digit_comp` is quite simple: we scale the significant digits to the exponent, and then we get the high 64-bits for the native float, determine if any lower bits were truncated, and use that to direct rounding. `negative_digit_comp` is a little more complex, but also quite trivial: we use the parsed significant digits as the real digits, and calculate the theoretical digits from `b+h`, the halfway point between `b` and `b+u`, the next-positive float. To get `b`, we round the adjusted mantissa down, create an extended-precision representation, and calculate the halfway point. We now have a base-10 exponent for the real digits, and a base-2 exponent for the theoretical digits. We scale these two to the same exponent by multiplying the theoretixal digits by `5**-real_exp`. We then get the base-2 exponent as `theor_exp - real_exp`, and if this is positive, we multipy the theoretical digits by it, otherwise, we multiply the real digits by it. Now, both are scaled to the same magnitude, and we simply compare the digits in the big integer, and use that to direct rounding. **Rust-Isms** A few Rust-isms have been added, since it simplifies logic assertions. These can be trivially removed or reworked, as needed. - a `slice` type has been added, which is a pointer and length. - `FASTFLOAT_ASSERT`, `FASTFLOAT_DEBUG_ASSERT`, and `FASTFLOAT_TRY` have been added - `FASTFLOAT_ASSERT` aborts, even in release builds, if the condition fails. - `FASTFLOAT_DEBUG_ASSERT` defaults to `assert`, for logic errors. - `FASTFLOAT_TRY` is like a Rust `Option` type, which propagates errors. Specifically, `FASTFLOAT_TRY` is useful in combination with `FASTFLOAT_ASSERT` to ensure there are no memory corruption errors possible in the big-integer arithmetic. Although the `bigint` type ensures we have enough storage for all valid floats, memory issues are quite a severe class of vulnerabilities, and due to the low performance cost of checks, we abort if we would have out-of-bounds writes. This can only occur when we are adding items to the vector, which is a very small number of steps. Therefore, we abort if our memory safety guarantees ever fail. lexical has never aborted, so it's unlikely we will ever fail these guarantees.

Implement the big-integer arithmetic algorithm.

Alexhuszagh and others added 2 commits September 10, 2021 18:53

Merge pull request #96 from Alexhuszagh/bigint

3f0ba09

Implement the big-integer arithmetic algorithm.

lemire merged commit 5c85d38 into main Sep 14, 2021

lemire deleted the dlemire/bigint branch January 28, 2023 01:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adopting Alexhuszagh's decimal comparison approach for long input strings #104

Adopting Alexhuszagh's decimal comparison approach for long input strings #104

lemire commented Sep 14, 2021

Adopting Alexhuszagh's decimal comparison approach for long input strings #104

Adopting Alexhuszagh's decimal comparison approach for long input strings #104

Conversation

lemire commented Sep 14, 2021