Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
dzhang314 committed Sep 30, 2024
2 parents 2438063 + f9dfddc commit 776c6bb
Showing 1 changed file with 9 additions and 9 deletions.
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

**Copyright © 2019-2024 by David K. Zhang. Released under the [MIT License][1].**

**MultiFloats.jl** is a Julia package for extended-precision arithmetic using 100–400 bits (30–120 decimal digits). In this range, it is the fastest library that I am aware of. At 100-bit precision, **MultiFloats.jl** is roughly **40× faster than [`BigFloat`][2]**, **5× faster than [Quadmath.jl][3]**, and **1.5× faster than [DoubleFloats.jl][4]**.
**MultiFloats.jl** is a Julia package for extended-precision arithmetic using 100–400 bits (30–120 decimal digits). In this range, it is the fastest library that I am aware of. At 100-bit precision, **MultiFloats.jl** is roughly **30× faster than [`BigFloat`][2]**, **6× faster than [Quadmath.jl][3]**, and **2× faster than [DoubleFloats.jl][4]**.

**MultiFloats.jl** is fast because it uses native `Float64` operations on static data structures that do not dynamically allocate memory. In contrast, [`BigFloat`][2] allocates memory for every single arithmetic operation, requiring frequent pauses for garbage collection. In addition, **MultiFloats.jl** uses branch-free algorithms that can be vectorized for even faster execution on [SIMD][5] processors.
**MultiFloats.jl** is fast because it uses native `Float64` operations on static data structures that do not dynamically allocate memory. In contrast, [`BigFloat`][2] allocates memory for every single arithmetic operation, requiring frequent pauses for garbage collection. In addition, **MultiFloats.jl** uses branch-free vectorized algorithms for even faster execution on [SIMD][5] processors.

**MultiFloats.jl** provides pure-Julia implementations of the basic arithmetic operations (`+`, `-`, `*`, `/`, `sqrt`), comparison operators (`==`, `!=`, `<`, `>`, `<=`, `>=`), and floating-point introspection methods (`isfinite`, `eps`, `minfloat`, etc.). Transcendental functions (`exp`, `log`, `sin`, `cos`, etc.) are supported through [MPFR][6].

Expand Down Expand Up @@ -81,14 +81,14 @@ We use [two linear algebra tasks][11] to compare the performance of extended-pre
* QR factorization of a random 400×400 matrix
* Pseudoinverse of a random 400×250 matrix using [GenericLinearAlgebra.jl][12]

The timings reported below are averages of 10 single-threaded runs performed on an Intel Core i9-11900KF processor using Julia 1.10.0.
The timings reported below are minima of 3 single-threaded runs performed on an Intel Core i9-11900KF processor using Julia 1.10.5.

| | MultiFloats<br>`Float64x2` | MPFR<br>[`BigFloat`][2] | Arb<br>[`ArbFloat`][13] | Intel<br>[`Dec128`][14] | Julia<br>[`Double64`][4] | libquadmath<br>[`Float128`][3] |
|----------------|----------------------------|--------------------------|--------------------------|--------------------------|--------------------------|------------------------|
| 400×400 `qr` | 0.276 sec | 7.311 sec<br>27× slower | 13.259 sec<br>48× slower | 11.963 sec<br>43× slower | 0.384 sec<br>1.4× slower | 1.399 sec<br>5× slower |
| correct digits | 26.2 | 25.9 | 25.9 | 27.7 | 26.1 | 27.9 |
| 400×250 `pinv` | 1.236 sec | 49.581 sec<br>40× slower | ❌ Error | ❌ Error | 1.899 sec<br>1.5× slower | 7.551 sec<br>6× slower |
| correct digits | 26.0 | 25.8 | ❌ Error | ❌ Error | 25.9 | 27.9 |
| | MultiFloats<br>`Float64x2` | MPFR<br>[`BigFloat`][2] | Arb<br>[`ArbFloat`][13] | Intel<br>[`Dec128`][14] | Julia<br>[`Double64`][4] | GNU<br>[`Float128`][3] |
|----------------|----------------------------|-------------------------|-------------------------|-------------------------|--------------------------|------------------------|
| 400×400 `qr` | 0.213 sec | 3.74 sec<br>18× slower | ❌ Error | ❌ Error | 0.408 sec<br>1.9× slower | 1.19 sec<br>5.6× slower |
| correct digits | 26.3 | 26.1 | ❌ Error | ❌ Error | 26.3 | 27.9 |
| 400×250 `pinv` | 0.872 sec | 29.3 sec<br>34× slower | ❌ Error | ❌ Error | 1.95 sec<br>2.2× slower | 6.37 sec<br>7.3× slower |
| correct digits | 26.0 | 25.9 | ❌ Error | ❌ Error | 26.0 | 27.9 |
| selectable precision | ✔️ | ✔️ | ✔️ ||||
| avoids allocation | ✔️ ||| ✔️ | ✔️ | ✔️ |
| arithmetic<br>`+`, `-`, `*`, `/`, `sqrt` | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
Expand Down

0 comments on commit 776c6bb

Please sign in to comment.