Emulate Float64 #520
ggkountouras
started this conversation in
Ideas
Replies: 1 comment 3 replies
-
I think this would be useful to have, but ideally as part of a vendor-neutral package (a la DoubleFloats.jl -- maybe something exists already). |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
1) Make it work
Using the theory from SoftFloat (https://github.com/ucb-bar/berkeley-softfloat-3) and the partially finished libMetalFloat64 (https://github.com/philipturner/metal-float64), implement a proof-of-concept version. At this stage, it is okay to have low throughput compared to native
Float32
.2) Make it right
Implement rounding modes. Add atomics. Ensure IEEE-754 compliance with tests.
3) Make it fast
Add option to drop strict IEEE-754 compliance (remove denormals, don't check for
Inf
/NaN
). Add vectorization. Inline at a higher level. Implement Fused Multiply-Add.Beta Was this translation helpful? Give feedback.
All reactions