-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Float64 to Float16 conversion is slow #41161
Comments
Yeah. This is one of those functions that should be really easy to implement, but is surprisingly hard to get correct and fast. It's been on my list for a while. |
Which is the Line 1490 in 15c19c8
Looking at it closely it internally converts to |
Interestingly this is already fast on M1 macs (so ARM), with Julia 1.9.1
It is still slow on an x86_64 machine (also using Julia 1.9.1):
|
@oscardssmith should we do the double conversion, define Float16(Float64) as Float16(Float32(Float64)) or is the double rounding wrong? |
double rounding is wrong |
Demo: round 0.499 to 2 digits: you get 0.50. Now round to 1 digit: you get 1 (with "round up"). But round 0.499 to 1 digit immediately: you get 0, even with round up. |
I believe we are calling compiler-rt for this. Of course this can't be implemented by converting via Float32 since that rounds twice, but it's frustrating that that method is so much faster. Would be nice to have a better implementation of this. See also #40315.
The text was updated successfully, but these errors were encountered: