-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch Float16 LLVM representation from i16
to half
#26381
Conversation
Some LLVM intrinsics can't be lowered to instructions on target platforms. Some of these are therefore lowered to libc libcalls and glibc does not implement all of them. This commit uses `extern_c` to implement these fallback methods in native Julia. Similar projects to RTLIB are glibc and compiler-rt. For now we only implement conversion between `Float16` and `Float32`, we also cheat by going through `Float32` for the conversion between `Float64` and `Float16`.
LLVM intrinsics either map to instructions or to functions in compiler-rt. Since we provide our own implementation we can just look them up in sys.so and resolve to the function there. On Darwin we have to use a unmangled version of the function name.
base/rtlib/RTLIB.jl
Outdated
|
||
# We would like to use `@ccallable` here, | ||
# but building the sysimage fails, so we use a bootstrapped version. | ||
function register(f, rtype, argt, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vtjnash I was trying to use @ccallable
here but I was getting failures during sysimage building.
@vtjnash On AArch64 I am running into:
|
Further update on PPC situation. The backend can't select FP16 operations (even on master), so I will first need to work with upstream on adding support for that. I wouldn't want to simply disable this on PPC since that would inhibit GPU code on PPC from using |
Today does not seem to be my day. It only worked beautifully on my tests because I have the |
The title of this PR refers to switching the representation. Does this actually change the |
I believe it changes the representation in LLVM but it shouldn't change the bit pattern. |
i16
to half
What's the status of this PR? (conflicts, of course, but is anyone working on it?) |
Still blocked today on #26381 (comment), but I would welcome another set of eyes to look at this, I would like to see this happening, but I can't dedicate time to it. |
Is there any way to disable it on PPC host but allow it to stay on for the NVPTX backend used by LLVM.jl for GPU code? |
No I don't think that is feasible we have |
Close this? |
While this PR is outdated, it is still an issue that needs fixing. So I
would say leave it open.
…On Sun, May 5, 2019, 19:47 Viral B. Shah ***@***.***> wrote:
Close this?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#26381 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABDO2WJGP3YM4A2TYXMTEDPT5WXLANCNFSM4EUOIRBA>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vchuravy says this PR is outdated
Closed in favour of #37510 |
This is a first step towards implementing the parts of #18927 and #18734
in the laziest most straight-forward possible.
This foregoes the completnes aspect of #18927, while providing
a way forward to potentially extend this to
Float128
, and it doesn't addadditional dependencies like #18734.
Follow-up PR's could extend RTLIB to be more generic or to not use Base
(thus allowing it to be it's own shared library). There are other places
where
Float16
currently are eagerly converted toFloat32
so this isnot an attempt at completness.
All the code paths are tested on x86 Linux and I am currently testing on ARM.
(There is a known PPC codegen issue, but I guess nobody but me cares about that).
@vtjnash is there anything I would need to do for
anticodegen
/LLVM free builds?Notes: