-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add macOS support #602
base: main
Are you sure you want to change the base?
Add macOS support #602
Conversation
#define CL_DEPRECATED(...) | ||
#define GCL_API_SUFFIX__VERSION_1_1 | ||
//#include <OpenCL/opencl.h> | ||
#include <CL/opencl.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can probably be changed back; I believe ICD Loader also overloads <OpenCL/opencl.h>
list(APPEND CHIP_SPV_DEFINITIONS HAVE_OPENCL) | ||
list(PREPEND CHIP_INTERFACE_LIBS ${OpenCL_LIBRARY}) | ||
list(PREPEND CHIP_INTERFACE_LIBS | ||
#OpenCL::Headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deadcode
@@ -114,10 +140,10 @@ if(NOT DEFINED LevelZero_LIBRARY) | |||
endif() | |||
endif() | |||
|
|||
message(STATUS "OpenCL_LIBRARY: ${OpenCL_LIBRARY}") | |||
message(STATUS "OpenCL_FOUND: ${OpenCL_FOUND}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is obviously ugly; I'll have to figure out how to show the path with import targets.
Maybe it's also enough to report YES / NO for each of these.
find_package(OpenCLHeaders) | ||
find_package(OpenCLICDLoader) | ||
|
||
#add_library(OpenCL ALIAS OpenCLICDLoader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code
#add_library(OpenCL ALIAS OpenCLICDLoader) | ||
message("OpenCL_FOUND: ${OpenCL_FOUND}") | ||
set(OpenCL_FOUND ${OpenCLICDLoader_FOUND}) | ||
message("OpenCL_FOUND: ${OpenCL_FOUND}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug leftover
set(ARCH "x64") | ||
else() | ||
set(ARCH "unknown") | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire block is super ugly. I'm not sure how to do this best
add_compile_options(-mf16c) | ||
elseif("${CMAKE_C_COMPILER_ARCHITECTURE_ID}" STREQUAL "x64") | ||
add_compile_options(-mf16c) | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How important is -mf16c
for x86? Could it be removed? It's in a "temporary" block, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall, the -mf16c
is required for host side native half type and operations (hip_fp16.h).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -52,4 +52,7 @@ | |||
typedef _Float16 api_half; | |||
typedef _Float16 api_half2 __attribute__((ext_vector_type(2))); | |||
|
|||
typedef unsigned int uint; | |||
typedef unsigned long ulong; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary on macOS? Does it still work for other platforms or will it report duplicate identifier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might cause troubles. On my system (Ubuntu 22.04) <stdlib.h> includes <sys/types.h> which defines typedefs for u{short,int,long}.
/usr/include/x86_64-linux-gnu/sys/types.h:
...
#ifdef __USE_MISC
/* Old compatibility names for C types. */
typedef unsigned long int ulong;
typedef unsigned short int ushort;
typedef unsigned int uint;
#endif
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it's not a problem as long as the duplicate typedefs aliases the same type.
target_link_libraries(LLVMHipDynMem "-undefined dynamic_lookup") | ||
target_link_libraries(LLVMHipStripUsedIntrinsics "-undefined dynamic_lookup") | ||
target_link_libraries(LLVMHipDefrost "-undefined dynamic_lookup") | ||
target_link_libraries(LLVMHipPasses "-undefined dynamic_lookup") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should potentially be set_target_properties(LLVMHipDynMem PROPERTIES LINK_FLAGS "-undefined dynamic_lookup)
etc.; I'll have to read up on it.
I'm not sure why this is necessary in the first place (I think it should be the default for MODULE
libs)
Works for now.
InverseCallGraphNode *N = popAny(WorkList); | ||
if (N == nullptr) { | ||
printf("N was nullptr! skipping\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets triggered in the abort
sample. Why?
Is this expected or a macOS issue?
@@ -129,8 +129,14 @@ set(SAMPLES | |||
ccompat | |||
hipComplex | |||
hipHostMallocSample | |||
) | |||
# We skip this test on macOS, because macho does not support EXCLUDE section flags (used by clang-offload-bundler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to fix clang-offload-bundler eventually, but I'm new to most of LLVM, mach-o, HIP, OpenCL and chipStar.
So I'll have to understand why this is necessary in the first place / what exactly it does.
@@ -1257,7 +1257,12 @@ void chipstar::Backend::waitForThreadExit() { | |||
* | |||
* So we just wait for 0.5 seconds before starting to check for thread exit. | |||
*/ | |||
#if defined(__APPLE__) || defined(__MACOSX) | |||
sched_yield(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Untested
@@ -77,7 +77,7 @@ void CHIPReadEnvVarsCallOnce() { | |||
|
|||
CHIPDeviceTypeStr = readEnvVar("CHIP_DEVICE_TYPE"); | |||
if (CHIPDeviceTypeStr.size() == 0) | |||
CHIPDeviceTypeStr = "gpu"; | |||
CHIPDeviceTypeStr = "default"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After spending half an hour until realizing pocl was missing SPIR-V, I've spent another half an hour trying to figure out why chipStar still didn't work, until I realized it was filtering out my CPU device.
I've tried to find a good solution but the CL spec was lacking. Would "default" type still work for most people? (or will they be surprised if their OpenCL platform suddenly kicks them from CPU to GPU after a chipStar update?)
launchUnaryFn<float>([] __device__(auto x) { return std::abs(x); }); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting warnings about this being ambiguous, I can add a log with these errors later.
Why isn't this an issue for other platforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add a log with these errors later.
Please, send us the log.
cool! I will try this on my m1 this week |
Nice! Thanks for working on this. For the "Metal backend" -- you might do more good contributing that backend to PoCL instead. You'd get at least SYCL support (via DPC++ or OpenSYCL) too that way as a side effect. |
@JayFoxRox Thank you so much for this contribution! We discussed it extensively this morning,and we would very much like to merge it when it is ready. I order to do so we will need CI support. |
Yes, pocl integration was also what I was thinking. However, for chipStar, it might be interesting to add a Vulkan backend, because that would probably work with MoltenVK, so you'd get the rest of the metal API "for-free" + you'd also get support for a bunch of Vulkan devices which lack OpenCL (or proper drivers).
I probably won't find time to ever work on that. I'm fine with having CPU-only (for now). For me, it's mostly about getting a bit into HIP/CUDA + I want to run some projects which are CUDA only.
Note that it doesn't really work yet, but patches are welcome.
Sounds good. |
I forgot that there's already also a starting point PoCL-Vulkan backend (quite early and primitive) that could be improved for MacOS needs if it supports Vulkan.
Why not... To me HIP is a higher level than OpenCL though so I'd simplify this to SYCL -> DPC++/OpenSYCL -> OpenCL -> X instead of hopping through HIP for SYCL support. |
I've tried the pocl-vulkan backend with macos-m1, but couldn't get it to work. I also tried writing a Metal backend for PoCL, and made some progress, but the Metal documentation was very hard to digest. (For eg: how to use events) |
@isuruf is Metal backend starting point usable at all? I suggest to contribute it to the main if it runs any examples so it's easier for other volunteers to contribute to make it better. |
So far, I've only tried the CPU backend, but I ran into a bunch of issues with PoCL (partially issues in the brew package, but also in PoCL itself) which is also why there hasn't been much progress here. Until there's a solution for pocl/pocl#1288 I can't really continue on the chipStar side either - I can't really know if the issues I'm seeing are from PoCL or from chipStar. |
No, it's not. |
Introduction
This adds macOS support in chipStar and its underlying infrastructure (LLVM, HIP, HIPCC).
To make this even more interesting, this is on M1 (Apple Silicon), e.g. ARM.
I assume this will take a while to get merged.
For now, I hope that others can help to fix Windows / Linux things that I break in this macOS port.
OpenCL / SPIR-V on macOS
Obviously there isn't a metal backend currently (and I don't plan to write one), although I theorize it might be much easier to write once this has been merged.
This probably won't work with the Apple OpenCL implementation due to the lack of SPIR-V support.
Instead, I added support for the OpenCL ICD loader and I use a hacked version of pocl from brew for which I enabled SPIR-V support.
I don't care about performance right now either - all I want is to run some CUDA and HIP projects on macOS.. so chipStar it is!
Installation
Once everything is merged, I plan to create a Formula for brew (in homebrew-core), so this can be installed more easily (
brew install chipstar
).However, I expect this can only happen once more of the chipStar changes in HIP, HIPCC and LLVM have been integrated in their respective upstreams (so that we don't need the forks in brew).
We could probably have a CHIP-SPV/homebrew-chipStar or similar though, to distribute this to macOS users.
For now, if you want to test this (on macOS):
Cloning
Follow steps in README for structure, but clone this branch.
For each submodule, check the TODO section below and checkout the corresponding branches.
Also see the note about pocl in the next section.
Install dependencies
Set up some config:
Build LLVM:
Build chipStar:
Testing:
These are mostly notes to myself, but this is how to test:
CHIP_LOGLEVEL=trace $CHIPSTAR_INSTALL/bin/chip_spv_samples/hipmath
Status
TODO