-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Implementing a proper Clang toolchain #640
Comments
@geimer I'm looking into this again now from the position that Looking at your first point, how about we do everything except the front end compilers Clang and Flang in an
so I wonder it it might get away with just not running the As regards OpenMP, we could disable the creation of the symlinks in the Now that I see all that written down, I wonder if it doesn't just look like your second solution, just with |
Hmm, the issue with OpenMP and GCCcore is actually really a general one, we can probably side-step that whole thing by making libgomp a banned library when using GCCcore (easybuilders/easybuild-framework#4535) |
Initial indications are that we should be able to separate out
|
I agree that the Fortran compiler is slowly approaching a usable state. With LLVM 18, there are still very simple OpenMP tests which fail to compile, but those are fixed in the current trunk version. This means that LLVM 19 might be a version where one can realistically try to build a toolchain.
Just to understand correctly, the idea would be to build OpenMP and so on with GCC first and afterwards build the Clang & Flang with this LLVM? We could do this, but should only build the minimal set of things we need for this to work or else users may miss features. The most important one I can think of right now would be support for OpenMP offloading, which may not work if OpenMP is built with GCC. See (source):
We would not only need to take See this example with Clang/trunk: Click to open$ clang --version
clang version 19.0.0git (https://github.com/llvm/llvm-project.git 0f323dc0c43bd45147bdf8ee9cbeef0d8f57165b)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/apps/software/Clang/trunk/bin
Build config: +assertions
$ clang -fopenmp=libgomp ompt_tool.c test.c -I$(pwd)/..
ompt_tool.c:196:14: warning: enumeration values 'ompt_dependence_type_out_all_memory' and 'ompt_dependence_type_inout_all_memory' not handled in switch [-Wswitch]
196 | switch ( t )
| ^
ompt_tool.c:1014:14: warning: enumeration value 'ompt_work_loop_static' not handled in switch [-Wswitch]
1014 | switch ( t )
| ^
ompt_tool.c:1168:35: warning: format specifies type 'int' but the argument has type 'uint64_t' (aka 'unsigned long') [-Wformat]
1165 | OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 "; WARNING: thread_begin_cb not dispatched; thread_data->value = %" PRId32 " (supposed to be >= 1)\n",
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1166 | __FUNCTION__,
| ~~~~~~~~~~~~~
1167 | ompt_tool_tid,
| ~~~~~~~~~~~~~~
1168 | thread_data->value );
| ^~~~~~~~~~~~~~~~~~~~
./ompt_tool.h:81:13: note: expanded from macro 'OMPT_TOOL_GUARDED_PRINTF'
81 | printf( __VA_ARGS__ ); \
| ^~~~~~~~~~~
ompt_tool.c:1176:35: warning: format specifies type 'int' but the argument has type 'uint64_t' (aka 'unsigned long') [-Wformat]
1172 | OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 "; WARNING tid != thread_data->value (%" PRId32 " != %" PRId32 ")\n",
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1173 | __FUNCTION__,
| ~~~~~~~~~~~~~
1174 | ompt_tool_tid,
| ~~~~~~~~~~~~~~
1175 | ompt_tool_tid,
| ~~~~~~~~~~~~~~
1176 | thread_data->value );
| ^~~~~~~~~~~~~~~~~~~~
./ompt_tool.h:81:13: note: expanded from macro 'OMPT_TOOL_GUARDED_PRINTF'
81 | printf( __VA_ARGS__ ); \
| ^~~~~~~~~~~
4 warnings generated.
$ ./a.out
$ clang -fopenmp=libomp ompt_tool.c test.c -I$(pwd)/..
ompt_tool.c:196:14: warning: enumeration values 'ompt_dependence_type_out_all_memory' and 'ompt_dependence_type_inout_all_memory' not handled in switch [-Wswitch]
196 | switch ( t )
| ^
ompt_tool.c:1014:14: warning: enumeration value 'ompt_work_loop_static' not handled in switch [-Wswitch]
1014 | switch ( t )
| ^
ompt_tool.c:1168:35: warning: format specifies type 'int' but the argument has type 'uint64_t' (aka 'unsigned long') [-Wformat]
1165 | OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 "; WARNING: thread_begin_cb not dispatched; thread_data->value = %" PRId32 " (supposed to be >= 1)\n",
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1166 | __FUNCTION__,
| ~~~~~~~~~~~~~
1167 | ompt_tool_tid,
| ~~~~~~~~~~~~~~
1168 | thread_data->value );
| ^~~~~~~~~~~~~~~~~~~~
./ompt_tool.h:81:13: note: expanded from macro 'OMPT_TOOL_GUARDED_PRINTF'
81 | printf( __VA_ARGS__ ); \
| ^~~~~~~~~~~
ompt_tool.c:1176:35: warning: format specifies type 'int' but the argument has type 'uint64_t' (aka 'unsigned long') [-Wformat]
1172 | OMPT_TOOL_GUARDED_PRINTF( "[%s] tid = %" PRId32 "; WARNING tid != thread_data->value (%" PRId32 " != %" PRId32 ")\n",
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1173 | __FUNCTION__,
| ~~~~~~~~~~~~~
1174 | ompt_tool_tid,
| ~~~~~~~~~~~~~~
1175 | ompt_tool_tid,
| ~~~~~~~~~~~~~~
1176 | thread_data->value );
| ^~~~~~~~~~~~~~~~~~~~
./ompt_tool.h:81:13: note: expanded from macro 'OMPT_TOOL_GUARDED_PRINTF'
81 | printf( __VA_ARGS__ ); \
| ^~~~~~~~~~~
4 warnings generated.
$ OMP_NUM_THREADS=1 ./a.out
[ompt_start_tool] tid = -1 | omp_version 201611 | runtime_version = 'LLVM OMP version: 5.0.20140926'
[my_initialize_tool] tid = -1 | initial_device_num 0
[register_callbacks] tid = -1 | thread_begin = always (expect always)
[register_callbacks] tid = -1 | thread_end = always (expect always)
[register_callbacks] tid = -1 | parallel_begin = always (expect always)
[register_callbacks] tid = -1 | parallel_end = always (expect always)
[register_callbacks] tid = -1 | task_create = always (expect always)
[register_callbacks] tid = -1 | task_schedule = always (expect always)
[register_callbacks] tid = -1 | implicit_task = always (expect always)
[register_callbacks] tid = -1 | target = always (expect always)
[register_callbacks] tid = -1 | target_emi = always (expect always)
[register_callbacks] tid = -1 | target_data_op = always (expect always)
[register_callbacks] tid = -1 | target_data_op_emi = always (expect always)
[register_callbacks] tid = -1 | target_submit = always (expect always)
[register_callbacks] tid = -1 | target_submit_emi = always (expect always)
[register_callbacks] tid = -1 | control_tool = always (expect always)
[register_callbacks] tid = -1 | device_initialize = always (expect always)
[register_callbacks] tid = -1 | device_finalize = always (expect always)
[register_callbacks] tid = -1 | device_load = always (expect always)
[register_callbacks] tid = -1 | device_unload = never (expect always)
[register_callbacks] tid = -1 | sync_region_wait = always
[register_callbacks] tid = -1 | mutex_released = always
[register_callbacks] tid = -1 | dependences = always
[register_callbacks] tid = -1 | task_dependence = always
[register_callbacks] tid = -1 | work = always
[register_callbacks] tid = -1 | masked = always
[register_callbacks] tid = -1 | target_map = never
[register_callbacks] tid = -1 | target_map_emi = never
[register_callbacks] tid = -1 | sync_region = always
[register_callbacks] tid = -1 | reduction = always
[register_callbacks] tid = -1 | lock_init = always
[register_callbacks] tid = -1 | lock_destroy = always
[register_callbacks] tid = -1 | mutex_acquire = always
[register_callbacks] tid = -1 | mutex_acquired = always
[register_callbacks] tid = -1 | nest_lock = always
[register_callbacks] tid = -1 | flush = always
[register_callbacks] tid = -1 | cancel = always
[register_callbacks] tid = -1 | dispatch = always
[register_callbacks] tid = -1 | error = always
[thread_begin_cb] tid = 1 | type = initial
[implicit_task_cb] tid = 1 | parallel_data = 0 | task_data = 6660001 | endpoint = begin | actual_parallelism = 1 | index = 1 | flags = initial
[parallel_begin_cb] tid = 1 | parallel_data = 7770001 | encountering_task_data = 6660001 | flags = invoker_runtime_team | requested_parallelism = 1 | codeptr_ra = 0x59fcfcd069bb
[implicit_task_cb] tid = 1 | parallel_data = 7770001 | task_data = 6660002 | endpoint = begin | actual_parallelism = 1 | index = 0 | flags = implicit
[implicit_task_cb] tid = 1 | parallel_data = 7777777 | task_data = 6660002 | endpoint = end | actual_parallelism = 1 | index = 0 | flags = implicit
[parallel_end_cb] tid = 1 | parallel_data = 7770001 | encountering_task_data = 6660001 | flags = invoker_runtime_team | codeptr_ra = 0x59fcfcd069bb
[implicit_task_cb] tid = 1 | parallel_data = 0 | task_data = 6660001 | endpoint = end | actual_parallelism = 0 | index = 1 | flags = initial
[thread_end_cb] tid = 1
[my_finalize_tool] tid = 1 |
Maybe it would be not unhelpful if I provide an atypical(?) perspective for why I'm interested in a[1] Flang/Clang toolchain, in fact why I'm interested even if the compiler is not bug-free. Since I'm developing and maintaining a Fortran application, and one which also offloads to GPU, I want to be able to test the compilers (also without offloading), and find out if I need to i) workaround compiler issues, or ii) ask a vendor to prioritize a feature for us. [1] actually, I'd like to have ~3 toolchains, but that's maybe getting a bit far ahead of ourselves: |
I'm absolutely with you on this one. The Fortran compiler is getting mature enough that building a toolchain is feasible and, even if not entirely bug-free, might have a large interest for users.
While I agree that this is a very interesting scenario, having an up-to-date latest and greatest version available all the time might be difficult, especially if a whole toolchain is built with that compiler. That's one reason why I chose to build a very small toolchain manually (basically only LLVM/Clang + OpenMPI), even if it's more painful to do so. This allows me to test a daily Clang (and sometimes also AOMP) build for issues.
I would guess that flang-classic will fade out once flang-new is ready. Building amdflang is more complicated though, as you would probably also want to have their entire LLVM toolchain. At that point, you're basically building AOMP (or the equivalent ROCm components) from source. That's possible, but as you said, we should focus on LLVM/Clang first. |
I think this is the general idea here, we move ahead in the C/C++ space and see where things break with Fortran, as things improve we will get more and more Fortran applications building. I'm particularly interested in pushing some Fortran applications we are connected to to start working on OpenMP device offloading.
So, there is a subtle issue here. EasyBuild itself updates toolchains twice a year (e.g.,
I don't think we would be considering |
I've been working on this (still on making sure all components of Will have to try with HDF5 and LAPACK |
Looking at the current EasyConfigs, there seems to be one version per toolchain update (with Clang 16 being the exception), normally using the last release in the LLVM/Clang update cycle. From my perspective, this is sufficient for EasyBuild and its users. Like you've said, additional versions can easily be added and if a major version is supported already (e.g. 18.1.0), the chance is high that another version (e.g. 18.1.7) also works fine when passed via
That sounds great! There are certainly some quirks that can come up. Just as an example: We're developing an LLVM IR plug-in for our application that will be used as an additional pass when a user compiles his application. There, we want to use |
To have a point of reference i've just opened 2 PRs for EB and related EC files
there are still some things that need fixing/improving, but in the meanwhile suggestion/comments are welcome |
We should see LLVM 19 in September: https://discourse.llvm.org/t/llvm-19-release-schedule-and-planning/79828 |
Disclaimer: I'm by no means an LLVM or Clang expert. The information below is just a collection of bits and pieces found in various places as well as my personal thoughts on how EasyBuild support could be improved.
Target
A working LLVM-based toolchain -- at least for C/C++ -- with minimal redundancy. Here, "toolchain" is not meant in the EasyBuild sense (i.e., including an MPI, math libs, etc.), but merely refers to a compiler environment that can be used by end users to build their codes. (This doesn't rule out to have an MPI w/o Fortran support using
Clang
, though.)With proper Fortran support being on the horizon, however, it might become a full toolchain in the EasyBuild sense in the future. This should be taken into account in the design.
Status quo
LLVM / Clang / flang
LLVM provides a framework for code optimization and generation for many different target CPUs. The most prominent language frontend is
Clang
, which focuses on C-like languages (C, C++, Objective-C, OpenCL). Basically all commercial compiler vendors (Intel, PGI, Cray, IBM, Fujitsu, ARM) have switched in the meanwhile toClang
as the basis for their C/C++ compilers.Fortran support was started based on the PGI Fortran compiler frontend, see the flang project on GitHub, now called "old/legacy/classic flang". However, it requires patched versions of LLVM and Clang, and seems stuck at LLVM 9. However, this mailing list post suggests that there might be an update for LLVM 11 ("LLVM11 with classic flang is on various vendor's roadmap for this autumn, so one of us will do it I'm sure.")
Besides, there is a "new flang" frontend (formerly called
f18
) written from scratch, now developed as an official LLVM project. However, it isn't fully functional yet and still depends on another compiler to do the actual work, see this mailing list post.EasyBuild
EasyBuild currently includes various
LLVM
packages which are used as dependencies by, for example,Mesa
,numba
, andRust
. Recent versions are built on top ofGCCcore
, and only include the core LLVM libraries and tools.In addition, there are various
Clang
easyconfigs. Again, recent versions are (usually) built on top ofGCCcore
. These can be used as a stand-alone compiler, but are also used as dependencies by various packages, such aspocl
,TRIQS
, andLongshot
, and could be used by additional packages such asScore-P
andDoxygen
. This is due to also providing libraries for source-code parsing and processing. TheClang
packages build their own copy ofLLVM
, and include other LLVM projects such as an OpenMP runtime library, thelld
linker, thelibc++
C++ Standard Library, and thepolly
polyhedral optimizer, though not all of those components are used by default with the current configuration.There has been some work on packaging "legacy flang" (see easybuilders/easybuild-easyconfigs#8335 and easybuilders/easybuild-easyblocks#1729), however, the question is whether it is worth putting more effort into this since things might change considerably with the "new flang".
Possible ways to organize things in EasyBuild
Build full
Clang
(includinglld
, libraries, etc.) using an existingLLVM
built withGCCcore
as dependencyBuilding LLVM projects out-of-tree is basically undocumented. Therefore, it is unclear how projects interrelate to each other and how to configure things correctly. However, some information could be extracted from the Fedora RPM specs (e.g., for Clang).
The LLVM OpenMP library by default installs symlinks for
libgomp
andlibiomp5
, i.e., the OpenMP runtimes of the GCC and Intel compilers, as it implements both APIs. Thus, the order in which modules are loaded determines which runtime is found byld.so
and affects the runtime behavior of codes using OpenMP.Creating these symlinks can be disabled via a
CMake
configuration option, but doing so may lead to simultaneously using two different OpenMP runtimes if some OpenMP code compiled withClang
is linked to a library built withGCCcore
also using OpenMP.Likewise, enabling
libc++
by default forClang
is likely to make code incompatible with C++ libraries compiled withGCCcore
usinglibstdc++
.LLVM
rather thanClang
.Introduce a new package named, e.g.,
LLVM-Clang
built withGCCcore
providing a fullClang
(includinglld
, libraries, etc.) and use it as a dependency for all packages that currently depend on eitherLLVM
orClang
. AClang
compiler package would then be a bundle ofGCCcore
,LLVM-Clang
, andbinutils
.libc++
issues outlined aboveLLVM-Clang
vs.Clang
packaging would probably cause questions similar to theGCCcore
vs.GCC
separation.Build minimal
Clang
(excludinglld
, libraries, OpenMP runtime) on top ofGCCcore
-- either using an existingLLVM
or as part of aLLVM-Clang
package as outlined above -- to provide the Clang libraries to packages that need it as a dependency. In addition, build a fullLLVM
/Clang
(including everything) on theSYSTEM
level as a separate toolchain.GCCcore
, as it is a completely separate toolchain.Clang
on theSYSTEM
level. (How does one properly do this? UseGCC
as a builddep rather than toolchain???)gfortran
could (temporarily) serve as a Fortran compiler in a full LLVM toolchain using, e.g.,compiler-rt
instead oflibgcc_s
. It's very likely that this won't work.Clang
module underGCCcore
serves a very limited purpose and should thus be avoided by end-users, unless they really know what they are doing. Not sure how to best prevent/document this. It is also unclear whether such a stripped downClang
would be sufficient for all packages that currently depend on the existingClang
packages.The text was updated successfully, but these errors were encountered: