Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM fails to build on PPC with GCC>=9 and -fno-plt #51205

Closed
h-vetinari opened this issue Sep 14, 2021 · 15 comments
Closed

LLVM fails to build on PPC with GCC>=9 and -fno-plt #51205

h-vetinari opened this issue Sep 14, 2021 · 15 comments
Labels
bugzilla Issues migrated from bugzilla build-problem cmake Build system in general and CMake in particular obsolete Issues with old (unsupported) versions of LLVM

Comments

@h-vetinari
Copy link
Contributor

Bugzilla Link 51863
Version trunk
OS Linux
Blocks #50580
CC @h-vetinari,@nemanjai,@tstellar,@xtkoba

Extended Description

Originally when building v11.1.0 for conda-forge (conda-forge/llvmdev-feedstock#115), there was an issue with using GCC 9 (only on PPC), so we kept using GCC 8.

Now, two major versions later, GCC 8 fails for other reaons (unrelated to LLVM), and I wanted to try upgrading the version again, see conda-forge/llvmdev-feedstock#131.

As it turns out the same error persists with GCC 11.1 & LLVM 13. I cannot tell where the fault lies exactly, because there's not much output from the linker to go on.

[...]
[ 88%] Building CXX object tools/llvm-exegesis/lib/X86/CMakeFiles/LLVMExegesisX86.dir/X86Counter.cpp.o
collect2: error: ld returned 1 exit status
make[2]: *** [tools/llvm-shlib/CMakeFiles/LLVM.dir/build.make:299: lib/libLLVM-13.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:30826: tools/llvm-shlib/CMakeFiles/LLVM.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 88%] Linking CXX static library ../../../../lib/libLLVMExegesisX86.a
[ 88%] Built target LLVMExegesisX86
make: *** [Makefile:156: all] Error 2

Sample CI run is here: https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=376252

@h-vetinari
Copy link
Contributor Author

It always seems to happen at the same place, somewhere in tools/llvm-shlib/CMakeFiles/LLVM.dir/build; Note that LLVM 13.0.0-rc3 (like 11.1 before) also fails at the same point with GCC 9 (in this case 9.4.0). I haven't tested GCC 10 yet.

@xtkoba
Copy link
Mannequin

xtkoba mannequin commented Sep 15, 2021

Possible context of error(s):

2021-09-14T23:07:40.7220247Z [ 88%] Linking CXX shared library ../../lib/libLLVM-13.so
2021-09-14T23:07:49.3795293Z ../../lib/libLLVMXCoreCodeGen.a(XCoreLowerThreadLocal.cpp.o): in function initializeXCoreLowerThreadLocalPassOnce(llvm::PassRegistry&)': 2021-09-14T23:07:49.3797855Z XCoreLowerThreadLocal.cpp:(.text._ZL39initializeXCoreLowerThreadLocalPassOnceRN4llvm12PassRegistryE+0xb4): relocation truncated to fit: R_PPC64_REL24 against symbol llvm::PassRegistry::registerPass(llvm::PassInfo const&, bool)' defined in .text._ZN4llvm12PassRegistry12registerPassERKNS_8PassInfoEb section in ../../lib/libLLVMCore.a(PassRegistry.cpp.o)
2021-09-14T23:07:49.3800154Z ../../lib/libLLVMXCoreCodeGen.a(XCoreLowerThreadLocal.cpp.o): in function (anonymous namespace)::XCoreLowerThreadLocal::~XCoreLowerThreadLocal()': 2021-09-14T23:07:49.3801958Z XCoreLowerThreadLocal.cpp:(.text._ZN12_GLOBAL__N_121XCoreLowerThreadLocalD2Ev+0x30): relocation truncated to fit: R_PPC64_REL24 against symbol llvm::ModulePass::~ModulePass()' defined in .text._ZN4llvm10ModulePassD2Ev section in ../../lib/libLLVMCore.a(Pass.cpp.o)
2021-09-14T23:07:49.3803758Z ../../lib/libLLVMXCoreCodeGen.a(XCoreLowerThreadLocal.cpp.o): in function (anonymous namespace)::XCoreLowerThreadLocal::~XCoreLowerThreadLocal()': 2021-09-14T23:07:49.3805417Z XCoreLowerThreadLocal.cpp:(.text._ZN12_GLOBAL__N_121XCoreLowerThreadLocalD0Ev+0x38): relocation truncated to fit: R_PPC64_REL24 against symbol llvm::ModulePass::~ModulePass()' defined in .text._ZN4llvm10ModulePassD2Ev section in ../../lib/libLLVMCore.a(Pass.cpp.o)
2021-09-14T23:07:49.3806986Z ../../lib/libLLVMXCoreCodeGen.a(XCoreLowerThreadLocal.cpp.o): in function llvm::Pass* llvm::callDefaultCtor<(anonymous namespace)::XCoreLowerThreadLocal>()': 2021-09-14T23:07:49.3808840Z XCoreLowerThreadLocal.cpp:(.text._ZN4llvm15callDefaultCtorIN12_GLOBAL__N_121XCoreLowerThreadLocalEEEPNS_4PassEv+0x7c): relocation truncated to fit: R_PPC64_REL24 against symbol llvm::PassRegistry::getPassRegistry()' defined in .text._ZN4llvm12PassRegistry15getPassRegistryEv section in ../../lib/libLLVMCore.a(PassRegistry.cpp.o)
2021-09-14T23:07:49.3810853Z ../../lib/libLLVMXCoreCodeGen.a(XCoreLowerThreadLocal.cpp.o): in function void std::__insertion_sort<llvm::WeakTrackingVH*, __gnu_cxx::__ops::_Iter_less_iter>(llvm::WeakTrackingVH*, llvm::WeakTrackingVH*, __gnu_cxx::__ops::_Iter_less_iter) [clone .isra.0]': 2021-09-14T23:07:49.3813593Z XCoreLowerThreadLocal.cpp:(.text._ZSt16__insertion_sortIPN4llvm14WeakTrackingVHEN9__gnu_cxx5__ops15_Iter_less_iterEEvT_S6_T0_.isra.0+0xe8): relocation truncated to fit: R_PPC64_REL24 against symbol llvm::ValueHandleBase::AddToExistingUseList(llvm::ValueHandleBase**)' defined in .text.ZN4llvm15ValueHandleBase20AddToExistingUseListEPPS0 section in ../../lib/libLLVMCore.a(Value.cpp.o)
2021-09-14T23:07:49.3815936Z XCoreLowerThreadLocal.cpp:(.text.ZSt16__insertion_sortIPN4llvm14WeakTrackingVHEN9__gnu_cxx5__ops15_Iter_less_iterEEvT_S6_T0.isra.0+0x14c): relocation truncated to fit: R_PPC64_REL24 against symbol llvm::ValueHandleBase::RemoveFromUseList()' defined in .text._ZN4llvm15ValueHandleBase17RemoveFromUseListEv section in ../../lib/libLLVMCore.a(Value.cpp.o) 2021-09-14T23:07:49.3818265Z XCoreLowerThreadLocal.cpp:(.text._ZSt16__insertion_sortIPN4llvm14WeakTrackingVHEN9__gnu_cxx5__ops15_Iter_less_iterEEvT_S6_T0_.isra.0+0x18c): relocation truncated to fit: R_PPC64_REL24 against symbol llvm::ValueHandleBase::AddToExistingUseList(llvm::ValueHandleBase**)' defined in .text.ZN4llvm15ValueHandleBase20AddToExistingUseListEPPS0 section in ../../lib/libLLVMCore.a(Value.cpp.o)
2021-09-14T23:07:49.3821811Z XCoreLowerThreadLocal.cpp:(.text.ZSt16__insertion_sortIPN4llvm14WeakTrackingVHEN9__gnu_cxx5__ops15_Iter_less_iterEEvT_S6_T0.isra.0+0x1dc): relocation truncated to fit: R_PPC64_REL24 against symbol llvm::ValueHandleBase::RemoveFromUseList()' defined in .text._ZN4llvm15ValueHandleBase17RemoveFromUseListEv section in ../../lib/libLLVMCore.a(Value.cpp.o) 2021-09-14T23:07:49.3824695Z XCoreLowerThreadLocal.cpp:(.text._ZSt16__insertion_sortIPN4llvm14WeakTrackingVHEN9__gnu_cxx5__ops15_Iter_less_iterEEvT_S6_T0_.isra.0+0x21c): relocation truncated to fit: R_PPC64_REL24 against symbol llvm::ValueHandleBase::AddToExistingUseList(llvm::ValueHandleBase**)' defined in .text.ZN4llvm15ValueHandleBase20AddToExistingUseListEPPS0 section in ../../lib/libLLVMCore.a(Value.cpp.o)
2021-09-14T23:07:49.3827107Z XCoreLowerThreadLocal.cpp:(.text.ZSt16__insertion_sortIPN4llvm14WeakTrackingVHEN9__gnu_cxx5__ops15_Iter_less_iterEEvT_S6_T0.isra.0+0x250): relocation truncated to fit: R_PPC64_REL24 against symbol `llvm::ValueHandleBase::RemoveFromUseList()' defined in .text._ZN4llvm15ValueHandleBase17RemoveFromUseListEv section in ../../lib/libLLVMCore.a(Value.cpp.o)
2021-09-14T23:07:49.3828856Z XCoreLowerThreadLocal.cpp:(.text.ZSt16__insertion_sortIPN4llvm14WeakTrackingVHEN9__gnu_cxx5__ops15_Iter_less_iterEEvT_S6_T0.isra.0+0x2ec): additional relocation overflows omitted from the output

@xtkoba
Copy link
Mannequin

xtkoba mannequin commented Sep 15, 2021

*FLAGS:

2021-09-14T20:50:25.7942787Z +CFLAGS=-mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/llvm-package-13.0.0.rc3 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix
2021-09-14T20:50:25.7963393Z +CPPFLAGS=-DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem $PREFIX/include
2021-09-14T20:50:25.7965392Z +DEBUG_CFLAGS=-mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fvar-tracking-assignments -pipe -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/llvm-package-13.0.0.rc3 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix
2021-09-14T20:50:25.7966976Z +DEBUG_CPPFLAGS=-D_DEBUG -D_FORTIFY_SOURCE=2 -Og -isystem $PREFIX/include
2021-09-14T20:50:25.7975958Z +LDFLAGS=-Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,-rpath,$PREFIX/lib -Wl,-rpath-link,$PREFIX/lib -L$PREFIX/lib
2021-09-14T20:50:25.8494323Z +CXXFLAGS=-fvisibility-inlines-hidden -fmessage-length=0 -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -pipe -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/llvm-package-13.0.0.rc3 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix
2021-09-14T20:50:25.8498367Z +DEBUG_CXXFLAGS=-fvisibility-inlines-hidden -fmessage-length=0 -mcpu=power8 -mtune=power8 -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fvar-tracking-assignments -pipe -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/llvm-package-13.0.0.rc3 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix

@h-vetinari
Copy link
Contributor Author

I realised that 11.1 is not necessarily the lower bound. It was just the first time that we in conda-forge had tried to use GCC >= 9 to compile LLVM. Therefore, the problem could have existed for longer already.

@h-vetinari
Copy link
Contributor Author

PS. The build script currently in use by this feedstock (resp. my PR) can be found here: https://github.com/h-vetinari/llvmdev-feedstock/blob/rc/recipe/build.sh

Happy to explain how to reconstruct/reproduce this.

@xtkoba
Copy link
Mannequin

xtkoba mannequin commented Sep 15, 2021

Try removing -fno-plt from C(XX)?FLAGS, or try appending -fplt to them to overwrite the existing -fno-plt. I'm not sure if this resolves the issue, but it can be said that the compiler option -fno-plt affects relocations in some manner.

@h-vetinari
Copy link
Contributor Author

Try removing -fno-plt from C(XX)?FLAGS, or try appending -fplt to them to
overwrite the existing -fno-plt. I'm not sure if this resolves the issue,
but it can be said that the compiler option -fno-plt affects relocations in
some manner.

Thanks a lot for the suggestion, indeed the build runs through on GCC 9 when -fno-plt is removed! :)

This helps take off the pressure a bit, though it should IMO still be solved, because IIUC it is a performance-relevant flag, and conda-forge has it on by default everywhere.

@nemanjai
Copy link
Member

Try removing -fno-plt from C(XX)?FLAGS, or try appending -fplt to them to
overwrite the existing -fno-plt. I'm not sure if this resolves the issue,
but it can be said that the compiler option -fno-plt affects relocations in
some manner.

Thanks a lot for the suggestion, indeed the build runs through on GCC 9 when
-fno-plt is removed! :)

This helps take off the pressure a bit, though it should IMO still be
solved, because IIUC it is a performance-relevant flag, and conda-forge has
it on by default everywhere.

I am really sorry that I didn't get to this until now...

I don't foresee being able to build LLVM on PPC with an option such as -fno-plt since the PLT stubs are what allows calls to reach their destination functions.

I think it would be interesting to see if there is actually a performance impact of this option on PPC - i.e. take the last known configuration that successfully builds with that option and build it also without that option. You should then be able to compare the performance of the two builds.

@h-vetinari
Copy link
Contributor Author

I am really sorry that I didn't get to this until now...

No worries, thanks for helping now!

I don't foresee being able to build LLVM on PPC with an option such as
-fno-plt since the PLT stubs are what allows calls to reach their
destination functions.

I don't claim to understand the details, but it did work in the past. What changed?

It's maybe worth noting that only the LLVM-component of LLVM is affected by this. Clang/OpenMP/etc. build fine on PPC with -fno-plt.

I think it would be interesting to see if there is actually a performance
impact of this option on PPC - i.e. take the last known configuration that
successfully builds with that option and build it also without that option.
You should then be able to compare the performance of the two builds.

Benchmarking the library code seems like a very complex task that I don't think I'll be able to execute. If this issue is declared to lean towards WONTFIX, I'll check back with the core conda-forge folks what their thoughts are on this.

@nemanjai
Copy link
Member

Unfortunately, the fix for this would not be something this community can provide as the bug is in GCC.

I do however think that removing the -fno-plt option for PPC builds is the correct course of action (irrespective of whether any GCC bugs with this option are ever fixed).

@h-vetinari
Copy link
Contributor Author

Unfortunately, the fix for this would not be something this community can
provide as the bug is in GCC.

Understood - is this a known bug or should it be raised with GCC?

I do however think that removing the -fno-plt option for PPC builds is the
correct course of action (irrespective of whether any GCC bugs with this
option are ever fixed).

Thanks for your help!

@nemanjai
Copy link
Member

Unfortunately, the fix for this would not be something this community can
provide as the bug is in GCC.

Understood - is this a known bug or should it be raised with GCC?

I don't think they're aware of this issue so a bug report for GCC may be in order. However, I imagine it will be difficult to reduce this to something that is manageable in order to produce a reproducer for the GCC bug report.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
@asl
Copy link
Collaborator

asl commented Dec 15, 2021

@tstellar Move to 13.0.1?

@h-vetinari
Copy link
Contributor Author

@tstellar Move to 13.0.1?

I think this never got removed from 13.0.0, but since it has no patch/solution, I don't think it makes sense to add it to 13.0.1.

@arsenm
Copy link
Contributor

arsenm commented Aug 14, 2023

Old build issue with old release

@arsenm arsenm closed this as not planned Won't fix, can't repro, duplicate, stale Aug 14, 2023
@EugeneZelenko EugeneZelenko added the obsolete Issues with old (unsupported) versions of LLVM label Aug 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla build-problem cmake Build system in general and CMake in particular obsolete Issues with old (unsupported) versions of LLVM
Projects
None yet
Development

No branches or pull requests

5 participants