Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[le11] Add support for building with MOLD linker v2 #6875

Merged
merged 7 commits into from
Oct 7, 2022

Conversation

SupervisedThinking
Copy link
Contributor

@SupervisedThinking SupervisedThinking commented Sep 7, 2022

The basic idea is based on this PR #6665 which is somewhat stalled, not completely functional & needs rebase. Since mold linker benchmarks show nice performance gains it could speed up the compilation of the distro image on systems with a lot of cores/threads see https://github.com/rui314/mold#readme & xbmc/xbmc#20891 for details.

  • build mold:host conditionally as gcc packages dependency
  • add MOLD_SUPPORT distro var
  • build spdlog as shared lib to fix Kodi linking by mold
  • add Kodi support for mold linking in pkg build opts

EDIT: rebased on #6928

config/functions Outdated Show resolved Hide resolved
config/functions Outdated Show resolved Hide resolved
@SupervisedThinking
Copy link
Contributor Author

SupervisedThinking commented Sep 8, 2022

config/functions Outdated Show resolved Hide resolved
@antonlacon
Copy link
Contributor

Code changes look ok. I believe it should get the ok from project/device maintainers too, as it's changing the default linker.

The TBB patch's PR: oneapi-src/oneTBB#824

@rui314
Copy link

rui314 commented Sep 12, 2022

@SupervisedThinking What was the issue you experienced with some Qt packages? We'd like to fix it on mold side if possible.

@SupervisedThinking
Copy link
Contributor Author

SupervisedThinking commented Sep 12, 2022

@SupervisedThinking What was the issue you experienced with some Qt packages? We'd like to fix it on mold side if possible.

The packages were "out-of-tree" packages & aren't included in vanilla LE. I can open issues later this week at the mold issue tracker 👍🏻

@HiassofT
Copy link
Member

@rui314 thanks a lot for looking into it!

I dug out the slowest working x86_64 box I could find here - an ancient 3GHz dualcore P4 with 2GB RAM and ran the testsuite on it.

Due to slower speed (and also performance frequency governor which I forgot to activate in previous tests) the results are a bit more stable and pronounced.

Interestingly mold (now latest master) is a bit slower than gold in x86_64 linking.

hias@hp:~/linkertest$ ./linktest.sh 1000 gcc 
mold 1.5.1 (7d8bf817e92c6cbfd8888420067e92a6c2fe33ab; compatible with GNU ld)

gcc (Debian 10.2.1-6) 10.2.1 20210110
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

    61.628   -fuse-ld=bfd
    26.539   -fuse-ld=gold
    38.330   -B/opt/mold/libexec/mold
    34.314   -B/opt/mold/libexec/mold -Wl,--no-threads
hias@hp:~/linkertest$ ./linktest.sh 1000 arm-linux-gnueabihf-gcc
mold 1.5.1 (7d8bf817e92c6cbfd8888420067e92a6c2fe33ab; compatible with GNU ld)

arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

    80.345   -fuse-ld=bfd
    24.399   -fuse-ld=gold
    54.573   -B/opt/mold/libexec/mold
    51.298   -B/opt/mold/libexec/mold -Wl,--no-threads

Perf stats with threads (-Wl,--perf)

hias@hp:~/linkertest$ grep -A 2 all perf-threads-*
perf-threads-arm-linux-gnueabihf-gcc.001:    0.025    0.016    0.033  all
perf-threads-arm-linux-gnueabihf-gcc.001-    0.025    0.017    0.025    read_input_files
perf-threads-arm-linux-gnueabihf-gcc.001-    0.010    0.003    0.006    total
--
perf-threads-arm-linux-gnueabihf-gcc.002:    0.019    0.022    0.033  all
perf-threads-arm-linux-gnueabihf-gcc.002-    0.014    0.028    0.025    read_input_files
perf-threads-arm-linux-gnueabihf-gcc.002-    0.009    0.003    0.006    total
--
perf-threads-arm-linux-gnueabihf-gcc.003:    0.019    0.021    0.033  all
perf-threads-arm-linux-gnueabihf-gcc.003-    0.026    0.015    0.025    read_input_files
perf-threads-arm-linux-gnueabihf-gcc.003-    0.003    0.009    0.006    total
--
perf-threads-arm-linux-gnueabihf-gcc.004:    0.017    0.025    0.033  all
perf-threads-arm-linux-gnueabihf-gcc.004-    0.017    0.026    0.025    read_input_files
perf-threads-arm-linux-gnueabihf-gcc.004-    0.007    0.006    0.006    total
--
perf-threads-arm-linux-gnueabihf-gcc.005:    0.020    0.021    0.033  all
perf-threads-arm-linux-gnueabihf-gcc.005-    0.023    0.019    0.025    read_input_files
perf-threads-arm-linux-gnueabihf-gcc.005-    0.006    0.006    0.006    total
--
perf-threads-arm-linux-gnueabihf-gcc.006:    0.029    0.012    0.033  all
perf-threads-arm-linux-gnueabihf-gcc.006-    0.027    0.016    0.025    read_input_files
perf-threads-arm-linux-gnueabihf-gcc.006-    0.007    0.006    0.006    total
--
perf-threads-arm-linux-gnueabihf-gcc.007:    0.027    0.014    0.033  all
perf-threads-arm-linux-gnueabihf-gcc.007-    0.030    0.013    0.025    read_input_files
perf-threads-arm-linux-gnueabihf-gcc.007-    0.004    0.008    0.006    total
--
perf-threads-arm-linux-gnueabihf-gcc.008:    0.022    0.019    0.033  all
perf-threads-arm-linux-gnueabihf-gcc.008-    0.021    0.021    0.025    read_input_files
perf-threads-arm-linux-gnueabihf-gcc.008-    0.012    0.001    0.006    total
--
perf-threads-arm-linux-gnueabihf-gcc.009:    0.032    0.009    0.033  all
perf-threads-arm-linux-gnueabihf-gcc.009-    0.029    0.014    0.025    read_input_files
perf-threads-arm-linux-gnueabihf-gcc.009-    0.003    0.009    0.006    total
--
perf-threads-arm-linux-gnueabihf-gcc.010:    0.026    0.013    0.033  all
perf-threads-arm-linux-gnueabihf-gcc.010-    0.030    0.011    0.025    read_input_files
perf-threads-arm-linux-gnueabihf-gcc.010-    0.003    0.009    0.006    total
--
perf-threads-gcc.001:    0.002    0.020    0.019  all
perf-threads-gcc.001-    0.005    0.022    0.015    read_input_files
perf-threads-gcc.001-    0.000    0.005    0.003    total
--
perf-threads-gcc.002:    0.007    0.011    0.019  all
perf-threads-gcc.002-    0.012    0.013    0.015    read_input_files
perf-threads-gcc.002-    0.003    0.003    0.003    total
--
perf-threads-gcc.003:    0.003    0.018    0.019  all
perf-threads-gcc.003-    0.007    0.019    0.015    read_input_files
perf-threads-gcc.003-    0.003    0.002    0.003    total
--
perf-threads-gcc.004:    0.001    0.019    0.019  all
perf-threads-gcc.004-    0.007    0.019    0.015    read_input_files
perf-threads-gcc.004-    0.002    0.004    0.003    total
--
perf-threads-gcc.005:    0.011    0.010    0.019  all
perf-threads-gcc.005-    0.011    0.015    0.015    read_input_files
perf-threads-gcc.005-    0.003    0.003    0.003    total
--
perf-threads-gcc.006:    0.002    0.020    0.019  all
perf-threads-gcc.006-    0.000    0.027    0.015    read_input_files
perf-threads-gcc.006-    0.005    0.000    0.003    total
--
perf-threads-gcc.007:   -0.003    0.024    0.019  all
perf-threads-gcc.007-    0.003    0.023    0.015    read_input_files
perf-threads-gcc.007-    0.001    0.004    0.003    total
--
perf-threads-gcc.008:    0.006    0.015    0.019  all
perf-threads-gcc.008-    0.004    0.022    0.015    read_input_files
perf-threads-gcc.008-    0.006    0.000    0.003    total
--
perf-threads-gcc.009:    0.018    0.002    0.019  all
perf-threads-gcc.009-    0.015    0.011    0.015    read_input_files
perf-threads-gcc.009-    0.003    0.002    0.003    total
--
perf-threads-gcc.010:    0.006    0.015    0.019  all
perf-threads-gcc.010-    0.013    0.014    0.015    read_input_files
perf-threads-gcc.010-    0.004    0.002    0.003    total

Perf stats without threads (-Wl,--perf -Wl,--no-threads)

hias@hp:~/linkertest$ grep -A 2 all perf-nothreads-*
perf-nothreads-arm-linux-gnueabihf-gcc.001:    0.011    0.004    0.030  all
perf-nothreads-arm-linux-gnueabihf-gcc.001-    0.015    0.010    0.025    read_input_files
perf-nothreads-arm-linux-gnueabihf-gcc.001-    0.001    0.002    0.004    total
--
perf-nothreads-arm-linux-gnueabihf-gcc.002:    0.010    0.006    0.030  all
perf-nothreads-arm-linux-gnueabihf-gcc.002-    0.014    0.011    0.025    read_input_files
perf-nothreads-arm-linux-gnueabihf-gcc.002-    0.002    0.001    0.003    total
--
perf-nothreads-arm-linux-gnueabihf-gcc.003:    0.010    0.005    0.030  all
perf-nothreads-arm-linux-gnueabihf-gcc.003-    0.017    0.009    0.025    read_input_files
perf-nothreads-arm-linux-gnueabihf-gcc.003-    0.004    0.000    0.004    total
--
perf-nothreads-arm-linux-gnueabihf-gcc.004:    0.011    0.005    0.030  all
perf-nothreads-arm-linux-gnueabihf-gcc.004-    0.016    0.008    0.025    read_input_files
perf-nothreads-arm-linux-gnueabihf-gcc.004-    0.003    0.001    0.004    total
--
perf-nothreads-arm-linux-gnueabihf-gcc.005:    0.012    0.004    0.030  all
perf-nothreads-arm-linux-gnueabihf-gcc.005-    0.018    0.007    0.025    read_input_files
perf-nothreads-arm-linux-gnueabihf-gcc.005-    0.003    0.001    0.004    total
--
perf-nothreads-arm-linux-gnueabihf-gcc.006:    0.006    0.009    0.030  all
perf-nothreads-arm-linux-gnueabihf-gcc.006-    0.012    0.013    0.025    read_input_files
perf-nothreads-arm-linux-gnueabihf-gcc.006-    0.004    0.000    0.004    total
--
perf-nothreads-arm-linux-gnueabihf-gcc.007:    0.010    0.006    0.030  all
perf-nothreads-arm-linux-gnueabihf-gcc.007-    0.016    0.008    0.025    read_input_files
perf-nothreads-arm-linux-gnueabihf-gcc.007-    0.003    0.001    0.004    total
--
perf-nothreads-arm-linux-gnueabihf-gcc.008:    0.004    0.011    0.030  all
perf-nothreads-arm-linux-gnueabihf-gcc.008-    0.007    0.018    0.025    read_input_files
perf-nothreads-arm-linux-gnueabihf-gcc.008-    0.002    0.002    0.004    total
--
perf-nothreads-arm-linux-gnueabihf-gcc.009:    0.005    0.010    0.030  all
perf-nothreads-arm-linux-gnueabihf-gcc.009-    0.008    0.017    0.025    read_input_files
perf-nothreads-arm-linux-gnueabihf-gcc.009-    0.004    0.000    0.004    total
--
perf-nothreads-arm-linux-gnueabihf-gcc.010:    0.009    0.006    0.030  all
perf-nothreads-arm-linux-gnueabihf-gcc.010-    0.012    0.013    0.025    read_input_files
perf-nothreads-arm-linux-gnueabihf-gcc.010-    0.004    0.000    0.004    total
--
perf-nothreads-gcc.001:   -0.001    0.005    0.016  all
perf-nothreads-gcc.001-    0.004    0.008    0.012    read_input_files
perf-nothreads-gcc.001-    0.002    0.000    0.002    total
--
perf-nothreads-gcc.002:   -0.001    0.004    0.016  all
perf-nothreads-gcc.002-    0.004    0.008    0.012    read_input_files
perf-nothreads-gcc.002-    0.001    0.002    0.002    total
--
perf-nothreads-gcc.003:   -0.002    0.005    0.016  all
perf-nothreads-gcc.003-    0.004    0.008    0.012    read_input_files
perf-nothreads-gcc.003-    0.001    0.001    0.002    total
--
perf-nothreads-gcc.004:   -0.000    0.004    0.016  all
perf-nothreads-gcc.004-    0.006    0.006    0.012    read_input_files
perf-nothreads-gcc.004-    0.001    0.001    0.002    total
--
perf-nothreads-gcc.005:    0.001    0.002    0.016  all
perf-nothreads-gcc.005-    0.004    0.008    0.012    read_input_files
perf-nothreads-gcc.005-    0.001    0.002    0.002    total
--
perf-nothreads-gcc.006:    0.002    0.001    0.016  all
perf-nothreads-gcc.006-    0.004    0.008    0.012    read_input_files
perf-nothreads-gcc.006-    0.002    0.001    0.002    total
--
perf-nothreads-gcc.007:   -0.001    0.004    0.016  all
perf-nothreads-gcc.007-    0.004    0.009    0.012    read_input_files
perf-nothreads-gcc.007-    0.001    0.002    0.002    total
--
perf-nothreads-gcc.008:   -0.003    0.006    0.016  all
perf-nothreads-gcc.008-    0.004    0.008    0.012    read_input_files
perf-nothreads-gcc.008-    0.000    0.002    0.002    total
--
perf-nothreads-gcc.009:    0.001    0.003    0.016  all
perf-nothreads-gcc.009-    0.008    0.004    0.012    read_input_files
perf-nothreads-gcc.009-    0.000    0.002    0.002    total
--
perf-nothreads-gcc.010:    0.005   -0.001    0.016  all
perf-nothreads-gcc.010-    0.004    0.008    0.012    read_input_files
perf-nothreads-gcc.010-    0.001    0.002    0.002    total

- Kodi fails to link static spdlog libs
```
mold: error: duplicate symbol: /mnt/dev/LibreELEC-RR/build.LibreELEC-RK3399.arm-11.0-devel/toolchain/armv8a-libreelec-linux-gnueabihf/sysroot/usr/lib/libspdlog.a(spdlog.cpp.o): /tmp/ccIINps6.ltrans2.ltrans.o: std::_Sp_make_shared_tag::_S_ti()::__tag
mold: error: duplicate symbol: /mnt/dev/LibreELEC-RR/build.LibreELEC-RK3399.arm-11.0-devel/toolchain/armv8a-libreelec-linux-gnueabihf/sysroot/usr/lib/libspdlog.a(spdlog.cpp.o): /tmp/ccIINps6.ltrans0.ltrans.o: fmt::v9::detail::do_count_digits(unsigned int)::table
mold: error: duplicate symbol: /mnt/dev/LibreELEC-RR/build.LibreELEC-RK3399.arm-11.0-devel/toolchain/armv8a-libreelec-linux-gnueabihf/sysroot/usr/lib/libspdlog.a(spdlog.cpp.o): /tmp/ccIINps6.ltrans0.ltrans.o: fmt::v9::detail::do_count_digits(unsigned long long)::bsr2log10
mold: error: duplicate symbol: /mnt/dev/LibreELEC-RR/build.LibreELEC-RK3399.arm-11.0-devel/toolchain/armv8a-libreelec-linux-gnueabihf/sysroot/usr/lib/libspdlog.a(spdlog.cpp.o): /tmp/ccIINps6.ltrans0.ltrans.o: fmt::v9::detail::do_count_digits(unsigned long long)::zero_or_powers_of_10
mold: error: duplicate symbol: /mnt/dev/LibreELEC-RR/build.LibreELEC-RK3399.arm-11.0-devel/toolchain/armv8a-libreelec-linux-gnueabihf/sysroot/usr/lib/libspdlog.a(spdlog.cpp.o): /tmp/ccIINps6.ltrans0.ltrans.o: fmt::v9::detail::basic_data<void>::pow10_significands
mold: error: duplicate symbol: /mnt/dev/LibreELEC-RR/build.LibreELEC-RK3399.arm-11.0-devel/toolchain/armv8a-libreelec-linux-gnueabihf/sysroot/usr/lib/libspdlog.a(spdlog.cpp.o): /tmp/ccIINps6.ltrans0.ltrans.o: fmt::v9::detail::basic_data<void>::pow10_exponents
mold: error: duplicate symbol: /mnt/dev/LibreELEC-RR/build.LibreELEC-RK3399.arm-11.0-devel/toolchain/armv8a-libreelec-linux-gnueabihf/sysroot/usr/lib/libspdlog.a(spdlog.cpp.o): /tmp/ccIINps6.ltrans0.ltrans.o: fmt::v9::detail::basic_data<void>::power_of_10_64
```
@SupervisedThinking
Copy link
Contributor Author

SupervisedThinking commented Oct 6, 2022

@HiassofT ping - this should be good to go. Mold will only be build & used if the flags are set.

=== tested on ===
x11.x86_64-RR-20221006-001a405
Linux phoenix 6.0.0 #1 SMP Thu Oct 6 03:02:35 CEST 2022 x86_64 GNU/Linux
2022-10-06 20:33:43.489 T:1371     info <general>: Starting Kodi (20.0-ALPHA3 (19.90.710) Git:3066d800934f39c28b951c69930d3cbee5fd6308). Platform: Linux x86 64-bit
2022-10-06 20:33:43.489 T:1371     info <general>: Using Release Kodi x64
2022-10-06 20:33:43.489 T:1371     info <general>: Kodi compiled 2022-10-06 by GCC 12.2.0 for Linux x86 64-bit version 6.0.0 (393216)
2022-10-06 20:33:43.489 T:1371     info <general>: Running on LibreELEC (ST): RR-20221006-001a405 11.0, kernel: Linux x86 64-bit version 6.0.0
2022-10-06 20:33:43.489 T:1371     info <general>: FFmpeg version/source: 4.4.1-Nexus-Alpha1-Kodi
2022-10-06 20:33:43.489 T:1371     info <general>: Host CPU: Intel(R) Core(TM) i3-6100 CPU @ 3.70GHz, 4 cores available
[    0.000000] DMI: Gigabyte Technology Co., Ltd. B150N Phoenix-WIFI/B150N Phoenix-WIFI-CF, BIOS F22e 03/09/2018
CApplication::CreateGUI - using the x11 windowing system
RetroPlayer[PROCESS]: Registering process control for X11
CRenderSystemGL::InitRenderSystem - Version: 4.6 (Core Profile) Mesa 22.2.0, Major: 4, Minor: 6
GL_RENDERER = Mesa Intel(R) HD Graphics 530 (SKL GT2)
GL_VERSION = 4.6 (Core Profile) Mesa 22.2.0
libva info: VA-API version 1.15.0
vainfo: VA-API version: 1.15 (libva 2.15.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 22.5.4 (001a405718)

Copy link
Member

@HiassofT HiassofT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good now.

Default hasn't changed, we still link with gold and mold isn't built unless MOLD_SUPPORT is changed to yes.

BTW: During testing I noticed that mold takes quite long to build - both on my 2C/4T X230 when testing native Linux builds on Debian and on my 4C/8T LE build laptop.

mimalloc and tbb can be neglected (they need roughly 10 seconds total), but mold needs well over 4 minutes to build (total LE build time is roughly 2 hours, with default gold)

Test with ccache wiped, building tbb+mimalloc+mold, all other deps built before

$ time ../run-rpi4.sh scripts/build_mt mold:host
real    4m44.906s
user    33m51.877s
sys     1m34.302s

@HiassofT HiassofT merged commit bbb8c6b into LibreELEC:master Oct 7, 2022
@SupervisedThinking SupervisedThinking deleted the up_mold_v2 branch October 7, 2022 21:36
@rui314
Copy link

rui314 commented Oct 7, 2022

mold's code is template heavy, and C++ compilers are unfortunately slow at compiling it.

@SupervisedThinking
Copy link
Contributor Author

SupervisedThinking commented Oct 23, 2022

@rui314 just wondering: could you implement an option to enable or disable certain platforms? So for example build mold with x86_64 & arm 32bit support only? Wouldn't this decrease build time?

@rui314
Copy link

rui314 commented Oct 23, 2022

@SupervisedThinking Technically we could, but it's very unlikely that we will add such a build-time option to mold because of https://github.com/rui314/mold/blob/main/CMakeLists.txt#L38-L44. We can't stop users from applying local patches to build only a part of the mold linker, but we don't want to officially support it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants