-
Notifications
You must be signed in to change notification settings - Fork 0
Sync meeting on EESSI software layer (2023 06 14)
Kenneth Hoste edited this page Jun 14, 2023
·
1 revision
This document collects information about the EESSI/2023.06 software-layer (what toolchains and software packages it should include, how to build the software, what issues are encountered and how they have been resolved). It also serves to coordinate the building of the software layer version and keeps track of meetings/discussions around that.
- EESSI/2023.04 abandoned
- compat layer was built, but includes OpenSSL 3.x which is problematic
- problems with old versions of Rust (issue #257)
- acceptable workaround could be to switch to newer versions of Rust (since it's a build-only dependency)
- problems with old versions of
cryptography
extension in Python (issue #258) - problems with
Net::SSLeay
extension in Perl (issue #259)
- problems with old versions of Rust (issue #257)
- OpenSSL/1.1 in software-layer installed with EasyBuild would be an actual from-source installation of OpenSSL, which is not a good idea long term
- compat layer was built, but includes OpenSSL 3.x which is problematic
- EESSI/2023.06 kickstarted
- compat layer (PR #188)
- sticking to OpenSSL 1.1.1 by masking
>=dev-libs/openssl-3
- bumping GCC to 10.4.0 (was 9.5.0), since GCC 9.x doesn't seem actively maintained anymore
-
automated build with bot now works
- problem with
x11-base/xorg-proto
fixed by not usinggcc-config
anymore (see issue #187) - build takes ~5h on 16 cores (
c4.4xlarge
fox86_64
+c6g.4xlarge
foraarch64
)
- problem with
- update to recent commit of Gentoo Git repository + latest version of Gentoo Prefix bootstrap script
- deployment with bot does not work yet, so will still be done manually for 2023.06
- tarball with
/cvmfs/pilot.eessi-hpc.org/versions/2023.06
is not being created correctly bybot/build.sh
script inEESSI/compatibility-layer
repo - check for successful build done by bot is currently hardcoded for software layer
- cfr. WIP PR #179 + bot PR #174 to switch to using a
bot/check-result.sh
script
- cfr. WIP PR #179 + bot PR #174 to switch to using a
- job manager part of bot is currently failing in
bot
account on AWS CitC Slurm cluster due to not having permission to update comments (?!)
- tarball with
- sticking to OpenSSL 1.1.1 by masking
- software layer
- test installations on top of 2023.06 compat layer look promising
- OK on
aarch64
:foss/2021a
+foss/2022a
+foss/2022b
(incl. all dependencies: GCC, Rust, Python, Perl, Rust, OpenBLAS, OpenMPI, ...)- OpenBLAS for
foss/2021b
failed because of too any failing tests in OpenBLAS (238 > 150) - to be checked
- OpenBLAS for
- building
foss
from scratch on top of EESSI 2023.06 compat layer takes about 4h on 16 cores (c6g.4xlarge
)
- OK on
-
PR #260 to
2023.06
branch ofsoftware-layer
repo- bump version to
2023.06
- clean up install script to only use easystack file
- add hook to correctly deal with OpenBLAS for
generic
CPU target - add easystack file for
foss/2021a
- bump version to
- test installations on top of 2023.06 compat layer look promising
- compat layer (PR #188)
How to build:
- Add to EasyStack file, and make a PR
- To make the bot build, comment in the PR: (but will only do this when its configured to listen to you)
- E.g.
bot: build arch:x86_64/generic repo:eessi-2023.06-software
would build the software added to the EasyStack file in this PR with generic optimization for the 2023.06 version of the software layer. -
bot: build arch=x86_64 repo=eessi-2023.06-software
=> builds for all x86_64 based archs, targetting the 2023.06 repo -
bot: build repo=eessi-2023.06-software
=> builds for all archs, targetting the 2023.06 repo -
bot: build inst:AWS repo=eessi-2023.06-software
=> builds for all archs, targetting the 2023.06 repo, only the bot called 'AWS' listens to this - implemented in bot in https://github.com/EESSI/eessi-bot-software-layer/pull/172
- NESSI example https://github.com/NorESSI/software-layer/pull/123
- E.g.
How to deploy:
- Add bot:deploy label to the PR (but will only do this when its configured to listen to you)
Access to build logs in bot
- easy solution: give access
bot
on AWS CitC cluster - put job output files in shared read-only location on AWS CitC cluster
- implement support in bot to get build log somehow
Access to Slurm cluster in AWS
Bot configuration
- see https://github.com/EESSI/software-layer/blob/main/bot/bot-eessi-aws-citc.cfg (needs to be updated with currently active configuration)
- who has permissions to build/deploy/talk to bot should not be public
- can look into setting up secret teams in GitHub for this?