Skip to content

Sync meeting on EESSI software layer (2023 06 14)

Kenneth Hoste edited this page Jun 14, 2023 · 1 revision

EESSI/2023.06 software layer

This document collects information about the EESSI/2023.06 software-layer (what toolchains and software packages it should include, how to build the software, what issues are encountered and how they have been resolved). It also serves to coordinate the building of the software layer version and keeps track of meetings/discussions around that.

Meeting 2023-06-14, 14:00 CEST (UTC+2)

  • EESSI/2023.04 abandoned
    • compat layer was built, but includes OpenSSL 3.x which is problematic
      • problems with old versions of Rust (issue #257)
        • acceptable workaround could be to switch to newer versions of Rust (since it's a build-only dependency)
      • problems with old versions of cryptography extension in Python (issue #258)
      • problems with Net::SSLeay extension in Perl (issue #259)
    • OpenSSL/1.1 in software-layer installed with EasyBuild would be an actual from-source installation of OpenSSL, which is not a good idea long term
  • EESSI/2023.06 kickstarted
    • compat layer (PR #188)
      • sticking to OpenSSL 1.1.1 by masking >=dev-libs/openssl-3
      • bumping GCC to 10.4.0 (was 9.5.0), since GCC 9.x doesn't seem actively maintained anymore
      • automated build with bot now works
        • problem with x11-base/xorg-proto fixed by not using gcc-config anymore (see issue #187)
        • build takes ~5h on 16 cores (c4.4xlarge fo x86_64 + c6g.4xlarge for aarch64)
      • update to recent commit of Gentoo Git repository + latest version of Gentoo Prefix bootstrap script
      • deployment with bot does not work yet, so will still be done manually for 2023.06
        • tarball with /cvmfs/pilot.eessi-hpc.org/versions/2023.06 is not being created correctly by bot/build.sh script in EESSI/compatibility-layer repo
        • check for successful build done by bot is currently hardcoded for software layer
        • job manager part of bot is currently failing in bot account on AWS CitC Slurm cluster due to not having permission to update comments (?!)
    • software layer
      • test installations on top of 2023.06 compat layer look promising
        • OK on aarch64: foss/2021a + foss/2022a + foss/2022b (incl. all dependencies: GCC, Rust, Python, Perl, Rust, OpenBLAS, OpenMPI, ...)
          • OpenBLAS for foss/2021b failed because of too any failing tests in OpenBLAS (238 > 150) - to be checked
        • building foss from scratch on top of EESSI 2023.06 compat layer takes about 4h on 16 cores (c6g.4xlarge)
      • PR #260 to 2023.06 branch of software-layer repo
        • bump version to 2023.06
          • clean up install script to only use easystack file
        • add hook to correctly deal with OpenBLAS for generic CPU target
        • add easystack file for foss/2021a

How to build:

  • Add to EasyStack file, and make a PR
  • To make the bot build, comment in the PR: (but will only do this when its configured to listen to you)
    • E.g. bot: build arch:x86_64/generic repo:eessi-2023.06-software would build the software added to the EasyStack file in this PR with generic optimization for the 2023.06 version of the software layer.
    • bot: build arch=x86_64 repo=eessi-2023.06-software => builds for all x86_64 based archs, targetting the 2023.06 repo
    • bot: build repo=eessi-2023.06-software => builds for all archs, targetting the 2023.06 repo
    • bot: build inst:AWS repo=eessi-2023.06-software => builds for all archs, targetting the 2023.06 repo, only the bot called 'AWS' listens to this
    • implemented in bot in https://github.com/EESSI/eessi-bot-software-layer/pull/172
    • NESSI example https://github.com/NorESSI/software-layer/pull/123

How to deploy:

  • Add bot:deploy label to the PR (but will only do this when its configured to listen to you)

Access to build logs in bot

  • easy solution: give access bot on AWS CitC cluster
  • put job output files in shared read-only location on AWS CitC cluster
  • implement support in bot to get build log somehow

Access to Slurm cluster in AWS

Bot configuration

Clone this wiki locally