-
Notifications
You must be signed in to change notification settings - Fork 0
Brainstorm software stacks new clusters Apr 23 2021
Bob Dröge edited this page Apr 23, 2021
·
2 revisions
Software stacks on new clusters, how to make it easy to switch to EESSI later?
See https://github.com/EESSI/meetings/blob/main/meetings/EESSI_pilot_2020.10_brainstorm_20201027.pdf
- Bob Dröge
- Sabry Razick (University of Oslo)
- Åke Sangren (Umeå Univ)
- Xin Wu (UPB)
- Ward Poelmans (Vrije Universiteit Brussel)
- Thomas Röblitz
- Sebastian Potthoff
- Robert Schade
- Robert Externbrink
- Robbert Eggermont
- Rachel Glaves (RUB)
- Peter Stol
- Kenneth Hoste
- Paul Keekstra
- Michael Vetter
- Maxim (SURF)
- Martin Errenst
- Koen (TU Delft)
- Jörg Sassmannshausen
- Jure Pečar
- Holger Angenent (WWU Münster)
- Dennis Terhorst (Jülich)
- Axel Rosén
- Alan O'Cais
Options:
- Use local stack like you did before
- Go completely the EESSI way and use the same system to install software
- Mix the two options above
- Use different disjunct modulepaths for both: loading the EESSI stack is loading a module.
- Make a hard switch: start with local and then rebuild everything you have in EESSI so you can do a clean but hard switch
- Look out when mixing both systems as Easyconfig in EESSI are tweaked compared to upstream EasyBuild. If you use both, it will cause clashes
- Users don't care where the modules come from
- EESSI will provide multiple 'views' to the EESSI stack
- standard EasyBuildMNS, HierarchicalMNS, lowercase, ...
- With and without rpath is a potential big difference and most likely user will notice. EasyBuild by default does not use rpath while EESSI does.
- Building on top of EESSI can solve some issues and keep things compatible, but how to deal with outdated compat layers that are going to re removed?
- Tell people who do this to make their own copy
- Store old versions somewhere else
- Alan is already doing this and this works quite well
- How do users compile their code when RPATH is being used?
- Can we provide a "stable" minimal EESSI release that is not being removed in the near future, and only provides toolchains?
- Maybe make it available for a set of test users first, as Thomas has done in Norway
- Jörg also had the idea to mix modules files from local installation and EESSI; how do you make sure that dependencies (e.g.
libpng
andlibjpeg
) are picked up from the right stack?- RPATH should solve the issue
- Whatever comes first in the
$MODULEPATH
will be used, so make sure that you local module directories are listed first - Using the easystacks, it should be easy to just rebuild things easily, either fully or only the module files
- Focus on stable release:
- GCCcore + useful apps built with GCCcore (e.g. Perl)
- Start testing already and see how this works
- Jörg (assuming they're going the EESSI/EasyBuild route), Åke, and Bob volunteer to test this
- Thomas can also help when CVMFS clients are available on at least one cluster
- Doing the transition with the release of a new GCCcore version may make things much easier
- As a conservative approach: first build everything locally, then make the switch to EESSI when a new toolchain becomes available, start building everything with that new toolchain
- Åke asks about the build procedure for EESSI and how software is added to the repo
- We use a Singularity container with fuse-overlayfs: https://github.com/EESSI/software-layer/blob/main/Dockerfile.fuse-overlay-debian10-x86_64
- This allows you to build on any machine and write to
/cvmfs
- Installations end up in a overlay directory on the host
- This directory is tarred, copied to the Stratum 0, and ingested
- We're working on more automation for this process
- What happens when security updates are installed? How often do we do that?
- We've only had to do this a couple of times so far, as we don't keep our versions for too long. May have to do it more often when we have a stable release
- It shouldn't really break stuff, but if it does, it can be bad...
- How often do we still see issues with software?
- It's quite often related to the compatibility layer (e.g. missing symlink to a host file), and often these are weird issues that you only detect when actually using the software
- In terms of building things usually work quite well
- A difference with Compute Canada is that we include lots of graphical libraries in the software layer instead of the compatibility layer, we'll have to see if any issues will pop up here...
- If you want to build software on top of EESSI, see the issue opened by Alan https://github.com/EESSI/software-layer/issues/59 (or take a look at the build script to find the settings that it uses: https://github.com/EESSI/software-layer/blob/main/EESSI-pilot-install-software.sh)