Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Project subdirectories in dev.eessi.io install path and EESSI-extend integration #804

Draft
wants to merge 31 commits into
base: 2023.06-software.eessi.io
Choose a base branch
from

Conversation

Neves-P
Copy link
Member

@Neves-P Neves-P commented Nov 5, 2024

Please don't merge for now

Previous installations to dev.eessi.io are landing in /cvmfs/dev.eessi.io/versions/2023.06/ but should also include a project specific subdirectory such as /cvmfs/dev.eessi.io/versions/2023.06/ESPResSo/. Right now, this should be done by setting the environment variable $EESSI_DEV_PROJECT in the dev.eessi.io bot build script used in each of the projects (example: https://github.com/EESSI/dev.eessi.io-scripts/blob/main/bot/bot-build-dev.eessi.io.slurm), but maybe this can be improved by grabbing the information in the job.cfg file which includes a field like this:

[repository]
...
repo_id: repo_name
...

For now, however, I kept it simple to reduce points of failure until we are sure the installations land in the right place.

EESSI-extend

An issue why this PR shouldn't be merged right now: EasyBuild is ignoring the cvmfs repository and installpath overrides and trying to write to software.eessi.io (and failing, because we mount this as read-only). Because this was working earlier I think that some wires are getting crossed when we switched to EESSI-extend in EESSI-install-software.sh (see #790 ) and similarly to what is described at #802. I've not managed to figure out where things are going wrong.

Relevant error from log:

== building and installing ESPResSo/4.2.2-foss-2023a-1f672a00e22a9bc5fec1f7070bd28ff7718a706e...
  >> installation prefix: /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/software/ESPResSo/4.2.2-foss-2023a-1f672a00e22a9bc5fec1f7070bd28ff7718a706e
== FAILED: Installation ended unsuccessfully (build directory: /tmp/bot/easybuild/build/ESPResSo/4.2.2/foss-2023a-1f672a00e22a9bc5fec1f7070bd28ff7718a706e): build failed (first 300 chars): Failed to create lock /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/software/.locks/_cvmfs_software.eessi.io_versions_2023.06_software_linux_x86_64_amd_zen2_software_ESPResSo_4.2.2-foss-2023a-1f672a00e22a9bc5fec1f7070bd28ff7718a706e.lock: [Errno 38] Function not implemente (took 0 secs)

I've tried changing the installpath without setting the EESSI-extend envvars like $EESSI_SITE_INSTALL and $EESSI_USER_INSTALL which might work, but I think users or sites should be able to change these to their needs when building for dev.eessi.io.

To try to figure this out I also added a bunch of extra output to the logs that should be removed before this PR is merged. I will do so before making this as ready for review.

I've tried a few approaches already and I'm out of ideas how to correct the installpath...

TODO after this PR

Address #799

Copy link

eessi-bot bot commented Nov 5, 2024

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

@riscv-eessi-io-bot
Copy link

Instance eessi-bot-riscv is configured to build for:

  • architectures: riscv64/generic
  • repositories: riscv.eessi.io-20240402

Copy link

eessi-bot bot commented Nov 5, 2024

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software, eessi.io-2023.06-compat

@Neves-P
Copy link
Member Author

Neves-P commented Nov 5, 2024

Seems like the last attempt 821455b got a bit further!

Now the installation in pointing to dev.eessi.io. However, this is a pretty dirty hack imo. The issue was that $EESSI_CVMFS_REPO was not being overwritten to dev.eessi.io so I just replaced the check with one that see if $EESSI_DEV_PROJECT is defined (which indirectly means that we're building for dev.eessi.io). Although it is still weird that $EESSI_CVMFS_REPO is not /cvmfs/dev.eessi.io as it should be, which is likely still problematic.

Edit: to clarify, installation starts but the build fails, which is probably happening because I just the first recent commit ID I could find, no guarantees that it would build in the first place.

@Neves-P
Copy link
Member Author

Neves-P commented Nov 10, 2024

Right now the builds do indeed progress further and the last (hopefully) issue to solve is creating the tarballs. I am testing this functionality on via EESSI/dev.eessi.io-example#7

A brief outline of what should happen on dev.eessi.io builds:

  • The bot runs the script in https://github.com/EESSI/dev.eessi.io-scripts/blob/main/bot/bot-build-dev.eessi.io.slurm (or another repository, this can be controlled via the changes in add support for specifying that build job script is located in another repository eessi-bot-software-layer#283
  • This script sets the EESSI compat layer version, cvmfs compat layer override, and the EESSI dev project name (which will be in the installation prefix, for example: /cvmfs/dev.eessi.io/versions/2023.06/ESPReSso/). The bot/build.sh script can also be changed, but we are using the same script as builds for software.eessi.io. This should be kept this way.
  • The necessary variables (overrides, paths, dev name, etc) are exported to the compat layer and to the tarball create script.
  • The builds are being created in the right installation path (see paste of the build step in the log bellow)
  • create_tarball.sh uses EESSI_DEV_PROJECT to determine if the build targets dev.eess.io and the project. The script should see any installed software in this project subdirectory (including accel builds), but something is going wrong (see also log bellow). Most likely I'm forgetting something, or there is a mistake somewhere, but I've not been able to find the problem.

Any help with this is very appreciated!

Build log excerpt

#
# Current EasyBuild configuration
# (C: command line argument, D: default value, E: environment variable, F: configuration file)
#
allow-loaded-modules (E) = EasyBuild, EESSI-extend
buildpath            (E) = /tmp/bot/easybuild/build
containerpath        (E) = /tmp/bot/easybuild/containers
debug                (E) = True
experimental         (E) = True
filter-deps          (E) = Autoconf, Automake, Autotools, binutils, bzip2, DBus, flex, gettext, gperf, help2man, intltool, libreadline, libtool, M4, makeinfo, ncurses, util-linux, XZ, zlib
filter-env-vars      (E) = LD_LIBRARY_PATH
hooks                (E) = /cvmfs/software.eessi.io/versions/2023.06/init/easybuild/eb_hooks.py
ignore-osdeps        (E) = True
installpath          (E) = /cvmfs/dev.eessi.io/versions/2023.06/ESPResSo/
module-extensions    (E) = True
packagepath          (E) = /tmp/bot/easybuild/packages
prefix               (E) = /tmp/bot/easybuild
read-only-installdir (E) = True
repositorypath       (E) = /tmp/bot/easybuild/ebfiles_repo
robot-paths          (E) = /project/60006/SHARED/jobs/2024.11/pr_7/event_e1610170-9de4-11ef-9b92-6e7c0f49e560/run_000/linux_x86_64_amd_zen2/dev.eessi.io/easyconfigs, /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/software/EasyBuild/4.9.4/easybuild/easyconfigs
rpath                (E) = True
sourcepath           (E) = /project/def-users/bot/shared/easybuild/sources:
sysroot              (E) = /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64
trace                (E) = True
umask                (E) = 022
zip-logs             (E) = bzip2
...
== ... (took 3 secs)
== cleaning up...
== ... (took < 1 sec)
== creating module...
  >> generating module file @ /cvmfs/dev.eessi.io/versions/2023.06/ESPResSo/modules/all/ESPResSo/4.2.2-foss-2023a-2ba17de6096933275abec0550981d9122e4e5f28.lua
== ... (took 2 secs)
== permissions...
== ... (took < 1 sec)
== packaging...
== ... (took < 1 sec)
  >> running command:
        [started at: 2024-11-08 15:43:39]
        [working dir: /project/60006/SHARED/jobs/2024.11/pr_7/event_e1610170-9de4-11ef-9b92-6e7c0f49e560/run_000/linux_x86_64_amd_zen2/dev.eessi.io]
        [output logged in /tmp/eb-wop0g31p/eb-ff2_n_24/easybuild-run_cmd-j974lq6t.log]
        bzip2 /cvmfs/dev.eessi.io/versions/2023.06/ESPResSo/software/ESPResSo/4.2.2-foss-2023a-2ba17de6096933275abec0550981d9122e4e5f28/easybuild/easybuild-ESPResSo-4.2.2-20241108.154339.log
  >> command completed: exit 0, ran in < 1s
== COMPLETED: Installation ended successfully (took 23 mins 14 secs)
== Results of the build can be found in the log file(s) /cvmfs/dev.eessi.io/versions/2023.06/ESPResSo/software/ESPResSo/4.2.2-foss-2023a-2ba17de6096933275abec0550981d9122e4e5f28/easybuild/easybuild-ESPResSo-4.2.2-20241108.154339.log.bz2
== Build succeeded for 1 out of 1

Create tarball excerpt

Launching container with command (next line):
singularity  run  --fusemount container:cvmfs2 cvmfs-config.cern.ch /cvmfs/cvmfs-config.cern.ch --fusemount container:cvmfs2 dev.eessi.io /cvmfs_ro/dev.eessi.io --fusemount container:fuse-overlayfs -o lowerdir=/cvmfs_ro/dev.eessi.io -o upperdir=/tmp/dev.eessi.io/overlay-upper -o workdir=/tmp/dev.eessi.io/overlay-work /cvmfs/dev.eessi.io --fusemount container:cvmfs2 software.eessi.io /cvmfs/software.eessi.io /tmp/bot/EESSI/eessi.7nv7ZDhC4r/ghcr.io_eessi_build_node_debian11.sif /project/60006/SHARED/jobs/2024.11/pr_7/event_e1610170-9de4-11ef-9b92-6e7c0f49e560/run_000/linux_x86_64_amd_zen2/dev.eessi.io/software-layer/create_tarball.sh /tmp 2023.06 x86_64/amd/zen2  ESPResSo /eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-1731080674.tar.gz
INFO:    Environment variable SINGULARITY_BIND is set, but APPTAINER_BIND is preferred
INFO:    Environment variable SINGULARITY_HOME is set, but APPTAINER_HOME is preferred
INFO:    Environment variable SINGULARITY_TMPDIR is set, but APPTAINER_TMPDIR is preferred
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: loading Fuse module... >> tmpdir: /tmp/tmp.lLQojo5k9N
Setting Install prefix directory to: 2023.06/ESPResSo/
done
CernVM-FS: loading Fuse module... CernVM-FS: loading Fuse module... done
done
>> Collecting list of files/directories to include in tarball via /tmp/dev.eessi.io/overlay-upper/versions...
>> Creating tarball /eessi_bot_job/eessi-2023.06-software-linux-x86_64-amd-zen2-1731080674.tar.gz from /cvmfs/dev.eessi.io/versions/...
tar: /tmp/tmp.lLQojo5k9N/files.list.txt: Cannot stat: No such file or directory
tar: Error is not recoverable: exiting now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant