Skip to content

meeting Oct 7 2021

Bob Dröge edited this page Oct 7, 2021 · 3 revisions

Notes for 20211007 meeting

20211007 meeting notes

  • date & time: Thu October 7th 2021 - 2pm CEST (12:00 UTC)
    • (every first Thursday of the month)
  • venue: (online, see mail for meeting link, or ask in Slack)
  • agenda:
    • Quick introduction by new people
    • EESSI-related meetings in last month
    • Progress update per EESSI layer
    • 2021.06 version of pilot repository
    • ReFrame updates
    • Infrastructure status and updates
    • AWS/Azure sponsorship update
    • Update on EESSI journal paper + S4 NeIC project proposal
    • Upcoming events
    • Q&A

Slides

Meeting notes

(by Kenneth, Bob)

Quick introduction by new people

  • Vasileiis Karakasis (CSCS)
    • ReFrame lead developer, will share some updates on ReFrame
  • Jean-Guillaume Piccinali (CSCS)

EESSI-related meetings in last month

  • Next CernVM-FS workshop in Amsterdam! (Sept 12-14 2022)

Progress update per EESSI layer

Filesystem layer

  • Two additional Stratum-1 mirror servers: in AWS (eu-west) + Azure (us-east)
  • Alan: Would using a runner in an Arm VM help?
    • Yes, would avoid having to use QEMU for building Arm images
    • Still a problem for POWER, can't run a GitHub runner natively on POWER (because it's implemented in .NET)
    • Could consider implementing our own GitHub App for this (but that's probably a significant effort)
  • Jörg: building Debian packages and making them installable should be possible with reasonable amount of effort
    • But hope is that CernVM-FS developers are willing to provide additional packages like this
  • good progress on automatic ingestion of tarball with additional software installations for EESSI repository
    • workflow:
      • tarball is uplodated to S3 bucket
      • Stratum-0 notices new tarball, opens PR to EESSI/staging repo to request appr

2021.06 version of pilot repository

  • GPU support
    • Getting the meeting set up to get approval for shipping CUDA installations is taking forever...
    • Maybe we need to switch tactics, and get things working first
      • provide CUDA-aware installations of TensorFlow/GROMACS/...
      • provide script to easily install missing CUDA stuff (+ check GPU driver version)
      • leverage host_injections stuff to let site provide CUDA runtime libraries and let EESSI use it
      • explain to NVIDIA why this setup is not ideal, and why approval for shipping CUDA installations in EESSI could avoid it
      • maybe also look into supporting AMD GPUs (ROCm is an open source software stack)...

ReFrame updates

Infrastructure status and updates

AWS/Azure sponsorship update

Update on EESSI journal paper + S4 NeIC project proposal

  • other funding opportunities
    • next NeIC call (Feb-Mar'22)
    • EOSC? service-oriented
      • Thomas: more data processing tools?

Upcoming events

  • CIUK 2021 will be held on December 9-10, both live and remote. Jörg is planning to submit something about how EasyBuild and EESSI can help in getting reproducible, reliable, and "use-anywhere" software stacks for sequencing workflows in COVID-19 research.

Q&A

  • Alan: the Fenix resources are going to expire by the end of the year. We can apply for renewal, but this has to be done by the end of October.
    • In the application we wrote that we would set up a Stratum 1, this hasn't been done yet. We should try to do this a.s.a.p., Bob will look into this.
    • Alan ran some benchmarks for the EESSI paper, but they used another account/allocation. We can mention this when we apply for renewal.
    • We can still benefit from these resources for training and testing (multi-node tests) purposes, especially now that we're in the stage where this becomes even more relevant, so we should try to reapply.
Clone this wiki locally