-
Notifications
You must be signed in to change notification settings - Fork 0
AWS meeting 2023 04 13
Kenneth Hoste edited this page Apr 17, 2023
·
2 revisions
- link to AWS project doc: https://docs.google.com/document/d/1CHG9fCh2LkfJ-EI8J-_Wr5NpHL5iwm8Wu6syfK9h7-c
- Thu 11 May 2023, 12:00 UTC
- Thu 8 June 2023, 12:00 UTC
- 9 Mar 2023: https://github.com/EESSI/meetings/wiki/AWS-meeting-2023-03-09
- 11 Jan 2023: https://github.com/EESSI/meetings/wiki/AWS-meeting-2023-01-11
- status update on sponsored credits
- see slide 14 in https://raw.githubusercontent.com/EESSI/meetings/main/meetings/EESSI_meeting_20230406.pdf
- ~$2.6k credits consumed in Mar'23
- significant increase compared to previous months
- remaining credits (~$7.1k) should be sufficient until end of July'23 (at current burning rate)
- budget alert are set at $1k + $2k / month
- to be reviewed in May, planning for refresh in June
- MultiXscale EuroHPC project is now on the rails
- kickoff meeting was on 20-23 March'23
- see https://www.multixscale.eu/wp-content/uploads/2023/04/MultiXscale-Kick-off-meeting_Press-Release_vf.pdf
- poster at EuroHPC Summit: https://www.multixscale.eu/wp-content/uploads/2023/03/32-Poster-MultiXscale.pdf
- which questions were raised by people visiting the poster?
- CASTIEL2 coordination project for EuroHPC CoEs and NCCs
- kickoff meeting was on 20-23 March'23
- primary goals for EESSI in coming weeks/months
- new EESSI pilot version (2023.04)
- slowly getting past problems that presented themselves in bootstrapping Gentoo Prefix
- start using build-and-deploy bot for all software builds (software layer + compat layer)
- extend software stack
- more recent toolchains
- apps relevant for MultiXscale (ESPResSo, waLBerla, LAMMPS), and other EuroHPC CoE's (GROMACS, OpenFOAM, ...)
- eye catchers like AlphaFold, OpenFold, ...
- NVIDIA GPU support
- see https://github.com/EESSI/software-layer/pull/172
- not knowing which GPU driver version is available implies possible need for CUDA compat libraries
- compat libraries are installed into /opt/eessi, which gets symlinked into EESSI software stack
- write permissions to /opt/eessi needed
- EESSI test suite
- https://github.com/EESSI/test-suite, using ReFrame
- focus on portability of test suite + performance
- currently only a GROMACS test, which will serve as a blueprint for other tests
- working on tests for OSU Microbenchmarks, TensorFlow, ...
- Add support for customising EESSI initialisation (including enabling tracking support)
- different variants of EESSI init script can be provided
- sites (like AWS) providing EESSI can opt-in to using an init script that does usage tracking (by leveraging "variant symlink" feature in CVMFS)
- /cvmfs/pilot.eessi-hpc.org/latest/init/bash -> /cvmfs/pilot.eessi-hpc.org/latest/init/bash_tracking_aws
- AWS-specific init script can also set additional environment variables required in AWS environment (related to EFA for example, etc.)
- tutorials
- EESSI introductory tutorial at HPC Knowledge Meeting 2023 in Barcelona (17-18 May 2023 - https://hpckp.org/annual-meeting)
- "CernVM-FS Best Practices" online tutorial
- fall 2023
- in collaboration with CernVM-FS developers
- focus on use of CernVM-FS on HPC systems
- we hope that EESSI configuration is included in CernVM-FS release by then (one step less to deploy EESSI)
- new EESSI pilot version (2023.04)
- ISC'23
- AWS presence?
- workshop/tutorial on Sunday: Best Practices for HPC in the Cloud
- maybe mention EESSI alongside Spack w.r.t. software deployment?
- HPC Tech Shorts interviews
- example interview: https://youtu.be/o_noqdK5Sdc
- get Brendan in touch with Eli (HPCNow!) to see if an on-floor interview is possible
- "How EESSI helped us solve X"
- obvious call to action at the end
- EESSI as streaming software pitch (a la Spotify/Netflix)
- walk-in meeting rooms (scheduled)
- workshop/tutorial on Sunday: Best Practices for HPC in the Cloud
- additional opportunities for EESSI promotion?
- booth talk at MS Azure booth
- MultiXscale will have presence in EuroHPC booth (poster, presentation time)
- AWS presence?
- Q&A
- container images
- container images to archive deprecated EESSI versions
- service to create container images on-demand?
- see container image for accessing EESSI
- can also create a squashfs file from CernVM-FS and mount that in container
- think about long-lived clusters vs short-lived (AWS Batch), persistent storage, ...
- need to check how much data is being pulled in when using EESSI pilot container (incl. container image itself)
- ParallelCluster repositories for accessing software modules
- incl. Spack, could also cover EESSI
- similar to what done for Cluster-in-the-Cloud, azhop (Azure), Magic Castle (https://github.com/ComputeCanada/magic_castle)
- container images