-
Notifications
You must be signed in to change notification settings - Fork 0
Sync meeting on EESSI test suite (2024 07 25)
Caspar van Leeuwen edited this page Jul 25, 2024
·
2 revisions
- every 2 weeks on Thursday at 14:00 CE(S)T
- next meetings:
- Thu 8 Aug'24 14:00 CEST (Caspar, Satish, Sam (maybe), Lara/Kenneth?)
Attending: Sam Moors, Caspar van Leeuwen
-
Caspar worked on Tutorial for writing a portable test
- Based on
mpi4py
all-reduce example - Substantial progress, but not finished yet. Now at the stage where we have a standard ReFrame test and discuss the steps how to make it portable.
- Based on
-
Three releases (0.3.0, 0.3.1, 0.3.2) by the end of June, used for the deliverable in MultiXscale
- Still todo: update docs:
- Describe ESPResSo test cases (Satish)
- Update tag names (Satish)
- Add small section on debugging if a test doesn't succeed
- Where to find the full logs
- How to run manually
- Still todo: update docs:
-
Apply memory limits using memory hook for all tests
- Caspar will go through the tests and update them where needed
- Suggestion: run
top
and dump info to figure out max memory useage, e.g.for i in {1..4}; do sleep 0.1 && top -b -n1 | grep "MiB Mem" ; done > cron.txt
or for a specific processfor i in {1..4}; do sleep 0.1 && top -b -n1 -p <pid>; done > cron.txt
. More info: https://www.tecmint.com/save-top-command-output-to-a-file/ - Suggestion 2: get it directly from
/proc/<pid>/status
- Suggestion 3: get the maximum useage from the C-group at the end of the job
cat /sys/fs/cgroup/memory/$(</proc/self/cpuset)/memory.max_usage_in_bytes
-
OpenFOAM test
- Satish has a test that works, but no ReFrame test => No progress
-
Merged PRs
-
Open PRs
-
PyTorch: Caspar still needs to set
OMP_NUM_THREADS
, then Sam will look at it again -
CP2K:
- OOM on Snellius for 1/8th node test (16 cores). Caspar will rerun to see if it is consistent, if so, try to increase memory request until it succeeds.
- Caspar will rerun on Karolina to see if the failures on 16 Nodes are consistent
-
LAMMPS:
- Seems to be ready for review, but Lara isn't here to check
- Caspar will try to run it on Snellius / Karolina, maybe Vega
- Sam will try to have a look too
-