Sync meeting on EESSI test suite (2024 05 23)

EESSI test suite sync meetings

Planning

every 2 weeks on Thursday at 14:00 CE(S)T
next meetings:
- Thu 13 June 2024 13:00 CEST
- Thu 27 June 14:00 CEST (excused: Sam)

Meeting (2024-05-23)

Attending: Kenneth Hoste, Samuel Moors, Lara Peeters, Xin An, Caspar van Leeuwen

start fleshing out structure for "Best practices" part in test suite docs on writing tests
- Had a discussion on this during the meeting, updated https://hackmd.io/mFovCXgSSDmF_Aybt47DOg
- Key decision: create API documentation first (auto-generated). Then, create a tutorial that shows how to write a portable mpi4py based "hello world"
OpenFOAM test
- Are we dropping this (for now)? (i.e. prioritize other tests) If so, should we at least make a PR of what's there, or isn't there anything yet?
- Ask Satish in the next meeting (maybe make a PR and close it, even if unfinished)
Problems with TensorFlow test during demo at EUM'24
- What were the problems? :) => Mixup by Kenneth with his own module env, not a problem of the test suite
Open PRs
- ESPResSo test #144
  - Review from Jean-Noel. Does this cover the questions from the previous meeting?
  - What else is needed to complete this PR?
  - Xin met with Satish, some sytax polishing is needed. Xin tested, but it failed in sanity step.
  - Kenneth: there is a tarball that needs to be cleaned up
  - Lara will test it on Hortense
- LAMMPS #131
  - "Check with Tilen if we can implement a sanity check that checks scientific correctness of the result" <= Does this still need to be done? => Yes, still needs to be done
  - Will use a hook from #133, so that needs to be merged first.
- QuantumESPRESSO #128
  - Caspar: implemented a hook to request memory from the scheduler in this pr. Supports SLURM (tested) and PBS (untested). Should be easily reusible in other tests.
  - Only thing needed from @crivella is to call the hook with the correct memory requirement for these QE test cases. Right now, I requested 4 GB + 0.9 GB/task, which seems to be enough for the largest use case. But can probably be tightened, at least for the smaller use cases. (no big deal: most people have this amount of memory anyway)
- CP2K #133
  - Tested by Lara, some failures, probably because of OOM => could use this PR
  - Time limit should be increased for the largest test case / smallest core count
  - Should we seperate out the generic part of this PR so it can be used in the CP2K PR already?
- PyTorch #130
  - Ready for review => Sam will look at it
  - Requires a local torchvision or PyTorch-bundle module, as this is not in EESSI yet (WIP, this pr)
We should switch CI workflow to use software.eessi.io
- https://github.com/EESSI/test-suite/issues/107
- Kenneth?
OSU collectives
- Sam will give them another test, because they were merged without testing
Next step / 0.3.0 milestone, what do we want in here?
- For sure:
  - Request memory hook, from this PR
  - ESPResSo
  - Update docs:
    - new tag names
    - cover new tests in the docs
  - Do release on PyPI
  - Apply memory limits using memory hook for all tests
- Optionally:
  - QE
  - PyTorch are probably close enough to being done (PyTorch depends on if things work on other systems...)
  - LAMMPS/CP2K?
  - Other features we really 'need'?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync meeting on EESSI test suite (2024 05 23)

EESSI test suite sync meetings

Planning

Meeting (2024-05-23)

Previous meetings

Clone this wiki locally