Skip to content

Sync meeting on EESSI test suite (2024 05 23)

Kenneth Hoste edited this page Jun 6, 2024 · 1 revision

EESSI test suite sync meetings

Planning

  • every 2 weeks on Thursday at 14:00 CE(S)T
  • next meetings:
    • Thu 13 June 2024 13:00 CEST
    • Thu 27 June 14:00 CEST (excused: Sam)

Meeting (2024-05-23)

Attending: Kenneth Hoste, Samuel Moors, Lara Peeters, Xin An, Caspar van Leeuwen

  • start fleshing out structure for "Best practices" part in test suite docs on writing tests

    • Had a discussion on this during the meeting, updated https://hackmd.io/mFovCXgSSDmF_Aybt47DOg
    • Key decision: create API documentation first (auto-generated). Then, create a tutorial that shows how to write a portable mpi4py based "hello world"
  • OpenFOAM test

    • Are we dropping this (for now)? (i.e. prioritize other tests) If so, should we at least make a PR of what's there, or isn't there anything yet?
    • Ask Satish in the next meeting (maybe make a PR and close it, even if unfinished)
  • Problems with TensorFlow test during demo at EUM'24

    • What were the problems? :) => Mixup by Kenneth with his own module env, not a problem of the test suite
  • Open PRs

    • ESPResSo test #144
      • Review from Jean-Noel. Does this cover the questions from the previous meeting?
      • What else is needed to complete this PR?
      • Xin met with Satish, some sytax polishing is needed. Xin tested, but it failed in sanity step.
      • Kenneth: there is a tarball that needs to be cleaned up
      • Lara will test it on Hortense
    • LAMMPS #131
      • "Check with Tilen if we can implement a sanity check that checks scientific correctness of the result" <= Does this still need to be done? => Yes, still needs to be done
      • Will use a hook from #133, so that needs to be merged first.
    • QuantumESPRESSO #128
      • Caspar: implemented a hook to request memory from the scheduler in this pr. Supports SLURM (tested) and PBS (untested). Should be easily reusible in other tests.
      • Only thing needed from @crivella is to call the hook with the correct memory requirement for these QE test cases. Right now, I requested 4 GB + 0.9 GB/task, which seems to be enough for the largest use case. But can probably be tightened, at least for the smaller use cases. (no big deal: most people have this amount of memory anyway)
    • CP2K #133
      • Tested by Lara, some failures, probably because of OOM => could use this PR
      • Time limit should be increased for the largest test case / smallest core count
      • Should we seperate out the generic part of this PR so it can be used in the CP2K PR already?
    • PyTorch #130
      • Ready for review => Sam will look at it
      • Requires a local torchvision or PyTorch-bundle module, as this is not in EESSI yet (WIP, this pr)
  • We should switch CI workflow to use software.eessi.io

  • OSU collectives

    • Sam will give them another test, because they were merged without testing
  • Next step / 0.3.0 milestone, what do we want in here?

    • For sure:
      • Request memory hook, from this PR
      • ESPResSo
      • Update docs:
        • new tag names
        • cover new tests in the docs
      • Do release on PyPI
      • Apply memory limits using memory hook for all tests
    • Optionally:
      • QE
      • PyTorch are probably close enough to being done (PyTorch depends on if things work on other systems...)
      • LAMMPS/CP2K?
      • Other features we really 'need'?

Previous meetings

Clone this wiki locally