Recipe test workflow proposal #2723

ehogan · 2022-07-12T16:36:51Z

ehogan
Jul 12, 2022
Collaborator

Hi @ESMValGroup/esmvaltool-developmentteam 👋

We (at the Met Office) have been thinking about whether we can bring some of the best techniques from our model development working practices to ESMValTool. One of these is a nightly test workflow that helps to identify issues very soon after merging branches. The new comparison tool developed by @bouweandela could enable us to implement something similar.

At the moment, all the recipes are run during the ESMValTool release process. This can leave the release manager with many issues to deal with at a late stage. Running the recipes nightly would identify issues sooner, making the release process less painful.

I have gathered from comments on various GitHub issues and PRs that, during a release, the recipes are run on DKRZ. However, I am unaware of the specific details (I was unable to find any release-related documentation), so I don't know how the recipes (and comparisons) are run currently; would it be possible for someone to provide details about this, please?

Our proposal:

Would it be worth us (i.e. the Met Office) creating a workflow (written using Rose and Cylc) that automatically runs recipes nightly and compares the outputs?
The workflow would, by default, run on JASMIN (where the data are located).
The workflow could be run at other institutions.
The workflow would be located in a new ESMValGroup repository (nightly-recipe-tests?).
The workflow would only run when main has changed.

Some considerations / questions:

Is it possible to check whether just the required input data exist using ESMValTool?
It should be possible to (optionally) include in the workflow whether the outputs are CF Compliant (if this is a requirement for the recipe).
What resources would be used on JASMIN if all ESMValTool recipes are run nightly? Is there a resource limit on JASMIN? Perhaps the most expensive recipes could be run weekly? Or we could cycle through recipes in groups so that all recipes are tested e.g. every 3 days.

Thoughts? :)

zklaus · 2022-07-13T08:44:50Z

zklaus
Jul 13, 2022
Maintainer

Hi @ehogan, that sounds great!
You may use the cylc suite that @bouweandela created as a starting point.

Running such a thing every night might be a bit costly, which is why we started collecting a reduced set of simplified recipes for this purpose that should still cover most of the ESMValCore, but perhaps a weekly or bi-weekly run of all recipes would be good? Regardless of frequency and exact recipe selection, automatic comparison and reporting would certainly be valuable and we can probably learn a lot from the application of the new comparison in this release cycle by @sloosvel and others.

I also like the added CF compliance check. Of course we will not reject anything for not being compliant right now, but it may help us to nudge us in the right direction.

0 replies

remi-kazeroni · 2022-07-13T14:42:45Z

remi-kazeroni
Jul 13, 2022
Maintainer

Hi @ehogan, thanks for the very interesting suggestion! That would be very helpful to reduce the workload for the release manager and further simplify the release process. Here are my thoughts:

We have indeed a cylc suite to run all recipes which was implemented by @bouweandela. For previous releases, this cylc suite was used to run all recipes at DKRZ a few times (once per release candidate for the Core) and output are displayed on this website. For the upcoming release (v2.6), things got a bit trickier because the new DKRZ machine (Levante) does not (yet?) provide a cylc module so I think it was troublesome for @sloosvel to run all recipes. I guess running all recipes daily or even weekly would require too much resources (see resources needed for the v2.5 release). At DKRZ we would not have enough resources in our compute project to run all recipes more than ~once per month. As an alternative, we could start thinking of a way to classify recipes depending on how much resources are needed to run them:

A large fraction of recipes could be run with one core in less than 10 minutes and those could be indeed tested very frequently (~daily).
The recipes that typically take 1-2 hours could be run every 1-2 weeks.
For the dozen of recipes requiring lots of resources (>4 hours, large RAM), I doubt it would be realistic to run them more than once per release.

One difficulty is that, by default, the user of the Cylc suite needs to request the same amount of resources for all recipes resulting in crashes for the memory intensive ones and probably to much resources used for the "fast recipes". I think it would be helpful to break the set of recipes into groups based on their computational imprint and have separate cylc suites for each group.

Also, I'm not too sure how many recipes could actually be run on Jasmin given that we don't have Tier3 data there and these are used in about half of the recipes (56/125).

Regarding the comparison tool developed by @bouweandela and used for the first time as part of this release, it would be very helpful to automatize the comparison as much as possible. Otherwise, the manual inspection of the differences is done by the release manager and this can be very time consuming (see #2704). Once the comparison tool would be fully optimized (see #2708), we could take some steps further by:

defining reference output against which new recipe runs would be compared
including the comparison in the cylc suite to run recipes
having an automated mechanism to notify the developers when recipe output have changed between 2 test runs or 2 releases.

You may also want to have a look at the summary of the testing recipe workshop from last year. We might have discussed things that would be relevant here, see #2345 and #2346

0 replies

sloosvel · 2022-07-13T16:37:32Z

sloosvel
Jul 13, 2022
Maintainer

We have indeed a cylc suite to run all recipes which was implemented by @bouweandela. For previous releases, this cylc suite was used to run all recipes at DKRZ a few times (once per release candidate for the Core) and output are displayed on this website. For the upcoming release (v2.6), things got a bit trickier because the new DKRZ machine (Levante) does not (yet?) provide a cylc module so I think it was troublesome for @sloosvel to run all recipes.

I wrote some python code to create a submission script for each recipe and launch them one after the other, taking care of the resources that some recipes need. I was planning on adding it in utils. It's nothing fancy, but using cylc to launch one job after the other, with no dependencies between jobs, is a bit of an overkill anyway and I did not feel like installing the tool myself either.

0 replies

bouweandela · 2022-07-15T06:11:29Z

bouweandela
Jul 15, 2022
Maintainer

@ehogan I transferred this to the ESMValTool repository, because I think it will be a bit too noisy for the Community repository (see here for the type of discussions we hope to have in the community repo).

1 reply

ehogan Jul 15, 2022
Collaborator Author

Many thanks @bouweandela, and for the link; apologies for not posting in the correct place the first time! :)

bouweandela · 2022-07-15T06:29:25Z

bouweandela
Jul 15, 2022
Maintainer

I was unable to find any release-related documentation

The documentation for making a release is here:

but note #2623.

Is it possible to check whether just the required input data exist using ESMValTool?

Not without making (minor) modifications to the code

Some more related effort: ESMValGroup/ESMValCore#1636.

0 replies

ehogan · 2022-07-27T11:34:31Z

ehogan
Jul 27, 2022
Collaborator Author

Thanks everyone for your input! 🥳 To summarise the requirements:

The workflow must:

be portable
- we will only be able to test the workflow on JASMIN and at the Met Office, as those are the systems we currently have access to
use the latest (main) versions of ESMValTool and ESMValCore
- but the recipes would only run if the versions of ESMValTool or ESMValCore had changed since the last time the recipes were run
have the ability to run recipes based on their duration (recipes that complete quickly can be run more frequently)
- proposed groupings:
  - fast (tested ~daily): recipes that run with one core in less than 10 minutes
  - medium (tested ~weekly): recipes that typically take 1-2 hours
  - slow (tested ~per release): recipes requiring lots of resources (>4 hours, large RAM)
ensure the appropriate resources are allocated based on the requirements of each recipe (as detailed on, e.g. the debug page)
perform automatic comparison of the recipe outputs with Known Good Outputs (KGOs) using the comparison tool developed by @bouweandela
send e-mails to notify the developers when a comparison fails
optionally check the recipe outputs for compliance with the CF conventions
report the results (this can be achieve by Cylc review)
use Rose 2 and Cylc 8

Considerations:

There is already a Cylc suite available
JASMIN does not store Tier 3 data, which is used in 56 of the 125 recipes

Outstanding questions:

Does anyone object to us creating a new repository (called nightly-recipe-tests) in the ESMValGroup organisation for this work?
Where should we store the KGOs? We can start by storing them locally, but we might want to consider adding them to a repository (perhaps Git LFS would be useful here?)

If everyone is happy with this we can start with a prototype that runs a "quick" recipe, then demo it to you all before continuing? What do you think? :)

0 replies

valeriupredoi · 2022-07-27T11:48:57Z

valeriupredoi
Jul 27, 2022
Maintainer

Hey guys! I've not managed to reply in time here - my apologies, things went under the proverbial rug. Emma, good initiative! Here's my 2c: only run a certain recipe on a certain night if and only if any of the preprocessors that that recipe runs have changed the previous day, or if the recipe itself has changed (doh!). Otherwise it's a massive waste of resources. You can gather that info via examining git logs by means of a script (I think I can help with that). You may say that configuration/data finding changes may impact that recipe too, and one should run it if those change too - but those apply to all the other runs, so if there's an issue, then surely it'd pop up elsewhere. What you think?

2 replies

bettina-gier Jul 27, 2022
Collaborator

But isn't part of why we do this to also spot errors we might not expect? Connections we don't think about/forgot or didn't even know existed? It's happened before that something on first glance unrelated introduced a problem elsewhere. The daily recipes are short exactly so that it's not a super massive waste of resources, no?
Though I do like the idea of prioritizing when we know something that influences a certain recipe has changed, but I've got no concrete idea how to go about this.

valeriupredoi Jul 27, 2022
Maintainer

sure, I'm only talking about nightly runs of massive full throttle recipes - running those is a huge waste of resources if we don't have a solid reason to run them on that particular night

bouweandela · 2022-07-28T08:24:35Z

bouweandela
Jul 28, 2022
Maintainer

we will only be able to test the workflow on JASMIN and at the Met Office, as those are the systems we currently have access to

If you need an account at DKRZ, this can be arranged

0 replies

bouweandela · 2022-07-28T08:28:03Z

bouweandela
Jul 28, 2022
Maintainer

Where should we store the KGOs? We can start by storing them locally, but we might want to consider adding them to a repository (perhaps Git LFS would be useful here?)

At the moment these are stored on the virtual machine we have at DKRZ, but a backed-up publicly accessible repository would be much better indeed.

1 reply

zklaus Jul 28, 2022
Maintainer

I like the use of Git LFS for this, though be aware that the Github hosted version as a limit of 1GB on both storage and monthly bandwidth at the free tier, so we would likely have to either buy more, or find a different hosting option that would need to be configured.

ehogan · 2022-08-03T16:21:53Z

ehogan
Aug 3, 2022
Collaborator Author

Does anyone object to us creating a new repository (called nightly-recipe-tests) in the ESMValGroup organisation for this work?

I have been thinking a bit more about the location of the recipe test workflow.

I acknowledge the current Cylc suite is located in the ESMValTool repository, but since the proposed workflow would install the latest versions of ESMValCore and ESMValTool, I'm not sure it makes sense to checkout ESMValTool to get a copy of the workflow, which, when run, would include checking out and installing the latest version of ESMValTool! This is what is making me lean towards creating a separate repository for the recipe test workflow. Or am I over thinking this? Would the community prefer the workflow to be located within the ESMValTool repository?

One other possible concern with locating the workflow in its own repository is that the documentation would need to be included in the ESMValTool documentation, but I'm guessing this wouldn't be too tricky, given that the ESMValTool documentation already includes documentation located in the ESMValCore repository?

2 replies

bouweandela Aug 4, 2022
Maintainer

Maybe you could start by just creating a subdirectory in esmvaltool/utils for the files you're creating and if you find that it doesn't work for some reason move them to their own repository? It seems unlikely that anyone who doesn't already have a copy of the ESMValTool repo will be using this anyway. Integrating the documentation from multiple repositories is pretty tricky, so if you decide on a separate repository it may still be easiest to include the documentation in the ESMValTool one, just like we did for the ESMValBot.

ehogan Aug 5, 2022
Collaborator Author

Thinking a bit more about this, if we store the KGOs in a repository, we would need to create a new repository (since adding large files to the ESMValTool repository would not be a good idea). Then, it might make sense to store the workflow in the same repository. Until then, your suggestion of adding the workflow to the ESMValTool repository is a good starting point. Thanks! :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recipe test workflow proposal #2723

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 10 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Recipe test workflow proposal #2723

ehogan Jul 12, 2022 Collaborator

Replies: 10 comments · 6 replies

zklaus Jul 13, 2022 Maintainer

remi-kazeroni Jul 13, 2022 Maintainer

sloosvel Jul 13, 2022 Maintainer

bouweandela Jul 15, 2022 Maintainer

ehogan Jul 15, 2022 Collaborator Author

bouweandela Jul 15, 2022 Maintainer

ehogan Jul 27, 2022 Collaborator Author

valeriupredoi Jul 27, 2022 Maintainer

bettina-gier Jul 27, 2022 Collaborator

valeriupredoi Jul 27, 2022 Maintainer

bouweandela Jul 28, 2022 Maintainer

bouweandela Jul 28, 2022 Maintainer

zklaus Jul 28, 2022 Maintainer

ehogan Aug 3, 2022 Collaborator Author

bouweandela Aug 4, 2022 Maintainer

ehogan Aug 5, 2022 Collaborator Author

ehogan
Jul 12, 2022
Collaborator

Replies: 10 comments 6 replies

zklaus
Jul 13, 2022
Maintainer

remi-kazeroni
Jul 13, 2022
Maintainer

sloosvel
Jul 13, 2022
Maintainer

bouweandela
Jul 15, 2022
Maintainer

ehogan Jul 15, 2022
Collaborator Author

bouweandela
Jul 15, 2022
Maintainer

ehogan
Jul 27, 2022
Collaborator Author

valeriupredoi
Jul 27, 2022
Maintainer

bettina-gier Jul 27, 2022
Collaborator

valeriupredoi Jul 27, 2022
Maintainer

bouweandela
Jul 28, 2022
Maintainer

bouweandela
Jul 28, 2022
Maintainer

zklaus Jul 28, 2022
Maintainer

ehogan
Aug 3, 2022
Collaborator Author

bouweandela Aug 4, 2022
Maintainer

ehogan Aug 5, 2022
Collaborator Author