Add new hook to filter invalid scales based on features set in the config file #111

casparvl · 2024-02-01T17:00:44Z

This avoids issues on Snellius GPU, where partially allocating multiple nodes is not allowed by the Slurm configuration.

…nfig file

casparvl · 2024-02-01T17:06:28Z

It's ready to be reviewed/tested, but not merged, since I didn't change all the other configs yet (they should all get a + list(SCALES.keys()), on their features). Whoever tests this can add that the the config of their system manually, and/or filter like I did for the Snellius GPU partition to see if that works properly for you.

casparvl · 2024-02-01T17:12:07Z

Ah, and we should also include it in the config for the CI (which is why it is now failing :))

casparvl · 2024-02-02T16:02:05Z

ok, I updated the features for all configs, including for github-actions. Let's see if the CI passes now again.

smoors

just tested this for the gromacs test, and it seems to work nicely.

however, the way we are currently using valid_systems is not very logical, and prone to mistakes, because we use [] both for the begin state (no filtering yet) and the end state (this test should not run).

i propose to change it as follows:

we start out with setting valid_systems to ['*'] in the class attribute of the test i.e. no filtering.
in the hooks whenever valid_systems is [] we know the test should be filtered out (not run).
in the hooks, if valid_systems is [*] and we want to filter, we replace it with [<filter>]

eessi/testsuite/hooks.py

casparvl · 2024-02-03T09:33:57Z

in the hooks, if valid_systems is [*] and we want to filter, we replace it with []

Yeah I actually thought about something like this as well, and then thought "What if someone is stupid enough to set this as system name?". But I agree, it's probably a better way to set it to a non-empty value. And note to self: I think the elegant way to do this is to set it to a constant that we define in eessi.testsuite.constants. That way we can name the constant something sensible (like FILTER) and make the value of that constant something that no one would ever set as system name ("FILTER=thisystemnamewillalwaysbefiltered" or whatever :D)

smoors · 2024-02-03T12:27:33Z

Yeah I actually thought about something like this as well, and then thought "What if someone is stupid enough to set this as system name?". But I agree, it's probably a better way to set it to a non-empty value. And note to self: I think the elegant way to do this is to set it to a constant that we define in eessi.testsuite.constants. That way we can name the constant something sensible (like FILTER) and make the value of that constant something that no one would ever set as system name ("FILTER=thisystemnamewillalwaysbefiltered" or whatever :D)

i'm not sure i understand your point.

what i meant is to change this line to valid_systems = ['*']:

test-suite/eessi/testsuite/tests/apps/gromacs.py

Line 45 in 937e34e

valid_systems = []

['*'] has a special meaning in reframe: The test is valid for any system, which is exactly what it should be initially because we didn't do any filtering yet.
also, for system names Only alphanumeric characters, dashes (-) and underscores (_) are allowed.

casparvl · 2024-02-05T14:52:21Z

My point was mostly about your third point:

in the hooks, if valid_systems is [*] and we want to filter, we replace it with []

I.e. instead of setting it to empty [] to filter, set it to a value that will never ever be a valid system name (as in my example). But, now I also understood what you meant with

we start out with setting valid_systems to ['*'] in the class attribute of the test i.e. no filtering.

I.e. you want to actually change the 'default' valid_systems as set in the class attribute for all tests.

I'm actually fine with both changes. I'll give a try to implementing it :)

smoors · 2024-02-05T14:56:40Z

I.e. instead of setting it to empty [] to filter, set it to a value that will never ever be a valid system name (as in my example).

ok, now i understand. we could actually do both to avoid possible confusion :)

…ribute

… empty, explicitely set it equal to this constant. That way we know (and can test in other hooks) that it was explicitely filtered out as an invalid test by one of our hooks

…of appended

casparvl · 2024-02-05T15:46:43Z

@smoors how about this? I'm not sure changing the default really helps much, it just gives me one more thing to check on (which I now generalized in the internal helper function _set_or_append_valid_systems). The reason you need to overwrite '*' in these cases btw is that appending a feature to it leads to an invalid system name. I.e. [* +gpu] does not match any system and all my tests were filtered.

In any case, the correct set of tests gets generated for me. For the sake of testing, I excluded 1_cpn_2_nodes from the valid scales for the rome partition in my ReFrame config:

valid_scales_snellius_rome = [s for s in SCALES if s not in ['1_cpn_2_nodes', '1_cpn_4_nodes']]
...
'partitions': [
    {
        'name': 'rome',
        ...
       'features': [
           FEATURES[CPU],
       ] + valid_scales_snellius_rome,
...

I then get:

$ reframe -n GROMACS.*2021a$ --tag CI --checkpath eessi/testsuite/tests/apps/gromacs.py --tag 2_nodes --run
[ RUN      ] GROMACS_EESSI %benchmark_info=HECBioSim/Crambin %nb_impl=cpu %scale=2_nodes %module_name=GROMACS/2021.3-foss-2021a /d597cff4 @snellius:rome+default
[ RUN      ] GROMACS_EESSI %benchmark_info=HECBioSim/Crambin %nb_impl=cpu %scale=2_nodes %module_name=GROMACS/2021.3-foss-2021a /d597cff4 @snellius:genoa+default
[ RUN      ] GROMACS_EESSI %benchmark_info=HECBioSim/Crambin %nb_impl=cpu %scale=1_cpn_2_nodes %module_name=GROMACS/2021.3-foss-2021a /f4194106 @snellius:genoa+default

As you can see, the 2_cpn_2_nodes scale is still generated for genoa (since I didn't exclude it there), but not for rome.

smoors · 2024-02-05T17:24:02Z

eessi/testsuite/hooks.py

+    if len(test.valid_systems) == 0:
+        test.valid_systems = [valid_systems]


valid_systems is a required field, which means that it is always set, and we are now setting it to ['*'] in the class attribute. this means we can never have 0 items, and we can remove the first if condition.

I wasn't sure if it couldn't be set to [] on the command line, but it seems not (at least I wasn't able to do it). Nevertheless: while we now have [*]as default valid systems by convention, this is something we set in the test class itself. If someone creates a test class with [] as default (which can easily happen), at least the current hook still works.

I guess it depends on how 'hard' we want to enforce ['*'] as default. If we really want to enforce that, having this hook fail could be one way to figure it out. But then the == 0 case would just have to lead to a hard error or something.

What do you think, do we want to be that strict? Or do you prefer to keep the current version, which would also make the [] default work?

i’m thinking of the case where someone sets it to [] somewhere outside the class attributes, meaning that the test should not run due to some condition. in that case we should respect that, and not override it or fail. so i think it should always be ['*'] in the class attributes, but we should not enforce it in this hook.

i don’t think it is a big problem if someone sets it accidentally to [] in the class attribute, because in that case no test will run, so it’s clear something is wrong with the test.

if we really want to enforce it, a solution could be to set it to ['*'] in a class that inherits from RegressionMixin and make sure every test class inherits from both RunOnlyRegressionTest and the custom RegressionMixin child class, but of course then the question is how to enforce that, and maybe it makes things overly complex.
see here for an example: https://reframe-hpc.readthedocs.io/en/stable/tutorial_advanced.html#grouping-parameter-packs

i don’t think it is a big problem if someone sets it accidentally to [] in the class attribute, because in that case no test will run, so it’s clear something is wrong with the test.

That's a good point actually

if we really want to enforce it, a solution could be to set it to ['*'] in a class that inherits from RegressionMixin and make sure every test class inherits from both RunOnlyRegressionTest and the custom RegressionMixin child class, but of course then the question is how to enforce that, and maybe it makes things overly complex.
see here for an example: https://reframe-hpc.readthedocs.io/en/stable/tutorial_advanced.html#grouping-parameter-packs

Yeah, I also thought maybe we should use Mixin classes more, after hearing e.g. Vasileios mention he also did that for his test library. I guess it might avoid more code duplication compared to the hooks and allows you to do default stuff like this. Enforcing it could be reasonably simple in the CI of the test suite by parsing the test file (it has to contain something like class classname(..., my_mixin_class). Well, not something I'm going to dive into now, but it might make things cleaner. If we want to move in that direction, we should do it before we have a million tests...

eessi/testsuite/hooks.py

smoors · 2024-02-05T17:39:36Z

I'm not sure changing the default really helps much

that's true, but my motivation for this change is to avoid future mistakes because it's not much more logical now.

btw, i do like the addition of the helper function, as we have only 1 place to check for.

eessi/testsuite/hooks.py

eessi/testsuite/constants.py

eessi/testsuite/hooks.py

Co-authored-by: Sam Moors <smoors@users.noreply.github.com>

eessi/testsuite/hooks.py

Co-authored-by: Sam Moors <smoors@users.noreply.github.com>

smoors

lgtm!

boegel · 2024-02-13T20:36:56Z

@casparvl Seems worthwhile to update the EESSI docs page on the test suite accordingly?

casparvl · 2024-02-14T12:30:01Z

Good point, done in EESSI/docs#156

Add new hook to filter invalid scales based on features set in the co…

9f9b380

…nfig file

casparvl marked this pull request as draft February 1, 2024 17:04

casparvl mentioned this pull request Feb 1, 2024

support for filtering out incompatible scales in configuration file #100

Closed

Alter all config files to allow all scales by default

b52fd19

smoors reviewed Feb 2, 2024

View reviewed changes

eessi/testsuite/hooks.py Outdated Show resolved Hide resolved

Updated hook name, and make the description a bit more clear

f5339a7

Caspar van Leeuwen added 5 commits February 5, 2024 15:58

Also apply hook for OSU and tensorflow

39c8430

Change the default valid_systems to ['*'] when it is set as class att…

3a45915

…ribute

Implement a constant for an invalid system. Whenever valid_systems is…

dd09b70

… empty, explicitely set it equal to this constant. That way we know (and can test in other hooks) that it was explicitely filtered out as an invalid test by one of our hooks

Corrected that we import * from constants.py

4dbaa11

Make sure that if the valid_system is *, it gets overwritten instead …

73b876e

…of appended

smoors reviewed Feb 5, 2024

View reviewed changes

casparvl casparvl and others added 2 commits February 9, 2024 09:59

Handle case where valid_systems has more than one element

39d99f0

Added missing colon

5955203

smoors reviewed Feb 10, 2024

View reviewed changes

eessi/testsuite/hooks.py Outdated Show resolved Hide resolved

smoors reviewed Feb 10, 2024

View reviewed changes

eessi/testsuite/constants.py Outdated Show resolved Hide resolved

eessi/testsuite/hooks.py Outdated Show resolved Hide resolved

eessi/testsuite/hooks.py Outdated Show resolved Hide resolved

smoors reviewed Feb 10, 2024

View reviewed changes

eessi/testsuite/hooks.py Outdated Show resolved Hide resolved

casparvl and others added 2 commits February 13, 2024 15:00

Apply formatting suggestions from Sam's review

9667d7e

Co-authored-by: Sam Moors <smoors@users.noreply.github.com>

Leave test.valid_systems alone if it has length 0

b748594

casparvl commented Feb 13, 2024

View reviewed changes

eessi/testsuite/hooks.py Show resolved Hide resolved

Updated function description to match the changes after review

cae996b

Caspar van Leeuwen added 2 commits February 13, 2024 16:12

Merge branch 'main' into filter_incompatible_scales

ca8f366

Warn in the 'else' case, but make it go through

950863b

smoors reviewed Feb 13, 2024

View reviewed changes

eessi/testsuite/hooks.py Outdated Show resolved Hide resolved

Update eessi/testsuite/hooks.py

f879bb1

Co-authored-by: Sam Moors <smoors@users.noreply.github.com>

casparvl marked this pull request as ready for review February 13, 2024 15:35

smoors approved these changes Feb 13, 2024

View reviewed changes

smoors merged commit d516f05 into EESSI:main Feb 13, 2024
9 checks passed

casparvl pushed a commit to casparvl/docs that referenced this pull request Feb 14, 2024

Document new partition features set in EESSI/test-suite#111

3798975

casparvl mentioned this pull request Feb 14, 2024

Document new partition features for the EESSI test suite to exclude execution scales EESSI/docs#156

Merged

casparvl deleted the filter_incompatible_scales branch September 4, 2024 18:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new hook to filter invalid scales based on features set in the config file #111

Add new hook to filter invalid scales based on features set in the config file #111

casparvl commented Feb 1, 2024

casparvl commented Feb 1, 2024

casparvl commented Feb 1, 2024

casparvl commented Feb 2, 2024

smoors left a comment •

edited

Loading

casparvl commented Feb 3, 2024

smoors commented Feb 3, 2024

casparvl commented Feb 5, 2024

smoors commented Feb 5, 2024

casparvl commented Feb 5, 2024

smoors Feb 5, 2024

casparvl Feb 9, 2024

smoors Feb 9, 2024

casparvl Feb 13, 2024

smoors commented Feb 5, 2024

smoors left a comment

boegel commented Feb 13, 2024

casparvl commented Feb 14, 2024

		if len(test.valid_systems) == 0:
		test.valid_systems = [valid_systems]

Add new hook to filter invalid scales based on features set in the config file #111

Add new hook to filter invalid scales based on features set in the config file #111

Conversation

casparvl commented Feb 1, 2024

casparvl commented Feb 1, 2024

casparvl commented Feb 1, 2024

casparvl commented Feb 2, 2024

smoors left a comment • edited Loading

Choose a reason for hiding this comment

casparvl commented Feb 3, 2024

smoors commented Feb 3, 2024

casparvl commented Feb 5, 2024

smoors commented Feb 5, 2024

casparvl commented Feb 5, 2024

smoors Feb 5, 2024

Choose a reason for hiding this comment

casparvl Feb 9, 2024

Choose a reason for hiding this comment

smoors Feb 9, 2024

Choose a reason for hiding this comment

casparvl Feb 13, 2024

Choose a reason for hiding this comment

smoors commented Feb 5, 2024

smoors left a comment

Choose a reason for hiding this comment

boegel commented Feb 13, 2024

casparvl commented Feb 14, 2024

smoors left a comment •

edited

Loading