-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make ModulesTool.exist more robust w.r.t. module wrappers, aliases, defaults, etc. #3216
Conversation
@Flamefire This would basically also consider the output of In addition, this would make things like Also, the tests should be enhanced. |
I don't understand. If you call Currently the problem is: If "Java/11" is an alias for "Java/whatever-11" then
Why? There is only an additional regexp check, not another |
Ugh, I clearly didn't look at this closely enough... Thanks for clarifying @Flamefire! In that case, I would:
|
5dc3bd6
to
79fce64
Compare
@boegel Done as suggested. I also implemented the show command so that it returns stderr not the combined stdout+stderr to reduce false positives (which can be fatal). Reasoning is that the output we want is contained in stderr (at least for LMod) and checking stderr to can have a path matching the regexp which we don't want. |
79fce64
to
3ff16d7
Compare
easybuild/tools/modules.py
Outdated
# "Java/whatever-11" when searching for "Java/11" (--> basename="Java") | ||
basename = os.path.dirname(mod_name) | ||
if basename: | ||
names_to_check.append(basename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While looking into this, I noticed this change is not required at all to make the enhanced test pass, because of the fallback mechanism with module_wrapper_exists
in ModulesTool.show
that was added in #2606 .
So, I'm still puzzled why this change is required at al...
I enhanced the checks in test_exist
a bit more, and roll back the changes you made to mod_exists_via_show
here, and it still works fine.
See Flamefire/easybuild-framework@improve_module_exist...boegel:improve_module_exist.
I can open a PR to your branch with these changes once I get the test_run_module
I added passing with all different module tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose we roll back this change for now, which is done as a part of Flamefire#2 .
We can further extend test_exist
to cover the use cases that are actually broken (for example module aliases as discussed in #3215), before looking into a fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I'm still puzzled why this change is required at al...
Because as explained in #3215 the fallback shouldn't be there: It is fuzzy and incomplete. If module show
returns a module, then the module exists else it does not, no fallback required.
Just looked at why the tests pass and comparing the #3215 and the docu at https://lmod.readthedocs.io/en/latest/093_modulerc.html:
Fallback considers only:
module_version
command- only module paths and 1 given directory
Fallback hence fails for:
module_alias
- system MODULERC files
- ~/.modulerc.lua or ~/.modulerc
The last one was the problem I found. Users can't change the files in modulepath but they may want to change aliases via a home-folder modulerc.
Counter-proposal: We keep this change and print a fat warning when a module is not found via module show
but via the fallback and gradually get rid of the fallback (My assumption is the warning will never trigger with this change)
Or do you see any reason to keep and improve a hand-rolled implementation of part of LMod/envModules?
More arguments: The fallback is particularly tricky as you noted in #2606 (comment) and I'm not sure that this is addressed: If a module wrapper exists but the underlying module does not, what does happen? I can't find such a 2nd check in
def module_wrapper_exists(self, mod_name, modulerc_fn='.modulerc', mod_wrapper_regex_template=None): |
Using
module show
only avoids this problem altogether.
To go forward faster: Is there anything wrong with this PR? Any possible failure you can see? Any test that fails with it? If you want a test which passes after but not before I guess using module_alias
will be enough. Can add that Tuesday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any strong objections against your approach, other than that it partially solves the same issue as module_wrapper_exists
does.
I agree that your approach is better, thanks a lot for the clarification.
module_wrapper_exists
indeed has shortcomings (quite a lot actually, as you point them out), and it's indeed not EasyBuild's task to re-implement the modules tool.
With that in mind, I suggest proceeding as follows:
-
Merge undo changes to mod_exists_via_show + additional/enhanced tests Flamefire/easybuild-framework#2 to update this PR, mostly for the additional tests (which are largely not directly related to the changes made in this PR, but are still useful). I spent quite a bit of time on the test for
run_module
in particular, it would be great if that effort doesn't go to waste... -
Revert commit a10a6ac that undoes your changes in
mod_exists_via_show
(or do agit rebase --interactive
to kick it out entirely). I can also kick out that commit from undo changes to mod_exists_via_show + additional/enhanced tests Flamefire/easybuild-framework#2 before you merge it, if you prefer. -
Add more checks in
test_exist
to cover the use cases that are not dealt with bymodule_wrapper_exists
, preferably not onlymodule_alias
but also the others, if possible. For~/.modulerc.lua
and~/.modulerc
, you can probably cover those in the test by temporarily redefining$HOME
? -
Remove
module_wrapper_exists
entirely, since it's no longer needed with your enhancedmod_exists_via_show
. I see no real reason to be careful there, the checks intest_exist
should be good enough to ensure that nothing gets broken that was covered bymodule_wrapper_exists
.If you prefer being cautious, I guess adding a
log.deprecated
statement inmodule_wrapper_exists
makes sense too, to leave the fallback in place for now but print a clear warning if it's still triggered. If we take this route, the message should clearly mention to report the use case that triggers the warning, so we know we missed something before we go ahead and remove that block of code. -
Make your enhanced
mod_exists_via_show
a little bit stricter, as follows:names_to_check.append(basename + '/')
The regular expression used to check the output of
module show
is a bit too liberal now in combination with only using the directory part of the module name.Using just
Java
in the fallback check would result in checking with'^\s*\S*/Java.*(\.lua)?:\s*$'
for example, while it should really be'^\s*\S*/Java/.*(\.lua)?:\s*$'
.
Maybe we should even usebasename + '/.*'
, and remove the.*
from themod_exists_regex_template
values?
I'm not sure why it's even there at all, we expect an exact match (except for the.lua
part), don't we? -
Rename
basename
to something else, since it causes confusion. You useos.path.dirname
to define a variable namedbasename
, which clashes directly withos.path.basename
which does the exact opposite (return just the filename). :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will work on this Tuesday, some comments now to leave time for decisions:
For ~/.modulerc.lua and ~/.modulerc, you can probably cover those in the test by temporarily redefining $HOME?
I didn't add tests yet for fear of accidentally removing the real file or e.g. missing to undefine it or so... I'll see what I can come up with
Remove module_wrapper_exists entirely,
I'm all in for that. But what about your usual reasoning someone else might be using it?
print a clear warning if it's still triggered. If we take this route, the message should clearly mention to report the use case that triggers the warning, so we know we missed something before we go ahead and remove that block of code.
We must not call it within EasyBuild or make the message only trigger if module_wrapper_exists
returns anything because it is currently called if the previous check does not find anything, which is valid: When the module indeed does not exist. Problem: An alias might exist but the underlying module not. module show
would return nothing but module_wrapper_exists
would return something. So I'd never call it from EB and deprecate it like other functions
Make your enhanced mod_exists_via_show a little bit stricter, as follows:
Why? If $LMOD_CMD show foo/1.2
returns bar_3.4.lua
because someone set the alias as such, then this is valid. AFAIK module show
returns nothing when the module is not found (maybe an error? But even that not on stderr of the module tool as that is up to the wrapper)
Hence I don't think it needs to be made any stricter. Use case for the loose version: IBM has a custom TensorFlow. Maybe someone makes an alias TensorFlow_PowerAI
to TensorFlow/1.13_Anaconda_...
or so to distinguish it from official TF
Do you know of anything where the strict check would be required? Then we should add negative tests too that cover that. But until then I would keep it loose
easybuild/tools/modules.py
Outdated
# "Java/whatever-11" when searching for "Java/11" (--> basename="Java") | ||
basename = os.path.dirname(mod_name) | ||
if basename: | ||
names_to_check.append(basename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any strong objections against your approach, other than that it partially solves the same issue as module_wrapper_exists
does.
I agree that your approach is better, thanks a lot for the clarification.
module_wrapper_exists
indeed has shortcomings (quite a lot actually, as you point them out), and it's indeed not EasyBuild's task to re-implement the modules tool.
With that in mind, I suggest proceeding as follows:
-
Merge undo changes to mod_exists_via_show + additional/enhanced tests Flamefire/easybuild-framework#2 to update this PR, mostly for the additional tests (which are largely not directly related to the changes made in this PR, but are still useful). I spent quite a bit of time on the test for
run_module
in particular, it would be great if that effort doesn't go to waste... -
Revert commit a10a6ac that undoes your changes in
mod_exists_via_show
(or do agit rebase --interactive
to kick it out entirely). I can also kick out that commit from undo changes to mod_exists_via_show + additional/enhanced tests Flamefire/easybuild-framework#2 before you merge it, if you prefer. -
Add more checks in
test_exist
to cover the use cases that are not dealt with bymodule_wrapper_exists
, preferably not onlymodule_alias
but also the others, if possible. For~/.modulerc.lua
and~/.modulerc
, you can probably cover those in the test by temporarily redefining$HOME
? -
Remove
module_wrapper_exists
entirely, since it's no longer needed with your enhancedmod_exists_via_show
. I see no real reason to be careful there, the checks intest_exist
should be good enough to ensure that nothing gets broken that was covered bymodule_wrapper_exists
.If you prefer being cautious, I guess adding a
log.deprecated
statement inmodule_wrapper_exists
makes sense too, to leave the fallback in place for now but print a clear warning if it's still triggered. If we take this route, the message should clearly mention to report the use case that triggers the warning, so we know we missed something before we go ahead and remove that block of code. -
Make your enhanced
mod_exists_via_show
a little bit stricter, as follows:names_to_check.append(basename + '/')
The regular expression used to check the output of
module show
is a bit too liberal now in combination with only using the directory part of the module name.Using just
Java
in the fallback check would result in checking with'^\s*\S*/Java.*(\.lua)?:\s*$'
for example, while it should really be'^\s*\S*/Java/.*(\.lua)?:\s*$'
.
Maybe we should even usebasename + '/.*'
, and remove the.*
from themod_exists_regex_template
values?
I'm not sure why it's even there at all, we expect an exact match (except for the.lua
part), don't we? -
Rename
basename
to something else, since it causes confusion. You useos.path.dirname
to define a variable namedbasename
, which clashes directly withos.path.basename
which does the exact opposite (return just the filename). :)
9b01ac8
to
62de10d
Compare
@boegel Tests failing due to the tests for the fallback printing deprecated messages. Shall I remove those tests? Edit: Or add "Deprecated functionality" to |
Allows to find Java/whatver-11 from "module show Java/11"
62de10d
to
13361a6
Compare
Rebased and removed the succeeding but echoing "DEPRECATED" module_wrapper_exists tests |
@boegel Found multiple issues:
I fixed and tested the first case. The second may need a major change and deprecation warning. However it is not totally clear what |
9dbf558
to
45135ab
Compare
…ces between different module tools...
…return non-zero exit code
… avoid failing test with EnvironmentModulesTcl
b3bd1f3
to
506397b
Compare
For HMNS the .modulerc needs updating on EnvironmentModules however the old fallback wrongly considered the current version as correct (depending on the actual location of the .modulerc) and returned True even though the module is not usable
Deprecate mod_exists_regex_template in favor for better output parsing Output parsing of module show is now more accurate as errors are handled and only first non-whitespace line is checked Usual outputs for non-existant modules: Empty or `(0):ERROR:...` For existing modules the first line (after comments etc) contains the full path of the module file with a colon at the end.
The issue was caused by self.show returning stdout and stderr As it now returns only stderr the issue is no longer possible
506397b
to
1454dd3
Compare
cdecb65
to
8cb5517
Compare
Allows to find Java/whatver-11 from "module show Java/11" as well as
AIModule
aliased toTensorFlow/2.1.0
or similar.The scope widened during the course of this PR trying to fix various test failures with different module tools resulting in a complete implementation which is much more robust.
This now fixes #3215
Proposal: Remove the previous check via
module avail
and solely rely onmodule show
as the true source which handles broken aliases by returning false instead of True formodule avail
in Lmod and False otherwise.