-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for implementing pre- and post-step hooks #2343
Conversation
I've thinking about support for hooks for some time already, but in a different way. The use case I had in mind was for updating Lmod's cache as soon as easybuild finishes. What is different from this, is that it'd be a hook at the very end of the installation process for all the easyconfigs being installed, instead of a hook after any given step for each easyconfig. What do you think @boegel ? Also, thinking in that use case, wouldn't it make sense to allow to call external scripts, instead of python scripts? |
@damianam We could extend this to also cover other hooks, rather than just pre- and post-step hooks, sure. How would you call a hook that only runs at the very end? Note that this way, you can still call external scripts, with a trivial hook implementation like:
You just lose a bit of context that way: which software name/version/toolchain, potentially also environment Requiring the hooks that Maybe we should discuss this a bit during the next EB conf call? |
Don't get me wrong, I think that most hooks must be python files, for the precise reason you mention. But as you said, in the use case I had in mind the script would be trivial and all that power won't be needed, which means that having a python script "wrapping" a shell script means you have to maintain both and it is kind of overkill. Not a biggie, obviously, as the wrapper is trivial and the extra work minimal. |
I see your point, but should we add further configuration options so people can provide hook scripts in whatever scripting language that is now cool? That would just be overhead on our end, without very little benefit for the EasyBuild users, I think? |
Well, I think that'd be overhead if you plan to provide interfaces. But if you simply call |
So, are you saying we should consider adding a |
I think the distinction between |
@damianam So, just use |
Isn't |
@damianam Something like that yes, but that's already quite a bit more logic required than what's there now. :) |
Ignore my comments then :-). With an |
@damianam Support for a |
easybuild/framework/easyblock.py
Outdated
@@ -71,8 +71,7 @@ | |||
from easybuild.tools.filetools import adjust_permissions, apply_patch, back_up_file, change_dir, convert_name | |||
from easybuild.tools.filetools import compute_checksum, copy_file, derive_alt_pypi_url, diff_files, download_file | |||
from easybuild.tools.filetools import encode_class_name, extract_file, is_alt_pypi_url, mkdir, move_logs, read_file | |||
from easybuild.tools.filetools import remove_file, rmtree2, write_file | |||
from easybuild.tools.filetools import verify_checksum, weld_paths | |||
from easybuild.tools.filetools import remove_file, rmtree2, run_hook, write_file, verify_checksum, weld_paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write_file
at the end to preserve alphabetical order?
@@ -2434,6 +2436,9 @@ def run_step(self, step, step_methods): | |||
""" | |||
self.log.info("Starting %s step", step) | |||
self.update_config_template_run_step() | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling run_hook
when hooks = None
would work, but seems odd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather have the logic for dealing with not having any hooks defined in run_hook
rather than having to copy-paste the logic to check whether any hooks are defined in multiple places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable.
easybuild/main.py
Outdated
# obtain a copy of the starting environment so each build can start afresh | ||
# we shouldn't use the environment from init_session_state, since relevant env vars might have been set since | ||
# e.g. via easyconfig.handle_allowed_system_deps | ||
init_env = copy.deepcopy(os.environ) | ||
|
||
run_hook('start', hooks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The labelling is at the moment "freeform". Nothing wrong with it, but I wonder if it makes sense to restrict it and create a list of valid labels? That way one can also inform the user in case of typos in the hooks list. Think of "why is my stat_hook
not working?" kind of situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good idea.
So, if someone includes stat_hook
, they would get an error that stat
is not a valid hook, and then spit out a list of known hooks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
easybuild/tools/filetools.py
Outdated
:param post_step_hook: indicates whether hook to run is a post-step hook | ||
""" | ||
res = None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if below means that all hooks, except start
and end
need to have prefixes. Conceptually, I find it odd. If I create a install_hook
I expect it to be executed after the install step. But it needs to be called post_install_hook
. That way it is clear what it does, but I wonder if it makes sense to consider too install_hook
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel install_hook
is ambiguous. Some may assume it would be called as soon as the install
step is triggered (equivalent with the current pre_install_hook
), while your assumption is different.
To avoid guessing & confusion, I feel it's better to be explicit...
With your suggestion above in mind, you'd get an error like this when you define install_hook
:
ERROR: 'install_hook' is not a valid hook; available hooks are: ..., pre_install_hook, post_install_hook, ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a reasonable approach.
@damianam Check for valid defined hooks implemented. We can probably do better w.r.t. the produced error message though...
|
easybuild/tools/hooks.py
Outdated
return res | ||
|
||
|
||
def run_hook(label, known_hooks, pre_step_hook=False, post_step_hook=False, args=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to have args
at all? They aren't used anywhere.
easybuild/tools/hooks.py
Outdated
|
||
hook_name = hook_prefix + label + '_hook' | ||
|
||
for hook in known_hooks: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make sense to have a dictionary instead of a list here?
@@ -2434,6 +2437,9 @@ def run_step(self, step, step_methods): | |||
""" | |||
self.log.info("Starting %s step", step) | |||
self.update_config_template_run_step() | |||
|
|||
run_hook(step, self.hooks, pre_step_hook=True, args=[self]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now understand why args
is needed (in a previous comment on a separate commit I didn't :-P). Question is: Can we test it somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean, test that self
is being passed as an argument?
That's already done in , see test_toy_build_hooks
in toy_build.py
where the post_install_hook
there which uses self.name
and self.version
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that args
seems like a kind of wildcard that I find trouble understanding. It is passed directly to the hook in all cases except the start and end hooks, where run_hook
is executed without any of the optional arguments. And then the hook itself might do something with it, but being it either []
or [self]
I don't find it very clear what is the purpose of it. I guess that now, with a slightly better understanding of what you are doing here, my comment is simply "can we rename args
to something more descriptive?". With a different name I suppose I would understand it from the beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, only the step hooks get passed an argument (i.e. self
, the EasyBlock
instance that gives access to all information).
This may change in the future though, if a reason pops up to pass additional arguments.
args
is commonly used in Python code, see for example https://pythontips.com/2013/08/04/args-and-kwargs-in-python-explained/.
The idea is that the args
list is passed down with *args
to "unpack" it when the hook is called.
Hooks should be implemented like this, to be future-proof (i.e. keep working when we may pass down additional (even named) arguments):
def pre_install_hook(self, *args, **kwargs):
# do something with self
def start_hook(*args, **kwargs):
# do something (no (named) arguments passed, for now)
I'll clarify this in the yet-to-be-written documentation.
easybuild/tools/filetools.py
Outdated
@@ -42,6 +42,7 @@ | |||
import fileinput | |||
import glob | |||
import hashlib | |||
import imp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer needed
easybuild/tools/hooks.py
Outdated
_log = fancylogger.getLogger('hooks', fname=False) | ||
|
||
# this should be obtained via EasyBlock.get_steps(), | ||
# but we can't import from easybuild.framework.easyblock without introducing a cyclic dependency... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But maybe we can push out this list, so we can import the same list on both sides?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not too happy with this either...
We could keep this list somewhere else, I'll see if that a good option.
easybuild/tools/hooks.py
Outdated
|
||
def load_hooks(hooks_path): | ||
"""Load defined hooks (if any).""" | ||
hooks = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to convert this to a dictionary
@boegel Regarding the error message, why not using |
…rror message for unknown hooks
@damianam Great suggestion, thanks! This looks way better/cooler now, see below. Other remarks are also tackled, can you please give this another look, and also a quick test? I'll also look into writing up some documentation for this...
|
easybuild/framework/easyblock.py
Outdated
@@ -72,7 +72,9 @@ | |||
from easybuild.tools.filetools import compute_checksum, copy_file, derive_alt_pypi_url, diff_files, download_file | |||
from easybuild.tools.filetools import encode_class_name, extract_file, is_alt_pypi_url, mkdir, move_logs, read_file | |||
from easybuild.tools.filetools import remove_file, rmtree2, verify_checksum, weld_paths, write_file | |||
from easybuild.tools.hooks import run_hook | |||
from easybuild.tools.hooks import BUILD_STEP, CLEANUP_STEP, CONFIGURE_STEP, EXTENSIONS_STEP, FETCH_STEP, INSTALL_STEP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to be picky, but wouldn't it make sense to put the steps in a dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you envision that exactly? What would be the key, what the value?
Also, dictionaries are mutable, so not a good choice for constants...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd do something like:
steps_dict = {
'BUILD_STEP': 'build',
'CLEANUP_STEP': 'cleanup',
....
}
steps = frozenset(steps_dict.items())
And then import steps
and use dict(steps)['BUILD_STEP']
for instance. It isn't too elegant, but importing every step individually seems like a PITA, and more difficult to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@damianam That's kind of bypassing the whole point of introducing constants.
The point here is to avoid "magic strings" like 'build'
, while typos in constants like BUILD_STEP
are easier to catch by linters, etc.
With your proposal, we're back to magic strings like 'BUILD_STEP'
...
Also, w.r.t. to maintenance: adding additional steps should be rare in the future, removing steps even more so.
I agree that the import of the constants looks a bit ugly though, but I don't see a better option without bypassing the purpose of using constants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. We can leave it as it is and blame it on the language ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have discussed sufficiently about this PR and is mature enough. Let's push it in!
update: documentation update in easybuilders/easybuild#392