Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slurm: Call slurm_init() once before any call to Slurm API #148

Merged
merged 2 commits into from
Dec 19, 2023

Conversation

e4t
Copy link
Contributor

@e4t e4t commented Mar 13, 2023

Since Slurm 20.11, slurm_init() needs to be called before any call to the Slurm API to make sure the config file is read in. We cannot call it in mod_slurm_init() or in mod_slurm_wcoll() directly as it requires slurm to be configured before use, thus pdsh would fail whenever the slurm plugin is found but Slurm is not configured correctly.

@codecov-commenter
Copy link

codecov-commenter commented Mar 13, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.43 🎉

Comparison is base (3bc95a8) 64.20% compared to head (4adcddc) 64.64%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #148      +/-   ##
==========================================
+ Coverage   64.20%   64.64%   +0.43%     
==========================================
  Files          27       27              
  Lines        5588     5886     +298     
==========================================
+ Hits         3588     3805     +217     
- Misses       2000     2081      +81     
Impacted Files Coverage Δ
src/pdsh/mod.c 76.44% <100.00%> (+0.94%) ⬆️
src/pdsh/opt.c 84.99% <100.00%> (-0.10%) ⬇️

... and 8 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@grondo
Copy link
Member

grondo commented Mar 13, 2023

Thanks!

We cannot call it in mod_slurm_init() or in mod_slurm_wcoll() directly as it requires slurm to be configured before use,

This looks fine to me, but could I ask why slurm_init() can't be called in mod_slurm_init() if any error is ignored (or saved for later examination if a Slurm option is utilized)? There may be a good reason, I was more curious than anything else.

Since Slurm 20.11, slurm_init() needs to be called before any call
to the Slurm API to make sure the config file is read in.
We cannot call it in mod_slurm_init() or in mod_slurm_wcoll() directly
as it requires slurm to be configured before use, thus pdsh would
fail whenever the slurm plugin is found but Slurm is not configured
correctly.

Signed-off-by: Egbert Eich <eich@suse.com>
@e4t
Copy link
Contributor Author

e4t commented Jul 4, 2023

Thanks!

We cannot call it in mod_slurm_init() or in mod_slurm_wcoll() directly as it requires slurm to be configured before use,

This looks fine to me, but could I ask why slurm_init() can't be called in mod_slurm_init() if any error is ignored (or saved for later examination if a Slurm option is utilized)? There may be a good reason, I was more curious than anything else.

@grondo Sorry for the late reply, I've missed your question.
I've tried to briefly describe this in the commit message: It was my initial approach as well, luckily Slurm was not configured on the build system so I caught it: slurm_init() requires Slurm to be configured, otherwise it will fail and exit() directly AFAIR - which it really shouldn't not do. This way, pdsh would fail every time the slurm module is found (ie during pdsh -h - even though it's not used as mod_slurm_init() is called every time the module is loaded even though it's not used. The only way around the Slurm quirkiness I saw was with this somewhat ugly 'hack'.
Meanwhile I've got bitten by my patch on older Slurm versions (< 20.11) as these don't expose slurm_init() through libslurm at all. I'll push an updated patch.

Only print message when debug is enabled: This avoids users
getting alarmed unnecessarily. pdsh will install .la files
as well and cycle thru all installed files when attempting
to load modules.
These should probably not be installed at all.

Signed-off-by: Egbert Eich <eich@suse.com>
@grondo grondo changed the title slurm_plugin: Call slurm_init() once before any call to Slurm API slurm: Call slurm_init() once before any call to Slurm API Dec 19, 2023
@grondo
Copy link
Member

grondo commented Dec 19, 2023

@e4t - sorry this one languished so long. Your solution works, and more people have been hitting this issue so I'm merging.

@grondo grondo merged commit 8767b26 into chaos:master Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants