-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shell: make kvs output limit configurable and default single-user jobs to unlimited #5732
shell: make kvs output limit configurable and default single-user jobs to unlimited #5732
Conversation
Problem: The job shell doesn't capture the owner of the enclosing instance, so there is no way to determine if the job is running in single vs multiuser mode. Capture the local broker security.owner attribute.
Problem: The shell fetches the local instance owner, but does not expose this value to plugins. Add an instance_owner key to the shell info object.
Problem: Presence of the instance_owner value in the shell info object is not documented. Add it to flux_shell_get_info(3).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There is an extra "is" in the first sentence of last paragraph of the commit message of the penultimate commit.
Any way to add a test to check that the default 10MB limit is still there for guests (or did I miss that there is one)?
34f709b
to
2299dfe
Compare
Yes, sorry that was missed. I've added a |
The 'system' instance container that runs systemd under docker has apparently stopped working. We may be hitting this issue: Though I'm not sure exactly when the underlying version of systemd would have been updated, since the system tests were working as of last Friday afaict. Anyway, the symptoms are the same as in that issue. As suggested, the tests do work under podman, so maybe that is one path forward if available on Github actions. I'll move this discussion to a separate issue, but until we figure this out we may have to disable the requirement for the |
Problem: The job shell truncates output at 10M to workaround current KVS limitations with very large eventlogs in possibly long-running system instances. However, this is unnecessarily limiting in single-user subinstances where it is most convenient to have all job output folded into the output of a batch job. Convert the KVS output limit to a configurable value set via the job shell option `output.limit`. Keep the default limit of 10M if the userid of the enclosing instance broker does not match the current userid, and disable the limit completely if the broker and shell userids match. Update shell output truncation tests that would now fail with the new behavior. Ensure output limit is not enforced in the single user test instance. Note that this allows a user to disable the limit in a multiuser instance as well via `-o output.limit=0`, but this was possible anyway since the limit is not really enforced by the instance (a user could build a custom job shell for instance).
Problem: The job shell output.limit option is not documented. Document this option in flux-shell(1) as well the common job shell options table for the cli submission commands.
00ab813
to
26194c3
Compare
Problem: There is no test that ensures the job shell truncates output to 10M by default for guest jobs in a multiuser instance. Add a test to t9000-system.t that ensures output is truncated in this case.
Problem: The use of `source` in t9000-system.t is non-POSIX and thus not supported by POSIX sh. Replace `source` with the POSIX `.`.
26194c3
to
74b4faa
Compare
Ok, I figured out why the system image tests were failing (my fault). I'll set MWP here now. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #5732 +/- ##
==========================================
- Coverage 83.46% 83.45% -0.02%
==========================================
Files 487 487
Lines 82984 83028 +44
==========================================
+ Hits 69265 69288 +23
- Misses 13719 13740 +21
|
This PR adds support for a new
output.limit
job shell option which overrides the default KVS output limit.The limit remains
10M
for multiuser instance jobs (when broker userid != job shell userid), but the limit is removed for single-user jobs (broker userid == shell userid).This should resolve some of the issues users have hit with the KVS output limit inside batch jobs, where it is most convenient to have jobs output folded in with that of the batch output file.