-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add job update service and new job-update(1) command #5409
Commits on Aug 25, 2023
-
testsuite: fix recursive jobspec-update in job-list test plugin
Problem: In the jobspec-update-job-list jobtap plugin, a test jobspec-update event is posted from the job.state.sched callback. However, in the future jobspec-update events will cause a job in SCHED state to transition back to PRIORITY, and when the job then goes back to SCHED state, the jobspec-update event will be emitted again, the job will transition back to PRIORITY again, creating an infinite loop. Ensure the test plugin only emits a jobspec-update once during the test to avoid infinite recursion. This will require the plugin to be reloaded to continue working after one job, but at this point that is not necessary.
Configuration menu - View commit details
-
Copy full SHA for ab3e189 - Browse repository at this point
Copy the full SHA ab3e189View commit details -
job-manager: move jobs from SCHED->PRIORITY on jobspec-update
Problem: When the jobspec for a job is modified by a jobspec-update event the job may need to be reprioritized, possibly held back from the scheduler, or submitted to a different queue. As with the urgency event, kick jobs in SCHED state back to PRIORITY on a jobspec-update event.
Configuration menu - View commit details
-
Copy full SHA for 8f07faa - Browse repository at this point
Copy the full SHA 8f07faaView commit details
Commits on Aug 31, 2023
-
job-manager: add jobspec update convenience functions
Problem: Jobspec updates will need to be manipulated and applied in multiple modules within the job manager, but currently the functionality to validate and apply jobspec updates is within static functions in event.c and jobtap.c. Locate some jobspec update functions centrally in job.c so they may easily be accessed from other job manager modules.
Configuration menu - View commit details
-
Copy full SHA for 5557cd9 - Browse repository at this point
Copy the full SHA 5557cd9View commit details -
job-manager: use jobspec_apply_updates() in event.c
Problem: The code to apply jobspec updates from the jobspec-update event in the job manager duplicates the job_apply_jobspec_updates() function exported from job-manager/job.c. Use jobspec_apply_jobspec_updates() to apply jobspec updates instead of the duplicated code.
Configuration menu - View commit details
-
Copy full SHA for ae52e3a - Browse repository at this point
Copy the full SHA ae52e3aView commit details -
job-manager: add support functions for job updates
Problem: The job update service will require assistance of the jobtap plugin stack to validate requested updates. Add a couple jobtap support functions for this purpose: - jobtap_job_update(): Call job.update.KEY callback to allow a single update for KEY. - jobtap_validate_updates(); Apply updates to jobspec and call the job.validate stack on the modified jobspec.
Configuration menu - View commit details
-
Copy full SHA for 63fa4b4 - Browse repository at this point
Copy the full SHA 63fa4b4View commit details -
python: allow Jobspec.setattr() key to start with attributes.
Problem: The Jobspec setattr() method always prepends 'attributes.' the the key argument, but this can be inconvenient when the key already contains the 'attributes.' prefix, since that prefix must then be removed before calling jobspec.setattr(). Only prepend 'attributes.' to the key argument of setattr() if it doesn't already contain that prefix.
Configuration menu - View commit details
-
Copy full SHA for cc045e5 - Browse repository at this point
Copy the full SHA cc045e5View commit details -
python: add Jobspec.getattr() method
Problem: There exists a Jobspec setattr() method which sets an attribute based on "dotted key" notation, but no equivalent getattr() method to get dotted keys. Add a Jobspec.getattr() method.
Configuration menu - View commit details
-
Copy full SHA for d2b4d7b - Browse repository at this point
Copy the full SHA d2b4d7bView commit details -
job-manager: validate all job states in limit-* plugins
Problem: The limit-duration and limit-job-size plugins do not validate jobs unless they are in the NEW state, ostensibly because job.validate may be called after a plugin reload or job manager restart. However, it is no longer the case that job.validate is called in these situations, and it may be necessary to call job.validate for jobs beyond the NEW state when processing job updates. Drop the checks for FLUX_JOB_STATE_NEW in the limit-duration and limit-job-size plugins.
Configuration menu - View commit details
-
Copy full SHA for 996d306 - Browse repository at this point
Copy the full SHA 996d306View commit details -
job-manager: add job update service
Problem: There is no service in the job manager for requesting the update of jobspec or other job parameters. Add a new update service to the job manager. Job updates can now be requested via a job-manager.update RPC, the payload of which includes the target jobid and an "updates" object which follows the jobspec-update specification in RFC 20. Updates for a key are only allowed if a plugin callback exists for the jobtap topic string "job.update.KEY", and the callback returns success. If multiple keys are updated in the same request, they all must be allowed or none will be applied. Once updates have been validated, then the proposed modified jobspec is sent through the job.validate plugin call stack. If the new jobspec fails to be sucessfully validated, then the updates are rejected and an error is returned to the requestor. Individual plugins may request that the job.validate be skipped for a given key by settin a 'validated' flag in the plugin OUT arguements. However, the job.validate call will still be made if multiple keys are being updated and not all of them set a validated flag. Once the update is allowed and validated, then a jobspec-update event is posted for the job and an empty success response is issued.
Configuration menu - View commit details
-
Copy full SHA for bf56891 - Browse repository at this point
Copy the full SHA bf56891View commit details -
job-manager: add update-duration builtin plugin
Problem: Once a job is submitted the duration cannot be updated. Add an update-duration plugin that adds a job.update.attributes.system.duration callback so that jobspec duration updates are supported for pending jobs. By default, users can update the duration of their own jobs up to the currently configured limit, and instance owners can update duration to any value. The ability of the instance owner to bypass limits can be disabled by reloading the plugin with the config parameter owner-allow-any=0.
Configuration menu - View commit details
-
Copy full SHA for bd923ea - Browse repository at this point
Copy the full SHA bd923eaView commit details -
Problem: There is no command line interface to request job updates. Add the flux-update(1) command, which takes a jobid and one or more KEY=VALUE pairs on the command line, and sends an update request to the job manager. Special handling for specific keys is supported for a more convenient user interface. Currently, any key which doesn't start with `attributes.`, `resources.` or `tasks.` is assumed to be prefixed with `attributes.system.`, so `duration=10m` is translated to `attributes.system.duration=10` for example. Key values may also get special handling through existence of an `update_{keystr}` method in the JobspecUpdates class, where `keystr` is the key with dots replaced by underscore. For now, an `update_attributes_system_duration()` function is provided which allows 'duration' values which support +/-FSD or FSD. When adjusting duration, the current jobspec is fetched with any updates applied to get the most up-to-date duration.
Configuration menu - View commit details
-
Copy full SHA for 7d3718b - Browse repository at this point
Copy the full SHA 7d3718bView commit details -
completions: add bash completions for flux-update(1)
Problem: There are no tab completions for the flux-update(1) command. Add a completion handler for flux-update(1) to etc/completions/flux.pre.
Configuration menu - View commit details
-
Copy full SHA for d9544f9 - Browse repository at this point
Copy the full SHA d9544f9View commit details -
job-manager: call job.update plugin stack after jobspec-update
Problem: There is no way for a jobtap plugin to get notified of a jobspec update after the jobspec updates have been applied. Jobs only transition back to PRIOITY state from SCHED, so the job.state.priority callback will not always be sufficient, and subscribing directly to the jobspec-update event would require the plugin to manually apply updates, and may not capture other ways a jobspec or job might be updated in the future. Introduce a 'job.update' callback topic which is called after any jobspec update has been applied. If the job is transitioning back to the PRIORITY state, this callback will be called before the job.state.priority topic so that plugins may adjust internal state that would normally be established prior to the first call to job.state.priority.
Configuration menu - View commit details
-
Copy full SHA for 37d48ff - Browse repository at this point
Copy the full SHA 37d48ffView commit details -
testsuite: add job update tests
Problem: There are no tests of the job update support in flux. Add a new test, t2290-job-update.t, and helper jobtap plugin, job-manager/plugins/update-test.c, and add some basic testing of the job update support using `flux update`.
Configuration menu - View commit details
-
Copy full SHA for 4fccb67 - Browse repository at this point
Copy the full SHA 4fccb67View commit details -
Problem: The flux-update(1) command is not documented. Add a short manual page for flux-update(1). Update spelling dictionary as necessary.
Configuration menu - View commit details
-
Copy full SHA for 33892fd - Browse repository at this point
Copy the full SHA 33892fdView commit details -
job-manager: support immutable job flag
Problem: It would be useful to disable updates for individual jobs, but there is currently no way to do this. Add an 'immutable' flag to the job manager job structure. Support adding this flag via the `set-flags` event.
Configuration menu - View commit details
-
Copy full SHA for 61f1e77 - Browse repository at this point
Copy the full SHA 61f1e77View commit details -
job-manager: deny updates for jobs that bypass update validation
Problem: When the instance owner updates a guest job in order to bypass validation (e.g. to update duration of a job beyond current limits), a future job update of a different attribute may fail because the job will be revalidated. This causes a confusing error that is unrelated to the user's request. When a job update bypasses validation, the update request is made by the instance owner, and the job user is not the instance owner, mark the job as immutable to prevent future updates by the job owner. This not only results in a less confusing error "job is immutable due to previous instance owner update" and also prevents the need to track which attribute updates have bypassed validation in past updates, which could be complex and could introduce unintended consequences.
Configuration menu - View commit details
-
Copy full SHA for 518a726 - Browse repository at this point
Copy the full SHA 518a726View commit details -
testsuite: test update immutability of jobs
Problem: There are no tests of jobs which have been updated by the instance owner and therefore have the immutable flag set. Test update of guest jobs in t2290-job-update.t before and after an instance owner update. Ensure an immutable job cannot be updated by the user.
Configuration menu - View commit details
-
Copy full SHA for 15fa70f - Browse repository at this point
Copy the full SHA 15fa70fView commit details -
doc: add note about immutable jobs to flux-update(1)
Problem: The flux-update(1) man page does not mention that jobs updated by the instance owner may become immutable. Add an explanation of how jobs updated by the instance owner can bypass validation, and why this makes the jobs immutable.
Configuration menu - View commit details
-
Copy full SHA for c41d143 - Browse repository at this point
Copy the full SHA c41d143View commit details