Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add operator API to reject all job registrations #11450

Closed
tgross opened this issue Nov 4, 2021 · 3 comments · Fixed by #11610
Closed

add operator API to reject all job registrations #11450

tgross opened this issue Nov 4, 2021 · 3 comments · Fixed by #11610
Assignees
Labels
theme/api HTTP API and SDK issues type/enhancement
Milestone

Comments

@tgross
Copy link
Member

tgross commented Nov 4, 2021

During incident response, operators may find that automated processes elsewhere in the organization can be generating new workloads on Nomad clusters that are unable to handle the workload. Add an operator API endpoint that causes all job registration calls to be rejected with an error stating that the cluster is in a load-shedding mode.

@tgross tgross added type/enhancement theme/api HTTP API and SDK issues labels Nov 4, 2021
@tgross
Copy link
Member Author

tgross commented Nov 19, 2021

We've also had an internal discussion about automatically load-shedding in cases where the Nomad servers are overloaded, but that's a good deal more involved than this issue. In any case, whatever state this option has will most likely be used for that mechanism anyways.

@tgross tgross self-assigned this Dec 2, 2021
@tgross tgross added this to the 1.2.3 milestone Dec 2, 2021
@tgross
Copy link
Member Author

tgross commented Dec 2, 2021

My thinking on the approach for this:

  • A new field RejectJobRegistration on SchedulerConfig, which we already have wired up to get persisted to raft.
  • Job.Register and Job.Dispatch will check this this field value. If the field is true we'll return an error unless the ACL is a management token.
  • We don't currently allow SchedulerConfig to be set via CLI... we may have a couple more upcoming features for SchedulerConfig, so let's put off adding a CLI until we know what they're all going to be so that we have a sensible/ergonomic CLI for all of them together.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
theme/api HTTP API and SDK issues type/enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant