Backport of Multiple instances of a periodic job are run simultaneously, when prohibit_overlap is true into release/1.5.x #16661

hc-github-team-nomad-core · 2023-03-27T15:26:34Z

Backport

This PR is auto-generated from #16583 to be assessed for backporting due to the inclusion of the label backport/1.5.x.

WARNING automatic cherry-pick of commits failed. Commits will require human attention.

The below text is copied from the body of the original PR.

This PR addresses the bug reported on #11052

When a leader change happens, the periodic dispatcher on the new leader starts by re running all periodic jobs by force, without checking if there is an instance of the said job already.
A new check is introduced that skips the job if prohibit_overlap is set and there is already a instance running.

admin: post 1.5.0 rc.1 release

…16217) Nomad servers can advertise independent IP addresses for `serf` and `rpc`. Somewhat unexpectedly, the `serf` address is also used for both Serf and server-to-server RPC communication (including Raft RPC). The address advertised for `rpc` is only used for client-to-server RPC. This split was introduced intentionally in Nomad 0.8. When clients are using Consul discovery for connecting to servers, they get an initial discovery set from Consul and use the correct `rpc` tag in Consul to get a list of adddresses for servers. The client then makes a `Status.Peers` RPC to get the list of those servers that are raft peers. But this endpoint is shared between servers and clients, and provides the address used for Raft. Most of the time this is harmless because servers will bind on 0.0.0.0 anyways., But in topologies where servers are on a private network and clients are on separate subnets (or even public subnets), clients will make initial contact with the server to get the list of peers but then populate their local server set with unreachable addresses. Cluster administrators can work around this problem by using `server_join` with specific IP addresses (or DNS names), because the `Node.UpdateStatus` endpoint returns the correct set of RPC addresses when updating the node. So once a client has registered, it will get the correct set of RPC addresses. This changeset updates the client logic to query `Status.Members` instead of `Status.Peers`, and then extract the correctly advertised address and port from the response body.

… not with root user (#16222)

Fixes #16288. An earlier version of `go-plugin` introduced a warning log if `SecureConfig` is unset. For Nomad and other applications that have "internal" `go-plugin` consumers where the application runs itself as a plugin, this causes spurious warn-level logs. For Nomad in particular this means every task driver and logmon invocation emits the log, which is our primary operation. The change was reverted upstream, so this changeset picks up the reverted version.

The signature of the `raftApply` function requires that the caller unwrap the first returned value (the response from `FSM.Apply`) to see if it's an error. This puts the burden on the caller to remember to check two different places for errors, and we've done so inconsistently. Update `raftApply` to do the unwrapping for us and return any `FSM.Apply` error as the error value. Similar work was done in Consul in hashicorp/consul#9991. This eliminates some boilerplate and surfaces a few minor bugs in the process: * job deregistrations of already-GC'd jobs were still emitting evals * reconcile job summaries does not return scheduler errors * node updates did not report errors associated with inconsistent service discovery or CSI plugin states Note that although _most_ of the `FSM.Apply` functions return only errors (which makes it tempting to remove the first return value entirely), there are few that return `bool` for some reason and Variables relies on the response value for proper CAS checking.

@type

* Template and styles * @type to @color on flash messages * Notifications service as wrapper * Test cases updated for new notifs

admin: Post 1.5.0 release

) Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

When native service discovery was added, we used the node secret as the auth token. Once Workload Identity was added in Nomad 1.4.x we needed to use the claim token for `template` blocks, and so we allowed valid claims to bypass the ACL policy check to preserve the existing behavior. (Invalid claims are still rejected, so this didn't widen any security boundary.) In reworking authentication for 1.5.0, we unintentionally removed this bypass. For WIs without a policy attached to their job, everything works as expected because the resulting `acl.ACL` is nil. But once a policy is attached to the job the `acl.ACL` is no longer nil and this causes permissions errors. Fix the regression by adding back the bypass for valid claims. In future work, we should strongly consider getting turning the implicit policies into real `ACLPolicy` objects (even if not stored in state) so that we don't have these kind of brittle exceptions to the auth code.

Several `nomad job` subcommands had duplicate or slightly similar logic for resolving a job ID from a CLI argument prefix, while others did not have this functionality at all. This commit pulls the shared logic to the command Meta and updates all `nomad job` subcommands to use it.

In Nomad 0.12.1 we introduced atomic job registration/deregistration, where the new eval was written in the same raft entry. Backwards-compatibility checks were supposed to have been removed in Nomad 1.1.0, but we missed that. This is long safe to remove.

Some of the methods in `Allocations()` incorrectly use the `putQuery` in API calls where `put` is more appropriate since they are not reading information back. These methods are also not returning request metadata such as `LastIndex` back to callers, which can be useful to have in some scenarios. They also provide poor developer experience as they take an `*api.Allocation` struct when only the allocation ID is necessary. This can lead consumers to make unnecessary API calls to fetch the full allocation. Fixing these problems require updating the methods' signatures so they take `*WriteOptions` instead of `*QueryOptions` and return `*WriteMeta`, but this is a breaking change that requires advanced notice to consumers. This commit adds a future breaking change notice and also fixes the `Stop` method so it properly returns request metadata in a backwards compatible way.

…ient links/chart (#16274) * Fix for wildcard DC sys/sysbatch jobs * A few extra modules for wildcard DC in systemish jobs * doesMatchPattern moved to its own util as match-glob * DC glob lookup using matchGlob * PR feedback

…16362) Wildcard datacenters introduced a bug where a job with any wildcard datacenters will always be treated as a destructive update when we check whether a datacenter has been removed from the jobspec. Includes updating the helper so that callers don't have to loop over the job's datacenters.

Implement the new `nomad job restart` command that allows operators to restart allocations tasks or reschedule then entire allocation. Restarts can be batched to target multiple allocations in parallel. Between each batch the command can stop and hold for a predefined time or until the user confirms that the process should proceed. This implements the "Stateless Restarts" alternative from the original RFC (https://gist.github.com/schmichael/e0b8b2ec1eb146301175fd87ddd46180). The original concept is still worth implementing, as it allows this functionality to be exposed over an API that can be consumed by the Nomad UI and other clients. But the implementation turned out to be more complex than we initially expected so we thought it would be better to release a stateless CLI-based implementation first to gather feedback and validate the restart behaviour. Co-authored-by: Shishir Mahajan <smahajan@roblox.com>

When a disconnect client reconnects the `allocReconciler` must find the allocations that were created to replace the original disconnected allocations. This process was being done in only a subset of non-terminal untainted allocations, meaning that, if the replacement allocations were not in this state the reconciler didn't stop them, leaving the job in an inconsistent state. This inconsistency is only solved in a future job evaluation, but at that point the allocation is considered reconnected and so the specific reconnection logic was not applied, leading to unexpected outcomes. This commit fixes the problem by running reconnecting allocation reconciliation logic earlier into the process, leaving the rest of the reconciler oblivious of reconnecting allocations. It also uses the full set of allocations to search for replacements, stopping them even if they are not in the `untainted` set. The system `SystemScheduler` is not affected by this bug because disconnected clients don't trigger replacements: every eligible client is already running an allocation.

…hibit_overlap is true Fixes #11052 When restoring periodic dispatcher, all periodic jobs are forced without checking for previous childre.

…hibit_overlap is true Fixes #11052 When restoring periodic dispatcher, all periodic jobs are forced without checking for previous children.

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

…g-ghost

schmichael and others added 30 commits March 1, 2023 08:09

Prepare release 1.5.0-rc.1

c5d54ab

Generate files for 1.5.0-rc.1 release

0deb5c4

Prepare for next release

b739144

Merge pull request #16284 from hashicorp/post-1.5.0-rc.1-release

cf5b14c

admin: post 1.5.0 rc.1 release

prepare release 1.5.0

ba6d20b

Generate files for 1.5.0 release

01d049e

Prepare for next release

4734c02

Merge release 1.5.0 files

646a82b

tests: add functionality to skip a test if it's not running in CI and…

fbd0dcb

… not with root user (#16222)

[ui, helios] Toast Component (#16099)

f88e3b0

* Template and styles * @type to @color on flash messages * Notifications service as wrapper * Test cases updated for new notifs

Merge pull request #16293 from hashicorp/post-1.5.0-release

f553dc8

admin: Post 1.5.0 release

cli: sort Node value in nomad operator raft list-peers command (#16221

273b76a

) Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

Add namespace argument to the job verification help text (#16243)

f89910d

docs: fix typos in task-api.mdx and workload-identity.mdx (#16309)

64d27c6

api: add new test case for force-leave (#16260)

2ec6575

Signed-off-by: dttung2905 <ttdao.2015@accountancy.smu.edu.sg>

docs: fix alloc stop no_shutdown_delay (#16282)

158d6a9

client: don't emit task shutdown delay event if not waiting (#16281)

b07af57

deps: update test to 0.6.2 for new functions (#16326)

78bcd32

CI: delete test-link-rewrites.yml (#16354)

605f155

docs: note that secrets dir is usually mounted noexec (#16363)

03d6a8c

cli: support json and t on acl binding-rule info command. (#16357)

003a567

lgfa29 and others added 21 commits March 23, 2023 18:28

docs: added section of needed ACL rules for Nomad UI (#16494)

b84c455

style: rename ForceRun to ForceEval, for clarity (#16617)

6626965

Multiple instances of a periodic job are run simultaneously, when pro…

51249fc

…hibit_overlap is true Fixes #11052 When restoring periodic dispatcher, all periodic jobs are forced without checking for previous childre.

Multiple instances of a periodic job are run simultaneously, when pro…

e9850f3

…hibit_overlap is true Fixes #11052 When restoring periodic dispatcher, all periodic jobs are forced without checking for previous children.

style: refactor force run function

3c858a9

fix: remove defer and inline unlock for speed optimization

4c59344

Update nomad/leader.go

8ac3e0e

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

Update nomad/leader_test.go

90db021

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

Update nomad/leader_test.go

23807bd

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

Update nomad/leader_test.go

eb6cd35

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

Update nomad/leader_test.go

f4c24bc

Co-authored-by: James Rasell <jrasell@users.noreply.github.com>

backport of commit f4c24bc

a2ce7f0

backport of commit c762dc8

6cbe024

backport of commit f4352f0

2c363fd

backport of commit 2a9a785

a7260c0

backport of commit f841f4f

2385f05

backport of commit 186f982

124700a

Merge f4c24bc into backport/b-gh-11052/uniquely-glowing-ghost

096cb3b

backport of commit b3eacaa

4fd336a

hc-github-team-nomad-core force-pushed the backport/b-gh-11052/uniquely-glowing-ghost branch 3 times, most recently from 23d891b to 124700a Compare March 27, 2023 15:26

hc-github-team-nomad-core requested a review from Juanadelacuesta March 27, 2023 15:26

vercel bot deployed to Preview – nomad-storybook-and-ui March 27, 2023 15:35 View deployment

Merge branch 'release/1.5.x' into backport/b-gh-11052/uniquely-glowin…

b431198

…g-ghost

vercel bot deployed to Preview – nomad-storybook-and-ui March 27, 2023 16:31 View deployment

Juanadelacuesta closed this Mar 28, 2023

Juanadelacuesta deleted the backport/b-gh-11052/uniquely-glowing-ghost branch May 10, 2023 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backport of Multiple instances of a periodic job are run simultaneously, when prohibit_overlap is true into release/1.5.x #16661

Backport of Multiple instances of a periodic job are run simultaneously, when prohibit_overlap is true into release/1.5.x #16661

hc-github-team-nomad-core commented Mar 27, 2023

Backport of Multiple instances of a periodic job are run simultaneously, when prohibit_overlap is true into release/1.5.x #16661

Backport of Multiple instances of a periodic job are run simultaneously, when prohibit_overlap is true into release/1.5.x #16661

Conversation

hc-github-team-nomad-core commented Mar 27, 2023

Backport