Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport of cli: search all namespaces for node volumes into release/1.4.x #18117

Conversation

hc-github-team-nomad-core
Copy link
Contributor

Backport

This PR is auto-generated from #17925 to be assessed for backporting due to the inclusion of the label backport/1.4.x.

The below text is copied from the body of the original PR.


When looking for CSI volumes to display in the node status command the CLI needs to search all namespaces. It's also helpful to display the volume namespace in the command output. This has been handled by #17911.

Closes #17923

tgross and others added 30 commits May 19, 2023 10:27
The file path in the TSCCR repo for the `returntocorp/semgrep` action was
incorrect, so the pinning tool was not able to find the correct entry and it was
not pinned in #17238.

The repository is fixed in hashicorp/security-tsccr#431
Workload Identities have an implicit default policy. This policy can't currently
be described via HCL because it includes task interpolation for Variables and
access to the Services API (which doesn't exist as its own ACL
capbility). Describe this in our WI documentation.

Fixes: #16277
* Treated same-route as sub-route and didnt cancel watchers

* Adds panel to child jobs and sub-sorts

* removed the safety check in module-for-job tests

* [ui] Adds status panel to Sysbatch jobs (#17243)

* In working out periodic/param child jobs, realized the intersection with sysbatch is high enough that it ought to be worked on now

* Further removal of jobclientstatussummary

* Explicitly making mocked jobs in no-deployment mode

* remove last remnants of job-client-status-summary component

* Screwed up my sorting order a few commits ago; this corrects it

* noActiveDeployment gonna be the death of me
…#17178)

* build(deps): bump github.com/shoenig/test from 0.6.4 to 0.6.5 in /api

* deps: update shoenig/test to v0.6.5

* deps: update again to v0.6.6

---------

Co-authored-by: Tim Gross <tgross@hashicorp.com>
Co-authored-by: Seth Hoenig <shoenig@duck.com>
The `volume register` command can update a small subset of the volume's fields
in-place, with some restrictions depending on whether the volume is currently in
use. Document these in the `volume register` command docs and the volume
specification docs.

Fixes: #17247
When resolving ACL policies, we were not using the parent ID for the policy
lookup for dispatch/periodic jobs, even though the claims were signed for that
parent ID. This prevents all calls to the Task API (and other WI-authenticated
API calls) from a periodically-dispatched job failing with 403.

Fix this by using the parent job ID whenever it's available.
The `nomad tls cert` command did not create certificates with the correct SANs for
them to work with non default domain and region names. This changset updates the
code to support non default domains and regions in the certificates.
* Add UnexpectedResultError to nomad/api

This allows users to perform additional status-based behavior by rehydrating the error using `errors.As` inside of consumers.
The 32-bit Intel builds (aka "386") are not tested and likely have bugs
involving platform-sized integers when operated at any non-trivial scale. Remove
these builds from the upcoming Nomad 1.6.0 and provide recommendations in the
upgrade notes for those users who might have hobbyist boards running 32-bit
ARM (this will primarily be the RaspberryPi Zero or older spins of the RaspPi).

DO NOT BACKPORT TO 1.5.x OR EARLIER!
* Generate files for 1.5.6 release

* Prepare for next release

* Merge release 1.5.6 files

* manual revert bindata_assetfs because the one on main is better

---------

Co-authored-by: hc-github-team-nomad-core <github-team-nomad-core@hashicorp.com>
Bumps [github.com/docker/distribution](https://github.com/docker/distribution) from 2.8.1+incompatible to 2.8.2+incompatible.
- [Release notes](https://github.com/docker/distribution/releases)
- [Commits](distribution/distribution@v2.8.1...v2.8.2)

---
updated-dependencies:
- dependency-name: github.com/docker/distribution
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…17031)

Bumps [github.com/hashicorp/vault/sdk](https://github.com/hashicorp/vault) from 0.7.0 to 0.9.0.
- [Release notes](https://github.com/hashicorp/vault/releases)
- [Changelog](https://github.com/hashicorp/vault/blob/main/CHANGELOG.md)
- [Commits](hashicorp/vault@v0.7.0...v0.9.0)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/vault/sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github.com/grpc-ecosystem/go-grpc-middleware](https://github.com/grpc-ecosystem/go-grpc-middleware) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/grpc-ecosystem/go-grpc-middleware/releases)
- [Commits](grpc-ecosystem/go-grpc-middleware@v1.3.0...v1.4.0)

---
updated-dependencies:
- dependency-name: github.com/grpc-ecosystem/go-grpc-middleware
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
A "readiness" check implies a failing healthcheck will not cause the
deployment of a service to stop - i.e. it is only used as a liveness
probe in the context of service discoverability.

Fix our docs example to reflect that a readiness check is created by
setting on_update to "ignore" (as opposed to "ignore_warnings").
…ob Status panel on steady service jobs (#17246)

* Failed or lost cell condensed

* Latest Deployment cell

* Stylistic changes and deploying state fixup

* Rewritten tooltip message and updated lost/failed tests

* failed-or-lost cell updates to job status panel acceptance tests
…sponses (#17316)

Add a nil check to constructNodeServerInfoResponse to manage an apparent race between deregister and client heartbeats. Fixes #17310
* A few variable-adding bugfixes

* Disable Delete button if only one KV is left, and remove entity warnings on Add More
Nomad API will reject jobs with priority set to 0.
Consul v1.13.8 was released with a breaking change in the /v1/agent/self
endpoint version where a line break was being returned.

This caused the Nomad finterprint to fail because `NewVersion` errors on
parse.

This commit removes any extra space from the Consul version returned by
the API.
When monitoring the replacement allocation, if the
`Allocations().Info()` request fails, the `alloc` variable is `nil`, so
it should not be read.
Bumps [github.com/elazarl/go-bindata-assetfs](https://github.com/elazarl/go-bindata-assetfs) from 1.0.1-0.20200509193318-234c15e7648f to 1.0.1.
- [Release notes](https://github.com/elazarl/go-bindata-assetfs/releases)
- [Changelog](https://github.com/elazarl/go-bindata-assetfs/blob/master/.goreleaser.yml)
- [Commits](https://github.com/elazarl/go-bindata-assetfs/commits/v1.0.1)

---
updated-dependencies:
- dependency-name: github.com/elazarl/go-bindata-assetfs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
The allocrunner sends several updates to the server during the early lifecycle
of an allocation and its tasks. Clients batch-up allocation updates every 200ms,
but experiments like the C2M challenge has shown that even with this batching,
servers can be overwhelmed with client updates during high volume
deployments. Benchmarking done in #9451 has shown that client updates can easily
represent ~70% of all Nomad Raft traffic.

Each allocation sends many updates during its lifetime, but only those that
change the `ClientStatus` field are critical for progressing a deployment or
kicking off a reschedule to recover from failures.

Add a priority to the client allocation sync and update the `syncTicker`
receiver so that we only send an update if there's a high priority update
waiting, or on every 5th tick. This means when there are no high priority
updates, the client will send updates at most every 1s instead of
200ms. Benchmarks have shown this can reduce overall Raft traffic by 10%, as
well as reduce client-to-server RPC traffic.

This changeset also switches from a channel-based collection of updates to a
shared buffer, so as to split batching from sending and prevent backpressure
onto the allocrunner when the RPC is slow. This doesn't have a major performance
benefit in the benchmarks but makes the implementation of the prioritized update
simpler.

Fixes: #9451
This reverts commit ba736e4.

This was accidentally added by fat-fingered Admin push...
lhaig and others added 23 commits July 11, 2023 08:53
Co-authored-by: James Rasell <jrasell@users.noreply.github.com>
Document and test that if a namespace does not provide an `allow` or
`deny` list than those are treated as `nil` and have a different
behaviour from an empty list (`[]string{}`).
Cannot set a user for raw_exec tasks, because doing so does not work
with the 0700 root owned client data directory that we setup in the e2e
cluster in accordance with the Nomad hardening guide.
* docs: add plugin docs for pledge task driver

Add pledge driver to the set of Community drivers.

* docs: cr feedback
This adds a quick smoke test of our binaries to verify we haven't exceeeded the
maximum GLIBC (2.17) version during linking which would break our ability to
execute on EL7 machines.
* e2e: setup nomad for pledge driver

* e2e: add some e2e tests for pledge task driver
When looking for CSI volumes to display in the `node status` command the
CLI needs to search all namespaces. It's also helpful to display the
volume namespace in the command output.
@hc-github-team-nomad-core hc-github-team-nomad-core force-pushed the backport/b-fix-node-status-volumes/notably-united-moccasin branch from 49ef4c5 to d23db47 Compare August 1, 2023 13:56
@hc-github-team-nomad-core hc-github-team-nomad-core merged commit 0a50fe6 into release/1.4.x Aug 1, 2023
@hc-github-team-nomad-core hc-github-team-nomad-core deleted the backport/b-fix-node-status-volumes/notably-united-moccasin branch August 1, 2023 13:56
@github-actions
Copy link

github-actions bot commented Aug 1, 2023

Ember Test Audit comparison

release/1.4.x d23db47 change
passes 1422 1422 0
failures 0 0 0
flaky 0 0 0
duration 11m 16s 023ms 11m 16s 078ms +055ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet