Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport of ci: fix test splits that have less test packages than runner count from hanging into release/1.15.x #17085

Merged
merged 332 commits into from
Apr 21, 2023

Conversation

hc-github-team-consul-core
Copy link
Collaborator

Backport

This PR is auto-generated from #17080 to be assessed for backporting due to the inclusion of the label backport/1.15.

The below text is copied from the body of the original PR.


Description

In 1.13, there are only two compatibility test packages. So when a generate matrix job has a TOTAL_RUNNERS of 5 but there are only 2 packages, the process hangs. This PR changes the runner count to 2 in this case.

Another issue that occurs that is fixed in this PR is that when you specify TOTAL_RUNNERS = 4, you will actually split across 5 runners. (The same occurs with any number you set TOTAL_RUNNERS to. You will actually get 1 greater.)


Overview of commits

rboyer and others added 30 commits February 28, 2023 14:37
… VERSION file (#16467)

Fixes a regression from #15631 in the output of `consul version` from:

    Consul v1.16.0-dev
    +ent
    Revision 56b86ac+CHANGES

to

    Consul v1.16.0-dev+ent
    Revision 56b86ac+CHANGES
* converted main services page to services overview page

* set up services usage dirs

* added Define Services usage page

* converted health checks everything page to Define Health Checks usage page

* added Register Services and Nodes usage page

* converted Query with DNS to Discover Services and Nodes Overview page

* added Configure DNS Behavior usage page

* added Enable Static DNS Lookups usage page

* added the Enable Dynamic Queries DNS Queries usage page

* added the Configuration dir and overview page - may not need the overview, tho

* fixed the nav from previous commit

* added the Services Configuration Reference page

* added Health Checks Configuration Reference page

* updated service defaults configuraiton entry to new configuration ref format

* fixed some bad links found by checker

* more bad links found by checker

* another bad link found by checker

* converted main services page to services overview page

* set up services usage dirs

* added Define Services usage page

* converted health checks everything page to Define Health Checks usage page

* added Register Services and Nodes usage page

* converted Query with DNS to Discover Services and Nodes Overview page

* added Configure DNS Behavior usage page

* added Enable Static DNS Lookups usage page

* added the Enable Dynamic Queries DNS Queries usage page

* added the Configuration dir and overview page - may not need the overview, tho

* fixed the nav from previous commit

* added the Services Configuration Reference page

* added Health Checks Configuration Reference page

* updated service defaults configuraiton entry to new configuration ref format

* fixed some bad links found by checker

* more bad links found by checker

* another bad link found by checker

* fixed cross-links between new topics

* updated links to the new services pages

* fixed bad links in scale file

* tweaks to titles and phrasing

* fixed typo in checks.mdx

* started updating the conf ref to latest template

* update SD conf ref to match latest CT standard

* Apply suggestions from code review

Co-authored-by: Eddie Rowe <74205376+eddie-rowe@users.noreply.github.com>

* remove previous version of the checks page

* fixed cross-links

* Apply suggestions from code review

Co-authored-by: Eddie Rowe <74205376+eddie-rowe@users.noreply.github.com>

---------

Co-authored-by: Eddie Rowe <74205376+eddie-rowe@users.noreply.github.com>
Does the required dance with the local HTTP endpoint to get the required
data for the jwt based auth setup in Azure. Keeps support for 'legacy'
mode where all login data is passed on via the auth methods parameters.
Refactored check for hardcoded /login fields.
* Changed titles for services pages to sentence style cap

* missed a meta title
* add new release notes
---------

Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com>
* fix: return error msg if acl policy not found

* changelog

* add test
…s and policies (#16288)

* Deprecate merge-policies and add options add-policy-name/add-policy-id to improve CLI token update command

* deprecate merge-roles fields

* Fix potential flakey tests and update ux to remove 'completely' + typo fixes
* Update v1_15_x.mdx

---------

Co-authored-by: Tu Nguyen <im2nguyen@users.noreply.github.com>
* Suppress AlreadyRegisteredError to fix test retries

* Remove duplicate sink
Adds support for a jwt token in a file. Simply reads the file and sends
the read in jwt along to the vault login.

It also supports a legacy mode with the jwt string being passed
directly. In which case the path is made optional.
Adds support for Kubernetes jwt/token file based auth. Only needs to
read the file and save the contents as the jwt/token.
NET-2396: refactor test to reduce duplication
NET-2841: PART 1 - refactor NewPeeringCluster to support custom config
…tingGateway upstream timeouts configurable (#16495)

* Leverage ServiceResolver ConnectTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable

* Regenerate golden files

* Add RequestTimeout field

* Add changelog entry
…y cleaned up (#16498)

* Fix issue where terminating gateway service resolvers weren't properly cleaned up

* Add integration test for cleaning up resolvers

* Add changelog entry

* Use state test and drop integration test
- When an envoy version is out of a supported range, we now return the envoy version being used as `major.minor.x` to indicate that it is the minor version at most that is incompatible
- When an envoy version is in the list of unsupported envoy versions we return back the envoy version in the error message as `major.minor.patch` as now the exact version matters.
…ms (#16499)

* Fix resolution of service resolvers with subsets for external upstreams

* Add tests

* Add changelog entry

* Update view filter logic
* fixed broken links associated with cluster peering updates

* additional links to fix

* typos

* fixed redirect file
Adds support for the approle auth-method. Only handles using the approle
role/secret to auth and it doesn't support the agent's extra management
configuration options (wrap and delete after read) as they are not
required as part of the auth (ie. they are vault agent things).
Updated Params field to re-frame as supporting arguments specific to the
supported vault-agent auth-auth methods with links to each methods
"#configuration" section.
Included a call out limits on parameters supported.
…ds session and triggers a replacement proxycfg watcher (#16497)

Receiving an "acl not found" error from an RPC in the agent cache and the
streaming/event components will cause any request loops to cease under the
assumption that they will never work again if the token was destroyed. This
prevents log spam (#14144, #9738).

Unfortunately due to things like:

- authz requests going to stale servers that may not have witnessed the token
  creation yet

- authz requests in a secondary datacenter happening before the tokens get
  replicated to that datacenter

- authz requests from a primary TO a secondary datacenter happening before the
  tokens get replicated to that datacenter

The caller will get an "acl not found" *before* the token exists, rather than
just after. The machinery added above in the linked PRs will kick in and
prevent the request loop from looping around again once the tokens actually
exist.

For `consul-dataplane` usages, where xDS is served by the Consul servers
rather than the clients ultimately this is not a problem because in that
scenario the `agent/proxycfg` machinery is on-demand and launched by a new xDS
stream needing data for a specific service in the catalog. If the watching
goroutines are terminated it ripples down and terminates the xDS stream, which
CDP will eventually re-establish and restart everything.

For Consul client usages, the `agent/proxycfg` machinery is ahead-of-time
launched at service registration time (called "local" in some of the proxycfg
machinery) so when the xDS stream comes in the data is already ready to go. If
the watching goroutines terminate it should terminate the xDS stream, but
there's no mechanism to re-spawn the watching goroutines. If the xDS stream
reconnects it will see no `ConfigSnapshot` and will not get one again until
the client agent is restarted, or the service is re-registered with something
changed in it.

This PR fixes a few things in the machinery:

- there was an inadvertent deadlock in fetching snapshot from the proxycfg
  machinery by xDS, such that when the watching goroutine terminated the
  snapshots would never be fetched. This caused some of the xDS machinery to
  get indefinitely paused and not finish the teardown properly.

- Every 30s we now attempt to re-insert all locally registered services into
  the proxycfg machinery.

- When services are re-inserted into the proxycfg machinery we special case
  "dead" ones such that we unilaterally replace them rather that doing that
  conditionally.
* NET-2903 Normalize weight for http routes

* Update website/content/docs/connect/gateways/api-gateway/configuration/http-route.mdx

Co-authored-by: trujillo-adam <47586768+trujillo-adam@users.noreply.github.com>
* Add some basic ui improvements for api-gateway services

* Add changelog entry

* Use ternary for null check

* Update gateway doc links

* rename changelog entry for new PR

* Fix test
@hc-github-team-consul-core hc-github-team-consul-core force-pushed the backport/jm/runner-count/hideously-fit-sturgeon branch 6 times, most recently from 4652b52 to 0ede8a6 Compare April 21, 2023 16:03
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto approved Consul Bot automated PR

@hc-github-team-consul-core hc-github-team-consul-core force-pushed the backport/jm/runner-count/hideously-fit-sturgeon branch from 7f0bf0d to 7ae41c5 Compare April 21, 2023 16:03
@github-actions github-actions bot added pr/dependencies PR specifically updates dependencies of project theme/agent-cache Agent Cache theme/api Relating to the HTTP API interface theme/certificates Related to creating, distributing, and rotating certificates in Consul theme/cli Flags and documentation for the CLI interface theme/config Relating to Consul Agent configuration, including reloading theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies theme/consul-terraform-sync Relating to Consul Terraform Sync and Network Infrastructure Automation theme/contributing Additions and enhancements to community contributing materials theme/envoy/xds Related to Envoy support theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics theme/tls Using TLS (Transport Layer Security) or mTLS (mutual TLS) to secure communication theme/ui Anything related to the UI type/ci Relating to continuous integration (CI) tooling for testing or releases type/docs Documentation needs to be created/updated/clarified labels Apr 21, 2023
@jmurret jmurret disabled auto-merge April 21, 2023 16:29
@jmurret jmurret enabled auto-merge (squash) April 21, 2023 16:29
@jmurret jmurret merged commit 5493cf6 into release/1.15.x Apr 21, 2023
@jmurret jmurret deleted the backport/jm/runner-count/hideously-fit-sturgeon branch April 21, 2023 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/dependencies PR specifically updates dependencies of project theme/agent-cache Agent Cache theme/api Relating to the HTTP API interface theme/certificates Related to creating, distributing, and rotating certificates in Consul theme/cli Flags and documentation for the CLI interface theme/config Relating to Consul Agent configuration, including reloading theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies theme/consul-terraform-sync Relating to Consul Terraform Sync and Network Infrastructure Automation theme/contributing Additions and enhancements to community contributing materials theme/envoy/xds Related to Envoy support theme/internals Serf, Raft, SWIM, Lifeguard, Anti-Entropy, locking topics theme/tls Using TLS (Transport Layer Security) or mTLS (mutual TLS) to secure communication theme/ui Anything related to the UI type/ci Relating to continuous integration (CI) tooling for testing or releases type/docs Documentation needs to be created/updated/clarified
Projects
None yet
Development

Successfully merging this pull request may close these issues.