Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High envoy CPU when multiple ingress per group #10826

Closed
MikeN123 opened this issue Jun 28, 2021 · 3 comments · Fixed by #10883
Closed

High envoy CPU when multiple ingress per group #10826

MikeN123 opened this issue Jun 28, 2021 · 3 comments · Fixed by #10883
Assignees
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/consul/connect Consul Connect integration type/bug

Comments

@MikeN123
Copy link
Contributor

Nomad version

Nomad v1.1.2

Consul version

Consul v1.10.0

Operating system and Environment details

Debian Linux 10.

Mostly a dev-setup (no TLS, no ACLs), so server and client and Consul are all on the same host.

Issue

When running multiple ingress gateways in one task group, only 1 gateway behaves 'normally' when it comes to CPU usage. The other gateways seem to loop and use 100% CPU.

If I place the gateways in different task groups, everything behaves normally.

Reproduction steps

Run the job file below (from the e2e tests in this repo) and monitor CPU usage.

Expected Result

All Envoys are idle and do not use any CPU.

Actual Result

2 of the 3 Envoys use 100% CPU.

Job file (if appropriate)

https://github.com/hashicorp/nomad/blob/1b68a1d067d9682c18770eef085738a2fe51e950/e2e/connect/input/multi-ingress.nomad

Nomad logs (if appropriate)

I'm not sure what logs to check. I did not see anything out of the ordinary in the Nomad or the Envoy logs. I would have expected to see something retrying or looping or something like that, but could not find anything like that.

@shoenig shoenig added theme/consul/connect Consul Connect integration stage/accepted Confirmed, and intend to work on. No timeline committment though. labels Jun 28, 2021
@shoenig shoenig added this to Needs Triage in Nomad - Community Issues Triage via automation Jun 28, 2021
@shoenig shoenig self-assigned this Jun 28, 2021
@shoenig
Copy link
Member

shoenig commented Jun 28, 2021

Thanks for reporting, @MikeN123! I vaguely remember looking into this in the past but didn't come to a conclusion. I'll try again this week. I suspect envoy might be silently trying to bind the default admin interface in a loop, but that's just a guess.

@shoenig
Copy link
Member

shoenig commented Jul 9, 2021

So it's not the admin listener which we correctly assign port 1900(N++), but rather this ready listener defaulting to 8443.

Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 127.0.0.1:19000         0.0.0.0:*               LISTEN      -                   
tcp        0      0 127.0.0.1:19001         0.0.0.0:*               LISTEN      -                   
tcp        0      0 127.0.0.1:19002         0.0.0.0:*               LISTEN      -                   
tcp        0      0 127.0.0.1:8443          0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:8081            0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:8082            0.0.0.0:*               LISTEN      -                   
tcp        0      0 0.0.0.0:8083            0.0.0.0:*               LISTEN      - 
      "name": "envoy_ready_listener",
      "address": {
       "socket_address": {
        "address": "127.0.0.1",
        "port_value": 8443
       }

Or at least, that's what I assume, because nothing about the listener for this port shows up in logs.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/consul/connect Consul Connect integration type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants