You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I upgraded our linkerd installation from 2.11 to 2.12.1 2 days ago. Over these 2 days I've been seeing random readiness and/or liveness probe failures from the linkerd-destination pod.
The pod always seems to recover in time, but the events show up in our monitoring dashboard and I'd like to investigate whether we can safely ignore these or whether something is wrong.
How can it be reproduced?
No idea how to reproduce it, these events happen at random times throughout the day, including at night while nothing is going on on our cluster.
Logs, error output, etc
These are the various events we're seeing:
Oct 12, 2022 @ 04:16:35.751 Readiness probe failed: Get "http://172.16.36.138:9990/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Oct 12, 2022 @ 04:02:25.753 Readiness probe failed: Get "http://172.16.36.138:9990/ready": dial tcp 172.16.36.138:9990: i/o timeout (Client.Timeout exceeded while awaiting headers)
Oct 12, 2022 @ 04:02:25.744 Liveness probe failed: Get "http://172.16.36.138:9990/live": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Oct 12, 2022 @ 02:34:15.744 Liveness probe failed: Get "http://172.16.36.138:9990/live": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Oct 12, 2022 @ 00:21:45.744 Readiness probe failed: Get "http://172.16.36.138:9990/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
I've looked in the logs of the concerned pod, and only found erroneous logs in the policy container. I've pasted the entire log, since it also contains 'info' logs which have message that sound quite erroneous:
(timestamps below must be calculated +2h to match the event timestamps)
2022-10-10T13:31:18.651751Z INFO grpc{port=8090}: linkerd_policy_controller: gRPC server listening addr=0.0.0.0:8090
2022-10-10T13:35:23.608556Z INFO meshtlsauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:35:23.976584Z INFO authorizationpolicies: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:35:24.076786Z INFO authorizationpolicies: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120218487 (120221055): Expired
2022-10-10T13:35:24.560669Z INFO httproutes: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:35:24.728935Z INFO httproutes: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120218487 (120221058): Expired
2022-10-10T13:35:24.949501Z INFO networkauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:35:25.566212Z INFO servers: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:35:25.840335Z INFO servers: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120218487 (120221063): Expired
2022-10-10T13:35:38.219469Z INFO serverauthorizations: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:39:29.366869Z INFO meshtlsauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:39:29.735517Z INFO networkauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:39:34.074919Z INFO authorizationpolicies: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:39:34.875888Z INFO networkauthentications: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120218487 (120222748): Expired
2022-10-10T13:39:35.303290Z INFO httproutes: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:43:41.022288Z INFO meshtlsauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T13:43:44.877451Z INFO networkauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T14:26:30.355741Z INFO meshtlsauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T14:47:23.025370Z INFO serverauthorizations: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T18:57:58.177987Z INFO serverauthorizations: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T23:05:45.769316Z INFO pods: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-10T23:09:51.423789Z INFO pods: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T01:48:46.123539Z INFO meshtlsauthentications: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120484163 (120486907): Expired
2022-10-11T02:55:48.803715Z INFO serverauthorizations: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T02:59:55.241793Z INFO serverauthorizations: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T05:16:31.956125Z INFO pods: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T05:26:21.995063Z INFO pods: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: timed out
2022-10-11T05:30:31.730465Z INFO pods: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T07:46:22.178523Z INFO httproutes: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T07:50:33.695771Z INFO httproutes: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T15:38:03.061691Z INFO server{port=9443}:conn{client.ip=172.16.36.138 client.port=46290}: kubert::server: Connection lost error=connection error: unexpected end of file
2022-10-11T21:47:22.726596Z INFO authorizationpolicies: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T22:01:39.622617Z INFO authorizationpolicies: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T22:04:19.318943Z WARN meshtlsauthentications: kube_client::client: eof in poll: error reading a body from connection: error reading a body from connection: unexpected EOF during chunk size line
2022-10-11T22:04:19.332348Z WARN authorizationpolicies: kube_client::client: eof in poll: error reading a body from connection: error reading a body from connection: unexpected EOF during chunk size line
2022-10-11T22:04:34.827919Z WARN pods: kube_client::client: eof in poll: error reading a body from connection: error reading a body from connection: unexpected EOF during chunk size line
2022-10-11T22:04:34.882406Z WARN meshtlsauthentications: kube_client::client: eof in poll: error reading a body from connection: error reading a body from connection: unexpected EOF during chunk size line
2022-10-11T22:04:34.892552Z WARN serverauthorizations: kube_client::client: eof in poll: error reading a body from connection: error reading a body from connection: unexpected EOF during chunk size line
2022-10-11T22:04:34.892625Z WARN authorizationpolicies: kube_client::client: eof in poll: error reading a body from connection: error reading a body from connection: unexpected EOF during chunk size line
2022-10-11T22:04:34.894460Z WARN networkauthentications: kube_client::client: eof in poll: error reading a body from connection: error reading a body from connection: unexpected EOF during chunk size line
2022-10-11T22:04:34.894648Z WARN httproutes: kube_client::client: eof in poll: error reading a body from connection: error reading a body from connection: unexpected EOF during chunk size line
2022-10-11T22:04:34.894713Z WARN servers: kube_client::client: eof in poll: error reading a body from connection: error reading a body from connection: unexpected EOF during chunk size line
2022-10-11T22:04:34.901507Z ERROR authorizationpolicies: kube_client::client::builder: failed with error error trying to connect: tcp connect error: Connection refused (os error 111)
2022-10-11T22:04:34.901526Z INFO authorizationpolicies: kubert::errors: stream failed error=failed to start watching object: HyperError: error trying to connect: tcp connect error: Connection refused (os error 111)
2022-10-11T22:04:34.901842Z ERROR networkauthentications: kube_client::client::builder: failed with error error trying to connect: tcp connect error: Connection refused (os error 111)
2022-10-11T22:04:34.901938Z INFO networkauthentications: kubert::errors: stream failed error=failed to start watching object: HyperError: error trying to connect: tcp connect error: Connection refused (os error 111)
2022-10-11T22:04:34.902569Z ERROR serverauthorizations: kube_client::client::builder: failed with error error trying to connect: tcp connect error: Connection refused (os error 111)
2022-10-11T22:04:34.902847Z INFO serverauthorizations: kubert::errors: stream failed error=failed to start watching object: HyperError: error trying to connect: tcp connect error: Connection refused (os error 111)
2022-10-11T22:04:34.903652Z ERROR servers: kube_client::client::builder: failed with error error trying to connect: tcp connect error: Connection refused (os error 111)
2022-10-11T22:04:34.903667Z INFO servers: kubert::errors: stream failed error=failed to start watching object: HyperError: error trying to connect: tcp connect error: Connection refused (os error 111)
2022-10-11T22:04:36.748702Z INFO meshtlsauthentications: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120928647 (120930940): Expired
2022-10-11T22:04:36.756656Z ERROR meshtlsauthentications: kube_client::client::builder: failed with error error trying to connect: tcp connect error: Connection refused (os error 111)
2022-10-11T22:04:36.756837Z INFO meshtlsauthentications: kubert::errors: stream failed error=failed to perform initial object list: HyperError: error trying to connect: tcp connect error: Connection refused (os error 111)
2022-10-11T22:04:40.050739Z INFO networkauthentications: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120929532 (120930967): Expired
2022-10-11T22:04:40.155520Z INFO servers: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120929405 (120930854): Expired
2022-10-11T22:04:41.172888Z INFO httproutes: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120929533 (120930898): Expired
2022-10-11T22:04:41.250555Z INFO serverauthorizations: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120929406 (120930969): Expired
2022-10-11T22:04:41.372005Z INFO authorizationpolicies: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120927410 (120930873): Expired
2022-10-11T22:09:13.428716Z INFO networkauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T22:09:13.542826Z INFO servers: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T22:09:13.857861Z INFO servers: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120931016 (120932649): Expired
2022-10-11T22:09:14.549683Z INFO httproutes: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T22:09:14.743844Z INFO httproutes: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120931023 (120932662): Expired
2022-10-11T22:09:14.770412Z INFO serverauthorizations: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T22:09:14.982397Z INFO serverauthorizations: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120931025 (120932665): Expired
2022-10-11T22:13:40.120252Z INFO networkauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T22:13:41.688142Z INFO authorizationpolicies: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T22:13:46.533429Z INFO httproutes: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T22:13:46.646938Z INFO serverauthorizations: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T22:17:53.027175Z INFO serverauthorizations: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T23:21:00.884975Z INFO authorizationpolicies: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-11T23:21:06.032865Z INFO authorizationpolicies: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120956659 (120959083): Expired
2022-10-12T00:09:23.556347Z INFO httproutes: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-12T00:09:59.689070Z INFO meshtlsauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-12T00:14:35.778036Z INFO meshtlsauthentications: kubert::errors: stream failed error=error returned by apiserver during watch: too old resource version: 120975018 (120978685): Expired
2022-10-12T01:21:31.361181Z INFO networkauthentications: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-12T07:14:28.606800Z INFO httproutes: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-12T08:36:40.356749Z INFO authorizationpolicies: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
2022-10-12T08:41:08.717814Z INFO authorizationpolicies: kubert::errors: stream failed error=watch stream failed: Error reading events stream: error reading a body from connection: error reading a body from connection: Connection reset by peer (os error 104)
output of linkerd check -o short
Linkerd core checks
===================
linkerd-version
---------------
‼ cli is up-to-date
is running version 2.11.1 but the latest stable version is 2.12.1
see https://linkerd.io/2.11/checks/#l5d-version-cli for hints
control-plane-version
---------------------
‼ viz extension proxies are up-to-date
some proxies are not running the current version:
* metrics-api-595c7b564-7ls6t (stable-2.11.4)
* prometheus-77b9558b4b-4nqjm (stable-2.11.4)
* tap-7f8f67546f-x624j (stable-2.11.4)
* tap-injector-6b6c5c86d4-cqsv5 (stable-2.11.4)
* web-6756f5956c-z4kdl (stable-2.11.4)
see https://linkerd.io/2.11/checks/#l5d-viz-proxy-cp-version for hints
‼ viz extension proxies and cli versions match
grafana-db56d7cb4-qm44p running but cli running stable-2.11.1
see https://linkerd.io/2.11/checks/#l5d-viz-proxy-cli-version for hints
Status check results are √
Environment
Kubernetes version: 1.24.3
Cluster environment: AKS
Host OS: Linux (Ubuntu 18.04)
Linkerd version: 2.12.1
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
No response
The text was updated successfully, but these errors were encountered:
Based on those logs, it seems like the Kubernetes API is refusing connections from the policy controller:
2022-10-11T22:04:34.901842Z ERROR networkauthentications: kube_client::client::builder: failed with error error trying to connect: tcp connect error: Connection refused (os error 111)
It's unclear why this would happen and we haven't seen this in our testing. If you have concrete steps to reliably reproduce this, we can investigate. Otherwise, I would recommend increasing the log level to see if there are any more clues or using tools such as tcpdump to try to understand why the connection is being refused.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
@AlexGoris-KasparSolutions any luck with this using the latest stable versions? We added an improvement in 2.12.4 that allowed probes to behave nicer in particular in clusters with lots of resources deployed. If you still experience the issue, it'd also be very helpful to get more detailed information like suggested by @adleong.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
What is the issue?
I upgraded our linkerd installation from 2.11 to 2.12.1 2 days ago. Over these 2 days I've been seeing random readiness and/or liveness probe failures from the
linkerd-destination
pod.The pod always seems to recover in time, but the events show up in our monitoring dashboard and I'd like to investigate whether we can safely ignore these or whether something is wrong.
How can it be reproduced?
No idea how to reproduce it, these events happen at random times throughout the day, including at night while nothing is going on on our cluster.
Logs, error output, etc
These are the various events we're seeing:
I've looked in the logs of the concerned pod, and only found erroneous logs in the
policy
container. I've pasted the entire log, since it also contains 'info' logs which have message that sound quite erroneous:(timestamps below must be calculated +2h to match the event timestamps)
output of
linkerd check -o short
Environment
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
No response
The text was updated successfully, but these errors were encountered: