Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: vtorc Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings" #15726

Closed
L3o-pold opened this issue Apr 16, 2024 · 3 comments
Labels
Component: VTorc Vitess Orchestrator integration Type: Bug

Comments

@L3o-pold
Copy link
Collaborator

L3o-pold commented Apr 16, 2024

Overview of the Issue

Since upgrading from 18 to 19.0.3 I see a lot of logs in vtorc saying

Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings"

I would say it outputs this 90 times per minutes (with default vtorc flags)

I tried to tweak some flags --grpc_keepalive_time=30s and --grpc_keepalive_timeout=30s but no improvement.

I also noticed an increase in memory comsumption. With 512Mi memory limit, we saw a lot of OOM. Prior to upgrading to 19.x we never hitted OOM.

Reproduction Steps

Running a small sharded cluster:

  • 2 vtorc
  • 2 keyspace
  • 9 tablets
  • 2 vtgate
  • 1 vtctld
  • 3 etcd

Binary Version

vtorc version Version: 19.0.3 (Git revision cb5464edf5d7075feae744f3580f8bc626d185aa branch 'HEAD') built on Thu Apr  4 12:17:21 UTC 2024 by vitess@buildkitsandbox using go1.22.2 linux/amd64

Operating System and Environment details

docker

Log Fragments

vtorc-5dc96749c7-d7krt vtorc E0416 20:22:43.986560       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:44.988277       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:44.988471       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:45.989058       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:46.011874       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:46.984802       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:48.985185       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:48.985318       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:49.984356       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:49.984507       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:50.986921       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:50.987095       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:51.984797       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:51.984898       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:53.041251       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:54.987922       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:54.989548       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:55.986758       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:55.994977       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:56.984674       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:56.987330       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:57.985635       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:57.986287       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:22:59.981554       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:00.986283       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:00.986497       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:02.011700       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:02.011791       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:02.989077       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:02.989077       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:03.985710       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:03.985828       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:05.985778       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:07.012021       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:07.012023       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:07.986159       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:07.987660       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:08.987172       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:08.987325       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
vtorc-5dc96749c7-d7krt vtorc E0416 20:23:09.988717       1 component.go:46] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".
@L3o-pold L3o-pold added Type: Bug Needs Triage This issue needs to be correctly labelled and triaged labels Apr 16, 2024
@L3o-pold
Copy link
Collaborator Author

Looks like their is a memory leak somewhere
image

@rohit-nayak-ps rohit-nayak-ps added Component: VTorc Vitess Orchestrator integration and removed Needs Triage This issue needs to be correctly labelled and triaged labels Apr 17, 2024
@GuptaManan100
Copy link
Member

Hello! First off, thank you @L3o-pold for finding this bug! I spent all day yesterday and some time today trying to find the issue and I have found it.

The changes in #15562 that were backported to v19, cause the tablet manager to create a pool of connections for FullStatus RPC calls, connections which are never closed, not untill we find a gRPC failure.

The problem happened because we hadn't backported the change #15356 to v19! That PR changes VTOrc to use a single TMC for all the RPC calls. Without this change in v19.0.3, we are in a state wherein for every FullStatus RPC VTOrc is creating a new TMC connection, which inturn is creating 8 gRPC connections that are never killed, so the number of gRPC connections is ever increasing...

The fix is going to be to backport this change to v19 and do a new patch release.

@czxin788
Copy link

image

VTorc ocuupies too much memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: VTorc Vitess Orchestrator integration Type: Bug
Projects
None yet
Development

No branches or pull requests

4 participants