-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrying to start the Informer on AKS never succeeds and ends up in immediate ECONNRESET #589
Comments
@ivanstanev I think it should help. I just pushed a new release (0.14.0) with the fix, please try it out! |
Thanks for the super quick response, going to try it out now and let it run over the weekend and let you know how it goes! |
@brendandburns I think the recent |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hey 👋
We've been users of the client library in our product for a while now. Recently we noticed that connections to AKS suddenly get interrupted (roughly 5 minutes after start) and we stop getting notified of new workloads in the cluster. We noticed it is because we didn't have an error handler (as defined in the example: https://github.com/kubernetes-client/javascript/blob/master/examples/typescript/informer/informer.ts#L16-L22) and that AKS has a Load Balancer for the K8s API server that interrupts long-running connections after 5 minutes by default.
So we added the setTimeout() + informer.start() to try and fix this.
However, we find that this does not help and the informer ends up in an infinite loop where the API server immediately returns
ECONNRESET
, the informer tries to re-start after 5 seconds (due to setTimeout()), and our app never recovers - stuck in receiving ECONNRESET and retrying infinitely. Killing our Pod and starting from scratch fixes this - until the API server stops the connection and again ending in a loop of ECONNRESET and trying to start the informer.We are using version
0.13.2
of the library.I noticed this recent PR #576 fixes a connection leak and ensures abort() is called on the connection. Do you think this is related and it would help once it lands in a new release?
The text was updated successfully, but these errors were encountered: