Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArgoCD-Repo-Server does not close https connections to git-server #3522

Closed
3 tasks done
Jaydee94 opened this issue Apr 30, 2020 · 7 comments · Fixed by #3531
Closed
3 tasks done

ArgoCD-Repo-Server does not close https connections to git-server #3522

Jaydee94 opened this issue Apr 30, 2020 · 7 comments · Fixed by #3531
Labels
bug/priority:high Should be fixed in the next patch release bug/severity:major Malfunction in one of the core component, impacting a majority of users bug Something isn't working

Comments

@Jaydee94
Copy link

Jaydee94 commented Apr 30, 2020

Checklist:

  • I've searched in the docs and FAQ for my answer: http://bit.ly/argocd-faq.
  • I've included steps to reproduce the bug.
  • I've pasted the output of argocd version.

Describe the bug

We are experiencing problems the argo-cd repo server. The application opens https connection to our git server and does not close them as expected. We have now traced the problem to the worker node where the pod is running. By attaching to the $pid of the pod we can watch the number of open sockets rise continuously.

watch "ls -lah /proc/$pid/fd/ | wc"

Using nsenter we can also get a more detailed overview over the open connections. Here is the output when we run netstat in the argo-repo-server namespace:

nsenter -t 3659754 -n netstat -pantu
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address                   State       PID/Program name
tcp        0      0 172.27.4.81:51880       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:60360       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:45428       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:36380       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:34510       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:58658       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:36830       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:35722       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:55286       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:40218       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:52948       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:45712       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp        0      0 172.27.4.81:59698       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
[....... 14k times ......]
tcp        0      0 172.27.4.81:38964       censored-by-our-it-sec:443        ESTABLISHED 3659754/argocd-repo
tcp6       0      0 :::8081                 :::*                              LISTEN      3659754/argocd-repo
tcp6       0      0 :::8084                 :::*                              LISTEN      3659754/argocd-repo
tcp6       0      0 172.27.4.81:8081        172.25.2.52:44604                 TIME_WAIT   -

To Reproduce

  • connect a argocd-application to a repo using https
  • watch the tcp connections for the $pid of the repo-server process (nsenter -t $pid-n netstat -pantu)

Expected behavior

The repo server opens connections to the git repository and closes the connections after processing.

Version

argocd: v1.5.2+c2c19f4
  BuildDate: 2020-04-15T16:41:59Z
  GitCommit: c2c19f42ad78ed7a6fb70e86aed117be484feb50
  GitTreeState: clean
  GoVersion: go1.14
  Compiler: gc
  Platform: linux/amd64
argocd-server: v1.5.2+c2c19f4
  BuildDate: 2020-04-15T16:43:12Z
  GitCommit: c2c19f42ad78ed7a6fb70e86aed117be484feb50
  GitTreeState: clean
  GoVersion: go1.14
  Compiler: gc
  Platform: linux/amd64
  Ksonnet Version: v0.13.1
  Kustomize Version: Version: {Version:kustomize/v3.2.1 GitCommit:d89b448c745937f0cf1936162f26a5aac688f840 BuildDate:2019-09-27T00:10:52Z GoOs:linux GoArch:amd64}
  Helm Version: version.BuildInfo{Version:"v3.1.1", GitCommit:"afe70585407b420d0097d07b21c47dc511525ac8", GitTreeState:"clean", GoVersion:"go1.13.8"}
  Kubectl Version: v1.14.0

Our Workaround

We switched to ssh instead of https. Using ssh, the connections to out git server are closed correctly as we would expect.

Maybe the open connections are caused by the "customHTTPClient" from https://github.com/argoproj/argo-cd/blob/9e81c38c13be708cb7f1d280ae93ddeb59131305/util/git/client.go

@Jaydee94 Jaydee94 added the bug Something isn't working label Apr 30, 2020
@jannfis jannfis added the bug/in-triage This issue needs further triage to be correctly classified label Apr 30, 2020
@jannfis
Copy link
Member

jannfis commented Apr 30, 2020

Thanks for your brilliant bug report, @Jaydee94!

I could reproduce the issue reliably and can confirm it. The repo server is indeed leaking TCP connections for HTTPS repository requests. I'd consider this bug as severe.

@jannfis jannfis added bug/priority:high Should be fixed in the next patch release bug/severity:major Malfunction in one of the core component, impacting a majority of users and removed bug/in-triage This issue needs further triage to be correctly classified labels Apr 30, 2020
@jannfis
Copy link
Member

jannfis commented May 1, 2020

I think I found the root cause, and it was indeed the custom HTTP client implementation that was keeping HTTP connections alive.

I just submitted a PR with the fix (#3531) - with this fix applied, TCP connections to Git repositories connected via HTTPS should be closed correctly and in time now, at least they do in my reproduction environment.

@szEvEz
Copy link

szEvEz commented May 18, 2020

Hi,

as we are also running into this issue, I was just curious when this bugfix will be released? Thanks for the great work!

@Jaydee94
Copy link
Author

@szEvEz As far as i know the bugfix is already part of the newest release 1.5.4

@jannfis
Copy link
Member

jannfis commented May 18, 2020

@szEvEz @Jaydee94 Sorry, this patch didn't make it in the 1.5.x branch yet. We thought to give it a thorough test before merging it in one of the fix release.

@alexmt what is the plan for this fix? Could we collect a few metrics on latency or performance issues?

@szEvEz
Copy link

szEvEz commented May 19, 2020

@jannfis @Jaydee94 Thanks for reaching out so quickly! And really no need to apologise, just wanted to check the current status :) . Thanks

@sastorsl
Copy link

sastorsl commented May 5, 2023

We've hit the exact same issue, but with regards to helm chart updates.
A massive amount of https sessions never expire, and are (seems to be) related to keepalive.
I'll open a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug/priority:high Should be fixed in the next patch release bug/severity:major Malfunction in one of the core component, impacting a majority of users bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants