Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix leak of connection to OpenShift #5902

Closed
sleshchenko opened this issue Aug 4, 2017 · 8 comments
Closed

Fix leak of connection to OpenShift #5902

sleshchenko opened this issue Aug 4, 2017 · 8 comments
Assignees
Labels
kind/task Internal things, technical debt, and to-do tasks to be performed. status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering.

Comments

@sleshchenko
Copy link
Member

When OpenShift workspace is started we have following output in logs of tomcat:

2017-08-03 15:44:15,559[aceSharedPool-0]  [INFO ] [o.e.c.a.w.s.WorkspaceManager 411]    - Workspace 'che:testx19' with id 'workspace8go3un4udnwr1pfd' started by user 'undefined'
Aug 03, 2017 3:49:01 PM okhttp3.internal.platform.Platform log
WARNING: A connection to https://192.168.42.246:8443/ was leaked. Did you forget to close a response body? To see where this was allocated, set the OkHttpClient logger level to FINE: Logger.getLogger(OkHttpClient.class.getName()).setLevel(Level.FINE);

Creation of all OpenShift clients placed in try with resources section.
So it is needed to investigate why leak is done and fix it.

@sleshchenko sleshchenko added kind/task Internal things, technical debt, and to-do tasks to be performed. team/platform labels Aug 4, 2017
@l0rd
Copy link
Contributor

l0rd commented Aug 4, 2017

This issue has some analysis about various problems (including memory leaks) in the OpenShift client. And since we don't use JIRA anymore we are tracking that here.

@sleshchenko
Copy link
Member Author

@l0rd As far as I understand described issues mostly are not in OpenShift connector but in fabric8 kubernetes client. Some of them have the corresponding issue for kubernetes-client, some no. Also, there is no described problem that is described here.
Do you think this is our incorrect usage of kubernetes client or they have a bug? If they have, then should I create issue for kubernetes-client and link it with this issue?

@skabashnyuk skabashnyuk added the status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering. label Sep 15, 2017
@l0rd
Copy link
Contributor

l0rd commented Sep 26, 2017

@sleshchenko I have no idea. @amisevsk @ibuziuk and @snjeza have worked on that and may know if it's a kubernetes client bug or how we use it in OpenShiftConnector.

@sleshchenko
Copy link
Member Author

@amisevsk @ibuziuk and @snjeza Do you have any thought about it?

@sleshchenko
Copy link
Member Author

@snjeza I see. Thanks

@akorneta akorneta self-assigned this Nov 23, 2017
@akorneta akorneta added the status/in-progress This issue has been taken by an engineer and is under active development. label Nov 23, 2017
@akorneta
Copy link
Contributor

akorneta commented Nov 24, 2017

Code which interacts with OpenShift client produces connection leaks. We also investigated that the kubernetes client version does not affect this. We tried combinations on ocp instance v3.6.0:

                                            current version
openshift-client:   v3.1.0  v3.0.3  v2.6.3     v2.2.8
kubernetes-client:  v3.1.0  v3.0.3  v2.6.3     v2.2.8
kubernetes-model:   v2.0.4  v2.0.4  v1.1.4     v1.0.67
okhttp:             v3.8.1  v3.8.1  v3.8.1     v3.6.0 

reproduce steps:

  • run any che workspace
  • wait about ~ 10 minutes
  • check che server logs

Also we investigated that for start of one che workspace we create about 16 OpenShiftClients which is turn contain one instance of the OkHttpClient that holds the pool of connections, executor service e.t.c.
We decided to replace the approach and create 1 OpenShiftClient for the whole application and reuse it. After that connection leaks did not appear. But, another problem occurs:
on several cycles of normal start/stop of che workspaces, bootstrapper execs are freezes and falls out on timeouts, as result workspaces cannot be run.

All the studies were carried out on branch che6 continue to work on this.

@akorneta akorneta removed the status/in-progress This issue has been taken by an engineer and is under active development. label Nov 30, 2017
@akorneta
Copy link
Contributor

akorneta commented Dec 1, 2017

In che6 branch we had the one OpenShiftClient for one interaction with OpenShift API, it means that for start workspace flow we created at least 16 new OpenShiftClients. With this approach serves logs was full of warnings about connections leaks.
Snjeza described that happens because of OpenShiftOAuthInterceptor. If request fails with 403 or 401 response code, it tries to resend request with oauth token(which is configured or with fetched one if username and password are configured) . Before resending of request it performs the close of response body, but for websocket connections it does not release the connection. So when we send first websocket connection using newly created OpenShiftClient instance with oauth token from configuration, we had the connection leaks after 5 minutes. The warnings about connection leaks appear after 5 minutes because of the cleanup job that is performed by ConnectionPool(where the real connection is placed) in 5 minutes.
How it works now:
We create only one instance of OpenShiftClient whole Che server and reuse it for all the workspaces. No warnings about connections leaks appear because of initialization of oauth token during first http(not websocket) connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/task Internal things, technical debt, and to-do tasks to be performed. status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering.
Projects
None yet
Development

No branches or pull requests

5 participants