Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport fix to 17.03 #1784

Merged
merged 5 commits into from
Jun 12, 2017
Merged

Backport fix to 17.03 #1784

merged 5 commits into from
Jun 12, 2017

Conversation

fcrisciani
Copy link

Fixed cancelling method call to avoid go routine leaks

Signed-off-by: Flavio Crisciani flavio.crisciani@docker.com

@fcrisciani
Copy link
Author

fcrisciani commented May 30, 2017

Stacktrace related to 17.03:

goroutine 1140550 [chan receive, 8614 minutes]:
github.com/docker/docker/vendor/github.com/docker/libnetwork.(*controller).handleTableEvents(0xc4208d6f00, 0xc424222660, 0xc423b0ca60)
	/root/rpmbuild/BUILD/docker-ee/.gopath/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/agent.go:595 +0x5c
created by github.com/docker/docker/vendor/github.com/docker/libnetwork.(*network).addDriverWatches
	/root/rpmbuild/BUILD/docker-ee/.gopath/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/agent.go:555 +0x317

Stacktrace related in 17.03:

github.com/docker/docker/vendor/github.com/vishvananda/netlink/nl.(*NetlinkSocket).Receive(0xc42570c7b0, 0xc422e72d80, 0x1, 0x1, 0x0, 0x0)
	/root/rpmbuild/BUILD/docker-ee/.gopath/src/github.com/docker/docker/vendor/github.com/vishvananda/netlink/nl/nl_linux.go:596 +0x10b
github.com/docker/docker/vendor/github.com/docker/libnetwork/drivers/overlay.(*network).watchMiss(0xc421d2a500, 0xc42570c7b0)
	/root/rpmbuild/BUILD/docker-ee/.gopath/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/drivers/overlay/ov_network.go:631 +0x61
created by github.com/docker/docker/vendor/github.com/docker/libnetwork/drivers/overlay.(*network).initSandbox
	/root/rpmbuild/BUILD/docker-ee/.gopath/src/github.com/docker/docker/vendor/github.com/docker/libnetwork/drivers/overlay/ov_network.go:620 +0x564

@fcrisciani
Copy link
Author

Not ready yet, looks like there is still something missing to handle the stacktrace reported above

@aaronlehmann
Copy link
Contributor

I'm confused - is the problem with the backport, or is the stack trace in #1784 (comment) unrelated to what was fixed here?

@fcrisciani
Copy link
Author

@aaronlehmann sorry for the confusion, but I will group here all the backports for all the fixes for the goroutine leaks.
The fix are 3:

  1. (already merged in master) wrongful rewrite of the cancelList for the watch of the endpoint_table
  2. add return to the handleTableEvents
  3. close the netlink socket and return of the watchMiss goroutine

@fcrisciani
Copy link
Author

Wait for the master one to be reviewed and accepted:

Flavio Crisciani added 5 commits June 9, 2017 15:49
Fixed cancelling method call to avoid go routine leaks

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The channel ch.C is never closed.
Added the listen of the ch.Done() to guarantee
that the goroutine is exiting once the event channel
is closed

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
On linux systems bump up gc_thresholds so to lower the
probability of running with neighbor table overflow issues

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The netlink socket that was used to monitor the L2
miss was never being closed. The watchMiss goroutine
spawned was never returning. This was causing goroutine
leak in case of createNetwork/destroyNetwork

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
The feature was not getting properly triggered, move it as
first operation in the configure

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
@fcrisciani fcrisciani changed the title Backport fix for goroutine leak Backport fix to 17.03 Jun 9, 2017
@fcrisciani
Copy link
Author

This commit covers:
#1787
#1786
#1789
#1795

@fcrisciani
Copy link
Author

@mavenugo this is my side of the backport, you will notice that I mention 4 PR but I have 5 commits, the reason is that the logic for the coreCancelFuncs needed for avoiding the leak was part of this PR: #1735 that we are not backporting

@mavenugo
Copy link
Contributor

LGTM

@mavenugo mavenugo merged commit 676c570 into moby:release/v0.9 Jun 12, 2017
@fcrisciani fcrisciani deleted the leak branch June 12, 2017 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants