Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: sync: negative WaitGroup counter in k8s.io/kubernetes/pkg/controller/node #16404

Closed
soltysh opened this issue Sep 18, 2017 · 5 comments · Fixed by #16411
Closed

panic: sync: negative WaitGroup counter in k8s.io/kubernetes/pkg/controller/node #16404

soltysh opened this issue Sep 18, 2017 · 5 comments · Fixed by #16411
Assignees
Labels
component/kubernetes kind/test-flake Categorizes issue or PR as related to test flakes. priority/P0 vendor-update Touching vendor dir or related files

Comments

@soltysh
Copy link
Member

soltysh commented Sep 18, 2017

Seen today here: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/16390/test_pull_request_origin_unit/2793/build-log.txt

=== RUN   TestCancel
panic: sync: negative WaitGroup counter
goroutine 400 [running]:
sync.(*WaitGroup).Add(0xc420392540, 0xffffffffffffffff)
	/usr/lib/golang/src/sync/waitgroup.go:75 +0x255
sync.(*WaitGroup).Done(0xc420392540)
	/usr/lib/golang/src/sync/waitgroup.go:100 +0x42
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node.TestCancel.func1(0xc420318d80, 0x1, 0x1c)
	/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node/timed_workers_test.go:88 +0x61
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node.(*TimedWorkerQueue).getWrappedWorkerFunc.func1(0xc420318d80, 0x0, 0x0)
	github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node/_test/_obj_test/timed_workers.go:81 +0xb6
github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node.CreateWorker.func1()
	github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node/_test/_obj_test/timed_workers.go:44 +0x72
created by time.goFunc
	/usr/lib/golang/src/time/sleep.go:170 +0x52
FAIL	github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/node	37.948s
@soltysh soltysh added kind/test-flake Categorizes issue or PR as related to test flakes. priority/P1 labels Sep 18, 2017
@soltysh soltysh added the vendor-update Touching vendor dir or related files label Sep 18, 2017
@enj
Copy link
Contributor

enj commented Sep 18, 2017

We just need to disable these tests in origin:

#16077
kubernetes/kubernetes#51704
kubernetes/kubernetes#51705

@soltysh
Copy link
Member Author

soltysh commented Sep 18, 2017

I don't feel that's a good approach. Disabling tests is our last resort, followed immediately by a proper/temporary fix (if it's big). Why not increasing those timeouts even more?

@enj
Copy link
Contributor

enj commented Sep 18, 2017

I suppose we can carry a patch for a 10 second sleep. The whole thing is a hack anyway :/

@soltysh
Copy link
Member Author

soltysh commented Sep 18, 2017

SGTM, will you open a PR or shall I?

@mfojtik
Copy link
Contributor

mfojtik commented Sep 20, 2017

Seen this 4x today, raising to P0.

openshift-merge-robot added a commit that referenced this issue Sep 20, 2017
Automatic merge from submit-queue

UPSTREAM: <carry>: increase timeout in TestCancelAndReadd even more

Fixes #16404.

Apparently #16077 didn't fix the problem. #16404 is showing that we're hitting this more and more.

@mfojtik @Kargakis @enj ptal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/kubernetes kind/test-flake Categorizes issue or PR as related to test flakes. priority/P0 vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants