dind deployment flake: Timeout waiting for 3 nodes to report readiness #11315

0xmichalis · 2016-10-11T10:41:17Z

Any dind deployment failures in ci will result in an error message similar to the following:

............................................................
[ERROR] Timeout waiting for 3 nodes to report readiness
[ERROR] PID 30612: hack/dind-cluster.sh:45: `return 1` exited with status 1.
[INFO]      Stack Trace: 
[INFO]        1: hack/dind-cluster.sh:45: `return 1`
[INFO]        2: hack/dind-cluster.sh:114: wait-for-cluster
[INFO]        3: hack/dind-cluster.sh:344: start
[INFO]   Exiting with code 1.
[INFO] Saving cluster configuration
[ERROR] Failed to deploy cluster for plugin: {multitenant}

Diagnosing the problem will require looking at the dind cluster artifacts saved by the job.

The text was updated successfully, but these errors were encountered:

0xmichalis · 2017-07-10T15:03:43Z

Latest base ami failure: https://ci.openshift.redhat.com/jenkins/view/All/job/ami_build_origin_int_rhel_base/161/consoleFull#14075125555898c58db7602c31c0eab717

0xmichalis · 2017-07-13T11:55:41Z

https://ci.openshift.redhat.com/jenkins/view/All/job/merge_pull_request_origin/1271/consoleFull#2086332504575ef9a2e4b031e1b25ce320

tnozicka · 2017-10-05T18:13:27Z

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/14910/test_pull_request_origin_extended_networking_minimal/8761/

Stopping dind cluster 'nettest'
Starting dind cluster 'nettest' with plugin 'redhat/openshift-ovs-multitenant' and runtime 'dockershim'
Waiting for ok
............................................................................
Done
Waiting for 3 nodes to report readiness
........................................................................................................................
[ERROR] Timeout waiting for 3 nodes to report readiness
[ERROR] PID 31630: hack/dind-cluster.sh:41: `return 1` exited with status 1.
... skipping 7 lines ...
[ERROR] Failed to deploy cluster for plugin: {multitenant}
... skipping 10712 lines ...
[ERROR] 1 plugin(s) failed one or more tests
[ERROR] PID 29201: test/extended/networking-minimal.sh:6: `NETWORKING_E2E_MINIMAL=1 "${OS_ROOT}/test/extended/networking.sh"` exited with status 1.
... skipping 7 lines ...
########## FINISHED STAGE: FAILURE: RUN EXTENDED TESTS [00h 08m 10s] ##########

sosiouxme · 2018-02-21T18:03:51Z

Got it twice today. https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/18654/test_pull_request_origin_extended_networking_minimal/14578/ is the latest

gabemontero · 2018-02-22T16:10:13Z

See https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/18703/test_pull_request_origin_extended_networking_minimal/14586/

Perhaps related to the networking flakes @smarterclayton mentioned on the mailing list.

danwinship · 2018-02-27T14:21:04Z

(The recent burst of failures was caused by the CA serial number bug that was fixed by #18713)

danwinship · 2018-02-27T16:13:39Z

Actually, the recent (and still continuing) burst of failures is caused by an actual docker-in-docker setup bug revealed by the new CA-serial-number-handling code (#18765), not related to #18713.

openshift-bot · 2018-05-28T18:36:59Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2018-06-27T18:38:42Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2018-07-27T19:40:49Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

0xmichalis added component/networking area/tests kind/test-flake Categorizes issue or PR as related to test flakes. labels Oct 11, 2016

0xmichalis mentioned this issue Oct 11, 2016

oc: deprecate 'deploy --latest' in favor of 'rollout latest --again' #11287

Merged

danmcp added the priority/P2 label Oct 11, 2016

danmcp assigned knobunc Oct 11, 2016

marun mentioned this issue Oct 11, 2016

dind deployment fails due to timeout #11325

Closed

marun assigned marun and unassigned knobunc Oct 11, 2016

marun changed the title ~~[ERROR] Failed to deploy cluster for plugin: {multitenant}~~ dind deployment flake: Failed to deploy cluster for plugin Oct 11, 2016

marun mentioned this issue Oct 11, 2016

dind: Ensure unique node config and fix perms #11326

Merged

marun changed the title ~~dind deployment flake: Failed to deploy cluster for plugin~~ dind deployment flake: Timeout waiting for 3 nodes to report readiness Oct 11, 2016

knobunc mentioned this issue Oct 12, 2016

Check upper bound of watch port range for IP Failover configuration #10207

Merged

juanvallejo mentioned this issue Oct 12, 2016

add pod bash completion oc exec #11329

Merged

knobunc mentioned this issue Oct 12, 2016

API to autocreate subnets for external hosts #10621

Merged

csrwng mentioned this issue Oct 14, 2016

CLI: add a set build-secret command #10615

Merged

deads2k mentioned this issue Oct 17, 2016

make login, project, and discovery work against kube with RBAC enabled #11340

Merged

smarterclayton closed this as completed Jan 23, 2017

csrwng mentioned this issue Feb 1, 2017

cluster up: remove hardcoded docker root mount #12744

Merged

0xmichalis reopened this Jul 10, 2017

marun assigned knobunc and unassigned marun Sep 7, 2017

tnozicka mentioned this issue Oct 5, 2017

deploy: remove generic deployment trigger controller #14910

Merged

sosiouxme mentioned this issue Feb 21, 2018

[Diagnostics] Fix AnalyzeLogs to provide more clear debug message #18654

Merged

gabemontero mentioned this issue Feb 22, 2018

Bld metrics path to alerts part 2 #18703

Merged

danwinship assigned danwinship and unassigned knobunc Feb 27, 2018

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 28, 2018

openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 27, 2018

openshift-ci-robot closed this as completed Jul 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dind deployment flake: Timeout waiting for 3 nodes to report readiness #11315

dind deployment flake: Timeout waiting for 3 nodes to report readiness #11315

0xmichalis commented Oct 11, 2016 •

edited by marun

Loading

0xmichalis commented Jul 10, 2017

0xmichalis commented Jul 13, 2017

tnozicka commented Oct 5, 2017

sosiouxme commented Feb 21, 2018

gabemontero commented Feb 22, 2018 •

edited

Loading

danwinship commented Feb 27, 2018

danwinship commented Feb 27, 2018

openshift-bot commented May 28, 2018

openshift-bot commented Jun 27, 2018

openshift-bot commented Jul 27, 2018

dind deployment flake: Timeout waiting for 3 nodes to report readiness #11315

dind deployment flake: Timeout waiting for 3 nodes to report readiness #11315

Comments

0xmichalis commented Oct 11, 2016 • edited by marun Loading

0xmichalis commented Jul 10, 2017

0xmichalis commented Jul 13, 2017

tnozicka commented Oct 5, 2017

sosiouxme commented Feb 21, 2018

gabemontero commented Feb 22, 2018 • edited Loading

danwinship commented Feb 27, 2018

danwinship commented Feb 27, 2018

openshift-bot commented May 28, 2018

openshift-bot commented Jun 27, 2018

openshift-bot commented Jul 27, 2018

0xmichalis commented Oct 11, 2016 •

edited by marun

Loading

gabemontero commented Feb 22, 2018 •

edited

Loading