Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Calico as CNI for e2e and disable node readiness checks #1459

Merged
merged 13 commits into from
Oct 17, 2022

Conversation

embik
Copy link
Member

@embik embik commented Oct 11, 2022

What this PR does / why we need it:

This is a bit complicated to explain but essentially the e2e tests running out of kind were only somewhat working, but flannel "covered" that up a bit. The problem is between the kind control plane and the joining nodes; CNI pods cannot talk to the Kubernetes API endpoint as the Endpoints for the "kubernetes" Service is an internal IP nodes will not be able to reach. This also affects kube-proxy never being able to properly start.

For some reason, flannel still marked the nodes as Ready (although technically they weren't), same for Cilium, but none really worked for CNI and flannel broke during the CI cluster upgrade (machine-controller-webhook was no longer able to resolve DNS due to service IPs no longer properly routing).

This PR moves to Calico to bring back the kind control plane, but removes the readiness check for a node (since no node can ever really be initialised by CNI in our test setup) and resorts to just making sure a Node object exists for a machine. A follow-up issue was raised at #1462.

Which issue(s) this PR fixes:

Fixes #

What type of PR is this?
/kind cleanup
/kind regression

Special notes for your reviewer:

Does this PR introduce a user-facing change? Then add your Release Note here:

NONE

Documentation:

NONE

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
@kubermatic-bot kubermatic-bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. docs/none Denotes a PR that doesn't need documentation (changes). labels Oct 11, 2022
@kubermatic-bot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@kubermatic-bot kubermatic-bot added kind/regression Categorizes issue or PR as related to a regression from a prior release. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 11, 2022
@embik
Copy link
Member Author

embik commented Oct 11, 2022

/test pull-machine-controller-e2e-vsphere

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
@embik
Copy link
Member Author

embik commented Oct 13, 2022

/test pull-machine-controller-e2e-vsphere

@embik embik changed the title Use Calico as CNI for machine-controller e2e jobs Use Calico as CNI for e2e and disable node readiness checks Oct 13, 2022
@embik embik marked this pull request as ready for review October 13, 2022 12:20
@kubermatic-bot kubermatic-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 13, 2022
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
@kubermatic-bot
Copy link
Contributor

kubermatic-bot commented Oct 14, 2022

@embik: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-machine-controller-e2e-deployment-upgrade e34e060 link true /test pull-machine-controller-e2e-deployment-upgrade

Full PR test history

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@embik
Copy link
Member Author

embik commented Oct 14, 2022

/retest

Copy link
Member

@xmudrii xmudrii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@kubermatic-bot kubermatic-bot added the lgtm Indicates that a PR is ready to be merged. label Oct 17, 2022
@kubermatic-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2cac2634fe77bc81b9c71fe8ee4918fd40fd47ae

@kubermatic-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: embik, xmudrii

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubermatic-bot kubermatic-bot merged commit cbb247f into kubermatic:main Oct 17, 2022
@embik embik deleted the calico-cni branch October 17, 2022 12:01
embik added a commit to embik/machine-controller that referenced this pull request Oct 18, 2022
…ic#1459)

* Update containerized check

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Test if kindnet CNI works

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Revert "Test if kindnet CNI works"

This reverts commit 35ad25c.

* Replace Flannel with Cilium

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Set operator.replicas=1 for cilium

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Remove CentOS 7 tests

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Add Calico as CNI

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Do not check node readiness, just verify node exists

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Revert "Remove CentOS 7 tests"

This reverts commit 72b0d18.

* Add a test for node conditions except NodeReady

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Fix yamllint issues

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Fix linting issue

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Disable upgrade Prow job

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
kubermatic-bot pushed a commit that referenced this pull request Oct 18, 2022
…d use KKP `main` branch (#1465)

* Use Calico as CNI for e2e and disable node readiness checks (#1459)

* Update containerized check

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Test if kindnet CNI works

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Revert "Test if kindnet CNI works"

This reverts commit 35ad25c.

* Replace Flannel with Cilium

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Set operator.replicas=1 for cilium

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Remove CentOS 7 tests

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Add Calico as CNI

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Do not check node readiness, just verify node exists

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Revert "Remove CentOS 7 tests"

This reverts commit 72b0d18.

* Add a test for node conditions except NodeReady

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Fix yamllint issues

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Fix linting issue

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Disable upgrade Prow job

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Use 'main' as KKP branch instead of 'master' (#1464)

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
@embik embik added the backport-complete Denotes a PR or issue which has been fully backported to all required release branches. label Oct 18, 2022
lucakuendig pushed a commit to lucakuendig/machine-controller that referenced this pull request Oct 28, 2022
…ic#1459)

* Update containerized check

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Test if kindnet CNI works

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Revert "Test if kindnet CNI works"

This reverts commit 35ad25c.

* Replace Flannel with Cilium

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Set operator.replicas=1 for cilium

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Remove CentOS 7 tests

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Add Calico as CNI

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Do not check node readiness, just verify node exists

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Revert "Remove CentOS 7 tests"

This reverts commit 72b0d18.

* Add a test for node conditions except NodeReady

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Fix yamllint issues

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Fix linting issue

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Disable upgrade Prow job

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
mate4st pushed a commit to mate4st/machine-controller that referenced this pull request Mar 13, 2023
…ic#1459)

* Update containerized check

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Test if kindnet CNI works

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Revert "Test if kindnet CNI works"

This reverts commit 35ad25c.

* Replace Flannel with Cilium

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Set operator.replicas=1 for cilium

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Remove CentOS 7 tests

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Add Calico as CNI

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Do not check node readiness, just verify node exists

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Revert "Remove CentOS 7 tests"

This reverts commit 72b0d18.

* Add a test for node conditions except NodeReady

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Fix yamllint issues

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Fix linting issue

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

* Disable upgrade Prow job

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-complete Denotes a PR or issue which has been fully backported to all required release branches. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. docs/none Denotes a PR that doesn't need documentation (changes). kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/regression Categorizes issue or PR as related to a regression from a prior release. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants