Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use kind for E2E tests #1304

Merged
merged 10 commits into from
Jun 1, 2022
Merged

Use kind for E2E tests #1304

merged 10 commits into from
Jun 1, 2022

Conversation

ahmedwaleedmalik
Copy link
Member

@ahmedwaleedmalik ahmedwaleedmalik commented May 20, 2022

Signed-off-by: Waleed Malik ahmedwaleedmalik@gmail.com

What this PR does / why we need it:
Alt Text

Which issue(s) this PR fixes (optional, in fixes #<issue number> format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Optional Release Note:

NONE

@kubermatic-bot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@kubermatic-bot kubermatic-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. approved Indicates a PR has been approved by an approver from all required OWNERS files. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 20, 2022
@ahmedwaleedmalik
Copy link
Member Author

/test pull-machine-controller-e2e-aws

@ahmedwaleedmalik ahmedwaleedmalik marked this pull request as ready for review May 20, 2022 11:56
@kubermatic-bot kubermatic-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 20, 2022
@ahmedwaleedmalik ahmedwaleedmalik marked this pull request as draft May 20, 2022 12:21
@kubermatic-bot kubermatic-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 20, 2022
@ahmedwaleedmalik
Copy link
Member Author

/test pull-machine-controller-e2e-aws

2 similar comments
@ahmedwaleedmalik
Copy link
Member Author

/test pull-machine-controller-e2e-aws

@ahmedwaleedmalik
Copy link
Member Author

/test pull-machine-controller-e2e-aws

@ahmedwaleedmalik
Copy link
Member Author

/test pull-machine-controller-e2e-aws

2 similar comments
@ahmedwaleedmalik
Copy link
Member Author

/test pull-machine-controller-e2e-aws

@ahmedwaleedmalik
Copy link
Member Author

/test pull-machine-controller-e2e-aws

@ahmedwaleedmalik ahmedwaleedmalik marked this pull request as ready for review May 28, 2022 07:49
@kubermatic-bot kubermatic-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 28, 2022
@ahmedwaleedmalik
Copy link
Member Author

/hold

@kubermatic-bot kubermatic-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 28, 2022
@kubermatic-bot kubermatic-bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 1, 2022
@ahmedwaleedmalik
Copy link
Member Author

Why do we do that? What do we get with Flannel that's not available with Kindnet?

I opted for flannel because kindnet is a very minimalistic CNI plugin and network plugins are not shipped with it. But eventually, I realized that even flannel doesn't ship them. But yes I don't have strong opinions regarding which one we should choose for e2e tests but kindnet didn't work on my initial tries.

I have switched to kindnet for now. Let's see, how the e2e tests react.

@ahmedwaleedmalik
Copy link
Member Author

/retest

@ahmedwaleedmalik
Copy link
Member Author

So kindnet didn't work, as I anticipated. @xmudrii @xrstf

Signed-off-by: Waleed Malik <ahmedwaleedmalik@gmail.com>
@ahmedwaleedmalik
Copy link
Member Author

Reverted back to flannel. Maybe we can see switching back to kindnet as an improvement and do that in a separate follow-up PR, if required/possible.

@xrstf
Copy link
Contributor

xrstf commented Jun 1, 2022

So kindnet didn't work, as I anticipated.

Can we see the logs or failed pods somewhere, so we have more than "didn't work" for the future in case this gets important for kind 0.13+ ?

Presumably, https://public-prow.loodse.com/view/gs/prow-dev-public-data/pr-logs/pull/kubermatic_machine-controller/1304/pull-machine-controller-e2e-aws-arm/1531939052419289088 is one of the jobs that failed due to CNI?

@xmudrii
Copy link
Member

xmudrii commented Jun 1, 2022

TBH, I'm not sure do we want to block on this, but kindnet not working makes me think that we're doing something wrong. I don't have strong feelings, but I think we should use kindnet in a kind cluster. I would like to better understand why is this happening at all.

@ahmedwaleedmalik
Copy link
Member Author

ahmedwaleedmalik commented Jun 1, 2022

Presumably, https://public-prow.loodse.com/view/gs/prow-dev-public-data/pr-logs/pull/kubermatic_machine-controller/1304/pull-machine-controller-e2e-aws-arm/1531939052419289088 is one of the jobs that failed due to CNI?

Yes, how about we open a new issue for that and investigate? :) I'd really like for this PR to be merged and we can always update/improve/refactor the CNI stuff later, against a separate issue.

@xrstf
Copy link
Contributor

xrstf commented Jun 1, 2022

Can you provide logs that explain why

helper.go:237: verify failed due to error=failed to verify creation of node for MachineDeployment: failed waiting for MachineDeployment ubuntu-1-23-5-aws-1531939052419289088 to get a node in ready state: timed out waiting for the condition

failed?

@ahmedwaleedmalik
Copy link
Member Author

I'm focused on some other ticket for now. So I don't have the bandwidth to do that. If you want, I can revert my change and you can check the logs accordingly.

Signed-off-by: Waleed Malik <ahmedwaleedmalik@gmail.com>
@ahmedwaleedmalik
Copy link
Member Author

@xrstf reverted and switched back to kindnet. You can debug and go through the logs now.

@ahmedwaleedmalik
Copy link
Member Author

ahmedwaleedmalik commented Jun 1, 2022

@xrstf @xmudrii

Jun 01 13:18:23 ip-172-31-8-126 kubelet[11017]: E0601 13:18:23.556129   11017 kubelet.go:2344] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
Jun 01 13:18:28 ip-172-31-8-126 kubelet[11017]: I0601 13:18:28.336312   11017 scope.go:110] "RemoveContainer" containerID="991ff73aab78179ffa1da5ca196ecb307d80884612a3f34c923262bd665fdef7"
Jun 01 13:18:28 ip-172-31-8-126 kubelet[11017]: E0601 13:18:28.336596   11017 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kindnet-cni\" with CrashLoopBackOff: \"back-off 2m40s restarting failed container=kindnet-cni pod=kindnet-jbsqb_kube-system(a6c88456-1ad0-4412-8223-c7ed9163d80c)\"" pod="kube-system/kindnet-jbsqb" podUID=a6c88456-1ad0-4412-8223-c7ed9163d80c

Signed-off-by: Waleed Malik <ahmedwaleedmalik@gmail.com>
@ahmedwaleedmalik
Copy link
Member Author

/retest

@xmudrii
Copy link
Member

xmudrii commented Jun 1, 2022

@ahmedwaleedmalik It's mostly like that kindnet doesn't work with non-kind nodes. I guess this is sort of expected because kindnet is supposed to be a very basic CNI implementation. In this case, I think resorting to Flannel is a good decision.

@ahmedwaleedmalik
Copy link
Member Author

/retest

@ahmedwaleedmalik
Copy link
Member Author

/retest

@xmudrii then it's good to go from my side. This azure testing is just flaking out.

Copy link
Member

@xmudrii xmudrii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing PR, I'm very happy that we finally use Kind instead of Hetzner. ⚡ ⚡ ⚡

Here are some approves for this amazing PR:

/approve
/approve
/approve

@kubermatic-bot kubermatic-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 1, 2022
@kubermatic-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 3dd3c405cb23b56e83529a302bbcaf8488132f9a

@kubermatic-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahmedwaleedmalik, xmudrii, xrstf

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [ahmedwaleedmalik,xmudrii,xrstf]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ahmedwaleedmalik
Copy link
Member Author

/unhold
;)

@kubermatic-bot kubermatic-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 1, 2022
@kubermatic-bot kubermatic-bot merged commit 7fcde86 into kubermatic:master Jun 1, 2022
@ahmedwaleedmalik ahmedwaleedmalik deleted the use-kind-e2e branch June 1, 2022 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants