Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cannot setup openyurt with yurtctl convert --provider kind #484

Closed
Congrool opened this issue Sep 21, 2021 · 30 comments · Fixed by #506
Closed

[BUG] Cannot setup openyurt with yurtctl convert --provider kind #484

Congrool opened this issue Sep 21, 2021 · 30 comments · Fixed by #506
Assignees
Labels
kind/bug kind/bug

Comments

@Congrool
Copy link
Member

What happened:
Hello, I'd like to deploy the openyurt cluster with yurtctl and kind. It seems that yurtctl supports kind with option --provider kind. However, when I used the following command, it resulted in error.

yurtctl convert -t --provider kind --cloud-nodes ${cloudnodes}

F0921 08:08:07.173618   12871 convert.go:98] fail to complete the convert option: failed to read file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf, open /etc/systemd/system/kubelet.service.d/10-kubeadm.conf: no such file or directory

I read the code and found that when yurtctl starts, it will read the 10-kubeadm.conf (at /etc/systemd/system/kubelet.service.d/10-kubeadm.conf in default) to get pod manifest path. However, the file and directory does not exsit when using kind. Maybe we should come out a better way to solve it.

What you expected to happen:
We can use yurtctl to deploy openyurt with kind.

How to reproduce it (as minimally and precisely as possible):
Use yurtctl convert -t --provider kind --cloud-nodes ${cloudnodes} to deploy openyurt with kind.

Environment:

  • OpenYurt version: commit: 797c43d
  • Kubernetes version (use kubectl version): 1.20
@Congrool Congrool added the kind/bug kind/bug label Sep 21, 2021
@rambohe-ch
Copy link
Member

@Peeknut Would you be able to help fix this problem?

@adamzhoul
Copy link
Member

what about using --kubeadm-conf-path for now?

@Congrool
Copy link
Member Author

Congrool commented Sep 22, 2021

@adamzhoul , Hello, I find some 10-kubeadm.conf in my local file system.

locate 10-kubeadm.conf
${GOPATH}/pkg/mod/sigs.k8s.io/kind@v0.11.1/images/base/files/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
${GOPATH}/src/github.com/kubernetes/build/debs/10-kubeadm.conf
${GOPATH}/src/github.com/kubernetes/build/rpms/10-kubeadm.conf

However, none of them looks like the right file.

I tried the first one, and new error occurred. Maybe it's not a right way.

get kubeadm-conf-path: ${GOPATH}/pkg/mod/sigs.k8s.io/kind@v0.11.1/images/base/files/etc/systemd/system/kubelet.service.d/10-kubeadm.conffailed to get pod manifest path
F0922 09:44:40.800540   64458 convert.go:98] fail to complete the convert option: failed to read file /var/lib/kubelet/config.yaml, open /var/lib/kubelet/config.yaml: no such file or directory

@Congrool Congrool changed the title [BUG] Cannot setup openyurt wich yurtctl convert --provider kind [BUG] Cannot setup openyurt with yurtctl convert --provider kind Sep 22, 2021
@adamzhoul
Copy link
Member

try run yurtctl in kind node
image

@adamzhoul
Copy link
Member

Now we read the podManifests path from kubeadm file.

  1. pass podManifests path to the disable-node job. (this requires kubeadm exists at local, so this is a problem)
  2. pass kubeadm path to the revert edgenode job. (this doesn’t require kubeadm to exist at local, so this is not a problem)

in all: this leads to a problem when yurtctl is not in the cluster node.

Solutions:

  1. add —podManifests back, which is not recommended personally.
  2. use default podManifest value when kubeadm is not exists. this may lead the disable-node job to fail when podManifest dir is not right.
  3. pass —kubeadm to disable-node job, which may lead to more complicated implementation.
    config_path=`grep -o -E '\-\-config=.*.yaml' {{.kubeadm_conf_path}} |awk -F'=' '{print  $2}’`
    pod_manifest_path=`grep -o -E 'staticPodPath:\ .*' $config_path|awk '{print $2}'`
    nsenter -t 1 -m -u -n -i -- sed -i 's/--controllers=/--controllers=-nodelifecycle,/g' $pod_manifest_path/kube-controller-manager.yaml
  1. others.

@rambohe-ch @Congrool @Peeknut what do you think?

@Peeknut
Copy link
Member

Peeknut commented Sep 22, 2021

@adamzhoul , Hello, I find some 10-kubeadm.conf in my local file system.

locate 10-kubeadm.conf
${GOPATH}/pkg/mod/sigs.k8s.io/kind@v0.11.1/images/base/files/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
${GOPATH}/src/github.com/kubernetes/build/debs/10-kubeadm.conf
${GOPATH}/src/github.com/kubernetes/build/rpms/10-kubeadm.conf

However, none of them looks like the right file.

I tried the first one, and new error occurred. Maybe it's not a right way.

get kubeadm-conf-path: ${GOPATH}/pkg/mod/sigs.k8s.io/kind@v0.11.1/images/base/files/etc/systemd/system/kubelet.service.d/10-kubeadm.conffailed to get pod manifest path
F0922 09:44:40.800540   64458 convert.go:98] fail to complete the convert option: failed to read file /var/lib/kubelet/config.yaml, open /var/lib/kubelet/config.yaml: no such file or directory

Does 10-kubeadm.conf in dir /usr/lib/systemd/system/kubelet.service.d/ ?

@Congrool
Copy link
Member Author

@Peeknut There's no /usr/lib/systemd/system/kubelet.service.d/ as well in my local file system.

ls /usr/lib/systemd/system/kubelet.service.d/
ls: cannot access '/usr/lib/systemd/system/kubelet.service.d/': No such file or directory

@Peeknut
Copy link
Member

Peeknut commented Sep 22, 2021

@Peeknut There's no /usr/lib/systemd/system/kubelet.service.d/ as well in my local file system.

ls /usr/lib/systemd/system/kubelet.service.d/
ls: cannot access '/usr/lib/systemd/system/kubelet.service.d/': No such file or directory

Could you show the file 10-kubeadm.conf content?

@Congrool
Copy link
Member Author

@Peeknut
Well, the content of the file is as following:

cat ${GOPATH}/pkg/mod/sigs.k8s.io/kind@v0.11.1/images/base/files/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# https://github.com/kubernetes/kubernetes/blob/ba8fcafaf8c502a454acd86b728c857932555315/build/debs/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
# On cgroup v1, the /kubelet cgroup is created in the entrypoint script before running systemd.
# On cgroup v2, the /kubelet cgroup is created here. (See the comments in the entrypoint script for the reason.)
ExecStartPre=/bin/sh -euc "if [ -f /sys/fs/cgroup/cgroup.controllers ]; then create-kubelet-cgroup-v2; fi"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --cgroup-root=/kubelet

It's the same as the file in the kind repo.

@adamzhoul
I tried to run yurtctl in the kind node.

yurtctl convert --cloud-nodes openyurt-e2e-test-control-plane -t --provider kubeadm

It works.

root@openyurt-e2e-test-control-plane:~/openyurt# ./yurtctl convert --cloud-nodes openyurt-e2e-test-control-plane -t --provider kubeadm
get kubeadm-conf-path: /etc/systemd/system/kubelet.service.d/10-kubeadm.confget PodManifestPath: /etc/kubernetes/manifestsI0922 03:38:39.504680    2808 convert.go:318] mark openyurt-e2e-test-control-plane as the cloud-node
I0922 03:40:00.021463    2808 util.go:492] servant job(yurtctl-disable-node-controller-openyurt-e2e-test-control-plane) has succeeded
I0922 03:40:00.021570    2808 convert.go:343] complete disabling node-controller
I0922 03:40:00.489607    2808 convert.go:354] yurt-tunnel-server is deployed
I0922 03:40:00.525977    2808 convert.go:362] yurt-tunnel-agent is deployed
I0922 03:40:00.600863    2808 convert.go:433] kube-public/cluster-info configmap already exists, skip to prepare it
I0922 03:40:00.682505    2808 convert.go:390] deploying the yurt-hub and resetting the kubelet service...
E0922 03:42:00.797153    2808 util.go:489] fail to run servant job(yurtctl-servant-convert-openyurt-e2e-test-worker): wait for job to be complete timeout
I0922 03:42:00.797216    2808 convert.go:413] complete deploying yurt-hub

I think it's better to enable users to run yurtctl locally without exec into kind node when --provider kind is set.

@Peeknut
Copy link
Member

Peeknut commented Sep 22, 2021

@Peeknut
Well, the content of the file is as following:

cat ${GOPATH}/pkg/mod/sigs.k8s.io/kind@v0.11.1/images/base/files/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# https://github.com/kubernetes/kubernetes/blob/ba8fcafaf8c502a454acd86b728c857932555315/build/debs/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
# On cgroup v1, the /kubelet cgroup is created in the entrypoint script before running systemd.
# On cgroup v2, the /kubelet cgroup is created here. (See the comments in the entrypoint script for the reason.)
ExecStartPre=/bin/sh -euc "if [ -f /sys/fs/cgroup/cgroup.controllers ]; then create-kubelet-cgroup-v2; fi"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --cgroup-root=/kubelet

It's the same as the file in the kind repo.

@adamzhoul
I tried to run yurtctl in the kind node.

yurtctl convert --cloud-nodes openyurt-e2e-test-control-plane -t --provider kubeadm

It works.

root@openyurt-e2e-test-control-plane:~/openyurt# ./yurtctl convert --cloud-nodes openyurt-e2e-test-control-plane -t --provider kubeadm
get kubeadm-conf-path: /etc/systemd/system/kubelet.service.d/10-kubeadm.confget PodManifestPath: /etc/kubernetes/manifestsI0922 03:38:39.504680    2808 convert.go:318] mark openyurt-e2e-test-control-plane as the cloud-node
I0922 03:40:00.021463    2808 util.go:492] servant job(yurtctl-disable-node-controller-openyurt-e2e-test-control-plane) has succeeded
I0922 03:40:00.021570    2808 convert.go:343] complete disabling node-controller
I0922 03:40:00.489607    2808 convert.go:354] yurt-tunnel-server is deployed
I0922 03:40:00.525977    2808 convert.go:362] yurt-tunnel-agent is deployed
I0922 03:40:00.600863    2808 convert.go:433] kube-public/cluster-info configmap already exists, skip to prepare it
I0922 03:40:00.682505    2808 convert.go:390] deploying the yurt-hub and resetting the kubelet service...
E0922 03:42:00.797153    2808 util.go:489] fail to run servant job(yurtctl-servant-convert-openyurt-e2e-test-worker): wait for job to be complete timeout
I0922 03:42:00.797216    2808 convert.go:413] complete deploying yurt-hub

I think it's better to enable users to run yurtctl locally without exec into kind node when --provider kind is set.

Currently the value of parameter --provider (kind or kubeadm ) has no effect. So if the command yurtctl convert --cloud-nodes openyurt-e2e-test-control-plane -t --provider kubeadm can work, so can command yurtctl convert -t --provider kind --cloud-nodes ${cloudnodes}.

@Peeknut
Copy link
Member

Peeknut commented Sep 22, 2021

@adamzhoul , Hello, I find some 10-kubeadm.conf in my local file system.

locate 10-kubeadm.conf
${GOPATH}/pkg/mod/sigs.k8s.io/kind@v0.11.1/images/base/files/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
${GOPATH}/src/github.com/kubernetes/build/debs/10-kubeadm.conf
${GOPATH}/src/github.com/kubernetes/build/rpms/10-kubeadm.conf

However, none of them looks like the right file.

I tried the first one, and new error occurred. Maybe it's not a right way.

get kubeadm-conf-path: ${GOPATH}/pkg/mod/sigs.k8s.io/kind@v0.11.1/images/base/files/etc/systemd/system/kubelet.service.d/10-kubeadm.conffailed to get pod manifest path
F0922 09:44:40.800540   64458 convert.go:98] fail to complete the convert option: failed to read file /var/lib/kubelet/config.yaml, open /var/lib/kubelet/config.yaml: no such file or directory

The err msg show open /var/lib/kubelet/config.yaml: no such file or directory, but 10-kubeadm.conf is configured with Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml", please check whether /var/lib/kubelet/config.yaml really exists.

@adamzhoul
Copy link
Member

adamzhoul commented Sep 22, 2021

@Peeknut
I think his core poblem is: yurtctl run in local pc not in cluster node.
kubeadm.conf config.yaml may not exists, and it’s ok they are not.

@Congrool
Copy link
Member Author

Currently the value of parameter --provider (kind or kubeadm ) has no effect. So if the command yurtctl convert --cloud-nodes openyurt-e2e-test-control-plane -t --provider kubeadm can work, so can command yurtctl convert -t --provider kind --cloud-nodes ${cloudnodes}.

@Peeknut Yes. Both --provider kubeadm and --provider kind can work in the kind node (through docker exec -it openyurt-e2e-test-control-plane bash) while cannot work in the local host (where the kind cluster is running on).

I mean if we can support deploying openyurt without exec into the docker container when we set --provider kind.

@Peeknut
Copy link
Member

Peeknut commented Sep 22, 2021

Oo, I got it.
Yes, we can consider supporting this feature.

@Peeknut
Copy link
Member

Peeknut commented Sep 22, 2021

Now we read the podManifests path from kubeadm file.

  1. pass podManifests path to the disable-node job. (this requires kubeadm exists at local, so this is a problem)
  2. pass kubeadm path to the revert edgenode job. (this doesn’t require kubeadm to exist at local, so this is not a problem)

in all: this leads to a problem when yurtctl is not in the cluster node.

Solutions:

  1. add —podManifests back, which is not recommended personally.
  2. use default podManifest value when kubeadm is not exists. this may lead the disable-node job to fail when podManifest dir is not right.
  3. pass —kubeadm to disable-node job, which may lead to more complicated implementation.
    config_path=`grep -o -E '\-\-config=.*.yaml' {{.kubeadm_conf_path}} |awk -F'=' '{print  $2}’`
    pod_manifest_path=`grep -o -E 'staticPodPath:\ .*' $config_path|awk '{print $2}'`
    nsenter -t 1 -m -u -n -i -- sed -i 's/--controllers=/--controllers=-nodelifecycle,/g' $pod_manifest_path/kube-controller-manager.yaml
  1. others.

@rambohe-ch @Congrool @Peeknut what do you think?

For disable node-controller job,consider passing in "kubeadm_conf_path" instead of "pod_manifest_path". Because:
1)It can support the situation yurtctl is not in the cluster node.
2)It is indeed better to obtain the pod-manifest-path according to the actual situation of the node instead of uniformly passing in the parameter pod-manifest-path in the cloud.

Solution 2 (Using the default value) has the same effect as the parameters passed in now.(that is, we assume that the pod-manifest-path on the node is the same by default).
Solution 3 is really complicated. How about considers adding a yurctl subcommand to close the node controller?

@adamzhoul
Copy link
Member

what do you mean by adding a subcommand?

@Peeknut
Copy link
Member

Peeknut commented Sep 22, 2021

what do you mean by adding a subcommand?

Maybe just like yurtctl convert controller-plane, and the disable node-controller job can ues it?

@adamzhoul
Copy link
Member

Is this means?

  1. remove disable node-controller from yurtctl convert
  2. run yurtctl disable node-controller

I think this may lead to :

  1. large size of code update
  2. the user has to run two commands to do one convert job.

@Peeknut
Copy link
Member

Peeknut commented Sep 22, 2021

Is this means?

  1. remove disable node-controller from yurtctl convert
  2. run yurtctl disable node-controller

I think this may lead to :

  1. large size of code update
  2. the user has to run two commands to do one convert job.

yes, maybe should compare solutions or find other solutions.

@adamzhoul
Copy link
Member

for now, solution 3 can be the quickest way and has a small size code updated.

  1. pass —kubeadm to disable-node job, which may lead to more complicated implementation.
config_path=`grep -o -E '\-\-config=.*.yaml' {{.kubeadm_conf_path}} |awk -F'=' '{print  $2}’`
pod_manifest_path=`grep -o -E 'staticPodPath:\ .*' $config_path|awk '{print $2}'`
nsenter -t 1 -m -u -n -i -- sed -i 's/--controllers=/--controllers=-nodelifecycle,/g' $pod_manifest_path/kube-controller-manager.yaml

after all, run yurtctl outside of the cluster is really convenient and attractive.
especially when debugging code.

what do you think? @Peeknut @Congrool @rambohe-ch

@Peeknut
Copy link
Member

Peeknut commented Sep 26, 2021

Yes, this can solve the problem, but maybe using the code form is more elegant than using the linux command.
What do you think?@rambohe-ch

@rambohe-ch
Copy link
Member

rambohe-ch commented Sep 27, 2021

for now, solution 3 can be the quickest way and has a small size code updated.

  1. pass —kubeadm to disable-node job, which may lead to more complicated implementation.
config_path=`grep -o -E '\-\-config=.*.yaml' {{.kubeadm_conf_path}} |awk -F'=' '{print  $2}’`
pod_manifest_path=`grep -o -E 'staticPodPath:\ .*' $config_path|awk '{print $2}'`
nsenter -t 1 -m -u -n -i -- sed -i 's/--controllers=/--controllers=-nodelifecycle,/g' $pod_manifest_path/kube-controller-manager.yaml

after all, run yurtctl outside of the cluster is really convenient and attractive.
especially when debugging code.

what do you think? @Peeknut @Congrool @rambohe-ch

@adamzhoul @Peeknut how about define staticPodPath as /etc/kubernetes/manifests?
the reasons as following:

  1. all tools(kubeadm/minikube/kind) use /etc/kubernetes/manifests as staticPodPath, and users rarely modify these setting.
  2. the code will more readable and simpler. we don't need to consider how to get staticPodPath from config.yaml or pass staticPodPath from cloud to edge.
  3. we can add some before you begin introductions in tutorials like make sure that kubelet use /etc/kubernetes/manifests as staticPodPath.

if staticPodPath is not /etc/kubernetes/manifests, users are advised to install OpenYurt manually.

@adamzhoul
Copy link
Member

Do you mean hard code? Something like:

   const staticPodPath = "/etc/kubernetes/manifests"

I agree with that.

we can add some before you begin introductions in tutorials like make sure that kubelet use /etc/kubernetes/manifests as staticPodPath.

Only put this into before you begin looks a little strange.

if staticPodPath is not /etc/kubernetes/manifests, users are advised to install OpenYurt manually.

what about creating a link? manually install is really difficult.

ln -s realPath /etc/kubernetes/manifests

@rambohe-ch
Copy link
Member

rambohe-ch commented Sep 27, 2021

@adamzhoul Thank you for your feedback.

Do you mean hard code? Something like:

   const staticPodPath = "/etc/kubernetes/manifests"

yes, use hard code to define staticPodPath.

what about creating a link? manually install is really difficult.

ln -s realPath /etc/kubernetes/manifests

It's a good idea, we can also add soft link for staticPodPath in before you begin of tutorial docs.

@rambohe-ch
Copy link
Member

@adamzhoul Would you like to take over and fix this bug?

@adamzhoul
Copy link
Member

sure

@rambohe-ch
Copy link
Member

/assign @adamzhoul

@adamzhoul
Copy link
Member

still, I don't think we need to put something in Before you begin

  1. this dir is hardly changed
  2. we only support the cluster installed by minikube, kubeadm, kind. we are only following their config.
  3. put this in Before you begin looks like we have some special design while we are not
  4. less in Before you begin less pressure for beginners.

@rambohe-ch

@rambohe-ch
Copy link
Member

still, I don't think we need to put something in Before you begin

  1. this dir is hardly changed
  2. we only support the cluster installed by minikube, kubeadm, kind. we are only following their config.
  3. put this in Before you begin looks like we have some special design while we are not
  4. less in Before you begin less pressure for beginners.

@rambohe-ch

@adamzhoul how about add some introduction in troubleshooting tutorial? https://github.com/openyurtio/openyurt/blob/master/docs/tutorial/yurtctl.md#troubleshooting

@adamzhoul
Copy link
Member

good idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug kind/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants