Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix and enhancement during my first try, hope it helps #306

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion config/setup/yurt-controller-manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,9 @@ spec:
serviceAccountName: yurt-controller-manager
hostNetwork: true
tolerations:
- operator: "Exists"
- key: "node-role.kubernetes.io/master"
effect: ""
operator: "Exists"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original operator: "Exists" can make yurt-controller-manager tolerates all of taints. would you comment the reason of adding key: "node-role.kubernetes.io/master"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without explicit blank assignment to effect, it can't schedule to master node as I verified, and maybe the original intention is to tolerate any taints, in that case we can assign key as "", but in my opinion strict and explicit way is better

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my cluster, use operator:"Exists" yurtctl-controller-manager could schedule to master node.

[root@n80 ~]# kubectl get nodes
NAME     STATUS   ROLES    AGE    VERSION
master   Ready    master   124d   v1.16.0
n80      Ready    <none>   124d   v1.16.0

[root@n80 ~]# kubectl get pod -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   coredns-58cc8c89f4-c2z88                   1/1     Running   2          84d
kube-system   coredns-58cc8c89f4-m6v2b                   1/1     Running   2          84d
kube-system   etcd-master                                1/1     Running   2          124d
kube-system   kube-apiserver-master                      1/1     Running   2          124d
kube-system   kube-controller-manager-master             1/1     Running   3          124d
kube-system   kube-flannel-ds-79ckt                      1/1     Running   2          124d
kube-system   kube-flannel-ds-q886f                      1/1     Running   0          3d9h
kube-system   kube-proxy-44cfx                           1/1     Running   0          3d9h
kube-system   kube-proxy-rk49h                           1/1     Running   2          124d
kube-system   kube-scheduler-master                      1/1     Running   2          124d
kube-system   yurt-controller-manager-5b67549d9b-k6lhb   1/1     Running   0          8s
kube-system   yurt-hub-n80                               1/1     Running   0          5s
kube-system   yurt-tunnel-server-d84666f6c-nvrb8         1/1     Running   0          7s
kube-system   yurtctl-servant-convert-n80-4r6vg          1/1     Running   0          7s


[root@n80 ~]# kubectl describe pod -n kube-system yurt-controller-manager-5b67549d9b-k6lhb 
Name:         yurt-controller-manager-5b67549d9b-k6lhb
Namespace:    kube-system
Priority:     0
Node:         master/10.10.102.78
Start Time:   Tue, 25 May 2021 20:19:43 +0800
Labels:       app=yurt-controller-manager
              pod-template-hash=5b67549d9b
Annotations:  <none>
Status:       Running
IP:           10.10.102.78
IPs:
  IP:           10.10.102.78
Controlled By:  ReplicaSet/yurt-controller-manager-5b67549d9b
Containers:
  yurt-controller-manager:
    Container ID:  docker://5a0c2e724d5998d73cb077468351d250252e2432851e0971a80e0d82b87c62a6
    Image:         registry.cn-hangzhou.aliyuncs.com/openyurttest/yurt-controller-manager:v0.4.0-amd64
    Image ID:      docker-pullable://registry.cn-hangzhou.aliyuncs.com/openyurttest/yurt-controller-manager@sha256:c13082cb9171b82a698ccc96dd710235f14227e47bec5a050c6258a8e269f80b
    Port:          <none>
    Host Port:     <none>
    Command:
      yurt-controller-manager
    State:          Running
      Started:      Tue, 25 May 2021 20:19:45 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from yurt-controller-manager-token-2ntpx (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  yurt-controller-manager-token-2ntpx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  yurt-controller-manager-token-2ntpx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     
Events:
  Type    Reason     Age        From               Message
  ----    ------     ----       ----               -------
  Normal  Scheduled  <unknown>  default-scheduler  Successfully assigned kube-system/yurt-controller-manager-5b67549d9b-k6lhb to master
  Normal  Pulled     87s        kubelet, master    Container image "registry.cn-hangzhou.aliyuncs.com/openyurttest/yurt-controller-manager:v0.4.0-amd64" already present on machine
  Normal  Created    87s        kubelet, master    Created container yurt-controller-manager
  Normal  Started    87s        kubelet, master    Started container yurt-controller-manager


[root@n80 ~]# kubectl describe node master
Name:               master
Roles:              master
Labels:             alibabacloud.com/is-edge-worker=false
                    beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=master
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/master=
                    openyurt.io/is-edge-worker=false
Annotations:        flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"96:e7:51:e3:0e:59"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 10.10.102.78
                    kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Thu, 21 Jan 2021 12:34:52 +0800
Taints:             node-role.kubernetes.io/master:NoSchedule
Unschedulable:      false
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Fri, 07 May 2021 16:33:05 +0800   Fri, 07 May 2021 16:33:05 +0800   FlannelIsUp                  Flannel is running on this node
  MemoryPressure       False   Tue, 25 May 2021 20:20:06 +0800   Fri, 22 Jan 2021 22:36:24 +0800   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 25 May 2021 20:20:06 +0800   Tue, 09 Mar 2021 20:34:17 +0800   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Tue, 25 May 2021 20:20:06 +0800   Fri, 22 Jan 2021 22:36:24 +0800   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Tue, 25 May 2021 20:20:06 +0800   Tue, 25 May 2021 18:45:23 +0800   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  10.10.102.78
  Hostname:    master
Capacity:
 cpu:                2
 ephemeral-storage:  17394Mi
 hugepages-2Mi:      0
 memory:             3882072Ki
 pods:               110
Allocatable:
 cpu:                2
 ephemeral-storage:  16415037823
 hugepages-2Mi:      0
 memory:             3779672Ki
 pods:               110
System Info:
 Machine ID:                 a60cd88e65b74f27920dc57fc8b1f9da
 System UUID:                4237B2C9-A1A7-1A01-8A0A-6B468BE3B652
 Boot ID:                    df0415ce-5ff6-4f1f-a5b8-27df5135fa62
 Kernel Version:             3.10.0-693.el7.x86_64
 OS Image:                   CentOS Linux 7 (Core)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://19.3.8
 Kubelet Version:            v1.16.0
 Kube-Proxy Version:         v1.16.0
PodCIDR:                     10.244.0.0/24
PodCIDRs:                    10.244.0.0/24
Non-terminated Pods:         (10 in total)
  Namespace                  Name                                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                                        ------------  ----------  ---------------  -------------  ---
  kube-system                coredns-58cc8c89f4-c2z88                    100m (5%)     0 (0%)      70Mi (1%)        170Mi (4%)     84d
  kube-system                coredns-58cc8c89f4-m6v2b                    100m (5%)     0 (0%)      70Mi (1%)        170Mi (4%)     84d
  kube-system                etcd-master                                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         124d
  kube-system                kube-apiserver-master                       250m (12%)    0 (0%)      0 (0%)           0 (0%)         124d
  kube-system                kube-controller-manager-master              200m (10%)    0 (0%)      0 (0%)           0 (0%)         124d
  kube-system                kube-flannel-ds-79ckt                       100m (5%)     100m (5%)   50Mi (1%)        50Mi (1%)      124d
  kube-system                kube-proxy-rk49h                            0 (0%)        0 (0%)      0 (0%)           0 (0%)         124d
  kube-system                kube-scheduler-master                       100m (5%)     0 (0%)      0 (0%)           0 (0%)         124d
  kube-system                yurt-controller-manager-5b67549d9b-k6lhb    0 (0%)        0 (0%)      0 (0%)           0 (0%)         2m15s
  kube-system                yurt-tunnel-server-d84666f6c-nvrb8          0 (0%)        0 (0%)      0 (0%)           0 (0%)         2m14s
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                850m (42%)  100m (5%)
  memory             190Mi (5%)  390Mi (10%)
  ephemeral-storage  0 (0%)      0 (0%)
Events:              <none>

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is my previous logs, after apply the change, issue fixed

box@joez-work-op-vm-2:~/share/repo/openyurt/openyurt-images$ kubectl get po -A -o wide
NAMESPACE     NAME                                              READY   STATUS    RESTARTS   AGE    IP              NODE                NOMINATED NODE   READINESS GATES
kube-system   coredns-7ff77c879f-fsgjp                          1/1     Running   0          20m    10.244.0.2      joez-work-op-vm-2   <none>           <none>
kube-system   coredns-7ff77c879f-t54gh                          1/1     Running   0          20m    10.244.0.3      joez-work-op-vm-2   <none>           <none>
kube-system   etcd-joez-work-op-vm-2                            1/1     Running   0          20m    10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   kube-apiserver-joez-work-op-vm-2                  1/1     Running   0          20m    10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   kube-controller-manager-joez-work-op-vm-2         1/1     Running   0          20m    10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   kube-flannel-ds-d9r8c                             1/1     Running   0          20m    10.67.103.191   joez-work-op-vm-3   <none>           <none>
kube-system   kube-flannel-ds-kzpkp                             1/1     Running   0          20m    10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   kube-proxy-v2r5z                                  1/1     Running   0          20m    10.67.103.191   joez-work-op-vm-3   <none>           <none>
kube-system   kube-proxy-zqnbl                                  1/1     Running   0          20m    10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   kube-scheduler-joez-work-op-vm-2                  1/1     Running   0          20m    10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   yurt-controller-manager-5d4b5ffb89-q2tbl          1/1     Running   0          4m5s   10.67.103.191   joez-work-op-vm-3   <none>           <none>
kube-system   yurt-hub-joez-work-op-vm-3                        1/1     Running   0          4m5s   10.67.103.191   joez-work-op-vm-3   <none>           <none>
kube-system   yurtctl-servant-convert-joez-work-op-vm-3-4rhfs   1/1     Running   0          4m5s   10.67.103.191   joez-work-op-vm-3   <none>           <none>

Warning  FailedScheduling  <unknown>  default-scheduler  0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.

but with the latest code, there is no such issue

box@joez-work-op-vm-2:~/share/repo/openyurt/openyurt-release$ kubectl get node -o wide
NAME                STATUS   ROLES    AGE    VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
joez-work-op-vm-2   Ready    master   2m6s   v1.18.0   10.67.103.75    <none>        Ubuntu 18.04.5 LTS   4.15.0-143-generic   docker://20.10.2
joez-work-op-vm-3   Ready    <none>   95s    v1.18.0   10.67.103.191   <none>        Ubuntu 18.04.5 LTS   4.15.0-112-generic   docker://20.10.2
box@joez-work-op-vm-2:~/share/repo/openyurt/openyurt-release$ ./yurtctl convert -c joez-work-op-vm-2 -p kubeadm
I0526 10:48:50.203562   17897 convert.go:273] mark joez-work-op-vm-2 as the cloud-node
I0526 10:48:50.230939   17897 convert.go:466] kube-public/cluster-info configmap already exists, skip to prepare it
I0526 10:48:50.230959   17897 convert.go:353] deploying the yurt-hub and resetting the kubelet service...
I0526 10:49:20.256159   17897 util.go:320] servant job(yurtctl-servant-convert-joez-work-op-vm-3) has succeeded
I0526 10:49:20.256194   17897 convert.go:377] the yurt-hub is deployed
box@joez-work-op-vm-2:~/share/repo/openyurt/openyurt-release$ kubectl get po -A -o wide
NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE     IP              NODE                NOMINATED NODE   READINESS GATES
kube-system   coredns-7ff77c879f-7qm58                    1/1     Running   0          7m28s   10.244.0.3      joez-work-op-vm-2   <none>           <none>
kube-system   coredns-7ff77c879f-nd6t9                    1/1     Running   0          7m28s   10.244.0.2      joez-work-op-vm-2   <none>           <none>
kube-system   etcd-joez-work-op-vm-2                      1/1     Running   0          7m37s   10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   kube-apiserver-joez-work-op-vm-2            1/1     Running   0          7m37s   10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   kube-controller-manager-joez-work-op-vm-2   1/1     Running   0          7m37s   10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   kube-proxy-dfrcj                            1/1     Running   0          7m15s   10.67.103.191   joez-work-op-vm-3   <none>           <none>
kube-system   kube-proxy-w8z7t                            1/1     Running   0          7m28s   10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   kube-scheduler-joez-work-op-vm-2            1/1     Running   0          7m37s   10.67.103.75    joez-work-op-vm-2   <none>           <none>
kube-system   yurt-controller-manager-9d749b975-kvk6q     1/1     Running   0          5m23s   10.67.103.75    joez-work-op-vm-2   <none>           <none>
box@joez-work-op-vm-2:~/share/repo/openyurt/openyurt-release$ kubectl get -n kube-system -o yaml Deployment/yurt-controller-manager | grep -A4 ' tolerations'
      tolerations:
      - operator: Exists

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that operator: "Exists" toleration maybe added recently, so the above error is fixed. and operator: "Exists" can tolerate all of taints, so it's more suitable for yurt-controller-manager.

affinity:
nodeAffinity:
# we prefer allocating ecm on cloud node
Expand All @@ -120,5 +122,6 @@ spec:
containers:
- name: yurt-controller-manager
image: openyurt/yurt-controller-manager:latest
imagePullPolicy: IfNotPresent
command:
- yurt-controller-manager
2 changes: 1 addition & 1 deletion config/setup/yurthub.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ spec:
containers:
- name: yurt-hub
image: openyurt/yurthub:latest
imagePullPolicy: Always
imagePullPolicy: IfNotPresent
volumeMounts:
- name: kubernetes
mountPath: /etc/kubernetes
Expand Down
4 changes: 3 additions & 1 deletion config/yaml-template/yurt-controller-manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,9 @@ spec:
serviceAccountName: __project_prefix__-controller-manager
hostNetwork: true
tolerations:
- operator: "Exists"
- key: "node-role.kubernetes.io/master"
effect: ""
operator: "Exists"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment the reason of adding key: "node-role.kubernetes.io/master"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as previous one

affinity:
nodeAffinity:
# we prefer allocating ecm on cloud node
Expand Down
5 changes: 5 additions & 0 deletions pkg/yurtctl/cmd/convert/convert.go
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,11 @@ func (co *ConvertOptions) RunConvert() (err error) {
edgeNodeNames = append(edgeNodeNames, node.GetName())
}

if len(co.CloudNodes) < 1 {
klog.Errorf("At least one cloud node should be provided!")
return
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe check co.CloudNodes in func Complete() is better?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right
My first attempt is to find out a master node if there is no one provided, yes keep it simple and explicit, and fail early

// 3. deploy yurt controller manager
// create a service account for yurt-controller-manager
err = kubeutil.CreateServiceAccountFromYaml(co.clientSet,
Expand Down
9 changes: 6 additions & 3 deletions pkg/yurtctl/constants/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,9 @@ spec:
serviceAccountName: yurt-controller-manager
hostNetwork: true
tolerations:
- operator: "Exists"
- key: "node-role.kubernetes.io/master"
effect: ""
operator: "Exists"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment the reason of adding key: "node-role.kubernetes.io/master"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as previous one

affinity:
nodeAffinity:
# we prefer allocating ecm on cloud node
Expand All @@ -159,6 +161,7 @@ spec:
containers:
- name: yurt-controller-manager
image: {{.image}}
imagePullPolicy: IfNotPresent
command:
- yurt-controller-manager
`
Expand All @@ -184,7 +187,7 @@ spec:
containers:
- name: yurtctl-servant
image: {{.yurtctl_servant_image}}
imagePullPolicy: Always
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
Expand Down Expand Up @@ -229,7 +232,7 @@ spec:
containers:
- name: yurtctl-servant
image: {{.yurtctl_servant_image}}
imagePullPolicy: Always
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
Expand Down
62 changes: 62 additions & 0 deletions release-openyurt
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#!/usr/bin/env bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can exec make release command under openyurt repository to generate images of OpenYurt components. it's like that this file is a local test script. and how about remove it from this PR?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will remove it from this PR, it is kind of an enhancement, especially for local development, you can build and deliver, then load and deploy with one command

# author: joe.zheng
# version: 21.5.17

set -e

repo="${REL_REPO:-openyurt}"
arch="${REL_ARCH:-amd64}"
tag="${REL_TAG:-latest}"
out="_output/local/bin/linux"
saved="$repo-release"
loads="$saved/load"
images="
yurthub
yurt-controller-manager
yurtctl-servant
yurt-tunnel-server
yurt-tunnel-agent
"

echo "build images"
make release ARCH=$arch REPO=$repo GIT_VERSION=$tag

echo "clear $saved"
rm -rf $saved
mkdir -p $saved

echo "copy yurtctl"
cp $out/$arch/yurtctl $saved

echo "save images"
for i in $images; do
old="$repo/$i:$tag-$arch"
new="$repo/$i:$tag"
echo "tag $old to $new"
docker tag $old $new
echo "saving $new"
docker save $new | gzip > $saved/$i.tar.gz
done

echo "create $loads"
cat <<'EOF' > $loads
#!/usr/bin/env bash
# author: joe.zheng
# version: 21.5.17

set -e

cd $(dirname $0)

images="$(ls *.tar.gz)"
echo "load images:"
for i in $images; do
echo "loading $i"
docker load -i $i
done
echo "done"
EOF

chmod a+x $loads

echo "done"