PD blocking at Pending status #1281

guiyang · 2019-12-04T04:45:17Z

Bug Report

What version of Kubernetes are you using?

Client Version: v1.16.3
Server Version: v1.16.3

What version of TiDB Operator are you using?

TiDB Operator Version: v1.0.3

What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?
Storage class:

NAME            PROVISIONER                    AGE
local-storage   kubernetes.io/no-provisioner   125m

NAME                   STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    AGE
pd-tidb-cluster-pd-0   Pending                                      local-storage   97m
pd-tidb-cluster-pd-1   Pending                                      local-storage   97m
pd-tidb-cluster-pd-2   Pending                                      local-storage   97m

What's the status of the TiDB cluster pods?

NAME                                     READY   STATUS    RESTARTS   AGE
tidb-cluster-discovery-dd8458b6f-krkk8   1/1     Running   0          100m
tidb-cluster-monitor-769579b84-5bk5r     3/3     Running   0          100m
tidb-cluster-pd-0                        0/1     Pending   0          55m
tidb-cluster-pd-1                        0/1     Pending   0          55m
tidb-cluster-pd-2                        0/1     Pending   0          55m

~$ kubectl describe po -n db-cluster tidb-cluster-pd-0
Name:           tidb-cluster-pd-0
Namespace:      db-cluster
Priority:       0
Node:           <none>
Labels:         app.kubernetes.io/component=pd
                app.kubernetes.io/instance=tidb-cluster
                app.kubernetes.io/managed-by=tidb-operator
                app.kubernetes.io/name=tidb-cluster
                controller-revision-hash=tidb-cluster-pd-658997f6cd
                statefulset.kubernetes.io/pod-name=tidb-cluster-pd-0
Annotations:    pingcap.com/last-applied-configuration:
                  {"volumes":[{"name":"annotations","downwardAPI":{"items":[{"path":"annotations","fieldRef":{"fieldPath":"metadata.annotations"}}]}},{"name..."}
                prometheus.io/path: /metrics
                prometheus.io/port: 2379
                prometheus.io/scrape: true
Status:         Pending
IP:
IPs:            <none>
Controlled By:  StatefulSet/tidb-cluster-pd
Containers:
  pd:
    Image:       registry.securitycloud.com/pingcap/pd:v3.0.5
    Ports:       2380/TCP, 2379/TCP
    Host Ports:  0/TCP, 0/TCP
    Command:
      /bin/sh
      /usr/local/bin/pd_start_script.sh
    Environment:
      NAMESPACE:          db-cluster (v1:metadata.namespace)
      PEER_SERVICE_NAME:  tidb-cluster-pd-peer
      SERVICE_NAME:       tidb-cluster-pd
      SET_NAME:           tidb-cluster-pd
      TZ:                 UTC
    Mounts:
      /etc/pd from config (ro)
      /etc/podinfo from annotations (ro)
      /usr/local/bin from startup-script (ro)
      /var/lib/pd from pd (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-vtwzp (ro)
Volumes:
  pd:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pd-tidb-cluster-pd-0
    ReadOnly:   false
  annotations:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.annotations -> annotations
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      tidb-cluster-pd-cfa0d77a
    Optional:  false
  startup-script:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      tidb-cluster-pd-cfa0d77a
    Optional:  false
  default-token-vtwzp:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-vtwzp
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

PV status

NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS    REASON   AGE
local-pv-13042f9f   1951Mi     RWO            Delete           Available           local-storage            107m
local-pv-443e918c   1951Mi     RWO            Delete           Available           local-storage            108m
local-pv-6aab1cfd   1951Mi     RWO            Delete           Available           local-storage            108m
local-pv-d22c0a2a   1951Mi     RWO            Delete           Available           local-storage            109m

Helm version

V3.0.0

What did you do?

~$ helm install tidb-operation tidb-operator-v1.0.4.tgz -n db-admin
~$ helm install tidb-cluster tidb-cluster-v1.0.4.tgz -n db-cluster

The text was updated successfully, but these errors were encountered:

tennix · 2019-12-04T06:54:16Z

@guiyang Is tidb-scheduler in db-admin namespace running? And besides, what's the volumeBindingMode of the local-storage storage class? After you've checked this and still could not find the reason, you can provide the tidb-scheduler pod's two container logs so we can help you to diagnose this problem.

guiyang · 2019-12-04T07:52:34Z

@tennix Thanks, I checked kube-scheduler's logs and I found that the problem cause by missing RBAC rule for CSINode. It work fine after I updated helm template and reinstalled the tidb-operator.

- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses", "csinodes"]
  verbs: ["get", "list", "watch"]

cofyc · 2019-12-04T07:55:27Z

this change should be back-ported into release-1.0.

tennix · 2019-12-04T08:12:38Z

@tennix Thanks, I checked kube-scheduler's logs and I found that the problem cause by missing RBAC rule for CSINode. It work fine after I updated helm template and reinstalled the tidb-operator.
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses", "csinodes"]
  verbs: ["get", "list", "watch"]

Cool! Thank you for your feedback. @cofyc Could you send a PR to backport this into release-1.0?

aylei · 2019-12-25T11:36:33Z

closed via #1282

guiyang changed the title ~~PD blocking on Pending status~~ PD blocking at Pending status Dec 4, 2019

cofyc mentioned this issue Dec 4, 2019

Fix TiDB scheduler permission in 1.16 #1282

Merged

tennix added the type/bug Something isn't working label Dec 4, 2019

aylei closed this as completed Dec 25, 2019

yahonda pushed a commit that referenced this issue Dec 27, 2021

en,zh: Add warning about spec.tikv.evictLeaderTimeout field (#1281)

77e6bb3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PD blocking at Pending status #1281

PD blocking at Pending status #1281

guiyang commented Dec 4, 2019 •

edited

Loading

tennix commented Dec 4, 2019

guiyang commented Dec 4, 2019

cofyc commented Dec 4, 2019

tennix commented Dec 4, 2019

aylei commented Dec 25, 2019

PD blocking at Pending status #1281

PD blocking at Pending status #1281

Comments

guiyang commented Dec 4, 2019 • edited Loading

Bug Report

tennix commented Dec 4, 2019

guiyang commented Dec 4, 2019

cofyc commented Dec 4, 2019

tennix commented Dec 4, 2019

aylei commented Dec 25, 2019

guiyang commented Dec 4, 2019 •

edited

Loading