Skip to content

Latest commit

 

History

History
661 lines (554 loc) · 23.5 KB

postgres.md

File metadata and controls

661 lines (554 loc) · 23.5 KB

New to KubeDB? Please start here.

Running PostgreSQL

This tutorial will show you how to use KubeDB to run a PostgreSQL database.

Before You Begin

At first, you need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by using Minikube.

Now, install KubeDB cli on your workstation and KubeDB operator in your cluster following the steps here.

To keep things isolated, this tutorial uses a separate namespace called demo throughout this tutorial. This tutorial will also use a PGAdmin to connect and test PostgreSQL database, once it is running. Run the following command to prepare your cluster for this tutorial:

$ kubectl create -f ./docs/examples/postgres/demo-0.yaml
namespace "demo" created
deployment "pgadmin" created
service "pgadmin" created

$ kubectl get pods -n demo --watch
NAME                      READY     STATUS              RESTARTS   AGE
pgadmin-538449054-s046r   0/1       ContainerCreating   0          13s
pgadmin-538449054-s046r   1/1       Running   0          1m
^C⏎                                                                                                                                                             

$ kubectl get service -n demo
NAME      CLUSTER-IP   EXTERNAL-IP   PORT(S)        AGE
pgadmin   10.0.0.92    <pending>     80:31188/TCP   1m

$ minikube ip
192.168.99.100

Now, open your browser and go to the following URL: http://{minikube-ip}:{pgadmin-svc-nodeport}. According to the above example, this URL will be http://192.168.99.100:31188. To log into the PGAdmin, use username admin and password admin.

Create a PostgreSQL database

KubeDB implements a Postgres CRD to define the specification of a PostgreSQL database. Below is the Postgres object created in this tutorial.

apiVersion: kubedb.com/v1alpha1
kind: Postgres
metadata:
  name: p1
  namespace: demo
spec:
  version: 9.5
  doNotPause: true
  storage:
    storageClassName: "standard"
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 50Mi
  init:
    scriptSource:
      scriptPath: "postgres-init-scripts/run.sh"
      gitRepo:
        repository: "https://github.com/k8sdb/postgres-init-scripts.git"

$ kubedb create -f ./docs/examples/postgres/demo-1.yaml
validating "./docs/examples/postgres/demo-1.yaml"
postgres "p1" created

Here,

  • spec.version is the version of PostgreSQL database. In this tutorial, a PostgreSQL 9.5 database is going to be created.

  • spec.doNotPause tells KubeDB operator that if this tpr is deleted, it should be automatically reverted. This should be set to true for production databases to avoid accidental deletion.

  • spec.storage specifies the StorageClass of PVC dynamically allocated to store data for this database. This storage spec will be passed to the StatefulSet created by KubeDB operator to run database pods. You can specify any StorageClass available in your cluster with appropriate resource requests. If no storage spec is given, an emptyDir is used.

  • spec.init.scriptSource specifies a bash script used to initialize the database after it is created. In this tutorial, run.sh script from the git repository https://github.com/k8sdb/postgres-init-scripts.git is used to create a dashboard table in data schema.

KubeDB operator watches for Postgres objects using Kubernetes api. When a Postgres object is created, KubeDB operator will create a new StatefulSet and a ClusterIP Service with the matching tpr name. KubeDB operator will also create a governing service for StatefulSets with the name kubedb, if one is not already present. If RBAC is enabled, a ClusterRole, ServiceAccount and ClusterRoleBinding with the matching tpr name will be created and used as the service account name for the corresponding StatefulSet.

$ kubedb describe pg -n demo p1
Name:		p1
Namespace:	demo
StartTimestamp:	Mon, 17 Jul 2017 15:31:34 -0700
Status:		Running
Volume:
  StorageClass:	standard
  Capacity:	50Mi
  Access Modes:	RWO

Service:	
  Name:		p1
  Type:		ClusterIP
  IP:		10.0.0.161
  Port:		db	5432/TCP

Database Secret:
  Name:	p1-admin-auth
  Type:	Opaque
  Data
  ====
  .admin:	35 bytes

No Snapshots.

Events:
  FirstSeen   LastSeen   Count     From                Type       Reason               Message
  ---------   --------   -----     ----                --------   ------               -------
  2m          2m         1         Postgres operator   Normal     SuccessfulValidate   Successfully validate Postgres
  2m          2m         1         Postgres operator   Normal     SuccessfulCreate     Successfully created Postgres
  2m          2m         1         Postgres operator   Normal     SuccessfulCreate     Successfully created StatefulSet
  3m          3m         1         Postgres operator   Normal     SuccessfulValidate   Successfully validate Postgres
  3m          3m         1         Postgres operator   Normal     Creating             Creating Kubernetes objects


$ kubectl get statefulset -n demo
NAME      DESIRED   CURRENT   AGE
p1        1         1         1m

$ kubectl get pvc -n demo
NAME        STATUS    VOLUME                                     CAPACITY   ACCESSMODES   STORAGECLASS   AGE
data-p1-0   Bound     pvc-e90b87d4-6b5a-11e7-b9ca-080027f73ab7   50Mi RWO           standard       1m

$ kubectl get pv -n demo
NAME                                       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM            STORAGECLASS   REASON    AGE
pvc-e90b87d4-6b5a-11e7-b9ca-080027f73ab7   50Mi RWO           Delete          Bound     demo/data-p1-0   standard                 1m

$ kubectl get service -n demo
NAME      CLUSTER-IP   EXTERNAL-IP   PORT(S)        AGE
kubedb    None         <none>                       3m
p1        10.0.0.143   <none>        5432/TCP       3m
pgadmin   10.0.0.120   <pending>     80:30576/TCP   6m

KubeDB operator sets the status.phase to Running once the database is successfully created. Run the following command to see the modified tpr:

$ kubedb get pg -n demo p1 -o yaml
apiVersion: kubedb.com/v1alpha1
kind: Postgres
metadata:
  creationTimestamp: 2017-07-17T22:31:34Z
  name: p1
  namespace: demo
  resourceVersion: "2677"
  selfLink: /apis/kubedb.com/v1alpha1/namespaces/demo/postgreses/p1
  uid: b02ccec1-6b3f-11e7-bdc0-080027aa4456
spec:
  databaseSecret:
    secretName: p1-admin-auth
  doNotPause: true
  init:
    scriptSource:
      gitRepo:
        repository: https://github.com/k8sdb/postgres-init-scripts.git
      scriptPath: postgres-init-scripts/run.sh
  resources: {}
  storage:
    accessModes:
    - ReadWriteOnce
    storageClassName: standard
    resources:
      requests:
        storage: 50Mi
  version: "9.5"
status:
  creationTime: 2017-07-17T22:31:34Z
  phase: Running

Please note that KubeDB operator has created a new Secret called p1-admin-auth (format: {tpr-name}-admin-auth) for storing the password for postgres superuser. This secret contains a .admin key with a ini formatted key-value pairs. If you want to use an existing secret please specify that when creating the tpr using spec.databaseSecret.secretName.

Now, you can connect to this database from the PGAdmin dashboard using the database pod IP and postgres user password. Now, open your browser and go to the following URL: http://{minikube-ip}:{pgadmin-svc-nodeport}. To log into the PGAdmin, use username admin and password admin.

$ kubectl get pods p1-0 -n demo -o yaml | grep IP
  hostIP: 192.168.99.100
  podIP: 172.17.0.6

$ kubectl get secrets -n demo p1-admin-auth -o jsonpath='{.data.\.admin}' | base64 -d
POSTGRES_PASSWORD=R9keKKRTqSJUPtNC

Using p1 from PGAdmin4

Database Snapshots

Instant Backups

Now, you can easily take a snapshot of this database by creating a Snapshot tpr. When a Snapshot tpr is created, KubeDB operator will launch a Job that runs the pg_dump command and uploads the output sql file to various cloud providers S3, GCS, Azure, OpenStack Swift and/or locally mounted volumes using osm.

In this tutorial, snapshots will be stored in a Google Cloud Storage (GCS) bucket. To do so, a secret is needed that has the following 2 keys:

Key Description
GOOGLE_PROJECT_ID Required. Google Cloud project ID
GOOGLE_SERVICE_ACCOUNT_JSON_KEY Required. Google Cloud service account json key
$ echo -n '<your-project-id>' > GOOGLE_PROJECT_ID
$ mv downloaded-sa-json.key > GOOGLE_SERVICE_ACCOUNT_JSON_KEY
$ kubectl create secret generic pg-snap-secret -n demo \
    --from-file=./GOOGLE_PROJECT_ID \
    --from-file=./GOOGLE_SERVICE_ACCOUNT_JSON_KEY
secret "pg-snap-secret" created
$ kubectl get secret pg-snap-secret -n demo -o yaml

apiVersion: v1
data:
  GOOGLE_PROJECT_ID: PHlvdXItcHJvamVjdC1pZD4=
  GOOGLE_SERVICE_ACCOUNT_JSON_KEY: ewogICJ0eXBlIjogInNlcnZpY2VfYWNjb3V...9tIgp9Cg==
kind: Secret
metadata:
  creationTimestamp: 2017-07-17T18:06:51Z
  name: pg-snap-secret
  namespace: demo
  resourceVersion: "5461"
  selfLink: /api/v1/namespaces/demo/secrets/pg-snap-secret
  uid: a6983b00-5c02-11e7-bb52-08002711f4aa
type: Opaque

To lean how to configure other storage destinations for Snapshots, please visit here. Now, create the Snapshot tpr.

$ kubedb create -f ./docs/examples/postgres/demo-2.yaml
validating "./docs/examples/postgres/demo-2.yaml"
snapshot "p1-xyz" created

$ kubedb get snap -n demo
NAME      DATABASE   STATUS    AGE
p1-xyz    pg/p1      Running   22s
$ kubedb get snap -n demo p1-xyz -o yaml
apiVersion: kubedb.com/v1alpha1
kind: Snapshot
metadata:
  creationTimestamp: 2017-07-18T02:18:00Z
  labels:
    kubedb.com/kind: Postgres
    kubedb.com/name: p1
  name: p1-xyz
  namespace: demo
  resourceVersion: "2973"
  selfLink: /apis/kubedb.com/v1alpha1/namespaces/demo/snapshots/p1-xyz
  uid: 5269701f-6b5f-11e7-b9ca-080027f73ab7
spec:
  databaseName: p1
  gcs:
    bucket: restic
  resources: {}
  storageSecretName: snap-secret
status:
  completionTime: 2017-07-18T02:19:11Z
  phase: Succeeded
  startTime: 2017-07-18T02:18:00Z

Here,

  • metadata.labels should include the type of database kubedb.com/kind: Postgres whose snapshot will be taken.

  • spec.databaseName points to the database whose snapshot is taken.

  • spec.storageSecretName points to the Secret containing the credentials for snapshot storage destination.

  • spec.gcs.bucket points to the bucket name used to store the snapshot data.

You can also run the kubedb describe command to see the recent snapshots taken for a database.

$ kubedb describe pg -n demo p1
Name:		p1
Namespace:	demo
StartTimestamp:	Mon, 17 Jul 2017 18:46:24 -0700
Status:		Running
Volume:
  StorageClass:	standard
  Capacity:	50Mi
  Access Modes:	RWO

Service:	
  Name:		p1
  Type:		ClusterIP
  IP:		10.0.0.143
  Port:		db	5432/TCP

Database Secret:
  Name:	p1-admin-auth
  Type:	Opaque
  Data
  ====
  .admin:	35 bytes

Snapshots:
  Name     Bucket      StartTime                         CompletionTime                    Phase
  ----     ------      ---------                         --------------                    -----
  p1-xyz   gs:restic   Mon, 17 Jul 2017 19:18:00 -0700   Mon, 17 Jul 2017 19:19:11 -0700   Succeeded

Events:
  FirstSeen   LastSeen   Count     From                  Type       Reason               Message
  ---------   --------   -----     ----                  --------   ------               -------
  1m          1m         1         Snapshot Controller   Normal     SuccessfulSnapshot   Successfully completed snapshot
  2m          2m         1         Snapshot Controller   Normal     Starting             Backup running
  33m         33m        1         Postgres operator     Normal     SuccessfulValidate   Successfully validate Postgres
  33m         33m        1         Postgres operator     Normal     SuccessfulCreate     Successfully created StatefulSet
  33m         33m        1         Postgres operator     Normal     SuccessfulCreate     Successfully created Postgres
  34m         34m        1         Postgres operator     Normal     Creating             Creating Kubernetes objects
  34m         34m        1         Postgres operator     Normal     SuccessfulValidate   Successfully validate Postgres

Once the snapshot Job is complete, you should see the output of the pg_dump command stored in the GCS bucket.

snapshot-console

From the above image, you can see that the snapshot output is stored in a folder called {bucket}/kubedb/{namespace}/{tpr}/{snapshot}/.

Scheduled Backups

KubeDB supports taking periodic backups for a database using a cron expression. To take periodic backups, edit the Postgres tpr to add spec.backupSchedule section.

$ kubedb edit pg p1 -n demo

apiVersion: kubedb.com/v1alpha1
kind: Postgres
metadata:
  name: p1
  namespace: demo
spec:
  version: 9.5
  doNotPause: true
  storage:
    storageClassName: "standard"
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 50Mi
  init:
    scriptSource:
      scriptPath: "postgres-init-scripts/run.sh"
      gitRepo:
        repository: "https://github.com/k8sdb/postgres-init-scripts.git"
  backupSchedule:
    cronExpression: "@every 1m"
    storageSecretName: snap-secret
    gcs:
      bucket: restic

Once the spec.backupSchedule is added, KubeDB operator will create a new Snapshot tpr on each tick of the cron expression. This triggers KubeDB operator to create a Job as it would for any regular instant backup process. You can see the snapshots as they are created using kubedb get snap command.

$ kubedb get snap -n demo
NAME                 DATABASE   STATUS      AGE
p1-20170718-030836   pg/p1      Succeeded   1m
p1-20170718-030956   pg/p1      Running     2s
p1-xyz               pg/p1      Succeeded   51m

Restore from Snapshot

You can create a new database from a previously taken Snapshot. Specify the Snapshot name in the spec.init.snapshotSource field of a new Postgres tpr. See the example recovered tpr below:

$ cat ./docs/examples/postgres/demo-4.yaml
apiVersion: kubedb.com/v1alpha1
kind: Postgres
metadata:
  name: recovered
  namespace: demo
spec:
  version: 9.5
  doNotPause: true
  storage:
    storageClassName: "standard"
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 50Mi
  init:
    snapshotSource:
      name: p1-xyz

$ kubectl create -f ./docs/examples/postgres/demo-4.yaml
postgres "recovered" created

Here,

  • spec.init.snapshotSource.name refers to a Snapshot tpr for a Postgres database in the same namespaces as this new recovered Postgres tpr.

Now, wait several seconds. KubeDB operator will create a new StatefulSet. Then KubeDB operator launches a Kubernetes Job to initialize the new database using the data from p1-xyz Snapshot.

$ kubedb get pg -n demo
NAME        STATUS    AGE
p1          Running   10m
recovered   Running   6m

$ kubedb describe pg -n demo recovered
Name:		recovered
Namespace:	demo
StartTimestamp:	Tue, 18 Jul 2017 16:07:18 -0700
Status:		Running
Volume:
  StorageClass:	standard
  Capacity:	50Mi
  Access Modes:	RWO

Service:
  Name:		recovered
  Type:		ClusterIP
  IP:		10.0.0.234
  Port:		db	5432/TCP

Database Secret:
  Name:	recovered-admin-auth
  Type:	Opaque
  Data
  ====
  .admin:	35 bytes

No Snapshots.

Events:
  FirstSeen   LastSeen   Count     From                Type       Reason               Message
  ---------   --------   -----     ----                --------   ------               -------
  3m          3m         1         Postgres operator   Normal     SuccessfulValidate   Successfully validate Postgres
  3m          3m         1         Postgres operator   Warning    Failed               Failed to complete initialization
  3m          3m         1         Postgres operator   Normal     SuccessfulCreate     Successfully created Postgres
  5m          5m         1         Postgres operator   Normal     SuccessfulCreate     Successfully created StatefulSet
  5m          5m         1         Postgres operator   Normal     Initializing         Initializing from Snapshot: "p1-xyz"
  5m          5m         1         Postgres operator   Normal     Creating             Creating Kubernetes objects
  5m          5m         1         Postgres operator   Normal     SuccessfulValidate   Successfully validate Postgres

Pause Database

Since the Postgres tpr created in this tpr has spec.doNotPause set to true, if you delete the tpr, KubeDB operator will recreate the tpr and essentially nullify the delete operation. You can see this below:

$ kubedb delete pg p1 -n demo
error: Postgres "p1" can't be paused. To continue delete, unset spec.doNotPause and retry.

Now, run kubedb edit pg p1 -n demo to set spec.doNotPause to false or remove this field (which default to false). Then if you delete the Postgres tpr, KubeDB operator will delete the StatefulSet and its pods, but leaves the PVCs unchanged. In KubeDB parlance, we say that p1 PostgreSQL database has entered into dormant state. This is represented by KubeDB operator by creating a matching DormantDatabase tpr.

$ kubedb delete pg -n demo p1
postgres "p1" deleted

$ kubedb get drmn -n demo p1
NAME      STATUS    AGE
p1        Pausing   20s

$ kubedb get drmn -n demo p1
NAME      STATUS    AGE
p1        Paused    3m

$ kubedb get drmn -n demo p1 -o yaml
apiVersion: kubedb.com/v1alpha1
kind: DormantDatabase
metadata:
  creationTimestamp: 2017-07-18T03:23:08Z
  labels:
    kubedb.com/kind: Postgres
  name: p1
  namespace: demo
  resourceVersion: "8004"
  selfLink: /apis/kubedb.com/v1alpha1/namespaces/demo/dormantdatabases/p1
  uid: 6ba8d3c9-6b68-11e7-b9ca-080027f73ab7
spec:
  origin:
    metadata:
      creationTimestamp: null
      name: p1
      namespace: demo
    spec:
      postgres:
        backupSchedule:
          cronExpression: '@every 1m'
          gcs:
            bucket: restic
          resources: {}
          storageSecretName: snap-secret
        databaseSecret:
          secretName: p1-admin-auth
        init:
          scriptSource:
            gitRepo:
              repository: https://github.com/k8sdb/postgres-init-scripts.git
            scriptPath: postgres-init-scripts/run.sh
        resources: {}
        storage:
          accessModes:
          - ReadWriteOnce
          storageClassName: standard
          resources:
            requests:
              storage: 50Mi
        version: "9.5"
status:
  creationTime: 2017-07-18T03:23:08Z
  pausingTime: 2017-07-18T03:23:48Z
  phase: Paused

Here,

  • spec.origin is the spec of the original spec of the original Postgres tpr.

  • status.phase points to the current database state Paused.

Resume Dormant Database

To resume the database from the dormant state, set spec.resume to true in the DormantDatabase tpr.

$ kubedb edit drmn -n demo p1

apiVersion: kubedb.com/v1alpha1
kind: DormantDatabase
metadata:
  creationTimestamp: 2017-07-18T03:23:08Z
  labels:
    kubedb.com/kind: Postgres
  name: p1
  namespace: demo
  resourceVersion: "8004"
  selfLink: /apis/kubedb.com/v1alpha1/namespaces/demo/dormantdatabases/p1
  uid: 6ba8d3c9-6b68-11e7-b9ca-080027f73ab7
spec:
  resume: true
  origin:
    metadata:
      creationTimestamp: null
      name: p1
      namespace: demo
    spec:
      postgres:
        backupSchedule:
          cronExpression: '@every 1m'
          gcs:
            bucket: restic
          resources: {}
          storageSecretName: snap-secret
        databaseSecret:
          secretName: p1-admin-auth
        init:
          scriptSource:
            gitRepo:
              repository: https://github.com/k8sdb/postgres-init-scripts.git
            scriptPath: postgres-init-scripts/run.sh
        resources: {}
        storage:
          accessModes:
          - ReadWriteOnce
          storageClassName: standard
          resources:
            requests:
              storage: 50Mi
        version: "9.5"
status:
  creationTime: 2017-07-18T03:23:08Z
  pausingTime: 2017-07-18T03:23:48Z
  phase: Paused

KubeDB operator will notice that spec.resume is set to true. KubeDB operator will delete the DormantDatabase tpr and create a new Postgres tpr using the original spec. This will in turn start a new StatefulSet which will mount the originally created PVCs. Thus the original database is resumed.

Wipeout Dormant Database

You can also wipe out a DormantDatabase by setting spec.wipeOut to true. KubeDB operator will delete the PVCs, delete any relevant Snapshot tprs for this database and also delete snapshot data stored in the Cloud Storage buckets. There is no way to resume a wiped out database. So, be sure before you wipe out a database.

$ kubedb edit drmn -n demo p1
# set spec.wipeOut: true

$ kubedb get drmn -n demo p1 -o yaml
apiVersion: kubedb.com/v1alpha1
kind: DormantDatabase
metadata:
  creationTimestamp: 2017-07-18T03:23:08Z
  labels:
    kubedb.com/kind: Postgres
  name: p1
  namespace: demo
  resourceVersion: "15223"
  selfLink: /apis/kubedb.com/v1alpha1/namespaces/demo/dormantdatabases/p1
  uid: 6ba8d3c9-6b68-11e7-b9ca-080027f73ab7
spec:
  origin:
    metadata:
      creationTimestamp: null
      name: p1
      namespace: demo
    spec:
      postgres:
        backupSchedule:
          cronExpression: '@every 1m'
          gcs:
            bucket: restic
          resources: {}
          storageSecretName: snap-secret
        databaseSecret:
          secretName: p1-admin-auth
        init:
          scriptSource:
            gitRepo:
              repository: https://github.com/k8sdb/postgres-init-scripts.git
            scriptPath: postgres-init-scripts/run.sh
        resources: {}
        storage:
          accessModes:
          - ReadWriteOnce
          storageClassName: standard
          resources:
            requests:
              storage: 50Mi
        version: "9.5"
  wipeOut: true
status:
  creationTime: 2017-07-18T03:23:08Z
  pausingTime: 2017-07-18T03:23:48Z
  phase: WipedOut
  wipeOutTime: 2017-07-18T05:09:59Z

$ kubedb get drmn -n demo
NAME      STATUS     AGE
p1        WipedOut   1h

Delete Dormant Database

You still have a record that there used to be a Postgres database p1 in the form of a DormantDatabase database p1. Since you have already wiped out the database, you can delete the DormantDatabase tpr.

$ kubedb delete drmn p1 -n demo
dormantdatabase "p1" deleted

Cleaning up

To cleanup the Kubernetes resources created by this tutorial, run:

$ kubectl delete ns demo

If you would like to uninstall KubeDB operator, please follow the steps here.

Next Steps

  • Learn about the details of Postgres tpr here.
  • See the list of supported storage providers for snapshots here.
  • Thinking about monitoring your database? KubeDB works out-of-the-box with Prometheus.
  • Learn how to use KubeDB in a RBAC enabled cluster.
  • Wondering what features are coming next? Please visit here.
  • Want to hack on KubeDB? Check our contribution guidelines.