Cloud backups for group of PVCs


This document will show you how to create group cloud snapshots of Portworx volumes and how you can clone those snapshots to use them in pods.

Pre-requisites

Installing Stork

This requires that you already have Stork installed and running on your Kubernetes cluster. If you fetched the Portworx specs from the Portworx spec generator in PX-Central and used the default options, Stork is already installed.

Kubernetes Version

Group snapshots are supported in following Kubernetes versions:

  • 1.10 and above
  • 1.9.4 and above
  • 1.8.9 and above

Configuring cloud secrets

To create cloud snapshots, one needs to setup secrets with Portworx which will get used to connect and authenticate with the configured cloud provider.

Follow instructions on the pxctl credentials page to setup secrets.

Portworx and Stork version

Group cloud snapshots using Stork are supported in Portworx and Stork 2.0.2 and above. If you are running a lower version, refer to Upgrade on Kubernetes to upgrade Portworx to 2.0.2 or above.

Creating group cloud snapshots

To take group snapshots, you need use the GroupVolumeSnapshot CRD object and pass in portworx/snapshot-type as cloud. Here is a simple example:

apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
  name: cassandra-group-cloudsnapshot
spec:
  pvcSelector:
    matchLabels:
      app: cassandra
  options:
    portworx/snapshot-type: cloud

Above spec will take a group snapshot of all PVCs that match labels app=cassandra.

The Examples section has a more detailed end-to-end example.

Above spec backs up the snapshots to a cloud S3 endpoint. If you intend on taking snapshots just local tot he cluster, refer to Create local group snapshots.

The GroupVolumeSnapshot object also supports specifying pre and post rules that are run on the application pods using the volumes being snapshotted. This allows users to quiesce the applications before the snapshot is taken and resume I/O after the snapshot is taken. Refer to 3D Snapshots for more detailed documentation on that.

Checking status of group cloud snapshots

A new VolumeSnapshot object will get created for each PVC that matches the given pvcSelector. For example, if the label selector app: cassandra matches 3 PVCs, you will have 3 volumesnapshot objects.

You can track the status of the group volume snapshots using:

kubectl describe groupvolumesnapshot <group-snapshot-name>

This will show the latest status and will also list the VolumeSnapshot objects once it’s complete. Below is an example of the status section of the cassandra group snapshot.

Status:
  Stage:   Final
  Status:  Successful
  Volume Snapshots:
    Conditions:
      Last Transition Time:  2019-01-14T20:30:49Z
      Message:               Snapshot created successfully and it is ready
      Reason:
      Status:                True
      Type:                  Ready
    Data Source:
      Portworx Volume:
        Snapshot Id:       a7843d0c-da4b-4f8c-974f-4b6f09463a98/763613271174793816-922960401583326548
        Snapshot Type:     cloud
    Parent Volume ID:      763613271174793816
    Task ID:               d0b4b798-319b-4c2e-a01c-66490f4172c7
    Volume Snapshot Name:  cassandra-group-cloudsnapshot-cassandra-data-cassandra-2-31d9e5df-183b-11e9-a9a4-080027ee1df7
    Conditions:
      Last Transition Time:  2019-01-14T20:30:49Z
      Message:               Snapshot created successfully and it is ready
      Reason:
      Status:                True
      Type:                  Ready
    Data Source:
      Portworx Volume:
        Snapshot Id:       a7843d0c-da4b-4f8c-974f-4b6f09463a98/1081147806034223862-518034075073409747
        Snapshot Type:     cloud
    Parent Volume ID:      1081147806034223862
    Task ID:               44da0d6d-b33f-48da-82f6-b62951dcca0e
    Volume Snapshot Name:  cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-31d9e5df-183b-11e9-a9a4-080027ee1df7
    Conditions:
      Last Transition Time:  2019-01-14T20:30:49Z
      Message:               Snapshot created successfully and it is ready
      Reason:
      Status:                True
      Type:                  Ready
    Data Source:
      Portworx Volume:
        Snapshot Id:       a7843d0c-da4b-4f8c-974f-4b6f09463a98/237262101530372284-299546281563771622
        Snapshot Type:     cloud
    Parent Volume ID:      237262101530372284
    Task ID:               915d08e1-c2fd-45a5-940f-ee3b13f7c03f
    Volume Snapshot Name:  cassandra-group-cloudsnapshot-cassandra-data-cassandra-1-31d9e5df-183b-11e9-a9a4-080027ee1df7
  • You can see 3 volume snapshots which are part of the group snapshot. The name of the volume snapshot is in the Volume Snapshot Name field. For more details on the volumesnapshot, you can do:

    kubectl get volumesnapshot <volume-snapshot-name> -o yaml

Retries of group cloud snapshots

If a cloud groupvolumesnapshot fails to trigger, it will be retried. However, by default, if a cloud groupvolumesnapshot fails after it has been triggered/started successfully, it will be marked as Failed and will not be retried

If you want to change this behavior, you can set the maxRetries field in the spec. In below example, we will perform 3 retries on failures.

apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
  name: cassandra-group-cloudsnapshot
spec:
  pvcSelector:
    matchLabels:
      app: cassandra
  maxRetries: 3
  options:
    portworx/snapshot-type: cloud

When maxRetries are enabled, NumRetries in the status of the groupvolumesnapshot will indicate the number of retries performed.

Snapshots across namespaces

When creating a group snapshot, you can specify a list of namespaces to which the group snapshot can be restored. Below is an example of a group cloud snapshot which can be restored into prod-01 and prod-02 namespaces.

apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
  name: cassandra-groupsnapshot
spec:
  pvcSelector:
    matchLabels:
      app: cassandra
  options:
    portworx/snapshot-type: cloud
  restoreNamespaces:
   - prod-01
   - prod-02

Restoring from group cloud snapshots

Previous section describes how to list the volume snapshots that are part of a group snapshot. Once you have the names the VolumeSnapshot objects, you can use them to create PVCs from them.

When you install Stork, it also creates a storage class called stork-snapshot-sc. This storage class can be used to create PVCs from snapshots.

To create a PVC from a snapshot, you would add the snapshot.alpha.kubernetes.io/snapshot annotation to refer to the snapshot name.

If the snapshot exists in another namespace, the snapshot namespace should be specified with stork/snapshot-source-namespace annotation in the PVC.

Note that the storageClassName needs to be the Stork StorageClass stork-snapshot-sc as in the example below.

For the above snapshot, the spec would like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-snap-clone
  annotations:
    snapshot.alpha.kubernetes.io/snapshot: mysql-snapshot
spec:
  accessModes:
     - ReadWriteOnce
  storageClassName: stork-snapshot-sc
  resources:
    requests:
      storage: 2Gi

Once you apply the above spec, you will see a PVC created by Stork. This PVC will be backed by a Portworx volume clone of the snapshot created above.

kubectl get pvc
NAMESPACE   NAME                                   STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                AGE
default     mysql-data                             Bound     pvc-f782bf5c-20e7-11e8-931d-0214683e8447   2Gi        RWO            px-mysql-sc                 2d
default     mysql-snap-clone                       Bound     pvc-05d3ce48-2280-11e8-98cc-0214683e8447   2Gi        RWO            stork-snapshot-sc           2s

Examples

Group cloud snapshot for all cassandra PVCs

In below example, we will take a group snapshot for all PVCs in the default namespace and that have labels app: cassandra and back it up to the configured cloud S3 endpoint in the Portworx cluster.

Step 1: Deploy cassandra statefulset and PVCs

Following spec creates a replica 3 cassandra statefulset. Each replica pod will use its own PVC.

##### Portworx storage class
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
    name: portworx-repl2
provisioner: kubernetes.io/portworx-volume
parameters:
   repl: "2"
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: cassandra
  name: cassandra
spec:
  clusterIP: None
  ports:
    - port: 9042
  selector:
    app: cassandra

---

apiVersion: "apps/v1"
kind: StatefulSet
metadata:
  name: cassandra
spec:
  selector:
    matchLabels:
      app: cassandra
  serviceName: cassandra
  replicas: 3
  template:
    metadata:
      labels:
        app: cassandra
    spec:
      containers:
      - name: cassandra
        image: gcr.io/google-samples/cassandra:v12
        imagePullPolicy: Always
        ports:
        - containerPort: 7000
          name: intra-node
        - containerPort: 7001
          name: tls-intra-node
        - containerPort: 7199
          name: jmx
        - containerPort: 9042
          name: cql
        resources:
          limits:
            cpu: "500m"
            memory: 1Gi
          requests:
           cpu: "500m"
           memory: 1Gi
        securityContext:
          capabilities:
            add:
              - IPC_LOCK
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "PID=$(pidof java) && kill $PID && while ps -p $PID > /dev/null; do sleep 1; done"]
        env:
          - name: MAX_HEAP_SIZE
            value: 512M
          - name: HEAP_NEWSIZE
            value: 100M
          - name: CASSANDRA_SEEDS
            value: "cassandra-0.cassandra.default.svc.cluster.local"
          - name: CASSANDRA_CLUSTER_NAME
            value: "K8Demo"
          - name: CASSANDRA_DC
            value: "DC1-K8Demo"
          - name: CASSANDRA_RACK
            value: "Rack1-K8Demo"
          - name: CASSANDRA_AUTO_BOOTSTRAP
            value: "false"
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
        readinessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - /ready-probe.sh
          initialDelaySeconds: 15
          timeoutSeconds: 5
        # These volume mounts are persistent. They are like inline claims,
        # but not exactly because the names need to match exactly one of
        # the stateful pod volumes.
        volumeMounts:
        - name: cassandra-data
          mountPath: /cassandra_data
  # These are converted to volume claims by the controller
  # and mounted at the paths mentioned above.
  volumeClaimTemplates:
  - metadata:
      name: cassandra-data
      labels:
        app: cassandra
      annotations:
        volume.beta.kubernetes.io/storage-class: portworx-repl2
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 2Gi

Step 2: Wait for all cassandra pods to be running

List the cassandra pods:

kubectl get pods -l app=cassandra
NAME          READY     STATUS    RESTARTS   AGE
cassandra-0   1/1       Running   0          3m
cassandra-1   1/1       Running   0          2m
cassandra-2   1/1       Running   0          1m

Once you see all the 3 pods, you can also list the cassandra PVCs.

kubectl get pvc -l app=cassandra
NAME                         STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
cassandra-data-cassandra-0   Bound     pvc-ff752ad9-1607-11e9-a9a4-080027ee1df7   2Gi        RWO            stork-snapshot-sc   3m
cassandra-data-cassandra-1   Bound     pvc-ff767dcf-1607-11e9-a9a4-080027ee1df7   2Gi        RWO            stork-snapshot-sc   2m
cassandra-data-cassandra-2   Bound     pvc-ff78173c-1607-11e9-a9a4-080027ee1df7   2Gi        RWO            stork-snapshot-sc   1m

Step 3: Take the group cloud snapshot

Apply the following spec to take the cassandra group snapshot. Portworx will quiesce I/O on all volumes before triggering their snapshots.

apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
  name: cassandra-group-cloudsnapshot
spec:
  pvcSelector:
    matchLabels:
      app: cassandra
  options:
    portworx/snapshot-type: cloud

Once you apply the above object you can check the status of the snapshots using kubectl:

kubectl describe groupvolumesnapshot cassandra-group-cloudsnapshot

While the group snapshot is in progress, the status will reflect as InProgress. Once complete, you should see a status stage as Final and status as Successful.

Name:         cassandra-group-cloudsnapshot
Namespace:    default
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"stork.libopenstorage.org/v1alpha1","kind":"GroupVolumeSnapshot","metadata":{"annotations":{},"name":"cassandra-group-cloudsnapshot","nam...
API Version:  stork.libopenstorage.org/v1alpha1
Kind:         GroupVolumeSnapshot
Metadata:
  Cluster Name:
  Creation Timestamp:  2019-01-14T20:30:13Z
  Generation:          0
  Resource Version:    18212101
  Self Link:           /apis/stork.libopenstorage.org/v1alpha1/namespaces/default/groupvolumesnapshots/cassandra-group-cloudsnapshot
  UID:                 31d9e5df-183b-11e9-a9a4-080027ee1df7
Spec:
  Options:
    Portworx / Snapshot - Type:  cloud
  Post Snapshot Rule:
  Pre Snapshot Rule:
  Pvc Selector:
    Match Labels:
      App:  cassandra
Status:
  Stage:   Final
  Status:  Successful
  Volume Snapshots:
    Conditions:
      Last Transition Time:  2019-01-14T20:30:49Z
      Message:               Snapshot created successfully and it is ready
      Reason:
      Status:                True
      Type:                  Ready
    Data Source:
      Portworx Volume:
        Snapshot Id:       a7843d0c-da4b-4f8c-974f-4b6f09463a98/763613271174793816-922960401583326548
        Snapshot Type:     cloud
    Parent Volume ID:      763613271174793816
    Task ID:               d0b4b798-319b-4c2e-a01c-66490f4172c7
    Volume Snapshot Name:  cassandra-group-cloudsnapshot-cassandra-data-cassandra-2-31d9e5df-183b-11e9-a9a4-080027ee1df7
    Conditions:
      Last Transition Time:  2019-01-14T20:30:49Z
      Message:               Snapshot created successfully and it is ready
      Reason:
      Status:                True
      Type:                  Ready
    Data Source:
      Portworx Volume:
        Snapshot Id:       a7843d0c-da4b-4f8c-974f-4b6f09463a98/1081147806034223862-518034075073409747
        Snapshot Type:     cloud
    Parent Volume ID:      1081147806034223862
    Task ID:               44da0d6d-b33f-48da-82f6-b62951dcca0e
    Volume Snapshot Name:  cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-31d9e5df-183b-11e9-a9a4-080027ee1df7
    Conditions:
      Last Transition Time:  2019-01-14T20:30:49Z
      Message:               Snapshot created successfully and it is ready
      Reason:
      Status:                True
      Type:                  Ready
    Data Source:
      Portworx Volume:
        Snapshot Id:       a7843d0c-da4b-4f8c-974f-4b6f09463a98/237262101530372284-299546281563771622
        Snapshot Type:     cloud
    Parent Volume ID:      237262101530372284
    Task ID:               915d08e1-c2fd-45a5-940f-ee3b13f7c03f
    Volume Snapshot Name:  cassandra-group-cloudsnapshot-cassandra-data-cassandra-1-31d9e5df-183b-11e9-a9a4-080027ee1df7
Events:                    <none>

Above we can see that creation of cassandra-group-snapshot created 3 volumesnapshots:

  1. cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-31d9e5df-183b-11e9-a9a4-080027ee1df7
  2. cassandra-group-cloudsnapshot-cassandra-data-cassandra-1-31d9e5df-183b-11e9-a9a4-080027ee1df7
  3. cassandra-group-cloudsnapshot-cassandra-data-cassandra-2-31d9e5df-183b-11e9-a9a4-080027ee1df7

These correspond to the PVCs cassandra-data-cassandra-0, cassandra-data-cassandra-1 and cassandra-data-cassandra-2 respectively.

You can also describe these individual volume snapshots using

kubectl describe volumesnapshot cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-31d9e5df-183b-11e9-a9a4-080027ee1df7
Name:         cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-31d9e5df-183b-11e9-a9a4-080027ee1df7
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  volumesnapshot.external-storage.k8s.io/v1
Kind:         VolumeSnapshot
Metadata:
  Cluster Name:
  Creation Timestamp:  2019-01-14T20:30:49Z
  Owner References:
    API Version:     stork.libopenstorage.org/v1alpha1
    Kind:            GroupVolumeSnapshot
    Name:            cassandra-group-cloudsnapshot
    UID:             31d9e5df-183b-11e9-a9a4-080027ee1df7
  Resource Version:  18212097
  Self Link:         /apis/volumesnapshot.external-storage.k8s.io/v1/namespaces/default/volumesnapshots/cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-31d9e5df-183b-11e9-a9a4-080027ee1df7
  UID:               47949666-183b-11e9-a9a4-080027ee1df7
Spec:
  Persistent Volume Claim Name:  cassandra-data-cassandra-0
  Snapshot Data Name:            cassandra-group-cloudsnapshot-cassandra-data-cassandra-0-31d9e5df-183b-11e9-a9a4-080027ee1df7
Status:
  Conditions:
    Last Transition Time:  2019-01-14T20:30:49Z
    Message:               Snapshot created successfully and it is ready
    Reason:
    Status:                True
    Type:                  Ready
  Creation Timestamp:      <nil>
Events:                    <none>

Deleting group snapshots

To delete group snapshots, you need to delete the GroupVolumeSnapshot that was used to create the group snapshots. Stork will delete all other volumesnapshots that were created for this group snapshot.

kubectl delete groupvolumesnapshot cassandra-group-cloudsnapshot


Last edited: Thursday, May 7, 2020