As of version 0.9.0, Ark has support for backing up and restoring Kubernetes volumes using a free open-source backup tool called restic.
Ark has always allowed you to take snapshots of persistent volumes as part of your backups if you’re using one of the supported cloud providers’ block storage offerings (Amazon EBS Volumes, Azure Managed Disks, Google Persistent Disks). Starting with version 0.6.0, we provide a plugin model that enables anyone to implement additional object and block storage backends, outside the main Ark repository.
We integrated restic with Ark so that users have an out-of-the-box solution for backing up and restoring almost any type of Kubernetes volume*. This is a new capability for Ark, not a replacement for existing functionality. If you’re running on AWS, and taking EBS snapshots as part of your regular Ark backups, there’s no need to switch to using restic. However, if you’ve been waiting for a snapshot plugin for your storage platform, or if you’re using EFS, AzureFile, NFS, emptyDir, local, or any other volume type that doesn’t have a native snapshot concept, restic might be for you.
Restic is not tied to a specific storage platform, which means that this integration also paves the way for future work to enable cross-volume-type data migrations. Stay tuned as this evolves!
* hostPath volumes are not supported, but the new local volume type is supported.
Ensure you’ve downloaded & extracted the latest release.
In the Ark directory (i.e. where you extracted the release tarball), run the following to create new custom resource definitions:
kubectl apply -f config/common/00-prereqs.yaml
Run one of the following for your platform to create the daemonset:
kubectl apply -f config/aws/20-restic-daemonset.yaml
kubectl apply -f config/azure/20-restic-daemonset.yaml
kubectl apply -f config/gcp/20-restic-daemonset.yaml
kubectl apply -f config/minio/30-restic-daemonset.yaml
You’re now ready to use Ark with restic.
Run the following for each pod that contains a volume to back up:
kubectl -n YOUR_POD_NAMESPACE annotate pod/YOUR_POD_NAME backup.ark.heptio.com/backup-volumes=YOUR_VOLUME_NAME_1,YOUR_VOLUME_NAME_2,...
where the volume names are the names of the volumes in the pod spec.
For example, for the following pod:
apiVersion: v1
kind: Pod
metadata:
name: sample
namespace: foo
spec:
containers:
- image: k8s.gcr.io/test-webserver
name: test-webserver
volumeMounts:
- name: pvc-volume
mountPath: /volume-1
- name: emptydir-volume
mountPath: /volume-2
volumes:
- name: pvc-volume
persistentVolumeClaim:
claimName: test-volume-claim
- name: emptydir-volume
emptyDir: {}
You’d run:
kubectl -n foo annotate pod/sample backup.ark.heptio.com/backup-volumes=pvc-volume,emptydir-volume
This annotation can also be provided in a pod template spec if you use a controller to manage your pods.
Take an Ark backup:
ark backup create NAME OPTIONS...
When the backup completes, view information about the backups:
ark backup describe YOUR_BACKUP_NAME
kubectl -n heptio-ark get podvolumebackups -l ark.heptio.com/backup-name=YOUR_BACKUP_NAME -o yaml
Restore from your Ark backup:
ark restore create --from-backup BACKUP_NAME OPTIONS...
When the restore completes, view information about your pod volume restores:
ark restore describe YOUR_RESTORE_NAME
kubectl -n heptio-ark get podvolumerestores -l ark.heptio.com/restore-name=YOUR_RESTORE_NAME -o yaml
hostPath
volumes are not supported. Local persistent volumes are supported.Run the following checks:
Are your Ark server and daemonset pods running?
kubectl get pods -n heptio-ark
Does your restic repository exist, and is it ready?
ark restic repo get
ark restic repo get REPO_NAME -o yaml
Are there any errors in your Ark backup/restore?
ark backup describe BACKUP_NAME
ark backup logs BACKUP_NAME
ark restore describe RESTORE_NAME
ark restore logs RESTORE_NAME
What is the status of your pod volume backups/restores?
kubectl -n heptio-ark get podvolumebackups -l ark.heptio.com/backup-name=BACKUP_NAME -o yaml
kubectl -n heptio-ark get podvolumerestores -l ark.heptio.com/restore-name=RESTORE_NAME -o yaml
Is there any useful information in the Ark server or daemon pod logs?
kubectl -n heptio-ark logs deploy/ark
kubectl -n heptio-ark logs DAEMON_POD_NAME
NOTE: You can increase the verbosity of the pod logs by adding --log-level=debug
as an argument
to the container command in the deployment/daemonset pod template spec.
We introduced three custom resource definitions and associated controllers:
ResticRepository
- represents/manages the lifecycle of Ark’s restic repositories. Ark creates
a restic repository per namespace when the first restic backup for a namespace is requested. The controller
for this custom resource executes restic repository lifecycle commands – restic init
, restic check
,
and restic prune
.
You can see information about your Ark restic repositories by running ark restic repo get
.
PodVolumeBackup
- represents a restic backup of a volume in a pod. The main Ark backup process creates
one or more of these when it finds an annotated pod. Each node in the cluster runs a controller for this
resource (in a daemonset) that handles the PodVolumeBackups
for pods on that node. The controller executes
restic backup
commands to backup pod volume data.
PodVolumeRestore
- represents a restic restore of a pod volume. The main Ark restore process creates one
or more of these when it encounters a pod that has associated restic backups. Each node in the cluster runs a
controller for this resource (in the same daemonset as above) that handles the PodVolumeRestores
for pods
on that node. The controller executes restic restore
commands to restore pod volume data.
backup.ark.heptio.com/backup-volumes
)ResticRepository
custom resource already existsResticRepository
controller to init/check itPodVolumeBackup
custom resource per volume listed in the pod annotationPodVolumeBackup
resources to complete or failPodVolumeBackup
is handled by the controller on the appropriate node, which:
/var/lib/kubelet/pods
to access the pod volume datarestic backup
Completed
or Failed
PodVolumeBackup
finishes, the main Ark process captures its restic snapshot ID and adds it as an annotation
to the copy of the pod JSON that’s stored in the Ark backup. This will be used for restores, as seen in the next section.snapshot.ark.heptio.com/<volume-name>
)ResticRepository
custom resource already existsResticRepository
controller to init/check it (note that
in this case, the actual repository should already exist in object storage, so the Ark controller will simply
check it for integrity)PodVolumeRestore
custom resource for each volume to be restored in the podPodVolumeRestore
resource to complete or failPodVolumeRestore
is handled by the controller on the appropriate node, which:
/var/lib/kubelet/pods
to access the pod volume datarestic restore
.ark
subdirectory, whose name is the UID of the Ark restore
that this pod volume restore is forCompleted
or Failed
.ark
, whose name is the UID of the Ark restore being run