- Open Source
- Why Fairwinds
- About Us
So we decided to create a new project, Gemini, in order to automate the backup and restoration of PersistentVolumes. Gemini consists of a new CRD - the SnapshotGroup - as well as an oporator that creates, deletes, and restores VolumeSnapshots based on SnapshotGroup specifications. Here’s how it works.
We start with a SnapshotGroup definition, which looks something like this:
- every: 10 minutes
Here we tell Gemini to find the existing postgres-data PVC, and to schedule a backup every 10 minutes - overkill, maybe, but better safe than sorry. In addition to the latest backup, we’ve also told Gemini to also hold onto the three most recent backups, so we always have at least 30 minutes worth of coverage.
But we can go further! We can also tell Gemini to keep hourly, daily, weekly, monthly, and yearly snapshots:
- every: 10 minutes
- every: hour
- every: day
- every: week
- every: month
- every: year
Gemini will still only run a single backup every 10 minutes, but it will preserve additional backups to fulfill the longer-term backup schedule.
$ kubectl get volumesnapshot NAME AGE postgres-backups-1585945609 15m
Take note of the timestamp,
1585945609- that’s our target restore point: 15 minutes ago. Next, we’d scale down the application:
kubectl scale all --all --replicas=0
We’ll want to move quickly now, as our application is offline. To swap out the PVC, we simply annotate our SnapshotGroup with the desired restore point:
kubectl annotate snapshotgroup/postgres-backups --overwrite \ "gemini.fairwinds.com/restore=1585945609"
Once Gemini sees this annotation, it will trigger a one-off backup, delete the old PVC, and replace it with a new one (with the same name) using data from the specified snapshot. The restoration should only take 30 seconds or so. In the meantime, we can scale back up, and our pods will come online once the new PVC is ready.
kubectl scale all --all --replicas=1
It’s unfortunate that restoring data involves a bit of downtime, and that there doesn’t seem to be any reasonable way around this. If you have ideas on how to improve this process, let us know by opening an issue!
It’s also worth noting that you can spin up a second PVC from one of your backups, and attach it to a separate instance of your application. I’ve used this mechanism to recover old photos without having to revert my entire photo app to a particular point in time.