Cartika Blog

Backing Up Your Cloud – Are Snapshots Really Backups?

Snapshot BackupsBacking up and recovering data hosted in “The Cloud” is becoming an increasingly hot topic in the market.  What is the proper way to do this?  Most platforms (Openstack, VMWare, OnAPP, Cloudstack, etc) and many Public/Private Cloud Providers out there list “snapshot” technology as the recommended mechanism to backup and restore your cloud data. But, are Snapshots really backups? A snapshot is a copy of a virtual machines disk at a given point in time.  This snapshot can then be used to quickly and easily revert that virtual machine to that point in time, or load up a clone of that virtual machine onto a new instance.  Absolutely a useful and critical tool, and certainly represents some sort of backup, but, should it really be counted on or used as a backup?  The answer is likely not, and certainly not in most common use case scenarios.

Advantages to Snapshots

  • Quickly create a copy of a Virtual Machine
  • Snapshot can be used to quickly restore a Virtual Machine back to the point in time it was taken
  • Snapshots can be used to clone and launch new, identical Virtual Machines (usually by converting the snapshot image into a custom template)
  • Snapshots work at the hypervisor level and do not require any agents to be installed or run within guest Virtual Machines to operate

Disadvatages to Snapshots

  • Snapshot data, even though it should, is not always taken to alternate storage devices then where production data resides – making it difficult to be used to recover from in disaster recovery scenarios
  • Taking Snapshots of all Virtual Machines on a reasonably busy hypervisor or hypervisor pool often causes performance issues on the hypervisors, impacting production Virtual Machines
  • As a result of the above, taking regular snapshots of all Virtual Machines is impractical in a typical public/private cloud scenario and especially in scenarios where frequent or higher frequency of data restore points is required
  • Retrieving individual files, databases, database tables for the purposes of granular restoration from within a snapshot is often difficult, time consuming and/or impractical
  • Recovering snapshots to alternate hypervisor platforms is typically very difficult, if even at all possible

So, whats the answer?

Utilize Bacula4 + Snapshots – strategically and symbiotically based on their respective strengths and benefits To be fair, any proper backup and recovery solution, which enables users the ability to schedule backups to occur whenever they want, with whatever frequency they want, and allows them to restore individual files, emails boxes, databases, database tables, in a granular manner, from any available restore point – is an adequate solution. If the solution also enables Bare Metal Restore (BMR) ability so that the Server/Virtual Machine can be restored back to a full system state from any available restore point, its even more advantageous. However, since we built and developed Bacula4 to specifically address this requirement for both our hosting/infrastructure business with Cartika, and for the hosting/infrastructure business of other service providers, I will specifically refer to Bacula4 here. For a full and comprehensive Backup and DR strategy, we at Cartika and Bacula4 would recommend the following: 1) Utilize on-demand Snapshots strategically to get a quickly recoverable image of your Virtual machine any time core Operating System and/or core Application Stack changes are made (just before and just after any such changes for example). Save these Snapshots on our storage devices (which are separate devices then used to house our production cloud server data) until such time that future core changes are made to the Operating System and/or Application Stack 2) Utilize Bacula4 to take file/database level backups of your Virtual Machine with whatever frequency you require and as is mandated by your business requirements.  Everything from a daily backup to a near continuous data protection model with backup/recovery points being generated multiple times a hour or day 3) For additional Disaster Recovery Scenarios (ie complete loss of facility), utilize our optional Bacula4 offsite backup job replication feature to also backup all of your Virtual Machines data and BMR image to one of our alternate facilities

When to use what for Recovery

  1. Core Operating System or Application Stack upgrades can be near instantly recovered from in case something goes terribly wrong utilizing a strategic, on-demand snapshot
  2. Day to day issues involving human errors, intrusions, injections, accidents or a simple change of opinion with work done by web designers, developers, dba’s and/or sys admins can be recovered and resolved in seconds utilizing Bacula4
  3. In scenarios of complete disaster, a Virtual Machine can be recovered to the last stable Operating System and Application Stack version utilizing the most recent snapshot available containing the most recent image, and the latest file/database data set can be restored to the most recent recovery point utilizing Bacula4. Companies, even in the worse case disaster scenario, can restore entire Virtual Machines, back to a data set as recently as minutes old with a few clicks and within a few minutes

The above is admittedly very Cartika/Bacula4 centric.  However, users on other public/private IaaS platforms face similar requirements when it comes to backing up and restoring their cloud data and cloud services.  The above protocol and recommendations (even if alternate solutions are used) will absolutely represent best and safest practices for all users of Cloud Computing platforms.  Cost effective, simple and easy to setup up and configure using a variety of tools most Cloud providers already have in place and/or are easily implemented by Cloud Computing users. Most importantly, it will give you peace of mind knowing that your data is backed up and safe, and that in case one day everything goes wrong, recovering from even the worse disaster is a relatively simple and efficient task.