Part 2: Backup Solutions and Restore Policies
Posted on June 19, 2019 by Steve Pelletier

Fron High Availability to Archive: Enhancing Disaster Recover, Backup and Archive with the Cloud

Part 2: Backup Solutions and Restore Policies

I am going to start this post with the most important thing I can say about backup solutions. Backup solutions are useless if you can’t restore from them. Backup solutions need to be monitored to make sure there are no serious errors that need to be fixed, and test restores need to be conducted routinely. If an organization’s IT department cannot commit to monitoring and testing their backup solution, they should keep their resumes up-to-date and stored someplace other than on the servers that their backup solution is attempting to protect.

Now that that is out of the way, it’s time to ask the question “Why do we need backups?” The simple answer is so that we can restore the data to certain points in time. The next question is “Why would we need to restore data?” There is a long list of reasons to restore data, but here are a few of the more common:

  • Hardware failure that causes data loss
  • Accidentally deleted data
  • Need to recover a previous version of a file
  • Disaster recovery

3-2-1 diagram for data backup and recoveryBased on these needs, a backup schedule and backup retention policy should be developed. Current best practice for backups is known as the 3-2-1 rule. This rule states:

  • There should be three copies of all data
    • Production copy
    • Local backup copy
    • Remote backup copy
  • At least two types of media should be used
    • Disk
    • Tape / VTL
  • One copy should be remote
    • To protect against catastrophic failures at the production site. This can play a role in disaster recovery.

Grandfather-Father-Son Backups

Grandfather-Father-Son diagram for data backupsA typical backup and retention schedule is known as the Grandfather-Father-Son rotation. In this rotation, a full backup of a server is run on either the first or last day of each month. This full backup is the Grandfather backup. Grandfather backups are typically retained for one full year. Another full backup is run on either the first or last day of each week. This is the Father backup, and is typically retained for one full month. Every day that does not have a full backup scheduled, an incremental backup is run. This is the Son backup, and is retained for one full week. An incremental backup is a backup of everything that has changed since the last backup. If incremental backups are run on a daily basis, it is essentially a backup of everything that has changed in the last day. With this backup and retention schedule, daily copies of any file can easily be recovered for the last week. This is useful if a file was deleted, or a previous version of a file from earlier in the week is needed. A weekly version of any file can also be recovered from the Father backup, or a monthly version of a file can be recovered for up to a year from the Grandfather backup. In the event of a complete server failure that requires a full data restore, a Father and multiple Son backups can be used to recover the server. In a worst case scenario, if the server failed on the day that a full backup (Grandfather or Father) is scheduled, the last full backup, and the last six incremental (Son) backups can be used to restore the server to the most recently backed up state.

Incremental Forever Backups

The Grandfather-Father-Son method works well for many companies, however there are some issues with this method of backing up data. What if you have a large number of servers, or large amounts of data to back up? Running full backups of every server on the same day could make the backups run longer than is desirable. Most organizations have a defined “backup window.” A backup window is a time frame in which the backup needs to complete so it doesn’t interfere with normal work hours. Modifications can be made to the Grandfather-Father-Son schedule to distribute when full backups are done on servers, but a newer backup method known as Incremental Forever has been developed.

With Incremental Forever backups, an initial full backup is performed and incremental backups are used for all future backups. The obvious issue is that to do a complete restore of a server, you would need to use the initial full and ALL of the incremental backups. The Incremental Forever method addresses this by creating periodic “Synthetic Full Backups.” A Synthetic Full Backup is created by combining the data from the last full, or synthetic full and all of the incremental backups since the last full or synthetic full, to build the equivalent of a full backup. This allows you to simulate the Grandfather-Father-Son method while only doing incremental backups each night, keeping the backup window manageable. The Synthetic Full Backups can be created outside of the backup window by the backup server, without impacting production on the servers being backed up. Depending on the configuration of the Incremental Forever policy, you can also have the additional advantage of being able to restore a file from any of the historical incremental backups. This means you could restore a file from any given date in the data retention period. This differs from the traditional Grandfather-Father-Son, which only allows file versions to be recovered from the last week.

OK, I’ve touched on a couple of popular schedules for backing up data, but you may have noticed that I didn’t mention what the data is getting backed up to. I never said tape backup. While tape is still a very viable media to back up to, it has some limitations, such as a limited life span, costs associated with tape management and tape libraries capable of holding multiple tapes, wasted capacity, and cost associated with taking tapes off site and returning them when they are needed. In the Grandfather-Father-Son schedule, the Grandfather and Father tapes need to be taken off site for safe keeping, and returned once their retention period has expired so the tapes can be reused. Tapes also have a fixed capacity. Let’s use the LTO6 tape as an example. An LTO6 tape can hold 2.5TB per tape. Unless your nightly backup is an exact multiple of 2.5TB, you will always have at least one tape which is not completely filled every time you run backups. To address some of these weaknesses, several options are available, including:

  • Disk-to-Disk backups
    • Disk-to-Disk-to-Tape backups
    • Disk-to-Disk-to-Cloud backups
  • Virtual tape libraries (VTLs)

Data Backup chart

There are many techniques that can be used to help with backups, such as disk snapshots, and VM snapshots, but these are beyond the scope of this article and may be addressed in a future post.

Next up, Part 3: Backup and Restore Media.

Get the FREE eBook

From High Availability to Archive: Enhancing Disaster Recovery, Backup and Archive with the Cloud Ebook

This is part 2 of 10 in the From High Availability to Archive: Enhancing Disaster Recovery, Backup and Archive with the Cloud series. To read them all right now download our free eBook.