Consider a financial application performing a transfer of funds from one account to another. Logically, this operation consists of two steps: debit one account and credit another. If a momentary snapshot copy of the database is taken after the debit operation but before the credit operation, that snapshot reflects the debit but not the credit. The transfer transaction is incomplete and the dataset in the snapshot is deemed ‘inconsistent’. This case illustrates the need for backup and application consistency though when dealing with financial services one might consider other aspects including queues in memories, files that are being written to etc. In such cases you might consider a complete and consistent replication of all the stack.
One thing to keep in mind is that, ‘consistency’ refers to a state of a stationary backup copy. The live system’s datasets may be in transient, inconsistent states at any given time (as the funds transfer example illustrates). This is an inevitable and legitimate situation. However, a backup copy – the stationary dataset kept for possible eventual restoring – is where consistency matters. When we say that we want consistent backup, we mean that, using a backup dataset to restore a system results in a system that is healthy, usable, and in a correct state. To better understand the concept of backup consistency, we’ll list five notions of consistency and explain their manifestation.
Crash-consistent backup means the all machine data was captured at the same exact time. In some situations, particularly after a system crash, disks reach a point where the media becomes unusable. In such situations, the disk may be repaired and restored, the restored resource and data are in the identical state as they were at the time of the backup. An example for crash-consistent backup is AWS EBS snapshots. EBS snapshots are guaranteed to be crash-consistent. After a crash, restoring the saved EBS snapshot will bring back the system to a healthy state in terms of the integrity of the EBS volume(s) within the snapshot(s).
Logical volume level consistency
When a backup copy of a volume is made, it is necessary to ensure consistency of all volume meta-data. A volume consistent backup must be performed in such a way that meta-data transactions are momentarily frozen and all pending operations are completed (flushing “dirty memory” to disk, for example). For example, Linux LVM is a logical volume management system embedded in the OS kernel. One advantage of LVM is that it provides the means for taking a complete and consistent snapshot of entire logical volumes. Backing up an LVM volume is done by taking a snapshot and copying it. The snapshot itself is immediate, that is: it a copy that is “frozen in time”. This makes it possible to copy it safely for backup, a process which takes time and might introduce inconsistencies if done directly on a volume that’s being accessed by applications.
Beyond the physical level consistency discussed above, the file system introduces an additional level of information used to maintain files and directories. A file backup copy is considered file system consistent if-and-only-if (IFF) it can be restored to a state where all file operations can be performed (CRUD), all file system metadata is intact, and file system operations are fully available. Note that file system consistent backup does not guarantee that the data within the files is complete, or that it is consistent with regards to some requirements of other systems. Only the health of the file system is of concern. For example, the XFS file system, which is available in most Linux kernel implementations, provide a utility named xfsdump that can create a copy of an entire file system that is fully consistent. Xfsdump can operate while the file system is being accessed and without interruption to normal I/O operations.
Application logic establishes a notion of consistency that’s higher level than that of the file system. While at a file level, a single write or read operation is the scope of data integrity, the application may require a bundle of i/o operations to be completed for the state of the data to be consistent. We illustrated this notion of consistency with the example that we presented in the introduction text above. A backup copy that is “application consistent” is a copy that reflects a state of the dataset where all application transactions are completed (no open transactions). In order for a backup copy of a dataset to be considered ‘application consistent’, it must be guaranteed that the application is brought to quiescence prior to the backup copy operation. All pending i/o operations must be completed (and flushed to persistent storage) and all transactions must be committed or aborted. For example, the Oracle DBMS supports a ‘backup mode’. Backup copies that are taken while the DBMS is in backup mode are guaranteed to be consistent and can be restored safely.
We have described four different but complementary concepts of backup consistency. We saw that ensuring the consistency of backup datasets depends on the point of view one takes. The physical disk may crash and become corrupt. A crash consistent backup copy can be used to restore the contents of the disk and restore it. Protecting multi-disk logical volumes and file systems requires yet additional care. Applications have their own view of consistency and backing up application data is even more complicated. It is the notion of multi-layer backup consistency that facilitates a fully protected system. Backup systems that guarantee all layers of consistency are deemed safe and partial consistency is as good as no consistency.
Final note on backup consistency
In this article, we reviewed five concepts of backup consistency. An advanced enterprise backup solution should take into account all the five shades of backup consistencies. Moreover, the solution should incorporate the notion of application backup consistency to fully protect their IT operations. Does your backup system and practices support sufficient backup consistency? Are you confident that, when you need to restore all or parts of your IT datasets, the restored data will be usable and viable? We recommend that you look at N2WS Backup & Recovery, a complete data protection management system that can guarantee cloud backup consistency and provides IT managers with a peace of mind (try it free for 30 days — after that it automatically converts to our Free Edition which protects up to 5 EC2 instances.)