At AWS re:Invent 2014, Werner Vogels declared that AWS will soon offer new, larger EBS volumes with the possibility to create EBS volumes up to a size of 16 TBs. From EBS’ creation until now, the maximum size of an EBS volume has been 1TB (1024GB), which means that volumes will be up to 16 times as large with this improvement. Additionally, the performance of these volumes will improve by twofold. AWS will provide up to 10,000 IOPS for a general purpose (SSD) volume (up from 1 TB and 3,000 baseline IOPS), and up to 20,000 IOPS for a PIOPS (SSD) volume (up from 1 TB & 4000 PIOPS). It should be noted that this option will not be applicable for magnetic volumes.
In this post, I would like to discuss the implication of this change from our point of view, at N2Ws.
Data Consistency Issues in Multiple EBS Volume Striping
The size limitation for a volume of 1TB is encumbering because many applications and companies, including CPM customers, need volumes that are larger than 1TB. To achieve higher capacity volumes, multiple EBS volumes are created and then combined into one logical volume. This can be done with the use of a spanning disk within a dynamic disk in Windows and with LVM and other tools in Linux. Users can also achieve larger sized volumes of more than 1TB by striping them together using RAID.
- Users can increase the redundancy and resiliency to achieve better HA with RAID 1. It is observed that in the case of EBS, RAID 1 does not reduce the complete risk of disk failure because it is possible that two separate EBS volumes that are striped together are created on a single piece of hardware. If the underlying AWS hardware fails, it will fail all volumes.
- Users can increase disk performance by striping EBS volumes together using RAID 0 (RAID 5 is not recommended by AWS as parity byte operations reduce IOPS). This helps users achieve better IO throughput, but if there is a need for higher IOPS (not the disk size), better alternatives exist, such as PIOPS or general purpose volumes with burstable IOPS. However, you can achieve higher performance with striping than with the PIOPS of each volume separately. RAID 0 may help achieve an EBS volume size that is larger than 1GB, but it does not solve the data backup consistency issue as explained below.
A very large logical volume created from multiple EBS volumes will require more management. Consistent backup with these types of logical volumes is challenging when a file is written into both volumes simultaneously. When you take snapshots of these two EBS volumes, they won’t begin at the same time, and this short time gap between the two snapshots may cause data corruption.
N2WS Backup & Recovery (CPM) Consistent Backups
This issue can be resolved with CPM’s consistent backups. In Windows, the VSS utility is used to back up logical volumes consistently, including dynamic disks and the NTFS file system. In Linux, it’s possible to achieve consistency by creating a script that creates LVM snapshots using CPM. Thus, CPM can perform consistent backups of these complicated and elaborate volatile volumes, but users will need to engage in more work. Learn more.
The ability to create a big volume from one EBS volume that is up to 16TBs, which is a considerable size, makes it easier for those who want to use large disks and back them up with snapshots. This results in one EBS volume per snapshot, and the inconsistency in EBS snapshots caused by multiple volumes can be avoided.
However, you may still want to create a logical volume out of multiple EBS volumes, especially in the need of very high performance. It is possible to exploit concurrency (writing commands that perform in parallel) if a striping disk (e.g. RAID 0) across several disks is created and the throughput (IOPS) offered by AWS can be improved. If ultra high performance is needed, striped disks can be used.
As we described above, if an EC2 instance has more than one volume, it can create problems during backup and restore. Other elements can cause snapshot inconsistency, like applications that keep buffers and memory or transactions and files open, and this becomes more complex if a volume is compounded from two or more EBS volumes. CPM addresses the need for consistency and offers a very handy solution. In Linux, it provides an option to run a script just before the snapshot is taken and another script right after the snapshot starts. The script that runs before the snapshot is taken initiates a lock or freeze operation and the script that runs after the snapshot is taken releases all locks and ensures that the data is consistent during the snapshot.
The Database Example
Data consistency is especially important for databases because some of their data is stored in buffers, or cache. For example, a user can request a lock and flush from the MySQL database, which will lock the database temporarily for write operations. It will also flush all transactions and caches into disks, including any open files. Additionally, the file system can be frozen, which may be less desirable if you have an application with a database, because the file system isn’t aware of data or caches that are at the application or OS level.
Where LVM is used, it is possible to request that LVM creates a snapshot that is stored on the disks. Right after the LVM snapshot is created, we can call on the EBS snapshot, delete the LVM snapshot, and revert to it when recovering. Learn more
Windows VSS is a very powerful infrastructure that informs applications, like SQL server and Exchange, that they’re about to be backed up, and takes care of the file system and of all of the IO stack tiers to make sure that the backup is completely consistent. Learn more
With higher availability of IOPS due to the new, larger EBS volume size, we can assume that the majority of EC2 users that need larger volumes would prefer to do it with one EBS volume.