Amazon offers various storage options to support static content, databases, archives, and multiple other use cases. Their pay-as-you-go model offers flexible options for storage, including what might be the most important building block – the EBS volume. These are durable, block-level storage volumes that can be easily attached to an EC2 instance. EBS is best suited as a primary storage for cases that require persistent data such as databases and file systems. In this article, we discuss the journey of the AWS EBS volume, which started with one standard, HDD-backed storage option and has evolved into five different EBS volume types. We’ll consider its management and provide tips on how to enhance performance and availability.
SSD and HHDAWS EBS started its journey in the summer of 2008 as standard, HDD-based storage that could be between 1GB to 1TB in size. It was offering 100 IOPS. As time passed, they began to introduce new EBS volume types, such as PIOPS in Aug 2012 that was able to support 1000 IOPS. Over time, AWS scaled to new heights and introduced new volume types to meet the growing demand for storage. In June 2014, they introduced SSD-based, general purpose storage. AWS renamed the standard storage to magnetic as well as increased the IOPS for provisioned storage to 4000. The demand for storage continued to increase and AWS offered more and more storage types. As we all know, SSD is always high performing but very costly, whereas HDD is cheaper but does not give expected IOPS. To meet the challenge, EBS offered two new storage types, alternatives or replacements for magnetic storage, in April, 2016. The two new types, Throughput-optimized HDD and Cold HDD, provide better IOPS even though they are HDD offerings. The new HDD offerings are very helpful for applications that require high performing storage at low costs. Both drives have their own advantages and disadvantages. SSD-backed drives are better optimized for transactional load where there is need for high IOPS throughput. On the other hand, HDD-backed drives are good for large-scale data streaming performance where throughput is measured in MiB/s. The following table provides a rough comparison of the various Amazon EBS volume types:
|SSD-Backed EBS Storage||HDD-Backed EBS Storage|
|Volume-Type||Provisioned IOPS (io1)||General Purpose (gp2)||Throughput Optimized (st1)||Cold HDD (sc1)||Magnetic Volume|
|High performing SSD volume suitable for mission-critical applications such as RDBMS, and NoSQL databases.||A balance between performance and price. Much larger volume size can provide very good IOPS. Suitable for use cases such as system boot volumes, virtual desktops, Dev & Test.||Low cost HDD, useful for frequently accessed throughput, such as streaming, Big Data, and Data Warehousing. It cannot be a boot volume.||Lowest cost HDD, suitable for less frequently accessed workloads. More helpful when data is rarely accessed. It cannot be a boot volume.||The original standard volume now considered previous generation. Suitable for infrequent data access. It can be used as a boot volume.|
|Size Supported||4 GiB – 16 TiB||1 Gib – 16 TiB||500 GiB -16 TiB||500 GiB – 16 TiB||1 GiB-1 TiB|
|Max IOPS Supported||20,000||10,000||500||250||40-200|
|Max Throughput/ Volume||320 MiB/s||160 MiB/s||500 MiB/s||250 MiB/s||40-90 MiB/s|
|Price||$0.125/GB per month + $.065/ provisioned IOPS/month||$0.100/GB-month||$.045/GB-month||$.025/GB-month||$0.05/GB-month $0.05/million I/O|
Availability and SecurityAWS EBS offers an option to back up the volume with a point-in-time persistent snapshot that is stored in AWS S3. The AWS EBS snapshot performs block-level incremental backups and helps recover a whole volume in case of a disaster. However, you should consider a solution to support file level restoration of the EBS volume for maintenance. An EBS volume is specific to an availability zone, but its snapshot can be used to recover the volume in a separate AZ to support cross-zone failover. A snapshot can be copied to a separate region as well to separate AWS accounts. In addition, it is important to define, plan, and automate your backup policy. These should be inline with your RPO and RTO requirements. For these backup and DR (disaster recovery) purposes, you must not rely on a cumbersome collections of scripts that meet ad-hoc requirements but have a robust and flexible cloud solution that facilitates automated AWS backup and recovery. EBS has several encryption options. The AWS out-of-the-box encryption eliminates the need to maintain and secure your own key management infrastructure. This aids in encrypting data at rest, as well as in transit, for example, when moving data between volumes or when copying snapshots. You can always migrate from an unencrypted volume to an encrypted one.
PerformanceThere are several factors that affect the performance of an EBS volume, such as instance size, I/O configuration, and your workload. It is important that you also consider the following points to ensure your EBS volume’s performance.
- Use EBS Optimized Instances. EBS optimized instances have a dedicated throughput between EBS and your EC2 instance, the reason being that in these instance types, the network traffic is kept separate from EBS traffic. Since we know that EBS is a storage over network (though in same availability zone), the EC2 instance uses the same channel for both Internet and EBS storage traffic. With an EBS optimized instance, the traffic will have two separate channels and EBS will have a dedicated throughput with EC2. This results in much improved performance. There are a few instance types (C3,R3,M3) that do support enabling the EBS optimized option. Some EBS volumes have this enabled by default (M4,C4,D2). Use these instance types for better performance.
- The volume configuration also determines your EBS performance. We already know that PIOPS gives dedicated throughput, but if you are using a general purpose volume, then it offers 3 IOPS/GB of storage (with 100 IOPS as baseline). Thus, in this case, the larger the size of the general purpose volume, the better IOPS it can offer without incurring any additional cost of IOPS. For example, the 300 GB general purpose volume will give 900 IOPS.
- When considering HDD vs SSD volumes, you should choose according to your storage requirements and the costs. If the I/O operations are sequential and predictable, you might want to consider HDD. On the other hand, for random, high throughput activity, you should look at SSD to get your performance boost.
- Manage the queue length. Volume queue length is the number of pending I/Os for an EBS device. You should plan the queue based on the latency and I/O size. If your defined I/Os are not able to meet the required performance, the queue length will increase and that will affect EBS performance. AWS recommends a queue length of 1 for every 500 IOPS in SSD-backed storage. For HDD-backed storage, a queue length of 4 is recommended while performing 1MiB sequential I/Os.
- Always monitor your IO characteristics with CloudWatch and optimize your volume for future use. Learn how
- Pre-Warm your EBS volume. It is important to know that when you create an EBS volume from a snapshot and try to access each block of data for the first time, there might be a latencies. If your application demands better performance from first read, you should hit each block before putting this volume into production. This can be done by initializing EBS. It’s important to note that you will have to wait till the volume is completely restored which is not a preferable option on an outage scenario, for example, when looking for a quick recovery
- When you take a snapshot of an HDD-backed volume, its performance may drop while the snapshot is in progress. It is recommended that you schedule snapshots for off-peak hours if possible.