AWS offers a wide variety of building block options for cloud data storage. However with each storage option offering its own advantages and disadvantages when it comes to cost, performance and backup – it may not be readily apparent which solution is best for an enterprise and their customers today. The good news is that the comprehensive storage options that AWS offers these days means that operating a highly available and reliable IT environment has become more scalable and cost-efficient than ever before.
In this article, we’ll cover the five AWS storage options you should be using today, and offer some use cases and backup-related best practices for each. Our focus will be on the infrastructure base building blocks AWS offers, rather than on data store engines such as DynamoDB.
1. S3 – Simple Storage Service
S3 (i.e., Simple Storage Service) is the most well-known AWS storage option. In fact, S3 was the first cloud service Amazon offered, back in 2006. S3 is a highly scalable storage option that uses objects rather than blocks or files. Objects are arranged into collections called buckets, which store static web content and media. Today, S3 stores tens of trillions of objects and a single object size can range between a few kilobytes up to 5TB in size. S3 offer durability of 99.999999999%, and following its long uptime history, you can be highly confident in its availability.
2. EBS Volume
Amazon’s Elastic Block Store (EBS) provides block-level storage that can be attached to Amazon EC2 instances. AWS EBS service is both flexible and cost-effective and allows users to back up their data for long-term durability by taking frequent snapshots of the EBS volumes. Although there is a free tier, the pricing for other EBS tiers is flexible, and generally set up on a per-use basis. We invite you to check previous articles for all you need to know about EBS volumes, including the different types, costs, and snapshot mechanism.
Glacier offers a long-term, cost-efficient archiving service. Glacier is a flexible, highly secure storage for data backup and archiving – with no limit to the amount of data you can store in the service. With Glacier, AWS users can reliably store their data for as little as $0.007 per gigabyte per month. The data availability does not apply to Glacier since the access is asynchronous and data retrieval can take up to four hours.
4. EC2 Instance Store
The Amazon EC2 Instance Store provides temporary block storage volumes for Amazon EC2 virtual machines. The Instance Store should only be used to store temporary data that you can afford to lose, as all data is automatically deleted when an EC2 instance fails, stops, or is terminated. These local instance store volumes can be useful, however, for temporary storage of information that is continually changing, such as caches, buffers, and other temporary content.
5. Amazon ElastiCache
ElastiCache is an in-memory caching service. It helps software vendors improve their web application performance by enabling them to retrieve information from managed in-memory caches, instead of relying on slower disk-based databases. ElastiCache supports both Memcached and Redis, which are the most popular open source in-memory engines today. ElastiCache is natively integrated with all other AWS services such as EC2, RDS, etc. It’s often used to manage web session data, or to cache dynamically generated web pages.
Bonus Storage Service: AWS Storage Gateway
AWS Storage Gateway integrates on-premise IT environments with cloud storage. With AWS Storage, data is stored in Amazon S3, with frequently-accessed files kept locally. Once this is connected to the AWS, you can create three types of storage gateway volumes: gateway-cached volumes, which will use S3 to store your primary data, while keeping a copy of the data locally; gateway-stored volumes, which will store the primary data locally in your data center and in parallel backup the data to AWS S3; and gateway-virtual tape library (VTL), which will replace your local physical tape library with a virtual tape library. Storage Gateway is best used for corporate file sharing, and for enabling existing on-premises backup applications to store primary backups on Amazon S3.
The AWS storage options above are not without their challenges. Just as on any other public cloud, performance with AWS varies. To check your storage requirements and utilization you can use Perfmon, which is integral to Windows, and IOStat for a Linux environment. These create an activity log on the physical disk and also monitor the number of I/Os per second.
In addition, AWS storage options provide the ability to maintain robust service operations, including data protection. With regards to backup and DR, it’s important that you should not only use ad-hoc backup automation scripts – for example for scheduling EBS snapshots or data migration to Glacier – but be able to define, implement and maintain comprehensive policies. This allows you to align the uptime of your cloud environment accordingly, and comply with SLA or specific compliance standards.
One solution for resolving these challenges is to use Cloud Protection Manager (CPM). CPM provides a simple, intuitive and user-friendly web interface to easily manage your EC2 backup operations. CPM is available as a service model that allows users to manage multiple AWS accounts and configure policies and schedules to take automated snapshot backups. It also has a Windows agent to consistently back up Windows applications. CPM allows you to recover a volume from a snapshot, increase its size and switch it with an existing attached volume in a single step.