As you may know, Amazon Elastic Block Store (Amazon EBS) provides persistent block storage volumes that offer low-latency performance for EC2 instances. It offers multiple types to fulfill various use cases (like log processing, cold data storage, I/O intensive database workloads, etc.), as well as automatic replication within the Availability Zone (AZ). This allows for much-needed high availability (each EBS volume is replicated within its AZ), as well as durability (the annual failure rate is only 0.1%–0.2% each year). But still, EBS volumes can fail—and sometimes even become unrecoverable—so backing up your primary storage with potentially business-critical data is always recommended.
With Amazon EBS, you can do that by utilizing snapshots (point-in-time incremental backups that are stored in S3 buckets), which can be used to quickly recreate volumes when necessary. The only downside to this is that snapshots are priced at $0.05 per GB, which is higher than the regular price of using Amazon S3 for storage. Still, the benefits outweigh the cost.
In this article, we will overview seven use cases for backing up EBS volumes to Amazon S3 buckets. But first, let’s take a look at why you would want to store your backups in Amazon S3 in the first place.
Amazon S3 Storage Classes
Amazon S3 (Amazon Simple Storage Service) is one of the first services Amazon ever introduced, and is one of the reasons the AWS cloud has been so successful. It is an object storage solution that is secured (it complies with PCI-DSS, HIPAA/HITECH, and many others), highly available and durable (99.999999999%), scalable, flexible, and low cost ($0.023 per GB stored).
In addition, Amazon S3 provides multiple storage classes, so whatever your use case is, your demands will be met. These include:
- S3 Standard: A general purpose storage suitable for a wide variety of use cases; the most commonly used.
- S3 Standard-Infrequent Access (S3 Standard-IA): Used for data that is not often needed, but can be retrieved quickly when required.
- S3 One Zone-Infrequent Access (S3 One Zone-IA): For customers who do not require multiple AZ data resilience.
- S3 Intelligent-Tiering: Recently introduced, this new class helps with unknown or changing access by moving data between various classes, always making sure to keep it in the class that is most effective at that time.
- Amazon S3 Glacier: Cheap cold storage used for archiving data that is not often accessed, but has a somewhat slow retrieval time (unless more expensive options for retrieval are chosen).
- Reduced Redundancy Storage (RRS): Was used as another low-cost option in the past, with a slightly lower durability, but has been slowly phased out by Amazon. You can still choose it, but it is not advisable, and it doesn’t provide actual cost savings any more.
Backing Up Your EBS Volumes to Amazon S3: 7 Use Cases
Now that we know why Amazon S3 is such a great option for storing backups, let’s take a look at some use cases.
1. Long-Term Logs and Metrics Storage
Logs and metrics usually require long retention periods, and often have to be stored away, as their size makes them too expensive to keep in hot storage. If you have access to logs or metrics for your application that go back many years, making a snapshot is a good option for putting away a long-standing volume containing the data that you might need in the future. Since the retrieval time of these logs and metrics is rarely urgent, Amazon S3 Glacier is a great option at a low cost. If you need your data again, you can use the snapshot to recreate the volume, and look at the data using whatever tools you were relying on (Elasticsearch, Prometheus, etc.).
2. Disaster Recovery Data
When looking at business continuity, one of the first things that comes to mind is disaster recovery. Having backups of your mission-critical data that you can retrieve on demand is crucial in order to bounce back from a disastrous event, whether it’s a malicious attack on your business, a human error caused by an employee, or a regional cloud outage. With your snapshots backed up in Amazon S3, you can easily provision new volumes and restore your services to reduce the downtime. Just make sure to copy your snapshots to another Region, so that they are safely stored away and ready when disaster hits.
Your recovery time objective (RTO) and recovery point objective (RPO) will be the deciding factors regarding what class of storage you will need, so make sure you look at this from both the operations and business perspectives.
3. Customer Data Storage
Many companies store large amounts of their client’s data, usually indefinitely. Thanks to the performance of Amazon EBS (which has a higher IOPS and lower latency than Amazon S3), and to the fact that retrieving data from Amazon S3 will result in additional costs (even though data out traffic between Amazon S3 and Amazon EC2 is free, you still pay for the requests made), many opt to keep the data on those volumes.
For example, you might be in the business of storing customer emails, and while only the more recent emails need to be available at all times for regular searches, you still store the older ones going back five years or more. These can easily be put in cold storage, and depending on the agreement with the customer, the data might not need to be available quickly when requested, leaving you with the simple solution of Amazon S3 Glacier for storing it long term.
If the data does need to be accessed a bit more quickly (the standard retrieval time for Amazon S3 Glacier is 3–5 hours), you can opt for S3 Standard-IA. You can keep the data in Amazon S3 Glacier, but be prepared for a more expensive expedited retrieval cost.
4. Public Sector Archives
The public sector usually stores huge amounts of data, and in order to reduce the cost of this (while keeping everything somewhat organized), it has to be stored away properly. For example, in academia, access to attendance records is needed often, so storing backups in S3 Standard would make the most sense. Other records, like academic records, can be stored in S3 Standard-IA, as they are rarely accessed, but still need to be retrieved quickly on demand. Of course, Amazon S3 Glacier is always the preferred storage class (when applicable) due to its low cost, so make sure you identify the proper storage solution in cases like these.
5. Regulatory Compliance
Various regulations and compliances can require extremely long data retention periods. Banks are especially sensitive to this, as audits can happen, and information from years back may be requested. Since this data is rarely accessed, and audits are usually scheduled upfront, Amazon S3 Glacier is the perfect choice due to its low cost. The medical industry also needs to keep data for very long periods of time, and the data needs to be secured in order to comply with laws that protect patient confidentiality. Depending on the requirements, any of the Amazon S3 storage classes might be of use here.
6. Project Storage
Media companies have numerous projects, and each of them can contain gigabytes, or even terabytes, of data. When a project is over, a disk containing an entire set of raw data can be stored away as a snapshot, and easily accessed later. Most production houses that work with commercials, music videos, or even movies post production tend to have large projects with lots of raw footage that might be needed in the future, so it can’t simply be discarded. Storing a whole project in Amazon S3 Glacier assures that the data is not only safely stored away, but also that the cost of keeping it long term is as low as possible.
7. Mass Content Distribution
When working with data, you often need to share it across teams or departments within a company. Whether at big enterprises or smaller startups, many require access to the original files for research, analytic, or other purposes. Often, this data can be kept in Amazon S3, but thanks to the performance of Amazon EBS, some might opt to keep it on the actual volumes instead. In order to avoid long retention in hot storage, which can be costly, these volumes can be easily stored in Amazon S3 as snapshots; and later, when required, a volume can be recreated and attached to an EC2 instance to be consumed by those who need it. Depending on the requirements, as well as the cost factor, all Amazon S3 classes can be a viable choice here.
Simplifying Backups With N2WS Backup & Recovery
Third-party products can perfectly complement existing AWS capabilities. N2WS Backup & Recovery was built specifically for AWS. By leveraging native AWS technologies to utilize block-level and incremental snapshots, it allows for very efficient backups which can be easily automated. And, when necessary, everything can be recovered quickly and in any region or account in the cloud.
N2WS Backup & Recovery now offers a feature called “Store Snapshots in Amazon S3.” as well as “Archive to Glacier”. This feature will reduce the overall cost of long term archival. Now, instead of paying $0.05 per GB for your snapshots, you pay the regular cost of using Amazon S3—with the most expensive class being S3 Standard (priced at $0.023 per GB stored) and even cheaper for Glacier ($0.004 per GB) .
Archive snapshots to S3, Glacier and Deep Archive
Amazon S3 provides a great storage solution at a very good price, making it a primary choice for various use cases on AWS. Creating snapshots of EBS volumes allows you to rely on all of Amazon S3’s benefits, but unfortunately, at a significantly higher cost—which can be a huge problem for companies that are trying to keep their spending as low as possible.
Using N2WS Backup & Recovery, you can benefit from all the upsides of Amazon S3 while keeping the price where you want it, so it’s very worthwhile to look into it.