In the world of IT, storage has always made up one of the core pieces of the infrastructure. After all, everything revolves around data, so storing it properly is of the utmost importance. But not all data has the same requirements, and while some data needs to be available to access most of the time, some data does not. And when this mostly inactive data is accompanied by long retention periods, properly maintaining it means putting it in cold storage (such as Amazon Glacier).
When looking at on-premises cold storage has usually translated to tapes, but now that most companies are looking at public clouds like AWS, there are managed services that fulfil that role. In this article we will overview Amazon Glacier, a cold storage solution offered by Amazon, and look at why it might be a good solution for your company’s use case.
Amazon Glacier is a durable (99.999999999% durability) and scalable cold storage service, that comes at a very low cost, especially when compared to any other on-premises solution out there. It is also very secure (supporting a wide range of standards and compliances such as HIPAA, EU GDPR, etc.), and provides multiple options for not only storing objects, but also their retrieval—this helps to keep the overall cost to a minimum, while still fulfilling clients’ needs.
What Is the Difference between Amazon Glacier and S3?
When first moving to the cloud, it is easy to get overwhelmed by the amount of services and options available. This is also true when looking for the best storage solution for your use case, as you might be considering both S3 and Glacier. So how are they different, and when would you use one and not the other?
Both S3 and Glacier are designed for high durability, and both are replicated across multiple Availability Zones to protect you from single AZ failures. Also, both allow you to store almost unlimited data quantities. But they do differ when it comes to accessing the data. S3 is mostly used for infrequently accessed objects (or somewhat infrequently when relying on S3 IA (Infrequent Access) storage class. You can see a detailed list of use cases for S3 here. On the other hand, Glacier is used for storing objects that you won’t need to access for extended periods of time—weeks, months, or even years. This is the intended use case for Glacier, and it’s entire cost structure revolves around this. So let’s see what retrieval options are provided and how everything is priced.
Amazon Glacier Pricing and File Retrieval Options
Storing data in Glacier costs only $0.004 per GB (for comparison storing data in S3 Standard storage costs $0.023 per GB), which is a bargain, but object retrieval does come with a price. There are three different options to consider when looking to get your data out of Glacier. The cheapest option is the Bulk retrieval, which will set you back only $0.025 per 1,000 requests and $0.0025 per GB retrieved, but it being a lowest-cost option means that you will wait between 5 and 12 hours before your data is ready for you. If this is too slow for your needs (maybe you are in the middle of disaster recovery, and your RTO demands more speed), the Standard retrieval takes between 3 and 5 hours, but the prices will then go up to $0.05 per 1,000 requests and $0.01 per GB retrieved.
And if you need your data even faster, the Expedited retrieval option can make it available within 1 to 5 minutes—but it comes at a significant cost. Expedited retrievals can be On-Demand or Provisioned. On-Demand retrievals cost $10.00 per 1,000 requests, as well as $0.03 per GB retrieved, but the caveat is that Expedited On-Demand requests are not available at all times (just like On-Demand EC2 instances are not always available). If you are using Provisioned capacity, than your retrieval will be Provisioned too, and they are available at all times—but the cost can skyrocket to $100.00 per capacity unit, so make sure you know what you are doing to prevent unnecessary spending.
Amazon Glacier use cases
Being a low-cost cold storage, AWS Glacier serves a practical use in various cases today. Let’s look at some of them.
Magnetic Tape Replacements
Various tape libraries can still be found in use today. Whether they are stored off-site or on-premises, they are costly, but also create a lot of overhead for maintaining them. AWS Glacier is a great replacement as not only is it a managed service, which means you will be relieved of the management overhead, but thanks to its pay-for-what-you-use model (operation expenses) there is also no upfront cost.
Storage Gateway Virtual Tape Library can also be utilized to store away archives, when you have a hybrid cloud solution in place. It can send data to both S3 and Glacier storages, and the important factor is that it allows you to rely on your existing workflows, making this entire process seamless.
Archiving Data for Regulatory Purposes
A lot of businesses are required to keep long-term archives in order to comply with various compliance regulations. This is especially true in healthcare (HIPAA, as well as many other regulations), where vast amounts of patient data needs to be stored for decades. This storage must be reliable and secured, but keeping data of that magnitude for so long can be an issue in terms of cost. AWS Glacier fulfills all of the requirements and is a perfect fit here.
Digital Media Asset Archival
When working with digital media, there are usually large files involved. Whether you are looking at news footage or digital assets for a large movie or gaming project, video files can be gigabytes or even terabytes in size. These do need to be stored away long term, and that can produce a lot of cost. With Glacier, you not only get the low-cost option for storing them, but you can also retrieve your files easily when you need them—this allows you to have a simple, cheap, and efficient workflow in place.
Backup and Restore for Disaster Recovery Scenarios
Disaster recovery planning is a crucial part of business continuity for any company today, and making sure you are protected when an undesired event occurs is of the utmost importance.
Whether you are running your environment in the AWS cloud, or are relying on a hybrid cloud solution, AWS Glacier provides you with durable and secure backup storage at a low cost. If you do end up in a situation where you have to restore your data, different retrieval options will make sure that no matter what your Recovery Time Objective (RTO determines how quickly you need to recover your services in order to maintain business continuity) is you can have everything up and running in no time.
If your business relies on a hybrid cloud, File Gateway or Volume Gateway are an alternative. The former gives you S3 object access using protocols such as NFS, and the latter provides block storage for your applications using iSCSI.
Long Term Data Libraries
Many libraries (but also government agencies etc.) need to store away lots of data long term. For these institutions cost is less of an issue, but the durability of the objects is quite important. And with such large quantities of data, maintenance becomes almost an impossible task. Glacier, being a fully managed storage service, is built to be self healing. This means it performs regular data integrity checks on all files, and makes sure that any object that is not verified is repaired.
Amazon Glacier Deep Archive
Amazon is always looking to improve on their products and is constantly introducing new features. Just recently they announced AWS Deep Archive, a new storage class for Glacier. Deep Archive is meant as a very long-term storage solution that offers even lower prices than Glacier’s Standard tier, and is a perfect fit for keeping data sets for 7–10 years or longer (often to meet regulatory compliance requirements etc.).
Deep Archive provides a retrieval of data within 12h (for now this is the only option), and costs just $0.00099 per GB stored.
Amazon Glacier is a top-tier cold storage solution, providing a secure, durable, and very cost-efficient service for those who need to offload their data long term. As we have shown it can be used for numerous use cases, and with the release of Deep Archive its capabilities will only continue to grow. Whether you are an existing AWS customer or not, if you have the data requirements for a cold storage, make sure you consider Glacier.
As colder storage is becoming more and more utilized on AWS’s most popular cloud infrastructure services they are also becoming the most supported cloud storage service with integration to third-party solutions. Amazon S3, for example, is highly durable, highly scalable, low cost, and integrates with the majority of AWS services. Further, you can experiment with Amazon S3 and Glacier by signing up for AWS Free Tier, which includes 5GB of free storage space and up to 20,000 get and 2,000 pull requests for 12 months. Usage above the AWS Free Tier limit will be charged standard rates. Looking for an AWS Data Protection solution? Try N2WS Backup & Recovery (CPM) for FREE!