Unsurprisingly, the biggest headlines to come out of AWS re:Invent this past December centered on innovations related to machine learning and AI – like an overhaul of SageMaker, a new generation of foundation models known as Nova, and a slew of feature updates for Bedrock, Amazon’s platform for creating generative AI applications.
But if you were paying close attention, you might also have noticed that Amazon used re:Invent to debut several major updates on the cloud storage front. And while storage may not be quite as trendy at the moment as AI, there’s plenty of reason to be deeply excited about the re:Invent storage announcements, too.
So, in case you missed it, keep reading for a look at six game-changing cloud storage innovations announced during (or in close proximity to) re:Invent, along with explanations of why they’re a big deal for anyone who cares about managing and protecting data in the cloud.
1. Amazon S3 Tables
The S3 object storage service has traditionally been a less-than-ideal option for certain types of use cases – including, notably, data analytics. S3 was originally designed as a low-cost, highly scalable way to store large volumes of data. But because traditional S3 buckets don’t organize data in a particular way, they’re not optimal for parsing data quickly as part of analytics workloads.
The introduction at re:Invent of S3 Tables changes this. The Tables feature creates a new type of bucket that you can configure on S3. Unlike general-purpose S3 buckets, Table buckets function as an “analytics warehouse,” in Amazon’s words, which translates to fast queries when you use Tables buckets to host tabular data – like purchase transaction records or user login events. Amazon says query speed is up to three times faster, at least when using query engines hosted by Amazon (it hasn’t mentioned how query performance might compare for customers who use third-party tools to analyze S3 data). And you still enjoy the same scalability and reliability that you’d get from standard S3 buckets.
By optimizing S3 for analytics, Tables opens the door to a whole range of interesting new use cases for object storage in the Amazon cloud. No longer will S3 buckets merely be a convenient place to dump a lot of data that is not organized in a particular way; they can now also serve as a place to host tabular data that you need to analyze quickly and at scale.
2. Amazon S3 storage browser
A personal favorite announcement of ours, Amazon opened up another compelling use case for S3. They announced their Storage Browser, which lets end-users browse data hosted in S3 buckets. Storage Browser works as an open source interface component that developers can include in apps. Once present, the component exposes S3 bucket contents to approved application users. Developers can configure S3 data to be browsable in read-only mode. They can also configure upload, download, copy, and delete privileges for users, if they wish.
While it was possible to build features like this before, doing so required a lot of effort because developers had to implement their own code to make S3 storage buckets browsable by third parties. Now, they can add this capability to apps with ease.
To be sure, not every S3 bucket hosts data that developers would want their end-users to be able to browse or modify. But by making it simple to expose S3 data for this purpose where it makes sense, Amazon seems to be encouraging organizations to think of S3 buckets not just as a place to store internal data, but also data that customers need to be able to access directly.
3. New storage-optimized EC2 instances
In more good news for those who want to process vast quantities of data efficiently, Amazon announced I7ie, a new generation of storage-optimized EC2 instances. Compared to the older-generation I3en instances, the I7ie class delivers 50 percent lower I/O latency rates, 65 percent better real-time storage performance, and 65 percent lower I/O latency, according to Amazon.
The company also says that the new generation of storage-optimized instances provides 20 percent better price performance, although it didn’t detail how it calculates that metric.
The new instance types are great news if you need to create cloud servers with high-performance local storage. This update is another step toward simplifying workloads like I/O-intensive analytics.
4. Time-based snapshots for EBS
If you store data on Amazon Elastic Block Storage (EBS), you may create EBS snapshots as a way of backing up your data. Unfortunately, you may also sometimes find that EBS snapshots take longer to generate than you expected. When that happens, you may fall short of your Recovery Point Objective (RPO) goals because by the time your snapshot is complete, the data is too old to be relevant for RPO purposes.
With Amazon’s new time-based EBS snapshot copy feature (which was announced a few days before re:Invent began), you no longer have to cross your fingers that snapshots will finish in the time you need. Instead, you can specify a desired duration for completing the snapshot before it starts. Amazon automatically determines whether the requested duration is feasible by tracking your current throughput rates.
To be clear, this feature doesn’t mean you can somehow cause an EBS snapshot to be created as quickly as you want. It’s not magic. But by calculating completion time before the snapshotting process begins, it helps prevent scenarios where you expect a snapshot to go faster than it does. With the new feature, you can be confident that a planned snapshot duration will actually be achievable.
5. FSx Intelligent Storage Tiering
Amazon FSx is a cloud storage service that lets you create file systems in the cloud. It supports several file system types that businesses commonly use to share data across the network, such as Windows File Server and OpenZFS.
In essence, FSx offers a twist on traditional network sharing file protocols, like NFS and SMB: Instead of having to set up and manage your own network-attached file shares, you can store your data in the cloud via FSx and connect to it from any on-prem or cloud-based endpoint using standard network-attached storage applications.
Traditionally, FSx data was hosted using high-performance Solid State Drives (SSDs). That was great for businesses that wanted to achieve high I/O rates for their network-attached file systems. It was less great for those seeking cost-effective commodity storage in file system form.
Although it’s up for debate if ‘intelligent tiering’ can actually save on costs, the announcement regarding intelligent tiering storage class for FSx claims to be a low-cost network-attached storage using the Amazon cloud has become an option. That’s because Amazon has priced storage using the new feature (which currently supports only the OpenZFS version of FSx) 85 percent lower than traditional FSx storage. In addition, the new feature automatically cycles your data through different storage tiers based on how frequently you access it; you don’t need to manage storage tiers manually.
While FSx is not as widely used as other AWS storage services (like S3), this new storage option for FSx could be significant for anyone who wants to make data easily accessible via traditional network-attached storage, but without paying the pricing premium that comes with SSD infrastructure.
6. N2W enhancements
The various storage service updates that Amazon introduced during re:Invent make it easier than ever to store and process data in the cloud while achieving high rates of performance and minimizing costs. But if you want to get even more value out of cloud-based data infrastructure, you’ll want to take advantage of the N2W feature enhancements that we announced during re:Invent.
Key updates include:
- Cost Effective Long-term Azure Storage: N2W v4.4 now supports Azure vault storage for long-term cold storage retention. This means major cost savings as the feature utilizes optimal archive tiering with no need for any Azure Backup licensing costs.
- Support for Wasabi S3-compatible storage: By making it possible to store data using Wasabi’s low-cost S3-compatible storage service as part of a cross-cloud data protection strategy, N2W can help reduce storage bills.
- Smarter S3 compliance locking: A new algorithm for S3 compliance locking (which automatically preserves S3 data for compliance purposes) improves efficiency by reducing API requests and lowering operating costs. The result is the same intelligent compliance benefits, but at a lower cost.
- Enhanced snapshot functionality: N2W has long offered advanced snapshot functionality, but it’s now better than ever by adding native integration with Azure APIs. This feature complements existing support for other clouds’ snapshotting capabilities, making it easier than ever to generate snapshots quickly and reliably across multiple clouds.
- Enhanced Recovery Scenarios Functionality: N2W customers now enjoy even more control over recovery scenarios through features like the ability to retain fixed tags during recovery events.
- Targeted data backup retries: Now, when a backup doesn’t succeed, N2W can automatically reattempt to backup only the parts of the operation that failed, rather than restarting the backup from scratch. This approach saves time and money while also increasing the chances of successful backups.
- Per-VM pricing for Azure: A new pricing option in N2W allows customers to pay per VM that they want to back up in Azure, rather than paying based on VM size. We estimate that this option (which complements an equivalent pricing feature already available for N2W in AWS environments), combined with our DLM in Azure feature, can reduce costs by up to 500 percent in many scenarios.
These N2W enhancements double down on AWS’s recent updates by providing even more opportunities to reduce cloud storage costs while simultaneously boosting the reliability of cloud-based storage. To learn more, you’re welcome to request a free N2W trial.
Chris Tozzi
Chris, who has worked as a journalist and Linux systems administrator, is a freelance writer specializing in areas such as DevOps, cybersecurity, cloud computing, and AI and machine learning. He is also an adviser for Fixate IO, an adjunct research adviser for IDC, and a professor of IT and society at a polytechnic university in upstate New York.