fbpx

Archiving on Amazon S3: A Practical Guide

How to use Amazon S3 to archive snapshots (and why you should)
By archiving older data, you can optimize storage, reducing operational costs without sacrificing data retention needs.
Share This Post

Why Use Amazon S3 to Archive Your Snapshots? 

Archiving snapshots on Amazon S3 is a critical practice for organizations managing large volumes of data, who are storing that data for more than 30 days. Because, as datasets grow, maintaining all that data in high-cost, high-performance storage becomes impractical. By archiving older or less frequently accessed data, organizations can optimize their storage strategies, reducing operational costs without sacrificing data retention needs.

Snapshots, used to capture the state of systems or datasets at a specific point in time, are useful for backups, disaster recovery, and compliance purposes. However, these snapshots can accumulate quickly, consuming significant amounts of expensive storage space. Moving these snapshots to a cost-efficient archival solution like Amazon S3 (or one of its Glacier tiers) helps ensure data is preserved securely and affordably while freeing up primary storage resources for more active data.

Here are some of the benefits of using Amazon S3 for archiving:

  • Cost efficiency: S3 Glacier and S3 Glacier Deep Archive provide low-cost options for storing infrequently accessed data, helping reduce storage expenses.
  • Scalability: S3 is designed to handle massive datasets, allowing organizations to store as much data as needed without worrying about capacity limitations.
  • High durability: S3 ensures 99.999999999% (11 nines) durability for stored data, minimizing the risk of data loss over time.
  • Automated lifecycle policies: Users can automate data transitions from standard storage to archival tiers, simplifying the process of managing large data sets.
  • Security: Data is encrypted both at rest and in transit, ensuring that sensitive information remains protected from unauthorized access.
  • Compliance and governance: S3 supports various compliance requirements, making it suitable for industries with strict regulatory standards, such as healthcare and finance.

This is part of a series of articles about S3 backup

In this article:

What Is S3 Glacier and Deep Archive? 

Amazon S3 Glacier is a low-cost storage service for data archiving and long-term backup. It offers secure, durable storage for data that is infrequently accessed. The service is optimized for cost, providing a cheaper alternative to traditional storage for archival needs. 

S3 Glacier is suitable for many use cases, including regulatory and compliance archiving, media asset storage, and preserving scientific and research data. The service supports multipart upload to handle large datasets and integrates with data management and lifecycle policies in Amazon S3. Data in S3 Glacier is encrypted by default, ensuring a high level of security.

Amazon S3 Glacier Deep Archive is an even lower-cost storage class within S3 for long-term data retention and digital preservation. This service is intended for data that is retained for 7-10 years or longer and is rarely, if ever, accessed. It allows organizations to store large sets of data at very low costs, making it suitable for infrequently accessed backups and archives.

Similar to S3 Glacier, S3 Glacier Deep Archive offers the same level of durability and security. Data retrieval times are longer compared to regular Glacier, given its use case for deep archival storage where immediate access is not a priority. S3 Glacier Deep Archive is a strong choice for enterprises looking to minimize storage expenditure for dormant data.

✅ Pro Tip: Using a tool like N2W, you can automate these lifecycle transitions, ensuring data is archived to the right tier based on your organization’s retention and access requirements.

Learn more in our detailed guide to glacier backup

Tips from the Expert
Picture of Sebastian Straub
Sebastian Straub
Sebastian is the Principle Solutions Architect at N2WS with more than 20 years of IT experience. With his charismatic personality, sharp sense of humor, and wealth of expertise, Sebastian effortlessly navigates the complexities of AWS and Azure to break things down in an easy-to-understand way.

How to Archive Snapshots into S3 (the hard way)

Archiving snapshots into Amazon S3 involves configuring the appropriate storage lifecycle and moving the snapshot data to a cost-efficient storage class, such as S3 Glacier or S3 Glacier Deep Archive.

Step 1: Set Up S3 Buckets and Permissions
Begin by ensuring that you have the necessary S3 bucket created where the snapshots will be archived. You should also confirm that the appropriate AWS Identity and Access Management (IAM) roles and policies are in place to allow access and manage the snapshot data. These roles should have permissions to read from your source storage (such as EC2 snapshots or RDS snapshots) and write to the designated S3 bucket.

NOTE: IAM roles should use the principle of least privilege, ensuring only necessary actions are permitted.

Step 2: Create a Lifecycle Policy
Lifecycle policies allow you to automatically transition objects between different storage classes based on time or specific conditions. In this case, you’ll configure a policy to move snapshots from standard S3 storage to a more cost-effective class like S3 Glacier or S3 Glacier Deep Archive after a specified period (e.g., 30 or 60 days of inactivity). 

You can create a lifecycle rule using the AWS Management Console or through the AWS CLI:

aws s3api put-bucket-lifecycle-configuration --bucket my-bucket-name --lifecycle-configuration file://lifecycle.json

In the lifecycle.json, define rules for transitioning objects to S3 Glacier after a certain number of days:

{
  "Rules": [
    {
      "ID": "MoveToGlacier",
      "Filter": {
        "Prefix": "snapshots/"
      },
      "Status": "Enabled",
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "GLACIER"
        }
      ]
    }
  ]
}

Step 3: Move Snapshots to S3
Once your lifecycle policies are in place, you can begin uploading snapshots to the designated S3 bucket. If the snapshots are generated from EC2 or RDS, use AWS Data Lifecycle Manager (DLM) to automate the process of snapshot creation and retention. Data Lifecycle Manager can automate the transfer of snapshots to S3 using policies that manage both retention and archival.

Step 4: Monitor and Verify
After implementing the lifecycle policies, it’s important to monitor the transition of snapshots to ensure they are archived correctly. AWS CloudWatch metrics can be configured to monitor the status and storage class of objects, providing insights into whether snapshots have successfully transitioned to archival storage.

Managing Archived Snapshots 

Once snapshots have been archived into S3 Glacier or S3 Glacier Deep Archive, it’s essential to manage them effectively to ensure that data remains accessible when needed and storage costs are optimized.

Accessing Archived Snapshots
Archived snapshots in S3 Glacier or S3 Glacier Deep Archive are not immediately available for retrieval. If you need to access these snapshots, you must initiate a retrieval request, which can take from minutes to several hours depending on the retrieval speed chosen (Expedited, Standard, or Bulk). AWS provides a range of retrieval options to balance cost and speed.

Retrieval Requests
To restore a snapshot from an archival tier, you can issue a retrieval request via the AWS Management Console, AWS CLI, or SDKs. For example, using the AWS CLI, you can initiate a restore from Glacier:

aws s3api restore-object --bucket my-bucket-name --key snapshots/my-snapshot --restore-request Days=7,GlacierJobParameters={Tier=Standard}

This command requests that the snapshot be restored to standard S3 storage for a duration of 7 days. You can specify different retrieval tiers to adjust the speed and cost of the operation.

Cost Management
Regularly review your archived snapshots and lifecycle policies to ensure that you’re balancing cost and access requirements. AWS Cost Explorer can help you analyze how much you’re spending on different S3 storage classes and identify opportunities to further optimize costs by adjusting retention periods or using deeper archival options.

Retention and Deletion
For compliance or operational reasons, you may need to retain snapshots for specific periods. Use S3 Object Lock to enforce retention periods and prevent deletions before required timelines are met. When snapshots are no longer needed, ensure that they are deleted to free up storage and minimize ongoing costs.

✅ Pro Tip: N2W supports fine-grained retention policies, enabling organizations to enforce compliance rules without manual effort.

Best Practices for Archiving on Amazon S3 

To maximize efficiency and control costs when archiving snapshots in Amazon S3, it’s essential to implement best practices tailored to your data’s retention and access needs. 

These strategies can help ensure that archived data remains secure, compliant, and accessible when needed, while keeping expenses under control:

  1. Use tags for enhanced snapshot management: Apply tags to archived snapshots to categorize and organize data by attributes like project, retention period, or compliance requirements. Tags enable easier tracking, filtering, and management of snapshots for retrieval, review, or deletion.
  2. Implement layered lifecycle policies: Use multiple lifecycle transitions to manage storage costs. For example, move snapshots first to S3 Standard-IA for a short period, then to Glacier or Deep Archive. This approach provides flexibility and cost savings by aligning storage class changes with expected data usage patterns.
  3. Set up alerts for unexpected retrievals: Configure Amazon CloudWatch alerts to notify you of any unexpected or unauthorized retrieval requests from Glacier or Deep Archive. This helps manage retrieval costs and detect any potential security issues.
  4. Plan retrieval windows strategically: When restoring data from Deep Archive or Glacier, use Bulk retrievals for non-urgent data and schedule retrievals during off-peak hours. Grouping retrieval requests helps to avoid high costs associated with Expedited retrievals.
  5. Regularly reassess retention policies: Periodically review retention policies to ensure archived data aligns with current business or regulatory needs. AWS Config can help monitor compliance and identify snapshots eligible for deletion or deeper archival, optimizing storage costs.
  6. Enable MFA for deletions: Protect critical snapshots from unauthorized deletion by enabling MFA Delete on S3 buckets. This extra layer of security prevents accidental or malicious deletions, keeping important data safe.
  7. Use Amazon S3 Analytics for data insights: Leverage S3 Storage Class Analysis to track access patterns and make informed archival decisions. Analytics provide insights into snapshot usage, helping avoid premature archiving of data still in demand, which balances cost efficiency with accessibility.

How to Archive Snapshots into S3 (the easy way)

For AWS users looking for an easy, automated way to manage archiving, N2W offers seamless integration with Amazon S3, simplifying the process of transitioning snapshots to archival storage.

Instead of relying solely on manual lifecycle configuration, with N2W, you can schedule snapshots to archive into any S3 tier you want, bypassing the need for CLI or JSON lifecycle files. This is especially useful for non-technical users. Automating these archival processes not only reduces human error but ensures scalability.

Here’s how you can make the process ridiculously easy:

1. Set Up Your Backup Policy

With N2W, you can define policies that determine how and when snapshots are taken. Within the same policy, you can configure automatic archival to Amazon S3, including Glacier Instant Retrieval and Glacier Deep Archive, without needing to write a single line of code.

  • Select Storage Tier: From the intuitive N2W dashboard, choose which S3 storage class (Standard, Standard-IA, S3 Intelligent-Tiering, Glacier, Instant Retrieval, or Deep Archive) you want for your archived snapshots.
  • Define Retention: Specify how long snapshots should remain in S3 before being transitioned to deeper archival or deleted.

2. Automate Lifecycle Management

Once you’ve set your policy, the system automatically moves snapshots to the appropriate storage tier based on your requirements. This eliminates the need for manual configuration or maintaining complex lifecycle JSON files.

✅ Pro Tip: You can automatically add resources to your policies based on tags. This ensures that new instances or volumes are included in your backup and archival workflows without the need to manually update policies for every new resource. 

3. Easily Restore Archived Snapshots

When you need to restore data, N2W makes it simple. Archived snapshots can be retrieved with just a few clicks, no matter the storage class they reside in.

Why Choose N2W for Archiving?

  • Ease of Use: Manage all aspects of backup and archiving from a single, intuitive console.
  • Time Savings: Automate complex workflows, freeing up your team for higher-priority tasks.
  • Cost Optimization: Save up to 98% on storage costs by archiving snapshots to low-cost Amazon S3 tiers like Glacier Deep Archive. With the N2W AnySnap Archiver, you can also import and archive any existing AWS snapshots—even if they weren’t originally taken with N2W—for immediate cost savings and better cost management.
  • Compliance Assurance: Stay audit-ready with detailed logging and reporting features, ensuring your archival strategy aligns with regulatory requirements.

Looking for a smarter way to archive your snapshots? Try N2W for free and see how you can automate S3 archival and optimize storage costs with ease.

Next step

The easier way to archive snapshots to S3

Allowed us to save over $1 million in the management of AWS EBS snapshots...

N2WS vs AWS Backup

Why chose N2WS over AWS Backup? Find out the critical differences here.

N2WS in comparison to AWS Backup, offers a single console to manage backups across accounts or clouds. Here is a stylized screenshot of the N2WS dashboard.