If your EC2 backup solution uses EBS snapshots, besides worrying about how to automate and manage snapshots, you will need a way to estimate the cost of storing them. Currently, the exact size of EBS snapshots is not available. In part 1 of this 2 part series, we will try to better understand how EBS snapshots work. This blog is part of a series of posts on AWS backup.
Block-level
AWS documentation specifies that EBS snapshots are incremental. But what does that mean exactly? It means that the first snapshot you take of a volume that contains all the data within is known as a full snapshot. Every subsequent snapshot stores the changes that were made since the last one. This is a very efficient method as nothing needs to be calculated when the snapshot starts and only the minimum required data is copied. The AWS backup mechanism needs only to track the changes that were made to EBS volumes between snapshots (probably using a bitmap).
Incremental Backup
The AWS “incremental” snapshot backup is actually incremented at the block-level. The changes monitored are not changes in files but rather changes at the disk level. The disk is divided into “blocks” and every modified block (a write happened in that block) is marked to be copied at the next snapshot. Even if only part of a block has been changed, all of it will be copied. It is not so important to know which block size AWS uses, though it is probably somewhere between 64KB and 4MB (like most block-level snapshot solutions). Smaller blocks will provide more “accurate” backup as only the minimal changes will be copied. If the block is small, however, the metadata needed to manage it increases.
There are more blocks per volume and bitmaps are bigger so there’s a trade-off. With large amounts of data, usually larger blocks are preferred. The first full EBS snapshot is content-aware, which means that only the blocks that actually contain data are copied. If you have a 1TB EBS volume that is only half full than the snapshot will only include that half and that is what you’ll pay for. How does the AWS snapshot mechanism know if a volume’s blocks actually contain data? The answer is quite simple and is related to the bitmap we mentioned earlier. AWS’s snapshot mechanism keeps a bit (it doesn’t have to be exactly one bit, but a bit in concept) for each block on the volume.
Every time there’s a write request on an EBS volume, the corresponding bit/s in the bitmap are turned on. Changes are tracked from the “birth” of the EBS volume, and every block that has ever been written in the volume will be included in the first full snapshot. After each snapshot is taken, a new bitmap is created and tracks all the writes from the point-in-time of the previous snapshot. In the following diagram, we can see the changes between snapshots.
The first one is the full snapshot. The blue area represents data on the volume and is what will be copied to the snapshot. Subsequent snapshots will only copy the green areas, which are the areas that were modified. The green areas can represent newly added data or existing data that has been updated or changed.
Rough estimation
In order to estimate how large your EBS snapshots will be, you need to know how much your volumes are changing. One way would be to guesstimate, we can use a simple thumb rule that is often used in- backup planning: A typical data volume of a production server changes about 3% a day. Let’s try and calculate the cost. Assuming a 1TB EBS volume, that is 70% full at first. We take snapshots and keep them for 30 days. So, the first full will be taking 700GB (70% of 1TB). For the incremental snapshots, we can multiply 30 (days) by 30GB (3% of 1TB) and we reach 900GB. Add them together and we reach about 1.6TB of total snapshot storage. AWS compresses the snapshots when they are stored in S3. It is hard to estimate how much data will be reduced by compression.
If compression is zip-like and data on the EBS volume consists mostly of text files and can be compressed very well. On the other side, if data on the volume is already compressed (e.g. compressed file system, media files), it will not be compressed at all. You can decide not to factor compression into your calculation or give it mostly a 2:1 ratio. The cloud cost of storing 1GB of EBS snapshot data is $0.095/month (Virginia region, February 2013). For 1600GB the price will be 152$/month. If we assume compression is effective, it will be half: 76$onth. Accurate? No. Something we can work with, maybe…
A more accurate method
For more accurate AWS EBS pricing, you need a more accurate method of knowing how much your EBS volumes are changing. To do this, you can sample them. To do that you can install software that monitors your disk changes and reports them to you. Take a large enough sample at typical times, and you can get a very good idea on how much any specific EBS volume is changing. For Windows instances you can use the internal Windows tool, Performance Monitor (simply type run > perfmon), `Perfmon` can give you the number of bytes written on average per second, just add the logical disk related counters.
Another tool would be Disk Monitor, a tool you can download from Microsoft’s site (originally written by SysInternals), it can monitor writes to disk and create a file from it that can later be imported to a spreadsheet. You can download it from here. In Linux instances, you can use a command-line Python-based open source tool named Iotop.
Write patterns and how they affect snapshot size
Write IO patterns affect the amount of data your snapshots will take. Let’s take an example: An EBS volume with 1GB of data and then every day there is a 1GB change on the volume. So the first full snapshot will take 1GB of snapshot storage space, and then every daily incremental will also take 1GB. Now let’s assume we keep snapshots for 10 days and delete any older ones.
So, if every 1GB is written to new unused blocks on the volumes (e.g. new static files were written, older ones don’t change), then my snapshot data will grow by 1GB every day forever (or until the EBS volume if full).
Deleting old snapshots won’t matter because all the blocks they occupy will need to be saved. So after 10 days, you will have 10GB of snapshot data, and after 100 days 100GB. Now let’s assume the other extreme: There is only 1GB of occupied space on this EBS volume, and every day that same 1GB is overwritten (e.g. a bit like a database file that changes a lot, but not necessarily grows).
In this case, you will have 10GB of snapshot data after 10 days, but after 100 days you will still have 10GB of snapshot data because older snapshots are deleted.
Number of snapshots don’t necessarily matter
We keep talking about a daily change. How does the frequency of snapshot-taking fit into that? Well, that depends. You can take one snapshot a day or take six. If in the same day blocks won’t be written and then rewritten it doesn’t matter. One or six snapshots will use the same amount of storage space, and therefore will cost the same.
This is a very significant conclusion when configuring your EBS volumes backup solution; you can actually take a higher resolution of snapshots without increasing the cost, giving you a better RPO (Recovery Point Objective). In reality, things will probably not be that “clean,” but in a typical application, most data will probably not be rewritten all the time, and in most cases, you will be able to take more frequent snapshots without affecting your AWS bill by much.
- Tag snapshots for cost tracking: Implement strict tagging for each snapshot based on environment, team, or project. This helps track snapshot costs per department using AWS Cost Explorer, avoiding surprises in billing.
- Use gp3 volumes to reduce IOPS costs: If you require higher IOPS for your volumes but want to keep snapshot costs lower, consider migrating to gp3 volumes, which provide better cost control for both IOPS and throughput while maintaining snapshot compatibility.
- Monitor snapshot storage metrics: Set up custom CloudWatch metrics to track snapshot creation, retention, and usage patterns. These insights can reveal inefficiencies or help detect unusual write behavior leading to higher costs.
- Utilize compression in the application layer: If your application involves highly compressible data, applying compression before writing to the volume can reduce the size of the snapshots, as EBS captures compressed data without further reducing it.
- Periodically review and adjust RPO: Reassess your Recovery Point Objective (RPO) over time. As workloads evolve, you may find you can reduce snapshot frequency, cutting costs without compromising the integrity of your backup strategy.
Concluding our EBS snapshot pricing series
Currently, you can only estimate how much S3 storage space your snapshots take. To help plan your budgets and to use your EC2 backup solution more effectively, you can estimate the amount and pattern of changes of your EBS volumes, by making assumptions or by sampling them.