If your EC2 backup solution uses EBS snapshots, besides worrying about how to automate and manage snapshots, you will need a way to estimate the cost of storing them. Currently, the exact size of EBS snapshots is not available. In part 1 of this 2 part series, we will try to better understand how EBS snapshots work.
AWS documentation specifies that EBS snapshots are incremental. But what does that mean exactly? It means that the first snapshot you take of a volume that contains all the data within is known as a full snapshot. Every subsequent snapshot stores the changes that were made since the last one. This is a very efficient method as nothing needs to be calculated when the snapshot starts and only the minimum required data is copied. The AWS backup mechanism needs only to track the changes that were made to the EBS volume between snapshots (probably using a bitmap).
The AWS “incremental” snapshot backup is actually incremented at the block-level. The changes monitored are not changes in files but rather changes at the disk level. The disk is divided into “blocks” and every modified block (a write happened in that block) is marked to be copied at the next snapshot. Even if only part of a block has been changed, all of it will be copied. It is not so important to know which block size AWS uses, though it is probably somewhere between 64KB and 4MB (like most block-level snapshot solutions). Smaller blocks will provide more “accurate” backup as only the minimal changes will be copied. If the block is small, however, the metadata needed to manage it increases.
There are more blocks per volume and bitmaps are bigger so there’s a trade-off. With large amounts of data, usually larger blocks are preferred. The first full EBS snapshot is content-aware, which means that only the blocks that actually contain data are copied. If you have a 1TB EBS volume that is only half full than the snapshot will only include that half and that is what you’ll pay for. How does the AWS snapshot mechanism know if a volume’s blocks actually contain data? The answer is quite simple and is related to the bitmap we mentioned earlier. AWS’s snapshot mechanism keeps a bit (it doesn’t have to be exactly one bit, but a bit in concept) for each block on the volume.
Every time there’s a write request on an EBS volume, the corresponding bit/s in the bitmap are turned on. Changes are tracked from the “birth” of the EBS volume, and every block that has ever been written in the volume will be included in the first full snapshot. After each snapshot is taken, a new bitmap is created and tracks all the writes from the point-in-time of the previous snapshot. In the following diagram, we can see the changes between snapshots. The first one is the full snapshot. The blue area represents data on the volume and is what will be copied to the snapshot. Subsequent snapshots will only copy the green areas, which are the areas that were modified. The green areas can represent newly added data or existing data that has been updated or changed. In part 2, I will delve deeper into how to assess the size of snapshots and, from that, determine their cost.