The tradeoff between I/O performance and recovery time
Today, more than ever, users need instant access to their applications and data. In order for this to happen, applications need to be able to quickly retrieve data from storage blocks. For Amazon Elastic Block Storage (Amazon EBS), the biggest lag is experienced when first accessing storage blocks that are instantiated from an EBS snapshot. According to AWS, accessing a Provisioned IOPS (SSD) EBS volume for the first time can cause a loss of up to 50% of IOPS, which can be detrimental for databases requiring speed and consistent access. A way to overcome this performance hit is pre-warming of EBS volumes.
EBS Snapshot Restoration
An EBS snapshot is an incremental backup that only saves the blocks that have changed since the previous snapshot. The snapshot restoration process is designed in such a way that each snapshot can restore the entire volume.
To automatically restore a volume from a snapshot, you only need the snapshot ID and the newly created volume will load in the background. The EBS restoration process assumes the application can start read operations before the entire volume is fully recovered. However, if a data block required by the application hasn’t loaded yet, a process within the EBS volume queues up the necessary data to be copied immediately. The advantage of this process is that data that is not regularly accessed, or not expected to be accessed for a long period of time, will not take up processing power and will be queued for restoration at a later point in time. This results in an efficient and fast recovery process, especially for volumes with large amounts of data. Learn how to automate your EBS snapshots and restoration.
However, as mentioned above, in rare cases “cold” EBS volumes may have a longer latency resulting in a direct impact on database operations, creating latencies and harming end user experience. In these rare cases, EBS pre-warming is necessary.
Pre-Warming from an EBS Snapshot
EBS pre-warming is separate from standard snapshot restoration. It usually involves reading from and writing to all blocks in a volume before the application instances use them in order to avoid any chance of performance hits.
For a new EBS volume attached to a Linux instance, use the dd command to write all blocks. This action will initiate a write call to the EBS volume from the Linux instance and will make your desired IOPS ready for workloads once the command is finished. Learn how to pre-warm a new blank volume or a volume restored from a snapshot.
The key concept of this process is to prevent latency at the cost of a longer wait period, depending on the amount of data in the snapshot. For instance, the pre-warming process for a large database, consisting of multiple terabytes, could take hours to complete. While you can run pre-warming in parallel with normal snapshot recovery, it is not common practice. To calculate how long this may take, let’s look at a 1TB database with a 1Gb network, for example: 1000GB read at 50MB/sec = ~ 1,050,000 MB/50 = ~21,000 secs = ~ 6hrs.
To be clear, the 6-hour figure calculated in the example above is actually downtime. As a result, pre-warming should only be used in very special cases. In most cases, applications can greatly benefit from immediate snapshot restoration. Therefore in practice, restoring an application that holds multiple terabytes of data in AWS should take a few minutes. This is a great advantage when restoring a whole application stack in the cloud, rather than waiting for hours with the pre-warming method.