AWS Disaster Recovery Scenarios: Part 2 – AWS Warm Standby and Multi-Site

aws disaster recovery
Share This Post

In Part 1 of this article, we looked at two of the four AWS disaster recovery scenarios, backup and restore and pilot light. In this second part, we’ll cover the other two: warm standby and multi-site. These two scenarios are generally the best choices for companies that need a much quicker recovery time or have a dual region setup, assuming cost is not their primary concern. We’ll also examine the use of the N2WS Backup and Recovery tool, a cloud native solution that can greatly simplify the processes of AWS backup and recovery that’s required by these disaster recovery scenarios.

Scenario 3: AWS Warm Standby

Warm standby is similar to pilot light, since it makes use of another environment (typically in another region) running in reduced capacity that is ready to be quickly expanded in case of emergency. The main difference between the two scenarios lies in the size and complexity of the second environment. While a pilot light contains only scaled-down core systems (usually just the database and a few instances), a warm standby environment includes everything that your primary production environment uses. In addition, a warm standby environment is always running, so you will never have it completely scaled down to zero.

Like pilot light, AWS warm standby is an active-passive solution that offers fast recovery times. The preparation phase for warm standby is very similar to pilot light’s preparation phase. Databases need to be brought up and constantly replicated, and workload instances need to be provisioned from the latest AMI and kept up to date. All additional services need to be ready as well, so if you rely on data warehousing (RedShift), serverless (Lambda), machine learning (various services like SageMaker and Forecast) or anything else, these services need to be ready to go at a moment’s notice. Running these services in a second environment can, of course, carry additional costs—especially if you work with a lot of data. Workload machines can easily be kept at low numbers and scaled up when needed, but having to maintain a copy of all your data in another region will not only duplicate your storage expenses, it will also affect your data transfer costs.

In the warm standby scenario, recovery from an undesirable event is fairly simple. All you need to do is scale up (whether horizontally or vertically) and switch your DNS to point to your new environment, making it the active one. When everything goes back to normal, you might even consider keeping this new active environment as is by switching your original environment to be the passive one.

Warm standby has another use, which you might as well exploit since you are paying for it. You can justify the additional expense of this limited capacity environment by using it for various non-production workloads, like quality assurance and testing.

Scenario 4: AWS Multi-site

Among the four scenarios, multi-site is the odd one out. It is the only one that works as an active-active configuration. Basically, in the AWS multi-site disaster recovery scenario, a company runs another completely functional environment in a different AWS region. This requires a great deal more preparation and expenditure than warm standby; but, when it comes to recovery speed, multi-site is unmatched. There is usually no downtime at all, and RTO and RPO requirements can be met, no matter how strict they are.

Of course, all this comes with a hefty price, as you will be running a complete duplicate of your existing environment along with performing constant replication between the two. Not everyone is willing to pay for this. With smaller startups, each dollar counts, and large enterprises might be reluctant to increase their already high costs so drastically. However, there are companies with business requirements that dictate the use of the multi-site scenario, as they simply can’t afford not to have this kind of insurance.

The preparation phase for a multi-site scenario is where most of the work happens, and it can take as long to create this environment as it did to completely bring up and prepare the primary region. After you have successfully duplicated your environment in another region, you need to set up the necessary replication and ensure that everything is working properly. That way, when the secondary environment is needed, a quick switch can be made.

AWS Route 53 (Amazon’s DNS service) is very helpful here, not only for switching the workloads between the two active environments, but also for distributing the partial load. Just as the AWS warm standby environment is often used for QA, testing, or other internal processes, both environments can constantly be used for production workloads in a multi-site configuration. This gives you the opportunity to do some weighted routing (pushing a certain percentage of traffic to each). When a disaster occurs and one region goes down, the other one will automatically take over. Route 53 will always direct traffic to healthy environments, so if one environment fails, the service will stop sending traffic that way. Of course, you can always opt for another method of routing between the two, and you can choose to switch from one environment to another only when a disaster happens.

The multi-site scenario is by far the best and the safest one, but it does come at the highest possible cost.

N2WS Backup and Recovery for quick recovery and more

N2WS Backup and Recovery is a cloud native AWS backup and recovery tool created specifically with AWS in mind, and it can help you with all four of these disaster recovery scenarios by providing a quick and easy way to centralize your company’s needs. N2WS Backup and Recovery has just recently been upgraded to version 3.0, and the new and improved dashboard and UI make it easier than ever to navigate the platform and configure it to your requirements.

All four disaster recovery scenarios require provisioning new infrastructure. N2WS Backup and Recovery’s VPC Capture and Clone feature allows you to replicate your entire networking setup very quickly with just a few clicks. When a disaster occurs, there is no room for mistakes, and being able to confidently manage such an important piece of the process is incredibly valuable.

Of course, N2WS Backup and Recovery also allows you to backup various elements of your environment, such as EBS volumes, RDS databases, Aurora or RedShift clusters, and DynamoDB tables. On top of that, you can backup your EFS volumes without the need for custom scripts and manual processes. Storing your backups comes with a cost—in some cases, a high one. N2WS Backup and Recovery offers a feature to store backups in S3, potentially reducing your backup storage spending by up to 98%. And, our tool allows you to easily recover this data to any region, or even to another AWS account (to avoid being the next Code Spaces), if your AWS account does get compromised. You can save even more by taking advantage of the option to archive backups to Glacier cold storage. You’ll just need to keep in mind the necessary retrieval times associated with this option.

N2WS Backup and Recovery is less relevant for a multi-site scenario than it is for the other three scenarios described in this series. This is because multi-site involves running another active environment so the restore process for it is slightly different. However, even in this scenario, you need to have backups in place, and N2WS Backup and Recovery can absolutely help out with that. For the other three scenarios, the benefits of using our solution are obvious.

The four AWS Disaster Recovery scenarios and the N2WS option

In this two-part series, we examined the four AWS disaster recovery scenarios in-depth, considering use cases, complexities, and costs. Each of these four scenarios has its purpose, and together, they provide a well-rounded protection plan designed by AWS designed to meet the diverse needs of their clients. To determine which scenario is best for your company, you will need to look not only at your requirements, but also at your financials. As usual, the more performance and flexibility you want, the more you have to pay.

This article also introduced you to N2WS Backup and Recovery, a cloud native backup tool that can help you when working with these scenarios and can make your life easier when dealing with cloud backups in general. N2WS provides complicated scheduling at the click of a button and customizes backup for each AWS account with a few clicks of a checkbox. N2WS eliminates vulnerabilities by no longer having a single employee or a team rely on manual backups or scripts. N2WS is easily maintainable avoiding extra overhead and works seamlessly without constant attending. For a free 30-day trial of this powerful tool, click here.

Next step

The easier way to recover cloud workloads

Allowed us to save over $1 million in the management of AWS EBS snapshots...

Try N2WS for Free