Amazon RDS, released in 2009, offers great promise for developers using MySQL. For those running and managing instances within AWS cloud, database availability and consistency support have been highly beneficial features. Today it’s compatible with Oracle, MSSQL, PostgreSQL and MariaDB.
Then along comes Amazon Aurora. Aurora, a proprietary database service created by AWS that provides higher levels of performance and scalability, joined the relational database portfolio in 2014 at AWS re:Invent. According to AWS SVP Andy Jass, Aurora has the capabilities of “…proprietary database engines at one tenth of the cost. Compatible with MySQL, Aurora aims to be an enterprise-class database solution.”
Key features of Amazon Aurora
According to AWS, Aurora is not only cheaper to run than other large scale commercial databases, but it is also much faster than the popular open source, MySQL. The service has increased the scalability of the popular open source database, enabling storage to be automatically provisioned as you go, which is a major advantage in a world where databases are still a main cause of performance bottlenecks.
Scalability: Go Big Anytime
According to Amazon, Aurora is up to five times (5x) faster than the native MySQL deployment, making it ideal for large amounts of data and environments with high performance requirements.
You can start with 10Gb of provisioned storage, and as you reach the capacity limit it will automatically increase by 10Gb increments, scaling all the way up to the size of a very large database with tens of TBs. The database cluster architecture can support an “active/active” configuration, where it is possible to have more than one writer.
Limitations and challenges
Although this architecture allows for higher levels of scalability, it also produces challenges in terms of coordination and synchronization. The more classic architecture, and what the database uses, is what we call “passive/active”, where only one entity at a time can write to the storage.
You can scale out the Aurora DB cluster with as many readers (i.e. Aurora Replica) as required and performance will be guaranteed, at least in terms of reading from the database. In terms of writing, however, Aurora is limited to just one machine (i.e. Primary instance), and in that sense it is similar to RDS, as both require the provisioning of a specific instance for that purpose. You can always upscale your instance size in order to try and keep up with the writing performance.
Aurora vs SAN data centers
In terms of architecture, as we already mentioned, Aurora uses the classic DB cluster architecture which is typically used in large, multiple database environments. A key principle is its single central storage for the database. As the storage the database employs is different from AWS EBS disks, this allows the ability to scale dynamically.
AWS has developed a special storage backend for Aurora, which will enable durability and inter-availability zone (AZ) replication. In comparison, traditional SAN datacenters store all of the databases to a disk, or the logical disks are stored in a large storage array, having the ability to logically connect to different servers.
Fault-tolerant by design for high availability
An Aurora DB cluster is a fault tolerant by design. The cluster volume spans multiple Availability Zones in a single region, and each Availability Zone contains a copy of the cluster volume data. This functionality means that your DB cluster can tolerate the failure of an Availability Zone without any loss of data and only a brief interruption of service.
As mentioned, in an Aurora cluster there is a single writer instance and multiple readers that read from the disk. If an error occurs and the writer fails or crashes, a simple automatic failover process will take one of the readers and assign it a new role as a writer.
The fact that they are attached to the same storage location within the same network means that there is no recovery downtime or time where data needs to be copied to another location, making it highly available. In addition, the fact that there can be a lot of readers within a database where there are a lot of reads and queries going on enables higher performance levels, since processes can be implemented concurrently on different machines.
Amazon Aurora vs RDS
Regular Amazon RDS deploys what we call a “DB instance”, a DB server that needs to be provisioned in advance by specifying the instance type and size of storage. Snapshots can be used to migrate to a larger scale, although this process doesn’t support seamless autoscale.
You can have a multi AZ deployment, but since RDS needs to perform DB level replication, it is less efficient than the Aurora cluster option. This limitation is one of the key reasons why Aurora is more efficient and scalable than RDS, and therefore makes it a preferable option.
Any use case where you have a lot of queries (BI, for example) is a good use case for Amazon Aurora since you have multiple data sources, points, and many queries being performed in parallel. In such cases, you can utilize multiple readers, which eliminates any bottlenecks.
Aurora Updates: Backtrack and GovCloud
We’ve all been in situations in which we wished there was an ‘Undo’ button to fix something we accidentally broke. Amazon Aurora now has this feature and it allows you to go back to a certain point in time without restoring data from a backup. This functionality can be enabled for all newly-deployed MySQL-compatible Aurora database clusters and MySQL-compatible clusters restored from a backup.
Amazon also recently announced that customers who are utilizing GovCloud to back up sensitive data and to meet compliance needs, can now launch an Aurora instance within GovCloud region.
The Importance of Automation of your Aurora Backup
In terms of functionality, Aurora is formally part of AWS relational database services (RDS). Aurora supports almost all backup functionalities that are available with RDS, such as point in time recovery and automatic backup. It also supports manual snapshots, however, the snapshot mechanism operates slightly different on Aurora.
Instead of acting like a regular snapshot with a disk, like RDS, a snapshot is taken of the backend storage. While not a huge difference by any means, you will notice that a few extra steps are needed in order to recover a fully operating cluster from a snapshot. Therefore, it is recommended to automate your Aurora recovery processes.
When an Aurora DB cluster is created from a snapshot, only the backend database will be created, meaning that additional operations will be required to recover the readers and writer. Therefore, you have a multiple step process, rather than a single step process that is possible with RDS.
✅ TIP: If you are carrying out this process through the console or via an automation tool that has already provided a functionality such as N2WS however, then you don’t need to worry about this issue as recovery is just a click away.
Avoiding Vendor Lock-in
When migrating data to the cloud, there is always the vendor lock-in consideration. Even though Aurora claims to be 100% compatible with MySQL, there are no guarantees that it will stay this way forever. Enterprises on Amazon that are looking to move their Oracle, for example, and wish to leverage the benefits of a managed Database-as-a-Service (DBaaS), may find that Aurora is a valuable solution for them. AWS provides a variety of migration tools to help implement the switchover.
N2WS supports Disaster Recovery for Amazon Aurora
The good news is you can start protecting your cloud deployment properly with full cross-region and cross-account disaster recovery now available for Amazon Aurora clusters. We’re extremely excited about supporting Amazon Aurora because typically a full backup and recovery might traditionally take about 2 hours, whereas it can now be done in about 2 minutes. Start your free trial today to ensure implementing an automated robust, scalable, enterprise-class cloud backup and recovery solution.