Redshift, the AWS petabyte data warehouse solution, is designed to offer fast query performance with the use of columnar storage technology and is available to use over a wide range of SQL clients.
There are two Redshift snapshot types: automated and manual, and both are stored in Amazon S3. In this article, I will elaborate more on the need to use and automate Redshift backups using manual snapshots.
1. Longer Retention
Redshift’s automated backups are held for a maximum retention period of up to 35 days, then are automatically deleted. Manual snapshots, on the other hand, can be saved for as long as needed, are not automatically deleted and can be taken at any desired point in time. Manual snapshots can be crucial when it comes to complying with strict regulation rules that generally require data to be saved for a number of years. Additionally, manual snapshots can be particularly useful if you load a large amount of data into your Redshift cluster, for example, and want to immediately back it up.
Furthermore, AWS limits the amount of manual snapshots you can take to 20 snapshots per account. If you have multiple databases, you might want to raise your snapshot limit. Contact AWS or carefully refine your backup policy to take snapshots at larger intervals.
2. Surviving an Altered or Terminated Cluster
According to this Stack Overflow discussion, automated Redshift backups are taken every eight hours, or every 5GB of inserted data, whichever happens first. If a database is altered, there is a good chance you won’t be able to restore it to its most recent changes. In which case, you should make sure that a manual snapshot has been taken.
Redshift deletes automated snapshots when you delete the cluster. This means that your Redshift database cannot be restored because the snapshot is lost. At the moment of deletion, you are prompted to capture a manual snapshot, although the significance of this can sometimes be lost and this stage bypassed, meaning there will be no database backup. Manual Redshift snapshots on the other hand, are stored separately, and if used, are the only way to save a deleted cluster.
In both cases of automating your manual snapshots, you will be able to set your backup policy including high-frequency snapshots. Aside from controlling the policy, automating manual snapshots creates seamless and transparent backups, which increase confidence in Redshift operations, and allows you to perform restorations at any point in time.
3. Cross-Region Backup
Your retention periods set for automated snapshots in AWS destination regions might be different than those within the cluster’s source region. Each AWS region has its own specific retention period for automated snapshots. In addition, cross-region snapshots are limited to a retention period of seven days. If you wish to keep and maintain a comprehensive backup of your Redshift cluster (specifically for mission critical production environments) in another region for high availability purposes, manual snapshots are the better option.
AWS automated snapshots offer great capabilities that might suit your needs. However, the catch is that you never know when the next recovery will happen, which is where manual snapshots come into play. Maintaining a comprehensive backup of your production environment and running a compliant environment can and needs to only be done by automating manual snapshots. They allow the flexibility and control that you need in order to support your specific custom backup policy and create robust and reliable cloud operations.
How to Automate Your Redshift Manual Snapshots
Redshift snapshots can be taken and managed through the AWS console or APIs. In order to define your backup policy and to automate your Redshift snapshots accordingly, you can use an AWS backup solution such as Cloud Protection Manager (CPM).
CPM is a native cloud backup and recovery solution for Amazon EC2 instances, EBS volumes, RDS databases and Redshift Clusters.