How does N2WS Backup & Recovery enable application-consistent backups for AWS-hosted PostgreSQL databases?

N2WS Backup & Recovery enables application-consistent backups for PostgreSQL databases hosted on AWS EC2 by allowing users to run custom scripts before and after EBS snapshots. The 'before' script can execute the 'pg_start_backup' command to mark the start of a backup, and the 'after' script can run 'pg_stop_backup' to finalize the backup process. This ensures that the backup captures a consistent state of the database, even during active transactions. Note: PostgreSQL does not support freezing I/O or locking the database for backup, so consistency relies on these scripts and proper configuration. Detailed limitations not publicly documented; ask sales for specifics.

What types of PostgreSQL backups are supported, and how does N2WS fit into these strategies?

PostgreSQL supports two main backup types: physical (file system level and continuous archiving with Point-in-Time Recovery) and logical (using pg_dump or pg_dumpall). N2WS fits into the physical backup strategy by automating EBS snapshots at the block level, which are faster than traditional file system backups. N2WS also enables application-consistent snapshots by integrating with PostgreSQL's backup commands via pre- and post-snapshot scripts. Note: Logical backups (pg_dump) are not directly managed by N2WS; users must handle those separately.

Can N2WS automate backup and recovery for AWS EC2-hosted PostgreSQL databases?

Yes, N2WS automates backup and recovery for AWS EC2-hosted PostgreSQL databases by scheduling EBS snapshots and allowing the execution of custom scripts to ensure application consistency. Users can configure 'before', 'after', and 'complete' scripts to manage PostgreSQL backup commands and post-processing tasks, such as deleting temporary files. Note: Manual configuration of scripts and permissions is required for full automation.

What are application-consistent snapshots, and why are they important for PostgreSQL backups?

Application-consistent snapshots ensure that all in-memory and pending I/O operations are properly flushed to disk before a backup is taken, resulting in a consistent database state. For PostgreSQL, this is achieved by running 'pg_start_backup' before and 'pg_stop_backup' after the snapshot. N2WS facilitates this process by allowing these scripts to be executed automatically as part of the backup policy. Note: Without application-consistent snapshots, backups may be crash-consistent only, potentially leading to data inconsistencies if transactions are in progress.

Does N2WS support backup and recovery for both AWS RDS and EC2-hosted PostgreSQL databases?

N2WS primarily supports backup and recovery for EC2-hosted PostgreSQL databases using EBS snapshots and custom scripts for application consistency. While AWS RDS provides its own managed backup features, N2WS is designed for environments where users manage their own PostgreSQL instances on EC2. Note: Direct integration with AWS RDS PostgreSQL backup is not documented; users should verify compatibility for managed RDS environments.

What scripting capabilities does N2WS provide for PostgreSQL backup automation?

N2WS allows users to configure 'before', 'after', and 'complete' scripts as part of backup policies. For PostgreSQL, these scripts can execute commands such as 'pg_start_backup' and 'pg_stop_backup' on the database server via SSH. The output of these scripts is logged and can be reviewed in the N2WS recovery panel. Note: Proper permissions and SSH access must be configured for scripts to execute successfully.

How can I view the results of backup scripts executed by N2WS?

The output from backup scripts executed by N2WS is available in the recovery panel, provided that N2WS is configured to collect script output. This allows users to verify that backup commands ran successfully and to troubleshoot any issues. Note: Script output collection must be enabled in N2WS settings for this feature to work.

What are the limitations of using N2WS for PostgreSQL backups?

One limitation is that PostgreSQL does not support freezing database I/O or locking the database for backup, unlike some other databases. As a result, consistency relies on the correct use of 'pg_start_backup' and 'pg_stop_backup' commands and proper script configuration. Additionally, N2WS does not directly manage logical backups (pg_dump), and users must ensure that scripts have the necessary permissions and access. Best fit for EC2-hosted PostgreSQL; teams needing direct RDS integration or logical backup management may want to consider alternatives.

What security and compliance certifications does N2WS hold?

N2WS is independently certified for ISO/IEC 27001:2022 and is SOC compliant by inheritance, leveraging AWS and Azure platform compliance. N2WS also supports regulatory frameworks such as HIPAA, GDPR, FedRAMP, ITAR, and CJIS. Customers can request a copy of the ISO certificate by contacting customer.success@n2ws.com. Note: For more details, visit the N2WS Trust Center.

How does N2WS ensure data protection and security for backups?

N2WS provides immutable, air-gapped backups to protect against ransomware and accidental deletion. All connections are protected with TLS/HTTPS, and multi-factor authentication (MFA) is available for user access. Backups never leave your cloud environment, ensuring data sovereignty. Note: Security features depend on correct configuration and cloud provider settings.

What integrations does N2WS offer for automation and monitoring?

N2WS offers a RESTful API for custom integrations and automation of tasks such as user onboarding and backup management. It also integrates with third-party monitoring tools like Datadog, Splunk, and Bocada for enhanced observability and compliance tracking. CLI access is available for advanced management. Note: Integration setup may require technical expertise; see the API documentation for details.

Is there technical documentation available for N2WS integrations and automation?

Yes, N2WS provides extensive technical documentation, including a RESTful API guide, user guides, release notes, and upgrade instructions. The API documentation can be downloaded here, and user guides are available at docs.n2ws.com/user-guide. Note: Some documentation may require registration or a support account.

Who can benefit from using N2WS for PostgreSQL backups on AWS?

N2WS is ideal for IT managers, cloud directors, and managed service providers (MSPs) who manage PostgreSQL databases on AWS EC2 and require automated, application-consistent backups. It is also suitable for enterprises, public sector organizations, and regulated industries needing compliance and data sovereignty. Note: Organizations using AWS RDS for PostgreSQL should verify compatibility, as N2WS is primarily designed for EC2-hosted databases.

What business impact can organizations expect from using N2WS for PostgreSQL backup and recovery?

Organizations can expect reduced disaster recovery costs (up to 92% savings on long-term backup storage), improved data protection through immutable backups, and minimized downtime with near-instant recovery. N2WS also simplifies compliance reporting and audit readiness for regulated industries. Note: Actual savings and impact depend on environment size and configuration.

How does N2WS compare to AWS Backup for PostgreSQL backup and recovery?

N2WS offers several capabilities not available in AWS Backup, including application-consistent backups for EC2-hosted PostgreSQL via custom scripts, immutable air-gapped backups, granular restore options, and cross-cloud recovery (AWS and Azure). AWS Backup is limited to AWS environments and does not support file/folder-level recovery or multi-tenancy. However, AWS Backup is natively integrated with AWS services and may be simpler for basic backup needs. Choose N2WS for advanced automation, compliance, and multi-cloud support; choose AWS Backup for basic, AWS-only workloads. Note: N2WS requires manual script configuration for PostgreSQL consistency.

How to Backup Your AWS Cloud Based PostgreSQL Database

PostgreSQL is an Object Relational Database Management System (ORDBMS) that is considered to be one of the most advanced open source relational database management systems around. Like all relational databases, it is an ACID compliant system and supports transactional queries, DDL statements and Master-Slave replication architecture.

Additionally, PostgreSQL offers an easy way to use data importing tools with Postgres Plus, allowing users to import data from enterprise databases, like Oracle.

What is PostgreSQL?

PostgreSQL is managed by the open source community and the most recent version, was released in June 2024. With the advancements in cloud computing, cloud managed databases are becoming more and more popular. They include various advantages, such as a pay as you go pricing model, scalability, as well as easy management.

AWS hosts PostgreSQL in two different ways

The first is via the AWS-managed database service, RDS, and the second is by self hosting your database on AWS EC2 infrastructure. While some users prefer using RDS because it is a managed service, many others still prefer managing their own databases allowing them to:

Achieve inter-region replication (read replica)
Set up replicas that have write capacity (e.g. reporting databases that may generate data at the end of the day)
Set up automatic failover to separate regions
Manually fine tune all database-level parameters, if you have very good admin capabilities, since RDS may not allow you to modify all parameters

In this article, we will use the example of an EC2-hosted PostgreSQL database. It is also very important to have proper backup and restore mechanisms when it comes to reliable database management. These allow you to achieve better disaster recovery and prevent data loss.

The importance of backup for PostgreSQL

In production, a single human error can result in the loss of valuable data. As a result, it is recommended to back up your system before making any changes to your production database, along with your regular planned backup. There are two primary ways of achieving backup and recovery with cloud-hosted databases:

Inherent backup and recovery via database engines, that is executed on cloud-hosted database instances
Backup and recovery on the volume/disk level using your cloud provider’s infrastructure (e.g. EBS snapshots for AWS)

We will discuss both backup and recovery options a bit later in the article. First, let’s introduce you to the different inherent replication strategies that can be implemented for PostgreSQL data backup.

The 2 types of PostgreSQL backups: physical vs logical

There are two types of backups in PostgreSQL. Physical Backups, which are broken up into File System Level Backups and Continuous Archiving and Point-in-Time Recovery (PITR), and Logical Backups.

Physical Backup: When PostgreSQL begins, its backend creates data files that are copied.
1. File System Level Backup: In this strategy, data files can be copied and stored in another location, then archived or compressed as necessary. The command to back up files is as follows: tar -cf backup-dd-mm-yy.tar l/data
  1. Advantages: A File System Level Backup is typically larger than an SQL dump (pg_dump will not dump the contents of indexes, just the commands to recreate them). However, performing a file system backup might be faster.
  2. Disadvantages: If data inconsistencies are not checked during backup, they could result in inconsistent backups. To better ensure that your backup is consistent, stop the database before backup.
  3. Where it’s used: While it’s not recommended for a single database server, it can be used where the PostgreSQL Master-Slave architecture is implemented because the database has to be shut down in order to perform a usable backup with the tar command.
2. Continuous Archiving and Point-in-Time-Recovery (PITR): PostgreSQL creates WAL (Write Ahead Log) files that record changes that are made to the database. With this approach, WALs can be backed up at regular intervals and, when combined with the File System Level Backup, used to recreate the database.
  1. Advantages: Among the many advantages of this approach is the ability to, create and stop the creation of WAL files at a particular interval so that the database can be updated to a previous point in time. With PITR, only the latest modified data is backed up when WAL files are created, reducing the amount of storage needed to backup data.
Logical Backup: PostgreSQL has two utilities (‘pg_dump’ and ‘pg_dumpall) that take consistent database snapshots at a given moment. However, they don’t force other users to use the database. Both utilities are effective tools that create *.sql files. With these utilities, backups can be performed on local databases and recovered on remote databases.

Crash-Consistent Snapshots

AWS EBS snapshots offer point-in-time backups that are considered to be crash-consistent snapshots. These snapshots backup all of a disk’s data at a particular point in time. However, if files are still open, say in the case of I/O transactions in progress in a database, the data may not be completely written to the disk. This may result in inconsistent data since the file system may not be aware that a snapshot is being taken. To overcome this, it is recommended to use application-consistent snapshots.

Application-Consistent Snapshots

It is recommended to have snapshots first inform the OS that a snapshot is being taken and then perform the backup. These types of snapshots are called application-consistent snapshots.

N2WS Backup & Recovery for PostgreSQL

N2WS Backup & Recovery can help in achieving application-consistent snapshots. It is important to note that AWS EBS snapshots are very fast compared to the inherent PostgreSQL backup options (e.g. File System backups) because they take complete block level snapshots.

N2WS Backup & Recovery is an enterprise backup, recovery and disaster recovery solution for EC2. It allows you to automate and maintain backups of your entire EC2 environment as well as achieve application-consistent backups.

This is done by providing a mechanism to perform certain tasks before and after snapshots are taken, informing the OS of the backup. N2WS allows you to write scripts that can be performed before a PostgreSQL backup on EC2. In order to allow backup scripts to run, configure your policy as shown below:

N2WS can execute three different scripts: “before”, “after” and “complete”.

The before script is launched before EBS snapshots are taken. You can execute the ‘pg_start_backup’ command here, which will provide you with the location of the transaction log where your backup will begin. N2WS will start the snapshot procedure after ‘pg_start_backup’ is completed.
The after script is executed right after backup is started. You can execute the ‘pg_stop_backup’ command here, which will move the the current transaction log insertion point to the next transaction log file.
The complete script is executed after all snapshots have completed. You can decide if you want to delete the WAL files at this point, and incorporate that into the code accordingly.

It is important to note that unlike MySQL, PostgreSQL does not provide the option to freeze database I/O or temporarily lock databases for backup. For this reason, we use the ‘pg_start_backup’ and ‘pg_stop_backup’ commands.

According to PostgreSQL, “the [‘pg_start_backup’] function writes a backup file (backup_label) into [a] database cluster’s data directory, performs a checkpoint, and then returns the backup’s starting transaction log location as text.” The ‘pg_stop_backup’ function, on the other hand, quickly switches a WAL segment in order to archive the current one.

Due to the fact that PostgreSQL backup is run by postgres users, if you want another user (script) to execute on your behalf, you have to change the ‘pg_hba.conf’ configuration file (generally located in data folder) to allow authentication from other users.

Here, we will execute the ‘pg_start_backup’ and ‘pg_stop_backup’ scripts in PostgreSQL, but N2WS will initiate them using before and after scripts by logging into the PostgreSQL server.

Executing start/stop backup scripts in PostgreSQL

PostgreSQL server backup scripts:

Start Backup Script(db_start_backup.sh)

#!/bin/bash
WAL_ARCHIVE=/var/lib/pgsql93/archives
PGDATA=/var/lib/pgsql93/data
PSQL=/usr/bin/psql
label=base_backup_${today}
PGBACKUP=/var/lib/pgsql93/pgsqlbackup
today=`date +%Y%m%d-%H%M%S`
echo "PG_START_BACKUP script will start now with $label..."
CP=`$PSQL -q -Upostgres -w -d template1 -c "SELECT pg_start_backup('$label');" -P tuples_only -P format=unaligned` RVAL=$?
echo "Checkpoint Begins is $CP"
if [ ${RVAL} -ne 0 ]
then
echo "PG_START_BACKUP FAILED!!!"
exit 1;
fi
echo "PG_START_BACKUP SUCCESS!!!"
echo "Compression with Tar starts..."
tar -cvf pgdata-$today.tar.bz2 --exclude='pg_xlog' $PGBACKUP/
echo "Compression with Tar completed..."
echo "PG_STOP_BACKUP script will start now..."
$PSQL -Upostgres template1 -c "SELECT pg_stop_backup();"
if [ $? -ne 0 ]
then
echo "PG_STOP_BACKUP FAILED!!!"
exit 1;
fi
echo "PG_STOP_BACKUP SUCCESS!!!"
Stop Backup Script(db_stop_backup.sh)
[This may return an error message if backup was already completed. However, it is fine because it is more important to have the data checkpoint during the snapshots]
#!/bin/bash
PSQL=/usr/bin/psql
echo "PG_STOP_BACKUP script will start now..."
$PSQL -q -Upostgres -w template1 -c "SELECT pg_stop_backup();"
if [ $? -ne 0 ]
then
echo "PG_STOP_BACKUP FAILED!!!"
exit 1;
fi
echo "PG_STOP_BACKUP SUCCESS!!!"

N2WS executes the following scripts:

"before" script ->(before_<policyname>.sh)

#!/bin/bash
ssh -i <location of pem file to SSH>  <user-name>@<IP Address of PostgreSQL machine> "db_start_backup.sh"
"After" script ->(after_<policyname>.sh)
#!/bin/bash
ssh -i <location of pem file to SSH>  <user-name>@<IP Address of PostgreSQL machine> "db_stop_backup.sh"

If the scripts are successful, N2WS will relay that in the output.

You can view the output of the log file as:
You can view the script’s log in the recovery panel, so long as N2WS is configured to collect script output.

The output from the script above will create a tar file in the target server (PostgreSQL machine). The tar file is not a dump of entire database, just the WAL files that are stored in the data directory.

Once you are sure that snapshots are running successfully and your data is consistent with application-consistent snapshots, you can write scripts to remove the tar files. These files will not affect your application but may occupy some space. Therefore, if you don’t want tar files to remain in your database server for an extended period of time, you can also write a delete script that will be executed by the “complete” script that removes the tar file after a snapshot is completed.

N2WS Backup & Recovery can help you achieve application-consistent backup with PostgreSQL (and much more). It’s an enterprise-class backup and recovery solution for Amazon based on EBS & RDS snapshots. It supports consistent application backups on Linux as well as Windows servers.

Frequently Asked Questions

Features & Capabilities