This blog was originally written in September 2018 and was updated in June 2023.

In this post, we cover the methods used to achieve an enterprise-grade backup strategy for the PostgreSQL cluster. In setting up our demonstration system, we use pg_basebackup and continuous archiving. The size of the database and your database environment—if it is on colo or cloud—matters. It is very important to understand how we can ensure minimalistic or no data loss at the time of disasters using our preferred backup tools.

As discussed in the introductory blog post, we highlight important aspects an enterprise should consider for its PostgreSQL environments. If you haven’t already checked it out, in our last blog post, we looked at PostgreSQL security specifically for enterprise-grade environments. For now, let’s dive deeper into exploring PostgreSQL backups.

Understanding the PostgreSQL Backup Process

There are three different PostgreSQL backup methods, including SQL dump, which generates a file with SQL commands that recreates the database in the same state as it was at the time of the dump, file system level backup, which directly copies database files for faster recovery, and continuous archiving and point-in-time recovery (PITR), which combines a file system level backup with the backup of WAL files.

Having a backup strategy in place that takes regular backups and has secure storage is essential to protect the database in an enterprise-grade environment to ensure its availability in the event of failures or disasters.

Why are PostgreSQL Backups Necessary?

PostgreSQL backups are essential for safeguarding critical data and mitigating potential risks. Without proper backups, data loss could have catastrophic consequences for businesses, from financial losses to reputation damage to operational disruptions. Additionally, human errors like accidental deletions or incorrect updates can cause irreversible data loss, making reliable backups crucial for restoring data to a known good state and minimizing the impact of such errors.

But it’s not just about avoiding losses; it’s about staying compliant and protecting your business.  Because many industries must follow strict regulations that mandate data retention and protection, failure to comply with these requirements can result in severe penalties and legal repercussions. By consistently backing up PostgreSQL data, you demonstrate adherence to these regulations, and should the need arise, you can restore data as needed.

Backups are also important because they can help to ensure business continuity in the event of a disaster, making it easier to recover from downtime and resume operations.

A robust PostgreSQL backup strategy grants an organization peace of mind, knowing that data is secure, compliance needs are met, and business will continue to thrive without interruption.

Percona and PostgreSQL work better together.

Try Percona Distribution for PostgreSQL today.

 

The DBA’s Role in Database Management

The Database Administrator (DBA) is important in backup management, maintaining data integrity, and ensuring business continuity. Some key tasks of a DBA in backup management include: 

Backup Planning: The DBA is responsible for developing a comprehensive backup strategy that aligns with the organization’s data protection requirements and needs and working with other teams to develop disaster recovery plans. 

Monitoring Backup Operations: DBAs continuously monitor backup processes and performance by setting up alerts to notify them of any backup failures or issues, addressing errors, and performing maintenance as needed.

Backup Testing and Validation: To ensure backups are reliable and can be efficiently used for recovery, DBAs regularly perform backup testing and validation.

Security and Access Control: DBAs are responsible for securing backup files and implementing security measures to protect backups from unauthorized access. They also ensure that backup strategies adhere to industry and legal compliance requirements.

Database Performance Tuning: By optimizing backup schedules, monitoring for errors, or looking for ways to reduce backup storage needs, DBAs are always working to improve the efficiency of backups and restores.

Documentation and Reporting: DBAs maintain detailed documentation of backup processes, schedules, and recovery procedures to ensure this information is available not only to other DBAs in the event of an emergency but also to organizational stakeholders if needed.

Overall, the role of a DBA in backup management is to design, implement, monitor, and maintain a backup strategy to protect critical data, ensure business continuity, and safeguard against potential data loss or corruption.

How to Backup a PostgreSQL Database: Exploring PostgreSQL Backup Options

The ability to recover a database to a certain point in time is always the ultimate aim of a backup strategy. Without a recoverable backup, you have no backup! So there are always two parts to a backup strategy: backup AND recovery. The backup tool you use to achieve your aims may vary. Let’s discuss some of the different PostgreSQL backup options.

pg_basebackup

The community version of PostgreSQL comes with a trusted backup tool called pg_basebackup. This utility can help you take a consistent online binary backup that is very fast, can be used for point-in-time recovery, and also for setting up slaves/replicas.

pg_dump

pg_dump is a command-line utility for backing up a PostgreSQL database that makes consistent backups even if the database is being used. It generates output as a text file containing the SQL commands that can be used to reconstruct the database and restore data or to make database migrations easier.  

And while pg_dump only dumps a single database, DBAs can use pg_dumpall to back up an entire cluster or objects common to all databases in the cluster.

Archiving and Continuous Archiving

Postgres does not flush data pages to disk upon every transaction commit. To ensure that PostgreSQL achieves durability and endures crash recovery, changes are written to transaction logs (a.k.a. WALs, Write-Ahead Logs) stored on disk.

WALs in PostgreSQL are similar to transaction log files in the InnoDB storage engine for MySQL. However, WAL files are recycled according to the values set to the parameters wal_keep_segments and max_wal_size. Hence, if WAL files are not copied to a safe location—such as a backup server or another file system—it won’t be possible to achieve point-in-time recovery (PITR).

To archive WAL segments to a safe location, the parameter archive_mode must be set to ‘ON’, and we must pass an appropriate shell command to the archive_command parameter. The shell command can be a copy (cp command in Linux) or a bash script that tells Postgres where to archive its WAL files. Administrators commonly use an NFS share or SAN partition for this, mounted locally to facilitate the storage of these files. You may refer to the manual page PostgreSQL Archiving to learn more about archiving.

Automatic Backups

Setting up automated backup procedures in PostgreSQL is important for data safety and availability. Admins can set these up using pg_dump and external scheduling tools to set up scheduled, recurring backups. 

Automatic backups are part of a backup strategy that helps to eliminate the risk of human error and minimize manual efforts. They standardize the backup process, reducing the chance of data loss and ensuring up-to-date information is available for recovery in case of system failures or data corruption.

Decisions on the intervals for automatic backups depend on things like transaction volume, the rate of data changes, and how critical the data is. For example, databases with lower activity levels may not require constant backing up, while ever-changing databases may require more frequent backups to minimize the chance of data loss and ensure data integrity.

PostgreSQL Point-in-Time Recovery (PITR)

Point-in-Time Recovery (PITR) is a database backup and recovery technique used in PostgreSQL that allows DBAs to restore a database to a specific time or state within previous backups.

For PITR, PostgreSQL continuously archives transaction logs (Write-Ahead Logs, or WAL files) which record all the changes made to a database after a full backup is taken. By combining the full database backup with these archived transaction logs, PITR ensures that — even if the last full backup is outdated — admins can apply the subsequent WAL files to replay the changes to the database and restore it to any point in time since the last backup, making recovery efforts easier in the event of any data loss or corruption issues. Without PITR in place, you will lose all data added to your database since the last full backup.

However, PITR can take a lot of time if you have a lot of data and could cause significant downtime. In such situations, we suggest you have a delayed Standby configured via streaming replication.

Using PostgreSQL Point-In-Time Recovery is a simple procedure, and doing a PITR with a delayed replica can save time in cases with a large data store.

PostgreSQL Backup Demonstration Scenario

For our demonstration setup, and taking into account the size of the database and the number of transactions, we use the following backup strategy:

  1. Nightly pg_basebackup that runs every day at 01:00 am.
  2. Continuous archiving of WALs to a remote backup server that is redundant and avoids a single point of failure.

Using the above strategy, we can ensure that we store full backups daily. At the same time, the changes/transactions are backed up using the archiving solution available with PostgreSQL. In this way, we can achieve recovery to any point back in time, depending on our backup retention policies.

Be sure to check out our webinar, Enterprise-grade PostgreSQL Built on Open Source Tools, to see this in practice.

Other Potential PostgreSQL Backup Options

Depending on your database size, number of transactions, and the environment, you may also consider using pgBackRest, Barman, or WAL-g. These are stable open source backup solutions that have been contributed and are continuously maintained by community members. They help us to achieve one or more of the following features:

  1. Incremental backups
  2. Differential backups
  3. Features for building standby replicas
  4. Ability to stream backups to another server
  5. Streaming backups to AWS S3 or object store, removing the need to store backups locally or on locally mounted network shares before uploading to the cloud.

Here is a snippet of what is/isn’t possible with each of these backup tools today.

postgresl backup feature comparison

Looking for an Enterprise-Grade PostgreSQL Backup Solution? Percona can Help

The importance of having a robust PostgreSQL backup strategy in an enterprise-grade environment cannot be overstated. From choosing the right backup method based on the database size and workload to the importance of regular testing and data validation and the ability to effectively recover from corruption or disaster, DBAs and organizations must implement the right backup and restore solution for their PostgreSQL needs.

Percona Distribution for PostgreSQL includes pgBackRest, a reliable, easy-to-use backup and restore solution seamlessly scales up to the largest workloads and databases.

Ready to learn more about PostgreSQL backup solutions?

 

Get started with Percona Distribution for PostgreSQL today

FAQ

How do you backup PostgreSQL databases?

Backing up PostgreSQL databases can be done by using the built-in utility called pg_dump or using the pg_basebackup method.

What is the best way to backup PostgreSQL databases?

The best way to backup PostgreSQL databases depends on various factors, including the size of the database, how often the database is accessed/updated, and the specific organizational recovery requirements. Some best practices include taking regular backups, testing backups for restoration viability, implementing continuous archiving of transaction logs (WAL files) to enable point-in-time recovery, automating backup procedures to minimize human error, using a combination of logical and physical backups, and utilizing offsite backup storage (physical and/or cloud).

What are the different types of PostgreSQL backups?

Three primary types of backups can be performed in PostgreSQL: logical backups using pg_dump, physical backups using pg_basebackup, and continuous archiving (WAL backups) that capture changes made since the last backup. 

Sources:

pg_basebackup
pgBackRest
Barman
WAL-g

Subscribe
Notify of
guest

6 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Michael Vitale

pgbackrest does allow backup from a slave, but the table matrix above doesn’t indicate that.

Brad

pgBackRest also allows stream to cloud.

Andrey Borodin

Hi! WAL-G also does incremental and differential backups. Also there are different incremental backups – file level (MTime based) and page level (fine-grained diff). WAL-G does page-level increments.

Vickie Cooper

Thank you