pgBackRest postgresql backup solutionpgBackRest addresses many of the must-have features that you’ll want to look for in a PostgreSQL backup solution. I have been a great fan of the pgBackRest project for quite some time, and it gets better all the time. Historically, it was written in perl and now over the last year, the project is making steady progress converting into native C code. At the time of writing, the latest version is 2.13 and there remains dependency on a long list of perl libraries. In case you’ve never tried pgBackRest, now it is a great time to do it. This post should help you to set up a simple backup with a local backup repository.

Installation

The pgBackRest project packages are maintained in the PGDG repository. If you have already added the PGDG repository to package manager,  installation is a breeze.

On RHEL/CentOS/OEL:

On Ubuntu/Debian

This will fetch all the required perl libraries too:

The backrest is a native executable now (version 2):

Some of the other changes DBAs should keep in mind are:

  1. thread-max option is no longer valid – use process-max instead.
  2. archive-max-mb option is no longer valid and has been replaced with the archive-push-queue-max option which has different semantics
  3. The default for the backup-user (deprecated), which is a new repo-host-user, has changed from backrest to pgbackrest.
  4. The configuration file has changed from /etc/pgbackrest.conf to /etc/pgbackrest/pgbackrest.conf

Building from source

We may want to build pgBackRest depending on our environment and version. Building pgBackrest from source on Debian / Ubuntu is already covered in the official documentation. Below I’ve provided the steps to follow for the Red Hat family.

Get the tarball of the latest release:

In order to build from source, we need to build tools and libraries:

Now we can unpack the tarball and move it to the unpacked the directory:

Before proceeding any further, we should make sure that perl is built with the  USE_64_BIT_INT option.

pgBackRest depends on a few perl libaries. If they are not already present in the system, we have to install them:

Now we can build pgBackRest by executing “make” in the src directory

If the build is successful we should see the pgbackrest executable in the src directory. We should move it to a standard path location like /usr/bin

Now we can create the repository location and log directory for pgBackRest. We grant privileges on that directory to the OS user that will execute pgBackRest.

Configuration

The very first step is to create a configuration file  /etc/pgbackrest.conf for the stanza. A stanza defines the backup configuration for a specific PostgreSQL database cluster. Any global configuration sections can be overridden to define stanza-specific settings.

Here, we have specified the backup repository location to be /var/lib/pgbackrest.

Once we have defined the configuration file, we can proceed to create the stanza

This will produce output like this:

Now we need to set up the database parameters to use pgbackrest. In particular, we should use pgbackrest for archive_command. One of the major concerns for using cp like utilities in archive_command is that they are lazy writers: they do not ensure that everything is written and that fsync is called. This is a potential hole in many backup configurations and pgbackrest can solve this issue. Here is a sample set of parameter changes:

From version 2.12 onwards, the archive-push used in the above setting is entirely coded in C, making it more lightweight and faster to execute.

Basic backup configuration is complete now and we can check the backup setup:

This should produce output like this:

Taking a full backup

Now we are all set to take a full backup. We can push a backup without specifying its type like this:

In my case, I received this output:

There are a couple of points to note here.

  1. Since we are taking a backup for the first time, it detects that there is no prior backup and switches to full database backup.
  2. The output says pg_start_backup() is issued in non-exclusive mode. pgBackRest handles the creation of label file.

At the end of the backup,  I received these details, and you should look for something similar:

Check the backup related information

As we can see, pgBackRest has compressed the backup to 14.1MB in size. This represents a great reduction compared to the size of the original database.

If a good full backup exists, then by default pgBackRest will attempt to perform an incremental backup.

Restoring backup

Restoring the backup is as simple as a single liner. pgBackRest will find out the correct full backup/ differential backup and incremental backup and do the restore for you.

The restore will display details on screen. Towards the end of the restore you should see a few lines as follows:

So pgBackRest creates a recovery.conf file automatically as the log output says. pgBackRest automatically inserts the restore_command to restore archived WAL logs like:

As a DBA user, you can just start up PostgreSQL as usual. In my case I just needed this command:

PostgreSQL executes the above-mentioned restore_command as it starts and this will get all the archived WALs that are needed to complete the recovery before starting up.

Summary

pgBackRest is continuously evolving and it as it matures it’s emerging as one of the best backup solutions for PostgreSQL. The new features, especially in Version 2, are impressive. It takes away the mundane tasks of backup configuration and retention policies, which is extremely helpful.


Photo of elephant by Magda Ehlers from Pexels

7 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
강동호

standby backup support?
I wonder if it is possible.

Mihail Zheltuhin

at repo host:

pg2-host=
pg2-path=
backup-standby=y

강동호

Yes, thanks for the answer.

Alex M.

Hi Jobin, thank you for the post. I assume the backup should run on the same server and it wouldn’t work with RDS, correct?

Jobin Augustine

Yes Alex, Thats correct. pgBackRest need access to data directory for backup and restore.

Pruthviraj

Hi Jobin,

Thanks for excellent post, I have a doubt on archive_command.

My setup: 2 node cluster, 1 Primary 1 standby.

in my postgresql.conf I have set archive_command= rsync -a %p postgres@:/etc/postgres/wals/%f, this where master will rsync the WAL’s and standby will read WALS from this location

I’m confused over the archive_command = ‘pgbackrest –stanza=pg0app archive-push %p’ how will the standby server read WALS in this scenario?

Let me know If I’m missing anything.

Thanks in Advance,
Pruthvi

Jobin Augustine

A standby database is not directly reading WALs from a archive location. Instead it executes restore_command to get the WALs. restore_command can be any shell command to get the WALs. For example, can be a file copy, rsync or pgbackrest command to get the WAL from backup. Counterpart of archive-push option of pgbackrest is archive-get which will be used in restore_command.
Hope this clarifies.