PostgreSQL logoWhen you need to upgrade your PostgreSQL databases, there are a number of options available to you. In this post we’ll take a look at how you can upgrade PostgreSQL versions using pg_upgrade, a built-in tool that allows in-place upgrade of your software. Using pg_upgrade allows you, potentially, to minimize your downtime, an essential consideration for many organizations. It also allows you to perform a postgres upgrade with very minimal effort.

In our previous posts, we discussed various methods and tools that can help us perform a PostgreSQL upgrade – (1) pg_dumpall, (2) pg_dump and pg_restore with pg_dumpall, (3) logical replication and pglogical, and (4) slony. Methods 1 and 2 can involve additional downtime compared to the approaches taken in 3 and 4. Whilst performing an upgrade using logical replication or slony may be time consuming and require a lot of monitoring, it can be worth it if you can minimize downtime. If you have large databases that are busy with a lot of transactions, you may be better served using logical replication or slony.

This post is the fifth of our Upgrading or Migrating Your Legacy PostgreSQL to Newer PostgreSQL Versions series. These posts lead up to a live webinar, where we’ll be exploring different methods available to upgrade your PostgreSQL databases. If it’s beyond the live webinar date when you read this, you’ll find the recording at that same link.

pg_upgrade

pg_upgrade (formerly pg_migrator – until PostgreSQL 8.4) is a built-in tool that helps in upgrading a legacy PostgreSQL server to a newer version without the need of a dump and restore. The oldest version from when you can upgrade your PostgreSQL using pg_upgrade is 8.4.x. It is capable of performing faster upgrades by taking into consideration that system tables are the ones that undergo the most change between two major versions. The internal data storage format is less often affected.

In fact, in one of our tests we were able to perform an upgrade of a 2 TB database server from PostgreSQL 9.6.5 to 11.1 in less than 10 seconds. Now that is fast!

Overview of the process

To understand how it works, consider a PostgreSQL server running on 9.3.3 that needs to upgrade to PostgreSQL 11. You should install the latest binaries for your new PostgreSQL version on the server – let’s say PostgreSQL 11.2 – before you begin the upgrade process.

Preparation and consistency checks

Once you have installed the new PostgreSQL version, initialize a new data directory using the new binaries and start it on another port i.e. a different port to the one used by PostgreSQL 9.3.3, in our example. Use pg_upgrade to perform consistency checks between the two servers – PG 9.3.3 and PG 11.2 – running on two different ports. If you get any errors, such as a missing extension, you need to to fix these before you proceeding to the upgrade. Once the consistency check has been passed, you can proceed.

Here is how the log looks if you should get an error while performing consistency checks.

To proceed beyond the error, in this example you’d need to install this missing extension pg_repack for the new PostgreSQL version, and rerun the check to make sure that you receive no errors and all the checks are passed.

Carrying out the upgrade

Once passed, you can proceed in one of two ways. One option is to let pg_upgrade copy the datafiles of the old data directory to the new data directory initialized by the new PostgreSQL version. The second option is to let pg_upgrade use hard links instead of copying data files. Copying a database of several terabytes may be time consuming. Using the hard links method makes the process really quick as it does not involve copying files.

To use hard links with pg_upgrade, you pass an additional argument -k as you can see in the following command.

In the Unix file system, a file or a directory is a link to an inode (index node) that stores metadata (disk block location, attributes, etc) of the data stored in them. Each inode is identified by an integer or an inode number. When you use pg_upgrade with hard links, it internally creates another file/directory in the new data directory that links to the same inode as it was in the older data directory for that file/directory. So, it skips the physical copy of the objects, but creates each object and links them to the same inode.

This reduces the disk IO and avoids the need for additional space in the server. An important point to note is that this option works only when you are upgrading your PostgreSQL on the same file system. This means, for example, if you want to upgrade to a new or a faster disk during the database upgrade, the hard link option does not work. In that case, you would need to use the file copy method.

So far, we have seen a high level overview of how pg_upgrade with hard links help you to perform an upgrade with lowest possible downtime. Come see more in action during our Webinar.  And don’t forget at Percona Live in Austin, May 28-30 2019, we’ll have two days of PostgreSQL content in a postgres dedicated track.


Elephant image based on photo from Pexels

7 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Sanjay

Hi Team, Thanks for the article! Can we delete old data directory after we hard link it to new one?

Jobin Augustine

pg_upgrade will automatically provide a script at the end of execution for deleting the old cluster. please use the same.

J Thomas

Wondering if there will be any syntaxual changes to expect from say, 9.6 – 11?

Jobin Augustine

I believe you are asking from application portability perspective.There won’t be any major syntax changes. As long as you stick to standard ANSI SQL syntax, there is very less chance that something will break in application. As you know PostgreSQL is one of the most standard compliant database. But testing is essential if you are planning to upgrade.
However there could be lot changes from Administrative and monitoring perspective, which majorly affects DBAs. The internal catalog views / functions can change which may affect the DBAs and their scripts.

Christian ROUX

Hi. Is there a way with pg_upgrade to migrate from a PostgreSQL 10 or 11 without data checksum to a PostgreSQL 12 version with data checksum or should I use or pg_dump/pg_restore/pg_dumpall or logical replication ?
Thanks you for your exellent blog.

Christian

Jobin Augustine

@Christian, You may upgrade to PostgreSQL 12 using pg_upgrade (without checksum) and then use the pg_checksums utility avaliable form PG12 (https://www.postgresql.org/docs/12/app-pgchecksums.html) to quickly enable the checksum.

Christian ROUX

Hello. Thanks for your reply. You are right. I’m new on PG 12 and I had forgotten pg_checksum.