Increasing max parallel workers per gather in Postgres

Cover photo by Deepak Rautela

Over the past 5 years, new Postgres versions have been getting better and better at running operations in parallel.

This is especially useful for analytical queries, which can sometimes speed up massively by fully utilising a large server (with lots of cores).

However, for fast, transactional queries, the overhead of starting and managing processes is almost always undesirable. As such, the default settings in Postgres are on the conservative side.

If you think your workload could benefit from more parallelism, then this is for you.

The first limiting factor

The first setting you’re likely to be limited by is the max_parallel_workers_per_gather parameter, which is only two by default.

This means that each Gather (or Gather Merge) operation can use at most two worker processes. If you haven’t changed this, you will likely have seen “Workers Planned: 2” and “Workers Launched: 2” in your EXPLAIN plans. We look out for these fields in pgMustard, as a sign the query might benefit from increased parallelisation.

It’s worth noting that the leader process is not included in this limit or these numbers, so by default you get a maximum of three processes.

Testing and making changes

You can see what your current setting is with:

SHOW max_parallel_workers_per_gather;

You can also change it just for your current session by running a query, eg:

SET max_parallel_workers_per_gather = 4;

You can then use EXPLAIN to see whether Postgres thinks using the extra workers is a good idea for a given query. You can then use EXPLAIN ANALYZE to see if whether (or by how much) it is actually faster.

Some settings that you should definitely consider at the same time are:

  • work_mem – as each worker can utilise it separately

  • max_parallel_workers – a system-wide limit (default 8)

  • max_worker_processes – a higher limit including background jobs (default 8)

Additionally, if parallelism is kicking in too soon, or too late, here are a few more settings to be aware of:

  • parallel_setup_cost

  • parallel_tuple_cost

  • min_parallel_table_scan_size

  • min_parallel_index_scan_size

For more details, I highly recommend the How Parallel Query Works section of the official docs, this Parallelism Dos and Don’ts talk (1 hour) by Sebastian from Swarm64, and the excellent postgresqlco.nf site, by the team at Ongres.

Once you are happy with your testing and want to change your settings globally, you can do so by adding/modifying them in your data/postgresql.conf file, eg:

max_parallel_workers_per_gather = 4

If you are changing max_worker_processes, you will need to restart the database, but if not, you can apply your changes by reloading the config:

select pg_reload_conf();

Or from the command line:

pg_ctl reload -D postgres\data

Turning it up to 11 (or more)

Newer versions of Postgres have been gradually improving parallelism support, so it’s worth testing an upgrade if you’re not up-to-date.

However, if your machine and workload could benefit from being very highly parallelised (more than about 12 parallel workers or so), you might struggle to do so on vanilla PostgreSQL.

There are a host of specialised solutions out there, but for a PostgreSQL-native one, delivered via an extension, you might want to check out Swarm64.

In summary

If you’re only running lots of fast, transactional, queries on Postgres, you likely don’t need to worry about parallelism at all.

If, however, you run some longer, analytical queries, there are several settings you can tune to speed them up, and make better use of your server.

The limiting factor is likely max_parallel_workers_per_gather (followed by max_parallel_workers and max_worker_processes), but there are other settings you can adjust too.