Parallel Queries?

Our favourite database, PostgreSQL, had a new version released back in September 2016. It’s not a very popular version yet, but it has become the new default on, amongst others, Heroku.

The biggest feature delivered in 9.6, listed as the first one in 9.6 Changelog, was the parallel execution of sequential scans, joins and aggregates. Let’s dive into this feature and see where it can help our applications work better!

Preliminary

Obviously, we need to have PostgreSQL 9.6 installed. If you use Ubuntu (I do), this thread on AskUbuntu has instructions for all versions. Don’t worry, it won’t upgrade your existing PostgreSQL installation – it’s installed alongside the previous versions and running on a different port, so if your current PostgreSQL runs on the default port 5432, 9.6 will listen on 5433.

Sequential Operations

Sequential scans are, as a rule of thumb, something you want to avoid when a given query’s execution time is important, as it means the database walks sequentially through the whole table, one-by-one, picking rows that fit the query conditions. This means a time complexity of O(n) for a simple, one-table SELECT-WHERE query. What you want for time-critical queries is an index scan, which has a way more preferred O(log(n)) complexity (for the default binary tree index).

But index comes with a cost: it takes additional space in memory and increases time of INSERT and UPDATE queries. Therefore – for many scenarios – not having an index and bearing the cost of a sequential scan is actually a more viable option. This all depends on the usecases.

At the moment only sequential operations can run in parallel, so we’ll focus on them, assuming we do have a usecase to tolerate sequential scans (rather than using an index).

Sequential Scan Parallelisation

Let’s start with the simplest case: we have a lot of data, we want to get all rows matching a given criteria, and we want to speed up the query. That’s why the ability to parallelise queries landed in PostgreSQL 9.6.

If you’re into reading C code (a nicely written C code!), take a look at the commit which introduced this feature.

First let’s create a people table with only id (primary key) and age columns:

postgres=# CREATE TABLE people (id int PRIMARY KEY NOT NULL, age int NOT NULL);
CREATE TABLE

postgres=# \d people
   Table "public.people"
Column |  Type   | Modifiers
-------+---------+-----------
id     | integer | not null
age    | integer | not null
Indexes:
   "people_pkey" PRIMARY KEY, btree (id)

And let’s insert some data into it. Ten million rows should be enough to see the gains of parallelisation. Let’s give everyone a random age from the 0-100 range.

postgres=# INSERT INTO people SELECT id, (random()*100)::integer AS age FROM generate_series(1,10000000) AS id;
INSERT 0 10000000

Now let’s try to fetch all the people with age 6, therefore expecting about one percent of rows.

postgres=# EXPLAIN ANALYZE SELECT * FROM people WHERE age = 6;
                                                    QUERY PLAN                                                    
------------------------------------------------------------------------------------------------------------------
 Seq Scan on people  (cost=0.00..169247.71 rows=104000 width=8) (actual time=0.052..1572.701 rows=100310 loops=1)
   Filter: (age = 6)
   Rows Removed by Filter: 9899690
 Planning time: 0.061 ms
 Execution time: 1579.476 ms
(5 rows)

This took quite some time, over a second and half. Query parallelisation is disabled by default: let’s enable it by allowing PostgreSQL to spawn a maximum of two workers. And then run the query again.

postgres=# SET max_parallel_workers_per_gather = 2;
SET

postgres=# EXPLAIN ANALYZE SELECT * FROM people WHERE age = 6;
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..107731.21 rows=104000 width=8) (actual time=0.431..892.823 rows=100310 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on people  (cost=0.00..96331.21 rows=43333 width=8) (actual time=0.109..862.562 rows=33437 loops=3)
         Filter: (age = 6)
         Rows Removed by Filter: 3299897
 Planning time: 0.133 ms
 Execution time: 906.548 ms
(8 rows)

Now this is way better: launching two workers lowered the execution time to below one second.

It’s not down to half the original time, as launching workers and gathering data from them by the “gather” process introduces some overhead. This gathering-from-workers overhead gets bigger with every additional worker; sometimes more workers do not improve the query time at all – but in order to demonstrate it, you’d need to experiment on a database server with more physical cores than the two we usually find on laptops.

There are also queries which won’t be parallelised, for example let’s try to fetch all people with age below 50 (which will return around half of our table):

postgres=# EXPLAIN ANALYZE SELECT * FROM people WHERE age < 50;
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Seq Scan on people  (cost=0.00..169247.71 rows=4955739 width=8) (actual time=0.079..1957.076 rows=4949330 loops=1)
   Filter: (age < 50)
   Rows Removed by Filter: 5050670
 Planning time: 0.097 ms
 Execution time: 2233.848 ms
(5 rows)

Why does that happen? The gathering-from-workers overhead introduced by every row matching our WHERE condition means it’s good to parallelise a query that will return only a small fraction of the whole table, but when most rows encountered by workers fulfill the condition the gathering overhead would not be worth the gains from parallelisation.

How planner does that is a separate story involving the approximate statistics PostgreSQL keeps about our tables and columns; what’s interesting to us is that we can force the planner to perform a parallel sequential scan by lowering (setting to zero) the cost of something called “parallel tuple”.

Let’s give it a try:

postgres=# SET parallel_tuple_cost TO 0;
SET
postgres=# EXPLAIN ANALYZE SELECT * FROM people WHERE age < 50;
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..97331.21 rows=4955739 width=8) (actual time=0.424..3147.678 rows=4949330 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on people  (cost=0.00..96331.21 rows=2064891 width=8) (actual time=0.082..1325.310 rows=1649777 loops=3)
         Filter: (age < 50)
         Rows Removed by Filter: 1683557
 Planning time: 0.104 ms
 Execution time: 3454.690 ms
(8 rows)

Yikes! The overhead is real and despite employing two workers (which means two cores of our CPU) our execution time actually got worse, growing from ~2 s to ~3.5 s.

Let’s clean up after ourselves before moving onto the next section.

postgres=# SET parallel_tuple_cost TO DEFAULT;
SET
postgres=# SET max_parallel_workers_per_gather TO 0;
SET

Parallel Aggregate Functions

We often use a feature called Aggregate Functions, which computes a single result from multiple rows of our table. For sure you’ve used the count function, there are also other useful ones like sum, min, max or avg (average).

Turns our they can also run in parallel now! Look at the commit that introduced this feature.

Let’s s ask our database for the average age (amongst all the 10 million randomly generated people) without parallelisation:

postgres=# EXPLAIN ANALYZE SELECT avg(age) FROM people;
                                                        QUERY PLAN                                                         
---------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=169247.72..169247.73 rows=1 width=32) (actual time=2751.862..2751.862 rows=1 loops=1)
   ->  Seq Scan on people  (cost=0.00..144247.77 rows=9999977 width=4) (actual time=0.054..1250.670 rows=10000000 loops=1)
 Planning time: 0.054 ms
 Execution time: 2751.905 ms
(4 rows)

And let’s see how fast it performs after enabling parallelisation:

postgres=# SET max_parallel_workers_per_gather TO 2;
SET

postgres=# EXPLAIN ANALYZE SELECT avg(age) FROM people;
                                                                 QUERY PLAN     
---------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=32) (actual time=1616.346..1616.346 rows=1 loops=1)
   ->  Gather  (cost=97331.21..97331.42 rows=2 width=32) (actual time=1616.143..1616.316 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=32) (actual time=1610.785..1610.785 rows=1 loops=3)
               ->  Parallel Seq Scan on people  (cost=0.00..85914.57 rows=4166657 width=4) (actual time=0.067..957.355 rows=3333333 loops=3)
 Planning time: 0.248 ms
 Execution time: 1619.181 ms
(8 rows)

Not bad! There’s a lot of information here, but what’s of interest to us is that we’ve managed to reduce the query execution time from 2.7 s down to 1.6 s.

Parallel Joins

The third and last sequential operation which can run in parallel in PostgreSQL 9.6 is joining rows from two different tables.

And once again, if you’re up for reading some nicely written C code, here is the commit that introduced join parallelisation.

Let’s begin by creating a second table, pets, and populating it with 10 million generated entries, with owner_id being a proper foreign key with index and species containing a three-letter name of species, be it "cat" or "dog".

postgres=# CREATE TABLE pets (owner_id int NOT NULL, species character(3) NOT NULL);
postgres=# CREATE INDEX pets_owner_id ON pets (owner_id);
postgres=# INSERT INTO pets SELECT (random()*10000000)::integer AS owner_id, ('{cat,dog}'::text[])[ceil(random()*2)] as species FROM generate_series(1,10000000);

Let’s get some data with conditions spanning both tables, at first without parallelisation:

postgres=# SET max_parallel_workers_per_gather TO 0;
SET
postgres=# EXPLAIN ANALYZE SELECT * FROM pets JOIN people ON pets.owner_id = people.id WHERE pets.species = 'cat' AND people.age = 18;
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=171025.88..310311.99 rows=407 width=28) (actual time=1627.973..5963.378 rows=49943 loops=1)
   Hash Cond: (pets.owner_id = people.id)
   ->  Seq Scan on pets  (cost=0.00..138275.00 rows=37611 width=20) (actual time=0.050..2784.238 rows=4997112 loops=1)
         Filter: (species = 'cat'::bpchar)
         Rows Removed by Filter: 5002888
   ->  Hash  (cost=169247.71..169247.71 rows=108333 width=8) (actual time=1626.987..1626.987 rows=100094 loops=1)
         Buckets: 131072  Batches: 2  Memory Usage: 2974kB
         ->  Seq Scan on people  (cost=0.00..169247.71 rows=108333 width=8) (actual time=0.045..1596.765 rows=100094 loops=1)
               Filter: (age = 18)
               Rows Removed by Filter: 9899906
 Planning time: 0.466 ms
 Execution time: 5967.223 ms
(12 rows)

That’s almost 6 seconds! Let’s add some workers and see if it helps:

postgres=# SET max_parallel_workers_per_gather TO 2;
SET
postgres=# EXPLAIN ANALYZE SELECT * FROM pets JOIN people ON pets.owner_id = people.id WHERE pets.species = 'cat' AND people.age = 18;
                                                             QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.43..244061.39 rows=53871 width=16) (actual time=0.304..1295.285 rows=49943 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Nested Loop  (cost=0.43..237674.29 rows=22446 width=16) (actual time=0.347..1274.578 rows=16648 loops=3)
         ->  Parallel Seq Scan on people  (cost=0.00..96331.21 rows=45139 width=8) (actual time=0.147..882.415 rows=33365 loops=3)
               Filter: (age = 18)
               Rows Removed by Filter: 3299969
         ->  Index Scan using pets_owner_id on pets  (cost=0.43..3.12 rows=1 width=8) (actual time=0.010..0.011 rows=0 loops=100094)
               Index Cond: (owner_id = people.id)
               Filter: (species = 'cat'::bpchar)
               Rows Removed by Filter: 1
 Planning time: 0.274 ms
 Execution time: 1306.590 ms
(13 rows)

That’s much better, we went down from almost 6 seconds to 1.3 s. We do have an index on the foreign key, like we always should, but the other columns (people.age, pets.species) are sequentially scanned.

If you remove the pets.owner_id index, the parallel version run will be down to just 5 seconds. Seriously, even in a constrained environment, primary keys and foreign keys (as in: columns used for joining) should always have an index!

Conclusion

Parallel sequential operations are a new feature of PostgreSQL 9.6, but it’s not only considered stable and production-ready, it’s also going to get lots of improvements in PostgreSQL 10. Before diving into any solution, always check your usecases, measure, analyze and make an informed decision based on proper research. And that’s a general tip, not only regarding this single feature of a single database!

Disclaimer: the measurements in pasted EXPLAIN ANALYZE results here are just that, a paste from a single run, they’re not proper benchmarks. There are no means and variances. Make sure to run a proper benchmark yourself on the database server you’re using!