Forcing a join order in PostgreSQL

05.2023 / Category: Performance / Tags: administration | join | optimizer | query tuning | sql help

join order misunderstood: database consultant confronts a first order trorm stooper — © Laurenz Albe 2023

Table of Contents

Different from many other database systems, PostgreSQL does not support query hints. That makes it difficult to force the hand of the query planner when it comes to a certain join order that you know to be good. This article explains how you can influence execution plans in PostgreSQL.

Why no query hints?

The PostgreSQL TODO list lists optimizer hints under “Features We Do Not Want” and links to a discussion page that explains why. The discussion page references mailing list discussions that show that this decision has not been unanimous.

However, one of the strong points of PostgreSQL is its extensibility. If you really want query hints, you can get them: all you have to do is install the extension pg_hint_plan. This extension offers a comprehensive set of Oracle-style query hints, and it does not require a modified version of the PostgreSQL server.

But perhaps you don't want to install third-party software, or your database is running at a hosting provider and you have no access to the operating system. In that case, read on for alternative ways to force the join order.

An example query and its join order

We have three tables a, b and c and want to calculate the natural join between them. The optimizer chooses the following plan:

EXPLAIN (COSTS OFF)
SELECT b.b_id, a.value
FROM a
   JOIN b USING (a_id)
   JOIN c USING (a_id)
WHERE c.c_id < 300;

                      QUERY PLAN                       
═══════════════════════════════════════════════════════
 Hash Join
   Hash Cond: (c.a_id = a.a_id)
   ->  Seq Scan on c
         Filter: (c_id < 300)
   ->  Hash
         ->  Nested Loop
               ->  Seq Scan on b
               ->  Memoize
                     Cache Key: b.a_id
                     Cache Mode: logical
                     ->  Index Scan using a_pkey on a
                           Index Cond: (a_id = b.a_id)
(12 rows)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

EXPLAIN (COSTS OFF)

SELECT b.b_id, a.value

FROM a

JOIN b USING (a_id)

JOIN c USING (a_id)

WHERE c.c_id < 300;

QUERY PLAN

═══════════════════════════════════════════════════════

Hash Join

Hash Cond: (c.a_id = a.a_id)

-> Seq Scan on c

Filter: (c_id < 300)

-> Hash

-> Nested Loop

-> Seq Scan on b

-> Memoize

Cache Key: b.a_id

Cache Mode: logical

-> Index Scan using a_pkey on a

Index Cond: (a_id = b.a_id)

(12 rows)

The PostgreSQL optimizer decided to first join b and a, then join the result with c. But we would like to join b and c first!

Forcing the join order with an optimizer barrier

We would like to write the query in a way that makes the PostgreSQL optimizer choose the plan we want. However, that is not as simple as it may seem. The PostgreSQL optimizer does not only plan the query as you wrote it, but it rearranges the query considerably. Among other things

it rearranges the join order as it thinks best
if “pulls up” subqueries to “flatten” the plan tree
it pushes WHERE conditions into joins and UNION ALL

Usually that is just what you want: the more ways the optimizer finds to execute the query, the better its chances are of finding the fastest execution plan. But if we want to force the optimizer's hand, we want to prevent exactly that. Therefore, we are looking for optimizer barriers, that is SQL constructs that prevent PostgreSQL from rearranging the plan.

The following two techniques are pretty similar: both rewrite the query to use a subquery and prevent the optimizer from pulling it up.

Using `OFFSET 0` to force the join order

Here, we write a subquery in the FROM clause that explicitly joins the desired tables. To keep PostgreSQL from flattening the query, we can use an OFFSET or LIMIT clause in the subquery. The traditional way to do that is to use OFFSET 0, which does not change the result of the subquery:

EXPLAIN (COSTS OFF)
SELECT subq.b_id, a.value
FROM a JOIN
   (SELECT a_id, b.b_id, c.c_id
    FROM b
       JOIN c USING (a_id)
    WHERE c.c_id < 300
    OFFSET 0
   ) AS subq
      USING (a_id);

                QUERY PLAN                 
═══════════════════════════════════════════
 Nested Loop
   ->  Hash Join
         Hash Cond: (c.a_id = b.a_id)
         ->  Seq Scan on c
               Filter: (c_id < 300)
         ->  Hash
               ->  Seq Scan on b
   ->  Memoize
         Cache Key: b.a_id
         Cache Mode: logical
         ->  Index Scan using a_pkey on a
               Index Cond: (a_id = b.a_id)
(12 rows)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

EXPLAIN (COSTS OFF)

SELECT subq.b_id, a.value

FROM a JOIN

(SELECT a_id, b.b_id, c.c_id

FROM b

JOIN c USING (a_id)

WHERE c.c_id < 300

OFFSET 0

) AS subq

USING (a_id);

QUERY PLAN

═══════════════════════════════════════════

Nested Loop

-> Hash Join

Hash Cond: (c.a_id = b.a_id)

-> Seq Scan on c

Filter: (c_id < 300)

-> Hash

-> Seq Scan on b

-> Memoize

Cache Key: b.a_id

Cache Mode: logical

-> Index Scan using a_pkey on a

Index Cond: (a_id = b.a_id)

(12 rows)

It would not be hard to teach the optimizer to ignore that “useless” clause, but that would disable this useful trick, so it won't happen.

Using a common table expression to force the join order

Writing a subquery in the FROM clause can make the query hard to read. A common table expression (CTE) is a different approach: you write the subquery in the WITH clause at the beginning of the statement and give it a name. Then you can use that name in the main query, quite like a view, but one that only exists in the context of a single statement.

Before PostgreSQL v12, a CTE was automatically an optimizer barrier. Since v12, PostgreSQL can pull CTEs into the main query, and you have to use the MATERIALIZED keyword to prevent that:

EXPLAIN (COSTS OFF)
WITH subq AS MATERIALIZED (
   SELECT a_id, b.b_id, c.c_id
   FROM b
      JOIN c USING (a_id)
   WHERE c.c_id < 300
)
SELECT subq.b_id, a.value
FROM a
   JOIN subq USING (a_id);

               QUERY PLAN               
════════════════════════════════════════
 Hash Join
   Hash Cond: (subq.a_id = a.a_id)
   CTE subq
     ->  Hash Join
           Hash Cond: (c.a_id = b.a_id)
           ->  Seq Scan on c
                 Filter: (c_id < 300)
           ->  Hash
                 ->  Seq Scan on b
   ->  CTE Scan on subq
   ->  Hash
         ->  Seq Scan on a
(12 rows)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

EXPLAIN (COSTS OFF)

WITH subq AS MATERIALIZED (

SELECT a_id, b.b_id, c.c_id

FROM b

JOIN c USING (a_id)

WHERE c.c_id < 300

)

SELECT subq.b_id, a.value

FROM a

JOIN subq USING (a_id);

QUERY PLAN

════════════════════════════════════════

Hash Join

Hash Cond: (subq.a_id = a.a_id)

CTE subq

-> Hash Join

Hash Cond: (c.a_id = b.a_id)

-> Seq Scan on c

Filter: (c_id < 300)

-> Hash

-> Seq Scan on b

-> CTE Scan on subq

-> Hash

-> Seq Scan on a

(12 rows)

The plan is different from the previous one, as PostgreSQL chose a hash join.

Join order and `join_collapse_limit`

I mentioned above that the optimizer rearranges the join order of a query. With an inner join of two tables, there are usually seven choices: PostgreSQL can opt for a nested loop, hash or merge join, and for the first two of these, the order of the tables makes a difference as well. With more tables, the number of options explodes, since the result of an inner join is independent of the order in which you join the tables. For three tables, there can be up to 147 combinations.

However, while the optimizer tries to find the best execution plan, it is also important that it does not take too long for planning. After all, PostgreSQL normally does not cache execution plans. To keep planning time moderate, the optimizer draws the line somewhere: if a query joins many tables, the optimizer will only consider all possible combinations for the first eight tables. It joins the remaining tables just like you wrote them in the statement. You can adjust that limit with the parameters join_collapse_limit and from_collapse_limit. The first one is for statements written with the explicit JOIN syntax, and the second applies to joins written in the form

FROM a, b WHERE a.col1 = b.col2

1	FROM a, b WHERE a.col1 = b.col2

If the number of tables reaches 12 (the default value of the parameter geqo_threshold), PostgreSQL uses an entirely different approach: it randomly generates a number of query plans and plays evolution by recombining the most promising plans over several generations. This genetic query optimizer can result in non-deterministic query plans, which is not always what you want.

Dumbing down the optimizer with `join_collapse_limit = 1`

With this approach, we deliberately lobotomize the optimizer by telling it not to rearrange the join order in the SQL statement. Then you have to write the tables in the exact order in which you want them joined:

SET join_collapse_limit = 1;

EXPLAIN (COSTS OFF)
SELECT b.b_id, a.value
FROM b
   JOIN c USING (a_id)
   JOIN a USING (a_id)
WHERE c.c_id < 300;

                QUERY PLAN                 
═══════════════════════════════════════════
 Nested Loop
   ->  Hash Join
         Hash Cond: (c.a_id = b.a_id)
         ->  Seq Scan on c
               Filter: (c_id < 300)
         ->  Hash
               ->  Seq Scan on b
   ->  Memoize
         Cache Key: b.a_id
         Cache Mode: logical
         ->  Index Scan using a_pkey on a
               Index Cond: (a_id = b.a_id)
(12 rows)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

SET join_collapse_limit = 1;

EXPLAIN (COSTS OFF)

SELECT b.b_id, a.value

FROM b

JOIN c USING (a_id)

JOIN a USING (a_id)

WHERE c.c_id < 300;

QUERY PLAN

═══════════════════════════════════════════

Nested Loop

-> Hash Join

Hash Cond: (c.a_id = b.a_id)

-> Seq Scan on c

Filter: (c_id < 300)

-> Hash

-> Seq Scan on b

-> Memoize

Cache Key: b.a_id

Cache Mode: logical

-> Index Scan using a_pkey on a

Index Cond: (a_id = b.a_id)

(12 rows)

We end up with the same execution plan as with OFFSET 0. You don't want to leave join_collapse_limit at 1, because other queries may perform badly with that setting. Here are some ideas how to change a parameter for a single query:

run the query in an explicit READ ONLY transaction and use SET LOCAL to change the parameter, so that it reverts to its previous setting as soon as the transaction is done
run the query from a database function and use the SET option of CREATE FUNCTION to change the parameter for the execution of the function

Conclusion

Unless you want to use an extension like pg_hint_plans, it is not easy to force the PostgreSQL optimizer to do what you want. We have seen how you can force the join oder with optimizer barriers or parameter settings.

If you are interested in query optimization, perhaps you want to read about UNION ALL and performance or about the different join strategies. There is also an introduction to EXPLAIN (ANALYZE).

4 responses to “Forcing a join order in PostgreSQL”

Tony Zucchini says:

May 4, 2023 at 8:33 am

Hi Laurenz, I put this question unsolved on stackowerflow, do you have some suggestion ?

I have a table "weather" that stores weather parameters for 45 years
of entire Europe area with 400 million of record partitioned by range in
an Oracle database managed through 19c instance and the table has been
duplicated using oracle_fdw foreign wrapper in a PostgreSQL 15 instance,
so now that table is local to PostgreSQL instance. Both tables are
preloaded in memory in keep pool for Oracle and shared buffers in
PostgreSQL using pg_prewarm.

I wrote a query that generates a ranking map of temperature max in a
selected interval for every year. Now execution time for one month
interval is 48s in PostgreSQL and 130s in Oracle, but I can force the
Oracle query optimizer to parallel execution that sounds logic with a
partitioned table using hint /* PARALLEL */, in this way execution time
goes down from 130s to 7s, but there isn't any possibility to force
PostgreSQL query optimizer to do the same. We know there aren't hints in
PostgreSQL but I played with related configuration parameters to force a
parallel strategy for the query, useless. Do You know if there is
possible strategy to force PostgreSQL optimizer to parallelize the query
?

Reply
- laurenz says:
  
  May 4, 2023 at 3:03 pm
  
  This has nothing to do with my article...
  
  Reply
Franck Pachot says:

May 4, 2023 at 9:22 am

All those force the join order but not the join direction. If the optimizer is bad on cardinality to get the right join order, there's a good chance that it chooses a bad direction. P
pg_hint_plan can do both, and also set a planner GUC at statement level.

Reply
Jeremy Giaco says:

June 27, 2023 at 1:28 pm

dude, thanks. i didn't think there was a way to force join order in postgres. i used a lot of hints/tricks in sql server to force join order, and didn't think there was a way to do that in postgres. i had a table with 500m very "thin" rows. one indexed value had two rows of data. when joining that to another table with several million rows, it took minutes. i knew i wanted to force the join order, which turns it into a subsecond query, but i thought i'd have to implement a temp table to do so. not any more!

Reply

Leave a Reply Cancel reply

Stay tuned with our

NEWSLETTER

CYBERTEC PostgreSQL International GmbH
Römerstraße 19
2752 Wöllersdorf
Austria

+43 (0) 2622 93022-0

office@cybertec-postgresql.com

ISO_27001_Badge

Customer Support

Support Platform

Services

Support CYBERTEC Partner PostgreSQL Books

Company

STAY TUNED WITH OUR NEWSLETTER

Get the newest PostgreSQL Info & Tools

Data Protection Policy Terms and conditions Terms of Service Imprint

©

2025

CYBERTEC PostgreSQL International GmbH