join order misunderstood: database consultant confronts a first order trorm stooper
© Laurenz Albe 2023

 

Different from many other database systems, PostgreSQL does not support query hints. That makes it difficult to force the hand of the query planner when it comes to a certain join order that you know to be good. This article explains how you can influence execution plans in PostgreSQL.

Why no query hints?

The PostgreSQL TODO list lists optimizer hints under “Features We Do Not Want” and links to a discussion page that explains why. The discussion page references mailing list discussions that show that this decision has not been unanimous.

However, one of the strong points of PostgreSQL is its extensibility. If you really want query hints, you can get them: all you have to do is install the extension pg_hint_plan. This extension offers a comprehensive set of Oracle-style query hints, and it does not require a modified version of the PostgreSQL server.

But perhaps you don’t want to install third-party software, or your database is running at a hosting provider and you have no access to the operating system. In that case, read on for alternative ways to force the join order.

An example query and its join order

We have three tables a, b and c and want to calculate the natural join between them. The optimizer chooses the following plan:

EXPLAIN (COSTS OFF)
SELECT b.b_id, a.value
FROM a
   JOIN b USING (a_id)
   JOIN c USING (a_id)
WHERE c.c_id < 300;

                      QUERY PLAN                       
═══════════════════════════════════════════════════════
 Hash Join
   Hash Cond: (c.a_id = a.a_id)
   ->  Seq Scan on c
         Filter: (c_id < 300)
   ->  Hash
         ->  Nested Loop
               ->  Seq Scan on b
               ->  Memoize
                     Cache Key: b.a_id
                     Cache Mode: logical
                     ->  Index Scan using a_pkey on a
                           Index Cond: (a_id = b.a_id)
(12 rows)

The PostgreSQL optimizer decided to first join b and a, then join the result with c. But we would like to join b and c first!

Forcing the join order with an optimizer barrier

We would like to write the query in a way that makes the PostgreSQL optimizer choose the plan we want. However, that is not as simple as it may seem. The PostgreSQL optimizer does not only plan the query as you wrote it, but it rearranges the query considerably. Among other things

  • it rearranges the join order as it thinks best
  • if “pulls up” subqueries to “flatten” the plan tree
  • it pushes WHERE conditions into joins and UNION ALL

Usually that is just what you want: the more ways the optimizer finds to execute the query, the better its chances are of finding the fastest execution plan. But if we want to force the optimizer’s hand, we want to prevent exactly that. Therefore, we are looking for optimizer barriers, that is SQL constructs that prevent PostgreSQL from rearranging the plan.

The following two techniques are pretty similar: both rewrite the query to use a subquery and prevent the optimizer from pulling it up.

Using OFFSET 0 to force the join order

Here, we write a subquery in the FROM clause that explicitly joins the desired tables. To keep PostgreSQL from flattening the query, we can use an OFFSET or LIMIT clause in the subquery. The traditional way to do that is to use OFFSET 0, which does not change the result of the subquery:

EXPLAIN (COSTS OFF)
SELECT subq.b_id, a.value
FROM a JOIN
   (SELECT a_id, b.b_id, c.c_id
    FROM b
       JOIN c USING (a_id)
    WHERE c.c_id < 300
    OFFSET 0
   ) AS subq
      USING (a_id);

                QUERY PLAN                 
═══════════════════════════════════════════
 Nested Loop
   ->  Hash Join
         Hash Cond: (c.a_id = b.a_id)
         ->  Seq Scan on c
               Filter: (c_id < 300)
         ->  Hash
               ->  Seq Scan on b
   ->  Memoize
         Cache Key: b.a_id
         Cache Mode: logical
         ->  Index Scan using a_pkey on a
               Index Cond: (a_id = b.a_id)
(12 rows)

It would not be hard to teach the optimizer to ignore that “useless” clause, but that would disable this useful trick, so it won’t happen.

Using a common table expression to force the join order

Writing a subquery in the FROM clause can make the query hard to read. A common table expression (CTE) is a different approach: you write the subquery in the WITH clause at the beginning of the statement and give it a name. Then you can use that name in the main query, quite like a view, but one that only exists in the context of a single statement.

Before PostgreSQL v12, a CTE was automatically an optimizer barrier. Since v12, PostgreSQL can pull CTEs into the main query, and you have to use the MATERIALIZED keyword to prevent that:

EXPLAIN (COSTS OFF)
WITH subq AS MATERIALIZED (
   SELECT a_id, b.b_id, c.c_id
   FROM b
      JOIN c USING (a_id)
   WHERE c.c_id < 300
)
SELECT subq.b_id, a.value
FROM a
   JOIN subq USING (a_id);

               QUERY PLAN               
════════════════════════════════════════
 Hash Join
   Hash Cond: (subq.a_id = a.a_id)
   CTE subq
     ->  Hash Join
           Hash Cond: (c.a_id = b.a_id)
           ->  Seq Scan on c
                 Filter: (c_id < 300)
           ->  Hash
                 ->  Seq Scan on b
   ->  CTE Scan on subq
   ->  Hash
         ->  Seq Scan on a
(12 rows)

The plan is different from the previous one, as PostgreSQL chose a hash join.

Join order and join_collapse_limit

I mentioned above that the optimizer rearranges the join order of a query. With an inner join of two tables, there are usually seven choices: PostgreSQL can opt for a nested loop, hash or merge join, and for the first two of these, the order of the tables makes a difference as well. With more tables, the number of options explodes, since the result of an inner join is independent of the order in which you join the tables. For three tables, there can be up to 147 combinations.

However, while the optimizer tries to find the best execution plan, it is also important that it does not take too long for planning. After all, PostgreSQL normally does not cache execution plans. To keep planning time moderate, the optimizer draws the line somewhere: if a query joins many tables, the optimizer will only consider all possible combinations for the first eight tables. It joins the remaining tables just like you wrote them in the statement. You can adjust that limit with the parameters join_collapse_limit and from_collapse_limit. The first one is for statements written with the explicit JOIN syntax, and the second applies to joins written in the form

FROM a, b WHERE a.col1 = b.col2

If the number of tables reaches 12 (the default value of the parameter geqo_threshold), PostgreSQL uses an entirely different approach: it randomly generates a number of query plans and plays evolution by recombining the most promising plans over several generations. This genetic query optimizer can result in non-deterministic query plans, which is not always what you want.

Dumbing down the optimizer with join_collapse_limit = 1

With this approach, we deliberately lobotomize the optimizer by telling it not to rearrange the join order in the SQL statement. Then you have to write the tables in the exact order in which you want them joined:

SET join_collapse_limit = 1;

EXPLAIN (COSTS OFF)
SELECT b.b_id, a.value
FROM b
   JOIN c USING (a_id)
   JOIN a USING (a_id)
WHERE c.c_id < 300;

                QUERY PLAN                 
═══════════════════════════════════════════
 Nested Loop
   ->  Hash Join
         Hash Cond: (c.a_id = b.a_id)
         ->  Seq Scan on c
               Filter: (c_id < 300)
         ->  Hash
               ->  Seq Scan on b
   ->  Memoize
         Cache Key: b.a_id
         Cache Mode: logical
         ->  Index Scan using a_pkey on a
               Index Cond: (a_id = b.a_id)
(12 rows)

We end up with the same execution plan as with OFFSET 0. You don’t want to leave join_collapse_limit at 1, because other queries may perform badly with that setting. Here are some ideas how to change a parameter for a single query:

  • run the query in an explicit READ ONLY transaction and use SET LOCAL to change the parameter, so that it reverts to its previous setting as soon as the transaction is done
  • run the query from a database function and use the SET option of CREATE FUNCTION to change the parameter for the execution of the function

Conclusion

Unless you want to use an extension like pg_hint_plans, it is not easy to force the PostgreSQL optimizer to do what you want. We have seen how you can force the join oder with optimizer barriers or parameter settings.

If you are interested in query optimization, perhaps you want to read about UNION ALL and performance or about the different join strategies. There is also an introduction to EXPLAIN (ANALYZE).