PostgreSQL: Understanding deadlocks

04.2021 / Category: How To / Tags: administration | lock

Many might have seen PostgreSQL issue the following error message: "ERROR: deadlock detected". But what does it really mean? How can we prevent a deadlock and how can we reproduce the problem? Let’s dive into PostgreSQL locking and understand what deadlock and deadlock_timeout really mean.

Table of Contents

How does a deadlock happen?

Many people approach us because they want to understand what a deadlock is and how it can happen. They also want to understand how a deadlock can be avoided and what software developers can do about it.
If you want to understand how a deadlock occurs, all you need is a table containing two lines. That’s sufficient to explain the basic principle of deadlocks.

Here is some easy-to-use sample data:

test=# CREATE TABLE t_data (id int, data int);
CREATE TABLE
test=# INSERT INTO t_data VALUES (1, 100), (2, 200);
INSERT 0 2
test=# TABLE t_data;
id  | data
----+------
1   | 100
2   | 200
(2 rows)

1

2

3

4

5

6

7

8

9

10

test=# CREATE TABLE t_data (id int, data int);

CREATE TABLE

test=# INSERT INTO t_data VALUES (1, 100), (2, 200);

INSERT 0 2

test=# TABLE t_data;

id | data

----+------

1 | 100

2 | 200

(2 rows)

The crux is that if data is updated in a different order, transactions might have to wait for one another to be finished. It is perfectly fine if transaction 1 has to wait for transaction 2. But what happens if transaction 1 has to wait for transaction 2 and transaction 2 has to wait for transaction 1? In that case, the system has two choices:

Wait infinitely, or
Abort one transaction and commit the other transaction.

As waiting infinitely is not an option, PostgreSQL will abort one of these transactions after some time (deadlock_timeout). Here is what happens:

Transaction 1	Transaction 2	Comment
`BEGIN;`	`BEGIN;`
`UPDATE t_data SET data = data * 10 WHERE id = 1 RETURNING *; id \| data ----+------ 1 \| 1000 (1 row)`	`UPDATE t_data SET data = data * 10 WHERE id = 2 RETURNING *; id \| data ----+------ 2 \| 2000 (1 row)`	works perfectly
`UPDATE t_data SET data = data * 10 WHERE id = 2 RETURNING *;`		has to wait until transaction 2 releases the lock on the row containing id = 2
… waits ...	`UPDATE t_data SET data = data * 10 WHERE id = 1 RETURNING *;`	wants to lock the row locked by transaction id: now both are supposed to wait
… deadlock timeout ...	… deadlock timeout ...	PostgreSQL waits (deadlock_timeout) and triggers deadlock detection after this timeout (not immediately)
update proceeds: “UPDATE 1”	ERROR: deadlock detected	a transaction has to die
COMMIT;		the rest commits normally

The error message we will see is:

ERROR: deadlock detected DETAIL: Process 70725 waits for ShareLock on transaction 891717; blocked by process 70713. Process 70713 waits for ShareLock on transaction 891718; blocked by process 70725. HINT: See server log for query details. CONTEXT: while updating tuple (0,1) in relation "t_data"

The reason is that transactions have to wait for one another. If two transactions are in a conflict, PostgreSQL will not resolve the problem immediately, rather it will wait for deadlock_timeout and then trigger the deadlock detection algorithm to resolve the problem.

Why does PostgreSQL wait for some time before it steps in and fixes things? The reason is that deadlock detection is quite expensive, and therefore not immediately triggering it makes sense. The default value here is 1 second, which is high enough to avoid pointless deadlock detection attempts, but is still short enough to fix the problem in a useful and timely manner.

How to fix and avoid deadlocks

The most important thing to know is: There is NO MAGIC CONFIGURATION PARAMETER to fix this problem. The problem does NOT depend on configuration. It depends on the execution order of operations. In other words, you cannot magically fix it without understanding the application and its underlying operations.

The only thing that can fix the problem is to change the execution order, as shown in the next listing:

test=# SELECT * FROM t_data ;
id  | data
----+------
1   | 1000
2   | 2000
(2 rows)

1

2

3

4

5

6

test=# SELECT * FROM t_data ;

id | data

----+------

1 | 1000

2 | 2000

(2 rows)

This is the data you should see after committing the transaction that did not fail before. Thus we can see what happens if two transactions execute in a different order:

Transaction 1	Transaction 2	Comment
`BEGIN;`
`UPDATE t_data SET data = data * 10 WHERE id = 1 RETURNING *; id \| data ----+------ 1 \| 1000 (1 row)`	`BEGIN;`
	`UPDATE t_data SET data = data * 10 WHERE id = 1 RETURNING *;`
`UPDATE t_data SET data = data * 10 WHERE id = 2 RETURNING *; id \| data ----+------- 2 \| 20000 (1 row)`	… wait …
`COMMIT;`	… wait …
	`UPDATE t_data SET data = data * 10 WHERE id = 1 RETURNING *; id \| data ----+-------- 1 \| 100000 (1 row)`	re-read the value and use the newly committed entries
	`UPDATE t_data SET data = data * 10 WHERE id = 2 RETURNING *; id \| data ----+-------- 2 \| 200000 (1 row)`	re-read the value and use the newly committed entries
	`COMMIT;`

In this case, there is no deadlock. However, in a real work scenario it is hardly possible to simply swap the execution order. That’s why this is more of a theoretical solution to the problem than a practical one. However, there are no other options to fix the problem of deadlocks. In the case of deadlocks, being aware of how to prevent them is the best cure.

Finally …

Locking is really important. Deadlocks are not the only concern in this area. Performance might be equally important, therefore it makes sense to deal with performance-related locking effects as well. Stay tuned for more on this topic.

If you want to learn more about important features of PostgreSQL, you might want to check out a blog post about UPDATE which can be found here.

Please leave your comments below. In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Twitter, Facebook, or LinkedIn.

0 0 votes

Article Rating

Subscribe

4 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

ClausRasmussen

2 years ago

My advice is to lock tables in alphabetic order and to process records ordered by id. It's not always possible or convenient and it doesn't always work but it is easy to explain and typically removes a lot of trivial cases

0

Reply

laurenz

2 years ago

Reply to ClausRasmussen

I agree with processing rows in some deterministic order (although that can be tricky with DML statements that affect more than a single row), but I think that locking tables is bad advice.
Explicitly locking tables means that you effectively prevent concurrency.
In my experience, explicitly locking tables is almost always the wrong answer and a cure that is worse than the disease.
But perhaps I misunderstood what you were trying to say.

0

Reply

ClausRasmussen

2 years ago

Reply to laurenz

By "locking table" I don't mean full table locks but only the records involved in the transaction. In your first example deadlocks could have been prevented if transaction 1 established a lock on row #1 and #2 before doing any changes. For example by doing a 'select for update' that implicitly locks the involved records

0

Reply

laurenz

2 years ago

Reply to ClausRasmussen

Then I totally agree with you.

0

Reply