Locking in PostgreSQLPostgreSQL locking visibility for application developers and DBAs is in most cases related to heavyweight locks. Complex database locking operations require full instrumentation using views from the system catalog. It should be clear which object is locked by a specific database “backend” process. An alternative name for any lock is “bottleneck”. In order to make database operations parallel we should split a single bottleneck into multiple operation specific tasks.

This is the second part of three blog posts related to table level locks. The previous post was about row-level locks, and a subsequent post reviews the latches protecting internal database structures.

Example environment

A simple table with several rows:

Helper view

In order to check different types of these locks, let’s create a helper view as suggested by Bruce Momjian in his presentation:

RowShareLock (ROW SHARE)

Many applications use the read-modify-write paradigm. For instance, the application fetches single object fields from a table, modifies the data, and saves the changes back to the database. In a multi-user environment, different users could modify the same rows in the course of this transaction. We can get inconsistent data with just a plain select. In response to user demands, almost all SQL databases have SELECT … FOR SHARE locking. This feature prevents the application entity from making data modifications until the locker transaction commits or rolls back.

For example:

  1. There is a user with multiple bank accounts stored in an accounts table, with total_amount stored in a bank_clients table.
  2. In order to update the total_amount field, we should prevent modification of all rows related to the specific bank client.
  3. It would be better to use a single update statement to calculate total_amount and select it from the accounts table. If the update requires external data, or some action from the user, then several statements are required

The SELECT FOR SHARE statement creates a “RowShareLock” lock on the relation locktest.

Here’s exactly the same lock created with an SQL statement:

A single heavyweight RowShareLock is required regardless of the number of rows locked by the query.

This is illustrated with an unfinished transaction in the following example. Start the unfinished transaction, and select from lockview in a second connection to the database:

RowExclusiveLock (ROW EXCLUSIVE)

Real queries that modify rows also require heavyweight locks on tables, one per table.

The next example uses a DELETE query, but an UPDATE will have the same effect.

All commands that modify data in a table obtain a ROW EXCLUSIVE lock.

This new lock is incompatible with the previous FOR SHARE example. SELECT * FROM locktest FOR SHARE  waits for the delete transaction to finish or abort:

Queries modifying table content also lock all indexes, even if the index does not contain modified fields.

ShareLock (SHARE)

The non-concurrent version of CREATE INDEX prevents table updates, e.g. DROP TABLE or INSERT or DELETE, with ShareLock.

You can execute multiple CREATE INDEX queries in parallel unless the index name is exactly the same. The wait happens on the row lock (ShareLock with “transactionid” type) in the pg_class table.

Note that there is also AccessExclusiveLock lock with type “relation”, but it’s not a table level one.

ShareUpdateExclusiveLock (SHARE UPDATE EXCLUSIVE)

These database maintenance operations need to take a ShareUpdateExclusiveLock:

  • ANALYZE table
  • VACUUM (without full) runs.
  • CREATE INDEX CONCURRENTLY

The  ANALYZE tablename;  statement updates table statistics. The query planner/optimizer is able to provide the best plans for query execution only if the statistics are up to date.

There is no conflict between RowExclusiveLock and ShareUpdateExclusiveLock. UPDATE/DELETE/INSERT could still modify rows during ANALYZE.

VACUUM and CREATE INDEX CONCURRENTLY can be executed only outside a transaction. To see the effects of these statements in lockview, execute a conflicting transaction first e.g. run ANALYZE in a transaction, or run VACUUM against a huge table.

CREATE INDEX CONCURRENTLY locking can be confusing. SHARE UPDATE EXCLUSIVE lock does not conflict with a ROW EXCLUSIVE lock that’s used for DELETES, INSERT and UPDATES. Unfortunately, CREATE INDEX CONCURRENTLY waits until active transactions are finished twice due to full table scans:

In a concurrent index build, the index is actually entered into the system catalogs in one transaction, then two table scans occur in two more transactions. Before each table scan, the index build must wait for existing transactions that have modified the table to terminate.” PostgreSQL Documentation

AccessExclusiveLock  (ACCESS EXCLUSIVE)

This lock conflicts with any other locks and is used by these statements:

  • CREATE RULE
  • DROP TABLE
  • DROP INDEX
  • TRUNCATE
  • VACUUM FULL
  • LOCK TABLE (default mode)
  • CLUSTER
  • REINDEX
  • REFRESH MATERIALIZED VIEW (without CONCURRENTLY)

More importantly, drop index requires access exclusive locks for both table and index:

Note: This is the most dangerous type of lock. Avoid running queries requiring access exclusive lock in production, or at least put the application in maintenance mode.

ExclusiveLock

Meanwhile, SQL commands don’t use ExclusiveLock, except for with the general LOCK TABLE statement. This lock prevents all requests except for a non-locking select (i.e. without FOR SHARE/UPDATE).

Savepoints

Savepoint produces an additional ExclusiveLock of transactionid type with new xid value.

pg_advisory_lock

Sometimes application developers require synchronization between processes. In such systems, the application creates and removes locks frequently. Systems with a row-based locks implementation tend to cause table bloat.

There are many functions related to advisory locks:

  • per session or per transaction
  • wait if lock is not available or immediately return false
  • exclusive or shared
  • 64-bit or two 32-bit integers resource identifiers

Imagine that we have several cron jobs and that the application should prevent simultaneous runs of the same script. Next, each script can check if a lock is available in PostgreSQL for specific integer job identifier:

The query produces ExclusiveLock with type advisory.

Deadlocks

Any system with multiple locks tends to have a deadlock situation when queries will never finish. The only way to resolve such issues: kill one of blocked statements. More importantly, deadlock detection is an expensive procedure in PostgreSQL. A check for deadlock only happens when a transaction is locked for deadlock_timeout milliseconds—after one second by default.

Here is an illustration of a deadlock situation for two different connections A and B:

Any deadlock starting from lock wait.

You are not alone with the identification of deadlocks, as the pg_stat_activity system view helps you to find statements and transactions causing lock waits:

The SELECT FOR UPDATE on c=2 row causes a deadlock:

Afterwards, PostgreSQL reports in server log:

As you can see, the database server aborts one blocked transaction automatically.

Multi-way deadlocks

Normally there are just two transactions creating deadlocks. However, in complex cases, an application could caused deadlocks with multiple transactions forming a dependency circle.

A: locks row1, B locks row2, C locks row3

Step2

A: trying to get row3, B: trying to get row1, C: trying to get row2

Summary

  • Do not put DDL statements in long transactions.
  • Please avoid DDL on during high load for frequently updated tables
  • CLUSTER command requires exclusive access to the table and all it’s indexes
  • Monitor postgresql log for deadlock-related messages

Photo by shy sol from Pexels