/ Shayon Mukherjee / blog

Why I enjoy PostgreSQL - Infrastructure Engineer's Perspective

January 17, 2022
~3 mins

While I enjoy working with MySQL (first database) both as a product and infrastructure engineer, lately, I have come to appreciate PostgreSQL as well. I recently read this post on why infrastructure engineers prefer MySQL, and I think it’s spot on.

FWIW: This is not a MySQL vs PostgreSQL post. This is just a small summary of what I have come to appreciate about PostgreSQL as an Infrastructure Engineer.

Schema Changes

Throw a stone in the pond, and you shall have Infrastructure Engineers telling tales of how schema changes brought down production. For most operations, I have found that PostgreSQL has much safer alternatives out of the box, thus allowing you to perform zero downtime or near zero-downtime schema changes by acquiring minimal locks.

Additionally, with every major release, there is a theme of introducing optimizations that focuses on acquiring fewer locks within the engine for common tasks.

Some examples:

Adding or dropping an index

PostgreSQL is capable of adding and dropping indexes without acquiring any locks. It does so using a parameter called CONCURRENTLY (source).

CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);

DROP INDEX CONCURRENTLY sales_quantity_index ON sales_table;

Foreign Key constraints

By passing NOT VALID to a statement when a foreign key is being added you can ask Postgres to skip validating the constraint and thus not to acquire a lock that would otherwise block writes (source). Of course this is also a trade-off, so your mileage may vary.

ALTER TABLE distributors ADD CONSTRAINT distfk FOREIGN KEY (address) REFERENCES addresses (address) NOT VALID;

ALTER TABLE distributors VALIDATE CONSTRAINT distfk;

Adding NOT NULL or other constraints

Doing something like the following makes PostgreSQL scan the entire table to ensure the constraint is valid.

ALTER TABLE table_name ALTER COLUMN column_name SET NOT NULL;

Depending on the nature of the workload, this can be pretty invasive. Instead, you can break that down into a few more statements, first by adding a constraint but not validating it immediately -

ALTER TABLE table_name ADD CONSTRAINT constraint_name CHECK (column IS NOT NULL) NOT VALID;

Notice we are leveraging NOT VALID here. Next, you tell PostgreSQL to validate constraint manually -

ALTER TABLE table_name VALIDATE CONSTRAINT constraint_name;

This makes PostgreSQL not acquire an ACCESS EXCLUSIVE LOCK, but a SHARE UPDATE EXCLUSIVE, meaning your read/writes will go through. Next, you add the original constraint -

ALTER TABLE table_name ALTER COLUMN column_name SET NOT NULL;

Since the constraint was validated already, PostgreSQL won’t acquire a ACCESS EXCLUSIVE LOCK this time. Now, you can drop the CHECK constraint after -

ALTER TABLE table_name DROP CONSTRAINT constraint_name;

While it increases the number of steps to be carried out, it also makes the entire process more reliable.

Extensibility

I think this is something that both product and infrastructure engineers appreciate. There is a bit here for everyone, from having custom data types to user-defined functions.

Being able to write custom functions in SQL or even C is quite powerful, often when automating operational tasks (I generally choose SQL). It’s an excellent way to abstract logic in modular functions, and because it’s SQL, you can “make the database do the hard work”.

Conclusion

So far, I have enjoyed taking advantage of these out of the box features from PostgreSQL for performing critical operations. From here, I am excited and looking forward to uncovering other parts of the database.

last modified January 17, 2022