Postgres Weekly Issue 390: January 27, 2021

#390 — January 27, 2021

👋 This week we've got a neat addition in the shape of an interview with Daniele Varrazzo who maintains one of the most popular Postgres adapters for Python and is working on the next major version as we speak. Check that out about half way through today's issue :-)

I'm keen to interview more people working with Postgres or supporting the Postgres ecosystem, so if you want to nominate yourself, a co-worker, or someone else, reply to this issue and let me know.
__
Peter Cooper, your editor

Postgres on ARM-Based AWS EC2 Instances: Is It Any Good? — As exciting as Apple’s developments with its own ARM-based chips are, AWS also has made big progress with its own Graviton2 CPUs – here’s how they fare when running Postgres. In short, it’s good news. A post from last year about how Postgres runs on the M1 chip may also be of interest.

Jobin Augustine and Sergey Kuzmichev

Postgres Scaling Advice for 2021 — A grab bag of thoughts and insights on just how far a single Postgres instance can scale and why it might be better than rolling out an entire cluster. Does your use case demand tens of thousands of transactions per second?

Kaarel Moppel

Real-Time Postgres Performance Monitoring — Collect out-of-the-box and custom PostgreSQL metrics and correlate them with data from your distributed infrastructure, applications, and logs. Gain real-time performance insights and set intelligent alerts with Datadog. Start a free trial.

Datadog sponsor

A Dating App's Journey to Postgres 12 — Coffee Meets Bagel is a dating system (of sorts) and they upgraded their cluster from Postgres 9.6 to 12.4 recently. Here they cover both the upgrade and how they use Postgres generally (complete with, yes, an architecture diagram! 😍)

Tommy Li

Heroku Announces Larger Postgres Plans (and Connection Pooling) — Even if you don’t use Heroku, they’ve had a pretty big influence on the Postgres space over the years and were one of the first services to offer it at such scale. Their plans now go up to 768GB of RAM with 4TB of storage and, no, they’re not cheap at all 😆 In other Heroku news, they've made connection pooling for Postgres generally available.

Greg Nokes (Heroku)

Nordic PGDay 2021 Cancelled — A year ago when in-person events began to get cancelled, I didn’t expect it’d still be happening a year later, but Nordic PGDay is now deferred to taking place in Helsinki in March 2022 fingers crossed.

PostgreSQL Europe

Golden Proportions in Postgres — Two quantities are in the ‘golden ratio’ if their ratio is the same as the ratio of their sum to the larger of the two quantities.. get it? Hans-Jürgen Schönig shows off how to use Postgres to help with such calculations.

Hans-Jürgen Schönig

Free eBook: Efficient Search in Rails with Postgres — Speed up a search query from seconds to milliseconds and learn about exact matches, trigrams, ILIKE, and full-text search.

pganalyze sponsor

How Postgres Replication Works — A high level tour of Postgres’s approach to replication, HA configurations, clustering, etc.

Adriano

A Preliminary Exploration of Dynamic Hashing in Postgres — More spelunking in the Postgres internals if you’re interested in how Postgres handles its hash tables.

Neil Chen

💬 A Q&A with…
Daniele Varrazzo
.. creator of psycopg3

psycopg2 is a popular Postgres database adapter for Python developers and its maintainer has recently begun work on a major upgrade in the shape of psycopg3 (for which he is very keen for your support.) I was intrigued by the need for a major new version of an already popular library, so asked him a few questions:

Why has psycopg2 become the most popular Postgres adapter for Python?

The project has solid foundations, as it is based on libpq, the official PostgreSQL C client library, maintained by the Postgres authors themselves. Other drivers have tried different avenues, ending up either being too slow (because of interpreting the backend-frontend protocol in pure Python) or incomplete/buggy.

On top of a solid communication layer, Federico "fog" Di Gregorio, the original author, designed a nice and flexible adaptation system to convert between the two rich but different data models. Supporting more Postgres features and making passing data around as transparent as possible eventually made the interface... disappear. Which is a great feature: the best thing a tool can do is to get out of the way of its user's intent.

What do you want to achieve with psycopg3?

The psycopg2 design is old and predates the Postgres communication protocol which allows to separate the queries from the parameters, so query and parameters are merged client-side. Using the newer protocol has the effect of changing the server behaviour in a few cases: for instance it's no longer possible to send more queries separated by semicolon ("insert into a; insert into b") if they also contain parameters; cast rules might be different, often stricter (at the moment there is quite a lot of tweaking going on to minimise the changes needed). Although psycopg3 doesn't introduce gratuitous changes for the fun of it, testing and changing some queries is definitely necessary.

There are advantages to use the new protocol, too: better memory use, better integration with server-side logging tools, query preparation, binary parameters, COPY based on Python objects... I'm trying to not get obsessed with benchmarks but psycopg3 is already solidly faster than psycopg2, albeit several code paths are still in Python and not yet ported to C.

Python has also gone forward quite a lot compared to when psycopg2 was designed, a time in which old-style classes were roaming the earth and the with statement was still hiding under a rock. That said, the real game changer is asyncio, finally a standard way to have collaborative I-O-bound concurrency. For a driver aspiring to make the most of a database as capable of concurrency such as Postgres, asyncio is a necessary interface.

What will working full time on psycopg3 allow you to do better?

The initial phase of psycopg3 development will be at least 6 months long. We are about halfway: I hope that around April we will have the basic features complete and a stable and performing adapter. This wouldn't have been possible without focusing solely on this project - as much as I like parallelism my brain is pretty single-threaded!

Having psycopg3 as my main "product" is allowing me to focus 100% on its quality. In the near future I would also like to create more integration and stress tests and to profile concurrent scenarios: I'm very happy if the sponsorships for the project go directly towards the quality of the library produced.

How can people support you if not financially?

There is a lot of code to write in a fresh codebase and I welcome anyone to come and play with these shiny toys (there is an open board of tasks). I would be happy if someone got to learn something by taking part in the new generation of this project.

Other ways to help could be to adopt psycopg3 early instead of psycopg2 for a fresh project; I also wouldn't mind having some of my code and ideas reviewed (Is my asyncio any good? Are these assumptions correct?) It would be good to have QA and testing done in a less biased way than mine could be. There are definitely several ways the project could use a few more heads.

Daniele is an Italian hacker now based in London and trying to work on psycopg3 full time. You can support his work here.

🔧 Tools and Code

pg_activity 2.0: A top-Like Server Activity Monitoring Tool for Postgres — Much as you might use top on a server to monitor processes and CPU usage, pg_activity provides a similar approach for checking up on PostgreSQL. 2.0.0 just dropped.

Dalibo

pg_hint_plan: Give Postgres The Ability to Manually Force Some Decisions in Execution Plans

NTT OSS Center DBMS Development and Support Team