Luca Ferrari has had a huge impact on the Postgres community in Italy, having been president of the Italian PostgreSQL Users Group in the past and having helped to organize the popular PGDay.it events. He also blogs frequently about Postgres and wrote PostgreSQL 11 Server Side Programming Quick Start Guide for Packt.
Note: A more complete version of this interview is on the Web.
We caught up with him to ask about server side Postgres use cases in particular:
For those who use Postgres as a simple database and haven't touched the deeper elements, where do you think they should start?
There's no single answer to this question, since Postgres is such a huge project with so many features and a rich community. I never found a project where it cannot fit in. Postgres is somehow like Unix: you cannot touch it as "just a database", you need to commit to its culture to benefit the most out of it.
In my classes, I can see that people usually get fascinated by the capabilities of doing server side programming, and that is why I decided to write my book about this topic. Often, people do not expect to be able to embed their Perl, Java, or Python libraries directly into PostgreSQL without having to rewrite their business logic in an SQL-like language.
Another great feature nowdays is the support to JSON within the database, thanks to which PostgreSQL can be used as both a relational database and a 'NoSQL' storage engine, providing a lot of flexibility in your infrastructure.
One suggestion I always gave is to join the mailing lists:
there are several that differ by topic and amount of traffic. Most are very active and
have high quality contributors that take care in providing accuate replies to users' questions, that spend time in reproducing errors and edge-cases, and who will help you. That's a mandatory
place where you have to start, in my opinion, to better learn about the project, its features and its culture.
Where should the line be drawn between doing things in an external programming language versus within Postgres?
Often the right choice is to place business logic near the data it refers to, that is within the database itself. However, there are several things to take into account including the developer's experience and the expressiveness of SQL-derived languages like pl/PgSQL.
There's a habit of letting ORMs (Object Relational Mappers) do most of the database interaction nowadays, reducing the database to a "simple storage". Of course, databases can do a lot more, and PostgreSQL in particular can help you migrate and embed your own business logic into the database itself.
I have helped a few companies embed their own Java libraries into Postgres resulting in a more robust and coherent way to access the data (the real value) without any regard of the application they were using. Because once you start having data, you will soon find that such data is required by multiple applications in different technologies and on different platforms, so that implementing the same business logic rules over and over becomes a huge effort; on the other hand, moving such logic within the database simplifies and keep uniform the way your data is manipulated.
What one thing do you think people should learn?
Stored procedures. They serve as a common base for triggers and are very similar to routines, therefore allowing you to build more complex pieces into your own cluster. Once you have learnt the common way of defining functions, you can go deeper and write your own native functions using other languages (e.g. C). This is more complex, but thanks to the extensibility of Postgres is not an impossible task and can help you migrating more and more code into the database. Once you have created a new feature, please do contribute it back so that other people can use it!
Last, allow me to announce that I'm working on another book right now: me and a friend of mine are writing a more general book on Postgres that will try to answer your question by leading the reader through the main features that make Postgres unique and great.