Postgres tips for optimizing Django & Python performance, from my PyCon workshop

Written by Louise Grandjonc
May 20, 2020

This year, I was so excited about doing a workshop about optimizing Python & Django apps with Postgres superpowers for the PyCon 2020 conference.

Working with other developers on performance is something I always find amazing. So props to the Python people at Microsoft who encouraged my team to create a workshop on Postgres for PyCon 2020. Thank you to Nina Zakharenko, Dan Taylor, & Crystal Kelch.

Alas, we had to change our plans and find other ways to share the PostgreSQL workshop content that we had prepared. So I created a video on the topic of database performance for Django developers, to help teach you the PostgreSQL tips and tricks that have served me well in optimizing my Django apps. These tips are what I call "Postgres superpowers."

Being myself a developer, and having worked with Django for years, I know that there are a few traps to avoid in order to have fast apps. In this video, I go through a few performance traps to avoid—as well as tools you can use to identify pain points.

Where to find this PyCon video on Django performance

You can find this video on Optimizing Python & Django performance in three different places:

On the Microsoft Board for PyCon2020, you will find a ton of useful videos created by my teammates on the Python team here at Microsoft. I found Nina’s talk on debugging especially interesting! And my Postgres teammate Sai Srirampur also created an (awesome) short demo on Building HTAP applications with Python and Postgres—in Azure.

But before you go watch my PyCon video, let me tell you a few things that I think you will find useful when watching.

Tools to help optimize Python

Django and Postgres have tools for you to use in order to see your SQL queries executed. In my experience, looking into executed SQL queries is the best way to make sure you don't let your ORM trick you into running redundant, unoptimised queries.

django-debug-toolbar

A tool that I've used a lot working with Django, is the django debug toolbar. When you set DEBUG=True in your settings, django-debug-toolbar will show up in your template. You can see all the queries executed by your view/template and thanks to the traceback, you can easily find in your code where the queries come from.

The django-debug-toolbar also gives you the execution time and you can have the result of EXPLAIN on that query, which is great to debug slow queries.

In the workshop, I use django-debug-toolbar for the traceback and timing functionalities.

Postgres logs in your day to day life

For the workshop, we created a very simple app, with not much javascript happening, so there was not much surprise when it comes to queries. But if you have API calls in your template, the django debug toolbar won't show queries coming from those calls. Which is why django debug toolbar is nice, but not always adapted to every app.

So, I use Postgres logs in my day to day life. Postgres logs give you a sure way to see all queries ran on your database.

And beyond Django, if you are working on different apps, with maybe different languages, the great thing about logs is that you can switch environments, whether it's Django, flask, or a Golang API. Your Postgres logs will still be available once configured.

To see how I configure my local Postgres to show all queries, you can watch my PyCon workshop.

pg_stat_statements, the most useful Postgres extension (right?)

And finally, the tool I introduce is pg_stat_statements. It's a Postgres extension for tracking execution statistics of all SQL statements executed by a server. You can see how many times a SQL query was executed—as well as the minimum, maximum, and mean execution times...

pg_stat_statements is a great tool to find slow queries on your production systems, and in your local Postgres environments too.

Avoiding loops in Django

As I mentioned, if you don't pay attention to your Postgres logs, you can quickly end up with unexpected queries. One very common issue is loops.

In the workshop, I go through the following example:

I have a view with a queryset of Campaigns filtered on a company

queryset.filter(company=self.request.user.employee.company)

And in the template, I have a loop, there is for loop displaying links to the detail view of related objects . This loops triggers hundreds of queries... Which is not something that you want! As a good friend once told me "There must be something wrong in your view if you have more than 20-25 queries."

And to avoid having O(n) queries because of a loop, Django has a few easy solutions: select_related and prefetch_related

Limiting columns to avoid slow SQL queries

One thing that I'm not fond of in ORM is the way it does SELECT. By default, the ORM not knowing what column you will need, it selects all columns. When your table has many columns, some being potentially long texts, selecting unnecessary columns means pulling more data—which leads to slower queries. Not what you want!

In the workshop, we talk about how to limit columns using Django's only function.

Pagination tips for Postgres

Finally, pagination can be very useful to avoid pulling thousands of rows (unnecessarily!) from your Postgres database. You have different ways to do pagination. In my PyCon 2020 video about Postgres superpowers, I give you two ways to paginate.

The first one is built-in in Django. It's based on adding OFFSET ... LIMIT to queries.

Unfortunately the offset approach to pagination has a couple limitations for Django apps.

  • First, Django uses a COUNT (*) to have the number of pages. The goal being to be able to jump between pages numbers. COUNT is never good, and the more you can avoid, the better.
  • Second, OFFSET ... LIMIT ... gets slower in high pages. OFFSET is used to skip first N rows of a query—to do that, database must fetch rows & order, to return last ones.

A better way to paginate is using keyset pagination. Instead of using OFFSET ... LIMIT ..., keyset pagination is based on filters on the ORDER BY fields. I explain more about how to implement keyset pagination in my PyCon video (and I also coverthe performance benefits you’ll see by using keyset pagination instead of OFFSET.)

Brand new Discord channel for PyCon

To try to connect to all the many Python developers we would have met in person at PyCon this year, our team has also created a microsoft-python discord channel. No surprise that the server follows our Open Source Code of Conduct. If you want to join (I’m on the discord), the new microsoft-python discord can be joined via aka.ms/python-discord.

Wishing you performant Django apps that use Postgres at its best

Just like chocolate and peanut butter go better together, so do Python and Postgres. I hope that the Postgres optimization techniques (the "superpowers") that I walk through in my PyCon 2020 video (below) will help you improve the performance of your Python and Django apps.

Somewhat irrelevant side-note: my PyCon video features hand-painted fish that I created many (many, many) years ago, long before I became a Django and Postgres developer. Enjoy.

The PyCon video about Postgres superpowers for Python and Django

YouTube video still: Optimize Python & Django apps with Postgres superpowers
Optimize Python & Django apps with Postgres superpowers – Louise Grandjonc – PyCon 2020 Online
Louise Grandjonc

Written by Louise Grandjonc

Former Postgres and Citus solutions engineer at Microsoft. Loves working on database performance and tuning SQL queries. Speaker at PGConf.EU, PyCon, DjangoCon, PyBay, PyCaribbean, & more. Bass viol player. Avid reader.