Procedural vs query approaches

This is the latest in a series of posts dealing with displaying what packages are available for a given FreeBSD port.

The Packages – how is this data stored? post may help you follow along. It outlines the data and the tables used in this post.

Last night I managed to get the right data out for ports which have multiple flavors. I used a few procedures each of one calling another, eventually getting the results.

That effort has now been replaced with a single query.

In this post, we will be looking at the www/py-django-storages port, which has three flavors.

The query

This query came to me from RhodiumToad, who has a wide breadth of knowledge and is liberal with his help. What you see below has been adjusted since then, but it closely represents his delivery.

WITH 
  pkg AS (SELECT * FROM packages WHERE port_id = 28303)
SELECT abi.name AS abi, 
       pn.package_name,
       pkg.package_set,
       pkg.package_version 
  FROM abi 
       CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn
       LEFT JOIN pkg on (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name)
 ORDER BY package_name, abi, package_set;

This blog post is mainly to document the above and go through it step by step. I’ll need this later because I know I won’t remember it. I also want to learn more about this query and apply it to future development.

I’m going to step through the query, outlining what each part does.

WITH pkg

A WITH query “provides a way to write auxiliary statements for use in a larger query”.

WITH 
  pkg AS (SELECT * FROM packages WHERE port_id = 28303)
SELECT abi.name AS abi, 
       pn.package_name,
       pkg.package_set,
       pkg.package_version 
  FROM abi 
       CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn
       LEFT JOIN pkg on (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name)
 ORDER BY package_name, abi, package_set;

I think of it as a temporary table for this query. It is defining the data which will be used by the rest of the query.

I will refer to this later as the pkg table.

This is what we get:

freshports.dev=# SELECT * FROM packages WHERE port_id = 28303 ORDER BY package_name;
   id    | abi_id | port_id | package_version |     package_name     | branch_id | branch_name | package_set 
---------+--------+---------+-----------------+----------------------+-----------+-------------+-------------
 4373305 |      1 |   28303 | 1.9.1           | py27-django-storages |           |             | quarterly
 4369003 |      1 |   28303 | 1.9.1           | py27-django-storages |           |             | latest
 4340907 |      7 |   28303 | 1.9.1           | py27-django-storages |           |             | quarterly
 4392074 |     10 |   28303 | 1.9.1           | py27-django-storages |           |             | latest
 4388153 |     11 |   28303 | 1.9.1           | py27-django-storages |           |             | latest
 4346130 |      6 |   28303 | 1.9.1           | py27-django-storages |           |             | latest
 4382725 |      2 |   28303 | 1.8             | py27-django-storages |           |             | latest
 4349958 |      6 |   28303 | 1.9.1           | py27-django-storages |           |             | quarterly
 4353741 |      4 |   28303 | 1.9.1           | py27-django-storages |           |             | latest
 4380869 |      8 |   28303 | 1.9.1           | py27-django-storages |           |             | quarterly
 4376671 |      8 |   28303 | 1.9.1           | py27-django-storages |           |             | latest
 4357905 |      4 |   28303 | 1.9.1           | py27-django-storages |           |             | quarterly
 4339390 |      7 |   28303 | 1.5.1           | py27-django-storages |           |             | latest
 4361606 |      9 |   28303 | 1.5.1           | py27-django-storages |           |             | latest
 4363708 |      9 |   28303 | 1.9.1           | py27-django-storages |           |             | quarterly
 4339245 |      7 |   28303 | 1.5.1           | py36-django-storages |           |             | latest
 4361337 |      9 |   28303 | 1.5.1           | py36-django-storages |           |             | latest
 4369021 |      1 |   28303 | 1.9.1           | py37-django-storages |           |             | latest
 4373197 |      1 |   28303 | 1.9.1           | py37-django-storages |           |             | quarterly
 4365567 |      9 |   28303 | 1.9.1           | py37-django-storages |           |             | quarterly
 4357806 |      4 |   28303 | 1.9.1           | py37-django-storages |           |             | quarterly
 4376807 |      8 |   28303 | 1.9.1           | py37-django-storages |           |             | latest
 4380779 |      8 |   28303 | 1.9.1           | py37-django-storages |           |             | quarterly
 4353810 |      4 |   28303 | 1.9.1           | py37-django-storages |           |             | latest
 4349931 |      6 |   28303 | 1.9.1           | py37-django-storages |           |             | quarterly
 4383395 |      2 |   28303 | 1.8             | py37-django-storages |           |             | latest
 4388136 |     11 |   28303 | 1.9.1           | py37-django-storages |           |             | latest
 4346104 |      6 |   28303 | 1.9.1           | py37-django-storages |           |             | latest
 4392033 |     10 |   28303 | 1.9.1           | py37-django-storages |           |             | latest
 4342889 |      7 |   28303 | 1.9.1           | py37-django-storages |           |             | quarterly
(30 rows)

freshports.dev=# 

I sorted the data just to make it easier to follow.

The above represents all the packages available for www/py-django-storages from the FreeBSD repos.

How many different builds do we have for each package_name?

  SELECT package_name, 
         count(package_name) 
    FROM packages 
   WHERE port_id = 28303 
GROUP BY package_name 
ORDER BY package_name;

That query produces this result:

     package_name     | count 
----------------------+-------
 py27-django-storages |    15
 py36-django-storages |     2
 py37-django-storages |    13
(3 rows)

Those numbers will become relevant later.

SELECT DISTINCT

SELECT DISTINCT eliminates duplicate rows from the result. Don’t do what I initially did and use brackets around your first item. Consider SELECT DISTINCT to be a single keyword. It is not a function.

WITH 
  pkg AS (SELECT * FROM packages WHERE port_id = 28303)
SELECT abi.name AS abi, 
       pn.package_name,
       pkg.package_set,
       pkg.package_version 
  FROM abi 
       CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn
       LEFT JOIN pkg on (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name)
 ORDER BY package_name, abi, package_set;

We need results for all packages on all ABI. To get there, we need a list of the packages first.

Combining that SELECT DISTINCT into the WITH clause, we get:

WITH pkg AS
  (SELECT * FROM packages WHERE port_id = 28303)
SELECT DISTINCT package_name FROM pkg;

     package_name     
----------------------
 py27-django-storages
 py36-django-storages
 py37-django-storages
(3 rows)

This is the pkg table.

All ABI * package combinations

We want to show people that a package is not available on a given ABI. We also want all tables to have the same number of rows so that scanning across the page compares like-to-like. For that, we want all combinations please.

abi CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn

Here is a list of all packages beside all ABI:

WITH pkg AS
  (SELECT * FROM packages WHERE port_id = 28303)
SELECT * FROM abi CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn

 id |        name        | active |     package_name     
----+--------------------+--------+----------------------
  1 | FreeBSD:12:amd64   | t      | py27-django-storages
  2 | FreeBSD:13:aarch64 | t      | py27-django-storages
  4 | FreeBSD:11:i386    | t      | py27-django-storages
  6 | FreeBSD:11:amd64   | t      | py27-django-storages
  7 | FreeBSD:11:aarch64 | t      | py27-django-storages
  8 | FreeBSD:12:i386    | t      | py27-django-storages
  9 | FreeBSD:12:aarch64 | t      | py27-django-storages
 10 | FreeBSD:13:i386    | t      | py27-django-storages
 11 | FreeBSD:13:amd64   | t      | py27-django-storages
  1 | FreeBSD:12:amd64   | t      | py36-django-storages
  2 | FreeBSD:13:aarch64 | t      | py36-django-storages
  4 | FreeBSD:11:i386    | t      | py36-django-storages
  6 | FreeBSD:11:amd64   | t      | py36-django-storages
  7 | FreeBSD:11:aarch64 | t      | py36-django-storages
  8 | FreeBSD:12:i386    | t      | py36-django-storages
  9 | FreeBSD:12:aarch64 | t      | py36-django-storages
 10 | FreeBSD:13:i386    | t      | py36-django-storages
 11 | FreeBSD:13:amd64   | t      | py36-django-storages
  1 | FreeBSD:12:amd64   | t      | py37-django-storages
  2 | FreeBSD:13:aarch64 | t      | py37-django-storages
  4 | FreeBSD:11:i386    | t      | py37-django-storages
  6 | FreeBSD:11:amd64   | t      | py37-django-storages
  7 | FreeBSD:11:aarch64 | t      | py37-django-storages
  8 | FreeBSD:12:i386    | t      | py37-django-storages
  9 | FreeBSD:12:aarch64 | t      | py37-django-storages
 10 | FreeBSD:13:i386    | t      | py37-django-storages
 11 | FreeBSD:13:amd64   | t      | py37-django-storages
(27 rows)

With 9 rows in the ABI table and 3 distinct package names we get 27 rows.

I will refer to this as the abi_x_package_name data.

LEFT JOIN

We have a list of all the

Now we do a LEFT JOIN. That gets us all the abi_x_package_name combinations we need with, or without, the corresponding package information.

LEFT JOIN pkg on (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name)

This takes the entries from pkg and joins it with everything from the abi table.

This takes us to the entire query now:

WITH 
  pkg AS (SELECT * FROM packages WHERE port_id = 28303)
SELECT abi.name AS abi, 
       pn.package_name,
       pkg.package_set,
       pkg.package_version 
  FROM abi 
       CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn
       LEFT JOIN pkg on (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name)
 ORDER BY package_name, abi, package_set;

        abi         | abi_id |     package_name     | package_set | package_version 
--------------------+--------+----------------------+-------------+-----------------
 FreeBSD:11:aarch64 |      7 | py27-django-storages | latest      | 1.5.1
 FreeBSD:11:aarch64 |      7 | py27-django-storages | quarterly   | 1.9.1
 FreeBSD:11:amd64   |      6 | py27-django-storages | latest      | 1.9.1
 FreeBSD:11:amd64   |      6 | py27-django-storages | quarterly   | 1.9.1
 FreeBSD:11:i386    |      4 | py27-django-storages | latest      | 1.9.1
 FreeBSD:11:i386    |      4 | py27-django-storages | quarterly   | 1.9.1
 FreeBSD:12:aarch64 |      9 | py27-django-storages | latest      | 1.5.1
 FreeBSD:12:aarch64 |      9 | py27-django-storages | quarterly   | 1.9.1
 FreeBSD:12:amd64   |      1 | py27-django-storages | latest      | 1.9.1
 FreeBSD:12:amd64   |      1 | py27-django-storages | quarterly   | 1.9.1
 FreeBSD:12:i386    |      8 | py27-django-storages | latest      | 1.9.1
 FreeBSD:12:i386    |      8 | py27-django-storages | quarterly   | 1.9.1
 FreeBSD:13:aarch64 |      2 | py27-django-storages | latest      | 1.8
 FreeBSD:13:amd64   |     11 | py27-django-storages | latest      | 1.9.1
 FreeBSD:13:i386    |     10 | py27-django-storages | latest      | 1.9.1
 FreeBSD:11:aarch64 |      7 | py36-django-storages | latest      | 1.5.1
 FreeBSD:11:amd64   |      6 | py36-django-storages |             | 
 FreeBSD:11:i386    |      4 | py36-django-storages |             | 
 FreeBSD:12:aarch64 |      9 | py36-django-storages | latest      | 1.5.1
 FreeBSD:12:amd64   |      1 | py36-django-storages |             | 
 FreeBSD:12:i386    |      8 | py36-django-storages |             | 
 FreeBSD:13:aarch64 |      2 | py36-django-storages |             | 
 FreeBSD:13:amd64   |     11 | py36-django-storages |             | 
 FreeBSD:13:i386    |     10 | py36-django-storages |             | 
 FreeBSD:11:aarch64 |      7 | py37-django-storages | quarterly   | 1.9.1
 FreeBSD:11:amd64   |      6 | py37-django-storages | latest      | 1.9.1
 FreeBSD:11:amd64   |      6 | py37-django-storages | quarterly   | 1.9.1
 FreeBSD:11:i386    |      4 | py37-django-storages | latest      | 1.9.1
 FreeBSD:11:i386    |      4 | py37-django-storages | quarterly   | 1.9.1
 FreeBSD:12:aarch64 |      9 | py37-django-storages | quarterly   | 1.9.1
 FreeBSD:12:amd64   |      1 | py37-django-storages | latest      | 1.9.1
 FreeBSD:12:amd64   |      1 | py37-django-storages | quarterly   | 1.9.1
 FreeBSD:12:i386    |      8 | py37-django-storages | latest      | 1.9.1
 FreeBSD:12:i386    |      8 | py37-django-storages | quarterly   | 1.9.1
 FreeBSD:13:aarch64 |      2 | py37-django-storages | latest      | 1.8
 FreeBSD:13:amd64   |     11 | py37-django-storages | latest      | 1.9.1
 FreeBSD:13:i386    |     10 | py37-django-storages | latest      | 1.9.1
(37 rows)

freshports.dev=#

From here, we start pivoting the data so we have latest and quarterly on the same line.

Pivot

By now, we have gone through the entire query. Next is the pivot.

WITH
  pkg AS (SELECT * FROM packages WHERE port_id = 28303)
SELECT abi.name AS abi,
       pn.package_name,
       max(pkg.package_version) FILTER (WHERE pkg.package_set = 'latest')
         AS package_version_latest,
       max(pkg.package_version) FILTER (WHERE pkg.package_set = 'quarterly')
         AS package_version_quarterly
  FROM abi
       CROSS JOIN (SELECT DISTINCT package_name FROM pkg) pn
       LEFT JOIN pkg ON (pkg.abi_id = abi.id AND pkg.package_name = pn.package_name)
 GROUP BY pn.package_name, abi.name
 ORDER BY pn.package_name, abi.name;

The filter controls which input rows are fed to the aggregate function max. I could have use min(), or any other similar function here because there is only one value which will match it. The GROUP BY clause is what controls this. For each package_name and abi.name combination, there will be 1 or two lines.

The output is the expected 27 lines of information:

        abi         |     package_name     | package_version_latest | package_version_quarterly 
--------------------+----------------------+------------------------+---------------------------
 FreeBSD:11:aarch64 | py27-django-storages | 1.5.1                  | 1.9.1
 FreeBSD:11:amd64   | py27-django-storages | 1.9.1                  | 1.9.1
 FreeBSD:11:i386    | py27-django-storages | 1.9.1                  | 1.9.1
 FreeBSD:12:aarch64 | py27-django-storages | 1.5.1                  | 1.9.1
 FreeBSD:12:amd64   | py27-django-storages | 1.9.1                  | 1.9.1
 FreeBSD:12:i386    | py27-django-storages | 1.9.1                  | 1.9.1
 FreeBSD:13:aarch64 | py27-django-storages | 1.8                    | 
 FreeBSD:13:amd64   | py27-django-storages | 1.9.1                  | 
 FreeBSD:13:i386    | py27-django-storages | 1.9.1                  | 
 FreeBSD:11:aarch64 | py36-django-storages | 1.5.1                  | 
 FreeBSD:11:amd64   | py36-django-storages |                        | 
 FreeBSD:11:i386    | py36-django-storages |                        | 
 FreeBSD:12:aarch64 | py36-django-storages | 1.5.1                  | 
 FreeBSD:12:amd64   | py36-django-storages |                        | 
 FreeBSD:12:i386    | py36-django-storages |                        | 
 FreeBSD:13:aarch64 | py36-django-storages |                        | 
 FreeBSD:13:amd64   | py36-django-storages |                        | 
 FreeBSD:13:i386    | py36-django-storages |                        | 
 FreeBSD:11:aarch64 | py37-django-storages |                        | 1.9.1
 FreeBSD:11:amd64   | py37-django-storages | 1.9.1                  | 1.9.1
 FreeBSD:11:i386    | py37-django-storages | 1.9.1                  | 1.9.1
 FreeBSD:12:aarch64 | py37-django-storages |                        | 1.9.1
 FreeBSD:12:amd64   | py37-django-storages | 1.9.1                  | 1.9.1
 FreeBSD:12:i386    | py37-django-storages | 1.9.1                  | 1.9.1
 FreeBSD:13:aarch64 | py37-django-storages | 1.8                    | 
 FreeBSD:13:amd64   | py37-django-storages | 1.9.1                  | 
 FreeBSD:13:i386    | py37-django-storages | 1.9.1                  | 
(27 rows)

Why 27? 3 package names, 9 ABI.

Much better

This is a much better approach than the procedures I used before. I like it.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

Leave a Comment

Scroll to Top