Fun with SQL: Self joins

Written by Craig Kerstiens
January 2, 2019

Various families have various traditions in the US around Christmas time. Some will play games like white elephant where you get a mix of decent gifts as well as gag gifts... you then draw numbers and get to pick from existing presents that have been opened ("stealing" from someone else) or opening an up-opened one. The game is both entertaining to try to get something you want, but also stick Aunt Jennifer with the stuffed poop emoji with a Santa hat on it.

Other traditions are a bit simpler, one that my partner's family follows is drawing names for one person you buy a gift for. This is nice because you can put a bit of effort into that one person without having to be too overwhelmed in tracking down things for multiple people. Each year we draw names for the next year. And by now you're probably thinking what does any of this have to do with SQL? Well normally when we draw names we write them on a piece of paper, someone takes a picture, then that gets texted around to other family members. At least for me every October I'm scrolling back through text messages to try to recall who it was I'm supposed to buy for. This year I took a little time to put everyone's name in a SQL database and write a simple query for easier recall.

A simple but workable schema

I could very much over-engineer this and record things like who has who on what year to see how many repeats happen, etc. I really just want an easy place to save the data and not forget. For that reason I'm simplifying the schema as much as possible:

CREATE TABLE giftnames AS (
  id serial,
  name text,
  assignment int
);

From here I populated family members names first:

INSERT INTO giftnames (name) VALUES ('Craig Kerstiens');
INSERT INTO giftnames (name) VALUES ('Aunt Jennifer');
INSERT INTO giftnames (name) VALUES ('Uncle Joe');

Then as we draw names of who each person gets I'm going to go through and update those values within the table as well:

UPDATE giftnames set assignment = 2 where id = 1;
UPDATE giftnames set assignment = 3 where id = 2;
UPDATE giftnames set assignment = 1 where id = 3;

And now querying this is as simple as: sql SELECT * FROM giftnames

The only problem with this is I get an id back of who I'm supposed to buy a gift for, now a human readable name. Enter the simple, but handy feature of a self join. With a self join I can join a table against itself. With a self join you can name the tables and join on any condition you see fit. In this case I'm going to do a self join, but then format everything together as a string to give a very human readable list much like we used to do on pen and paper:

SELECT giver.name || ' is buying for ' || receiver.name
FROM giftnames as giver,
     giftnames as receiver
WHERE giver.assignment = receiver.id;

Even if you don't follow the exact same traditions as my family self joins may still be really useful. Self joins are handy when you have rows that need to reference other rows in that same table. This can be common when you think about reporting structures such as who is a manager of who, or categories where you have parent categories and an infinite number of sub-categories.

Craig Kerstiens

Written by Craig Kerstiens

Former Head of Cloud at Citus Data. Ran product at Heroku Postgres. Countless conference talks on Postgres & Citus. Loves bbq and football.