A Poor Man's Column Oriented Database in PostgreSQL

Let's get this out of the way. If you need a real column oriented database, use one. Don't even think about using the insanity below in production.

Having said that, I've been wondering for a while if it would be possible to demonstrate the properties of a column oriented db by simulating one using only native functionality found in good old, row oriented, PostgreSQL.

Two properties of a column oriented database that make it beneficial in OLAP environments are

  1. For queries that involve only a subset of a table's columns, it only needs to read the data for those columns off disk, and no others, saving on IO
  2. Storing each column separately means it can compress that data better since it's all of the same type, further reducing IO

The reduction in IO saves time for queries aggregating data over a large fraction of the table, improving performance.

PostgreSQL, of course, is row oriented. It stores and reads data a whole row at a time, and this is great when you want to access a majority of a table's columns for a small percentage of the table. i.e. OLTP type queries.

The challenge will be, can we structure things so PostgreSQL behaves more like a column oriented db?

Strategy

The strategy will be to use multiple tables behind the scenes to store the data for each column individually, and throw a view on top to tie them all together and make it look like a single table. If PostgreSQL's query planner is smart enough, it should be able to see that queries involving only a subset of columns only need to access those particular underlying tables, save on IO, and beat out a traditional PostgreSQL table of the same shape. We can even allow for modifying this pseudo-table by creating INSTEAD OF triggers on the view to take INSERT, UPDATE, DELETE statements, chop them up, and perform the necessary operations on the underlying tables.

Here's how the underlying tables will look. We'll have a single 'primary table' that contains only the primary key...

                            Table "public.primary_table"
 Column |  Type   | Collation | Nullable |                  Default
--------+---------+-----------+----------+-------------------------------------------
 id     | integer |           | not null | nextval('primary_table_id_seq'::regclass)
Indexes:
    "primary_table_pkey" PRIMARY KEY, btree (id)
Referenced by:
    TABLE "reference_table_1" CONSTRAINT "reference_table_1_pt_id_fkey" FOREIGN KEY (pt_id) REFERENCES primary_table(id) ON UPDATE CASCADE ON DELETE CASCADE
    TABLE "reference_table_2" CONSTRAINT "reference_table_2_pt_id_fkey" FOREIGN KEY (pt_id) REFERENCES primary_table(id) ON UPDATE CASCADE ON DELETE CASCADE
    ... however many other tables we need to use as columns

And several other tables that represent each column. Each reference table will have two columns, one that stores the actual value of the column, and another to point back to the primary table. The foreign key back to the primary table is how we'll be able to reconstruct a logical row of the pseudo table. A reference table looks like this...

join_test=> \d reference_table_1
              Table "public.reference_table_1"
 Column |       Type       | Collation | Nullable | Default
--------+------------------+-----------+----------+---------
 pt_id  | integer          |           |          |
 value  | double precision |           |          |
Indexes:
    "reference_table_1_pt_id_key" UNIQUE CONSTRAINT, btree (pt_id)
    "reference_table_1_value_idx" btree (value)
Foreign-key constraints:
    "reference_table_1_pt_id_fkey" FOREIGN KEY (pt_id) REFERENCES primary_table(id) ON UPDATE CASCADE ON DELETE CASCADE

join_test=>

Our view that ties them together will look like the view below. Note the use of LEFT JOINs. I started by using INNER JOINs at first, but of course this doesn't let PostgreSQL avoid querying each and every table no matter what, since by definition, INNER JOINs are going to check that there are matching rows in every table, even if that table's value doesn't appear in the select list.

CREATE VIEW combined AS
SELECT
    pt.id,
    rt_1.value AS value_1,
    rt_2.value AS value_2,
    ... however many columns
FROM
    primary_table pt LEFT JOIN
    reference_table_1 rt_1 ON
        pt.id = rt_1.pt_id LEFT JOIN
    reference_table_2 rt_2 ON
        pt.id = rt_2.pt_id
    ... continue LEFT JOIN'ing however many columns;

To actually see the poor man's column oriented version win out, one of the things I had to do was use a lot of columns. Like 100. PostgreSQL is, after all, not really meant for this and it's a case of "It's not that the bear dances well, but that it dances at all." 100 columns is a lot of typing, so what follows are the plpgsql functions I made to create the necessary primary table, reference tables, view, functions for the INSTEAD OF triggers, and finally, something to create an ordinary PostgreSQL table to compare against.

Primary table creator:

DROP FUNCTION IF EXISTS create_primary_table(integer);
CREATE FUNCTION create_primary_table(rows integer) RETURNS void AS $function_text$
BEGIN
    DROP TABLE IF EXISTS primary_table CASCADE;
    CREATE TABLE primary_table (
        id serial primary key
    );

    INSERT INTO primary_table (id)
    SELECT
        nextval('primary_table_id_seq')
    FROM
        generate_series(1, rows);
END;
$function_text$ LANGUAGE plpgsql;

Reference table creator:

DROP FUNCTION IF EXISTS create_reference_tables(integer, integer, boolean);
CREATE FUNCTION create_reference_tables(tables integer, rows integer, create_indexes boolean) RETURNS void AS $function_text$
BEGIN
    FOR i IN 1..tables LOOP
        EXECUTE 'DROP TABLE IF EXISTS reference_table_' || i || ' CASCADE;';

        RAISE NOTICE 'Creating and inserting into table...';

        EXECUTE format($$
            CREATE TABLE reference_table_%1$s (
                pt_id integer unique references primary_table (id) ON UPDATE CASCADE ON DELETE CASCADE,
                value double precision
            );

            INSERT INTO reference_table_%1$s (pt_id, value)
            SELECT
                id,
                random()
            FROM
                primary_table
            ORDER BY
                random();
        $$, i);

        IF create_indexes THEN
            RAISE NOTICE 'Creating index...';
            EXECUTE 'CREATE INDEX ON reference_table_' || i || ' (value);';
        END IF;
        RAISE NOTICE 'Done creating table and index if necessary';
    END LOOP;
END;
$function_text$ LANGUAGE plpgsql;

View creator:

DROP FUNCTION IF EXISTS create_combined_view(integer);
CREATE FUNCTION create_combined_view(reference_tables integer) RETURNS void AS $function_text$
DECLARE
    join_list text;
    column_select_list text;
BEGIN
    SELECT
        string_agg($$,
                rt_$$ || gs || '.value AS value_' || gs, '')
    INTO column_select_list
    FROM
        generate_series(1, reference_tables) AS gs;

    SELECT
        string_agg($$ LEFT JOIN
                reference_table_$$ || gs || ' AS rt_' || gs || $$ ON
                     pt.id = rt_$$ || gs || '.pt_id', '')
    INTO join_list
    FROM
        generate_series(1, reference_tables) AS gs;

    DROP VIEW IF EXISTS combined;
    EXECUTE format($$
        CREATE VIEW combined AS
        SELECT
            pt.id%1$s
        FROM
            primary_table AS pt%2$s;
    $$, column_select_list, join_list);
END;
$function_text$ LANGUAGE plpgsql;

INSTEAD OF trigger functions:

CREATE OR REPLACE FUNCTION chop_up_insert() RETURNS trigger AS $$
DECLARE
    insert_text text;
    new_id INT;
BEGIN
    INSERT INTO primary_table (id) VALUES (DEFAULT) RETURNING id INTO new_id;

    SELECT
        string_agg('INSERT INTO reference_table_' || gs || ' (pt_id, value) VALUES ($1, $2.value_' || gs || ');', ' ')
    INTO insert_text
    FROM
        generate_series(1, 10) AS gs;

    EXECUTE insert_text USING new_id, NEW;

    RETURN NEW;
END;
$$ LANGUAGE plpgsql;


CREATE OR REPLACE FUNCTION chop_up_delete() RETURNS trigger AS $$
BEGIN
    DELETE FROM primary_table WHERE id = OLD.id;

    RETURN OLD;
END;
$$ LANGUAGE plpgsql;


CREATE OR REPLACE FUNCTION chop_up_update() RETURNS trigger AS $$
DECLARE
    delete_text text;
    insert_text text;
BEGIN
    SELECT
        string_agg('DELETE FROM reference_table_' || gs || ' WHERE pt_id = $1;', ' ')
    INTO delete_text
    FROM
        generate_series(1, 10) AS gs;

    EXECUTE delete_text USING OLD.id;

    SELECT
        string_agg('INSERT INTO reference_table_' || gs || ' (pt_id, value) VALUES ($1, $2.value_' || gs || ');', ' ')
    INTO insert_text
    FROM
        generate_series(1, 10) AS gs;

    EXECUTE insert_text USING OLD.id, NEW;

    IF NEW.id IS DISTINCT FROM OLD.id THEN
        UPDATE primary_table SET id = NEW.id WHERE id = OLD.id;
    END IF;

    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

Traditional table creator:

DROP FUNCTION IF EXISTS create_traditional_table(integer, integer);
CREATE FUNCTION create_traditional_table(columns integer, rows integer) RETURNS void AS $function_text$
DECLARE
    column_text text;
BEGIN
    SELECT
        string_agg(', value_' || gs || $$ double precision default random() $$, ' ')
    INTO column_text
    FROM
        generate_series(1, columns) AS gs;

    -- Create traditional table
    RAISE NOTICE 'Creating traditional table...';
    DROP TABLE IF EXISTS traditional_table CASCADE;
    EXECUTE format($$
        CREATE TABLE traditional_table (
            id serial primary key
            %1$s
        );
    $$, column_text);

    -- Insert traditional table rows
    EXECUTE format($$
        INSERT INTO traditional_table (id)
        SELECT
            nextval('traditional_table_id_seq')
        FROM
            generate_series(1, %1$s);
    $$, rows);
END;
$function_text$ LANGUAGE plpgsql;

Call our functions to actually create the data, VACUUM ANALYZE to gather stats, install triggers

\set number_of_columns 100
\set number_of_rows 4000000

SELECT create_primary_table(:number_of_rows);
SELECT create_reference_tables(:number_of_columns, :number_of_rows, True);
SELECT create_combined_view(:number_of_columns);
SELECT create_traditional_table(:number_of_columns, :number_of_rows);
VACUUM ANALYZE;

CREATE TRIGGER chop_up_insert INSTEAD OF INSERT ON combined FOR EACH ROW EXECUTE PROCEDURE chop_up_insert();
CREATE TRIGGER chop_up_delete INSTEAD OF DELETE ON combined FOR EACH ROW EXECUTE PROCEDURE chop_up_delete();
CREATE TRIGGER chop_up_update INSTEAD OF UPDATE ON combined FOR EACH ROW EXECUTE PROCEDURE chop_up_update();

The other thing I had to do was use storage that wasn't the blindingly fast NVMe in my desktop computer. With the NVMe storage, it was simply always too fast, and didn't create enough of an IO penalty for the traditional table to ever be slower, regardless of how many columns I used in the test.

Test setup
  • RDS t2.micro
  • PostgreSQL 10.4
  • Memory: 1 GB
  • Disk: General Purpose SSD, 400 GB
  • 100 columns
  • 4,000,000 rows

The important thing is that the 1GB of memory was never going to be enough to cache everything, ensuring we'd be going to disk when testing.

Results

Traditional table:

join_test=>
    EXPLAIN (ANALYZE, BUFFERS)
    SELECT
        sum(value_5)
    FROM
        traditional_table;
                                                                        QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=466278.58..466278.59 rows=1 width=8) (actual time=55886.976..55886.976 rows=1 loops=1)
   Buffers: shared hit=29 read=148235
   ->  Gather  (cost=466278.36..466278.57 rows=2 width=8) (actual time=55884.788..55886.970 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=29 read=148235
         ->  Partial Aggregate  (cost=465278.36..465278.37 rows=1 width=8) (actual time=55880.498..55880.498 rows=1 loops=3)
               Buffers: shared hit=192 read=444253
               ->  Parallel Seq Scan on traditional_table  (cost=0.00..461111.69 rows=1666669 width=8) (actual time=1.262..55498.732 rows=1333333 loops=3)
                     Buffers: shared hit=192 read=444253
 Planning time: 0.083 ms
 Execution time: 55888.540 ms
(12 rows)

Time: 55972.345 ms (00:55.972)

Poor man's column store view results:

join_test=>
    EXPLAIN (ANALYZE, BUFFERS)
    SELECT
        sum(value_5)
    FROM
        combined;
                                                                            QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=224283.22..224283.23 rows=1 width=8) (actual time=17190.290..17190.290 rows=1 loops=1)
   Buffers: shared hit=13769 read=13820, temp read=19458 written=19332
   ->  Gather  (cost=224283.01..224283.22 rows=2 width=8) (actual time=17180.375..17190.283 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=13769 read=13820, temp read=19458 written=19332
         ->  Partial Aggregate  (cost=223283.01..223283.02 rows=1 width=8) (actual time=17172.057..17172.058 rows=1 loops=3)
               Buffers: shared hit=43429 read=39287, temp read=58242 written=57864
               ->  Hash Left Join  (cost=131154.00..219116.34 rows=1666667 width=8) (actual time=8965.846..16261.048 rows=1333333 loops=3)
                     Hash Cond: (pt.id = rt_5.pt_id)
                     Buffers: shared hit=43429 read=39287, temp read=58242 written=57864
                     ->  Parallel Seq Scan on primary_table pt  (cost=0.00..34366.67 rows=1666667 width=4) (actual time=2.355..1211.905 rows=1333333 loops=3)
                           Buffers: shared hit=98 read=17602
                     ->  Hash  (cost=61622.00..61622.00 rows=4000000 width=12) (actual time=8960.659..8960.659 rows=4000000 loops=3)
                           Buckets: 131072  Batches: 64  Memory Usage: 3718kB
                           Buffers: shared hit=43181 read=21685, temp written=46044
                           ->  Seq Scan on reference_table_5 rt_5  (cost=0.00..61622.00 rows=4000000 width=12) (actual time=0.009..4658.662 rows=4000000 loops=3)
                                 Buffers: shared hit=43181 read=21685
 Planning time: 4.547 ms
 Execution time: 17190.455 ms
(20 rows)

Time: 17370.370 ms (00:17.370)
join_test=>

And there we go. PostgreSQL is smart enough to query only the underlying tables it needs to, and it completes in less than a third of the time of the traditional version. Certainly a contrived example, but that it works at all is pretty neat IMO.

Another thing we can do is to look at selecting, inserting, updating, and deleting just one row from the traditional table, versus from the view. If you've used Redshift before, one thing you may have noticed is that selecting just one row is slower than you might expect if you're used to a row oriented db. Same with other single row operations. It makes sense why that would be though, if we consider that Redshift has to find each piece of the row in different places and assemble them back together. Plus, it may have to decompress a block of data for each column before it can pick out the needed value, further slowing it down. We're not compressing the data, but we do need to assemble it back together. Let's see how the traditional table compares to our hacked up monstrosity.

Selecting one row from the traditional table:

join_test=> explain analyze select * from traditional_table where id = 2000000;
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using traditional_table_pkey on traditional_table  (cost=0.43..8.45 rows=1 width=804) (actual time=0.018..0.019 rows=1 loops=1)
   Index Cond: (id = 2000000)
 Planning time: 0.168 ms
 Execution time: 0.054 ms
(4 rows)

join_test=>

Selecting one row from the poor man's column store view:

join_test=> explain analyze select * from combined where id = 2000000;
                                                                                       QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=43.43..850.45 rows=1 width=804) (actual time=1.102..1.146 rows=1 loops=1)
   Join Filter: (pt.id = rt_100.pt_id)
   ->  Nested Loop Left Join  (cost=43.00..841.99 rows=1 width=796) (actual time=1.079..1.122 rows=1 loops=1)
         Join Filter: (pt.id = rt_99.pt_id)
         ->  Nested Loop Left Join  (cost=42.57..833.53 rows=1 width=788) (actual time=1.069..1.112 rows=1 loops=1)
               Join Filter: (pt.id = rt_98.pt_id)
               ->  Nested Loop Left Join  (cost=42.14..825.07 rows=1 width=780) (actual time=1.059..1.101 rows=1 loops=1)
                     Join Filter: (pt.id = rt_97.pt_id)
                     ->  Nested Loop Left Join  (cost=41.71..816.61 rows=1 width=772) (actual time=1.048..1.090 rows=1 loops=1)
                           Join Filter: (pt.id = rt_96.pt_id)

                          ... 95 other joins omitted...

                          ->  Index Scan using reference_table_96_pt_id_key on reference_table_96 rt_96  (cost=0.43..8.45 rows=1 width=12) (actual time=0.014..0.014 rows=1 loops=1)
                                 Index Cond: (pt_id = 2000000)
                     ->  Index Scan using reference_table_97_pt_id_key on reference_table_97 rt_97  (cost=0.43..8.45 rows=1 width=12) (actual time=0.009..0.009 rows=1 loops=1)
                           Index Cond: (pt_id = 2000000)
               ->  Index Scan using reference_table_98_pt_id_key on reference_table_98 rt_98  (cost=0.43..8.45 rows=1 width=12) (actual time=0.009..0.009 rows=1 loops=1)
                     Index Cond: (pt_id = 2000000)
         ->  Index Scan using reference_table_99_pt_id_key on reference_table_99 rt_99  (cost=0.43..8.45 rows=1 width=12) (actual time=0.008..0.008 rows=1 loops=1)
               Index Cond: (pt_id = 2000000)
   ->  Index Scan using reference_table_100_pt_id_key on reference_table_100 rt_100  (cost=0.43..8.45 rows=1 width=12) (actual time=0.022..0.022 rows=1 loops=1)
         Index Cond: (pt_id = 2000000)
 Planning time: 69.179 ms
 Execution time: 3.576 ms
(405 rows)

join_test=>

Inserting into the traditional table...

join_test=> explain analyze insert into traditional_table (value_1) values (10);
                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------
 Insert on traditional_table  (cost=0.00..0.26 rows=1 width=804) (actual time=7.081..7.081 rows=0 loops=1)
   ->  Result  (cost=0.00..0.26 rows=1 width=804) (actual time=5.486..5.486 rows=1 loops=1)
 Planning time: 0.091 ms
 Execution time: 7.184 ms
(4 rows)

join_test=>

Inserting into the view...

join_test=> explain analyze insert into combined (value_1) values (10);
                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
 Insert on combined  (cost=0.00..0.01 rows=1 width=804) (actual time=64.185..64.185 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=804) (actual time=0.002..0.002 rows=1 loops=1)
 Planning time: 0.077 ms
 Trigger chop_up_insert: time=64.142 calls=1
 Execution time: 64.240 ms
(5 rows)

join_test=> select id, value_1, value_2, value_3, value_4, value_5 from combined where id = 4000001;
  id     | value_1 | value_2 | value_3 | value_4 | value_5
---------+---------+---------+---------+---------+---------
 4000001 |      10 |         |         |         |
(1 row)

join_test=>

Updating the traditional table...

join_test=> EXPLAIN ANALYZE update traditional_table set value_3 = 30 where id = 3000000;
                                                                    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
 Update on traditional_table  (cost=0.43..8.45 rows=1 width=810) (actual time=0.044..0.044 rows=0 loops=1)
   ->  Index Scan using traditional_table_pkey on traditional_table  (cost=0.43..8.45 rows=1 width=810) (actual time=0.024..0.025 rows=1 loops=1)
         Index Cond: (id = 3000000)
 Planning time: 0.174 ms
 Execution time: 0.108 ms
(5 rows)

join_test=>

Updating the view

join_test=> EXPLAIN ANALYZE update combined set value_3 = 30 where id = 3000000;
                                                                                                                                                                                                                                                                                                                                                                                    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on combined  (cost=43.43..854.45 rows=1 width=1442) (actual time=2.984..2.984 rows=0 loops=1)
   ->  Nested Loop Left Join  (cost=43.43..854.45 rows=1 width=1442) (actual time=0.952..1.009 rows=1 loops=1)
         Join Filter: (pt.id = rt_100.pt_id)
         ->  Nested Loop Left Join  (cost=43.00..845.99 rows=1 width=1396) (actual time=0.939..0.996 rows=1 loops=1)
               Join Filter: (pt.id = rt_99.pt_id)
               ->  Nested Loop Left Join  (cost=42.57..837.53 rows=1 width=1382) (actual time=0.931..0.986 rows=1 loops=1)
                     Join Filter: (pt.id = rt_98.pt_id)

                     ... 97 other joins omitted ...

                     ->  Index Scan using reference_table_98_pt_id_key on reference_table_98 rt_98  (cost=0.43..8.45 rows=1 width=18) (actual time=0.006..0.006 rows=1 loops=1)
                           Index Cond: (pt_id = 3000000)
               ->  Index Scan using reference_table_99_pt_id_key on reference_table_99 rt_99  (cost=0.43..8.45 rows=1 width=18) (actual time=0.007..0.007 rows=1 loops=1)
                     Index Cond: (pt_id = 3000000)
         ->  Index Scan using reference_table_100_pt_id_key on reference_table_100 rt_100  (cost=0.43..8.45 rows=1 width=18) (actual time=0.006..0.006 rows=1 loops=1)
               Index Cond: (pt_id = 3000000)
 Planning time: 70.304 ms
 Trigger chop_up_update: time=1.964 calls=1
 Execution time: 6.022 ms
(406 rows)

join_test=> select id, value_1, value_2, value_3, value_4, value_5 from combined where id = 3000000;
   id    |      value_1      |      value_2      | value_3 |      value_4      |      value_5
---------+-------------------+-------------------+---------+-------------------+-------------------
 3000000 | 0.881429163739085 | 0.836008816491812 |      30 | 0.464126270730048 | 0.137938762549311
(1 row)

join_test=>

Deleting from the traditional table...

join_test=> EXPLAIN ANALYZE delete from traditional_table where id = 1000000;
                                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
 Delete on traditional_table  (cost=0.43..8.45 rows=1 width=6) (actual time=1.126..1.126 rows=0 loops=1)
   ->  Index Scan using traditional_table_pkey on traditional_table  (cost=0.43..8.45 rows=1 width=6) (actual time=1.106..1.108 rows=1 loops=1)
         Index Cond: (id = 1000000)
 Planning time: 0.082 ms
 Execution time: 1.153 ms
(5 rows)

join_test=>

Deleting from the view...

join_test=> EXPLAIN ANALYZE delete from combined where id = 1000000;
                                                                                                                                                                                                                                                                                                                                                                                    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Delete on combined  (cost=43.43..854.45 rows=1 width=638) (actual time=62.541..62.541 rows=0 loops=1)
   ->  Nested Loop Left Join  (cost=43.43..854.45 rows=1 width=638) (actual time=0.936..1.021 rows=1 loops=1)
         Join Filter: (pt.id = rt_100.pt_id)
         ->  Nested Loop Left Join  (cost=43.00..845.99 rows=1 width=1396) (actual time=0.918..1.001 rows=1 loops=1)
               Join Filter: (pt.id = rt_99.pt_id)
               ->  Nested Loop Left Join  (cost=42.57..837.53 rows=1 width=1382) (actual time=0.910..0.992 rows=1 loops=1)
                     Join Filter: (pt.id = rt_98.pt_id)

                     ... 97 other joins omitted ...

                     ->  Index Scan using reference_table_98_pt_id_key on reference_table_98 rt_98  (cost=0.43..8.45 rows=1 width=18) (actual time=0.008..0.008 rows=1 loops=1)
                           Index Cond: (pt_id = 1000000)
               ->  Index Scan using reference_table_99_pt_id_key on reference_table_99 rt_99  (cost=0.43..8.45 rows=1 width=18) (actual time=0.006..0.006 rows=1 loops=1)
                     Index Cond: (pt_id = 1000000)
         ->  Index Scan using reference_table_100_pt_id_key on reference_table_100 rt_100  (cost=0.43..8.45 rows=1 width=18) (actual time=0.012..0.012 rows=1 loops=1)
               Index Cond: (pt_id = 1000000)
 Planning time: 72.267 ms
 Trigger chop_up_delete: time=61.510 calls=1
 Execution time: 65.565 ms
(406 rows)

join_test=>

Couple things jump out at me, and those are that 1) the traditional table beats the pants off the view on single row operations (unsurprisingly), and 2) operations on the view often spend most of their time in the planning stage. Kind of interesting.

Conclusion

What kind of conclusions can we draw from this experiment? Certainly not that column oriented db's are better than row oriented, or vice versa, since they each have their strengths. To me it means the PostgreSQL developers have built a system that is robust and flexible enough that (when the stars align and we control all variables just right) something like this can work at all, and that's pretty neat.