PG Phriday: Everything in Common

Not a lot of people remember what Postgres was like before version 8.4. In many ways, this was the first “modern” release of the database engine. CTEs, Window Functions, column level permissions, in-place upgrade compatible with subsequent versions, collation support, continuous query statistic collection; it was just a smorgasbord of functionality.

Of these, CTEs or Common Table Expressions, probably enjoy the most user-level exposure; for good reason. Before this, there was no way to perform a recursive query in Postgres, which really hurts in certain situations. Want to display all related child threads in an online discussion? How about fetching the components of an organization chart by following management assignments? Better get ready for a lot of queries in a loop.

In addition to that, complicated queries were difficult to logically simplify. Reporting queries are especially prone to frequent sequences of aggregates and subqueries. It’s not uncommon to build a query that’s several pages long in this kind of context. Optimizing such an unwieldy beast is often difficult or even impossible simply due to all of the components and confusing nesting.

CTEs changed these things for the better and in the eyes of many, finally brought Postgres to parity with Oracle and its long-established recursive query support. So let’s explore what CTEs really deliver, and how they can improve our Postgres experience—caveats and all.

Trumping the PostgreSQL Query Planner

With the release of PostgreSQL 8.4, the community gained the ability to use CTE syntax. As such, this is a fairly old feature, yet it’s still misunderstood in a lot of ways. At the same time, the query planner has been advancing incrementally since that time. Most recently, PostgreSQL has gained the ability to perform index-only scans, making it possible to fetch results straight from the index, without confirming rows with the table data.

Unfortunately, this still isn’t enough. There are still quite a few areas where the PostgreSQL query planner is extremely naive, despite the advances we’ve seen recently. For instance, PostgreSQL still can’t do a basic loose index scan natively. It has to be tricked by using CTE syntax.