Recently on the pgsql-performance mailing list, a question popped up regarding Postgres RAM usage. In this instance Pietro wondered why Postgres wasn’t using more RAM, and why his process was taking so long. There were a few insightful replies, and they’re each interesting for reasons that aren’t immediately obvious. Let’s see what is really going on here, and perhaps answer a question while we’re at it.
Not a lot of people remember what Postgres was like before version 8.4. In many ways, this was the first “modern” release of the database engine. CTEs, Window Functions, column level permissions, in-place upgrade compatible with subsequent versions, collation support, continuous query statistic collection; it was just a smorgasbord of functionality.
Of these, CTEs or Common Table Expressions, probably enjoy the most user-level exposure; for good reason. Before this, there was no way to perform a recursive query in Postgres, which really hurts in certain situations. Want to display all related child threads in an online discussion? How about fetching the components of an organization chart by following management assignments? Better get ready for a lot of queries in a loop.
In addition to that, complicated queries were difficult to logically simplify. Reporting queries are especially prone to frequent sequences of aggregates and subqueries. It’s not uncommon to build a query that’s several pages long in this kind of context. Optimizing such an unwieldy beast is often difficult or even impossible simply due to all of the components and confusing nesting.
CTEs changed these things for the better and in the eyes of many, finally brought Postgres to parity with Oracle and its long-established recursive query support. So let’s explore what CTEs really deliver, and how they can improve our Postgres experience—caveats and all.
Programming is fun. I love programming! Ever since I changed my career from programming to database work, I’ve still occasionally dabbled in my former craft. As such, I believe I can say this with a fair amount of accuracy: programmers don’t understand databases. This isn’t something small, either; there’s a fundamental misunderstanding at play. Unless the coder happens to work primarily with graphics, bulk set-based transformations are not something they’ll generally work with.
A few months ago, Greg Smith of PostgreSQL fame suggested I submit a proposal to the new Postgres Open conference here in Chicago. Some of us residents of the Midwest have long waited for a PostgreSQL-related conference of our very own, and now the glorious day has finally arrived. I was asked to submit proposals to other conferences, but the travel involved quickly put me off; now I can be lazy and still help spread The Word.
Ah, query hints. For all those times when the database isn’t doing what you want, they’re a useful tool for forcing the query optimizer to perform your bidding. But in this case, not only is the road to Hell paved with good intentions, it’s paved with a frictionless slide directly into a wood chipper that empties handily into an active volcano. With query hints, be careful what you wish for, because—to the detriment of all you hold dear—you just might get it.