Performance

Recently a coworker asked me this question: Should I expect variance between minutes and hours for the same query? And I was forced to give him this answer: Potentially, but not commonly. Query planning is an inexact science, and regardless of the query being the “same query,” the data is not the “same data.” This isn’t generally the case, but on occasion, changes in data can affect the query execution path.

The Postgres query planner is house of cards built upon the ever-shifting sand of our data. It has the utterly impossible mission of converting our ridiculous and inane requests into a logical series of fetch, filter, sort, join, and other instructions. Then the resulting steps must be ruthlessly efficient or the execution phase could very well saturate every hardware resource available; Set Theory isn’t very forgiving. Forewarned is forearmed is very apt when applied to database query planners.

Partitioning tables in Postgres can be an extremely risky endeavor. Unfortunately on many larger systems, it’s also essentially a requirement; the maximum size of a Postgres table is 32TB. This isn’t just because converting an existing table to a series of partitions is expensive or time consuming. We must consider how the query planner will react to the partitioned version of a table. There’s also the very real risk we will (or already have) implement flaws in the trigger or constraint logic.

For all of those warehouse queries that never seem to complete before the heat death of the universe, there’s often a faster version. Sometimes this is due to a fundamental misunderstanding of how queries work, or how Postgres specifically functions. The trick is knowing when to back away slowly from an ugly but efficient query, and when to inject a flurry of predicates to fully illustrate the original intent of the query so the planner makes better decisions.

Like any programming language, the PL/pgSQL Postgres procedural language has plenty of handy control structures. Among those thankfully modern accoutrements is the humble EXCEPTION block. While not the more prevalent try/catch methodology, we can use BEGIN anywhere to start an embedded section for the same effect. Knowing this is the case, what are the costs of actually using them? Postgres is fairly streamlined, and as such, can take several shortcuts when there are no exception blocks within a chunk of PL/pgSQL.

Performance

PG Phriday: Planner Pitfalls

PG Phriday: Primal Planner Prep

PG Phriday: Broken Parts

PG Phriday: Pesky Partition Plans

PG Phriday: EXCEPTIONal Performance