PG Phriday: Broken Parts

Partitioning tables in Postgres can be an extremely risky endeavor. Unfortunately on many larger systems, it’s also essentially a requirement; the maximum size of a Postgres table is 32TB. This isn’t just because converting an existing table to a series of partitions is expensive or time consuming. We must consider how the query planner will react to the partitioned version of a table. There’s also the very real risk we will (or already have) implement flaws in the trigger or constraint logic.

Tools like pg_partman or pg_pathman are great for new systems, but existing deployments require retrofitting. So what happens if we have a large warehouse that’s already using a slightly flawed partition engine? And what other concerns remain even after achieving a pristine setup?

PG Phriday: Perfectly Logical

One of the things Postgres has been “missing” for a while is logical replication based on activity replay. Until fairly recently, in order to replicate single tables from one database to another, we had to encumber the table with performance-robbing triggers coupled to a third party daemon to manage transport. Those days might finally be behind us thanks to pglogical.

PG Phriday: Pesky Partition Plans

For all of those warehouse queries that never seem to complete before the heat death of the universe, there’s often a faster version. Sometimes this is due to a fundamental misunderstanding of how queries work, or how Postgres specifically functions. The trick is knowing when to back away slowly from an ugly but efficient query, and when to inject a flurry of predicates to fully illustrate the original intent of the query so the planner makes better decisions. When partitions are involved, this can be quite an adventure.

PG Phriday: Trusty Table Tiers

I always advocate breaking up large Postgres tables for a few reasons. Beyond query performance concerns, maintaining one monolithic structure is always more time consuming and consequentially more dangerous. The time required to create a dozen small indexes may be slightly longer than a single larger one, but we can treat the smaller indexes as incremental. If we want to rebuild, add more indexes, or fix any corruption, why advocate an all-or-nothing proposition? Deleting from one large table will be positively glacial compared to simply dropping an entire expired partition. The list just goes on and on.

PG Phriday: Parallel-O-Postgres

I wasn’t able to write an article last week due to an unexpected complication regarding tests I was running to verify its contents. So this week, it’s going to be extra special! Also long.

What’s the fastest way to load a Postgres table? If you believe the documentation, the COPY command is the best way to unceremoniously heave data into a table. Fortunately after all of our talk about partitions, our minds are primed and ready to think in chunks. Why restrict ourselves to one COPY, when we can summon an army?