Partitioning tables in Postgres can be an extremely risky endeavor. Unfortunately on many larger systems, it’s also essentially a requirement; the maximum size of a Postgres table is 32TB. This isn’t just because converting an existing table to a series of partitions is expensive or time consuming. We must consider how the query planner will react to the partitioned version of a table. There’s also the very real risk we will (or already have) implement flaws in the trigger or constraint logic.
One of the things Postgres has been “missing” for a while is logical replication based on activity replay. Until fairly recently, in order to replicate single tables from one database to another, we had to encumber the table with performance-robbing triggers coupled to a third party daemon to manage transport. Those days might finally be behind us thanks to pglogical. But is it easy to use? Let’s give it a try on our trusty sensor_log table.
For all of those warehouse queries that never seem to complete before the heat death of the universe, there’s often a faster version. Sometimes this is due to a fundamental misunderstanding of how queries work, or how Postgres specifically functions. The trick is knowing when to back away slowly from an ugly but efficient query, and when to inject a flurry of predicates to fully illustrate the original intent of the query so the planner makes better decisions.
I always advocate breaking up large Postgres tables for a few reasons. Beyond query performance concerns, maintaining one monolithic structure is always more time consuming and consequentially more dangerous. The time required to create a dozen small indexes may be slightly longer than a single larger one, but we can treat the smaller indexes as incremental. If we want to rebuild, add more indexes, or fix any corruption, why advocate an all-or-nothing proposition?
I wasn’t able to write an article last week due to an unexpected complication regarding tests I was running to verify its contents. So this week, it’s going to be extra special! Also long. What’s the fastest way to load a Postgres table? If you believe the documentation, the COPY command is the best way to unceremoniously heave data into a table. Fortunately after all of our talk about partitions, our minds are primed and ready to think in chunks.