Maintaining a Postgres database can involve a lot of busywork. This is especially true for more robust architectures that allocate at least one replica for failover purposes. It’s still fairly common for a DBA to create a replica to accommodate emergency or upgrade scenarios, only to have to repeat the process when it came time to revert to the original master system. It’s not safe to simply subscribe the original primary to the newly promoted secondary, so this leaves either creating a new clone, or using
rsync to synchronize all of the files first.
Postgres has been lacking something for quite a while, and more than a few people have attempted to alleviate the missing functionality multiple times. I’m speaking of course, about parallel queries. There are several reasons for this, and among them include various distribution and sharding needs for large data sets. When tables start to reach hundreds of millions, or even billions of rows, even high cardinality indexes produce results very slowly.
I wasn’t able to write an article last week due to an unexpected complication regarding tests I was running to verify its contents. So this week, it’s going to be extra special! Also long.
What’s the fastest way to load a Postgres table? If you believe the documentation, the
COPY command is the best way to unceremoniously heave data into a table. Fortunately after all of our talk about partitions, our minds are primed and ready to think in chunks. Why restrict ourselves to one
COPY, when we can summon an army?
I’ve been talking about partitions a lot recently, and I’ve painted them in a very positive light. Postgres partitions are a great way to distribute data along a logical grouping and work best when data is addressed in a fairly isloated manner. But what happens if we direct a basic query at a partitioned table in such a way that we ignore the allocation scheme? Well, what happens isn’t pretty. Let’s explore in more detail.
This PG Phriday is going to be a bit different. During my trip to Postgres Open this year, I attended a talk I had originally written off as “some Red Hat stuff.” But I saw the word “containers” in the PostgreSQL in Containers at Scale talk and became intrigued. A few days later, I had something of an epiphany: I’ve been wrong about servers for years; we all have.