The Postgres developers recently announced the availability of the first public beta for Postgres 9.6. I would be highly remiss to ignore such an opportunity to dig into any interesting functionality listed in the 9.6 release notes. All in all, it’s a pretty exciting series of advancements, and assuming this is a glimpse of what we see when 9.6 drops, I’d say we’re on the right track.
Plentiful Parallelism Probably the most high-profile addition for 9.
Having run into a bit of a snag with Postgres-XL, and not wanting to be dead in the water with our project, I went on a bit of a knowledge quest. Database scaling is hard, so I expected a bunch of either abandoned or proprietary approaches. In addition, as a huge fans of Postgres, compatibility or outright use of the Postgres core was a strict prerequisite.
So, what options are out there?
A couple days ago, Robert Haas announced that he checked in the first iteration of parallel sequence scans in the Postgres 9.6 branch. And no, that’s not a typo. One of the great things about the Postgres devs is that they have a very regimented system of feature freezes to help ensure timely releases. Thus even though 9.5 just released its second beta, they’re already working on 9.6.
So what is a sequence scan, and why does this matter?
Postgres has been lacking something for quite a while, and more than a few people have attempted to alleviate the missing functionality multiple times. I’m speaking of course, about parallel queries. There are several reasons for this, and among them include various distribution and sharding needs for large data sets. When tables start to reach hundreds of millions, or even billions of rows, even high cardinality indexes produce results very slowly.
Postgresql hates itself. I’m convinced of this, and have considered the idea frequently over the years. I roll it around in my mouth just to savor the taste, only to hope the flavor changes eventually. A couple things have advanced, though not quite what one might hope.
Parallel Restore With the introduction of PostgreSQL 8.4, the core developers have finally succumbed to the availability of multiple-CPU systems, and added parallel restores.