Performance

A couple days ago, Robert Haas announced that he checked in the first iteration of parallel sequence scans in the Postgres 9.6 branch. And no, that’s not a typo. One of the great things about the Postgres devs is that they have a very regimented system of feature freezes to help ensure timely releases. Thus even though 9.5 just released its second beta, they’re already working on 9.6. So what is a sequence scan, and why does this matter?

Postgres has been lacking something for quite a while, and more than a few people have attempted to alleviate the missing functionality multiple times. I’m speaking of course, about parallel queries. There are several reasons for this, and among them include various distribution and sharding needs for large data sets. When tables start to reach hundreds of millions, or even billions of rows, even high cardinality indexes produce results very slowly.

I wasn’t able to write an article last week due to an unexpected complication regarding tests I was running to verify its contents. So this week, it’s going to be extra special! Also long. What’s the fastest way to load a Postgres table? If you believe the documentation, the COPY command is the best way to unceremoniously heave data into a table. Fortunately after all of our talk about partitions, our minds are primed and ready to think in chunks.

I’ve been talking about partitions a lot recently, and I’ve painted them in a very positive light. Postgres partitions are a great way to distribute data along a logical grouping and work best when data is addressed in a fairly isloated manner. But what happens if we direct a basic query at a partitioned table in such a way that we ignore the allocation scheme? Well, what happens isn’t pretty. Let’s explore in more detail.

We’re finally at the end of the 10-part Postgres (PostgreSQL) performance series I use to initiate new developers into the database world. To that end, we’re going to discuss something that affects everyone at one point or another: index criteria. Or to put it another way: Why isn’t the database using an index? It’s a fairly innocuous question, but one that may have a surprising answer: the index was created using erroneous assumptions.

Performance

PG Phriday: Parallel Sequence Scans

PG Phriday: Massively Distributed Operation

PG Phriday: Parallel-O-Postgres

PG Phriday: When Partitioning Goes Wrong

PG Phriday: 10 Ways to Ruin Performance: Sex Offenders