PG Phriday: The Case for Partitioning
In the next few weeks, I’m going to be pushing a long discussion regarding Postgres (PostgreSQL) table partitioning. I’ve covered it in previous articles, but only regarding basic performance considerations. That’s a very limited view of what partitioning can offer; there’s a lot more variance and background that deserves elucidation.
So for the next few articles, the topic of discussion will be partitioning. There’s not really enough of it, and a lot of the techniques used in the field are effectively pulled straight from the documentation. I think we can go much further.
To that effect, this is the table structure all subsequent articles will reference:
CREATE TABLE sensor_log (
sensor_log_id SERIAL PRIMARY KEY,
location VARCHAR NOT NULL,
reading BIGINT NOT NULL,
reading_date TIMESTAMP NOT NULL
);
CREATE INDEX idx_sensor_log_location ON sensor_log (location);
CREATE INDEX idx_sensor_log_date ON sensor_log (reading_date);
Very large, continuously growing tables tend to be great candidates for partitioning. Further, we can explore several different techniques and cover various considerations related to expected volume. To help that along, I threw together a basic python loader script that both bootstraps the partition data, and can profile timings for analysis purposes.
To keep things interesting, the values are set to fill the table with 100M rows. The fact this is scripted helps simulate a process that might actually load data in a production environment. This allows testing basic insert speed with and without partitioning, as the size of the table (and indexes) increases over time.
# Set these to modify how large the COMMIT blocks will be,
# and how many days the data will represent.
# Total row count = chunk_size * date_range.
chunk_size = 100000
date_range = 1000
# Main program body
import psycopg2
from StringIO import StringIO
from datetime import datetime, timedelta
raw_data = StringIO()
db_conn = psycopg2.connect(database = 'postgres', user = 'postgres')
cur = db_conn.cursor()
print 'rows,seconds,microseconds'
for j in xrange(0, date_range):
raw_data.seek(0)
raw_data.truncate(0)
stamp = datetime.now() - timedelta(days = j)
for i in xrange(1, chunk_size + 1):
stamp -= timedelta(seconds = 5)
raw_data.write(
'%s\t%s\t%s\n' % (i % 1000, i % 100, stamp)
)
raw_data.seek(0)
t1 = datetime.now()
cur.copy_from(
raw_data, 'sensor_log', '\t',
columns = ('location', 'reading', 'reading_date')
)
t2 = datetime.now()
diff_s = str(t2 - t1).split(':')[2]
print '%s,%s' % (
(j + 1) * chunk_size, diff_s.replace('.', ',')
)
db_conn.commit()
raw_data.close()
db_conn.close()
Before we get to any partitioning discussion, using this script as a baseline, loading the base sensor_log
table takes about 45 minutes on my testing VM. Note that we didn’t cheat by loading the table without indexes as we’ve done previously. Existing tables won’t have that luxury, so for the partitioning discussion, we can ignore it for now.
So, what are the approaches we can cover?
- No partitioning at all.
- Use the technique discussed in the PostgreSQL documentation (triggers).
- Use the base table as current data, with partitions as older, archived information.
And knowing the techniques, what do we examine?
- Statement speed.
- Potential overhead.
- Maintenance concerns.
- Implementation caveats.
Partitioning is a hairy procedure. No matter what method is chosen, the result will have definite performance and management issues. The idea is to choose a good balance that addresses current needs, based on volume. I plan on being as comprehensive as possible to advocate partitioning without simply being a cheerleader.
Due to the 100-million row sample size and the associated 13GB of space needed, the time scale for even the non-partitioned case should make everything obvious. Unfortunately it also greatly inflates the time required for any statement tests, and bootstrapping the tables themselves. I think that effort is warranted though, especially because projects that are likely to benefit from partitioning will likely exceed even that scale.
So think about partitioning a bit, because I’m about to go into some crazy depth about it, and hopefully make a case for using it early and often.