PG Phriday: Bodacious Benchmarks
When it comes to putting Postgres through its paces, we often turn to benchmarks to absolutely bury it under a torrent of oppressive activity. It’s a great way to obtain maximum performance metrics and also observe how Postgres reacts and breaks down under such pressure. But these kinds of tests aren’t really practical, are they? After all, many such simulated workloads are nothing but bragging rights measured against previous Postgres releases, or for hardware comparisons. But while functionality beyond defaults is often overlooked, tools like
pgbench are actually critical to the development process.
pgbench frequently in the past for examples in these articles. It’s much like Apache’s own ab for bombarding a server with various workloads, except
pgbench must bootstrap by creating a set of test tables. Or does it? What if we have our own existing tables from a project that we want to measure, either for a hardware migration, or for scalability concerns?
Let’s see how we can utilize
pgbench with our trusty
sensor_log table. It’s not really designed to experience any updates, so we need something else as well. Let’s say the sensor log is an endpoint capture for a very busy sensor table that is only updated when sensors detect a change in the environment. That should provide us with plenty of opportunities for contention!
CREATE TABLE sensor ( sensor_id SERIAL PRIMARY KEY NOT NULL, location VARCHAR NOT NULL, reading BIGINT NOT NULL, modified_date TIMESTAMP NOT NULL ); INSERT INTO sensor (location, reading, modified_date) SELECT s.id, s.id % 100, CURRENT_DATE + INTERVAL '8h' - (s.id % 60 || 's')::INTERVAL FROM generate_series(1, 5000000) s(id); CREATE INDEX idx_sensor_location ON sensor (location); ANALYZE sensor; CREATE TABLE sensor_log ( sensor_id BIGINT NOT NULL, location VARCHAR NOT NULL, reading BIGINT NOT NULL, reading_date TIMESTAMP NOT NULL ); CREATE UNIQUE INDEX udx_sensor_log_sensor_id_reading_date ON sensor_log (reading_date, sensor_id); CREATE OR REPLACE FUNCTION f_enforce_modified() RETURNS TRIGGER LANGUAGE plpgsql AS $$ BEGIN NEW.modified_date = now(); RETURN NEW; END; $$; CREATE TRIGGER t_sensor_timestamp_b_iu BEFORE INSERT OR UPDATE ON sensor FOR EACH ROW EXECUTE PROCEDURE f_enforce_modified(); CREATE OR REPLACE FUNCTION f_capture_sensor_changes() RETURNS TRIGGER LANGUAGE plpgsql AS $$ BEGIN INSERT INTO sensor_log (sensor_id, location, reading, reading_date) VALUES (NEW.*); RETURN NULL; END; $$; CREATE TRIGGER t_sensor_changes_a_iu AFTER INSERT OR UPDATE ON sensor FOR EACH ROW EXECUTE PROCEDURE f_capture_sensor_changes();
We’ve bootstrapped the
sensor table with a relatively substantial amount of data. Five million rows isn’t really much these days, but it’s more than enough for demonstration purposes on a laptop or VM. With that said, there are a few things we can test with this basic stable structure. But what to analyze?
Let’s begin with a basic custom script that just emulates a simple update schedule with sensor readings coming in constantly from any random location. It’s a good place to start and allows us to tweak further. Note that this script only works with 9.6 and above, but previous iterations have an equivalent.
\set sid random(1, 100000 * :scale) \set val random(1, 100) SELECT * FROM sensor WHERE sensor_id = :sid; UPDATE sensor SET reading = :val WHERE sensor_id = :sid; SELECT max(reading) FROM sensor_log WHERE sensor_id = :sid AND reading_date > CURRENT_DATE - INTERVAL '5 minutes';
This script does a few things for us. Taken as a whole, we just grab a random entry from the
sensor table, capture its current reading, and see if that affected the aggregate high over the last five minutes. This is something an application might do as part of an analysis cycle. The real magic starts when we actually execute the script with
This is actually pretty cool, and full of valuable information. We’re also going to utilize a trick so we don’t taint our existing data. It’s not something recommended for production systems, but it’s perfect for QA or development environments.
createdb sensor_test -T sensors pgbench --scale 50 --report-latencies --no-vacuum \ --file sensor-bench-1.sql --time 30 sensor_test transaction type: sensor-bench-1.sql scaling factor: 50 query mode: simple number of clients: 1 number of threads: 1 duration: 30 s number of transactions actually processed: 9200 latency average: 3.261 ms tps = 306.636514 (including connections establishing) tps = 306.672074 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 0.003 \set sid random(1, 100000 * :scale) 0.001 \set val random(1, 100) 0.291 SELECT * FROM sensor 1.245 UPDATE sensor SET reading = :val 1.713 SELECT max(reading) FROM sensor_log
The first command created a new database named
sensor_test based on the
sensors database that contained both of our sample tables, and includes any tables and data that already existed. This means we can run as many tests as we want, throw everything away, and start fresh as many times as we need. It takes a bit of time to copy data, so there is probably a limit to where this becomes difficult to justify, but small and medium data sets fit this model just fine.
pgbench command itself deserves some explanation, though we did defer to the long-format parameters to help convey what we’re actually doing.
Let’s start with the scale. By default,
pgbench uses 100,000 rows as the base, and our script retains that because why not? With a scale of 50, that’s our five-million rows. What’s interesting about this is that we can actually increase possible contention by reducing the scale so it focuses modifications on a smaller portion of the data. We’ll probably find this useful later.
We disable vacuuming because
pgbench only knows to vacuum its own tables, so we disable that while using custom scripts. The script statistics at the end of the report is why we included the latency report. With this information, we can focus on the part of the script causing any slowdowns. Assuming we base our scripts on real application samples, this is an invaluable source of performance metrics.
Aside from the custom file we’re using for the test itself, we also set the test duration to thirty seconds. For any real kind of analysis, we’d want to use a much larger value, on the order of tens of minutes or even hours. That would give us a much better idea of average latencies after accounting for other system activity, including write throughput to the base data files during checkpoints.
Let’s see what happens if we crank down the scale to create a “hot spot” in our table, and increase the client concurrency to force more contention:
dropdb sensor_test createdb sensor_test -T sensors pgbench --client 4 --report-latencies --no-vacuum \ --log --aggregate-interval 10 \ --file sensor-bench-1.sql --time 30 sensor_test transaction type: sensor-bench-1.sql scaling factor: 1 query mode: simple number of clients: 4 number of threads: 1 duration: 30 s number of transactions actually processed: 32425 latency average: 3.701 ms tps = 1080.246903 (including connections establishing) tps = 1080.360307 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 0.002 \set sid random(1, 100000 * :scale) 0.001 \set val random(1, 100) 0.654 SELECT * FROM sensor 1.702 UPDATE sensor SET reading = :val 1.332 SELECT max(reading) FROM sensor_log cat $(ls -t pgbench_log.* | head -n 1) 1474658287 13068 36026916 122319506636 1060 17761 1474658297 10252 39970824 203678243092 1568 32993 1474658307 8371 39968669 270707139663 1910 82191
There isn’t a whole lot that has changed here, but we can see a couple important bits of information. First, the update statement is notably slower than before. This is a 2-CPU VM, so that’s expected, though incoming activity eclipsing CPU count on a system is somewhat common. Despite this, the aggregate maximum did not get slower, and in fact is actually slightly faster. Clearly the update is the issue, which is expected since we saddled it with two triggers.
That last bit of output is the result of including output logging and tweaking our aggregate interval to summarize instead of reporting the results of every transaction. This data gets logged based on the current PID of the
pgbench command, so we cheated a bit and used
cat to print the file for the most recent benchmark. Each column has a meaning, but the last two represent minimum and maximum latency in microseconds. Based on this, we can see that latency has some pretty wild swings from a mere 1ms to over 80ms.
This tells us a couple things, actually. Our previous models never included the
sensor table, just
sensor_log. We can always pull the most recent reading from a sensor log, so having a lot of updates and two triggers on the base table is a lot of unnecessary overhead. What if we flipped the model slightly and emulated the current state with a view instead?
CREATE TABLE sensor_log ( sensor_id SERIAL NOT NULL, location VARCHAR NOT NULL, reading BIGINT NOT NULL, reading_date TIMESTAMP NOT NULL DEFAULT now() ); INSERT INTO sensor_log (location, reading, reading_date) SELECT s.id, s.id % 100, CURRENT_DATE + INTERVAL '8h' - (s.id % 60 || 's')::INTERVAL FROM generate_series(1, 5000000) s(id); CREATE UNIQUE INDEX udx_sensor_log_sensor_id_reading_date ON sensor_log (sensor_id, reading_date DESC); ANALYZE sensor_log; CREATE VIEW v_sensor AS SELECT DISTINCT ON (sensor_id) * FROM sensor_log ORDER BY sensor_id, reading_date DESC;
This comes with a corresponding update to our script:
\set sid random(1, 100000 * :scale) \set val random(1, 100) SELECT * FROM v_sensor WHERE sensor_id = :sid; INSERT INTO sensor_log (sensor_id, location, reading) VALUES (:sid, :sid, :val); SELECT max(reading) FROM sensor_log WHERE sensor_id = :sid AND reading_date > CURRENT_DATE - INTERVAL '5 minutes';
And finally, we run the altered benchmark:
dropdb sensor_test createdb sensor_test -T sensors pgbench --client 4 --report-latencies --no-vacuum \ --file sensor-bench-2.sql --time 30 sensor_test transaction type: sensor-bench-2.sql scaling factor: 1 query mode: simple number of clients: 4 number of threads: 1 duration: 30 s number of transactions actually processed: 52091 latency average: 2.304 ms tps = 1736.124362 (including connections establishing) tps = 1736.281647 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 0.002 \set sid random(1, 100000 * :scale) 0.001 \set val random(1, 100) 0.463 SELECT * FROM v_sensor 1.347 INSERT INTO sensor_log (sensor_id, location, reading) 0.480 SELECT max(reading) FROM sensor_log
Maybe it comes as no surprise, but these modifications improved performance by over 70%! A good DBA probably would have said something about the
sensor table for the same reasons we listed, but it’s better to have numbers to prove it. It’s certainly logical enough to keep a table with current values and keep an audit log for changes, but we found a better model if we need to increase throughput.
Keep in mind that this is a very simple example. In a real cluster, our benchmark script would likely be much more involved, and we’d probably have several more of them for various read/write percentage scenarios. This is the kind of tooling necessary to really squeeze the database for every last drop of performance, or otherwise reveal points of contention or excessive latency swings.
If you’re not already using it,
pgbench is a powerful addition to any database developer or DBA arsenal. It’s much more than its defaults might suggest!