PG Phriday: Displaced Durability

Friday December 04, 2015

A lot of DBAs are quite adamant regarding ACID compliance. I count myself among them. But unlike the other parts of the acronym, there are times when data durability isn’t actually a high priority. Data staging holding areas, temporary tables that need visibility across sessions, and other transient information do not require zealous protection. As a DBA it feels weird saying it, but there’s just some data we simply don’t care about losing.

Which is why all versions of Postgres after 9.1 include the ability to create unlogged tables. Aside from not needing data durability, why even bother with such a feature? Believe it or not, enforcing durability comes with a lot of overhead. There’s a whole array of mechanisms that strive to guarantee committed transactions are permanent.

One of these, the write ahead log (WAL), is also part of the replication system. Due to their importance, WAL files are often archived as part of standard recovery procedure. That means there are potentially three redundancy levels for every table in our database! That is a vast amount of unnecessary disk writing and network activity for ultimately ephemeral data. Durability is great, except when it isn’t.

Here’s a quick demonstration at how these excess resources translate to time consumption:

\timing

CREATE TABLE regular_table AS
SELECT * FROM generate_series(1, 5000000);

Time: 7423.529 ms

CREATE UNLOGGED TABLE unlogged_table AS
SELECT * FROM generate_series(1, 5000000);

Time: 2843.311 ms

These timings varied somewhat, but the trend persisted through several iterations. This was on a test Postgres instance with one replica and no WAL archival. This means a production system utilizing WAL archival would potentially produce an even larger divergence. Taking the ‘D’ out of ACID has some very real performance benefits.

Of course, these advantages come with a gargantuan fire-breathing caveat. Let’s simulate a Postgres crash:

pkill -9 postmaster
pg_ctlcluster 9.4 main start

And then examine the aftermath:

SELECT count(1) FROM regular_table;

  count  
---------
 5000000

SELECT count(1) FROM unlogged_table;

 count 
-------
     0

Oh. Well, then. Because unlogged tables are not durable, they can’t be recovered. Following any unexpected shutdown, Postgres will truncate any unlogged tables upon the next startup. In addition, since unlogged tables are not part of the WAL system at all, they’re not streamed to any existing replicas. Replicas retain the table structure itself, but they remain entirely unusable.

This is what our clone has to say about the situation:

SELECT * FROM unlogged_table;

ERROR:  cannot access temporary or unlogged relations during recovery

Given these constraints, unlogged tables are only useful on the primary server for transient data. This makes them great for use cases we mentioned earlier: data staging areas, persistent temp tables, raw COPY targets, etc. Used judiciously, they can greatly accelerate an ETL stack or multiple-step data import process.

They’re also good for application stacks that require data flushing. Certain types of queues leverage notifications or live messaging, neither of which want stale data. Some applications bootstrap one or more tables from a tertiary data source upon starting. Web apps could even store local session data in an unlogged Postgres table on each host, pending aggregation on a reporting server. There are many forms of application where phantom data is exactly what we want.

In the end, the primary difference between Postgres and a strictly ACID database, is that Postgres actively facilitates circumventing unnecessary durability overhead. But it does so safely, limiting the effect to tables specifically created for that purpose. Why pay for solid reality, when all you want is smoke and mirrors?

See Also