A lot of DBAs are quite adamant regarding ACID compliance. I count myself among them. But unlike the other parts of the acronym, there are times when data durability isn’t actually a high priority. Data staging holding areas, temporary tables that need visibility across sessions, and other transient information do not require zealous protection. As a DBA it feels weird saying it, but there’s just some data we simply don’t care about losing.

Which is why all versions of Postgres after 9.1 include the ability to create unlogged tables. Aside from not needing data durability, why even bother with such a feature? Believe it or not, enforcing durability comes with a lot of overhead. There’s a whole array of mechanisms that strive to guarantee committed transactions are permanent.

One of these, the write ahead log (WAL), is also part of the replication system. Due to their importance, WAL files are often archived as part of standard recovery procedure. That means there are potentially three redundancy levels for every table in our database! That is a vast amount of unnecessary disk writing and network activity for ultimately ephemeral data. Durability is great, except when it isn’t.

Here’s a quick demonstration at how these excess resources translate to time consumption:

CREATE TABLE regular_table AS
SELECT * FROM generate_series(1, 5000000);
TIME: 7423.529 ms
SELECT * FROM generate_series(1, 5000000);
TIME: 2843.311 ms

These timings varied somewhat, but the trend persisted through several iterations. This was on a test Postgres instance with one replica and no WAL archival. This means a production system utilizing WAL archival would potentially produce an even larger divergence. Taking the ‘D’ out of ACID has some very real performance benefits.

Of course, these advantages come with a gargantuan fire-breathing caveat. Let’s simulate a Postgres crash:

pkill -9 postmaster
pg_ctlcluster 9.4 main start

And then examine the aftermath:

SELECT COUNT(1) FROM regular_table;
SELECT COUNT(1) FROM unlogged_table;

Oh. Well, then. Because unlogged tables are not durable, they can’t be recovered. Following any unexpected shutdown, Postgres will truncate any unlogged tables upon the next startup. In addition, since unlogged tables are not part of the WAL system at all, they’re not streamed to any existing replicas. Replicas retain the table structure itself, but they remain entirely unusable.

This is what our clone has to say about the situation:

SELECT * FROM unlogged_table;
ERROR:  cannot access TEMPORARY OR unlogged relations during recovery

Given these constraints, unlogged tables are only useful on the primary server for transient data. This makes them great for use cases we mentioned earlier: data staging areas, persistent temp tables, raw COPY targets, etc. Used judiciously, they can greatly accelerate an ETL stack or multiple-step data import process.

They’re also good for application stacks that require data flushing. Certain types of queues leverage notifications or live messaging, neither of which want stale data. Some applications bootstrap one or more tables from a tertiary data source upon starting. Web apps could even store local session data in an unlogged Postgres table on each host, pending aggregation on a reporting server. There are many forms of application where phantom data is exactly what we want.

In the end, the primary difference between Postgres and a strictly ACID database, is that Postgres actively facilitates circumventing unnecessary durability overhead. But it does so safely, limiting the effect to tables specifically created for that purpose. Why pay for solid reality, when all you want is smoke and mirrors?

PG Phriday: Displaced Durability
Tagged on:         

3 thoughts on “PG Phriday: Displaced Durability

  • The full table description (\d tablename) in psql does differentiate between logged and unlogged tables, on the first line:

              Table "public.x"
         Unlogged table "public.xx"

    Just checking it out, I also tried a temp table. They are always unlogged, but \d doesn’t indicate that:

           Table "pg_temp_2.xxx"

    Nor does your query against pg_class:

    ptjm=# select relpersistence from pg_class where relname = ‘xxx’;


    t (1 row)

    (however the data doesn’t go into the logs.

    1. You’re absolutely right. I swear I stared at it a trillion times and my brain completely skipped over those words. Holy cow. I’ve altered the article to remove my idiocy. 🙂

      And yeah, temp tables are also unlogged. I figured that was implied, but it probably should be more explicit just to dot all of the i’s and such.

Comments are closed.