PG Phriday: Why Postgres is the Best Database Engine

Friday March 22, 2024

Page content

Last Phriday we explored just where Postgres could end up in the future. One possible question which may have occurred to a reader was probably something along the lines of “That doesn’t even really sound like Postgres anymore. Why not just write another database?”

Let’s just be outright about it: Postgres is the best RDBMS engine currently available. It’s certainly bold to claim that any database engine is “the best”, and as the saying goes, “Extraordinary claims require extraordinary evidence.” The reason we don’t want to write a new database engine is because there’s no need. Despite its warts, we think Postgres is literally the best available, and we plan to make a pretty good case this week.

Reason 1: It’s Popular

There are a lot of database engines out there, of which Postgres is merely one of literally hundreds. Despite that, it has won the DB-Engines DBMS of the Year four times since 2017, and recently eclipsed MySQL in 2023 as the most popular according to Stack Overflow.

Beauty is unfortunately only skin deep. One reason for MySQL’s prominence is the continuing momentum acquired for its role in the ubiquitous LAMP stack. Just because something is popular doesn’t mean it’s actually the best available of its kind. But it does mean that MySQL started with a significant and dramatic lead for over 20 years, and Postgres has come to surpass its iron grip on the industry anyway.

Think about that for a second.

Reason 2: Community and Resources

Once someone decides to use Postgres—maybe because it’s used by some other tool they’re working with—they need to actually dive in. Postgres is certainly no stranger here. Let’s see what we can find:

And that’s really only scratching the surface. We actually see quite a few MySQL users in the Postgres Discord because MySQL still doesn’t have one! We help them anyway if we can, because that’s who we are.

It would be a fool’s errand to iterate all of the other resources associated with Postgres, but a good start might be these:

The documentation is very well organized and thorough.
Downloads are built regularly for multiple platforms.
There’s an official Docker image.
Learn about Postgres using a tutorial.
Keep learning on the Wiki.
Dig into configuration settings.
A github project
Who can forget the infamous depesz explain tool for parsing query plans?
There are a handful of Kubernetes operators, but we prefer CloudNativePG.

And on, and on, and on… That’s an enthusiasm bred from genuine interest. The Postgres community isn’t just active, it’s thriving. This isn’t even counting the multitude of blogs, books, videos, podcasts, and so on. It’s basically impossible to work in tech right now and avoid tripping over something built with Postgres.

That doesn’t just happen in a vacuum (pun not intended).

Reason 3: Features

Every database likes to boast about its features, right? What can Postgres do?

Extensions - Some could argue this is Postgres’ “killer app”
Foreign Data Wrappers to interact with non-Postgres sources
Physical and logical replication, and associated CDC
Functions and stored procedures in over a dozen languages (yes, that includes Rust)
Built-in JSON support
Support for DDL during transactions
Security at the column and even row level
Declarative table partitions
Swappable storage backends
Background worker processes - even extensions can use these!
We could go on…

It’s just an endless litany of capabilities that grows longer with every single yearly release. And if there’s something it can’t do, there’s a good chance it’ll either gain the feature in the future, or someone can add it themselves as an extension. Speaking of…

Reason 4: Extensions

Hmm, this heading looks suspiciously familiar. Extensions are such a titanic reason behind the popularity of Postgres that it’s actually difficult to adequately explain. Postgres added the functionality way back in version 9.1 and since then, it’s been a race to make Postgres do nearly everything. Thanks to extensions, Postgres can:

Be a time series database
Handle geospatial data
Incorporate vectors and LLM searches
Act as a proprietary write-anywhere database or open-source
Scale horizontally
Use columnar storage
Work with Elasticsearch
Act like a queue
Schedule events
And way, way more

This list has grown so ponderous that there are not one, or even two, but three extension managers available. At this point, Postgres is practically middleware.

We already alluded to the infamous “rewrite it in rust” joke, but it’s the truth! Even Zig was recently announced as a supported extension language recently. There are multiple companies that literally owe their existence to this single feature. Not Postgres the engine, just the ability to extend it without needing to fork the codebase.

This allows products and features to stay current by following APIs rather than having to port a large series of patches every time the Postgres codebase changes. This reason alone has led to the ultimate demise of projects as ambitious as Postgres-XL among others. There are proprietary Postgres forks like EPAS from EDB, but these are the exception rather than the rule. It’s possible the only reason EDB can even maintain their fork is due to the fact they employ more Postgres core committers than any other company; it’s not an easy thing to do!

But an extension? Any dweeb off the street can write one or two. As the API burrows deeper into the Postgres internals, more extensions become possible, adding functionality we could only dream about today. It’s the proverbial Face That Launched a Thousand Ships, and it’s what made the Postgres ecosystem what it is today.

And it actually gets even better.

Reason 5: Foreign Data Wrappers

It seems that Postgres version 9.1 was a smorgasbord of capabilities, because it also brought us FDWs. It’s the final piece to the puzzle that establishes Postgres as the best existing database engine. Extensions gave Postgres the ability to do anything and FDWs let it go anywhere.

The ability to perform queries on nearly anything, pull the data into an extremely competent set-theory execution engine, and send the results to nearly any upstream target is game changing. Not every type of data is SQL enabled, or even every database engine. Some rely on rather obscure or even proprietary languages.

While it’s possible to interact with these without an FDW, wrappers enable a universe of possibilities. Some wrappers can write directly to systems through the foreign table, while more advanced wrappers can push predicates or join conditionals to the target. Suddenly Postgres can read from a Kafka queue, or write to Oracle. Want an easier cohesive interface for your favorite data lake? It’s not a trivial exercise, but Postgres can do that too.

This essentially makes Postgres a thin client, where the database itself could be entirely empty and simply act as a bridge to dozens of other disparate data sources. It’s not quite the same as decoupling the storage and compute layers, but it could be used that way to an extent. It could even be an entrypoint into a Web3 swarm, a notoriously difficult problem-set due to the underlying decentralized nature.

Further Thoughts

What other database engine can do all of that? It’s entirely possible to enumerate various problems which continue to plague Postgres, but the ecosystem makes up for it in spades. Those pain points will eventually be addressed, and we can leverage all of that functionality right now in the meantime.

So when I discussed the potential future of Postgres last week, where the backend storage engine could be distributed and nodes could self-organize with Raft, it wasn’t purely idle speculation. Those capabilities are “only” a couple of extensions away. Why create an entirely new engine just to recreate what Postgres has now?

Maybe in some far-flung future, the extension API, FDWs, the execution engine, and other parts can be distilled down to their core fundamentals. At that point, “Postgres” could just be some kind of library or compatibility / interface layer. Backend systems could then shed all of the unused features for highly efficient embedded purposes. Until that time—if such a thing ever occurs at all—there’s no reason to reinvent the wheel.

Postgres is definitely the best thing since Betty White.