Review: Learning Heroku Postgres

I recently got the opportunity to take a look at Learning Heroku Postgres, a new book by Patrick Espake that seems intended to help new PostgreSQL database administrators get their data into the cloud. The chapters are short, concise, and the questionnaires at the end are a nice touch. But does it hit the mark? Almost.

Before I get too far into this review, I should point out that Heroku is a proprietary service that presents a modular deployment system for various programming languages, applications, administration, monitoring, and other related services. Though there are free hobby-level instances for most modules, it is a commercial platform which provides SAAS (Software as a Service) across multiple geographic locations. In order to leverage it properly, I recommend these hobby-level instances only for experimentation.

To that end, this is very much a book that is a benefit to the PostgreSQL community. Small and large businesses often have trouble distributing data and applications in a high availability environment, and as such, Heroku is a potential solution for quick-and-dirty scalability at a reasonable cost.

The book itself is essentially broken down into three major parts: Tooling, Basics, and Extras. Though this is not explicitly defined by the chapter overview, this is the way it reads. This is somewhat important, because it allows a bit of skipping around for users who are already familiar with PostgreSQL, Heroku, or both.

The adventure begins with a couple short chapters on how Heroku itself is organized, and acquiring Heroku command-line tools for managing account features. Here, Espake presents a good bird’s-eye view of Heroku’s deployment infrastructure and configuration, and spends time discussing just how everything is decoupled and bound together by queues so the elastic infrastructure accurately represents the intention of the user. This is critical, as understanding the underlying landscape can (and should) directly influence development cycles, since readers must account for Heroku’s quirks while organizing their application and associated data.

From here, the discussion naturally moves to PostgreSQL itself in chapter three. Espake makes it clear that PostgreSQL is managed as a fully automated solution, behind a thick wall of tools, interfaces, and somewhat limited management commands. This is one of the most important chapters, as it effectively lays out all of the necessary commands for synchronizing data and accessing the database itself for more direct manipulation with SQL clients, languages, and drivers. Afterwards in chapter four, he addresses the topic of backups and how to secure and obtain them. Both of these chapters combine to give a reader control of how Heroku represents their data, and securing it from loss.

Chapter five is something of an oddity. Espake introduces Heroku dataclips as a method for sharing data without talking about the reality of what they are: versioned views with an exposure API. This is the first time I got the impression that this book is more of a usage manual than a true learning resource. Yes it is important to show how data can be shared or downloaded with this feature, but after the introduction in chapter one regarding Heroku’s operation, I found this omission particularly odd. Given how dataclips work, they could be combined with views for easier overall data management, and yet this option is never presented.

Chapter six moves on to instance management. By this, I mean various uses for database replicas, such as forking, failover, and replacing the current database with a previous version. All the necessary commands and GUI options are here to make juggling multiple copies of the database easier. But again I see wasted opportunity. Heroku considers ‘rollback’ the act of replacing the primary instance with a previous backup instance. The fact that this directly conflicts with the concept of a transaction rollback is never discussed. Nor are database followers equated with PostgreSQL streaming replication, the mechanism that’s probably behind the feature. I wish Espake spent more time explaining how things work, instead of just providing instructions. After all, that kind of information is probably available in Heroku’s documentation; this book should provide a deeper understanding the user can leverage toward a better PostgreSQL cluster.

The last two chapters tie up most of the remaining loose ends by covering logs and various PostgreSQL-specific extensions available on the Heroku platform. Chapter eight in particular is a laundry list of PostgreSQL extensions generally available within the contribution libraries commonly distributed with the PostgreSQL code or binaries. It’s a good resource for users unfamiliar with this functionality, and further links are provided where necessary so the reader can explore, should that feature be relevant. While not really a feature of Heroku, or even especially relevant since most PostgreSQL distributions include them anyway, extensions are part of what make PostgreSQL so powerful, so I’ll allow it.

In the end, the book adequately covers numerous Heroku commands and interface elements. I wish the author spent more time talking about how some of Heroku’s terminology conflicts with common database concepts. For example, Heroku’s idea of ‘promote’ isn’t quite what a seasoned database administrator would recognize. Allowing a new user absorb this interpretation without caveat, could lead to conceptual issues in the future. This happens often unfortunately, as I’d already mentioned regarding rollback. From chapter four onward, the book is organized like a manual as if it were written by an employee of Heroku, treating PostgreSQL as a mere Heroku module that needed a checklist of feature documentation. There’s a reason this book is so short!

Still, it’s a good way to bootstrap a Heroku deployment of PostgreSQL. If there aren’t more comprehensive books on integrating the two, there probably will be in the near future. Wait for these if you really want to delve into a Heroku deployment; for a newbie, you can’t go wrong here.