I’ll be the first to admit that I found Postgres window functions fantastically confusing when I first encountered them. They’re a powerful and versatile tool for building reports and summaries, but that functionality hides behind a fairly steep learning curve. One of the ways to combat their inherent complexity is to fully explore how they work, instead of just trying to wrap our heads around the expected results.
There are a lot of languages available for authoring Postgres functions, but there’s nothing quite like the the classic PL/pgSQL. It’s SQL! It’s not SQL! It’s a kind of horrifying mutant crossbreed suitable only for terrifying small children and generating…
These days with multiple Postgres database environments a commonality, it’s not unheard of to copy data from one to another. Perhaps a production extract is necessary to properly vet data in a staging environment. Maybe the development environment needs to update its data to reflect an especially pernicious and intractable edge case someone observed. In any of these scenarios, we are likely to extract data from multiple tables to import it elsewhere. The question is: how?
In light of recent events where GitLab suffered a massive database loss, this is a great opportunity to examine what happened from a Postgres perspective. Since Simon Riggs over at 2ndQuadrant has already chimed in on improvements Gitlib might consider in their procedures, maybe we should walk the conversation back slightly.