Postgres is great, but it can’t run itself in all cases. Things come up. Queries go awry. Hardware fails, and users leave transactions open for interminable lengths of time. What happens if one of these things occur while the DBA themselves has a hardware fault? While they’re down for maintenance, someone still has to keep an eye on things. For the last PG Phriday of the year completely unrelated to my upcoming surgery, let’s talk about what happens when your DBA becomes inoperative due to medical complications.
This is Fake Postgres DBA 101!
Recently a coworker asked me this question:
Should I expect variance between minutes and hours for the same query?
And I was forced to give him this answer:
Potentially, but not commonly. Query planning is an inexact science, and regardless of the query being the “same query,” the data is not the “same data.” This isn’t generally the case, but on occasion, changes in data can affect the query execution path. Usually this is a good thing, as the database accounts for new value distributions.
For example, if there are a million distinct values in one column, but 90% of them are the same, certain values should trigger an index scan instead of a sequential scan. Those values will change over time, and if the stats don’t account for that, queries will have non-dependable performance. Of course, this introduces potential correlation assumptions that aren’t correct in some cases, and that also causes unreliable query performance. I guess the question is: which would you rather have?
It’s about the time for year-end performance reviews. While I’m always afraid I’ll narrowly avoid being fired for gross incompetence, that’s not usually how it goes. But that meeting did remind me about a bit of restructuring I plan to impose for 2017 that should vastly improve database availability across our organization. Many of the techniques to accomplish that—while Postgres tools in our case—are not Postgres-specific concepts.