October 15th, 2014 |
News | No Comments
Before I really get going with this post, I want to say I’m not panicked, and I suggest you stay the same. Meanwhile, it’s pretty clear the currently cavalier attitude toward Ebola needs to change. And of course, it all boils down to humans being the fallible creatures they are.
How Ebola Works
There’s good information on How Ebola Works, and how it kills you, but I’ll summarize. Ebola is a Biosafety Level 4 contagion, meaning proper attire when interacting with infected is a fully sealed safety suit with respirator, which should be decontaminated before and after exposure.
These kinds of precautions are necessary because Ebola is a hemorrhagic fever that causes multiple organ failure within days of exposure. How does that happen? Ebola is capable of replicating without the immune system taking immediate notice, because it attacks the dendritic cells of the immune system itself. Since these cells are how the immune system recognizes new invaders, there’s no defense while Ebola replicates. During the infection, it enters cells and makes them make more Ebola, which in turn causes them to explode. Eventually enough of this happens that the immune system actually does something once it notices the damage.
Unfortunately that something is a cytokine storm. In effect, the immune system freaks out and disgorges all of its killing might, severely damaging blood vessels in the process. This internal bleeding, in turn, causes blood pressure to drop. Combined with tissue damaged by Ebola, this leads to organ failure and eventually death. The mortality rate is quoted as anywhere between 25% to 95%, but it’s quite a bit more potent than the flu.
During all of this, you can’t have painkillers to ease the agony of your dying organs due to the likelihood your liver is among the casualties. It is a horrible, painful way to die that I wouldn’t wish on anyone.
How Does it Spread?
Some comparisons have been made that suggest Ebola is about as communicative as Hepatitis C. The CDC suggests that bodily fluids become a vector when the patient starts showing symptoms, which may take up to three weeks. With an incubation period that long, this allows travelers to reach quite diverse destinations before symptoms appear and spreading becomes likely.
Usually it’s easy to avoid bodily fluids like blood, mucus, semen, or diarrhea. But the problem with Ebola is that sweat is also a vector. Even during a cool day, the body produces sweat, and that sweat can get smeared on things like doorknobs.
Why it Bothers Me
The patient who recently died in Dallas has already infected his third health care worker. I’m pretty sure Hepatitis C doesn’t spread so easily, or every nurse in the country would have it. What’s worse, the nurse traveled on a plane a day before being diagnosed, potentially infecting anyone she encountered at both the departure and destination airports, as well as anyone in the plane. Yes, that includes the flight attendants.
And that is the real problem: people.
People generally don’t wash their hands before eating or itching their eyes. People don’t cough or sneeze into their elbows. People work sick for fear of loosing their jobs, or falling behind, thereby spreading diseases to the entire workplace. People reuse gloves. People travel when they’re not supposed to. People get scared and make mistakes. People cut budgets so there aren’t sufficient resources to handle outbreaks. People send patients home with incorrect diagnoses.
We can claim “it will never happen here,” or “you’d have to roll around in Ebola diarrhea to catch it,” or “our infrastructure will prevent spreading,” but that’s all wrong. All it takes is one weak link: one lazy person who didn’t fill in a checkbox on a medical form; one person who neglected to change gloves between patients; one person who thinks the rules don’t apply to them; one person in denial about how sick they are; one person who works in fast food and can’t afford a sick day. Or in the case of Dallas, a hilariously incompetent string of mishaps that led to at least three nurses being infected, people who presumably have better access to sterilization and proper handling of contaminated material than the rest of us.
And all of this is happening right as we start to enter flu season. The symptoms of Ebola are very similar to the flu, which means misdiagnoses will become problematic. Being sloppy with the flu is a nuisance, but with Ebola, it’s deadly. We need to stop fucking around and get serious, or we’ll end up like Liberia and Sierra Leone.
We can mock those countries all we want for shoddy infrastructure and lack of education, but are we really any better? People are fallible, and whether they’re in the US or Africa, we need to account for Murphy’s Law and Finagle’s Corollary. We can call Dallas a fluke, but it’s not. Shit happens, and the sooner we accept that, the sooner we can actually address Ebola before it becomes a real problem.
Hopefully, we still can.
August 5th, 2014 |
Database, Tech Talk | 5 Comments
Recently I stumbled across a question on Reddit regarding the performance impact of using pgBadger on an active database server. The only real answer to this question is: do not use pgBadger. Before anyone asks—no, you shouldn’t use pgFouine either. This is not an indictment on the quality of either project, but a statement of their obsolescence in the face of recent PostgreSQL features.
One of the recommended
postgresql.conf changes for both of these tools is to set
log_min_duration_statement to 0. There are a few other changes they require to put the log entries in the correct format, but we can worry about those later. For now, let’s focus on a little innocent math.
One of the PostgreSQL servers I work with, processes almost two billion queries per day. Let’s imagine every such query is very simple, even though this is definitely not the case. Consider this an example query:
SELECT col1, col2 FROM my_table WHERE id=?
Assuming the query is paramterized, and the number is from one to a million, our average query length is 47 characters. Let’s just say it’s 50 to keep things easy. If we multiply that by two billion, that’s 100-billion bytes of logged SQL. Seen another way, that’s 93GB of logs per day, or about 1MB of log data per second.
In practice, such a short query will not constitute the bulk of a PostgreSQL server’s workload. In fact, if even a simple ORM is involved, all queries are likely to be far more verbose. Java’s hibernate in particular is especially prone to overly gratuitous aliases prepended to all result columns. This is what our query would look like after Hibernate was done with it:
SELECT opsmytable1_.col1, opsmytable1_.col1
FROM my_table opsmytable1_
If we ignore the whitespace I added for readability, and use values from one to a million, the average query length becomes 99. Remember, this is ignoring all useful data PostgreSQL would also be logging! There are also a number of other problems with many of my operating assumptions. It’s very unlikely that query traffic will be consistent, nor will the queries themselves be so short. In addition, I didn’t account for the length of the log prefix that should contain relevant metadata about the query and its duration.
Once on a boring day, I enabled all query logging just to see how verbose our logs became. On that fateful day, I set
log_min_duration_statement to 0 for approximately ten seconds, and the result was 140MB worth of log files. Thus was my curiosity sated, and my soul filled with abject horror. Faced with such untenable output, how can we do log analysis? There’s no way pgBadger can process 100GB of logs in a timely manner. I tried using it a while ago, and even that ten seconds of log output required over a minute of processing.
It turns out PostgreSQL has had an answer to this for a while, but it wasn’t until the release of 9.2 that the feature became mature enough to use regularly. The pg_stat_statements extension maintains a system catalog table that tracks query performance data in realtime. Constants and variables are replaced to generalize the results, and it exposes information such as the number of executions, the total average run time of all executions, the number of rows matched, and so on. This is more than any log processing utility can do given the most verbose settings available.
I could spend hours churning through log files, or I can execute a query like this:
SELECT calls, total_time/calls AS avg_time, query
ORDER BY 2 DESC
That query just returned the ten slowest queries in the database. I could easily modify this query to find the most frequently executed queries, and thus improve our caching layer to include that data. This module can be directly responsible for platform improvements if used properly, and the amount of overhead is minimal.
In addition, the log settings are still available in conjunction with
pg_stat_statements. I normally recommend setting
log_min_duration_statement to a value that’s high enough to remove log noise, but low enough that it exposes problems early. I have ours set to 1000 so any query that runs longer than one second is exposed. Even on a system as active as ours, this produces about 5MB of log entries per day. This is a much more reasonable amount of data for log analysis, spot-checking, or finding abnormal system behavior.
All of this said, we could just as easily watch the database cluster and set
log_min_duration_statement to a nonzero amount of milliseconds. For most systems, even 20 milliseconds would be enough to prevent log output from saturating our disk write queue. However, the
pg_stat_statements extension automatically takes care of performance statistics without any post-processing or corresponding increase in log verbosity, so why add pgBadger to the stack at all?
There may be a compelling argument I’m missing, but for now I suggest using
pg_stat_statements without PostgreSQL-focused log post-processing. Ops tools like Graylog or logstash are specifically designed to parse logs for monitoring significant events, and keeping the signal to noise ratio high is better for these tools.
Save logs for errors, warnings, and notices; PostgreSQL is great at keeping track of its own performance.
July 29th, 2014 |
Database, News, Tech Talk, Writing | 6 Comments
Well, my publisher recently informed me that the book I’ve long been slaving over for almost a year, is finally finished. I must admit that PostgreSQL 9 High Availability Cookbook is somewhat awkward as a title, but that doesn’t detract from the contents. I’d like to discuss primarily why I wrote it.
When Packt first approached me in October of 2013, I was skeptical. I have to admit that I’m not a huge fan of the “cookbook” style they’ve been pushing lately. Yet, the more I thought about it, the more I realized it was something the community needed. I’ve worked almost exclusively with PostgreSQL since at late 2005 with databases big and small. It was always the big ones that presented difficulties.
Back then, disaster recovery nodes were warm standby through continuous recovery at best, and
pg_basebackup didn’t exist. Nor did
pg_upgrade, actually. Everyone had their own favorite backup script, and major upgrades required dumping the entire database and importing it in the new version. To work with PostgreSQL then required a much deeper understanding than is necessary now. Those days forced me to really understand how PostgreSQL functions, which caveats to acknowledge, and which needed redress.
One of those caveats that still called out to me, was one of adoption. With a lot of the rough edges removed in recent releases of PostgreSQL, came increased usage in small and large businesses alike. I fully expected PostgreSQL to be used in a relatively small customer acquisition firm, for instance, but then I started seeing it in heavy-duty financial platforms. Corporate deployments of PostgreSQL require various levels of high availability, from redundant hardware, all the way to WAL stream management and automated failover systems.
When I started working with OptionsHouse in 2010, their platform handled 8,000 database transactions per second. Over the years, that has increased to around 17k, and I’ve seen spikes over 20k. At these levels, standard storage solutions break down, and even failover systems are disruptive. Any outage must be as short as possible, and be instantly available with little to no dependency on cache warming. Our backup system had to run on the warm standby or risk slowing down our primary database. Little by little, I broke the database cluster into assigned roles to stave off the total destruction I felt was imminent.
I was mostly scared of the size of the installation and its amount of activity. Basic calculations told me the database handled over a billion queries per day, at such a rate that even one minute of downtime could potentially cost us tens of thousands in commissions. But I had no playbook. There was nothing I could use as a guide so that I knew what to look for when things went wrong, or how I could build a stable stack that generally took care of itself. It was overwhelming.
This book, as overly verbose as the title might be, is my contribution to all of the DBAs out there that might have to administer a database that demands high availability. It’s as in-depth as I could get without diverging too much from the cookbook style, and there are plenty of links for those who want to learn beyond the scope of its content. The core however, is there. Anyone with a good understanding of Linux could pick it up and weave a highly available cluster of PostgreSQL systems without worrying, or having to build too many of their own tools.
If I’ve helped even one DBA with this high availability book, I’ll consider my mission accomplished. It’s the culmination of years of experimentation, research, and performance testing. I owe it to the PostgreSQL community—which has helped me out of many jams—to share my experience how I can.
July 25th, 2014 |
Database, Programming, Tech Talk | 14 Comments
Programming is fun. I love programming! Ever since I changed my career from programming to database work, I’ve still occasionally dabbled in my former craft. As such, I believe I can say this with a fair amount of accuracy: programmers don’t understand databases. This isn’t something small, either; there’s a fundamental misunderstanding at play. Unless the coder happens to work primarily with graphics, bulk set-based transformations are not something they’ll generally work with.
For instance, if tasked with inserting ten thousand records into a database table, a programmer might simply open the data source and insert them one by one. Consider this basic (non-normalized) table with a couple basic indexes:
CREATE TABLE sensor_log (
sensor_log_id SERIAL PRIMARY KEY,
location VARCHAR NOT NULL,
reading BIGINT NOT NULL,
reading_date TIMESTAMP NOT NULL
CREATE INDEX idx_sensor_log_location ON sensor_log (location);
CREATE INDEX idx_sensor_log_date ON sensor_log (reading_date);
Now suppose we have a file with ten thousand lines of something like this:
To load this data, our coder chooses Python and whips up an insert script. Let’s even give the programmer the benefit of the doubt, and say they know that prepared queries are faster due to less overhead. I see scripts like this all the time, written in languages from Java to Erlang. This one is no different:
db_conn = psycopg2.connect(database = 'postgres', user = 'postgres')
cur = db_conn.cursor()
"""PREPARE log_add AS
INSERT INTO sensor_log (location, reading, reading_date)
VALUES ($1, $2, $3);"""
file_input = open('/tmp/input.csv', 'r')
for line in file_input:
cur.execute("EXECUTE log_add(%s, %s, %s)", line.strip().split(','))
It’s unlikely we have the
/tmp/input.csv file itself, but we can generate one. Suppose we have 100 locations each with 100 sensors. We could produce a fake input file with this SQL:
SELECT substring(md5((a.id % 100)::TEXT), 1, 3) || '-' ||
to_char(a.id % 100, 'FM0999') AS location,
(a.id * random() * 1000)::INT AS reading,
now() - (a.id % 60 || 's')::INTERVAL AS reading_date
FROM generate_series(1, 10000) a (id)
) TO '/tmp/input.csv' WITH CSV;
Whew! That was a lot of work. Now, let’s see what happens when we time the inserts on an empty import table:
Well, a little over one second isn’t that bad. But suppose we rewrote the python script a bit. Bear with me; I’m going to be silly and use the python script as a simple pass-through. This should simulate a process that applies transformations and outputs another file for direct database import. Here we go:
file_input = open('/tmp/input.csv', 'r')
processed = open('/tmp/processed.csv', 'w+')
for line in file_input:
parts = line.strip().split(',')
processed.write(','.join(parts) + '\n')
db_conn = psycopg2.connect(database = 'postgres', user = 'postgres')
cur = db_conn.cursor()
processed, 'sensor_log', ',',
columns = ('location', 'reading', 'reading_date')
Now let’s look at the timings involved again:
That’s about three times faster! Considering how simple this example is, that’s actually pretty drastic. We don’t have many indexes, the table has few columns, and the number of rows is relatively small. The situation gets far worse as all of those things increase.
It’s also not the end of our story. What happens if we enable autocommit, so that each insert gets its own transaction? Some ORMs might do this, or a naive developer might try generating a single script full of insert statements, and not know much about transactions. Let’s see:
Oh. Ouch. What normally takes around a third of a second can balloon all the way out to a minute and a half. This is one of the reasons I strongly advocate educating developers on proper data import techniques. It’s one thing for a job to be three to five times slower due to inefficient design. It’s quite another to be nearly 250 times slower simply because a programmer believes producing the output file was fast, so logically, inserting it should be similarly speedy. Both scenarios can be avoided by educating anyone involved with data manipulation.
This doesn’t just apply to new hires. Keeping everyone up to date on new techniques is equally important, as are refresher courses. I care about my database, so when possible, I try to over-share as much information as I can manage. I even wrote and presented several talks which I periodically give to our application developers to encourage better database use. Our company Wiki is similarly full of information, which I also present on occasion, if only because reading technical manuals can be quite boring.
If my database is being abused, it’s my job as a DBA to try and alleviate the situation any way I can. Sometimes, that means telling people what they’re doing wrong, and how they can fix it. I certainly didn’t know all of this ten years ago when I was primarily a coder. But I would have appreciated being pointed in the right direction by someone with more experience in the field.
Your database and users deserve it.
June 11th, 2014 |
Database, Tech Talk | 5 Comments
When I heard about foreign tables using the new
postgres_fdw foreign data wrapper in PostgreSQL 9.3, I was pretty excited. We hadn’t upgraded to 9.3 so I waited until we did before I did any serious testing. Having done more experimentation with it, I have to say I’m somewhat disappointed. Why? Because of how authentication was implemented.
I’m going to get this out of the way now: The
postgres_fdw foreign data wrapper only works with hard-coded plain-text passwords, forever the bane of security-conscious IT teams everywhere. These passwords aren’t even obfuscated or encrypted locally. The only implemented security is that the
pg_user_mapping table is limited to superuser access to actually see the raw passwords. Everyone else sees this:
postgres=> SELECT * FROM pg_user_mapping;
ERROR: permission denied for relation pg_user_mapping
The presumption is that a database superuser can change everyone’s password anyway, so it probably doesn’t matter that it’s hardcoded and visible in this view. And the developers have a point; without the raw password, how can a server-launched client log into the remote database? Perhaps the real problem is that there’s no mechanism for forwarding authentication from database to database.
This is especially problematic when attempting to federate a large database cluster. If I have a dozen nodes that all have the same user credentials, I have to create mappings to every single user, for every single foreign table, on every single independent node, or revert to trust-based authentication.
This can be scripted to a certain extent, but to what end? If a user were to change their own password, this breaks every foreign data wrapper they could previously access. This user now has to give their password to the DBA to broadcast across all the nodes with modifications to the user mappings. In cases where LDAP, Kerberos, GSSAPI, peer, or other token forwarding authentication is in place, this might not even be possible or advised.
Oracle solved this problem by tying DBLINK tables to a specific user during creation time. An access to a certain table authenticates as that user in all cases. This means a DBA can set aside a specific user for foreign table access purposes, and use a password that’s easy to change across the cluster if necessary. Grants take care of who has access to these objects. Of course, since
postgres_fdw is read/write, this would cause numerous permissions concerns.
So what are we left with? How can we actually use PostgreSQL foreign tables securely? At this point, I don’t believe it’s possible unless I’m missing something. And I’m extremely confused at how this feature got so far along without any real way to lock it down in the face of malleable passwords. Our systems have dozens of users who are forced by company policy to change their passwords every 90 days, thus none of these users can effectively access any foreign table I’d like to create.
And no, you can’t create a mapping and then grant access to it. In the face of multiple mapping grants, which one would PostgreSQL use? No, if there’s a way to solve this particular little snag, it won’t be that convenient. If anyone has ideas, or would like to go into length at how wrong I am, please do! Otherwise, I’m going to have to use internal users of my own design and materialized views to wrap the foreign tables; extremely large tables will need some other solution.
« Older Posts