When Query Hints Attack

Tuesday March 01, 2011

Page content

Ah, query hints. For all those times when the database isn’t doing what you want, they’re a useful tool for forcing the query optimizer to perform your bidding. But in this case, not only is the road to Hell paved with good intentions, it’s paved with a frictionless slide directly into a wood chipper that empties handily into an active volcano. With query hints, be careful what you wish for, because—to the detriment of all you hold dear—you just might get it.

In the Beginning

Before we realize why query hints are necessarily bad, let’s go over a few basics. Think about computer code. Clearly a computer can’t execute typed text as written! Either it has to be transformed into processor instructions, as seen in languages such as C or Pascal, or some intermediate byte-code suitable to an interpreter, such as Python or Java. This is normally the job of some kind of lexical scanner: to interpret the text and, through some convoluted grammatical magic, twist it into computer lingo.

But that’s only half of the story. More advanced compilers add another step, because just executing the raw translations would produce terrible performance. No, most compilers and interpreters have optimizers. They find unused instructions, loops that can be unrolled, processes that can be replaced with local assembler calls instead—any number of tweaks previously available only to experienced assembly programmers.

The query optimizer is doing this for your SQL. But unlike a compiler or interpreter which can target specific and highly derivative chipsets, it is forced to contend with an infinitely amorphous array of tables, views, stored procedures, triggers, constraints, indexes, and all the inherent combinations therein. Ideally, it has to know how much system memory is available, how fast the disks are, how much caching is available, what kind of IO scheduler the OS is using, the size and cardinality of every database object, heuristics on how that data is distributed, and what locks are necessary to preserve data being accessed in massive parallel.

Yet despite all this, query planners have long since exceeded all but the best hand-optimized plans. Few DBAs can possibly contend with all those moving parts, and that’s without accounting for the fact data never stops accumulating, and that too, will drastically alter expected execution paths.

WTF are Hints for, then?

So now we know what query planners are, and what they do. So what then, are hints? Take a look at this BS pseudocode:

SELECT *
  FROM my_table USE INDEX (idx_table_last_updated)
 WHERE last_updated > CURRENT_DATE - INTERVAL '1 week';

That ‘USE INDEX’ part is the hint. In this case, we’re telling the planner we want it to use the idx_table_last_updated index because we think having the last_updated column in our search makes it the best candidate. This is deceptively easy, especially since it seems so obvious.

Ignore the syntax for a moment, which is a horrible bastardization of PostgreSQL and MySQL. The point is that hints can override the planner’s normal path, because you either know something the planner doesn’t, or the planner is doing something horribly wrong.

For ad-hoc queries, this is actually great. Planners can be very stubborn, and are notorious for insisting their decisions are correct, you filthy human. And hints are just what their name implies: hints. Optimizers are free to ignore them if a hint is stupid or is literally impossible to apply. Besides that, the query optimizer actively hates you.

But in most cases, when a hint is necessary, there’s a very good reason. Maybe the collected statistics on a column just aren’t measuring up. Perhaps it’s reverse-scanning a date index and improperly ignoring a faster seek-based approach on an index you created to speed up a foreign key. Whatever the reason, it’s tempting to just scream “Fuck you, optimizer!” and lobotomize it with a hint or two.

This is where we start using our feet for target practice with a loaded AK47.

Target Practice

Now that your foot has been replaced with a bloody stump, let’s examine precisely what unhappiness transpired. For the sake of argument, try to remember that for most disk-based systems, even in a RAID-10 comprised of a dozen spindles, a random seek is about a hundred times slower than a sequential scan. Then note that, while an index seek is O(log(n)), rows from the table must be fetched independently, and that’s a seek and fetch for each match.

The Data Changed

Oh, unlucky you! Your application started getting really popular lately and it turns out one week of data is actually a tenth of the table. Your million-row table would have been a slow sequence scan, but now it’s an even slower 100,000 disk seeks. Since the table is now bigger than when you wrote the query, the results are coming back slower each time. Worse, the more popular your site gets, the more data is available in the last week, and the slower it’ll be. All because it’s fetching rows individually hundreds of thousands of times. Ouch!

That’s just one possible result. There are dozens more. Predicting how data will change and invalidate a well-intentioned hint is impossible. But consider the hint originally provided a best-case scenario; there’s nowhere to go but down.

A Better Index was Available

Hints have a nasty habit of making it into production code where the query is liable to be modified by someone who may overlook the hint. Our original query was changed to this:

SELECT *
  FROM my_table USE INDEX (idx_table_last_updated)
 WHERE last_updated > CURRENT_DATE - INTERVAL '1 week'
   AND table_format = 'ugly';

The conscientious coder thoughtfully added an index on table_format because she knew it was highly selective. Table formats of ‘ugly’ only occur in 0.001% of that table, making it an index second in performance only to the primary key. The planner wants to use the index on table_format but it can’t because the hint explicitly told it not to.

The Planner got Better

Probably the most insidious of bullets to your foot are improvements in the optimizer or database engine. Suppose the database can now maintain in-memory bitmap indexes for highly accessed objects. Combined with a merge of the table contents, queries can get orders of magnitude faster with no work on your part at all. Unless you used a query hint, in which case, you get only a swift kick in the family jewels.

God Damn it!

Exactly. The development community of PostgreSQL is well aware of these shortcomings, and as a consequence, have not created a formal hint syntax. Sure, you can temporarily disable sequence scans, index scans, nested loops, and so on, but no query can have hardcoded rules to override the optimizer. It’s not that they don’t want hints, but it’s better to spend time making the optimizer better, fixing bugs in its assumptions, improving table and column statistics, and rewriting problematic queries when necessary.

And that really is the right answer. To a certain degree, they may come to a compromise and build hints that don’t override the planner, but give it more information. Yet one could still argue that collecting better statistics would have the same effect, and simultaneously make all queries faster. It’s mostly for this reason that nobody has actually proposed a hint syntax of any kind that wasn’t eventually abandoned in favor of more pressing issues.

But where does that leave a developer or DBA who needs a quick fix for some crazy edge case the planner simply won’t handle properly? Do they just wait for months until a planner fix or improvement comes down the line? No way. DBAs are better than that.

No matter how good the planner is, it can be tricked. Maybe an anti-join:

SELECT *
  FROM my_table a
 WHERE last_updated > CURRENT_DATE - INTERVAL '1 week'
   AND EXISTS (
       SELECT 1
         FROM my_table b
        WHERE a.table_id = b.table_id
          AND table_format = 'ugly'
       );

Or how about a sub-select?

SELECT *
  FROM (
       SELECT *
         FROM my_table b
        WHERE table_format = 'ugly'
       )
 WHERE last_updated > CURRENT_DATE - INTERVAL '1 week';

Maybe creating a new partial index would work:

CREATE INDEX idx_table_format_updated_format
    ON my_table (current_date DESC)
 WHERE table_format = 'ugly';

This last one is especially nice and the DESC will work in any PostgreSQL of 8.4 or later. It’s like having a two-column index, but in this case, only indexes the lower incidence ‘ugly’ match, and orders the dates in descending order. So when searching for the most recently updated ‘ugly’ tables, we get a huge speed boost. Why? Because the planner won’t use a reverse index scan, which runs at the same speed as a series of random seeks. Instead, it will use a sequential index scan over a much smaller index.

In the End

The real solution to a bad query is to try and tell the planner what you want and let it decide how to calculate the results. If that doesn’t work, statistics for one or more columns can be increased and that might give the planner a better idea. If that doesn’t work, indexes may be the answer. If that still fails, there is still a veritable plethora of alternatives that won’t make the query perform worse some time in the future because old hints no longer apply.

It’s tempting. It’s right there in the damn manual! Can you resist pushing the shiny, candylike button? All I can suggest is that you try. All that awaits hint users is wailing and gnashing of teeth.

Please don’t use hints. Your sanity and your query optimizer will thank you.