May 27th, 2014 |
Contemplation, Database, News, Tech Talk | 4 Comments
This May, I attended my first international conference: PGCon 2014. Though the schedule spanned from May 20th to May 23rd, I came primarily for the talks. Then there was the Unconference on the 24th. I’d never heard of such a thing, but it was billed as a good way to network and find out what community members want from PostgreSQL. After attending the Unconference, I must admit I’m exceptionally glad it exists; it’s something I believe every strong Open Source project needs.
Why do I say that, having only been to one of them? It’s actually fairly simple. Around 10AM Saturday, everyone piled into the large lecture hall and had a seat. There were significantly fewer attendees, but most of the core committers remained for the festivities. We were promised pizza and PostgreSQL, and that’s all anyone needed. Josh Berkus started the festivities by announcing the rules and polling for ideas. The final schedule was pretty interesting in itself, but I was more enamored by the process and the response it elicited.
I’m no stranger to the community, and the mailing lists are almost overwhelmingly active. But these conversations, nearly all of them, are focused on assistance and hacker background noise. The thing that stood out to me during the Unconference planning was its organic nature. It wasn’t just that we chose the event schedule democratically. It wasn’t the wide range of topics. It wasn’t even the fact core members were there to listen. It was the engagement.
These people were excited and enjoying talking about PostgreSQL in a way I’ve never witnessed, and I’ve spoken at Postgres Open twice so far. I’ve seen several talks, been on both sides of the podium, and no previous experience even comes close. We were all having fun brainstorming about PostgreSQL and its future. For one day, it wasn’t about pre-cooked presentations chosen via committee, but about what “the community” wanted to discuss.
When it came time for the talks themselves, this atmosphere persisted. We agreed and disagreed, we had long and concise arguments for and against ideas, clarified our positions, and generally converged toward a loose consensus. And it was glorious. I know we were recording the sessions, so if you have the time when the videos are available, I urge you to watch just one so you can see the beauty and flow of our conversations.
I feel so strongly about this that I believe PGCon needs to start a day earlier. One unfortunate element about the Unconference is that it happens on a Saturday, when everyone wants to leave and return to their families. Worse, there is a marathon on Sunday, meaning it is difficult or even impossible to secure a hotel room for the Saturday event. People tend to follow the path of least resistance, so if there is a problem getting lodging, they won’t go.
And that’s a shame. Having a core of interested and engaged community members not only improves the reputation of PostgreSQL, but its advocacy as well. If people feel they can contribute without having to code, they’ll be more likely to do so. If those contributions, no matter how small, are acknowledged, their progenitors will stick around. I believe this is the grass-roots effort that makes PostgreSQL the future of the database world, and whoever came up with the Unconference deserves every accolade I can exclaim.
We need more of this. PostgreSQL has one of the most open communities I’ve had the pleasure of participating in, and that kind of momentum can’t really be forced. I hope every future PostgreSQL conference in every country has one of these, so everyone with the motivation can take part in the process.
Finally, find your nearest PostgreSQL User Group, and join the community. You’ll be glad you did.
May 14th, 2014 |
Database, Tech Talk | 6 Comments
PostgreSQL is a pretty good database, and I enjoy working with it. However, there is an implementation detail that not everyone knows about, which can drastically affect table performance. What is this mysterious feature? I am, of course, referring to foreign keys.
Foreign keys are normally a part of good database design, and for good reason. They inform about entity relationships, and they verify, enforce, and maintain those relationships. Yet all of this comes at a cost that might surprise you. In PostgreSQL, every foreign key is maintained with an invisible system-level trigger added to the source table in the reference. At least one trigger must go here, as operations that modify the source data must be checked that they do not violate the constraint.
This query is an easy way to see how many foreign keys are associated with every table in an entire PostgreSQL database:
SELECT t.oid::regclass::text AS table_name, count(1) AS total
FROM pg_constraint c
JOIN pg_class t ON (t.oid = c.confrelid)
GROUP BY table_name
ORDER BY total DESC;
With this in mind, consider how much overhead each trigger incurs on the referenced table. We can actually calculate this overhead. Consider this function:
CREATE OR REPLACE FUNCTION fnc_check_fk_overhead(key_count INT)
RETURNS VOID AS
CREATE TABLE test_fk
id BIGINT PRIMARY KEY,
INSERT INTO test_fk
SELECT generate_series(1, 100000), repeat(' ', 20);
CLUSTER test_fk_pkey ON test_fk;
FOR i IN 1..key_count LOOP
EXECUTE 'CREATE TABLE test_fk_ref_' || i ||
' (test_fk_id BIGINT REFERENCES test_fk (id))';
FOR i IN 1..100000 LOOP
UPDATE test_fk SET junk = ' '
WHERE id = i;
DROP TABLE test_fk CASCADE;
FOR i IN 1..key_count LOOP
EXECUTE 'DROP TABLE test_fk_ref_' || i;
$$ LANGUAGE plpgsql VOLATILE;
The function is designed to create a simple two-column table, fill it with 100,000 records, and test how long it takes to update every record. This is purely meant to simulate a high-transaction load caused by multiple clients. I know no sane developer would actually update so many records this way.
The only parameter this function accepts, is the amount of tables it should create that reference this source table. Every referring table is empty, and has only one column for the reference to be valid. After the foreign key tables are created, it performs those 100,000 updates, and we can measure the output with our favorite SQL tool. Here is a quick test with
On our system, these timings were collected several times, and averaged 2961ms, 3805ms, 4606ms, 5089ms, and 5785ms after three runs each. As we can see, after merely five foreign keys, performance of our updates drops by 28.5%. By the time we have 20 foreign keys, the updates are 95% slower!
I don’t mention this to make you abandon foreign keys. However, if you are in charge of an extremely active OLTP system, you might consider removing any non-critical FK constraints. If the values are merely informative, or will not cause any integrity concerns, a foreign key is not required. Indeed, excessive foreign keys are actually detrimental to the database in a very tangible way.
I merely ask you keep this in mind when designing or revising schemas for yourself or developers you support.
May 8th, 2014 |
Database, Tech Talk | No Comments
With the release of PostgreSQL 8.4, the community gained the ability to use CTE syntax. As such, this is a fairly old feature, yet it’s still misunderstood in a lot of ways. At the same time, the query planner has been advancing incrementally since that time. Most recently, PostgreSQL has gained the ability to perform index-only scans, making it possible to fetch results straight from the index, without confirming rows with the table data.
Unfortunately, this still isn’t enough. There are still quite a few areas where the PostgreSQL query planner is extremely naive, despite the advances we’ve seen recently. For instance, PostgreSQL still can’t do a basic loose index scan natively. It has to be tricked by using CTE syntax.
To demonstrate this further, imagine this relatively common scenario: an order processing system where clients can order products. What happens when we want to find the most recent order for all current customers? Boiled down to its minimum elements, this extremely simplified table will act as our order system.
CREATE TABLE test_order
client_id INT NOT NULL,
order_date TIMESTAMP NOT NULL,
filler TEXT NOT NULL
Now we need data to test with. We can simulate a relatively old order processing system by taking the current date and subtracting 1,000 days. We can also bootstrap with 10,000 clients, and make the assumption that newer clients will be more active. This allows us to represent clients that have left our services as time goes on. So we start with this test data:
INSERT INTO test_order
(CURRENT_DATE - INTERVAL '1000 days')::DATE
+ generate_series(1, s1.id%1000),
repeat(' ', 20)
FROM generate_series(1, 10000) s1 (id);
generate_series function is very handy for building fake data. We’re still not ready to use that data, however. Since we want to find the most recent order for all customers, we need an index that will combine the
order_date columns in such a way that a single lookup will provide the value we want for any particular client. This index should do nicely:
CREATE INDEX idx_test_order_client_id_order_date
ON test_order (client_id, order_date DESC);
Finally, we analyze to make sure the PostgreSQL engine has the most recent stats for our table. Just to make everything easily repeatable, we also set the
default_statistics_target to a higher value than default as well.
SET default_statistics_target TO 500;
Now we’ll start with the most obvious query. Here, we just use the
client_id column and look for the max
order_date for each:
SELECT client_id, max(order_date)
GROUP BY client_id;
The query plan is fairly straight-forward, and will probably include a sequence scan. On the virtual server we’re testing with, the total runtime for us ended up looking like this:
Total runtime: 1117.408 ms
There is some variance, but the end result is just over one second per execution. We ran this query several times to ensure it was properly cached by PostgreSQL. Why didn’t the planner use the index we created? Let’s assume the planner doesn’t know what
max does, and treats it like any other function. With that in mind, we can exploit a different type of syntax that should make the index much more usable. So let’s try
DISTINCT ON with an explicit
ORDER clause that matches the definition of our index:
SELECT DISTINCT ON (client_id)
ORDER BY client_id, order_date DESC;
Well, this time our test system used an index-only scan, and produced the results somewhat faster. Our new runtime looks like this:
Total runtime: 923.300 ms
That’s almost 20% faster than the sequence scan. Depending on how much bigger the table is than the index, reading the index and producing these results can vary significantly. And while the query time improved, it’s still pretty bad. For systems with tens or hundreds of millions of orders, the performance of this query will continue to degrade along with the row count. We’re also not really using the index effectively.
Reading the index from top to bottom and pulling out the desired results is faster than reading the whole table. But why should we do that? Due to the way we built this index, the root node for each client should always represent the value we’re looking for. So why doesn’t the planner simply perform a shallow index scan along the root nodes? It doesn’t matter what the reason is, because we can force it to do so. This is going to be ugly, but this query will act just as we described:
WITH RECURSIVE skip AS
(SELECT client_id, order_date
ORDER BY client_id, order_date DESC
(SELECT (SELECT min(client_id)
WHERE client_id > skip.client_id
) AS client_id,
WHERE client_id = (
WHERE client_id > skip.client_id
) AS order_date
WHERE skip.client_id IS NOT NULL)
The query plan for this is extremely convoluted, and we’re not even going to try to explain what it’s doing. But the final query execution time is hard to discount:
Total runtime: 181.501 ms
So what happened here? How can the abusive and ugly CTE above outwit the PostgreSQL query planner? We use the same principle as described in the PostgreSQL wiki for loose index scans. We start with the desired maximum order date for a single
client_id, then recursively begin adding clients one by one until the index is exhausted. Due to limitations preventing us from using the recursive element in a sub-query, we have to use the
SELECT clause to get the next client ID and the associated order date for that client.
This technique works universally for performing sparse index scans, and actually improves as cardinality (the number of unique values) decreases. As unlikely as that sounds, since we are only using the root nodes within the index tree, performance increases when there are less root nodes to check. This is the exact opposite to how indexes are normally used, so we can see why PostgreSQL doesn’t natively integrate this technique. Yet we would like to see it added eventually so query authors can use the first query example we wrote, instead of the excessively unintuitive version that actually produced good performance.
In any case, all PostgreSQL DBAs owe it to themselves and their clusters to learn CTEs. They provide a powerful override for the query planner, and helps solve the edge cases it doesn’t yet handle.
March 27th, 2014 |
Database, Tech Talk | 2 Comments
I’ve managed to convince my employer to open source one of the tools I recently wrote. That tool goes by the name of walctl, and I believe the importance of this kind of tool can not be overstated.
The PostgreSQL Write Ahead Log (WAL) files are key to crash recovery, point in time recovery, and all standby use not derived from streaming replication. WAL files are extremely critical to proper database operation. Yet their archival is treated as an afterthought. Some people use regular
cp while others go as far as to use
rsync to send data to a replica server.
This isn’t enough. A much safer architecture is to involve three servers. One server produces WAL files. One server acts as a storage and distribution location. One (or more) final server consumes the WAL files as necessary. In instances where streaming replication gets disconnected and the replica is too far behind for the WAL files the master server has available, the archive is a good mechanism for catching up.
Well, wallctl does all of that. It forces you to do all of that. Just set up a couple SSH keys prior to installation, and it needs nothing else. I added a database clone tool as a convenience which needs superuser access to the master database, but otherwise, walctl is very unobtrusive. This toolkit is simple, and will have a few limited advancements in updated versions. It’s meant to be small, fast, and light, following the UNIX philosophy of doing one thing well. In fact, I wrote this tool after examining several other PostgreSQL WAL and replication management systems. Almost all of them require transmitting WAL files directly from the master server to one or more slave systems. But this presumes you only have one replica, or forces the master server to do much more work by contacting several systems in a row. Why not let the slaves do all the hard work?
I highly recommend all DBAs use a WAL management tool of some kind, no matter which. Barman and repmgr are great alternatives that do much more than walctl. But if all you want to do is stash your WAL files in a safe location that multiple replica servers can utilize, this is the easier path.
December 12th, 2013 |
Book, Review | 1 Comment
There comes a time in every DBA’s life, that he needs to add functionality to his database software. To most DBAs, and indeed for most databases, this amounts to writing a few stored procedures or triggers. In extremely advanced cases, the database may provide an API for direct C-language calls. PostgreSQL however, has gone above and beyond this for several years, and have continuously made the process easier with each iteration.
So once again, I’m glad to review a book by three authors in the industry who either work directly on PostgreSQL internals, or use it extensively enough to contribute vastly important functionality. Hannu Krosing, Jim Mlodgenski, and Kirk Roybal collaborated to produce PostgreSQL Server Programming, a necessary and refreshing addition to the PostgreSQL compendium. I don’t know who contributed each individual chapter, but I can make a fairly educated guess that anything PL/Proxy related came from Mr. Krosing, its original designer.
As usual for a book of this type, things start off with relative simplicity. The authors make a very important note I try to convey to staff developers regularly: let the database do its job. The database is there to juggle data, handle set theory, and otherwise reduce traffic to and from the application to a minimum. This saves both network bandwidth and processing time on the front end, which can be combined with caching to service vastly larger infrastructures than otherwise possible.
Beyond this, are the basics. Using stored procedures, taking advantage of triggers, writing functions that can return sets. The gamut of examples runs from data auditing and logging, to integrity control and a certain extent of business logic. One or more of the authors suggests that functions are the proper interface to the database, to reduce overhead, and provide an abstract API that can change without directly altering the application code. It is, after all, the common denominator in any library or tool dealing with the data. While I personally don’t agree with this type of approach, the logical reasoning is sound, and can help simplify and prevent many future headaches.
And that’s where this book really shines: examples. There’s Python, C, PLPGSQL, triggers, procedures, compiled code, variants of several difficult techniques, and more. In the C case, things start with a simple “Hello World” type you might see in a beginning programming class, and the author steps through increasingly complex examples. Eventually, the C code is returning sets of sets of data per call, as if simulating several table rows.
In the more concrete, the authors provide copious links to external documentation and Wiki pages for those who want to explore this territory in more depth. Beyond that, they want readers to know about major sources of contributed code and extensions, all to make the database more useful, and perhaps entice the reader join in the fun. Everything from installing, to details necessary for writing extensions is covered, so that is well within the realm of possibility!
I already mentioned that at least one of the authors encourages functional database access instead of direct SQL. Well, there’s more than the obvious reasons for this: PL/Proxy is a procedural language that uses functions to facilitate database sharding for horizontal scalability. Originally designed for Skype, PL/Proxy has been used by many other projects. While it might not apply to everyone, sharding is a very real technique with non-trivial implementation details that have stymied the efforts of many development teams.
I actually would have liked a more extensive chapter or two regarding PL/Proxy. While several examples of functional access are outlined for a chat server, none of these functions are later modified in a way that would obviously leverage PL/Proxy. Further, Krosing doesn’t address how sequences should be distributed so data on all the various servers get non-conflicting surrogate keys. It would have been nice to see an end-to-end implementation.
All in all, anyone serious about their PostgreSQL database should take a look. Getting a server up and running is only half the story; making the database an integral component of an application instead of a mere junk drawer provides more functionality with fewer resources. It’s good to see a book that not only emphasizes this, but conveys the knowledge in order to accomplish such a feat. Hannu, Jim, and Kirk have all done the community a great service. I hope to see revisions of this in the future as PostgreSQL matures.
« Older Posts
Newer Posts »