PG Phriday: Replacing Business Logic
Back in 2005, I started a new job with a company that did work for other businesses. Their data model was designed by developers and they had no DBA, as is pretty common in smaller organizations. A critical part of our main application relied on an event log that captured customer activity and relayed their final decision to the client for reconciliation. One day someone noticed that there was a bug in the system that resolved the final decision from the event stack, and panic ensued. I put a lot of thought into it that night and had a fix for it the next morning. As usual, my approach relied on Postgres, and carried a fun lesson.
As it turns out, some business logic belongs in the database. Imagine an existing data model with business logic that must adhere to these criteria:
- All front-end events are captured to an event log table.
- Depending on customer activity, an event chain can be initiated.
- Within this event chain, the final decision must be captured.
- Decisions are ranked based on client pay scale for customer progress through the interface.
Nothing proprietary or complicated—just a basic ranking algorithm. Unfortunately in this case, whatever system maintained that ranking had some subtle flaw we needed to fix, and that included past data for the current billing cycle. We needed something that would fix the decision tree, and then prevent it from getting broken again in the future.
We can represent the data with a pretty simple model:
CREATE TABLE event_type ( event_type_id SERIAL PRIMARY KEY, event_name VARCHAR NOT NULL, score INT NOT NULL, can_init BOOLEAN NOT NULL DEFAULT FALSE ); CREATE TABLE event_log ( event_id SERIAL PRIMARY KEY, session_id UUID NOT NULL, event_type_id INT NOT NULL REFERENCES event_type (event_type_id), event_data VARCHAR NOT NULL, created_dt TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now() ); CREATE TABLE decision_log ( session_id UUID PRIMARY KEY, init_event_id INT NOT NULL REFERENCES event_log (event_id), decision_id INT NULL REFERENCES event_log (event_id) ); INSERT INTO event_type (event_name, score, can_init) VALUES ('start', 0, TRUE); INSERT INTO event_type (event_name, score) VALUES ('pending', 10), ('order', 20), ('cancel', 30);
The business logic requires something more, however. To a DBA, it sounds like a job for at least one trigger. But how do we organize them, and how should they fire? Going back to the criteria, we know that all events are captured to the
event_log table, so it needs a trigger. Since it’s a log of events, there are no updates or deletes. If we make the assumption that
decision_log can receive inserts from another vector, it should have a separate trigger to control ranking.
So we need one trigger after inserts on
event_log and one before updates on
decision_log. Here’s what that might resemble:
CREATE OR REPLACE FUNCTION sp_capture_event() RETURNS TRIGGER AS $$ BEGIN PERFORM event_type_id FROM event_type WHERE event_type_id = NEW.event_type_id AND can_init; IF FOUND THEN INSERT INTO decision_log (session_id, init_event_id) VALUES (NEW.session_id, NEW.event_id); ELSE UPDATE decision_log SET decision_id = NEW.event_id WHERE session_id = NEW.session_id; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; CREATE TRIGGER t_capture_events_a_i AFTER INSERT ON event_log FOR EACH ROW EXECUTE PROCEDURE sp_capture_event(); CREATE OR REPLACE FUNCTION sp_update_decision() RETURNS TRIGGER AS $$ BEGIN IF OLD.decision_id IS NULL THEN RETURN NEW; END IF; PERFORM n.event_id FROM event_log o JOIN event_type ot USING (event_type_id), event_log n JOIN event_type nt USING (event_type_id) WHERE o.event_id = OLD.decision_id AND n.event_id = NEW.decision_id AND nt.score > ot.score; IF FOUND THEN RETURN NEW; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; CREATE TRIGGER t_control_decisions_b_u BEFORE UPDATE ON decision_log FOR EACH ROW EXECUTE PROCEDURE sp_update_decision();
The trigger on
event_log can issue an
UPDATE based on event type since only some event types can initiate a new chain, and customers can only have one chain per session. It’s job is a very simple one. The trigger on
decision_log isn’t actually very complex either, and there were several ways we could have implemented it. The query shown basically just checks if the new event outranks the old one, and only allows the update when it does.
We could have independently selected the ranks into variables and compared with an
IF statement, or used
DISTINCT ON to return the top-ranked event, and only allow the update if it was the same event, and so on. It’s been over ten years, so I have no idea how I did it back then, but the solution above relies on the assumption we only need to fetch information for two events. Using an
IF statement would mean variables and the associated overhead, and the
DISTINCT ON approach could theoretically match tens or hundreds of rows before producing the final result.
In any case, we need to test this arrangement. Let’s use the
pgcrypto extension, the Postgres 9.2+ ability to set arbitrary variables, and a prepared statement to create some event traffic:
CREATE EXTENSION pgcrypto; SELECT set_config('local.sess', gen_random_uuid()::text, FALSE); PREPARE event_ins(INT, VARCHAR) AS INSERT INTO event_log (session_id, event_type_id, event_data) SELECT current_setting('local.sess')::UUID, $1, $2; EXECUTE event_ins(1, 'starting up...'); EXECUTE event_ins(2, 'ordering stuff...'); EXECUTE event_ins(3, 'Ordered!'); SELECT init_event_id, decision_id FROM decision_log; init_event_id | decision_id ---------------+------------- 1 | 3
Not bad. By this point, we’ve created a few basic events in the event log, and the decision log appears to be working as intended. But we have one more test to verify: will it reject outranked events? That’s important, because that’s what got is into trouble in the first place. If we can prove the trigger will directly reject bad changes, the integrity of the decision tree is preserved in a way the application can’t circumvent.
Let’s see what happens if the customer tries to screw up the process:
EXECUTE event_ins(2, 'No... wait...'); SELECT init_event_id, decision_id FROM decision_log; init_event_id | decision_id ---------------+------------- 1 | 3 EXECUTE event_ins(4, 'I changed my mind.'); SELECT init_event_id, decision_id FROM decision_log; init_event_id | decision_id ---------------+------------- 1 | 5
Pending events have less priority than order events, so the decision tree did not change when the customer went back in the chain. The events are still logged and stamped so we can see the entire session, but the final state which the client pays for is still preserved. In this particular instance, the customer decided they didn’t want whatever service was offered, and the client got a free pass.
That solved the conundrum of preventing further corruption of our data, but we still had to fix the existing billing cycle. Luckily, the
DISTINCT ON approach that isn’t quite correct for the trigger, works great on rebuilding the ideal rows. A fix would go something like this:
WITH ranked AS ( SELECT DISTINCT ON (session_id) e.session_id, e.event_id AS decision_id FROM event_log e JOIN event_type t USING (event_type_id) WHERE e.created_dt >= date_trunc('month', now()) AND NOT t.can_init ORDER BY e.session_id, t.score DESC ) UPDATE decision_log d SET decision_id = r.decision_id FROM ranked r WHERE d.session_id = r.session_id AND d.decision_id != r.decision_id;
That single update statement finds all sessions for the current billing cycle and gets the proper ranked decision event. Then it only changes the decisions that were wrong due to the bug. Even if that fix contains millions of rows, it’s a one-time operation. Our trigger ensures only valid entries will make it into the decision table from now on, thus ending the chaos. The only remaining task was to retire the code that previously maintained the decision stack.
I came up with that fix soon after starting at the company, and it remained in place until at least the day I left almost five years later. It wasn’t a particularly difficult solution, and it would be trivial to argue that more complicated processes should be implemented at the application level instead. But for simple integrity-enforcing rules, some business logic works best when it can’t be circumvented, even accidentally.
In the end, as basic as this approach was, it uses several Postgres features and SQL syntax. Triggers, functions,
DISTINCT ON, and
UPDATE ... FROM are only the beginning. Being able to use all available tools greatly enhances developer agility and understanding, as well as the quality of the final result. So don’t shy away from putting some code in the database; sometimes, that’s where it belongs.