PG Phriday: Being A Tattletale

In a heterogeneous database environment, it’s not uncommon for object creation and modification to occur haphazardly. Unless permissions are locked down to prevent it, users and applications will create tables, modify views, or otherwise invoke DDL without the DBA’s knowledge. Or perhaps permissions are exceptionally draconian, yet they’ve been circumvented or a superuser account has gone rogue. Maybe we just need to audit database modifications to fulfill oversight obligations. Whatever the reason, Postgres has it covered with event triggers.

Now, event triggers have only been around a relatively short while, having appeared in version 9.3. Even though I was personally excited to see the feature, priorities changed and they fell off my RADAR for quite a while. Yet the functionality they offer is exceedingly useful and worthy of exploration. So here’s a simple scenario: email DDL (Database Definition Language) to a DBA team whenever it occurs. This would mean anything from CREATE TABLE to DROP RULE, or anything else from this chart.

This would normally be fairly easy, but our first choice of language, the native PL/pgSQL doesn’t have email functionality. Can we use Python? Let’s try:

CREATE OR REPLACE FUNCTION sp_email_command()
RETURNS event_trigger AS
$$
    print 'Event Trigger!'
$$ LANGUAGE plpythonu;

ERROR:  PL/Python functions cannot return type event_trigger

Nope. While this is a somewhat unfortunate oversight, it’s not a roadblock. We can use a standard PL/pgSQL function as a wrapper to call our Python email routine. So let’s just do that.

The next thing to consider is what we should include in the email. Thankfully there’s a wealth of information available regarding database sessions. It’s always a good idea to be specific while snitching, so our email should minimally report who the user is, where they came from, where they went, and what they did. Here’s a very quick and dirty event trigger that does all of that:

CREATE OR REPLACE FUNCTION sp_tattle_ddl()
RETURNS event_trigger AS
$$
BEGIN
    RAISE NOTICE 'Type: %', TG_TAG;
    RAISE NOTICE 'Command: %', current_query();
    RAISE NOTICE 'DB Name: %', current_database();
    RAISE NOTICE 'DB User: %', session_user;
    RAISE NOTICE 'DB Port: %', inet_server_port();
    RAISE NOTICE 'Server Host: %', inet_server_addr();
    RAISE NOTICE 'Client Host: %', inet_client_addr();
END;
$$ LANGUAGE plpgsql;

CREATE EVENT TRIGGER t_event_audit_all
    ON ddl_command_start
EXECUTE PROCEDURE sp_tattle_ddl();

-- Let's see this in action!

CREATE TABLE foo (bar INT);

NOTICE:  Type: CREATE TABLE
NOTICE:  Command: CREATE TABLE foo (bar INT);
NOTICE:  DB Name: postgres
NOTICE:  DB User: postgres
NOTICE:  DB Port: 5432
NOTICE:  Server Host: pg-dev-1
NOTICE:  Client Host: win-desktop-15

Not a bad start! The reason we used session_user in lieu of current_user, is because the latter is transient. We want to know exactly who connected, not who they’re masquerading as. This makes it much easier to perform forensics if such becomes necessary.

Now that we know we can capture the DDL and the surrounding environment, let’s create a Python function that performs the email action for us. We don’t need to worry especially about overhead, since DDL is not nearly as time-sensitive as regular queries. Modifying a database shouldn’t happen often, so we should feel free to perform reverse DNS or other potentially time-consuming operations, as long as we don’t get excessive.

CREATE OR REPLACE FUNCTION sp_email_command(
    target TEXT, tg_type TEXT, command TEXT, db_name TEXT,
    db_user TEXT, db_port INT, server_addr INET, client_addr INET
)
RETURNS VOID AS
$$
    import smtplib
    import socket
    import email

    # For the sake of readability, we should translate server
    # and client host names from address information when
    # possible. If these weren't sent, that means we're dealing
    # with a local UNIX socket connection, and should label it
    # with the server our trigger fired on.

    local_name = socket.gethostname()
    server_name = local_name
    client_name = local_name

    if server_addr:
        try:
            server_name = socket.gethostbyaddr(server_addr)[0]
        except:
            server_name = server_addr

    if client_addr:
        try:
            client_name = socket.gethostbyaddr(client_addr)[0]
        except:
            client_name = client_addr

    # Put together the email message. If we use a standard label
    # format, automated systems or rules can easily route these.

    msg = email.message_from_string(
        'Container : %s\n' % server_name +
        'DB Port : %s\n' % (db_port or 'N/A') +
        'DB User : %s\n' % db_user +
        'DB Name : %s\n' % db_name +
        'Client Host : %s\n' % client_name +
        'Full Command : \n\n' + command
    )

    msg['Subject'] = '%s DDL Change on %s Detected!' % (tg_type, server_name)
    msg['From'] = 'postgres@%s' % server_name
    msg['To'] = target

    # Transmit the actual message to the local server. It
    # should then forward to the appropriate upstream system.

    s = smtplib.SMTP('localhost')
    s.sendmail('postgres@%s' % server_name, target, msg.as_string())
    s.quit()

$$ LANGUAGE plpythonu;

This is a pretty simple email. There are no attachments, the body is just a glorified series of captured session attributes and the modification. If we were more ambitious, we could implement templating or produce an HTML-encoded version. For now though, this function works well and even covers a couple of edge cases for local connections, and IPs that don’t reverse.

Our final task is to replace the trigger wrapper function with something that will call the Python version. Here we go:

CREATE OR REPLACE FUNCTION sp_tattle_ddl()
RETURNS event_trigger AS
$$
BEGIN
    PERFORM sp_email_command(
        'dbas@my-company.com', TG_TAG, current_query(),
        current_database(), session_user,
        inet_server_port(), inet_server_addr(),
        inet_client_addr()
    );
END;
$$ LANGUAGE plpgsql;

Now instead of issuing a NOTICE for each desirable attribute, we produce an email that details everything. Otherwise, not much has changed. A more complicated version might only call the email for some specific DDL, or perhaps CC the end user in certain cases.

What about the Postgres logs? After all, the log_statement option has a setting specifically for logging database modifications. It’s a good start, and could indeed be consumed by applications like Logstash or Elasticsearch. Event triggers however, give us much more control over presentation and circumstances. Instead of an email, we could insert these details into a remote auditing database, or to a federal agency. We could utilize foreign data wrappers to directly invoke APIs from any number of systems.

But primarily, the trigger approach is immediate and far more difficult to forge as a source. The integrity of our chain of custody is just as important as the audited DDL itself. The modification wasn’t scraped from a log, or parsed from related activity, but reported from the database instance as it happened. Written properly, event triggers can even reject any modifications that aren’t properly reported or acknowledged by the expected recipient.

Such ruthless control over database architecture probably isn’t warranted in the vast majority of implementations. But Postgres makes it possible, even easy, to crank the paranoia level up to 11 when the situation calls for it. Once again, Postgres excels in its versatility.

The principal caveat here is that event triggers don’t fire on themselves, making it possible to circumvent them. Hopefully this is corrected in a future version so extremely security-conscious admins can use it as another reliable layer in the onion.