PG Phriday: Corralling the Hordes

Friday February 26, 2016

Ah, users. They log in, query things, accidentally delete critical data, and drop tables for giggles. Bane or boon, user accounts are a necessary evil in all databases for obvious reasons. And what would any stash of data be if nobody had access? Someone needs to own the objects, at the very least. So how can we be responsible with access vectors while hosting a Postgres database? We already covered automating grants, so let’s progress to the next step: building a “best practice” access stack.

More experienced (and smarter) people than me have given this process a lot of thought, so why not learn from one of their implementations? In UNIX systems, file access is handled through up to nine grants based on the owner, an arbitrary group, or the unwashed masses, each with read, write, or execute permissions. In the context of Postgres, the shambling hordes can be associated with PUBLIC, USER with ownership, and GROUP for the usual buckets. Similarly, we have SELECT for read, INSERT, UPDATE, or DELETE for write, and EXECUTE as well.

This means we actually have a bit more control in certain areas than UNIX filesystems. In fact, since multiple users or groups have distinct privileges on each object, we have the opportunity to be really creative. If we start with the analogous case, but expand it by leveraging multiple groups, we have a set of standardized roles that resemble a filesystem’s privileges, but with greater granularity. We can even create something similar to a bitmask, so all new objects adhere to our desired grant structure.

So how do we start? The easy case would have the owner, a group for writing, a group for reading, and a group for function execution. Let’s set it up and include the default ACLs from the previous article:

CREATE USER myapp_owner WITH PASSWORD 'test';
CREATE SCHEMA myapp AUTHORIZATION myapp_owner;
ALTER USER myapp_owner SET search_path = myapp;

CREATE GROUP myapp_reader;
CREATE GROUP myapp_writer;
CREATE GROUP myapp_exec;

GRANT USAGE ON SCHEMA myapp TO myapp_reader;
GRANT USAGE ON SCHEMA myapp TO myapp_writer;
GRANT USAGE ON SCHEMA myapp TO myapp_exec;

ALTER DEFAULT PRIVILEGES
  FOR USER myapp_owner
   IN SCHEMA myapp
GRANT SELECT ON TABLES
   TO myapp_reader;

ALTER DEFAULT PRIVILEGES
  FOR USER myapp_owner
   IN SCHEMA myapp
GRANT INSERT, UPDATE, DELETE ON TABLES
   TO myapp_writer;

ALTER DEFAULT PRIVILEGES
  FOR USER myapp_owner
   IN SCHEMA myapp
GRANT USAGE ON SEQUENCES
   TO myapp_writer;

ALTER DEFAULT PRIVILEGES
  FOR USER myapp_owner
   IN SCHEMA myapp
GRANT EXECUTE ON FUNCTIONS
   TO myapp_exec;

ALTER DEFAULT PRIVILEGES
  FOR USER myapp_owner
   IN SCHEMA myapp
REVOKE EXECUTE ON FUNCTIONS
 FROM PUBLIC;

What we’ve done beyond creating the owning user and the groups themselves, is set defaults for new objects so everything is automatically enforced. The myapp_writer role has two of these because tables may rely on sequences for automatically generated IDs, so we need grants for both. We also neglected granting read access to the writer since writing can come from any number of vectors, some of which might not require reading the data they insert. The last thing we do is revoke execution of all stored procedures from PUBLIC, because that’s a silly default and smells of elderberries.

We should also note that, while these commands should be executed by a superuser initially, ownership itself does not require such elevated privileges. It’s unfortunately common for DBAs and operators to create objects as the postgres user itself. Since this user is often the primary superuser for the Postgres instance, deployment systems must be given access to it as well. That is an implicit risk to the entire database constellation that has only one alternative: the DBA deploys all database objects.

While certainly possible, this acts as an impediment to continuous integration, automated testing, and software deployment. Proprietary applications might also depend on database objects that must be created and maintained by the application itself. The DBA literally can’t inject themselves into every potential source.

But the myapp_owner user can. Let’s use it to build some simple object in the myapp schema:

CREATE TABLE test_tab (
  id          SERIAL PRIMARY KEY,
  somedata    VARCHAR NOT NULL,
  created_dt  TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now()
);

CREATE INDEX idx_test_tab_created_dt
    ON test_tab (created_dt);

CREATE VIEW v_test_data AS
SELECT * FROM test_tab
 WHERE somedata LIKE 'test%';

CREATE FUNCTION f_hello()
RETURNS VARCHAR AS
$$

  SELECT 'Hello ' || somedata
    FROM test_tab
   ORDER BY created_dt DESC
   LIMIT 1;

$$ LANGUAGE SQL;

Note that we still have no users beyond the initial owner. That comes later, once we know what username an application or user might desire or require for limited operation. We also strongly recommend that any database users for day-to-day operation should be distinct from the object owner. This prevents tool exploits from escalating to full data control. If a web user only has read access, a security bug would only expose the data, not allow the cracker the capability of changing data or dropping database structures.

Given these constraints, here’s a sample of how we might use our ACLs to constrict access. This is where a DBA could step in:

CREATE USER myapp_web WITH PASSWORD 'webthing';
ALTER USER myapp_web SET search_path = myapp;
GRANT myapp_reader TO myapp_web;

CREATE USER myapp_import WITH PASSWORD 'util';
ALTER USER myapp_import SET search_path = myapp;
GRANT myapp_reader TO myapp_import;
GRANT myapp_writer TO myapp_import;

CREATE USER myapp_report WITH PASSWORD 'calculate';
ALTER USER myapp_report SET search_path = myapp;
GRANT myapp_reader TO myapp_report;
GRANT myapp_exec TO myapp_report;

Now the web user can read all of the data it wants. The import jobs can shove data into the tables and read when reconciliation is necessary. Data reports can call functions to update fact tables or data cubes, and read the results.

This is a simple access model that might be ideal for a standard application deployment that doesn’t need to lock down individual tables. More complicated variations would have no default privilege definitions, and several more groups. For example, we might have created myapp_web as a group instead, and granted access to a limited subset of objects within the schema. Of course, provided we revoke the grants above, we still have that option.

And that’s where this kind of approach really wins out. While we defined a standardized series of default privileges that make it easy to distribute basic access, we can go much further while still maintaining those standards.

What about PUBLIC you ask? No. Stop asking that. They don’t get access. Barring functions, which we fixed, PUBLIC gets no default access to anything, ever. PUBLIC and all other users which don’t match our carefully organized roles, exist merely as objects of ridicule as they swarm fruitlessly at our impenetrable fortifications. The hordes are to be thwarted by any means available, which Postgres satisfies in spades.

After all, we all have our standards.

See Also