PG Phriday: JSON and JSONB Revisited

With Postgres 9.5 on the horizon, I figured it’s a good time to see how things have advanced since my last dive into that particular ocean. This is probably particularly relevant since even MongoDB, a JSON-driven NoSQL database, is now partially powered by Postgres. A lot of people found that particular revelation quite shocking, but maybe they shouldn’t, given the advancements embedded within the last couple of Postgres releases.

As it turns out, there are quite a few advancements that really make JSONB a powerful addition to Postgres. Note that this does not apply to JSON. While JSONB accumulates several new elements of functionality, JSON itself remains almost completely unchanged as of Postgres 9.5rc1.

As an example of this, let’s take a look at the new concatenation operator with regards to JSON and JSONB:

SELECT '{"Hairy": true}'::JSONB || '{"Wobbly": false}'::JSONB;

 {"Hairy": true, "Wobbly": false}

SELECT '{"Hairy": true}'::JSON || '{"Wobbly": false}'::JSON;

ERROR:  operator does not exist: json || json

It would probably be best to forget JSON even exists at this point. When interacting with JSON in a Postgres database, JSONB is clearly the way forward. Use it for table columns, use it for casting, use it for making delicious casseroles. But wait… didn’t we just do something that once required a PL/Python or PL/V8 function? Yes we did, and that’s only the beginning; the concatenate operator will also overwrite existing fields:

SELECT '{"Hairy": true, "Excited": false}'::JSONB || 
       '{"Excited": true, "Wobbly": true}'::JSONB;

 {"Hairy": true, "Wobbly": true, "Excited": true}

It’s nice to see that we don’t have to separate the operations, either. Replacing a value and adding a new one can both be done simultaneously. Alternatively, if we want to remove keys, the ‘-’ operator will do that directly. For example:

SELECT '{"Hairy": true, "Excited": false}'::JSONB - 'Excited';

 {"Hairy": true}

Note that this kind of interaction works with keys, not actual JSON. This might seem counterintuitive at first since ‘||’ and ‘-’ are not complementary operations. But neither are the actions taking place. The first is more of a merge, in which case the ‘||’ operator is probably being misused. The second is a subtraction based on a path to JSON—an important distinction. This isn’t a ‘+’ vs. ‘-’ situation, and can’t really be construed as such.

We also gain the ability to subtract a full path to a nested JSON element. Since keys can nest multiple levels deep, we would need an array to represent a path with a depth greater than one. We’d also probably want a different operator so we don’t mix keys with key paths. Thus 9.5 also introduces the ‘#-’ operator:

SELECT '{"Hairy": true, "Status": {
         "Excited": true, "Burning": true}
        }'::JSONB #- '{Status,Burning}'::TEXT[];

 {"Hairy": true, "Status": {"Excited": true}}

-- This also works

SELECT '{"Hairy": true, "Status": {
         "Excited": true, "Burning": true}
        }'::JSONB #- '{Status,Burning}';

 {"Hairy": true, "Status": {"Excited": true}}

It’s nice that the TEXT array casting is optional, since that usage is somewhat esoteric in comparison to others. Interestingly, the opposite operation for ‘#-’ isn’t ‘#+’, but a function call. Again, this may seem odd at first glance, but makes more sense upon further examination. To add JSON, we need a path to know where to put the data, and then the data itself. Operators can’t handle multiple parameters, so we need something that will.

And that’s exactly what the new jsonb_set function does. Here it is in action:

SELECT jsonb_set('{"Hairy": true, "Status": {
         "Excited": true, "Burning": true}

 {"Hairy": true, "Status": {"Burning": false, "Excited": true}}

The full documentation for JSON functions provides a lot more information about what jsonb_set does and how it works. I highly encourage reading up on it before doing a lot of extensive JSON work.

And finally, Postgres now has the ability to make JSON human readable. This is especially important since JSONB discards formatting since it is stored in Postgres as an arbitrary binary encoding. Deeply nested JSON can be quite ugly and resist mental parsing, so it’s great to see this addition. How does our hairy, burning, excited, yet wobbly data look?

SELECT jsonb_pretty('{"Hairy": true, "Status": {
         "Excited": true, "Burning": true},
         "Wobbly": false

     "Hairy": true,
     "Status": {
         "Burning": true,
         "Excited": true 
     "Wobbly": false

Given everything we’ve seen here, I almost want to suggest the developers rip out the existing JSON engine and replace it entirely with JSONB such that it’s merely an alias. I can’t think of a single reason to retain a less functional datatype that only serves to confuse new users. JSON is slower, uses more disk space, is harder to interact with, and has a mere fragment of JSONB’s capabilities.

Well, I can think of one reason: compatibility. Due to the different storage requirements, any existing tables using JSON columns would be incompatible with the JSONB engine. This is an unfortunate truth of JSON being a precursor to JSONB in the Postgres timeline. This reminds me of what happened to Oracle regarding VARCHAR vs VARCHAR2. Oracle’s recommendation is to always use VARCHAR2 for various reasons, and at this point, the same holds for Postgres and JSONB. I think a note to this effect in the Postgres documentation would be in our best interests.

In any case, this is an exciting time for Postgres in general. JSON seems to be replacing XML as an application communication format, so having such powerful internal compatibility with it is an important step forward. 9.5 can’t come soon enough!