PG Phriday: Interacting with JSON and JSONB
With the release of PostgreSQL 9.4, comes the ability to use binary JSON objects. This internal representation is faster and more capable than the original JSON included in 9.3. But how do we actually interact with JSON and JSONB in a database connection context? The answer is actually a little complicated and somewhat surprising.
Casting. Casting Everywhere.
Despite its inclusion as an internal type, PostgreSQL maintains its position as encouraging explicit casting to avoid bugs inherent in magic type conversions. Unfortunately, JSON blurs several lines in this regard, and this could lead to confusion on several fronts.
Let’s take a look at JSON first. Here are three very basic JSON documents for illustration:
{ "name": "cow-man" }
{ "weight": 389.4 }
{ "alive": true }
Nothing crazy. We have a string, a number, and a boolean. The PostgreSQL JSON type documentation suggests it handles these internally, which we can see for ourselves.
SELECT '{ "name": "cow-man" }'::JSON;
json
-----------------------
{ "name": "cow-man" }
SELECT '{ "weight": 389.4 }'::JSON;
json
---------------------
{ "weight": 389.4 }
SELECT '{ "alive": true }'::JSON;
json
-------------------
{ "alive": true }
Great! We can see the string, the number, and the boolean preserved in PostgreSQL’s encoding. Things start to go a bit sideways when we pull fields, though:
SELECT '{ "name": "cow-man" }'::JSON->'name';
?column?
-----------
"cow-man"
So far, so good. The PostgreSQL JSON documentation for functions and operators says that the ->
operator returns a JSON object. And indeed, we can re-cast this string to JSON:
SELECT '"cow-man"'::JSON;
json
-----------
"cow-man"
What happens when we try to compare two JSON objects, though?
SELECT '{ "name": "cow-man" }'::JSON->'name' = '"cow-man"'::JSON;
ERROR: operator does not exist: json = json
Wait… what? Hmm. Let’s try the same thing with JSONB:
SELECT '{ "name": "cow-man" }'::JSONB->'name' = '"cow-man"'::JSONB;
?column?
----------
t
That’s something of a surprise, isn’t it? It’s pretty clear from this that JSON and JSONB are much more than simply how the data gets encoded and stored. It also drastically affects how it’s possible to interact with the data itself.
Don’t relax yet, though! JSON and JSONB casting only succeed on TEXT
or VARCHAR
similar types. For example, these don’t work:
SELECT 365::JSON;
SELECT 365::JSONB;
But these do:
SELECT 365::TEXT::JSON;
SELECT 365::TEXT::JSONB;
So even though PostgreSQL acknowledges JSON datatypes, it can’t convert between those and its own internal types. A PostgreSQL NUMERIC
is similar to a JSON NUMBER
, but they’re not interchangeable, and can’t even be casted without first going through some kind of TEXT type. This is the same for boolean values. The only type that is treated natively is a string-based value.
While it may seem inconvenient to always use another type as an intermediary when interacting with JSON, that’s the current reality.
Just use TEXT and JSONB
If we reexamine the JSON type documentation, we also see the ->>
operator. This not only pulls the indicated field, but automatically casts it to text. This means that we can turn this ugly monstrosity:
SELECT ('{ "field": "value" }'::JSON->'field')::TEXT;
Into this:
SELECT '{ "field": "value" }'::JSON->>'field';
From here, we can perform any action normally possible with a text-based value. This is the only way to pull a JSON or JSONB field directly into a PostgreSQL native type.
All of this would suggest that the safest way to work with JSON or JSONB is through text. Ironically, text is also the only way to exchange comparisons between JSON and JSONB. Observe:
SELECT '"moo"'::JSON = '"moo"'::JSONB;
ERROR: operator does not exist: json = jsonb
And yet:
SELECT '"moo"'::JSON::TEXT = '"moo"'::JSONB::TEXT;
?column?
----------
t
Well, then. What this means is pretty clear: convert at the last minute, and always use some kind of text value when dealing with JSON and JSONB.
While I’m here, I’d also like to point out a somewhat amusing side-effect of how JSONB works as opposed to JSON. Textual data gets converted to JSONB automatically when JSONB is one of the equalities. What does that mean? All of these are valid, and note that I’m quoting everything so it’s treated as text:
SELECT '"moo"'::JSONB = '"moo"';
SELECT '365'::JSONB = '365';
SELECT 'true'::JSONB = 'true';
But all of these produce an error:
SELECT '"moo"'::JSON = '"moo"';
SELECT '365'::JSON = '365';
SELECT 'true'::JSON = 'true';
This alone suggests that the lack of interoperability between JSON and JSONB is more of an oversight, and that JSON is missing some casting rules. Hopefully, that means 9.5 will carry some corrections in this regard. It’s hard to imagine PostgreSQL will leave JSON as a lame-duck datatype that was only really useful for 9.3 while JSONB was being developed.
If not, I guess that means I don’t have to revisit this topic in the future. Everyone knows I love being lazy.