There’s an example in The Art of PostgreSQL1 that warns against writing some SQL like this:2
In case you’re not familiar with SQL, this is roughly saying: “take all the numbers x
from the list (1, 2, 3, 4, 5)
, which we’ll call temp
, where that number x
is not in the list (1, 2, null)
”.
I say roughly because that’s not really what this does at all. This query returns no rows! Let’s explore why. Here’s an example of an IN
statement working as expected:
Indeed, 7
is in the list (7)
, so this should be true. Likewise,
7
is not in the list (8)
, and expecting 7
to not be in the list (7)
is also going to lead to disappointment. So far so good.
What would you expect the following query to produce?
Naively, since null
is not in the list 7
, we might expect False
. However, null
is not really a value in this sense, and has its own special semantics. Instead this query itself returns null
.
Likewise, looking for something in a list that contains only null
will return null
.
Looking for something in a list not containing that thing, but containing null
, will return null
Finally, looking for something in a list that contains it, but also null
, will return True
! This is true independent of the item’s position in the list.
So what’s really going on in our initial query, is that 1
and 2
are found in the list, but the not in
inverts them to false, so they don’t show up in the result. And since 3
, 4
, and 5
are not in the list, but the list contains null
, they also won’t show up, since they’ll be null.
So we have a big list of False
s and null
s and in the end the query has an empty result! Do not treat null as a value, do not pass go, do not collect $200.
The Art of PostgreSQL. 2e. theartofpostgresql.com↩︎
I tested these queries with this nifty online postgresql compiler running version 9.6.2↩︎