One important part of writing programs is deciding what to do when things inevitably go wrong. One common way things go wrong is simple nonexistence: the value we want to have doesn’t exist. Let’s talk about three ways to handle this problem: null, exceptions, and optionals.
We’ll start with the billion-dollar mistake. NULL
,
null
, nil
, or None
. It is
admittedly convenient to have a way to “opt out” of having a real value
with the correct type (or in more dynamic languages, the correct duck
type). In C, there isn’t really a better representation for “we don’t
have this thing” than a null pointer, unless you want to do a lot of
extra work and explicitly represent the possibility of not having the
thing as part of the thing.
Thing is…this leads to a “small” host of problems. If you’ve ever
been bewildered by a NullPointerException
in Java, or had
trouble with NilClass
in Ruby, you should feel this
viscerally. Luckily, there are better ways to deal with stuff going
missing than just hoping or (this is almost worse) explicitly checking
at the beginning of every single function call for things that should
not be null, but are.
The problems with unexpected null-related errors are pretty self-explanatory. It’s frustrating to have to handle values that were never really there. The other choice, though, isn’t much better. The convenience of creating a null is almost always outweighed by the inconvenience of having to check for it elsewhere. We usually want our programs saturated with values, not with nulls.
As a simple example, let’s say we have a function f
which takes some value x
. If that value can be null, then
f
has to handle two paths: x == someValue
and
x == null
. What happens when we add another argument
y
? Now our logic is something like this:
if x is not None and y is not None:
# happy path
elif x is not None:
# we have x, but not y
elif y is not None:
# we have y, but not x
else:
# we have neither
There’s a combinatorial explosion of possibilities here, only
somewhat mitigated by the fact that we can sometimes treat “any value is
null” in one way, and “all values are not null” in another. Also
consider what happens if we add functions g
and
h
to the mix, which both take those same arguments
x
and y
. Adding explicit checks in all these
places gets annoying fast, and adds unnecessary cognitive overhead.
We don’t need to clutter our code with null checks or ignore the problem altogether. Exceptions provide a convenient way out.
Let’s take a classic example: reading a config file which may or may not exist. We strongly expect the file to exist, but if it doesn’t there isn’t much we can do within our usual program flow. Sometimes it doesn’t make sense to prompt users for a file, if they’re not even aware that such an internal file exists.
This situation is, by all rights, exceptional, and so it merits being handled with exceptions. You can break out of the part of the program that needs the file, and do whatever handling needs to be done to avoid a Really Bad Situation (whether that’s a total program crash, writing some invalid state, etc.)
Exceptions are better than null insofar as being explicit is better
than being implicit. We don’t just rely on a total program failure in
unexpected conditions, but instead have at least some way to handle
weird situations. Some language constructs (e.g. throws
in
Java) give programmers at least a small heads-up that they need to be on
the lookout for handling these cases.
There are certain aesthetic problems with using exceptions extensively. The adage “exceptions are for exceptional cases” is fairly wise, in my opinion. Its wisdom comes from a recognition of what exceptions are: a way to skirt around the natural control flow of your program when invalid conditions arise. Just as you shouldn’t be using exceptions for standard control flow, you shouldn’t be using them for standard value-gone-missing situations.
There is a better way when we know up front that we should sometimes expect to have a value, and sometimes expect not to. These kinds of values are “optional”. Optionals are a better representation of situations which aren’t actually exceptional, but where you still don’t just want to toss around a few nulls and call it a day.
Certain languages don’t even have null
at all. One
example is Haskell. Yes, there are ways in Haskell to produce bad
“bottom” values (e.g. infinite loops like let x = x in x
).
But it doesn’t have a pervasive null value that infects everything and
becomes an easy way out.
What can these languages do instead? Haskell provides a fantastic
optional type called Maybe
. Here’s a definition for it:
data Maybe a = Just a | Nothing
What these means is that we have an optional polymorphic over any
type a
, where our values are either wrapped in a
Just
or have a special representation Nothing
that indicates the value is missing. For example, we can have optional
integers that look like Just 7 :: Maybe Int
or they look
like Nothing :: Maybe Int
. Note that Nothing
does not play the same role as null
does in most other
languages, because a Nothing :: Maybe Int
is not the same
as a Nothing :: Maybe String
, whereas in C, for example,
nullptr
is always implicitly the same and can inhabit any
pointer type, regardless of the type of thing being pointed to.
There is some pain associated with optionals that people are quick to
bring up. Isn’t it a big hassle to have to wrap and unwrap your types in
Maybe
all the time? The ease of use of null
sure seems appealing. And surely our program handles all the potential
exceptions we missed appropriately…right?
There really is a better way. Often it’s accomplished by smart chaining of operations that might fail. In Haskell, Maybe is a monad, which among other things means we can hide away a lot of complexity behind a nice style called “do notation”. I’ve used this example before:
testMaybe :: Int -> Int -> [Int] -> Maybe Int
= do
testMaybe a b c <- lookup a [1..b]
found <- maybeDiv found 2
divided head <- maybeHead c
return (divided + head)
There isn’t any explicit optional handling here! Everything is handled behind the scenes, and if some part of our computation fails, we get back a Nothing, with no extra overhead when thinking about the happy path of this code. This is way better than having to manage which pieces of some computation might be null or throw an exception any old time.
Some other languages, like Swift, still have nil
but
manage to solve a lot of the pain associated with optionals by using
special operators. Swift lets you chain things together with
?
: for example, a?.my_func(b)
. You can also
dangerously unpack things as an escape hatch
(a!.my_func(b)
) but of course I don’t really recommend that
unless you’re really certain a
exists and isn’t
nil
.
Even in the world of nulls, we can do better by smarter chaining
rather than throwing exceptions every time we touch something that’s
unexpectedly null. Take the safe navigation operator
(&.
) in Ruby. Let’s say we have some nested object
where any part might be nil
, but we’re not sure if any part
is. It’s a huge pain to write nil checks for every part of a complicated
access, but using &.
is relatively painless. We can
transform a.b.c.d
(which might blow up with something like
undefined method for NilClass
) with an access like
a&.b&.c&.d
, which instead has the same correct
behavior on the happy path, but simply gives back nil
if
any part of the access chain was nil
.
It’s hopefully pretty clear that I’ve organized this post in order of increasing preference. Optionals are a great way to handle values which may or may not be there. Exceptions are fine, if used sparingly for truly exceptional conditions which would otherwise leave your program in an invalid state. Null tends to lead to more trouble than it’s worth.
Unfortunately, the usual ease-of-use tends to point the other direction. Null is incredibly easy to use: you can throw it anywhere, in place of any value. Exceptions are a little trickier, since they have to be handled. Optionals are a bit painful if you have to do all the unpacking and repacking yourself.
Happily, certain languages have constructs that make optionals less painful to use (or even make nulls nonexistent). I’m looking forward to seeing how language designers can solve even more of these problems in the future, leading to a world where we can handle value nonexistence with as much of the desired explicitness and ease as we currently use to handle values that actually exist.