There’s a neat zero-cost trick in Haskell for types with one
constructor and a single field inside it. Instead of using
data
, use newtype
and the typechecker will
treat your type differently, but the representation of your data at
runtime will be the same. This lets us get better typechecking in some
cases, without changing the runtime performance of programs.
A sort of classic (if boring) example of this is to create newtypes for different fields describing a person. Instead of
data Person = Person
firstName :: Text
{ lastName :: Text
, address :: Text
, }
We can instead use something like
newtype FirstName = FirstName Text
newtype LastName = LastName Text
newtype Address = Address Text
data Person = Person
firstName :: FirstName
{ lastName :: LastName
, address :: Address
, }
Then when we’re trying to call a function
getLatLng :: Address -> (Double, Double)
the typechecker
stops us from accidentally passing in a FirstName
. (Even
better, we keep FirstName
and LastName
as
distinct types, which helps keep us from ever getting confused about
which is which—the compiler will check for us automatically.)
There’s a function called
coerce :: Coercible a b => a -> b
which lets us
convert between various types that have representational equality (that
is, they have the exact same structure at runtime). This could be useful
if your code needs to call some external library that takes first names
as Text
, but the library doesn’t know about your
FirstName
type. You just coerce the FirstName
to Text
and everything works, with type safety guaranteed
by the compiler.1
What kinds of things are coercible? You shouldn’t go around writing
your own arbitrary Coercible
instances—that’d likely break
type safety. How does the compiler know whether two things are coercible
then?
There are some obvious cases. First of all, any type is coercible to
itself (it has the same runtime representation as itself). Also, like
above, if we have a newtype we can coerce to the underlying type of that
newtype. In this case we’re coercing FirstName
to
Text
. We can also go in the other direction, from
Text
to FirstName
.
It’s also fairly obvious that we can inductively build up complicated
coercibles if all the parts are coercible (think about some nested
structure containing only Coercible
s—this tree has the same
runtime representation as any other tree of the same representational
structure).
I won’t go much into it here,2 but if you know about
roles then it’s pretty easy to determine what the typechecker is doing
when determining whether two types are coercible. Types with the
phantom
role essentially don’t matter. This makes sense
because phantom types don’t affect the underlying runtime representation
at all. Types with the nominal
role can’t be coerced.
Again, makes sense because nominal
enforces strict
sameness. And types with a role of representational
can be
used in a Coercible
if they are themselves coercible. Makes
sense: the representational
role means that the underlying
representation is the same.
The Identity
functor is intimately connected with
coercibility, because it’s just a simple newtype around any other type.
Identity
is useful for many of the same reasons that
id
is—it slots into a functor-shaped hole, but doesn’t do
too much.
newtype Identity a = Identity { runIdentity :: a }
The instinctive way to write a functor instance for
Identity
is to define fmap
as unwrapping with
runIdentity
, then applying f
, then re-wrapping
with the Identity
constructor.
instance Functor Identity where
fmap f x = Identity (f (runIdentity x))
However, now that we know about coerce we can simplify the above a bit.
instance Functor Identity where
fmap = coerce
The (#.)
operator could be called the
“Coercible
composition operator”. It takes two coercions
(from a to b, and from b to c) and produces a single composed coercion
(from a to c). We can define #.
like this:
(#.) :: Coercible c b => (b -> c) -> (a -> b) -> (a -> c)
#.) _ = coerce (\x -> x :: b) :: forall a b. Coercible b a => a -> b (
If you strip away all the type machinery, we’re left with just
coerce (\x -> x)
, which reads an awful lot like coercing
the identity function to behave like a coercing identity function.
With the type machinery, though, we see that we’re building a
function that requires a to be coercible to b (for all a and b), and
also asserts that x has type b. It sort of “skips” the initial coercion
step from a to b, because as long as the type Coercible b a
is in there, we can do just one coercion. This is all just a lot of
compile-time checks for one single “composed” coercion, but unlike a
usual (.)
composition, we don’t have things to do at
runtime so it looks like the function basically does nothing! It even
ignores its first argument, which would be silly for usual composition
to do. Just try writing an implementation for _ . g
…you
really need f . g
if you want to get anything done.
One place that #.
is used is in the implementation of
lenses. Consider this implementation of set
, adapted from
Building Lenses.
set :: Setter s t a b -> b -> s -> t
= runIdentity #. setter (\_ -> Identity b) set setter b
The implementation is a little weird so let’s walk through it. First
of all, we have two things to keep track of, a Setter
called setter
and some value of the final transformed-to
type b
. What a setter does is replace some internal value
in a structure with a new value. The type of a Setter
looks
like this (don’t worry about what Settable
means for
now):
type Setter s t a b =
forall f. Settable f =>
-> f b) -> s -> f t (a
Let’s try to read the set
function from the inside out.
First of all, we don’t really care about the value of type
a
since it’s getting replaced by a b
anyways.
That explains the underscore in (\_ -> Identity b)
. We
wrap b
in Identity
because we need some
functor f
and what could be simpler than using the
Identity
functor? We also pass s
right in (the
pointfree third argument to set
), so the right half
(setter (\_ -> Identity b) s
) should make some sense
now.
We have a Setter
built, but we still need to get a
t
from it to satisfy the return type of set
.
Remember that f
is Identity
, so going from
f t
to t
should be as simple as calling
runIdentity.
Finally, we use #.
instead of
plain .
composition because we shouldn’t care about
typechecking newtypes differently here, only that the final
representations end up the same.
For more on the type safety of coercions, I recommend reading the paper Safe Zero-cost Coercions for Haskell↩︎
You can learn more about roles from this section of the GHC User’s Guide↩︎