Composing Coercions

newtype

There’s a neat zero-cost trick in Haskell for types with one constructor and a single field inside it. Instead of using data, use newtype and the typechecker will treat your type differently, but the representation of your data at runtime will be the same. This lets us get better typechecking in some cases, without changing the runtime performance of programs.

A sort of classic (if boring) example of this is to create newtypes for different fields describing a person. Instead of

data Person = Person
  { firstName :: Text
  , lastName :: Text
  , address :: Text
  }

We can instead use something like

newtype FirstName = FirstName Text
newtype LastName = LastName Text
newtype Address = Address Text

data Person = Person
  { firstName :: FirstName
  , lastName :: LastName
  , address :: Address
  }

Then when we’re trying to call a function getLatLng :: Address -> (Double, Double) the typechecker stops us from accidentally passing in a FirstName. (Even better, we keep FirstName and LastName as distinct types, which helps keep us from ever getting confused about which is which—the compiler will check for us automatically.)

coerce

There’s a function called coerce :: Coercible a b => a -> b which lets us convert between various types that have representational equality (that is, they have the exact same structure at runtime). This could be useful if your code needs to call some external library that takes first names as Text, but the library doesn’t know about your FirstName type. You just coerce the FirstName to Text and everything works, with type safety guaranteed by the compiler.1

What kinds of things are coercible? You shouldn’t go around writing your own arbitrary Coercible instances—that’d likely break type safety. How does the compiler know whether two things are coercible then?

There are some obvious cases. First of all, any type is coercible to itself (it has the same runtime representation as itself). Also, like above, if we have a newtype we can coerce to the underlying type of that newtype. In this case we’re coercing FirstName to Text. We can also go in the other direction, from Text to FirstName.

It’s also fairly obvious that we can inductively build up complicated coercibles if all the parts are coercible (think about some nested structure containing only Coercibles—this tree has the same runtime representation as any other tree of the same representational structure).

I won’t go much into it here,2 but if you know about roles then it’s pretty easy to determine what the typechecker is doing when determining whether two types are coercible. Types with the phantom role essentially don’t matter. This makes sense because phantom types don’t affect the underlying runtime representation at all. Types with the nominal role can’t be coerced. Again, makes sense because nominal enforces strict sameness. And types with a role of representational can be used in a Coercible if they are themselves coercible. Makes sense: the representational role means that the underlying representation is the same.

Identity

The Identity functor is intimately connected with coercibility, because it’s just a simple newtype around any other type. Identity is useful for many of the same reasons that id is—it slots into a functor-shaped hole, but doesn’t do too much.

newtype Identity a = Identity { runIdentity :: a }

The instinctive way to write a functor instance for Identity is to define fmap as unwrapping with runIdentity, then applying f, then re-wrapping with the Identity constructor.

instance Functor Identity where
  fmap f x = Identity (f (runIdentity x))

However, now that we know about coerce we can simplify the above a bit.

instance Functor Identity where
  fmap = coerce

#.

The (#.) operator could be called the “Coercible composition operator”. It takes two coercions (from a to b, and from b to c) and produces a single composed coercion (from a to c). We can define #. like this:

(#.) :: Coercible c b => (b -> c) -> (a -> b) -> (a -> c)
(#.) _ = coerce (\x -> x :: b) :: forall a b. Coercible b a => a -> b

If you strip away all the type machinery, we’re left with just coerce (\x -> x), which reads an awful lot like coercing the identity function to behave like a coercing identity function.

With the type machinery, though, we see that we’re building a function that requires a to be coercible to b (for all a and b), and also asserts that x has type b. It sort of “skips” the initial coercion step from a to b, because as long as the type Coercible b a is in there, we can do just one coercion. This is all just a lot of compile-time checks for one single “composed” coercion, but unlike a usual (.) composition, we don’t have things to do at runtime so it looks like the function basically does nothing! It even ignores its first argument, which would be silly for usual composition to do. Just try writing an implementation for _ . g…you really need f . g if you want to get anything done.

One place that #. is used is in the implementation of lenses. Consider this implementation of set, adapted from Building Lenses.

set :: Setter s t a b -> b -> s -> t
set setter b = runIdentity #. setter (\_ -> Identity b)

The implementation is a little weird so let’s walk through it. First of all, we have two things to keep track of, a Setter called setter and some value of the final transformed-to type b. What a setter does is replace some internal value in a structure with a new value. The type of a Setter looks like this (don’t worry about what Settable means for now):

type Setter s t a b =
  forall f. Settable f =>
  (a -> f b) -> s -> f t

Let’s try to read the set function from the inside out. First of all, we don’t really care about the value of type a since it’s getting replaced by a b anyways. That explains the underscore in (\_ -> Identity b). We wrap b in Identity because we need some functor f and what could be simpler than using the Identity functor? We also pass s right in (the pointfree third argument to set), so the right half (setter (\_ -> Identity b) s) should make some sense now.

We have a Setter built, but we still need to get a t from it to satisfy the return type of set. Remember that f is Identity, so going from f t to t should be as simple as calling runIdentity. Finally, we use #. instead of plain . composition because we shouldn’t care about typechecking newtypes differently here, only that the final representations end up the same.


  1. For more on the type safety of coercions, I recommend reading the paper Safe Zero-cost Coercions for Haskell↩︎

  2. You can learn more about roles from this section of the GHC User’s Guide↩︎


❮ 2019 Road Trip
A slew of national parks
Concatenative Programming ❯
Implementing a super-simple stack-based language