mitchell vitez blog music art media dark mode

A Wild GADT Appeared

I recently encountered what I think was my first “naturally arising” GADT1—that is, one that arose in the course of trying to solve a different problem. I’ve used GADTs before in other projects, but mostly as intended from the start, or simply to get practice using them.

At the Outset

The initial state of things was that we had several different record types defined, each sharing the same field names but with slightly different types on a few of the fields. Here’s a simplified example (the real records had many more fields):

{-# LANGUAGE DuplicateRecordFields #-}

data One   = One   { value :: Bool, number :: Int}
data Two   = Two   { value :: Char, number :: Int}
data Three = Three { value :: Char, number :: Int}

We can then factor out the common record (which we’ll call Shared) and massively simplify our data type definitions so we’re not repeatedly defining a large record with all the same fields.

data Shared a = Shared { value :: a, number :: Int}

data Sum
  = One   (Shared Bool)
  | Two   (Shared Char)
  | Three (Shared Char)

Getting What’s Shared

So far, this is a fairly straightforward refactor. However, the twist is that we have one more requirement: the need for a function unpack :: Sum -> Shared that lets us extract the shared record from each constructor. Here’s a fairly straightforward (albeit naive) definition:

{-# LANGUAGE LambdaCase #-}

unpack :: Sum -> Shared a
unpack = \case
  One   shared -> shared
  Two   shared -> shared
  Three shared -> shared

However, this leads to a typechecking failure:

gadts.hs:16:19-24: error:
    • Couldn't match type ‘a’ with ‘Bool’
      ‘a’ is a rigid type variable bound by
        the type signature for:
          unpack :: forall a. Sum -> Shared a
        at gadts.hs:14:1-25
      Expected type: Shared a
        Actual type: Shared Bool
    • In the expression: shared

If we try to unpack and repack the fields of the records using wildcard syntax:

{-# LANGUAGE RecordWildCards #-}

unpack :: Sum -> Shared a
unpack = \case
  One   Shared{..} -> Shared{..}
  Two   Shared{..} -> Shared{..}
  Three Shared{..} -> Shared{..}

We get a similar failure, just pushed down one layer into the value field rather than the shared record as a whole.

gadts.hs:17:30-31: error:
    • Couldn't match expected type ‘a’ with actual type ‘Bool’
      ‘a’ is a rigid type variable bound by
        the type signature for:
          unpack :: forall a. Sum -> Shared a
        at gadts.hs:15:1-25
    • In the ‘value’ field of a record

An alternate approach

Let’s back up a bit, to the initial setup with separate records, to see what we can learn about the problem by taking a different tack. Starting with:

data One   = One   { value :: Bool, number :: Int}
data Two   = Two   { value :: Char, number :: Int}
data Three = Three { value :: Char, number :: Int}

We can, of course, push the differing types back up one step as arguments, to make all of the record definitions look the exact same.

data One   a = One   { value :: a, number :: Int}
data Two   a = Two   { value :: a, number :: Int}
data Three a = Three { value :: a, number :: Int}

In the real problem, there were many more monomorphic fields (represented by number here), and a few more fields like value that were somehow dependent on a (e.g. Map a Text). We can mostly ignore the extra fields here, but they do provide more motivation to solve this kind of problem.

We can combine these disparate records into a single sum type with multiple constructors, each with the same record definition inside them.

data Sum a
  = One   { value :: a, number :: Int}
  | Two   { value :: a, number :: Int}
  | Three { value :: a, number :: Int}

These records are then ripe for factoring into Shared, as we did before:

data Shared a = Shared { value :: a, number :: Int}

data Sum a
  = One   (Shared a)
  | Two   (Shared a)
  | Three (Shared a)

The only difference from what we had before is now Sum takes a as an argument. This turns out to be enough to get unpack to typecheck though.

unpack :: Sum a -> Shared a
unpack = \case
  One   shared -> shared
  Two   shared -> shared
  Three shared -> shared

We can think of Sum as now having more information about the Shared inside it, a sort of type-level tag for what is extractable.

However, this doesn’t quite get us what we want. Remember that one of the reasons we had multiple records in the first place was to let them have slightly different types for a. Now, we have constructor types like:

One   :: Shared a -> Sum a
Two   :: Shared a -> Sum a
Three :: Shared a -> Sum a

We really want separate concrete types for these constructors:

One   :: Shared Bool -> Sum Bool
Two   :: Shared Char -> Sum Char
Three :: Shared Char -> Sum Char

This is where GADTs come in.

Where GADTs Come In

If we replace our definition of Sum with a GADT like so:

{-# LANGUAGE GADTs #-}

data Sum a where
  One   :: Shared Bool -> Sum Bool
  Two   :: Shared Char -> Sum Char
  Three :: Shared Char -> Sum Char

Along with using the existing definitions of Shared and unpack (and deriving Show for Shared):

data Shared a = Shared { value :: a, number :: Int }
  deriving Show

unpack :: Sum a -> Shared a
unpack = \case
  One   shared -> shared
  Two   shared -> shared
  Three shared -> shared

We can run code that uses unpack, while having the nicely-typed constructors from before:

main = do
  print . unpack . One $ Shared True 7
  print . unpack . Two $ Shared 'x'  7

  1. Generalized Algebraic Data Type↩︎