I recently encountered what I think was my first “naturally arising” GADT1—that is, one that arose in the course of trying to solve a different problem. I’ve used GADTs before in other projects, but mostly as intended from the start, or simply to get practice using them.
The initial state of things was that we had several different record types defined, each sharing the same field names but with slightly different types on a few of the fields. Here’s a simplified example (the real records had many more fields):
{-# LANGUAGE DuplicateRecordFields #-}
data One = One { value :: Bool, number :: Int}
data Two = Two { value :: Char, number :: Int}
data Three = Three { value :: Char, number :: Int}
We can then factor out the common record (which we’ll call
Shared
) and massively simplify our data type definitions so
we’re not repeatedly defining a large record with all the same
fields.
data Shared a = Shared { value :: a, number :: Int}
data Sum
= One (Shared Bool)
| Two (Shared Char)
| Three (Shared Char)
So far, this is a fairly straightforward refactor. However, the twist
is that we have one more requirement: the need for a function
unpack :: Sum -> Shared
that lets us extract the shared
record from each constructor. Here’s a fairly straightforward (albeit
naive) definition:
{-# LANGUAGE LambdaCase #-}
unpack :: Sum -> Shared a
= \case
unpack One shared -> shared
Two shared -> shared
Three shared -> shared
However, this leads to a typechecking failure:
gadts.hs:16:19-24: error:
• Couldn't match type ‘a’ with ‘Bool’
‘a’ is a rigid type variable bound by
the type signature for:
unpack :: forall a. Sum -> Shared a
at gadts.hs:14:1-25
Expected type: Shared a
Actual type: Shared Bool
• In the expression: shared
If we try to unpack and repack the fields of the records using wildcard syntax:
{-# LANGUAGE RecordWildCards #-}
unpack :: Sum -> Shared a
= \case
unpack One Shared{..} -> Shared{..}
Two Shared{..} -> Shared{..}
Three Shared{..} -> Shared{..}
We get a similar failure, just pushed down one layer into the
value
field rather than the shared
record as a
whole.
gadts.hs:17:30-31: error:
• Couldn't match expected type ‘a’ with actual type ‘Bool’
‘a’ is a rigid type variable bound by
the type signature for:
unpack :: forall a. Sum -> Shared a
at gadts.hs:15:1-25
• In the ‘value’ field of a record
Let’s back up a bit, to the initial setup with separate records, to see what we can learn about the problem by taking a different tack. Starting with:
data One = One { value :: Bool, number :: Int}
data Two = Two { value :: Char, number :: Int}
data Three = Three { value :: Char, number :: Int}
We can, of course, push the differing types back up one step as arguments, to make all of the record definitions look the exact same.
data One a = One { value :: a, number :: Int}
data Two a = Two { value :: a, number :: Int}
data Three a = Three { value :: a, number :: Int}
In the real problem, there were many more monomorphic fields
(represented by number
here), and a few more fields like
value
that were somehow dependent on a
(e.g. Map a Text
). We can mostly ignore the extra fields
here, but they do provide more motivation to solve this kind of
problem.
We can combine these disparate records into a single sum type with multiple constructors, each with the same record definition inside them.
data Sum a
= One { value :: a, number :: Int}
| Two { value :: a, number :: Int}
| Three { value :: a, number :: Int}
These records are then ripe for factoring into Shared
,
as we did before:
data Shared a = Shared { value :: a, number :: Int}
data Sum a
= One (Shared a)
| Two (Shared a)
| Three (Shared a)
The only difference from what we had before is now Sum
takes a
as an argument. This turns out to be enough to get
unpack
to typecheck though.
unpack :: Sum a -> Shared a
= \case
unpack One shared -> shared
Two shared -> shared
Three shared -> shared
We can think of Sum
as now having more information about
the Shared
inside it, a sort of type-level tag for what is
extractable.
However, this doesn’t quite get us what we want. Remember that one of
the reasons we had multiple records in the first place was to let them
have slightly different types for a
. Now, we have
constructor types like:
One :: Shared a -> Sum a
Two :: Shared a -> Sum a
Three :: Shared a -> Sum a
We really want separate concrete types for these constructors:
One :: Shared Bool -> Sum Bool
Two :: Shared Char -> Sum Char
Three :: Shared Char -> Sum Char
This is where GADTs come in.
If we replace our definition of Sum
with a GADT like
so:
{-# LANGUAGE GADTs #-}
data Sum a where
One :: Shared Bool -> Sum Bool
Two :: Shared Char -> Sum Char
Three :: Shared Char -> Sum Char
Along with using the existing definitions of Shared
and
unpack
(and deriving Show
for
Shared
):
data Shared a = Shared { value :: a, number :: Int }
deriving Show
unpack :: Sum a -> Shared a
= \case
unpack One shared -> shared
Two shared -> shared
Three shared -> shared
We can run code that uses unpack
, while having the
nicely-typed constructors from before:
= do
main print . unpack . One $ Shared True 7
print . unpack . Two $ Shared 'x' 7