One important piece of information about a type is how many possible values it contains. Let’s examine some ways to compute that info.
If you’re already comfortable manually figuring out the cardinality of a type, feel free to skip the first couple sections. The later bits are about how to do it more automatically.
Code from this post is here
Every type has a cardinality—some number of possible terms of that type. For example,
data Bool = True | False
Bool
has two1 possible values:
True
and False
. The type Unit
has
only one possible value inhabiting it:
data Unit = Unit
and this type has three:
data TrafficLight = Red | Yellow | Green
There’s even a type with no values:
data Void
We can use that one by importing Data.Void
.
There are also types with no bound on the possible number of values,
like Integer
. An Integer
can be
…-2
, -1
, 0
, 1
,
2
,… all the way on to forever in both directions.
What’s the cardinality of
Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ())))
?
We can find this by applying a few simple rules.
The cardinality of a Maybe a
is the cardinality of
a
plus one. This is because Maybe
only has two
cases. Either we have a Just a
, which has as many values as
there are in a
. Or, we have Nothing
, which is
where the “plus one” comes from.
The cardinality of an Either a b
is the cardinality of
a
plus the cardinality of b
. By similar
reasoning to above, we either have a Left
with just as many
values as there are in a
, or a Right
with as
many values as b
.
How about tuples? A tuple (a, b)
can have as many
possible values as there are possible pairings of an a
and
a b
. This means we have to multiply: the cardinality of
(a, b)
is the cardinality of a
times the
cardinality of b
.
Finally, a function a -> b
has \(b^a\) possible values. Think about the
function f :: Bool -> TrafficLight
. We’re assigning one
value from a set of 3 to each value from a set of two. I’ve enumerated
all these cases for f
below, for a total of \(3^2 = 9\) functions. Unless this is obvious
to you, it’s a good exercise here to also find the possible functions
g :: TrafficLight -> Bool
.
True = Red
f False = Red
f
True = Red
f False = Yellow
f
True = Red
f False = Green
f
True = Yellow
f False = Red
f
True = Yellow
f False = Yellow
f
True = Yellow
f False = Green
f
True = Green
f False = Red
f
True = Green
f False = Yellow
f
True = Green
f False = Green f
We can transform our type into a mathematical expression for the
number of values it has, going step-by-step. First, turn
Void
into 0
, ()
into
1
and Bool
into 2
:
Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ())))
Maybe (Either 0 (Either 2 1 -> (Maybe 2, Maybe 1))
Now we can transform Maybe
s into 1+
:
Maybe (Either 0 (Either 2 1 -> (Maybe 2, Maybe 1))
1 + (Either 0 (Either 2 1 -> (1 + 2, 1 + 1))
Next, our tuple becomes a multiplication:
1 + (Either 0 (Either 2 1 -> (1 + 2, 1 + 1))
1 + (Either 0 (Either 2 1 -> ((1 + 2) * (1 + 1))))
Either
s are sums of both sides:
1 + (Either 0 (Either 2 1 -> (1 + 2, 1 + 1))
1 + (0 + ((2 + 1) -> ((1 + 2) * (1 + 1))))
And finally, function types involve exponentiation:
1 + (0 + ((2 + 1) -> ((1 + 2) * (1 + 1))))
1 + (0 + (((1 + 2) * (1 + 1)) ^ (2 + 1)))
Doing the math, this boils down to 217.
It’s possible to compute cardinalities on relatively small examples like the one here by hand, but it’s super easy to make a mistake along the way. Can we instead write a function to do the arithmetic for us? Turns out we can, but instead of a function acting on values, it’ll be a function acting on types.
This is where things ramp up. A lot of the same mechanisms for
defining value-level functions in Haskell also exist at the type level,
but using somewhat different names and syntax. We’ll need
{-# LANGUAGE TypeFamilies #-}
to start, as well as
import GHC.TypeLits
.
Instead of a function’s type signature
cardinality :: a -> Nat
, we define a type family
Cardinality a :: Nat
. Just as cardinality
is a
value-level function that takes some value (with type a
)
and produces a value with type Nat
, we can think of
Cardinality
as a type-level function that takes some type
(a
) and produces a type with kind Nat
.
type family Cardinality a :: Nat where
We define these type-level functions in a very similar way to how we would define a function on values:
type family Cardinality a :: Nat where
Cardinality Void = 0
Cardinality () = 1
Cardinality Bool = 2
How can we check the result of calling these type-level functions?
GHCi provides the command :k!
, which lets us see what type
comes back as a result. (The λ
is my GHCi command prompt,
nothing to worry about there.)
:k! Cardinality Void
λ Cardinality Void :: Nat
= 0
:k! Cardinality ()
λ Cardinality () :: Nat
= 1
:k! Cardinality Bool
λ Cardinality Bool :: Nat
= 2
So far so good. Let’s begin adding our combining functions. We know
that a Maybe
adds one to the cardinality of whatever type’s
inside it, and that an Either
sums up the cardinality of
its Left
and Right
sides. There is such a
thing as a type-level +
operator, acting on types of kind
Nat
, but we’ll need
{-# LANGUAGE TypeOperators #-}
to use it. Because we’re
“calling” a type family recursively, we also need
{-# LANGUAGE UndecidableInstances #-}
.
Cardinality (Maybe a) =
1 + Cardinality a
Cardinality (Either a b) =
Cardinality a + Cardinality b
There are also *
and ^
at the type level.
However, you may run into confusing errors depending on how you use
*
. This used to mean Type
before
Type
meant Type
. If we write
e.g. (*) a b
then GHC
spits out an error that
* has kind *
. No worries though, it’s easy enough to use
*
qualified. If you prefer, you can instead enable
{-# LANGUAGE NoStarIsType #-}
to avoid this notation
conflict.
Cardinality (a, b) =
Cardinality a GHC.TypeLits.* Cardinality b
Cardinality (a -> b) =
Cardinality b ^ Cardinality a
Now we have enough of this type family defined to calculate the cardinality of that annoying complicated type from before.
:k! Cardinality (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ()))))
λ Cardinality (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ())))) :: Nat
= 217
Saying that the cardinality of a type is always a Nat
is
a lie. Some types have unboundedly many values. Integer
is
a good example: there is no upper or lower limit (try out
10^10000
in GHCi). Let’s introduce something that can
capture the idea that a cardinality might be infinite.
data NatOrInf = N Nat | Inf
NatOrInf
is a simple type with value-level data
constructors. However, we can lift its constructors into the type level
via {-# LANGUAGE DataKinds #-}
. When we write the data
constructors of NatOrInf
with a tick in front, we get
type-level constructors. For example, 'Inf
is a type-level
Inf
, and 'N 42
is a type-level
N 42
. We can go back through our definition of
Cardinality
to make it work with NatOrInf
instead of just Nat
.
The basics are easy, and we can now include infinite-cardinality types:
type family Cardinality a :: NatOrInf where
Cardinality Void = 'N 0
Cardinality () = 'N 1
Cardinality Bool = 'N 2
Cardinality Integer = 'Inf
However, because the type-level +
only works on
Nat
(and not on NatOrInf
), we’ll need to
define our own operators. The line below won’t work, just like
attempting something like the badly-typed Just 1 + abs 2
at
the value level:
Cardinality (Maybe a) = 'N 1 + Cardinality a
Let’s call this NatOrInf
addition operator
+%
, because %
kinda looks like \(\infty\) if you don’t look too closely. We
can continue to borrow much of our knowledge about how to define
value-level operators here.
type family (a :: NatOrInf) +% (b :: NatOrInf) :: NatOrInf where
'N a +% 'N b = 'N (a + b)
'Inf +% _ = 'Inf
+% 'Inf = 'Inf _
In fact, for comparison, let’s show what this type-level function
looks like at the value level. This won’t quite work as is, but it would
work with data NatOrInf = N Integer | Inf
, using
Integer
instead of Nat
. We can see how lifting
functions up to the type level can be thought of as mostly a syntactic
reshuffling game.
(+%) :: NatOrInf -> NatOrInf -> NatOrInf where
N a +% N b = N (a + b)
Inf +% _ = Inf
+% Inf = Inf _
Let’s also define *%
and ^%
. They’re mostly
simple substitutions into what we just wrote.
type family (a :: NatOrInf) *% (b :: NatOrInf) :: NatOrInf where
'N a *% 'N b = 'N (a GHC.TypeLits.* b)
'N 0 *% _ = 'N 0
*% 'N 0 = 'N 0
_ 'Inf *% _ = 'Inf
*% 'Inf = 'Inf
_
type family (a :: NatOrInf) ^% (b :: NatOrInf) :: NatOrInf where
'N a ^% 'N b = 'N (a ^ b)
'N 0 ^% _ = 'N 0
^% 'N 0 = 'N 1
_ 'Inf ^% _ = 'Inf
^% 'Inf = 'Inf _
With our new NatOrInf
operators, we can now define the
full Cardinality
type-level function
type family Cardinality a :: NatOrInf where
Cardinality Void = 'N 0
Cardinality () = 'N 1
Cardinality Bool = 'N 2
Cardinality (Maybe a) = 'N 1 +% Cardinality a
Cardinality (Either a b) = Cardinality a +% Cardinality b
Cardinality (a, b) = Cardinality a *% Cardinality b
Cardinality (a -> b) = Cardinality b ^% Cardinality a
Cardinality Integer = 'Inf
And give it one final test:
:k! Cardinality (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ()))))
λ Cardinality (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ())))) :: NatOrInf
= 'N 217
Cardinality
is pretty handy for computing how many
values a type has. However, right now we’re tied to a very limited list
of types. If we want to discover these answers for other types (say,
TrafficLight
), we need to add those types to the list
fairly manually. If only there were a way to examine the “shape” of a
type, we could count these up in a more generic way….
Of course, there is such a way to do this. We start by adding some new prereqs to use Generics:
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics
Let’s see how we might use this to help us count cardinalities. We
can derive Generic
for our TrafficLight
type.
data TrafficLight = Red | Yellow | Green
deriving Generic
Then to examine a Generic
-eye view of
TrafficLight
, we use :k! (Rep TrafficLight)
.
Rep
is a type family that gives us a, well, generic view of
how a type looks. Here’s what it shows us in this instance:
:k! Rep TrafficLight
λ Rep TrafficLight :: * -> *
= D1
'MetaData "TrafficLight" "Cardinality" "main" 'False)
(C1 ('MetaCons "Red" 'PrefixI 'False) U1
(:+: (C1 ('MetaCons "Yellow" 'PrefixI 'False) U1
:+: C1 ('MetaCons "Green" 'PrefixI 'False) U1))
That’s quite a bit of information, but luckily we can avoid caring
about anything starting with 'Meta
. So the true underlying
shape of TrafficLight
is something like
D1 (C1 U1 :+: (C1 U1 :+: C U1))
D1
means a datatype, which TrafficLight
certainly is. The C1
s help store metadata about
constructors. For our purposes, the important parts are the
U1
s and the :+:
s. A U1
represents
a constructor without any arguments (e.g. Red
), while the
:+:
is yet another type-level operator representing a sum.
Here, we have the sum of three U1
s, so we have a generic
view of a three-valued datatype.
Now, let’s get to writing a function that can take these generic
Rep
s and give us cardinalities. Just like before, we’ll be
writing a type family to compute NatOrInf
cardinalities for
us.
type family CardGeneric a :: NatOrInf where
CardGeneric U1 = 'N 1
CardGeneric (D1 _ x) = CardGeneric x
CardGeneric (C1 _ x) = CardGeneric x
CardGeneric (a :+: b) = CardGeneric a +% CardGeneric b
All of the above were seen in TrafficLight
.
D1
and C1
are just wrappers, so we ignore the
metadata on them and find the cardinality of the underlying type.
There’s also V1
, representing a type with no values,
which we can add pretty easily. For product types, the :*:
operator exists.
CardGeneric (a :*: b) = CardGeneric a *% CardGeneric b
CardGeneric U1 = 'N 1
There are other bits of Generics we won’t be covering, but let’s get
Maybe Bool
working. We can take a look at its
Rep
:
:k! Rep (Maybe Bool)
λ Rep (Maybe Bool) :: * -> *
= D1
'MetaData "Maybe" "GHC.Maybe" "base" 'False)
(C1 ('MetaCons "Nothing" 'PrefixI 'False) U1
(:+: C1
'MetaCons "Just" 'PrefixI 'False)
(S1
('MetaSel
('Nothing 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedLazy)
Rec0 Bool))) (
The new things here appear to be S1
(another metadata
wrapper) and Rec0
, which is a synonym for
K1 R
. We can see that this is where the Bool
in Maybe Bool
is stored. We already have a type-level
function that can tell us the cardinality of a Bool
, so
let’s reuse that.
CardGeneric (S1 _ x) = CardGeneric x
CardGeneric (K1 _ x) = Cardinality x
Putting all of those together, we get a type family that can go from
a Rep
to a NatOrInf
in a bunch of common
cases.
type family CardGeneric a :: NatOrInf where
CardGeneric V1 = 'N 0
CardGeneric U1 = 'N 1
CardGeneric (K1 _ x) = Cardinality x
CardGeneric (D1 _ x) = CardGeneric x
CardGeneric (C1 _ x) = CardGeneric x
CardGeneric (S1 _ x) = CardGeneric x
CardGeneric (a :+: b) = CardGeneric a +% CardGeneric b
CardGeneric (a :*: b) = CardGeneric a *% CardGeneric b
To make sure it’s working, let’s test with another type. All we need
to do is derive Generic
:
data TwoBit = TwoBit Bool Bool
deriving Generic
Looks like the possible values worked out alright:
:k! CardGeneric (Rep TwoBit)
λ CardGeneric (Rep TwoBit) :: NatOrInf
= 'N 4
Finally, one more test on our colossally complicated cardinality counting conundrum:
:k! CardGeneric (Rep (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ())))))
λ CardGeneric (Rep (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ()))))) :: NatOrInf
= 'N 217
Code from this post can be found in this gist
Technically three, as a commenter points out, since \(\bot\) is also an inhabitant. We’ll be ignoring that here, though. If you’re curious you might want to check out wiki.haskell.org/Bottom↩︎