Counting Cardinalities

One important piece of information about a type is how many possible values it contains. Let’s examine some ways to compute that info.

If you’re already comfortable manually figuring out the cardinality of a type, feel free to skip the first couple sections. The later bits are about how to do it more automatically.

Code from this post is here

Basics

Every type has a cardinality—some number of possible terms of that type. For example,

data Bool = True | False

Bool has two¹ possible values: True and False. The type Unit has only one possible value inhabiting it:

data Unit = Unit

and this type has three:

data TrafficLight = Red | Yellow | Green

There’s even a type with no values:

data Void

We can use that one by importing Data.Void.

There are also types with no bound on the possible number of values, like Integer. An Integer can be …-2, -1, 0, 1, 2,… all the way on to forever in both directions.

Cardinality Combinators

What’s the cardinality of Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ())))? We can find this by applying a few simple rules.

The cardinality of a Maybe a is the cardinality of a plus one. This is because Maybe only has two cases. Either we have a Just a, which has as many values as there are in a. Or, we have Nothing, which is where the “plus one” comes from.

The cardinality of an Either a b is the cardinality of a plus the cardinality of b. By similar reasoning to above, we either have a Left with just as many values as there are in a, or a Right with as many values as b.

How about tuples? A tuple (a, b) can have as many possible values as there are possible pairings of an a and a b. This means we have to multiply: the cardinality of (a, b) is the cardinality of a times the cardinality of b.

Finally, a function a -> b has \(b^a\) possible values. Think about the function f :: Bool -> TrafficLight. We’re assigning one value from a set of 3 to each value from a set of two. I’ve enumerated all these cases for f below, for a total of \(3^2 = 9\) functions. Unless this is obvious to you, it’s a good exercise here to also find the possible functions g :: TrafficLight -> Bool.

f True = Red
f False = Red

f True = Red
f False = Yellow

f True = Red
f False = Green

f True = Yellow
f False = Red

f True = Yellow
f False = Yellow

f True = Yellow
f False = Green

f True = Green
f False = Red

f True = Green
f False = Yellow

f True = Green
f False = Green

We can transform our type into a mathematical expression for the number of values it has, going step-by-step. First, turn Void into 0, () into 1 and Bool into 2:

Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ())))
Maybe (Either 0 (Either 2 1 -> (Maybe 2, Maybe 1))

Now we can transform Maybes into 1+:

Maybe (Either 0 (Either 2 1 -> (Maybe 2, Maybe 1))
1 + (Either 0 (Either 2 1 -> (1 + 2, 1 + 1))

Next, our tuple becomes a multiplication:

1 + (Either 0 (Either 2 1 -> (1 + 2, 1 + 1))
1 + (Either 0 (Either 2 1 -> ((1 + 2) * (1 + 1))))

Eithers are sums of both sides:

1 + (Either 0 (Either 2 1 -> (1 + 2, 1 + 1))
1 + (0 + ((2 + 1) -> ((1 + 2) * (1 + 1))))

And finally, function types involve exponentiation:

1 + (0 + ((2 + 1) -> ((1 + 2) * (1 + 1))))
1 + (0 + (((1 + 2) * (1 + 1)) ^ (2 + 1)))

Doing the math, this boils down to 217.

It’s possible to compute cardinalities on relatively small examples like the one here by hand, but it’s super easy to make a mistake along the way. Can we instead write a function to do the arithmetic for us? Turns out we can, but instead of a function acting on values, it’ll be a function acting on types.

Type-level functions

This is where things ramp up. A lot of the same mechanisms for defining value-level functions in Haskell also exist at the type level, but using somewhat different names and syntax. We’ll need {-# LANGUAGE TypeFamilies #-} to start, as well as import GHC.TypeLits.

Instead of a function’s type signature cardinality :: a -> Nat, we define a type family Cardinality a :: Nat. Just as cardinality is a value-level function that takes some value (with type a) and produces a value with type Nat, we can think of Cardinality as a type-level function that takes some type (a) and produces a type with kind Nat.

type family Cardinality a :: Nat where

We define these type-level functions in a very similar way to how we would define a function on values:

type family Cardinality a :: Nat where
  Cardinality Void = 0
  Cardinality ()   = 1
  Cardinality Bool = 2

How can we check the result of calling these type-level functions? GHCi provides the command :k!, which lets us see what type comes back as a result. (The λ is my GHCi command prompt, nothing to worry about there.)

λ :k! Cardinality Void
Cardinality Void :: Nat
= 0
λ :k! Cardinality ()
Cardinality () :: Nat
= 1
λ :k! Cardinality Bool
Cardinality Bool :: Nat
= 2

So far so good. Let’s begin adding our combining functions. We know that a Maybe adds one to the cardinality of whatever type’s inside it, and that an Either sums up the cardinality of its Left and Right sides. There is such a thing as a type-level + operator, acting on types of kind Nat, but we’ll need {-# LANGUAGE TypeOperators #-} to use it. Because we’re “calling” a type family recursively, we also need {-# LANGUAGE UndecidableInstances #-}.

  Cardinality (Maybe a) =
    1 + Cardinality a
  Cardinality (Either a b) =
    Cardinality a + Cardinality b

There are also * and ^ at the type level. However, you may run into confusing errors depending on how you use *. This used to mean Type before Type meant Type. If we write e.g. (*) a b then GHC spits out an error that * has kind *. No worries though, it’s easy enough to use * qualified. If you prefer, you can instead enable {-# LANGUAGE NoStarIsType #-} to avoid this notation conflict.

  Cardinality (a, b) =
    Cardinality a GHC.TypeLits.* Cardinality b
  Cardinality (a -> b) =
    Cardinality b ^ Cardinality a

Now we have enough of this type family defined to calculate the cardinality of that annoying complicated type from before.

λ :k! Cardinality (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ()))))
Cardinality (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ())))) :: Nat
= 217

Introducing Infinities

Saying that the cardinality of a type is always a Nat is a lie. Some types have unboundedly many values. Integer is a good example: there is no upper or lower limit (try out 10^10000 in GHCi). Let’s introduce something that can capture the idea that a cardinality might be infinite.

data NatOrInf = N Nat | Inf

NatOrInf is a simple type with value-level data constructors. However, we can lift its constructors into the type level via {-# LANGUAGE DataKinds #-}. When we write the data constructors of NatOrInf with a tick in front, we get type-level constructors. For example, 'Inf is a type-level Inf, and 'N 42 is a type-level N 42. We can go back through our definition of Cardinality to make it work with NatOrInf instead of just Nat.

The basics are easy, and we can now include infinite-cardinality types:

type family Cardinality a :: NatOrInf where
  Cardinality Void    = 'N 0
  Cardinality ()      = 'N 1
  Cardinality Bool    = 'N 2
  Cardinality Integer = 'Inf

However, because the type-level + only works on Nat (and not on NatOrInf), we’ll need to define our own operators. The line below won’t work, just like attempting something like the badly-typed Just 1 + abs 2 at the value level:

  Cardinality (Maybe a) = 'N 1 + Cardinality a

Let’s call this NatOrInf addition operator +%, because % kinda looks like \(\infty\) if you don’t look too closely. We can continue to borrow much of our knowledge about how to define value-level operators here.

type family (a :: NatOrInf) +% (b :: NatOrInf) :: NatOrInf where
  'N a +% 'N b = 'N (a + b)
  'Inf +% _    = 'Inf
  _    +% 'Inf = 'Inf

In fact, for comparison, let’s show what this type-level function looks like at the value level. This won’t quite work as is, but it would work with data NatOrInf = N Integer | Inf, using Integer instead of Nat. We can see how lifting functions up to the type level can be thought of as mostly a syntactic reshuffling game.

(+%) :: NatOrInf -> NatOrInf -> NatOrInf where
N a +% N b = N (a + b)
Inf +% _   = Inf
_ +% Inf = Inf

Let’s also define *% and ^%. They’re mostly simple substitutions into what we just wrote.

type family (a :: NatOrInf) *% (b :: NatOrInf) :: NatOrInf where
  'N a *% 'N b = 'N (a GHC.TypeLits.* b)
  'N 0 *% _    = 'N 0
  _    *% 'N 0 = 'N 0
  'Inf *% _    = 'Inf
  _    *% 'Inf = 'Inf

type family (a :: NatOrInf) ^% (b :: NatOrInf) :: NatOrInf where
  'N a ^% 'N b = 'N (a ^ b)
  'N 0 ^% _    = 'N 0
  _    ^% 'N 0 = 'N 1
  'Inf ^% _    = 'Inf
  _    ^% 'Inf = 'Inf

With our new NatOrInf operators, we can now define the full Cardinality type-level function

type family Cardinality a :: NatOrInf where
  Cardinality Void         = 'N 0
  Cardinality ()           = 'N 1
  Cardinality Bool         = 'N 2
  Cardinality (Maybe a)    = 'N 1 +% Cardinality a
  Cardinality (Either a b) = Cardinality a +% Cardinality b
  Cardinality (a, b)       = Cardinality a *% Cardinality b
  Cardinality (a -> b)     = Cardinality b ^% Cardinality a
  Cardinality Integer      = 'Inf

And give it one final test:

λ :k! Cardinality (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ()))))
Cardinality (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ())))) :: NatOrInf
= 'N 217

Again, but more Generically

Cardinality is pretty handy for computing how many values a type has. However, right now we’re tied to a very limited list of types. If we want to discover these answers for other types (say, TrafficLight), we need to add those types to the list fairly manually. If only there were a way to examine the “shape” of a type, we could count these up in a more generic way….

Of course, there is such a way to do this. We start by adding some new prereqs to use Generics:

{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics

Let’s see how we might use this to help us count cardinalities. We can derive Generic for our TrafficLight type.

data TrafficLight = Red | Yellow | Green
  deriving Generic

Then to examine a Generic-eye view of TrafficLight, we use :k! (Rep TrafficLight). Rep is a type family that gives us a, well, generic view of how a type looks. Here’s what it shows us in this instance:

λ :k! Rep TrafficLight
Rep TrafficLight :: * -> *
= D1
    ('MetaData "TrafficLight" "Cardinality" "main" 'False)
    (C1 ('MetaCons "Red" 'PrefixI 'False) U1
     :+: (C1 ('MetaCons "Yellow" 'PrefixI 'False) U1
          :+: C1 ('MetaCons "Green" 'PrefixI 'False) U1))

That’s quite a bit of information, but luckily we can avoid caring about anything starting with 'Meta. So the true underlying shape of TrafficLight is something like

D1 (C1 U1 :+: (C1 U1 :+: C U1))

D1 means a datatype, which TrafficLight certainly is. The C1s help store metadata about constructors. For our purposes, the important parts are the U1s and the :+:s. A U1 represents a constructor without any arguments (e.g. Red), while the :+: is yet another type-level operator representing a sum. Here, we have the sum of three U1s, so we have a generic view of a three-valued datatype.

Now, let’s get to writing a function that can take these generic Reps and give us cardinalities. Just like before, we’ll be writing a type family to compute NatOrInf cardinalities for us.

type family CardGeneric a :: NatOrInf where
  CardGeneric U1        = 'N 1
  CardGeneric (D1 _ x)  = CardGeneric x
  CardGeneric (C1 _ x)  = CardGeneric x
  CardGeneric (a :+: b) = CardGeneric a +% CardGeneric b

All of the above were seen in TrafficLight. D1 and C1 are just wrappers, so we ignore the metadata on them and find the cardinality of the underlying type.

There’s also V1, representing a type with no values, which we can add pretty easily. For product types, the :*: operator exists.

  CardGeneric (a :*: b) = CardGeneric a *% CardGeneric b
  CardGeneric U1        = 'N 1

There are other bits of Generics we won’t be covering, but let’s get Maybe Bool working. We can take a look at its Rep:

λ :k! Rep (Maybe Bool)
Rep (Maybe Bool) :: * -> *
= D1
    ('MetaData "Maybe" "GHC.Maybe" "base" 'False)
    (C1 ('MetaCons "Nothing" 'PrefixI 'False) U1
     :+: C1
           ('MetaCons "Just" 'PrefixI 'False)
           (S1
              ('MetaSel
                 'Nothing 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedLazy)
              (Rec0 Bool)))

The new things here appear to be S1 (another metadata wrapper) and Rec0, which is a synonym for K1 R. We can see that this is where the Bool in Maybe Bool is stored. We already have a type-level function that can tell us the cardinality of a Bool, so let’s reuse that.

  CardGeneric (S1 _ x)  = CardGeneric x
  CardGeneric (K1 _ x)  = Cardinality x

Putting all of those together, we get a type family that can go from a Rep to a NatOrInf in a bunch of common cases.

type family CardGeneric a :: NatOrInf where
  CardGeneric V1        = 'N 0
  CardGeneric U1        = 'N 1
  CardGeneric (K1 _ x)  = Cardinality x
  CardGeneric (D1 _ x)  = CardGeneric x
  CardGeneric (C1 _ x)  = CardGeneric x
  CardGeneric (S1 _ x)  = CardGeneric x
  CardGeneric (a :+: b) = CardGeneric a +% CardGeneric b
  CardGeneric (a :*: b) = CardGeneric a *% CardGeneric b

To make sure it’s working, let’s test with another type. All we need to do is derive Generic:

data TwoBit = TwoBit Bool Bool
  deriving Generic

Looks like the possible values worked out alright:

λ :k! CardGeneric (Rep TwoBit)
CardGeneric (Rep TwoBit) :: NatOrInf
= 'N 4

Finally, one more test on our colossally complicated cardinality counting conundrum:

λ :k! CardGeneric (Rep (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ())))))
CardGeneric (Rep (Maybe (Either Void (Either Bool () -> (Maybe Bool, Maybe ()))))) :: NatOrInf
= 'N 217

Code from this post can be found in this gist