Here’s an example of my usual Haskell syntax highlighting on this site:
data Bool = True | False
Throughout this post, we’ll be using a special syntax highlighter that colors types and values differently, for clarity. Here are some examples:
data Bool = True | False
True :: Bool
newtype Wrapped a = Wrapped a
As a mnemonic, these colors start as teal and vermilion respectively, but you can update them to any CSS color value with this little form:
Haskell’s data declaration syntax gives us the ability to conjure new
Algebraic Data Types (ADTs). The data
keyword introduces
these kinds of declarations. To the left of the equals sign, we name the
type, and to the right we name each of its constructors—possible ways to
construct a value of the type we’re creating. The constructors are
separated by |
.
data Bool = True | False
There’s nothing stopping us from giving both a type and a value of
that type the same name. However, note that the compiler (and our
special syntax highlighter here) makes strong distinctions between the
type Unit
and the
value Unit
. As humans,
part of our job is to understand which things are types and which are
values. Right now, a good rule is that types are found to the left of
the equals sign.
data Unit = Unit
However, that rule is pretty quickly broken, since we can use types inside constructors. If a robot is identifiable either by name or serial number, we might have this:
data RobotIdentifier = Name String | Serial Int
To see why this is, it’s useful to see what a constructor actually
is. When we examine the type signatures of the constructors of a
RobotIdentifier
, we see functions
that result in a RobotIdentifier
:
Name :: String -> RobotIdentifier
Serial :: Int -> RobotIdentifier
If we have a constructor that takes multiple values, we get back a function type with multiple arguments.
data RobotGroup = Group String Int String
Group :: String -> Int -> String -> RobotIdentifier
I suggest trying :type Group
in GHCi to confirm
this.
As an aside (skip if you’re not familiar with record syntax), if we
use record syntax to add nice names for each of the pieces of
information that form a Group
, we
still get the same type signature for the constructor. However, we also
get a few functions from RobotGroup
back to its constituent parts.
data RobotGroup = Group
{ name :: String
, numMembers :: Int
, identifierCode :: String
}
Group :: String -> Int -> String -> RobotIdentifier
name :: RobotGroup -> String
numMembers :: RobotGroup -> Int
identifierCode :: RobotGroup -> String
Another thing we can do with data declarations is have our type take a type variable (on the left of the equals sign) and use it (on the right):
newtype Wrapped a = Wrapped a
For example, we can now use a type Wrapped
Int
with possible values Wrapped
7
or Wrapped
-3432498
, etc.
I’ve noticed that a fairly common mistake beginners make is using
Maybe
(the name of the type) in a
pattern match where they mean to use Just
(the name of the constructor). However,
people almost never confuse Maybe
and Nothing
.
data Maybe a = Just a | Nothing
I’m not sure whether this is because the names Just
and Maybe
are confusing, or because the difference
between which is a type and which is a value is itself confusing.
We can use multiple type variables, as in Either
data Either a b = Left a | Right b
Finally, here’s a classic example of how we can use the same type variable multiple times within a single constructor, and also how we can reference the type we’re defining recursively from one of its constructors.
data List a = Empty | Cons a (List a)