Data Declaration Syntax Separation

Syntax highlighting

Here’s an example of my usual Haskell syntax highlighting on this site:

data Bool = True | False

Throughout this post, we’ll be using a special syntax highlighter that colors types and values differently, for clarity. Here are some examples:

data Bool = True | False
True :: Bool
newtype Wrapped a = Wrapped a

As a mnemonic, these colors start as teal and vermilion respectively, but you can update them to any CSS color value with this little form:

Data declarations

Haskell’s data declaration syntax gives us the ability to conjure new Algebraic Data Types (ADTs). The data keyword introduces these kinds of declarations. To the left of the equals sign, we name the type, and to the right we name each of its constructors—possible ways to construct a value of the type we’re creating. The constructors are separated by |.

data Bool = True | False

There’s nothing stopping us from giving both a type and a value of that type the same name. However, note that the compiler (and our special syntax highlighter here) makes strong distinctions between the type Unit and the value Unit. As humans, part of our job is to understand which things are types and which are values. Right now, a good rule is that types are found to the left of the equals sign.

data Unit = Unit

What’s the type of a constructor?

However, that rule is pretty quickly broken, since we can use types inside constructors. If a robot is identifiable either by name or serial number, we might have this:

data RobotIdentifier = Name String | Serial Int

To see why this is, it’s useful to see what a constructor actually is. When we examine the type signatures of the constructors of a RobotIdentifier, we see functions that result in a RobotIdentifier:

Name :: String -> RobotIdentifier

Serial :: Int -> RobotIdentifier

If we have a constructor that takes multiple values, we get back a function type with multiple arguments.

data RobotGroup = Group String Int String

Group :: String -> Int -> String -> RobotIdentifier

I suggest trying :type Group in GHCi to confirm this.

As an aside (skip if you’re not familiar with record syntax), if we use record syntax to add nice names for each of the pieces of information that form a Group, we still get the same type signature for the constructor. However, we also get a few functions from RobotGroup back to its constituent parts.

data RobotGroup = Group 
  { name :: String
  , numMembers :: Int
  , identifierCode :: String
  }

Group :: String -> Int -> String -> RobotIdentifier

name :: RobotGroup -> String
numMembers :: RobotGroup -> Int
identifierCode :: RobotGroup -> String

Type variables

Another thing we can do with data declarations is have our type take a type variable (on the left of the equals sign) and use it (on the right):

newtype Wrapped a = Wrapped a

For example, we can now use a type Wrapped Int with possible values Wrapped 7 or Wrapped -3432498, etc.

I’ve noticed that a fairly common mistake beginners make is using Maybe (the name of the type) in a pattern match where they mean to use Just (the name of the constructor). However, people almost never confuse Maybe and Nothing.

data Maybe a = Just a | Nothing

I’m not sure whether this is because the names Just and Maybe are confusing, or because the difference between which is a type and which is a value is itself confusing.

We can use multiple type variables, as in Either

data Either a b = Left a | Right b

Finally, here’s a classic example of how we can use the same type variable multiple times within a single constructor, and also how we can reference the type we’re defining recursively from one of its constructors.

data List a = Empty | Cons a (List a)