mitchell vitez

dark mode

blog about music art media

resume email github

Forgetful Functors and Lossy Language

A monoid is a mathematical object with an underlying set closed under an associative binary operation, and an identity. For example, addition over the reals is a monoid \((\mathbb{R}, +, 0)\) with operation \(+\) and identity \(0\).

A group goes a bit further. It also has a set and a binary operation, but in addition to identity and associativity it also requires that the operation be invertible. For example, addition over the reals \((\mathbb{R}, +)\) is a group as well as a monoid. The reals are closed under addition since any two real numbers added is another real number, and we have an inverse operation (subtraction).

We can construct monoids which aren’t groups. For example, take the monoid \((\mathbb{R}, \cdot, 1)\). If we try to construct the group that would otherwise naturally follow, we run into a problem. Taking any element \(x \cdot 0\) gives us \(0\), but there’s no way to invert this operation since we can’t divide out the zero.

Consider the functor from a group to a monoid \(U : Grp \to Mon\) that “forgets” the invertibility property of the group’s operation. This is one example of, appropriately enough, a “forgetful functor”. Despite the fact that we know the group’s operation had invertibility (by definition of being a group), we are choosing to lose that information for some other gain (e.g. allowing our new monoid to commingle with monoids that aren’t necessarily groups). In general, when we forget something in this way, whether in math or in code, the loss of information lets us treat the object more generally.

I want to explore how similar concepts crop up in human language, and why this might present some interesting problems.

Imagine the set of all possible thoughts. Some of these thoughts will be mere statements of fact, and others will be total fabrications. Some will be directed at ourselves, and some will be directed at others. A couple interesting properties of thoughts are transmittability and representability. If a thought is transmittable, that roughly means that we can tell others about it through language in finite space and finite time. If it’s representable, that means our brains can store it in some way even if we can’t find a way to teach others about it.

Notice that “all possible thoughts” is necessarily a superset of “all representable thoughts”. If we have a thought that’s representable, then that thought must exist. However, just because a thought is logically possible doesn’t mean it’s possible for a brain to store it. (If this conflicts with your internal definition of “thought”, maybe try “all possible ideas” or even “all possible ideas including ones humans can’t have, but aliens might be able to”.) Likewise, at least in a human context, for a thought to be transmittable it must be representable. This implies the possibility of thoughts we can think but cannot transmit, and also the possibility of thoughts we cannot think, but which exist in some other nebulous sense.

If human-unthinkable thoughts (HUTs) exist, we can draw a rough analogy back to our forgetful functors. The property we’re forgetting here is representability. However, because of the nature of unrepresentable thoughts, this means that if HUTs exist, then there are ideas we cannot have in full generality. The most wide-open space of ideas (all possible) will remain inaccessible to us.

Sometimes people give statements like “this statement is false” as examples of thoughts that cannot be thought about (or worse, as things to yell to shut down an AI takeover). This doesn’t make sense. You can think about this. Not only is it representable, it’s transmittable. I just wrote it down and implanted it in your brain!

Perhaps there is something about “this statement is false” that is indeed inaccessible to human minds. But then there would necessarily be no way for me to formulate what that thing was (beyond the cop-out I just used i.e. “the part of the statement that is unthinkable”), and definitely no way for me to transmit it to you accurately. If HUTs exist, there is no way for a human to give an example.

We are all quite used to our language being lossy. When I say “the black dog jumped over the fence” that may conjure up a specific image in your head, but probably not the same one as “the half-meter tall black dog with a short tail rocked back onto its hind legs then jumped over the tiny foot-tall chain link fence, landing in the sand on the other side”. It would take an extremely large amount of description to ensure that you saw in your mind exactly what I see in my memory, and usually the approximation is good enough.

One of the reasons mathematics is interesting to me is that it makes a valiant effort to be precise and rigorous and have such a level of consistency that two mathematicians using the same language can be assured they are talking about the same ideas. The underlying reason, though, that math can be so precise in the first place, is that the ideas it talks about are abstract and general. It is much harder to be precise and rigorous about a dog jumping over a fence (which requires a massive description) than about what you mean by “group” or “monoid”. This trade-off is itself witnessed in forgetful functors: the more precise information we require about an object, the less general that object is and vice versa.

This trade-off is also a great source of information about what’s going on inside a function given just its type signature. For example, consider the signature f :: a -> a. What function must f be? There is no way to construct an a from an a in full generality unless f is id, the identity function. Generality has the power of certain kinds of knowledge that specificity lacks. (If we replace a with Int, there are many more functions g :: Int -> Int than just identity e.g. succ, negate, double, square, const 0 etc.).

Anything nontrivial that shows up in math, language, and programming (which is kind of a mix of the first two) is probably a fairly general or powerful idea. Here, I think that idea is approximately summed up by the seemingly-paradoxical “generality lets you be precise, while specificity requires you to be imprecise”. This is such an interesting idea to me and I’ve seen it show up in so many different domains. I’m glad that it’s thinkable.