mitchell vitez

dark mode

blog about music art media

resume email github

Counting Crossword Grids

Consider an \(n \times n\) grid with each cell either black or white, and \(n\) odd. Here’s an example of a \(7 \times 7\):

There are \(2^{n^2}\) possible grids like this, since each square can be one of two options—black or white. This means that for a typical \(15 \times 15\) crossword grid, there are at most \(2^{225} \approx 5 \times 10^{67}\) possibilities—a truly gargantuan number, but an upper bound nonetheless. However, there are extra rules which limit the number of legal grids quite a bit more.

Crossword grids are rotationally symmetric through a turn of \(180\degree\). That is, if you rotate one so it’s upside down, the pattern of black and white squares will still look the same. Our first grid doesn’t obey that pattern, but here’s one that does.

Because of the symmetry, we know we don’t have to count every square towards the total anymore. We can rightly expect about half the squares not to count anymore. Because the central square is its own image under rotation, we’ll treat it separately. Below, I’ve colored in a set of squares that represents all the ones we need to count. Hopefully it’s clear that the colored squares each have exactly one counterpart when rotated.

Then, our new formula only needs to account for a choice of black or white for each blue square, plus one extra choice for the color of the central square:

\[ \underbrace{2}_{\text{central}} \times \underbrace{2^{\frac{n^2 - 1}{2}}}_{\text{rotated}} = 2^{\frac{n^2 - 1}{2} + 1} \]

For a typical grid size of \(15 \times 15\), this means our possibilities still number over ten decillion. Ridiculously large, but a truly tiny fraction of what they were before. However, there’s one more rule I’d like to apply before claiming that we’re looking at a reasonably small upper bound on the number of crossword grids: the word length rule.

In a normal crossword, we don’t see words any shorter than three letters long. This means that in any given row or column in the grid, we should never see a lone white square, or pair of white squares. This means our most recent example grid is invalid—it has lone squares vertically down the middle, and some lone horizontal pairs as well. Here’s another grid, which obeys all our rules so far.

This rule is especially limiting on a \(7 \times 7\) grid, since it forces all black squares to exist only along the centermost column and row, as seen above, or only on the outside edge, as seen below. On larger grids it’s not quite so bad, although it does rule out many grids.

Because this new rule filters out “bad” columns and rows independently of others (except when accounting for the doubling/symmetry rule we saw before), it makes sense to first take a look at a single-row example. We’ll need a longer row than 7—why not 11?

Let’s try to find invalid rows. There are no invalid rows with zero black squares. How about rows with one?

So far, out of 12 ways to place either zero or one black squares, four are invalid. Let’s try rows with two black squares. There are \({11 \choose 2} = 55\) possibilities to try, so it’s still in the realm of enumeration. Let’s now let blue squares mean something different: we could place a black square in any of their places, and still get an invalid row.

This is a repeat of our first one-black-square case above, but it adds nine new cases because we have nine places where we could put the second square. When we have a black third square, we actually don’t have anywhere that we can put another black square and get a valid row.

This pattern repeats for squares near the end, of course, but we have to be careful not to double-count. Two squares give us additional ways to cause mayhem. For example, if we have a black fourth, fifth, or sixth square, we get these patterns respectively:

The remainder are symmetrical to patterns we’ve seen already.

Whew, that’s a lot of work to count these things! And we’re only on two black squares. Luckily for us,

\[\sum_{k=0}^{11} {11 \choose k} = 2^{11} = 2048\]

While it would be a total pain to count 2048 cases by hand, it’s well within the enumeration range of a program (unlike, say, 10 decillion). This is also a fairly fast problem to compute in general: it’s equivalent to counting the number of n-bit numbers with some property in how their bits are arranged. The number of cases is small enough, though, that just about any implementation will do. Here’s one that finds the valid row numbers for rows with length up to 21 (the size of the New York Times Sunday puzzle).

import Control.Monad (forM_)
import Data.List.Split (splitOn)

data Color = B | W
  deriving (Show, Eq, Ord)

main =
  forM_ [1..21] $ \n ->
  print . length . filter isValid $ allCombinations n

allCombinations n =
  filter ((==n) . length) $ allCombs [B, W] n
    allCombs xs n =
      [1..n] >>= \n -> mapM (const xs) [1..n]

isValid =
  (>=3) . minimumIf . map length . filter (/= []) . splitOn [B]
  where minimumIf x = if null x then 0 else minimum x

The output is this:


The program finds 116 valid rows for an \(11 \times 11\), out of 2048. Here’s what they look like. 797 out of 32768 rows (about 2%) are actually valid for n=15. Looking this up in OEIS, we find that there’s a recurrence relation for this sequence, which could be handy if we wanted to compute this number for much larger grids:

\[ a_n = 2 a_{n-1} - a_{n-2} + a_{n-4} + 1 \text{ where } a_1 = 0, a_2 = 0, a_3 = 1, a_4 = 3 \]

These are the sixteen possible rows the program generates for a \(7 \times 7\). However, we already know many of these are invalid when not on an outside edge, because of the symmetry constraint. These kinds of interferences occur on larger boards too. So, we know that the number of grids for \(n=15\) is quite a bit less than 2% of 10 decillion.

Unfortunately, this was about where I ran out of steam in finding an exact solution, and started searching around for an answer. I did find this FiveThirtyEight Riddler, which asks essentially the same question, but with a few different constraints. The problem does seem to be solved in this twitter thread, but I’m surprised that this wasn’t more widely known! Maybe my searching wasn’t thorough enough.

Thinking about this puzzle for a while made me more interested in learning about combinatorial methods in general, to be able to solve this kind of problem more easily. I’m still glad I went through the process—it was good problem-solving practice with a relatively unknown domain. As a bonus, it even came packaged with a conveniently pretty visualization. If nothing else, I managed to encounter and solve some fun subproblems along the way.