Mitchell Vitez

Yep.

Introduction

It’s widely believed that the digits of pi are “random”. I personally think this seems likely to be true. I suspect if you go far enough into the digits, you’ll find your social security number, encodings of entire works of Shakespeare, etc., by pure chance.

However, it hasn’t actually been proven that pi is “normal”—having an even distribution of all its digits. It might be true that after a quadrillion digits, pi stops having any 0 digits for some reason.

It’d be infeasible to find an encoding of Hamlet since it’s so long (and therefore so unlikely to show up by random chance), but we can verify that every SSN exists in the digits of pi by actually locating them.

Check out the source code

Legalistic preamble

For security, never post your SSN into an unknown website. Even for purposes as cool as locating it in the digits of pi.

Reminder that using someone else’s SSN can constitute aggravated identity theft, a federal crime with a minimum prison sentence of two years. Just because you’ve found a number doesn’t mean you can use it.

Bad robot metaphysics

Before embarking on my quest, I decided to have a chat with an LLM for some background information. I was especially interested to find whether anyone had done this before (it didn’t know).

If we actually locate every SSN in the digits of pi, we’ll have shown that all SSNs are in the digits of pi. In math, this is called a “constructive proof”. If you can produce something, then it exists. Sadly, my stochastic friend didn’t agree:

This is one of the worst arguments I’ve seen in a long time, from any source.

I suppose this was foreshadowing my upcoming search for meaning in the randomness.

1 billion digits

This is the kind of problem where dealing with a billion items is table stakes.

My initial idea was to put all the SSNs in a set, search through the digits of pi in 9-digit windows, and remove found SSNs from the set. Once the set is empty, the search is over.

SSN Structure

We could search pi for every 9-digit number and call it a day. However, SSNs have a little bit of structure beyond “9 digit number” that we can take advantage of.¹

An SSN is broken into three parts: a 3-digit area, 2-digit group, and 4-digit serial number.

123-45-6789
─┬─ ┬─ ─┬──
 │  │   └─ serial
 │  └───── group
 └──────── area

None of the area, group, or serial numbers is allowed to be all 0s. However, they can have leading 0s. 001 is okay, but 000 is not.

Because Americans fear Satan, the area cannot be 666.

Finally, no SSN can be in the range 900-01-0001 to 999-99-9999. That is, no SSN can start with a 9. This by itself eliminates about 10% of the work we have to do.

Building the set of all SSNs

HashSet<String> is an almost ridiculously straightforward representation for the set of all SSNs.

let mut ssns: HashSet<String> = HashSet::new();

Following the SSN structure rules above, we can insert every possible SSN into this set.

for area in 1..=899 {
    if area == 666 {
        continue
    }
    for group in 1..=99 {
        for serial in 1..=9999 {
            ssns.insert(
                format!("{:03}{:02}{:04}", area, group, serial)
            );
        }
    }
}

This generates 888_931_098 SSNs, so we’ve saved about 11% of the work compared to checking every 9-digit string.

On my machine, this giant HashSet took 64 seconds to construct. Dismal!

String inefficiencies

A String uses 24 bytes of overhead—8 bytes for each of a pointer to the character data, a length, and a capacity.

size_of::<String>() // 24 bytes

By default, a String that can store 9 digits allocates 16 bytes of capacity (even though 9 bytes would be enough).

let ssn: String = format!("{:03}{:02}{:04}", 123, 45, 6789);
ssn.capacity() // 16 bytes

Representing each SSN as a String uses 40 total bytes and a pointer indirection to the heap. We’ll massively improve upon this representation below.

Loading the digits

Rather than calculate them myself, I used the website calculat.io as a handy way to download the first billion digits of pi. The unzipped .txt file is a manageable 1GB—a billion digits at one byte per digit.

Again, we use the most straightforward method possible, reading the entire file into a big String in RAM. These days, 1GB easily fits in most machines’ RAM. However, this still needs to do allocation and copying, which is slow.

fs::read_to_string(&file_path)

Statistically, what do we expect?

As a last step before running the code, let’s estimate the result we expect to see.

Let’s assume the digits of pi are uniformly randomly distributed.

Let \(F\) be the event “found a specific SSN in the digits we searched”. Because finding a specific SSN is unlikely and independent² from finding any others, we’d expect \(F\) to follow a Poisson distribution. This would mean the probability that \(F\) happens \(k\) times is given by:

\[P(F = k) = \frac{\lambda^k e^{-\lambda}}{k!}\]

The probability of finding a specific SSN at least once is the complement of the probability of never finding it at all:

\[P(F \geq 1) = 1 - P(F = 0)\]

Plugging in \(k=0\) to the Poisson distribution formula, we get this reduction, where many terms cancel out:

\[P(F = 0) = \frac{\lambda^0 e^{-\lambda}}{0!} = e^{-\lambda}\]

Take the complement:

\[P(F \geq 1) = 1 - e^{-\lambda}\]

\(\lambda\) is the number of events we expect in a given interval. There is a \(\frac{1}{10^9}\) chance that a random 9-digit string matches a specific SSN. If we search through \(N\) digits of pi, there are \(N - 8\) windows of 9 digits available. This is because we can start a window at each digit except the last 8, where there isn’t enough room for a full window.

\[\lambda = \frac{N - 8}{10^9}\]

Substituting in \(\lambda\), we get this reusable formula.

\[\boxed{P(F \geq 1)_N = 1 - e^{- \frac{N - 8}{10^9}}}\]

Plugging in any number of digits \(N\) will give us the probability of finding a given SSN in the first \(N\) digits of pi.

\[P(F \geq 1) _{N=10^9} = 1 - e^{- \frac{10^9 - 8}{10^9}} \approx 63.2\%\]

Output

The HashSet<String> version of the program takes several minutes to run. It’s single-threaded and uses an inefficient data structure. We’ll keep building our way to a much faster solution, but for now, here’s the output of the final version of the SSN-finding program, on a billion digits of pi.

[  0.00s] Generating the set of all valid SSNs
[  0.13s] Done generating the set of all 888_931_098 valid SSNs
[  0.13s] Memory-mapping pi digits file
[  0.13s] Finding SSN matches in file with 1_000_000_000 digits
[  0.87s] Done searching for matches
[  0.87s] Total SSNs = 888_931_098, found = 561_910_510, missing = 327_020_588
[  1.15s] More than 1_000_000 SSNs missing, skipping file write
[  1.15s] Done

10 billion digits

Avoiding String

String is not an efficient datatype for representing SSNs. A simple numeric type would be much more performant.

Because \(2^{16} < 10^{9} < 2^{32}\), we should use a u32 to represent each SSN. Each u32 is 4 bytes, a 10x improvement from the 40-byte Strings we had earlier. We can convert from a slice of bytes to a u32 with this function:

fn digits_to_u32(s: &[u8]) -> u32 {
    let mut n = 0u32;
    for &b in s {
        n = n * 10 + (b - b'0') as u32;
    }
    n
}

Of course, we can always convert back to String if we need to.

fn u32_to_digits(n: u32) -> String {
    let area = n / 1_000_000;
    let group = (n / 10_000) % 100;
    let serial = n % 10_000;
    format!("{:03}{:02}{:04}", area, group, serial)
}

Memory mapping

For efficient loading of a 10GB digits file, we can memory map it. This takes barely any extra effort compared to loading a file into a String.

let mmap = unsafe {
    Mmap::map(&file).expect("Failed to mmap file")
};

Multithreading

For 10 billion digits, we should move off a single thread. par_windows lets us do the exact thing we want—examine each 9-byte slice of a file as parallel overlapping windows. Again, surprisingly small code changes can have a big impact on the low-hanging fruit of performance.

The single-byte check window[0] != b'9' eliminates about 10% of the work, at the cost of affecting branch prediction. I found this to be worthwhile, but your profiler may vary.

mmap.par_windows(9).for_each(|window| {
    if window[0] != b'9' {
        let potential_ssn = digits_to_u32(window);
        ssns.remove(&potential_ssn);
    }
});

Initially, there wasn’t much speedup from making this program multithreaded. This was because the HashSet<u32> required a lock on every change, meaning all threads were contending for one lock. DashSet is a sharded HashSet, which only requires locking a portion of the overall set when making changes to it. This easy data structure change unlocked a lot of parallel performance.

A bigger data structure change

The program was still fairly slow, with a 3.5 minute runtime. Often when things are slow, it means we should think about switching to a new data structure.

How much information do we really need to store about each SSN? Just whether it’s been found or not. That’s a single bit.

We can represent a BitSet by using a Vec<AtomicU64> with 900 million total bits (SSNs upwards of 899-99-9999 are invalid). Each bit represents whether the SSN at that index has been found in the digits of pi (1) or not (0). Setting a bit means locating its AtomicU64 container, then updating a single bit in that u64 using bit arithmetic.

900 million bits is about 113 MB—extremely reasonable for a modern computer to handle. Building the set of valid SSNs was much faster now, taking under 2 seconds instead of over 60. The main loop of the program remained largely unchanged, except that instead of removing SSNs from the set as we went along, we would now collect all the 0 bits at the end of processing, to determine which SSNs were still missing.

A BitSet also buys us much finer granularity for parallelism than the DashMap did. There are about 14 million separate AtomicU64s, meaning contention for each one from a pool of 12 threads is pretty unlikely.

Using BitSet brought the runtime for a 10 billion digit search down to 12 seconds.

Probabilistic estimate

Using our formula from above…

\[\boxed{P(F \geq 1)_N = 1 - e^{- \frac{N - 8}{10^9}}}\]

…we can estimate the probability of finding a given SSN in the first 10 billion digits of pi to be:

\[P(F \geq 1)_{N=10^{10}} = 1 - e^{- \frac{10^{10} - 8}{10^9}} \approx99.995\%\]

Output

Searching 10 billion digits leads to only 40_258 missing SSNs. Out of 889 million, that’s a tiny fraction. However, it does mean we’re not done yet.

[  0.00s] Generating the set of all valid SSNs
[  0.13s] Done generating the set of all 888_931_098 valid SSNs
[  0.13s] Memory-mapping pi digits file
[  0.13s] Finding SSN matches in file with 10_000_000_000 digits
[  4.28s] Done searching for matches
[  4.28s] Total SSNs = 888_931_098, found = 888_890_840, missing = 40_258
[  4.28s] Writing missing SSNs to missing.txt
[  4.29s] Wrote missing SSNs to missing.txt
[  4.29s] Done

100 billion digits

Strife of Pi

For 1 billion and even 10 billion digits, I was able to find and directly download .txt files full of decimal digits from the internet. However, in the 100 billion digit range this got tricky.

There is a Google website with 100 trillion digits of pi (in 100 billion digit chunks), in both decimal and hex formats. However, it’s using the .ycd file format, which I wasn’t familiar with. We’d expect a 100GB file, but the .ycd was 39GB, indicating some form of compression.

The .ycd format comes from y-cruncher, a program I’d never heard of before. It only runs on Windows and Linux, and I had been writing my code on a Mac. Luckily, I have a Windows PC, so I was able to download the file there and convert a 39GB .ycd to a 100GB .txt file using y-cruncher.

I then transferred the 100GB file to my laptop using LocalSend, which took over an hour. I used the time to watch a movie.

Rolling windows

Parallel windows are nice, but they’re not taking advantage of useful information: most of each window overlaps with the next window. Each 9-digit window has 8 digits of reused information (marked R):

789012345678901234
----123456789-----
-----234567890----
     RRRRRRRR

With a little bit of math, we can peel off the old leftmost digit by subtracting away that digit times \(10^8\), multiply the whole number by 10, and add the new rightmost digit to get a brand new number. There’s no need to re-parse or recalculate any of the 8 digits in the middle. We fully parse the first window of every chunk of digits, then only do partial parsing until reaching the end of that chunk.

Parallelizing BitSet generation

Another place ripe for some optimization at this point was the initial BitSet population. It was taking around 1.6 seconds, and an extremely straightforward parallelization over each area got it down to 0.13s.

(1..=899u32)
    .into_par_iter()
    .map(|area| {
        // set every valid SSN in this `area` as missing
    });

Inner loops

Some of my favorite optimizations were the last ones I did, going deep into the inner loop of the main part of the program and trying to make it as fast as I could. I’m not an optimization expert, so surely there are speedups I missed, but this was still incredibly fun.

As an example optimization, ssn / 64 can be rewritten as ssn >> 6 and bit shifting is much faster than division. The Rust compiler will often spot these things for you, so it pays to always profile with and without a change in place, or to read the generated assembly.

The innermost part of the main loop was a call to the function below. Marking it #[inline(always)] wasn’t necessary—the compiler did it anyways—but I think it’s still a nice indication for humans reading the code. Instead of bounds-checked slice accesses, it uses pointer math. Finally, after profiling both the branching (current) version and a branchless version, the speedup from a load and checking (current & bit) == 0 outweighed the branch penalty and increase in fetch_or calls.

// this should be heavily optimized, since it's in the tightest loop
#[inline(always)]
fn mark_ssn_as_found(bitset_ptr: *const AtomicU64, ssn: usize) {
    let word = ssn >> 6;
    let bit = 1u64 << (ssn & 63);
    // SAFETY: using pointers for speedy unchecked access,
    // but still loading from the same range as before
    unsafe {
        let current = (*bitset_ptr.add(word)).load(Ordering::Relaxed);
        if (current & bit) == 0 {
            (*bitset_ptr.add(word)).fetch_or(bit, Ordering::Relaxed);
        }
    }
}

I found it interesting to poke around the assembly as a way to try to spot missing optimizations. cargo asm makes this really easy. The final output of cargo asm ssns_in_pi::mark_ssn_as_found was:

ssns_in_pi::mark_ssn_as_found:
 lsr     x8, x1, #6
 mov     w9, #1
 lsl     x9, x9, x1
 ldr     x10, [x0, x8, lsl, #3]
 tst     x10, x9
 b.eq    LBB9_2
 ret
LBB9_2:
 add     x8, x0, x8, lsl, #3
 ldset   x9, x8, [x8]
 ret

I’m nowhere near an expert on optimization (or even reading assembly), so if you spot something obvious that would speed this program up, I’d welcome feedback!

Probabilistic estimate

The probability of finding a given SSN in the first 100 billion digits of pi is:

\[P(F \geq 1)_{N=10^{11}} = 1 - e^{- \frac{10^{11} - 8}{10^9}} \approx 100\%\]

The decimal expansion isn’t quite 1, but it’s close.

0.9999999999999999999999999999999999999999999627992...

Output

Finally, the quest to ensure every SSN exists in the digits of pi was complete.

[  0.00s] Generating the set of all valid SSNs
[  0.13s] Done generating the set of all 888_931_098 valid SSNs
[  0.13s] Memory-mapping pi digits file
[  0.13s] Finding SSN matches in file with 100_000_000_000 digits
[ 36.89s] Done searching for matches
[ 36.89s] Total SSNs = 888_931_098, found = 888_931_098, missing = 0
[ 36.90s] Done

20 billion digits

Of course, searching 100 billion digits could be overkill. Our program doesn’t stop its search just because it’s found all the possible SSNs. If we search the first 20 billion digits, there are only three missing SSNs.

[  0.00s] Generating the set of all valid SSNs
[  0.13s] Done generating the set of all 888_931_098 valid SSNs
[  0.13s] Memory-mapping pi digits file
[  0.13s] Finding SSN matches in file with 20_000_000_000 digits
[  7.86s] Done searching for matches
[  7.86s] Total SSNs = 888_931_098, found = 888_931_095, missing = 3
[  7.86s] Writing missing SSNs to missing.txt
[  7.86s] Wrote missing SSNs to missing.txt
[  7.86s] Done

I’m not going to post specific SSNs, but we can use a verification site to check whether each is real. According to that check, the last three SSNs in pi were:

issued in Illinois between 1936 and 1950
never issued (prior to 2011)
issued in Pennsylvania between 1971 and 1972

The last SSN in pi

US life expectancy is 79 years,³ and 1971 was only 55 years ago. There’s probably someone still alive, perhaps somewhere in Pennsylvania, with a wildly remote SSN, not located until searching 22.4 billion digits into pi.

I wonder if they know.

The Social Security Administration actually changed the rules in 2011 to be much more “random”, but since my SSN is from before then, I’m sticking with the structure I know and love.↩︎
These aren’t truly independent events, since each window of 9 digits partially overlaps 8 other windows. We’ll ignore this for simplicity. We actually expect the Poisson distribution’s prediction to be an overestimate.↩︎
The National Center for Health Statistics puts it at 79.0 years.↩︎

Is every Social Security Number in the digits of pi?

Introduction

Legalistic preamble

Bad robot metaphysics

1 billion digits

SSN Structure

Building the set of all SSNs

String inefficiencies

Loading the digits

Statistically, what do we expect?

Output

10 billion digits

Avoiding String

Memory mapping

Multithreading

A bigger data structure change

Probabilistic estimate

Output

100 billion digits

Strife of Pi

Rolling windows

Parallelizing BitSet generation

Inner loops

Probabilistic estimate

Output

20 billion digits

The last SSN in pi