“Genetic algorithms” are one of those things that sound a lot harder and scarier than they actually are once you start playing with them. Of course, there are ways to make them extremely complicated, but the basics of an algorithm that kinda-sorta acts like genetics are really quite easy to understand.
For those following along at home, all the code is in this gist. I also tried something new and livestreamed the creation of this mini-project on twitch.
Our algorithm is pretty simple. It’s not a fantastic representation of any real-world population, or even of real-world biological principles, but it’s pretty good as an outline of how genetic algorithms can be constructed. The steps are:
Phenotypes are what the expressed genes cause the organism to look like/behave like/etc. In this case, our phenotype is a 5 by 20 grid of ASCII characters, limited to a range of human-readable characters. Here, our phenotype is also essentially the same thing as our genotype—there’s a one-to-one correspondence between the two.
The initial population is just a grid of entirely random characters. Here’s an example organism:
V^Jx>\_N(4XtxDT%Y,/r
Ao/9|A+4Nx3nAd#kFYWy
=aZFzA4Z]]A`)5uzZOb4
FWaANR\EXy\e"6AZ[@X7
e[LA@QvVU0ABAA2P|Aw7
Our fitness score is extremely simple—just count of the number of
A
s the organism has. We can then sort all the organisms in
our population by their fitness. Our selection pressure is a little
weird and nonbiological—we take the two top-ranked individuals and have
them form the basis for the entire next generation.
Crossover occurs when parents combine their genes. Because we only have two parents for the entire generation, the crossover algorithm isn’t that complicated. We simply randomly select a parent’s gene for each slot in the chromosome.
However, there’s a slight twist! Every time we select a gene, there’s a small chance that a randomly-chosen gene gets inserted instead, simulating the process of random mutation. In the long run, this allows populations to slowly acquire new genes, and if they happen to increase fitness, distribute them to new generations through crossover.
Here’s an example run, except that I’ve only sampled every 10
generations so changes appear faster. The pressure applied is to
maximize the number of A
s in the phenotype. The fitness
score is the percentage of the genes which are A
as
well.
*** Generation 1 ***
"L9x>V_N%4otjDT%YD3r
Io\9cu+D#b3nmd;kFYW%
=aZGzA4Z]60/w#uzGf"n
FWaANzsE1y\e"`?R[]X7
l[LAKQ0,JWmh@s2P|A@a
Fitness: 6%
*** Generation 10 ***
V^JA^\_N(4XQxDT%YA/r
Ao/]AA+4#A3nAd#AFYAy
=aZFzA4AANA`)5uzZSb4
FWaANR\EXyAe"6AA[@X7
l[LAKQvV70ABAA2P|Aw7
Fitness: 22%
*** Generation 20 ***
V^JA^A_NA4XQAATFAA/r
AA/]AA+4#A3nAd#AAaAy
=aZFUA[AANAA)muzZA94
FWaANR\EAAAeA6AAA@X7
lALAKQAVA0ABAA2P|Aw7
Fitness: 38%
*** Generation 30 ***
V^JA^A_NAAAAAA7FAA/A
AA/]A3+4#A3*Ad#AAaAy
=aZF6AAAANAA)muAZAAA
F#aANRAEAAAAA6AAA@X7
lALAKQAVA0AJAA2PAAw'
Fitness: 50%
*** Generation 40 ***
V^JAAAANAAAAAA7FAA/A
AAAKAA+WAA32Av#AAaAy
AaZF6AAAAAAA)muAAAAA
FAAAARAfAAAAAAAAA@s7
lALAKQAAAAAJAA2PAAh'
Fitness: 62%
*** Generation 50 ***
V^JAAAANAAAAAAAFAA/A
AAA^AA+4AA32AAAAAAAA
AaZA6AA1AAAA)mAAAAAA
FAAAARAfAAAAAAAAA[s7
lALAKQAAAAAJAAAPAAh'
Fitness: 71%