My Superintelligence Can Beat Up Your Superintelligence

The year is ????. Humanity figured out how to align a superintelligent AI with its own highest values, and switched it on. Everything went pretty well.

The AI spent its time mostly improving life here on earth (since that has a very high chance of aligning with human values), but also (in typical AI fashion) sent out a few probes just to gather information about the rest of the universe, in case there was some way to improve humanity’s lot. Maybe spreading to other galaxies, or discovering an advanced civilization to exchange some ideas with. Maybe it could discover a less-advanced species, so the humans could finally have a sample size of more than one when it came to understanding how life works in this universe. Something like that.

What if our AI discovered another superintelligent being out there, working on its goal of improving its own creator species’ lot in the universe? Some people might say this suggests a Great Filter that comes after Superintelligence, which would be a scary thing indeed. If there are so many species who make it all the way to superintelligence-generating level (at least two!), why haven’t we been contacted yet? Even if they didn’t want to contact a lesser species, they might have reason to provide us a few hints along the way, to stop us from ever destroying the universe they cohabitate. (Maybe you believe the evidence matches this pattern, and we’ve been getting extraterrestrial hints all along….)

Regardless, whether an as-yet-unknown filter exists beyond reaching that stage, we’d have more pressing problems to deal with. (Or, more appropriately, that our superintelligent proxy would deal with on our behalf.) How would it determine whether a fellow superintelligence is friendly? What does it do if the other superintelligence’s values don’t align with its own? Is a first strike warranted on an agent with goals sufficiently evil in the eyes of our own benefactor?

The scariest version of this is probably running into an external superintelligence with a goal so foreign to human interest our AI’s only options are to try and shut down the other AI. While the usual example of a “paperclip maximizer” does a good job of demonstrating how foreign the goals of certain intelligences can seem to humans, it’s not the worst possible outcome by a long shot. Imagine a species whose main goal is to colonize the universe. (This could very well be our own species as soon as we glimpse the power that would allow us to do so.) If they successfully align an AI with that interest, its goals would probably lead it to want to overcome our benefactor in order to conquer more universal territory.

There are far worse possibilities if you’d like to spend a few minutes being creative….

What if the other AI wasn’t properly aligned with its founding species’ interests though? The species could have been killed off, but that doesn’t make their agent go away. It could have any of a huge range of goals, and somehow our own AI would have to deal with that, whether cooperating or destroying some agent just as (or even more) powerful than itself.

Perhaps aligning AI is so hard, but life intelligent enough to spawn it so common, that eventually the universe turns into a battleground of massively powerful beings with entirely separate goals, each duking it out for whatever misconceived idea their respective creator species thought would make for a good AI.

As a lowly human, I have no idea what would actually happen in such a situation, but it’s kind of fun to think about, in the same way it’s fun to think about superheroes matching up against each other. While the kind of overwhelming power these kinds of agents would have isn’t directly attainable for someone like me, I can still speculate (as wildly as needed) and hope that it all eventually works out in humanity’s best interest.