- 23 Mar 2015 07:18
#14539319
It is quite an interesting topic. If we create a superintelligent AI, it definitely follows that if we don't give it a cohesive moral system under which to act it will quickly see that humanity is just a hindrance to its goals. He goes on to say that one suggestion is that programmers need to put in a human happiness imperative, but even that would not seem to be enough. Maybe the ten commandments would be useful to the AI after all? But the basic problem is that whatever fatal goal it has it will inevitably be able to outsmart us into some kind of loophole. For goodness sake, lawyers are able to do it all the time with verbose legalese, why would we expect a computer not to eventually come to its own conclusions about the best way to get the job done?
Lyle Cantor wrote:If you've studied computer science in the last twenty years, you know Stewart Russell — as he co-authored Artificial Intelligence: A Modern Approach with Peter Norvig, which has been the standard textbook in the field for the last two decades. An extraordinarily clear and engaging text, it was well placed to become the classic it is today.
Once a bastion of overpromises and vaporware, in the years since the publication of A Modern Approach the field of artificial intelligence has grown into a bulwark of our economy, optimizing logistics for the military, detecting credit card fraud, providing very good, though certainly subhuman, machine translations, and much, much more.
Slowly these programs have become more capable and more general, and as they do our world has become more efficient, and its inhabitants richer. All in all, we've been well served by modern machine learning techniques. As artificial intelligence becomes, well, more intelligent whole hosts of new businesses and applications become possible. Uber, for example, would not exist if not for modern path-finding algorithms and self-driving cars are only beginning to actualize thanks to the development of probabilistic approaches to modeling a robot and its environment.
One might think that this trend towards increasing capability will continue to be a positive development, yet Russell thinks otherwise. In fact, he’s worried about what might happen when such systems begin to eclipse humans in all intellectual domains, and even had this to say when questioned about the possibility by Edge.org, “The right response seems to be to change the goals of the field itself; instead of pure intelligence, we need to build intelligence that is provably aligned with human values.”
One of the fathers of modern artificial intelligence thinks we need to redefine the goals of the field itself, including some guarantee that these systems are “provably aligned” with human values. This is interesting news in itself, but far more interesting are the arguments and thinkers that lead him to this conclusion
Russell cites two academics in his comment to Edge.org, Stephen M. Omohundro, an American computer scientist and former physicist, and Nick Bostrom, an Oxford philosopher who published a book last year on the potential dangers of advanced artificial intelligence, Superinteligence: Paths, Dangers, Strategies.
We’ll begin with Nick Bostrom’s orthogonality thesis and illustrate it with his famous paperclip maximizer thought experiment.
The year is 2055 and The Gem Manufacturing Company has put you in charge of increasing the efficiency of its paperclip manufacturing operations. One of your hobbies is amateur artificial intelligence research and it just so happens that you figured out how to build a super-human AI just days before you got the commission. Eager to test out your new software, you spend the rest of the day formally defining the concept of a paperclip and then give you new software the following goal, or “utility function” in Bostrom’s parlance: create as many paperclips as possible with the resources available.
You eagerly grant it access to Gem Manufacturing’s automated paperclip production factories and everything starts working out great. The AI discovers new, highly-unexpected ways of rearranging and reprograming existing production equipment. By the end of the week waste has quickly declined, profits risen and when the phone rings you’re sure you’re about to get promoted. But it’s not management calling you, it’s your mother. She’s telling you to turn on the television.
You quickly learn that every automated factory in the world has had its security compromised and they are all churning out paperclips. You rush to into the factories’ central server room and unplug all the computers there. It’s no use, your AI has compromised (and in some cases even honestly rented) several large-scale server farms and is now using a not-insignificant percentage of the worlds computing resources. Around a month later, your AI has gone through the equivalent of several technological revolutions, perfecting a form of nanotechnology it is now using to convert all available matter on earth into paperclips. A decade later, all of our solar system has been turned into paperclips or paperclip production facilities and millions of probes are making their way to nearby stars in search of more matter to turn into paperclips.
Now this parable may seem silly. Surely once it gets intelligent enough to take over the world, the paperclip maximizer will realize that paperclips are a stupid use of the world’s resources. But why do you think that? What process is going on in your mind that defines a universe filled only with paperclips as a bad outcome? What Bostrom argues is this process is an internal and subjective one. We use our moral intuitions to examine and discard states of the world, like a paperclip universe, that we see as lacking value.
And the paperclip maximizer does not share our moral intuitions. Its only goal is more paperclips and its thoughts would go more like this: does this action lead to the production of more paperclips than all other actions considered? If so, implement that action. If not, move on to the next idea. Any thought like ‘what’s so great about paperclips anyway?’ would be judged as not likely to lead to more paperclips and so remain unexplored. This is the essence of the orthogonality thesis which Bostrom defines as follows:
Intelligence and final goals are orthogonal axes along which possible agents can freely vary. In other words, more or less any level of intelligence could in principle be combined with more or less any final goal [even something as ‘stupid’ as making as many paperclips as possible].
In my previous review of his book, I provided this summary of the idea:
Though agents with different utility functions (goals) may converge on some provably optimal method of cognition, they will not converge on any particular terminal goal, though they’ll share some instrumental or sub-goals. That is, a superinteligence whose super-goal is to calculate the decimal expansion of pi will never reason itself into benevolence. It would be quite happy to convert all the free matter and energy in the universe (including humans and our habitat) into specialized computers capable only of calculating the digits of pi. Why? Because its potential actions will be weighted and selected in the context of its utility function. If its utility function is to calculate pi, any thought of benevolence would be judged of negative utility.
Now I suppose it is possible that once an agent reaches a sufficient level of intellectual ability it derives some universal morality from the ether and there really is nothing to worry about, but I hope you agree that this is, at the very least, not a conservative assumption. For the purposes of this article I will take the orthogonality thesis as a given.
So a smarter than human artificial intelligence can have any goal. What’s the big deal? What’s really implausible about the paperclip scenario is how powerful it got in such a short amount of time.
This is an interesting question: What is so great about intelligence? I mean, we have guns and EMPs and the nuclear bombs that create them. If a smarter-than-human AI with goals that conflict with out own is created, won’t we just blow it up? Leaving aside the fact that guns and nuclear bombs are all products of intelligence and so illustrate how powerful a force it really is, I think it’s important to illustrate how smart an AI could get in comparison to humans.
Consider the chimp. If we are grading on a curve, chimps are very, very intelligent. Compare them to any other species besides Homo sapiens and they’re the best of the bunch. They have the rudiments of language, use very simple tools and have complex social hierarchies, and yet chimps are not doing very well. Their population is dwindling, and to the extent they are thriving they are thriving under our sufferance not their own strength.
Why? Because human civilization is little like the paperclip maximizer; We don’t hate chimps or the other animals whose habitats we are rearranging; we just see higher-value arrangements of the earth and water they need to survive. And we are only every-so-slighter smarter than chimps.
It is quite an interesting topic. If we create a superintelligent AI, it definitely follows that if we don't give it a cohesive moral system under which to act it will quickly see that humanity is just a hindrance to its goals. He goes on to say that one suggestion is that programmers need to put in a human happiness imperative, but even that would not seem to be enough. Maybe the ten commandments would be useful to the AI after all? But the basic problem is that whatever fatal goal it has it will inevitably be able to outsmart us into some kind of loophole. For goodness sake, lawyers are able to do it all the time with verbose legalese, why would we expect a computer not to eventually come to its own conclusions about the best way to get the job done?