In both the logical-planning and rational-agent views of AI, the machine’s objective—whether in the form of a goal, a utility function, or a reward function (as in reinforcement learning)—is specified exogenously.
In Wiener’s words, this is “the purpose put into the machine.” Indeed, it has been one of the tenets of the field that AI systems should be general purpose—i.e., capable of accepting a purpose as input and then achieving it—rather than special purpose, with their goal implicit in their design. For example, a self-driving car should accept a destination as input instead of having one fixed destination. However, some aspects of the car’s “driving purpose” are fixed, such as that it shouldn’t hit pedestrians. This is built directly into the car’s steering algorithms rather than being explicit: No self-driving car in existence today “knows” that pedestrians prefer not to be run over.
Putting a purpose into a machine that optimizes its behavior according to clearly defined algorithms seems an admirable approach to ensuring that the machine’s “conduct will be carried out on principles acceptable to us!” But, as Wiener warns, we need to put in the right purpose.
We might call this the King Midas problem: Midas got exactly what he asked for—namely, that everything he touched would turn to gold—but too late he discovered the drawbacks of drinking liquid gold and eating solid gold. The technical term for putting in the right purpose is value alignment. When it fails, we may inadvertently imbue machines with objectives counter to our own.
Tasked with finding a cure for cancer as fast as possible, an AI system might elect to use the entire human population as guinea pigs for its experiments. Asked to de-acidify the oceans, it might use up all the oxygen in the atmosphere as a side effect.
This is a common characteristic of systems that optimize: Variables not included in the objective may be set to extreme values to help optimize that objective.
—Stuart Russel, as part of ‘Possible Minds’, edited by John Brockman