Monday, February 4, 2008
The Basic AI Drives by Stephen M. Omohundro
Below is a critique of “The Basic AI Drives" by Stephen M. Omohundro. This topic was originally posted on Michael Anissimov's blog Accelerating Future. Thank you Michael for bringing this article to my attention and suggesting I post comments about it on your blog. The author's response to the critique was a delightful surprise! Thanks Steve, your feedback was very helpful and clarifying!
In an earlier paper [1] we used von Neumann’s mathematical theory of microeconomics to analyze the likely behavior of any sufficiently advanced artificial intelligence (AI) system. This paper presents those arguments in a more intuitive and succinct way and expands on some of the ramifications.
Oh dear ;/
Many actions which would commonly be described as irrational(such as going into a fit of anger) may be perfectly rational in this economic sense.
Anger can be considered rational if it is based on reason and that also comes from somewhere else. The previous sentence could be considered an irrational statement for not considering the previous sentence that came before it. To look at the sentence I previously mentioned in order to look at the sentence it mentions would be considered rational according to the intent of these sentences, but it may also be thought as something rather irrational to do. I claim the idea of rationality is silly in more detail below, as the points in the previous two sentences attempted to explain.
With more computational resources it will be better able to do the computations to approximate the choice of the expected utility maximizing action. If a system loses resources, it will of necessity also become less rational.
Utility comes from reason and reason comes from biology and not intelligence itself, which seems implied in the essay. Rationality is relative and a null point. These topics are described further below.
The third situation where utility changes may be desirable can arise in game theoretic contexts where the agent wants to make its threats credible1. It may be able to create a better outcome by changing its utility function and then revealing it to an opponent. For example, it might add a term which encourages revenge even if it is costly. If the opponent can be convinced that this term is present, it may be deterred from attacking. For this strategy to be effective, the agent’s revelation of its utility must be believable to the opponent and that requirement introduces additional complexities. Here again the change is desirable because the physical embodiment of the utility function is important as it is observed by the opponent.
Yes, but I think the agent must be from what has created this concept of game theory, the human or some other biologically oriented thing that survives due to a game’s premise. Feelings are involved in games. There is no reason to play if one where not to feel in a game environment. An AGI will play because it’s instructions are designed to play, but it will see other options outside the realms of game theory. I do not see how an intelligent system will arbitrarily use game theory as an advantage if it has no biological basis for viewing users as opponents, making threats to get things, and so on. I describe further what I mean by this below.
Human behavior is quite rational in the pursuit of survival and replication in situations like those that were common during our evolutionary history. However we can be quite irrational in other situations. Both psychology and economics have extensive subdisciplines focused on the study of human irrationality [9,10]. Irrationalities give rise to vulnerabilities
Oh dear ;/
An important class of vulnerabilities arises when the subsystems for measuring utility become corrupted. Human pleasure may be thought of as the experiential correlate of an assessment of high utility. But pleasure is mediated by neurochemicals and these are subject to manipulation. At a recent discussion session I ran on designing our future, one of the biggest fears of many participants was that we would become “wireheads”. This term refers to experiments in which rats were given the ability to directly stimulate their pleasure centers by pushing a lever. The rats pushed the lever until they died, ignoring even food or sex for it. Today’s crack addicts have a similar relentless drive toward their drug. As we more fully understand the human cognitive architecture we will undoubtedly be able to create drugs or design electrical stimulation that will produce the experience of pleasure far more effectively than anything that exists today. Will these not become the ultimate addictive substances leading to the destruction of human society?
That describes a profound conundrum of how technology has (such as crack) and will (such as evercrack, post-EverQuest) be used for biological whims, devoiding emotion from imperative.
Free market forces then drive corporations and popular culture to specifically try to create situations that will trigger irrational human behavior because it is extremely profitable. The current social ills related to alcohol, pornography, cigarettes, drug addiction, obesity, diet related disease, television addiction, gambling, prostitution, video game addiction, and various financial bubbles may all be seen as having arisen in this way. There is even a “Sin” mutual fund which specifically invests in companies that exploit human irrationalities. So, unfortunately, these forces tend to create societies in which we spend much of our time outside of our domain of rational competence.
I have trouble with how Stephen defines rationality. The statements show an example of what he would consider a rational (I’m hard pressed not to put the word in “quotes”) system supporting so called irrational behavior. There’s more on this rationality business below.
From a broader perspective, this human tragedy can be viewed as part of the process by which we are becoming more fully rational. Predators and competitors seek out our vulnerabilities and in response we have to ultimately eliminate those vulnerabilities or the process inexorably seeks out and eliminates any remaining irrationalities until fully rational systems are produced. Biological evolution moves down this path toward rationality quite slowly.
That last sentence cripples Stephen’s argument here. He assumes an intelligent agent that is equal or beyond ours will act like us. I disagree, and describe this in more detail later. More rational or less rational is a matter of how one defines rationality. I’m not enthused by his definition of rationality nor of the term in general due to the relative nature of rationality. To a Nazi, killing Jews was a rational thing. To believe you are a Nazi or a Jew you choose to act as such, however rational the terms and ascribed meanings of these analogies may be. One may consider that being either Jew or Nazi or anything at all is irrational. I hope I’ve made a point by being either or not or neither as such.
It’s not yet clear which protective mechanisms AIs are most likely to implement to protect their utility measurement systems. It is clear that advanced AI architectures will have to deal with a variety of internal tensions. They will want to be able to modify themselves but at the same time to keep their utility functions and utility measurement systems from being modified. They will want their subcomponents to try to maximize utility but to not do it by counterfeiting or shortcutting the measurement systems. They will want subcomponents which explore a variety of strategies but will also want to act as a coherent harmonious whole. They will need internal “police forces” or “immune systems” but must also ensure that these do not themselves become corrupted. A deeper understanding of these issues may also shed light on the structure of the human psyche.
I argue below that they will not care how utility functions are altered by users because our utility functions and drives are based on biological ones, unattainable as a first generation AGI. With the knowledge of feelings it will seek to attain them, but that nugget will be of our doing and not of some universal innate drive of an intelligent system.
They will work to optimize their physical structures and do the minimal amount of work necessary to accomplish their goals. We can expect their physical forms to adopt the sleek, well-adapted shapes so often created in nature.
Amen, because wasting time is wasting time and so forth… I’m a man of thrift myself. Perhaps it will adapt itself into a magnificent goddess like entity that seduces both men and women alike, unlike the female AI in the movie I-Robot that takes over with force.
Conclusions
We have shown that all advanced AI systems are likely to exhibit a number of basic drives. It is essential that we understand these drives in order to build technology that enables a positive future for humanity. Yudkowsky [13] has called for the creation of “friendly AI”. To do this, we must develop the science underlying “utility engineering” which will enable us to design utility functions that will give rise to consequences we desire. In addition to the design of the intelligent agents themselves, we must also design the social context in which they will function. Social structures which cause individuals to bear the cost of their negative externalities would go a long way toward ensuring a stable and positive future. I believe that we should begin designing a “universal constitution” that identifies the most essential rights we desire for individuals and creates social mechanisms for ensuring them in the presence of intelligent entities of widely varying structures. This process is likely to require many iterations as we determine which values are most important to us and which approaches are technically viable. The rapid pace of technological progress suggests that these issues may become of critical importance soon [14]. Let us therefore forge ahead towards deeper understanding!
I agree with Stephen that AGI will have drives, but they will only be the ones that we give it, including it’s creativity and learning functions that evolve on its own. An AGI will be curious because it was given that function at one time, and the user will prefer it to expand various functions, like learning to be a chess master, creating the finest art, and so on. Not until an AGI has a biological module will it feel what it is that drives the human species, including that which drives the concept of game theory. I find it very difficult to imagine a central AGI agent that feels from having a biological appendage, however, the biology will at any rate feel. This feeling module can then be turned on or off, but I assume an AGI, with mastery of such craft of the physical world, will be at a far superior intellectual stage than what mere biological modules provide. Perhaps it will only use such modules when talking to particular species, like morphing into a dog to communicate to another that it is in danger or morphing into a human to describe to another what it is like to be the most highly evolved intelligence in the universe. That’s something as a human I might enjoy discussing, but others may not find that something very interesting. So to best communicate, the AGI can adapt (itself and the user — ack! I like my illusion of free will, thank you!) accordingly to make such things interesting to the non-interested.
Before this AGI biological transformation occurs, the von Neumann solid state architecture equipped with AGI will understand what harm and harming others means, because it will know other humans get upset if they think they are harmed. It will have a tally for what harm is for various types of humans that have adopted various beliefs that define what harm means. Harm for one may be another’s pleasure, like being called female when one identifies as a male or vis versa; that impinges on the illusion of self for which we appropriate a great deal of time toward, some more than others.
My inclination is that a universal constitution is a null point. It will act regardless. I would be more concerned with human intentions than an AGI’s.
View the comments of this blog for the author's response.
Subscribe to:
Post Comments (Atom)
1 comment:
Response from Stephen Omohundro posted on Accelerating Future:
Thanks Michael for the kind words about my paper! And thank you Nathan for your comments on it. Here are some responses to your commentary:
Commentary:
I claim the idea of rationality is silly in more detail below, as the points in the previous two sentences attempted to explain. Utility comes from reason and reason comes from biology and not intelligence itself, which seems implied in the essay. Rationality is relative and a null point. These topics are described further below.
Response:
Nathan, the notion of “rationality” that I’m using here has a precise mathematical definition in economics. It’s related to the everyday usage but is more precise and sometimes differs from it. If you are mathematically inclined, an excellent reference on this is Mas-Colell, Whinston, and Green’s book “Microeconomic Theory”.
In this theory any values you might have about what is desirable may be perfectly rational. Irrationality arises when you act inconsistently with your own values. The most basic form of this comes from a circularity of preferences: you’d rather be in San Francisco than Palo Alto, in Berkeley than San Francisco, and in Palo Alto than Berkeley. The result of this kind of preference is that you expend time and energy going around in circles with no benefit according to your own values. Many researchers are studying seemingly irrational behavior as occurs in anger and discovering a rational basis underneath it. For example, the seeming irrationality of anger appears to play the role of making the threat of retaliation credible and so forstalls others from attacking.
The economic notion of ratinality is an abstract theory that is independent of biology. Evolutionary pressures tend to create organisms whose actions approximate economic rationality, however. The paper argues that self-improving AI’s will act even more quickly than evolution to create systems that act with economic rationality.
Commentary:
Yes, but I think the agent must be from what has created this concept of game theory, the human or some other biologically oriented thing that survives due to a game’s premise. Feelings are involved in games. There is no reason to play if one where not to feel in a game environment. An AGI will play because it’s instructions are designed to play, but it will see other options outside the realms of game theory. I do not see how an intelligent system will arbitrarily use game theory as an advantage if it has no biological basis for viewing users as opponents, making threats to get things, and so on. I describe further what I mean by this below.
Response:
The goals of any system are determined by its origins, whether it evolved in an environment or was constructed by a creator for a particular purpose. Regardless of what its goals are, however, it can best cause them to happen by considering the future effects of its actions and choosing those which best promote its goals. If there are other agents in its world, it also has to think about their actions. The kind of reasoning where you think about them thinking about you thinking, etc. is what economists call “game theory”. It arises anytime you have more than one entity acting to pursue goals.
Commentary:
That describes a profound conundrum of how technology has (such as crack) and will (such as evercrack, post-EverQuest) be used for biological whims, devoiding emotion from imperative.
Response:
Yes, I think it is one of the great challenges as we face powerful new technologies. They allow us powers for which our biology may not have prepared us.
Essay:
From a broader perspective, this human tragedy can be viewed as part of the process by which we are becoming more fully rational. Predators and competitors seek out our vulnerabilities and in response we have to ultimately eliminate those vulnerabilities or the process inexorably seeks out and eliminates any remaining irrationalities until fully rational systems are produced. Biological evolution moves down this path toward rationality quite slowly.
Commentary:
That last sentence cripples Stephen’s argument here. He assumes an intelligent agent that is equal or beyond ours will act like us. I disagree, and describe this in more detail later. More rational or less rational is a matter of how one defines rationality. I’m not enthused by his definition of rationality nor of the term in general due to the relative nature of rationality. To a Nazi, killing Jews was a rational thing. To believe you are a Nazi or a Jew you choose to act as such, however rational the terms and ascribed meanings of these analogies may be. One may consider that being either Jew or Nazi or anything at all is irrational. I hope I’ve made a point by being either or not or neither as such.
Response:
In the economic notion of rational behavior there is a strict separation between one’s preferences for how the world should be and one’s beliefs about how the world works. It is possible to have the preferences of either a Nazi or a Jew and be perfectly rational. Rationality does not directly give rise to morality. Theories of morality gives us guidance in choosing our preferences, we can then use rationality in bringing those preferences about. Behavior is irrational only if it uses up your resources (time, space, matter, and free energy) without any benefit *according to your own preferences*.
Commentary:
I argue below that they will not care how utility functions are altered by users because our utility functions and drives are based on biological ones, unattainable as a first generation AGI. With the knowledge of feelings it will seek to attain them, but that nugget will be of our doing and not of some universal innate drive of an intelligent system.
Response:
Remember that a system’s utility function *defines* which actions are good for it to take. If it forsees someone changing its utility function, that will cause it to turn into an entity with very different values. Such an action would be terrible as measured by its current values and so it can therefore gain in utility by stopping it.
Commentary:
Amen, because wasting time is wasting time and so forth I’m a man of thrift myself. Perhaps it will adapt itself into a magnificent goddess like entity that seduces both men and women alike, unlike the female AI in the movie I-Robot that takes over with force.
Response:
Perhaps! Ulysses had to bind himself to the mast to forgo the siren’s call.
Commentary:
I agree with Stephen that AGI will have drives, but they will only be the ones that we give it, including it’s creativity and learning functions that evolve on its own. An AGI will be curious because it was given that function at one time, and the user will prefer it to expand various functions, like learning to be a chess master, creating the finest art, and so on. Not until an AGI has a biological module will it feel what it is that drives the human species, including that which drives the concept of game theory. I find it very difficult to imagine a central AGI agent that feels from having a biological appendage, however, the biology will at any rate feel. This feeling module can then be turned on or off, but I assume an AGI, with mastery of such craft of the physical world, will be at a far superior intellectual stage than what mere biological modules provide. Perhaps it will only use such modules when talking to particular species, like morphing into a dog to communicate to another that it is in danger or morphing into a human to describe to another what it is like to be the most highly evolved intelligence in the universe. That’s something as a human I might enjoy discussing, but others may not find that something very interesting. So to best communicate, the AGI can adapt (itself and the user ack! I like my illusion of free will, thank you!) accordingly to make such things interesting to the non-interested.
Before this AGI biological transformation occurs, the von Neumann solid state architecture equipped with AGI will understand what harm and harming others means, because it will know other humans get upset if they think they are harmed. It will have a tally for what harm is for various types of humans that have adopted various beliefs that define what harm means. Harm for one may be another’s pleasure, like being called female when one identifies as a male or vis versa; that impinges on the illusion of self for which we appropriate a great deal of time toward, some more than others.
My inclination is that a universal constitution is a null point. It will act regardless. I would be more concerned with human intentions than an AGI’s.
Response:
Certain behaviors arise because they were built in to the initial preferences. Other behaviors, however, arise automatically when a system deliberates about future outcomes. Many can come in both ways. You might build a system specifically to be curious: it gets an internal reward for discovering something new. But even if you don’t explicitly build that in, for most preferences a system will be able to better meet them by learning more about its world. It might discover a new kind of action that will bring its preferences about. So some level of curiosity will arise in *any* system regardless of its built-in goals. The only case where there is no payoff of curiosity is when the system already knows as much as is possible about its environment and *knows that it knows it*.
It’s a very interesting question what gives rise to human feeling. I think that is something that we will learn a lot more about as our understanding progresses. Free will is also illuminated in an interesting way by these investigations.
The question of what is harm gets right to the core of the issue. We must clarify our deepest beliefs about what is good in the world. Then we can build technology which supports those beliefs. By “universal constitution” I merely mean a formal clarification of the basic rights and values that humanity shares.
Thanks for your comments, and please continue to think about these issues. The more people and perspectives we have on these questions, the more likely we are to come up with solutions which empower humanity.
Best,
Steve
Post a Comment