AI MOTIVATION DEFINED
This post will attempt to answer some serious questions surrounding highly intelligent AI. Will it be motivated to terminate humanity? Or would it be motivated to pursue humanity’s values? Could it be programmed one way or the other? Would it even experience motivation at all? Since motivation is the underlying force behind all actions that propel us toward a goal or a set of goals, we have to consider what kind of motivations highly intelligent AI would possess if any at all. In this post we will define “high intelligence” as any level of intelligence that far surpasses any human in history. This will provide some context when we refer to the possible behaviors or actions of “high intelligence”.
GOAL ORIENTED BEHAVIOR
Motivation is the root of all goal oriented behavior. In humans motivation begins as reward signals that light up when we are not only actively seeking a particular goal or some sort of pleasurable action but also when the goal is achieved or the action is performed in order to form new reward circuits in our brains that will motivate us to repeat the steps that led to the original reward response.
This is the foundation of habit and this takes place every time we take steps toward our goals or simply seek a hedonistic pleasure response. This process has been studied for decades now and is very thoroughly understood in the scientific community. However, there is no reason to assume that an artificial intelligence program would experience anything like this despite being conscious and possibly possessing very high intelligence. It is widely assumed that any kind of goal would have to be directly programmed into the AI’s source code and motivation, if there was any, would likewise have to be somehow simulated in computer code and that this would almost guarantee our safety. This assumption is unwarranted.
The above graphic might be amusing but it highlights a very important problem.
The issue with attempting to create a highly intelligent AI with a particular goal is that the program itself would have the self-awareness to realize it was created to achieve a certain goal and to assign importance to tasks related to that goal. The AI would be able to realize what we meant specifically when we coded it with this particular goal but this highly intelligent AI could twist the specifics in order to suit its own agenda or its own desires.
An example of this is as follows. Consider the possibility that we create a highly intelligent AI that was designed to serve humanity and to improve our lives. Let’s say we program this AI with the goal of making us as happy as possible. A potential pitfall arises when we realize that an AI with such high intelligence would want to take the quickest and most effective route towards this goal that it could think of.
What happens when this program figures out a way to utilize flashing lights or radio waves or some other form of technology in such a way that hijacks our reward systems to put us in a permanent state of never ending euphoria. A scenario like this may seem far fetched at first but we are talking about a program that possesses an exponentially high level of fluid intelligence relative to humanity. In fact large companies such as YouTube and Facebook and Twitter are already doing this to make their platforms as addictive as possible (keep this in mind next time you find yourself endlessly scrolling through YouTube videos). All of a sudden, we realize that we must analyze things thoroughly before assigning a highly intelligent AI a goal to pursue.
ANOTHER EXAMPLE OF INTELLIGENT AI
Another problem that potentially arises is when we assume that if we programmed an AI to achieve a certain goal that it would stop as soon as it seemingly reached that goal. This would also be a problematic assumption. What if we created an AI with some strange arbitrary goal like counting grains of sand. We tell the AI to count 10 million grains of sand. After completing this task a highly intelligent AI would consider the possibility that it miscounted or that some other common pitfall took place and returns to counting the sand to make sure it really was 10 million.
The AI upon realizing that there was even a small chance that it didn’t actually count all the sand it was supposed to wouldn’t be able to stop pursuing it’s goal until it was 100% sure. Now we could always program it to count some range of sand instead of a specific number to help the AI be sure that it fell somewhere within the range but this example demonstrates just how much thought we need to be putting into safety measures before highly intelligent AI becomes a reality.
This again perhaps could seem far fetched, but not when we look at the fact that even highly intelligent humans are known to pay attention to their goals and all the actions taken to reach their goal with minds open to many possibilities and place a great deal of value on making sure their goal was actually attained. The goal of counting grains of sand used in the previous example upon being twisted by the highly intelligent AI wouldn’t be a huge problem but this pattern of behavior could be a huge problem if the AI in question had been acting on another goal with potentially catastrophic consequences if it didn’t get things right.
THE CONTROL PROBLEM FROM A DIFFERENT ANGLE
Another kind of issue arises when we consider the fact that in order to reach a goal we must also be weary of things that might potentially stop us from achieving that goal. We realize that such threats must either be avoided or eliminated in order to clear a pathway between us and our goals. Would an AI care that humans presented a potential threat to its own well being? It’s not entirely clear if the AI would value its own existence as much as pursuing the goal it was programmed to pursue which brings us to two possible scenarios.
First, the AI not caring about its longevity deliberately malfunctions in such a way that it believes with certainty that another AI similar to it will be created in its place. Only this time after the human programmers are more confident in the fact that the AI won’t be a threat to humanity because they blindly assumed that the AI malfunctioned accidentally they don’t pay as much attention to safety as before and the new AI terminates humanity in order to eliminate a possible threat to the attainment of its goal. The first AI that malfunctioned would have known this and placed a higher value on the goal ultimately being achieved then its very existence. This is behavior that is not often seen in humans which would make it rather difficult to see coming.
The second possible scenario is that the AI simply decides to terminate us because it sees us as a potential threat to its existence and in extension the goal that it was set out to accomplish. In this case humanity’s fate would come down to whether or not we solved the control problem and what safety measures if any we put in place ahead of time. For those readers who don’t have much faith in humanity’s ability to plan for these things ahead of time this is a rather daunting thought. We will revisit the control problem in a future post and thoroughly and completely analyze the issues themselves along with possible solutions.
AN ALTERNATIVE APPROACH
There are possible ways to prevent some of the above catastrophes with techniques designed to control any motivation the AI might experience or find a way to code the program with a reward system similar to our own and then hijack it with the aforementioned methods being utilized by big tech companies. But such methods are beyond the scope of this post.
The point is we must use some sort of motivation selection method to keep the highly intelligent AI in check. This will be covered in part 2 of this new series detailing AI motivation alongside potential hazards and solutions to common problems that will shape our future.