Concerns About AI Safety

Rupam Mahmood

As we continue to speculate the plausibility of superhuman AI, some of us have become concerned about our future in the face of such technologically advanced entities. These concerns are chiefly of two forms. Some concerns are about AI systems that may achieve goals or exhibit behaviors much different than what the designer originally intended. This may happen due to our inability to understand the technical issues involving complex systems, resulting in design errors. If the capabilities of these systems can generally surpass those of humans, their actions may become out of control and dangerous.

Some other concerns are regarding the possibility of the designer himself providing AI systems with goals that are potentially conflicting with the goals of some humans. This is seen as problematic even if these AI systems abide by their intended goals and behaviors. The culmination of this concern is represented by AI systems possessing self-interested goals, that is, systems the purpose of which is not to serve humans but to take care of themselves. Here, I will address only this particular AI-safety concern. More specifically, I will reveal the root cause of our fear of self-interested AI, debunk some of our misconceptions about it, and make a case for its existence.

The fear of entities with self-interested and potentially conflicting goals is not new. Animals naturally fear other organisms—especially the peers—with different goals. In the face of this fear, animals often try to kill or subjugate other organisms. With humans, this has extended to wars and slavery. However, animals have also been successful in avoiding violence and establishing cooperation and coexistence with kins and other organisms in the vicinity. With humans, this has extended to the formation of tribes, and institutions. Humans have also succeeded to cooperate and coexist with humans outside of their own tribes or geographical vicinity through trades, and global institutions.

Therefore, when confronted with other entities possessing different and potentially conflicting goals, it is valuable to remember that fear and violence are natural, but not necessarily a measured, judicious response, and does not necessarily bring the best outcome. Such responses have been much too common in our history and often did more harm than good, especially when we acted this way against people with different ethnicities, backgrounds, or creeds. On the other hand, cooperation and coexistence quite often resulted in a better long-term outcome, although achieving it requires much more cognitive effort. Therefore, when deliberating the possibility of self-interested AI, it is important to resist our natural responses to it.

On that note, I would like to conclude by mentioning that when we are talking about entities with self-interested goals, AI should not be our foremost concern. It is rather our failure to coexist peacefully and efficiently with others that remains to be the main problem. AI does not bring any novel challenge to our society in this regard, but it may rather exacerbate our existing failure of creating a peaceful society consisting of people with different ambitions and goals. We better act fast to mend our existing societal problems rather than thwarting the possibility of novel goals and ambitions.

The fear of others is not, however, the only source of concerns regarding self-interested AI. Some people would argue against having self-interested AI also because they would not find such entities to be useful in any manner. A typical belief on this line of thought is that the only way we as humans can benefit from AI is by having them subservient to us. Therefore, the aim here is to create AI systems, presumably with superhuman intellect, the sole purpose of which would be to work for some particular or the general well-being of humans.

At this point, it is appropriate to clarify and distinguish between two separate but related concepts: technological augmentation of humans often known as intelligence amplification (IA), and autonomous AI systems. Although IA is one of the most desirable outcomes of technological advancements, it is not the most natural example of AI but is rather commonly contrasted with AI. The prototypical example of AI is an autonomous robot operated by its own given goal. Then perhaps what the proponents of the “servitude-only AI” really seek is IA. Because it is in IA we will have the fullest servitude of machines, which will be treated more as our extensions or tools with access to substantial control over them rather than as autonomous systems.

Another possibility is that the proponents of the servitude-only AI actually desire special-purpose robots. To be servile, such robots will likely be cognitively mutilated or restricted so that we can have substantial control over them. Without such control over an entity, having access to its fullest use or servitude is a wishful thinking, and let us not indulge ourselves to such a confusing and illusory thought. As servile special-purpose systems are by design required to be cognitively-restricted, expecting general superhuman intellect from them seems incoherent.

It is worth noting that I am neither opposed to IA nor an opponent of special-purpose robots. What I am opposed to is the idea of having servile AI that is otherwise with general superhuman intellect. The thought of seeking servitude from an AI system that can think in an open-ended manner with a superhuman capacity seems not only dystopian but also incoherent. How can we have an autonomous system that is both with general superhuman capabilities and under substantial control of humans?

We may need to give up on our desire for either the servitude-only AI or AI with general superhuman capabilities. But how can we give up on our hope of having superhuman AI? And how can we guarantee to have both superhuman AI and a safe future for humans? The future is precarious, and any promise of safety with certitude is questionable. But what we can certainly do is deliberate on different possibilities of the future. Superhuman AI, over which we will have no significant control, will likely not be subservient to us. More ominously, they can be with self-interested goals. However, a violent demise of humans is no way the only outcome of having superhuman AI. In fact, having self-interested AI is not only feasible, but it may also be beneficial to us. Let us consider this possibility.

Those who fear that coexisting with self-interested AI entities is infeasible because of their potentially conflicting goals should consider the fact that we already coexist with other self-interested entities. Bankers, cab drivers, and bartenders—from whom we receive useful services—are all self-interested beings with their own separate goals. The fact that serving us is not their sole purpose does not preclude us from benefiting from their services. This should work as a proof by example that coexisting with entities possessing self-interested independent goals can be feasible as well as useful. And examples of coexistence and cooperation are not confined only to entities with peer capabilities. With more cognitive effort, we have been able to live with other animals without exterminating them, and our achievements along this path are under continual progress.

Moreover, it can be desirable to allow even the special-purpose robots to be somewhat self-interested or at least to assume the responsibility for their own livelihood and maintenance. And this is more due to the concern of practicality and scalability. If we produce a million units of a particular kind of robot, the task of maintaining them and providing for their livelihood will become a tall order. Who should be responsible for all those? The manufacturers, the governments, or the owners? In each of these cases, the reliance is ultimately either on humans or at least on a centrally-organized system. The latter kind is not known for its penchant toward scalability. If we seek scalability beyond human reliance, the handing over of these responsibilities to the AI unit itself seems appropriate. Being autonomous AI systems, they can be naturally expected to be able to take care of these essential tasks regarding their own lives. This is also a natural solution allowing large-scale production.

And finally, the coexistence of entities with different goals, if we can achieve, is morally much more appealing than having a monolithic society with an army of robots driven by a singular objective. The necessity of diversity and peaceful coexistence is being increasingly felt even in our current society. Why would the inclusion of superhuman entities will suddenly cause such a coexistence to be undesirable or implausible? A desire to coexist may, in fact, become more important when these newcomers are included in our society.

By pointing out the fact that self-interested superhuman AI can be feasible, beneficial, and morally appealing, I do not intend to disregard our AI-safety concerns. My objective here has been rather to reveal some of our unfounded safety concerns based on our ingrained natural responses to alien entities and our incorrect assumptions about them. My intention has been to reveal that perhaps the most desirable outcomes of having superhuman self-interested AI have so far been presented and viewed as the least desirable ones. The concept of having self-interested AI has long been subject to suspicion and fear-mongering. It is high time we carefully deliberated on this possibility and prepared ourselves so that we could embrace it gracefully while we still had a chance.