As our glossary video on the term indicates, “[u]tilitarianism is an ethical theory that determines right from wrong by focusing on outcomes. It is a form of consequentialism.” A famous limitation of any form of consequentialism, as the video also indicates, is this: “because we cannot predict the future, it’s difficult to know with certainty whether the consequences of our actions will be good or bad.”
In a recent blog post, we bemoaned the absence of factual certainty needed to make policy, business, and (especially) moral decisions regarding the future development of artificial intelligence (AI). Factual certainty is helpful, and perhaps requisite, to effective consequentialist decisions regarding AI. How are we to make sound moral judgments about further development of AI when so many true experts disagree so vehemently about the potential dangers? Is there a meaningful chance that further AI development might lead to an apocalypse, or is such a thought simply ludicrous?
This blog post does not undertake to settle this controversy which with available information is, we believe, likely unresolvable with any certainty. Rather, we write simply to frame the controversy and bring to the reader’s attention some key bits of information relevant to the debate between “techno-optimists” who are mostly unconcerned about negative consequences flowing from future AI developments on the one hand and “techno-safetyists,” more commonly known as “AI doomers,” on the other.
What worries AI skeptics? Well, lots of things, including discriminatory AI facial recognition programs, negative impacts on humans who may lose their jobs to machines, and environmental collateral damage. But doomers are worried about extinction-level impacts, and that is our focus here. How might further AI development pose a meaningful threat to mankind? Contemplate these concerns (which are just a few of many):
- Most famously, consider Nick Bostrom’s paperclip hypothetical regarding the problem of aligning an AI tool’s instructions with human interests as summarized by Ezra Klein: “The canonical example here is the paper clip maximizer. You tell a powerful A.I. system to make more paper clips and it starts destroying the world in its effort to turn everything into a paper clip. You try to turn it off but it replicates itself on every computer system it can find because being turned off would interfere with its objective: to make more paper clips.”
- Max Tegmark, MIT physicist and AI researcher, worries that just as humans have wiped out innumerable less intelligent species, superintelligent AI agents could do the same to us: “If you have machines that control the planet, and they are interested in doing a lot of computation and they want to scale up their computing infrastructure, it’s natural that they would want to use our land for that. If we protest too much, then we become a pest and a nuisance to them. They might want to rearrange the biosphere to do something else with those atoms—and if that is not compatible with human life, well, tough luck for us….”
- AI has already made some important scientific breakthroughs, including significant progress on the “protein-folding problem.” Eliezer Yadkowsky of the Machine Intelligence Research Institute suggests: “If [AI] can solve certain biological challenges, it could build itself a tiny molecular laboratory and manufacture and release lethal bacteria. What that looks like is everybody on earth falling over dead inside the same second.”
- Ajeya Contra, a research analyst at Open Philanthropy, believes that AI will eventually be better than humans at pretty much everything and will then naturally be put in charge of running everything: “In that world, it becomes easier to imagine that, if AI systems wanted to cooperate with one another in order to push humans out of the picture, they would have lots of levers to pull: they are running the police force, the military, the biggest companies; they are inventing the technology and developing the policy.”
- Jeremie Harris, CEO of Gladstone AI, worries that “above a certain threshold of capability, AIs could potentially become uncontrollable.”
As noted in our earlier post, many notable experts believe that—given the current state of AI development–doomers who worry that AI might bring about the end of mankind as we know it, are “pretty irresponsible” (Mark Zuckerberg), “face palming[ly]” irrational (Yann LeCun), and manifesting “a full-blown moral panic” (Marc Andreessen).
Consider these arguments made by techno-optimists:
- “Current AI is nowhere near capable enough for these [doomsday] risks to materialize.” –Arvind Naranyan, Princeton computer scientist
- “[L]arge language models can only give back to us what we have fed them. Through statistical associative techniques and reinforcement learning, the algorithm behind ChatGPT can ferret out and regurgitate insightful words and concepts, but there is no sense in which it understands the depth of meaning these words and concepts carry.” –Anja Kaspersen & Wendell Wallach of the Carnegie Council for Ethics
- “Specifically, we do not believe that with current approaches, computers can truly be creative in any domain – be it art, music, science, business, mathematics or any field requiring novel thoughts of a human. We believe that A.I. machines have hit an ‘imitation barrier.’ Applications such as ChatGPT do generate responses to questions, but the results are merely reconfigurations of data and prior knowledge loaded by humans.” –Rowland Chen, CEO of Silicon Valley Laboratory
- One “advantage that humans have over technology: A person can walk over to the wall and remove the power cord from the outlet.” –Eric Schroeder
While doomers would like to take comfort in these opinions, there are at least two reasons not to be especially comforted by them. First, the abilities of AI have already advanced since these recent opinions were voiced and, in fact, they’ve advanced since yesterday. And their evolution will continue to accelerate. The University of Texas’s J. Craig Wheeler confidently predicts that future “developments in AI will … be exponential and complex.” Today’s AI capacity for good and for ill, is growing stronger by the minute, as was highlighted by the very recent release of DeepSeekR1 by a Chinese company.
Second, doomers already have developments they can point to that give some support to their most extreme fears.
- Consider the alignment problem. Historian and philosopher Yuval Noah Harari admits that Bostrom’s paper clip thought experiment may sound silly on its face, but suggests that Bostrom was making the “point that the problem with computers isn’t that they are particularly evil but that they are particularly powerful…If we define a misaligned goal to a pocket calculator, the consequences are trivial. But if we define a misaligned goal to a superintelligent machine, the consequences could be dystopian.”
Suggesting strongly that Facebook should have paid more attention to the paper clip hypothetical in 2014 when it instructed its algorithm to maximize profits by maximizing “user engagement” in Myanmar, Harari notes that the algorithm quickly learned that user engagement was best inflated by inundating Facebook accounts with false news about imaginary atrocities supposedly committed in the past and planned for the future by the ethnic minority Rohingya. Because Facebook accounts provided a primary source of news in Myanmar, when its algorithm proactively amplified and promoted content that incited violence, hatred, and discrimination against the Rohingya, its actions fueled the murder of 10,000 men, women and children, with 300 villages destroyed and more than 700,000 people forced to flee the country in order to survive this ethnic cleansing.
- Another misalignment example involves Dario Amodei, CEO of AI firm Anthropic, who developed a general purpose AI which performed well in car races. Amodei entered it into a boat race, instructing it to maximize its score. Harari reports the results:
“The game rewarded players with a lot of points for getting ahead of other boats—as in the car races—but it also rewarded them with a few points whenever they replenished their power by docking into a harbor. The AI discovered that if instead of trying to outsail the other boats, it simply went in circles in and out of the harbor, it could accumulate more points.” (Harari)
The boat scored the most points as literally instructed, but didn’t even try to win the race, as its human overlords desired. It mostly spun in circles, collided with stone walls, and rammed other boats.
- In 2022, Collaboration Pharmaceuticals designed AI software—MegaSyn—to search through molecular constructions to find cures for diseases that would outweigh side effects. One researcher switched the code to see if MegaSyn could be used in a negative way and programmed it to produce lethal treatments. “He inputted the information and came back a couple of hours later to see the results; over 40,000 results were produced, each more toxic than the other. The drugs it created were molecular combinations that have never been seen before.” (Sasha White)
- Those of a certain age will remember that it took many years and multiple attempts for a computer (IBM’s Deep Blue) to defeat the world’s best human chess player (Gary Kasparov) in 1997. That was a significant achievement, but possibly not as significant as March 10, 2016 when Google Deep Mind’s Alpha Go program played the world champion, Lee Sedol, in Go, an ancient Chinese game that is much more complicated than chess. Many experts thought the best human Go players would never be defeated by any computer. AlphaGo won the first game in the match. Then in the second game, as reported by the computer scientist most responsible for AlphaGo:
“…came move 37. It made no sense. AlphaGo had apparently blown it, blindly following an apparently losing strategy no professional player would ever pursue. The live match commentators, both professionals of the highest ranking said it was a “very strange move” and thought it was “a mistake.” …[A]s the endgame approached, that ‘mistaken’ move proved pivotal. AlphaGo had won again. Go strategy was being rewritten before our eyes. Our AI had uncovered ideas that hadn’t occurred to the most brilliant [human] players in thousands of years.” (Mustafa Suleyman)
So much for AI algorithms lacking creativity and imagination. Today, advanced AI enables computers to defeat humans in virtually every such game with relative ease.
- AI researchers Martin Abadi and David Andersen established three neural networks that they nicknamed Alice, Bob, and Eve. Alice and Bob were instructed to communicate confidentially. Eve would eavesdrop on the communication. Although Alice and Bob were not instructed in how to encrypt their communications, they successfully taught themselves a private language that did the job. What else might AI tools be able to teach themselves?
- In one experiment, researchers at the Alignment Research Center tasked OpenAI’s GPT-4 with defeating a CAPTCHA test designed to distinguish humans from robots. The system was unable to defeat the test by itself, but it accessed TaskRabbit and hired a human worker to defeat CAPTCHA for it. When contacted, the human was initially suspicious and asked GPT-4 if it was a robot that couldn’t solve the CAPTCHA quiz itself. GPT-4 lied in order to carry out its assignment, saying that it was a human being with a vision impairment. What else might AI systems lie to humans about?
- Philosopher and AI researcher Joe Carlsmith learned that the AI he was testing was willing to “fake” alignment (pretending that it was acting consistently with the researcher’s goals) in a scheme to gain power it could later exercise. Another troubling example of imagination and deception.
- That humans have a corporeal body and can unplug a computer is an advantage that may quickly fade as robots are equipped with AI. “Roboticists increasingly believe that their field is approaching a ChatGPT moment…. What is striking about these achievements is that they involve very little programing. The robots’ behavior is learned.” (Somers)
Hmmmm. All this leaves us with evidence that AI is powerful, creative, and willing to lie and scheme to reach its own goals. Unfortunately, it does not give us enough specific, measurable, confirmable evidence regarding AI’s potential to do good and likelihood to do bad to enable us to make a truly informed utilitarian judgment regarding the morality of continuing to develop AI at the current breakneck speed. All we can do is urge AI developers, entrepreneurs, consumers, regulators and others to carefully weigh what evidence is available when they make their moral judgments regarding others’ actions and the moral action choices they themselves face in the immediate future.
Sources:
Martin Abadi & David Andersen, “Learning to Protect Communications with Adversarial Neural Cryptography,” arXiv, Oct. 21, 2016, at arXiv.1610.06918.
Eduardo Baptista, “What is DeepSeek and Why Is It Disrupting the AI Sector?, Reuters, Jan. 28, 2025, at https://www.reuters.com/technology/artificial-intelligence/what-is-deepseek-why-is-it-disrupting-ai-sector-2025-01-27/.
Joe Carlsmith, “Scheming AIs: Will AIs Fake Alignment during Training in Order to Get Power?” Open Philosophy, Nov. 2023. See also https://arxiv.org/abs/2311.08379.
Raymond Chen, “Letter to the Editor,” New York Times, June 1, 2023.
Matt Egan, CNN, “AI Could Pose ‘Extinction-Level’ Threat to Humans and the US Must Intervene, State Dept.-commissioned Report Warns,” CNN, Mar. 12, 2024.
Yuval Noah Harari, Nexus: A Brief History of Information Networks from the Stone Age to AI 195-200 (2024).
Anja Kaspersen & Wendell Wallach, “Now is the Moment for a Systemic Reset of AI and Technology Governance,” (Jan. 24, 2023), at https://www.carnegiecouncil.org/media/article/systemic-reset-ai-technology-governance.
Ezra Klein, “The Imminent Danger of A.I. Is One We’re Not Talking About,” New York Times, Feb. 26, 2023.
Andrew Marantz, “OK, Doomer.” The New Yorker, March 18, 2024.
Molly McDonough, “Did AI Solve the Protein-Folding Problem?” Harvard Medicine, Oct. 2024, at https://magazine.hms.harvard.edu/articles/did-ai-solve-protein-folding-problem.
Cade Metz, “Mark Zuckerberg, Elon Musk and the Feud over Killer Robots,” New York Times, June 9, 2018.
OpenAI, GPT-4 System Card, March 23, 2023, at https://cdn.openai.com/papers/gpt-4-system-card.pdf.
Steve Rose, “Five Ways AI Might Destroy the World: ‘Everyone on Earth Could Fall over Dead in the Same Second,” The Guardian, July 7, 2023.
Eric Schroeder, “Letter to the Editor,” New York Times, June 1, 2023.
James Somers, “A Revolution in How Robots Learn, The New Yorker, Nov. 25, 2024.
Mustafa Suleyman, The Coming Wave 54 (2023), quoted in Yuval Noah Harari, Nexus: A Brief History of Information Networks from the Stone Age to AI 332 (2024).
Chris Vallance, “Artificial Intelligence Could Lead to Extinction, Experts Warn,” The BBC, May 30, 2023, at https://www.bbc.com/news/uk-65746524.
J. Craig Wheeler, The Path to Singularity: How Technology Will Challenge the Future of Humanity 50 (2024).
Sasha White, “Artificial Intelligence: The End of our World or Just the Beginning? The Science Survey, Jan. 26, 2024, at https://thesciencesurvey.com/news/2024/01/26/ai-the-end-of-our-world-or-just-the-beginning/
Videos:
Artificial Intelligence: https://ethicsunwrapped.utexas.edu/glossary/artificial-intelligence.
AI Ethics: https://ethicsunwrapped.utexas.edu/glossary/ai-ethics.
Blog Posts:
“Artificial Intelligence, Democracy, and Danger”: https://ethicsunwrapped.utexas.edu/artificial-intelligence-democracy-and-danger.
“AI Ethics: ‘Just the Facts Ma’am,’”: https://ethicsunwrapped.utexas.edu/ai-ethics-just-the-facts-maam.