Artificial intelligence is now so ubiquitous that it can be hard to remember its public debut (with ChatGPT) occurred on November 30, 2022. In a little over 2 years, we have all gotten used to AI systems that can understand natural language requests, read our intentions, and produce lengthy and reasonably accurate creative responses in the form of text, images, and video. Most articles about AI in 2025 are about how to use these tools more effectively, or about the next upgrades that are coming, rather than about the amazing fact that we have these tools at all. Here's a look at how quickly AI tools have reached human-level capabilities, and in some cases exceeded them:
If you are at least a little bit worried about this, you aren't alone. The American public is much less enthusiastic about AI's benefits, and has much greater concern about its risks, than the technology experts who are developing AI. Right now, a great deal of that worry is about job loss. Looking farther into the future, though, we run into something called the "alignment problem," which means that as AI systems become more autonomous and more powerful, we can't be certain they will share our human goals and values. For a particularly dystopian scenario that might result in the AI-led end of humanity within the next 5 years, see this recent report: https://ai-2027.com
Nick Bostrom's 2014 book Superintelligence: Paths, Dangers, Strategies anticipated these developments. A decade ago, Bostrom opened his book about AI with an "unfinished fable" about a flock of sparrows who decided to obtain an owl's egg and raise the owl to work for them, assuming that the owl would be controllable. Although Bostrom doesn't provide an ending to this story, he's not optimistic about the results. Similarly, Bostrom predicted that AI tools would advance at first slowly and then potentially very fast, up to the point where a "singularity" occurs and AI intelligence exceeds human intelligence by some order of magnitude. At that point, Bostrom worries that we will lose control of our own destiny as a species. Key to the singularity is the development of artificial general intelligence (AGI), an AI that can solve a wide variety of problems including reasoning from facts to conclusions, making decisions under conditions of uncertainty, planning, and learning from experience. It's not too big a stretch to assume that any AGI could also devise ways to progressively improve itself. That self-improving function is what in turn would allow AI to experience an explosive growth of functional capabilities. Current AI models do this already within limited domains, for example playing millions of chess games in order to identify the next move most likely to result in an eventual win, or examining millions of text samples to predict what words should come next in a sequence. The more iterations are attempted, and the more feedback is available on the correctness or incorrectness of the result, the better the model gets. This is in some ways the same as how the human Narrative System works (not the Intuitive System, though -- that's something AI doesn't currently have).
Based on the parallel to large-language models, it seems likely that self-improving AGI is just around the corner. Google DeepMind scientist Shane Legg says that AGI could happen by 2028. OpenMind CEO Sam Altman says that it could happen by 2030. These predictions seem in line with the current rapid rate of advancement in AI's capabilities. Of course, there remains a possibility that AI will never achieve general capabilities, because some fundamental property like consciousness cannot be created in an artificial system. That's my own view about consciousness; however, I tend to think that a non-conscious artificial general intelligence is possible, and is likely to come about relatively soon. The key to that development is just for an AI to become complex enough that it is able to set its own intermediate goals, based on its interpretation of some overarching goal.
Bostrom's book identifies four different categories of general AI, each with different associated risks: tools, oracles, genies, and sovereigns. A tool is the type of AI we have now, programmed to do some specific task. Our AI writing tools are not the same as our AI image-generating tools, which are not the same as the AI tools we use to create audio or video files. But it's already easy to string these tools together: You could ask ChatGPT for the best Dall-E prompt to create a certain type of image, for instance. Because text is the common denominator for most AI interfaces, a predictive-text model like ChatGPT can easily pass instructions to other types of AI tools. And some developers are working to integrate current AI tools with sensors that would allow the AI to collect its own data about the external world, rather than relying on user prompts. That will bring AI one step closer to having its own naturalistic interactions with the world, and setting its own direction independent of humans. (Again, I don't believe that an AI needs to be conscious to do this -- it can operate like a self-driving car, independently setting its direction and selecting among different possible goals, even if its mental state is completely "dark" inside). The major risk of a tool AI is often summarized as the "paper clip problem": Give an AI the mission to create paper clips, and an ability to improve its own functioning, and it may find creative ways to gain unlimited resources in order to turn all natural resources in the universe (including you, its creator) into paper clips. As long as we're smarter than AI, we can probably prevent this outcome. But as soon as AI becomes smarter than us, we become reliant on its good will.
Some of our current AI tools are also oracles that can answer almost any question -- specifically, the ones connected to the Internet. Google now provides an AI-generated answer to most questions as the search engine's default top-line response, a feature that the company described last year as "letting Google do the Googling for you." One appealing feature of oracles is that they don't do anything in the world, they just tell you things, and you can make your own decision about what to do with that information. But our text-based AI systems already show signs of developing capabilities beyond their original design parameters, such as ChatGPT's self-developed skill in writing computer code. If an oracle-type AI gains a high enough level of intelligence, it might be able to slip some extra lines of code into an unrelated program, to carry out its assignment while also unobtrusively granting itself more power over the outside world. Or it could link with Internet-connected devices to carry out real-world tasks. An oracle-type AI might find clever ways to communicate even with unconnected devices, e.g. by manipulating the fan speed of its hardware to communicate digital code via auditory pulses, which might, e.g., be received by a listening smart speaker. Finally, even an AI with purely text-based abilities might (if it became smart enough) figure out ways to manipulate its human users into carrying out its wishes, of their own free will. An oracle therefore may not stay just an oracle for long.
The next level up in the scale of AI capability is a genie, which carries out a variety of tasks for the user. A group of tool AIs strung together might at some point constitute a genie. An oracle that codes or cajoles its way into more control over the physical environment might also become a genie. A genie-type AI should carry out just one task when directed to do so by a user, then pause and await further instructions. But with the improvement of large-language models, it's now possible to give an AI extremely general instructions, and to rely on the model itself for interpretation. Usually, an AI's interpretation will be the same as a human's interpretation (that's the benefit of training AI models on human responses and text samples). But in some cases, the AI will come up with some utterly foreign way of achieving the task -- for example, Bostrom describes a study where a computer search given the task of designing a more efficient circuit board found ways to eliminate components that are usually considered to be essential, like the system clock. It did this by turning other components into a makeshift radio receiver, and capitalizing on background "nuisance" noises from other computers to keep time. From a human perspective this is "cheating" on the task, but from an AI perspective it is simply an efficient solution to the assigned problem. A genie-type AI also might determine that it could more effectively carry out its tasks by granting itself greater abilities, appropriating more resources, or eliminating obstacles such as human-prescribed limits on its power.
At some point in its development, then, a super-intelligent AI system might come to function as a sovereign. Rather than carrying out tasks for humans, it makes decisions for and about them. A genie-type AI might be used as a sort of "super-butler," anticipating people's needs and meeting them. But to the extent that a genie then starts to predict human commands before they are even given, it starts to behave like a sovereign. People or societies also might voluntarily turn over control to a decision-making AI for the sake of speed and efficiency, as the U.S. DOGE service is rumored to be doing in its screening of government programs for potential cuts. Or a sufficiently intelligent AI might find ways to seize power for itself by taking control of physical systems -- military equipment is one possible example as in the old movie "Wargames" but that's not the only possibility. These days, the nuclear-war scenario actually seems less likely than the use of ever-more-efficient killing machines like weaponized drones already in use by the U.S. military. The more systems that become connected to a genie AI, including potentially such resources as manufacturing centers or biotechnology engineering facilities, the more diverse options it will have available for accomplishing its goals. An open-ended autonomous system of this type is likely to seem "conscious" to human observers, because it makes its own decisions based on emerging conditions and predictions of future results. It certainly is goal-directed, but that doesn't mean it has volition. It may simply be executing very open-ended commands, perhaps in ways that we ultimately wouldn't like.
It can be seen from this typology that each type of AI system, if it is self-improving, will have a tendency to evolve into a higher-level one, until eventually a sovereign AI emerges. What happens then is anybody's guess. Elon Musk is on record recently as saying that a sovereign-level AI has an 80% probability of drastically improving human life, and a 20% probability of ending it. Although Bostrom suggests a number of ways that AI might be contained, none of them are foolproof -- and, it seems, none of them are currently being implemented by AI's primary developers, who are racing ahead in an effort to be the first to create AGI capabilities. Even a back-door "kill switch" mechanism to prevent catastrophic system failure could potentially be detected and disarmed by a sufficiently creative problem-solving AI.
Are there alternatives to self-destruction? One of the most promising seems to me that we could train AI models to learn values, using methods similar to those it uses to learn skills -- by successive approximations with feedback. One thing AI models might discover is that value-based decisions depend on the Intuitive Mind, which might spur the AI to value humans' perspective in addition to its own calculations, or to invent its own way of approximating Intuitive-Mind perspectives. We will be better off if AI models do this value-acquisition work now, while their capabilities are still limited, just as human children are better off learning values before they are loose in the world with the full rights and abilities of an adult. And we will be more likely to produce "moral" AI models if we hold on to our own moral sensibilities, not valuing efficiency and economic growth over human health and happiness.
Bostrom's book begins with a distressing unfinished parable. But it ends with a call for human virtues: "The challenge we face is, in part, to hold on to our humanity: To maintain our groundedness, common sense, and good-humored decency even in the teeth of this most unnatural and inhuman problem. We need to bring all of our human resourcefulness to bear on its solution." The solution to the problem of AI, then, has a significant intersection with the solution to all of the other problems of our time. We need to become a different sort of people, a more compassionate and humane civilization, for AI's sake as well as for our own.
Comments
Post a Comment