Chapter 9: Engineering the future: what we should do instead

If we successfully choose not to supplant humanity by machines – at least for a while! – what can we do instead? Do we give up the huge promise of AI as a technology? At some level the answer is a simple no: close the Gates to uncontrollable AGI and superintelligence, but do build many other forms of AI, as well as the governance structures and institutions we’ll need to manage them.

But there’s still a great deal to say; making this happen would be a central occupation of humanity. This section explores several key themes:

How we can characterize “Tool” AI and the forms it can take.
That we can get (almost) everything humanity wants without AGI, with Tool AI.
That Tool AI systems are (probably, in principle) manageable.
That turning away from AGI does not mean compromising on national security – quite the opposite.
That power concentration is a real concern. Can we mitigate it without undermining safety and security?
That we will want – and need – new governance and social structures, and AI can actually help.

AI inside the Gates: Tool AI

The triple-intersection diagram gives a good way to delineate what we can call “Tool AI”: AI that is a controllable tool for human use, rather than an uncontrollable rival or replacement. The least problematic AI systems are those that are autonomous but not general or super capable (like an auction bidding bot), or general but not autonomous or capable (like a small language model), or capable but narrow and very controllable (like AlphaGo).¹²⁴ Those with two intersecting features have wider application but higher risk and will require major efforts to manage. (Just because an AI system is more of a tool does not mean it is inherently safe, merely that is isn’t inherently unsafe – consider a chainsaw, versus a pet tiger.) The Gate must remain closed to (full) AGI and superintelligence at the triple intersection, and enormous care must be taken with AI systems approaching that threshold.

But this leaves a lot of powerful AI! We can get huge utility out of smart and general passive “oracles” and narrow systems, general systems at human but not superhuman level, and so on. Many tech companies and developers are actively building these sorts of tools and should continue; like most people they are implicitly assuming the Gates to AGI and superintelligence will be closed.¹²⁵

As well, AI systems can be effectively combined into composite systems that maintain human oversight while enhancing capability. Rather than relying on inscrutable black boxes, we can build systems where multiple components – including both AI and traditional software – work together in ways that humans can monitor and understand.¹²⁶ While some components might be black boxes, none would be close to AGI – only the composite system as a whole would be both highly general and highly capable, and in a strictly controllable way.¹²⁷

Meaningful and guaranteed human control

What does “strictly controllable” mean? A key idea of the “Tool” framework is to allow systems – even if quite general and powerful – that are guaranteed to be under meaningful human control. What does this mean? It entails two aspects. First is a design consideration: humans should be deeply and centrally involved in what the system is doing, without delegating key important decisions to the AI. This is the character of most current AI systems. Second, to the degree that AI systems are autonomous, they must have guarantees that limit their scope of action. A guarantee should be a number characterizing the probability of something happening, and a reason to believe that number. This is what we demand in other safety critical fields, where numbers like “mean time between failure”s and expected numbers of accidents are computed, supported, and published in safety cases.¹²⁸ The ideal number for failures is zero, of course. And the good news is that we might get quite close, albeit using quite different AI architectures, using ideas of formally verified properties of programs (including AI). The idea, explored at length by Omohundro, Tegmark, Bengio, Dalrymple, and others (see here and here) is to construct a program with certain properties (for example: that a human can shut it down) and formally prove that those properties hold. This can be done now for quite short programs and simple properties, but the (coming) power of AI-powered proof software could allow it for much more complex programs (e.g. wrappers) and even AI itself. This is a very ambitious program, but as pressure grows on the Gates, we’re going to need some powerful materials reinforcing them. Mathematical proof may be one of the few that is strong enough.

Wither the AI industry

With AI progress redirected, Tool AI would still be an enormous industry. In terms of hardware, even with compute caps to prevent superintelligence, training and inference in smaller models will still require huge amounts of specialized components. On the software side, defusing the explosion in AI model and computation size should simply lead to companies redirecting resources toward making the smaller systems better, more diverse, and more specialized, rather than simply making them bigger.¹²⁹ There would be plenty of room – more probably – for all those money-making Silicon Valley startups.¹³⁰

Tool AI can yield (almost) everything humanity wants, without AGI

Intelligence, whether biological or machine, can be broadly considered as the ability to plan and execute activities bringing about futures more in line with a set of goals. As such, intelligence is of enormous benefit when used in pursuit of wisely chosen goals. Artificial intelligence is attracting huge investments of time and effort largely because of its promised benefits. So we should ask: to what degree would we still garner the benefits of AI if we contain its runaway to superintellience? The answer: we may lose surprisingly little.

Consider first that current AI systems are already very powerful, and we have really only scratched the surface of what can be done with them.¹³¹ They are reasonably capable of “running the show” in terms of “understanding” a question or task presented to them, and what it would take to answer this question or do that task.

Next, much of the excitement about modern AI systems is due to their generality; but some of the most capable AI systems – such as ones that generate or recognize speech or images, do scientific prediction and modeling, play games, etc. – are much narrower and well “within the Gates” in terms of computation.¹³² These systems are super-human at the particular tasks they do. They may have edge-case¹³³ (or exploitable) weaknesses due to their narrowness; however totally narrow or fully general are not the only options available: there are many architectures in between.¹³⁴

These AI tools can greatly speed advancement in other positive technologies, without AGI. To do better nuclear physics, we don’t need AI to be a nuclear physicist – we have those! If we want to accelerate medicine, give the biologists, medical researchers, and chemists powerful tools. They want them and will use them to enormous gain. We don’t need a server farm full of a million digital geniuses; we have millions of humans whose genius AI can help bring out. Yes, it will take longer to get immortality and the cure to all diseases. This is a real cost. But even the most promising health innovations would be of little use if AI-driven instability leads to global conflict or societal collapse. We owe it to ourselves to give AI-empowered humans a go at the problem first.

And suppose there is, in fact, some enormous upside to AGI that cannot be obtained by humanity using in-Gate tools. Do we lose those by never building AGI and superintelligence? In weighing the risks and rewards here, there is an enormous asymmetric benefit in waiting versus rushing: we can wait until it can be done in a guaranteed safe and beneficial way, and almost everyone will still get to reap the rewards; if we rush, it could be – in the words of the OpenAI CEO Sam Altman – lights out for all of us.

But if non-AGI tools are potentially so powerful, can we manage them? The answer is a clear…maybe.

Tool AI systems are (probably, in principle) manageable

But it will not be easy. Current cutting-edge AI systems can greatly empower people and institutions in achieving their goals. This is, in general, a good thing! However, there are natural dynamics of having such systems at our disposal – suddenly and without much time for society to adapt – that offer serious risks that need to be managed. It is worth discussing a few major classes of such risks, and how they may be diminished, assuming a Gate closure.

One class of risks is of high-powered Tool AI allowing access to knowledge or capability that had previously been tied to a person or organization, making a combination of high capability plus high loyalty available to a very broad array of actors. Today, with enough money a person of ill intent could hire a team of chemists to design and produce new chemical weapons – but it isn’t so very easy to have that money or to find/assemble the team and convince them to do something pretty clearly illegal, unethical, and dangerous. To prevent AI systems from playing such a role, improvements on current methods may well suffice,¹³⁵ as long as all those systems and access to them are responsibly managed. On the other hand, if powerful systems are released for general use and modification, any built-in safety measures are likely removable. So to avoid risks in this class, strong restrictions as to what can be publicly released – analogous to restrictions on details of nuclear, explosive, and other dangerous technologies – will be required.¹³⁶

A second class of risks stems from the scaling up of machines that act like or impersonate people. At the level of harm to individual people, these risks include much more effective scams, spam, and phishing, and the proliferation of non-consensual deepfakes.¹³⁷ At a collective level, they include disruption of core social processes like public discussion and debate, our societal information and knowledge gathering, processing, and dissemination systems, and our political choice systems. Mitigating this risk is likely to involve (a) laws restricting the impersonation of people by AI systems, and holding liable AI developers that create systems that generate such impersonations, (b) watermarking and provenance systems that identify and classify (responsibly) generated AI content, and (c) new socio-technical epistemic systems that can create a trusted chain from data (e.g. cameras and recordings) through facts, understanding, and good world-models.¹³⁸ All of this is possible, and AI can help with some parts of it.

A third general risk is that to the degree some tasks are automated, the humans presently doing those tasks can have less financial value as labor. Historically, automating tasks has made things enabled by those tasks cheaper and more abundant, while sorting the people previously doing those tasks into those still involved in the automated version (generally at higher skill/pay), and those whose labor is worth less or little. On net it is difficult to predict in which sectors more versus less human labor will be required in the resulting larger but more efficient sector. In parallel, the automation dynamic tends to increase inequality and general productivity, decrease the cost of certain goods and services (via efficiency increases), and increase the cost of others (via cost disease). For for those on the disfavored side of the inequality increase, it is deeply unclear whether the cost decrease in those certain goods and services outweighs the increase in others, and leads to overall greater well-being. So how will this go for AI? Because of the relative ease with which human intellectual labor can be replaced by general AI, we can expect a rapid version of this with human-competitive general-purpose AI.¹³⁹ If we close the Gate to AGI, many fewer jobs will be wholesale replaced by AI agents; but huge labor displacement is still probable over a period of years.¹⁴⁰ To avoid widespread economic suffering, it will likely be necessary to implement both some form of universal basic assets or income, and also engineer a cultural shift toward valuing and rewarding human-centric labor that is harder to automate (rather than seeing labor prices to drop due to the rise in available labor pushed out of other parts of the economy.) Other constructs, such as that of “data dignity” (in which the human producers of training data are auto-accorded royalties for the value created by that data in AI) may help. Automation by AI also has a second potential adverse effect, which is of inappropriate automation. Along with applications where AI simply does a worse job, this would include those where AI systems are likely to violate moral, ethical, or legal precepts – for example in life and death decisions, and in judicial matters. These must be treated by applying and extending our current legal frameworks.

Finally, a significant threat of in-gate AI is its use in personalized persuasion, attention capture, and manipulation. We have seen in social media and other online platforms the growth of a deeply entrenched attention economy (where online services battle fiercely for user attention) and “surveillance capitalism” systems (in which user information and profiling is added to the commodification of attention.) It is all but certain that more AI will be put into the service of both. AI is already heavily used in addictive feed algorithms, but this will evolve into addictive AI-generated content, customized to be compulsively consumed by a single person. And that person’s input, responses, and data, will be fed into the attention/advertising machine to continue the vicious cycle. As well, as AI helpers provided by tech companies become the interface for more online life, they will likely replace search engines and feeds as the mechanism by which persuasion and monetization of customers occurs. Our society’s failure to control these dynamics so far does not bode well. Some of this dynamic may be lessened via regulations concerning privacy, data rights, and manipulation. Getting more to the problem’s root may require different perspectives, such as that of loyal AI assistants (discussed below.)

The upshot of this discussion is that of hope: in-Gate tool-based systems – at least as long as they stay comparable in power and capability to today’s most cutting-edge systems – are probably manageable if there is will and coordination to do so. Decent human institutions, empowered by AI tools,¹⁴¹ can do it. We could also fail in doing it. But it is hard to see how allowing more powerful systems would help – other than by putting them in charge and hoping for the best.

National security

Races for AI supremacy – driven by national security or other motivations – drive us toward uncontrolled powerful AI systems that would tend to absorb, rather than bestow, power. An AGI race between the US and China is a race to determine which nation superintelligence gets first.

So what should those in charge of national security do instead? Governments have strong experience in building controllable and secure systems, and they should double-down on doing so in AI, supporting the sort of infrastructure projects that succeed best when done at scale and with government imprimatur.

Instead of a reckless “Manhattan project” toward AGI,¹⁴² the US government could launch an Apollo project for controllable, secure, trustworthy systems. This could include for example:

A major program to (a) develop the on-chip hardware security mechanisms and (b) the infrastructure, to manage the compute side of powerful AI. These could build off of the US CHIPS act and export control regime.
A large-scale initiative to develop formal verification techniques so that particular features of AI systems (like an off-switch) can be proven to be present or absent. This can leverage AI itself to develop proofs of properties.
A nation-scale effort to create software that is verifiably secure, powered by AI tools that can recode existing software into verifiably secure frameworks.
A national investment project in scientific advancement using AI,¹⁴³ running as a partnership between the DOE, NSF, and NIH.

In general, there is an enormous attack surface on our society that makes us vulnerable to risks from AI and its misuse. Protecting from some of these risks will require government-sized investment and standardization. These would provide vastly more security than pouring gasoline on the fire of races toward AGI. And if AI is going to be built into weaponry and command-and-control systems, it is crucial that the AI be trustworthy and secure, which current AI simply is not.

Power concentration and its mitigations

This essay has focused on the idea of human control of AI and its potential failure. But another valid lens through which to view the AI situation is through concentration of power. The development of very powerful AI threatens to concentrate power either into the very few and very large corporate hands that have developed and will control it, or into governments using AI as a new means to maintain their own power and control, or into the AI systems themselves. Or some unholy mix of the above. In any of these cases most of humanity loses power, control, and agency. How might we combat this?

The very first and most important step, of course, is a Gate closure to smarter-than-human AGI and superintelligence. These explicitly can directly replace humans and groups of humans. If they are under corporate or government control they will concentrate power in those corporations or governments; if they are “free” they will concentrate power into themselves. So let’s assume the Gates are closed. Then what?

One proposed solution to power concentration is “open-source” AI, where model weights are freely or widely available. But as mentioned earlier, once a model is open, most safety measures or guardrails can be (and generally are) stripped away. So there is an acute tension between on the one hand decentralization, and on the other hand safety, security, and human control of AI systems. There are also reasons to be skeptical that open models will by themselves meaningfully combat power concentration in AI any more than they have in operating systems (still dominated by Microsoft, Apple, and Google despite open alternatives).¹⁴⁴

Yet there may be ways to square this circle – to centralize and mitigate risks while decentralizing capability and economic reward. This requires rethinking both how AI is developed and how its benefits are distributed.

New models of public AI development and ownership would help. This could take several forms: government-developed AI (subject to democratic oversight),¹⁴⁵ nonprofit AI development organizations (like Mozilla for browsers), or structures enabling very widespread ownership and governance. Key is that these institutions would be explicitly chartered to serve the public interest while operating under strong safety constraints.¹⁴⁶ Well-crafted regulatory and standards/certifications regimes will also be vital, so that AI products offered by a vibrant market stay genuinely useful rather than exploitative toward their users.

In terms of economic power concentration, we can use provenance tracking and “data dignity” to ensure economic benefits flow more widely. In particular, most AI power now (and in the future if we keep the Gates closed) stems from human-generated data, whether direct training data or human feedback. If AI companies were required to compensate data providers fairly,¹⁴⁷ this could at least help distribute the economic rewards more broadly. Beyond this, another model could be public ownership of significant fractions of large AI companies. For example, governments able to tax AI companies could invest a fraction of receipts into a sovereign wealth fund that holds stock in the companies, and pays dividends to the populace.¹⁴⁸

Crucial in these mechanisms is to use the power of AI itself to help distribute power better, rather than simply fighting AI-driven power concentration using non-AI means. One powerful approach would be through well-designed AI assistants that operate with genuine fiduciary duty to their users – putting users’ interests first, especially above corporate providers’.¹⁴⁹ These assistants must be truly trustworthy, technically competent yet appropriately limited based on use case and risk level, and widely available to all through public, nonprofit, or certified for-profit channels. Just as we would never accept a human assistant who secretly works against our interests for another party, we should not accept AI assistants that surveil, manipulate, or extract value from their users for corporate benefit.

Such a transformation would fundamentally alter the current dynamic where individuals are left to negotiate alone with vast (AI powered) corporate and bureaucratic machines that prioritize value extraction over human welfare. While there are many possible approaches to redistributing AI-driven power more broadly, none will emerge by default: they must be deliberately engineered and governed with mechanisms like fiduciary requirements, public provision, and tiered access based on risk.

Approaches to mitigate power concentration can face significant headwinds from incumbent powers.¹⁵⁰ But there are paths toward AI development that don’t require choosing between safety and concentrated power. By building the right institutions now, we could ensure that AI’s benefits are widely shared while its risks are carefully managed.

Our current governance structures are struggling: they are slow to respond, often captured by special interests, and increasingly distrusted by the public. Yet this is not a reason to abandon them – quite the opposite. Some institutions may need replacing, but more broadly we need new mechanisms that can enhance and supplement our existing structures, helping them function better in our rapidly evolving world.

Much of our institutional weakness stems not from formal government structures, but from degraded social institutions: our systems for developing shared understanding, coordinating action, and conducting meaningful discourse. So far, AI has accelerated this degradation, flooding our information channels with generated content, pointing us to the most polarizing and divisive content, and making it harder to distinguish truth from fiction.

But AI could actually help rebuild and strengthen these social institutions. Consider three crucial areas:

First, AI could help restore trust in our epistemic systems – our ways of knowing what is true. We could develop AI-powered systems that track and verify the provenance of information, from raw data through analysis to conclusions. These systems could combine cryptographic verification with sophisticated analysis to help people understand not just whether something is true, but how we know it’s true.¹⁵¹ Loyal AI assistants could be charged with following the details to ensure that they check out.

Second, AI could enable new forms of large-scale coordination. Many of our most pressing problems – from climate change to antibiotic resistance – are fundamentally coordination problems. We’re stuck in situations that are worse than they could be for nearly everyone, because no individual or group can afford to make the first move. AI systems could help by modeling complex incentive structures, identifying viable paths to better outcomes, and facilitating the trust-building and commitment mechanisms needed to get there.

Perhaps most intriguingly, AI could enable entirely new forms of social discourse. Imagine being able to “talk to a city”¹⁵² – not just viewing statistics, but having a meaningful dialogue with an AI system that processes and synthesizes the views, experiences, needs, and aspirations of millions of residents. Or consider how AI could facilitate genuine dialogue between groups that currently talk past each other, by helping each side better understand the other’s actual concerns and values rather than their caricatures of each other.¹⁵³ Or AI could offer skilled, credibly neutral intermediation of disputes between people or even large groups of people (who could all interact with it directly and individually!) Current AI is totally capable of doing this work, but the tools to do so will not come into being by themselves, or via market incentives.

These possibilities might sound utopian, especially given AI’s current role in degrading discourse and trust. But that’s precisely why we must actively develop these positive applications. By closing the Gates to uncontrollable AGI and prioritizing AI that enhances human agency, we can steer technological progress toward a future where AI serves as a force for empowerment, resilience, and collective advancement.

That said, staying away from the triple-intersection is unfortunately not as easy as one might like. Pushing capability very hard in any one of the three aspects tends to increase it in the others. In particular, it may be hard to create an extremely general and capable intelligence that can’t be easily turned autonomous. One approach is to train models “myopic” systems with hobbled planning ability. Another would to focus on engineering pure “oracle” systems that would shy away from answering action-oriented questions.↩︎
Many companies fail to realize that they too would eventually be displaced by AGI, even if it takes longer – if they did, they might push on those Gates a bit less!↩︎
AI systems could communicate in more efficient but less intelligible ways, but maintaining human understanding should take priority.↩︎
This idea of modular, interpretable AI has been developed in detail by several researchers; see e.g. the “Comprehensive AI Services” model by Drexler, the “Open Agency Architecture” of Dalrymple and others. While such systems might require more engineering effort than monolithic neural networks trained with massive computation, that’s precisely where computation limits help – by making the safer, more transparent path also the more practical one.↩︎
On safety cases in general see this handbook. Pertaining to AI in particular, see Wasil et al., Clymer et al., Buhl et al., and Balesni et al.↩︎
We are in fact already seeing this trend driven just by the high cost of inference: smaller and more specialized models “distilled” from larger ones and capable of running on less expensive hardware.↩︎
I understand why those excited about the AI tech ecosystem may oppose what they see as onerous regulation on their industry. But it is frankly baffling to me why, say, a venture capitalist would want to allow runaway to AGI and superintelligence. Those systems (and companies, while they remain under company control) will eat all of the startups as a snack. Probably even sooner than eating other industries. Anyone invested in a thriving AI ecosystem should prioritize ensuring that AGI development does not lead to monopolization by a few dominant players.↩︎
As economist and former Deepmind researcher Michael Webb put it, “I think if we stopped all development of bigger language models today, so GPT-4 and Claude and whatever, and they’re the last things that we train of that size – so we’re allowing lots more iteration on things of that size and all kinds of fine-tuning, but nothing bigger than that, no bigger advancements – just what we have today I think is enough to power 20 or 30 years of incredible economic growth.”↩︎
For example, DeepMind’s alphafold system used only 100,000th of GPT-4’s FLOP number.↩︎
The difficulty of self-driving cars is important to note here: while nominally a narrow task, and achievable with fair reliability with relatively small AI systems, extensive real-world knowledge and understanding is necessary to get reliability to the level needed in such a safety-critical task.↩︎
For example, given a computation budget, we’d likely see GPAI models pre-trained at (say) half that budget, and the other half used to train up very high capability in a more narrow range of tasks. This would give super-human narrow capability backstopped by near-human general intelligence.↩︎
The current dominant alignment technique is “reinforcement learning by human feedback” (RLHF) and uses human feedback to create a reward/punishment signal for reinforcement leaning of the AI model. This and related techniques like constitutional AI are working surprisingly well (though they lack robustness and can be circumvented with modest effort.) In addition, current language models are generally competent enough at common-sense reasoning that they will not make foolish moral mistakes. This is something of a sweet spot: smart enough to understand what people want (to the degree it can be defined), but not smart enough to plan elaborate deceptions or cause huge harm when they get it wrong.↩︎
In the long run, any level of AI capability that gets developed is likely to proliferate, since ultimately it is software, and useful. We’ll need to have robust mechanisms to defend against the risks such systems posed. But we do not have that now so we must be very measured in how much powerful AI models are allowed to proliferate.↩︎
The vast majority of these are non-consensual pornographic deepfakes, including of minors.↩︎
Many ingredients for such solutions exist, in the form of “bot-or-not” laws (in the EU AI act among other places), industry provenance-tracking technologies, innovative news aggregators, prediction aggregators and markets, etc.↩︎
The automation wave may not follow previous patterns, in that relatively high-skill tasks such as quality writing, interpreting law, or giving medical advice, may be as much or even more vulnerable to automation than lower-skill tasks.↩︎
For careful modeling of the effect of AGI on wages, see the report here, and gory details here, from Anton Korinek and collaborators. They find that as more pieces of jobs are automated, productivity and wages go up – to a point. Once too much is automated, productivity continue to increase, but wages crater because people are replaced wholesale by efficient AI. This is why closing the Gates is so useful: we get the productivity without the vanished human wages.↩︎
There are many ways AI can be used as, and to help build, “defensive” technologies to make protections and management more robust. See this influential post describing this “D/acc” agenda.↩︎
Somewhat ironically, a US Manhattan project would likely do little to speed timelines toward AGI – the dial of human and fiscal investment in AI progress is already pinned at 11. The primary results would be to inspire a similar project in China (which excels at national-level infrastructure projects), to make international agreements limiting AI’s risk much harder, and to alarm other geopolitical adversaries of the US such as Russia.↩︎
The “National AI Research Resource” program is a good current step in this direction and should be expanded.↩︎
See this analysis of the various meanings and implications of “open” in tech products and how some have led to more, rather than less, entrenchment of dominance.↩︎
Plans in the US for a National AI Research Resource and the recent launch of a European AI Foundation are interesting steps in this direction.↩︎
The challenge here is not technical but institutional – we urgently need real-world examples and experiments in what public-interest AI development could look like.↩︎
This goes against current big tech business models and would require both legal action and new norms.↩︎
Only some governments will be able to do so. A more radical idea is a universal fund of this type, under joint ownership of all humans.↩︎
For a lengthy exposition of this case see this paper on AI loyalty. Unfortunately the default trajectory of AI assistants is likely to be one where they are increasingly disloyal.↩︎
Somewhat ironically, many incumbent powers are also at risk of AI-backed disempowerment; but it may be difficult for them to perceive this until and unless the process gets quite far along.↩︎
Some interesting efforts in this direction are represented by the c2pa coalition on cryptographic verification; Verity and Ground news on better news epistemics; and Metaculus and prediction markets on grounding discourse in falsifiable predictions.↩︎
See this fascinating pilot project.↩︎
See Kialo, and efforts of the Collective Intelligence Project for some examples.↩︎

Please submit feedback and corrections to taylor@futureoflife.org