by Anthony Aguirre

Appendixes

Supplementary information, including: Technical details around compute accounting, an example implementation of a 'gate closure', details for a strict AGI liability regime, and a tiered approach to AGI safety & security standards

Save the whole paper:

Download PDF

Essay navigation

Appendix A: Compute accounting technical details

Compute accounting technical details
A detailed method for both “ground truth” as well as good approximations for the total compute used in training and inference is required for meaningful compute-based controls. Here is an example of how the “ground truth” could be tallied at a technical level.
Definitions:
Compute causal graph: For a given output O of an AI model, there is a set of digital computations for which changing the result of that computation could potentially change O. (This should be conservatively assumed, i.e. there should be a clear reason to believe that a computation is independent of a precursor that both occurs earlier in time and has a physical potential causal path of effect.) This includes computation done by the AI model during inference, as well as computations that went into input, data preparation, and training of the model. Because any of these may itself be output from an AI model, this is computed recursively, cut off where a human has provided a significant change to the input.
Training Compute: The total compute, in FLOP or other units, entailed by the compute causal graph of a neural network (including data preparation, training, and fine-tuning, and any other computations.)
Output Compute: The total compute in the compute causal graph of a given AI output, including all neural networks (and including their Training Compute) and other computations going into that output.
Inference Compute Rate: In a series of outputs, the rate of change (in FLOP/s or other units) of Output Compute between outputs, i.e. the compute used to produce the next output, divided by the timed interval between the outputs.
Examples and approximations:
For a single neural network trained on human-created data, the Training Compute is simply the total training compute as customarily reported. For such a neural network doing inference at a steady rate, the Inference Compute Rate is approximately total speed of computation cluster performing the inference in FLOP/s. For model fine-tuning, Training Compute of the complete model is given by the Training Compute of the non-fine-tuned model plus the computation done during fine-tuning and to prepare any data used in fine-tuning. For a distilled model, the Training Compute of the complete model includes training of both the distilled model and the larger model used to provide synthetic data or other training input. If several models are trained, but many “trials” are discarded on the basis of human judgment, these do not count toward the Training or Output Compute of the retained model.

Appendix B: Example implementation of a gate closure

Implementation Example: Here is one example of how a gate closure could work, given a limit of 10²⁷ FLOP for training and 10²⁰ FLOP/s for inference (running the AI):
1. Pause: For reasons of national security, the US Executive branch asks all companies based in the US, doing business in the US, or using chips manufactured in the US, to cease and desist from any new AI training runs that might exceed the 10²⁷ FLOP Training Compute limit. The US should commence discussions with other countries hosting AI development, strongly encouraging them to take similar steps and indicating that the US pause may be lifted should they choose not to comply. 2. US oversight and licensing: By executive order or action of an existing regulatory agency, the US requires that within (say) one year: All AI training runs estimated above 10²⁵ FLOP done by companies operating in the US be registered in a database maintained by a US regulatory agency. (Note: A slightly weaker version of this had already been included in the now-rescinded 2023 US executive order on AI, requiring registration for models above 10²⁶ FLOP.) All AI-relevant hardware manufacturers operating in the US or doing business with the USG adhere to a set of requirements on their specialized hardware and the software driving it. (Many of these requirements could be built into software and firmware updates to existing hardware, but longterm and robust solutions would require changes to later generations of hardware.) Among these is a requirement that if the hardware is part of a high-speed-interconnected cluster capable of executing 10¹⁸ FLOP/s of computation, a higher level of verification is required, which includes regular permission by a remote “governor” who receives both telemetry and requests to perform additional computation. The custodian reports the total computation performed on its hardware to the agency maintaining the US database. Stronger requirements are phased in to allow both more secure and more flexible oversight and permissioning. 3. International oversight: The US, China, and any other countries hosting advanced chip manufacturing capability negotiate an international agreement. This agreement creates a new international agency, analogous to the International Atomic Energy Agency, charged with overseeing AI training and execution. Signatory countries must require their domestic AI hardware manufacturers to comply with a set of requirements at least as strong as those imposed in the US. Custodians are now required to report AI computation numbers to both agencies in their home countries as well as a new office within the international agency. Additional countries are strongly encouraged to join the existing international agreement: export controls by signatory counties restrict access to high-end hardware by non-signatories while signatories can receive technical support in managing their AI systems. 4. International verification and enforcement: The hardware verification system is updated so that it reports computation usage to both the original custodian and also directly to the international agency office. The agency, via discussion with the signatories of the international agreement, agrees on computation limitations which then take legal force in the signatory countries. In parallel, a set of international standards may be developed so that training and running of AIs above a threshold of computation (but below the limit) are required to adhere to those standards. The agency can, if necessary to compensate for better algorithms etc., lower the computation limit. Or, if it is deemed safe and advisable (at say the level of provable safety guarantees), raise the computation limit.

Implementation Example: Here is one example of how a gate closure could work, given a limit of 10²⁷ FLOP for training and 10²⁰ FLOP/s for inference (running the AI):

1. Pause: For reasons of national security, the US Executive branch asks all companies based in the US, doing business in the US, or using chips manufactured in the US, to cease and desist from any new AI training runs that might exceed the 10²⁷ FLOP Training Compute limit. The US should commence discussions with other countries hosting AI development, strongly encouraging them to take similar steps and indicating that the US pause may be lifted should they choose not to comply.

2. US oversight and licensing: By executive order or action of an existing regulatory agency, the US requires that within (say) one year:

All AI training runs estimated above 10²⁵ FLOP done by companies operating in the US be registered in a database maintained by a US regulatory agency. (Note: A slightly weaker version of this had already been included in the now-rescinded 2023 US executive order on AI, requiring registration for models above 10²⁶ FLOP.)
All AI-relevant hardware manufacturers operating in the US or doing business with the USG adhere to a set of requirements on their specialized hardware and the software driving it. (Many of these requirements could be built into software and firmware updates to existing hardware, but longterm and robust solutions would require changes to later generations of hardware.) Among these is a requirement that if the hardware is part of a high-speed-interconnected cluster capable of executing 10¹⁸ FLOP/s of computation, a higher level of verification is required, which includes regular permission by a remote “governor” who receives both telemetry and requests to perform additional computation.
The custodian reports the total computation performed on its hardware to the agency maintaining the US database.
Stronger requirements are phased in to allow both more secure and more flexible oversight and permissioning.

3. International oversight:

The US, China, and any other countries hosting advanced chip manufacturing capability negotiate an international agreement.
This agreement creates a new international agency, analogous to the International Atomic Energy Agency, charged with overseeing AI training and execution.
Signatory countries must require their domestic AI hardware manufacturers to comply with a set of requirements at least as strong as those imposed in the US.
Custodians are now required to report AI computation numbers to both agencies in their home countries as well as a new office within the international agency.
Additional countries are strongly encouraged to join the existing international agreement: export controls by signatory counties restrict access to high-end hardware by non-signatories while signatories can receive technical support in managing their AI systems.

4. International verification and enforcement:

The hardware verification system is updated so that it reports computation usage to both the original custodian and also directly to the international agency office.
The agency, via discussion with the signatories of the international agreement, agrees on computation limitations which then take legal force in the signatory countries.
In parallel, a set of international standards may be developed so that training and running of AIs above a threshold of computation (but below the limit) are required to adhere to those standards.
The agency can, if necessary to compensate for better algorithms etc., lower the computation limit. Or, if it is deemed safe and advisable (at say the level of provable safety guarantees), raise the computation limit.

Appendix C: Details for a strict AGI liability regime

Details for a strict AGI liability regime
Creation and operation of an advanced AI system that is highly general, capable, and autonomous, is considered an “abnormally dangerous” activity. As such, the default liability for training and operating such systems level is strict, joint and several liability (or its non-US equivalent) for any harms done by the model or its outputs/actions. Personal liability will be imposed for executives and board members in cases of gross negligence or willful misconduct. This should include criminal penalties for the most egregious cases. There are numerous safe-harbors under which liability reverts to the default (fault-based, in the US) liability to which people and companies would normally be subject. Models trained and operated below some compute threshold (which would be at least 10x lower than the caps described above.) AI that is “weak” (roughly, below human expert level at the tasks for which it is intended) and/or AI that is “narrow” (having a fixed and quite limited scope of tasks and operations that it is specifically designed and trained for) and/or AI that is “passive” (very limited in its ability – even under modest modification – to take actions or perform complex multistep tasks without direct human involvement and control.) An AI that is guaranteed to be safe, secure, and controllable (provably safe, or a risk analysis indicates a negligible level of expected harm.) Safe harbors may be claimed on the basis of a safety case prepared by the AI developer and approved by an agency or auditor credentialed by an agency. To claim a safe harbor based on compute, the developer must just supply credible estimates of total Training Compute and maximal Inference Rate Legislation would explicitly outline situations under which injunctive relief from the development of AI systems with a high risk of public harm would be appropriate. Company consortia, working with NGOs and government agencies, should develop standards and norms defining these terms, how regulators should grant safe harbors, how AI developer should develop safety cases, and how courts should interpret liability where safe harbors are not proactively claimed.

Details for a strict AGI liability regime

Creation and operation of an advanced AI system that is highly general, capable, and autonomous, is considered an “abnormally dangerous” activity.
As such, the default liability for training and operating such systems level is strict, joint and several liability (or its non-US equivalent) for any harms done by the model or its outputs/actions.
Personal liability will be imposed for executives and board members in cases of gross negligence or willful misconduct. This should include criminal penalties for the most egregious cases.
There are numerous safe-harbors under which liability reverts to the default (fault-based, in the US) liability to which people and companies would normally be subject.
- Models trained and operated below some compute threshold (which would be at least 10x lower than the caps described above.)
- AI that is “weak” (roughly, below human expert level at the tasks for which it is intended) and/or
- AI that is “narrow” (having a fixed and quite limited scope of tasks and operations that it is specifically designed and trained for) and/or
- AI that is “passive” (very limited in its ability – even under modest modification – to take actions or perform complex multistep tasks without direct human involvement and control.)
- An AI that is guaranteed to be safe, secure, and controllable (provably safe, or a risk analysis indicates a negligible level of expected harm.)
Safe harbors may be claimed on the basis of a safety case prepared by the AI developer and approved by an agency or auditor credentialed by an agency. To claim a safe harbor based on compute, the developer must just supply credible estimates of total Training Compute and maximal Inference Rate
Legislation would explicitly outline situations under which injunctive relief from the development of AI systems with a high risk of public harm would be appropriate.
Company consortia, working with NGOs and government agencies, should develop standards and norms defining these terms, how regulators should grant safe harbors, how AI developer should develop safety cases, and how courts should interpret liability where safe harbors are not proactively claimed.

Appendix D: A tiered approach to AGI safety & security standards

Risk classifications and safety/security standards, with tiers based on compute thresholds as well as combinations of high autonomy, generality, and intelligence:

*Strong autonomy* applies if the system is able to perform, or can easily be made to perform, many-step tasks and/or take complex actions that are real-world relevant, without significant human oversight or intervention. Examples: autonomous vehicles and robots; financial trading bots. Non-examples: GPT-4; image classifiers

*Strong generality* indicates a wide scope of application, performance of tasks for which the model was not deliberately and specifically trained, and significant ability to learn new tasks. Examples: GPT-4; mu-zero. Non-examples: AlphaFold; autonomous vehicles; image generators

*Strong intelligence* corresponds to matching human expert-level performance on the tasks for which the model performs best (and for a general model, across a broad range of tasks.) Examples: AlphaFold; mu-zero; o3. Non-examples: GPT-4; Siri
A tiered approach to AGI safety & security standards
Risk Tier	Trigger(s)	Requirements for training	Requirement for deployment
RT-0	AI weak in autonomy, generality, and intelligence	none	none
RT-1	AI strong in one of autonomy, generality, and intelligence	none	Based on risk and use, potentially safety cases approved by national authorities wherever the model can be used
RT-2	AI strong in two of autonomy, generality, and intelligence	Registration with national authority with jurisdiction over the developer	Safety case bounding risk of major harm below authorized levels plus independent safety audits (including black-box and white-box redteaming) approved by national authorities wherever the model can be used
RT-3	AGI strong in autonomy, generality, and intelligence	Pre-approval of safety and security plan by national authority with jurisdiction over the developer	Safety case guaranteeing bounded risk of major harm below authorized levels as well as required specifications, including cybersecurity, controllability, a non-removable killswitch, alignment with human values, and robustness to malicious use.
RT-4	Any model that also exceeds either 10²⁷ FLOP Training or 10²⁰ FLOP/s Inference	Prohibited pending international agreed lift of compute cap	Prohibited pending international agreed lift of compute cap

Please submit feedback and corrections to taylor@futureoflife.org

Keep The Future Human

Learn how we can keep the future human and deliver the extraordinary benefits of AI – without the unacceptable risk.

by Anthony Aguirre