International AI projects and differential AI development
Released on 26th January 2026
Citations
This note was written as part of a research avenue that I don’t currently plan to pursue further. It’s more like work-in-progress than Forethought’s usual publications, but I’m sharing it as I think some people may find it useful.
Introduction
There have been many proposals for an international AI project to manage risks from advanced AI. Something that’s missing from these proposals to date (including my own) is the idea of differential AI development, and in particular, the importance of differentially accelerating helpful AI capabilities.1
Some AI capabilities pose major risks (e.g. agentic superintelligence with wide real-world powers; LLMs that can provide instructions to manufacture bioweapons). But others are generally societally beneficial (e.g. AI for cancer screening), and others are actively helpful for addressing the risks posed by other AI capabilities (e.g. AI for forecasting). We want to limit those that pose major risks, permit those that are generally beneficial, and encourage those that are actively helpful for addressing other risks.
However, existing proposals for an international AI project usually propose that any frontier AI development (often defined above a certain compute threshold) takes place under the auspices of the international project, which would prohibit the development of the most-helpful AI capabilities as well as the most-dangerous AI capabilities.
Instead, I think proposals should probably: (i) be more surgical, focused on limiting the most-dangerous capabilities and (ii) try to actively encourage the most-helpful capabilities.
Limiting the most dangerous AI capabilities
I think that the ability to do AI R&D (such as ML research and engineering, and chip design) is the most worrying capability.
I think the primary challenge that arises from AI comes from the fact that, once AI can fully automate AI R&D, then progress in AI capabilities likely becomes extremely fast, following a super-exponential progress curve. It’s only because of this that you quickly move from pre-AGI that is easy to control to superintelligence that might be very difficult to control. It’s only because of this that you have extremely rapid technological change that results in a large number of non-alignment related challenges, with little time to respond. And it’s primarily because of this that AI could lead to intense concentration of power, where a small lead in capabilities can turn into a decisive advantage over rivals.
For this reason, I’d suggest that any international project should have a monopoly on the training of AI only that is both above a certain FLOP threshold AND that is aiming at producing AI that can meaningfully automate ML research and engineering or chip design, or produce other potentially-catastrophic technologies like engineered pathogens.
There are different ways of implementing this proposal. On one model, companies would need permission to do training runs over the FLOP threshold, with permission granted if the lab is trusted, agrees to oversight, and agrees not to try to meaningfully improve the automation of AI R&D. An alternative is that there is no process for granting permission, but that it’s illegal to make an AI system, trained with more than the FLOP threshold, that meaningfully improves automation of AI R&D. The former is more restrictive; the latter is riskier.
This might seem hard to enforce. But, because the FLOP threshold is high and such large training runs are so costly, these restrictions would only apply to a handful of actors. This means that fairly intense oversight would be feasible: for example, requiring capability audits from the international project at regular intervals throughout training, or requiring some international project supervisors to be employed at the company.
What’s more, the incentives for companies to violate this agreement would be weak: any attempt to escape oversight would be very risky and likely to fail (such as via detection or whistleblowers); and the penalties for violating the agreement could be severe (such as no longer being able to train further AI models, or even jail time); and such companies could still make major profits via AI that cannot do AI R&D. For this reason, I think this looks enforceable even if one cannot precisely specify which capabilities are prohibited (in just the same way that financial fraud is illegal even though the law cannot precisely specify all the conditions under which fraud occurs).
Encouraging the most helpful AI capabilities
“Helpful” here refers in particular to helpfulness for governments, companies, and broader society to respond to risks posed by rapid AI tech progress.
Some helpful AI capabilities include:
- AI for forecasting and strategic foresight, to help human decision-makers to know what’s coming, including what capabilities would soon come on-line from continued AI progress.
- AI for policy analysis and advice, to provide a much better understanding of what policy responses are available, and what the effects of those policy options would be.
- AI for ethical deliberation, to help reason through the ethical quandaries that such new developments might pose (for example, around digital sentience), and/or quickly aggregate the preferences of a wider swathe of the electorate or humanity as a whole than is normally possible.
- AI that assists in making trades or agreements, to identify positive-sum trades or treaties that could be agreed-upon, whether that’s between labs, between labs and governments, or between governments.
- AI for rapid education and tuition, to help decision-makers and society at large get up to speed on the latest technological developments and geopolitical changes in what would be an extremely fast-changing world.
We could call the set of such capabilities “artificial wisdom” rather than “artificial intelligence”.
It would be highly desirable if we could get narrow superintelligence in these domains (just as we have narrow superintelligence in Chess and Go) before the point at which we have more generally capable, or more dangerous, AI systems.
Governments could choose to deliberately incentivise work on helpful AI capabilities by (i) giving grants or subsidies to companies that are producing helpful AI capabilities; (ii) awarding prizes to companies that produce helpful AI capabilities; (iii) creating Advance Market Commitments, agreeing in advance to pay a certain amount in advance for access to AI with certain capabilities (perhaps as measured on technical benchmarks); (iv) directly building such capabilities as part of an international project.
Advantages of this approach
The approach I’ve suggested both limits only the most-dangerous capabilities, and actively encourages the most-helpful capabilities.
I think that this approach has a number of advantages over the blanket-ban approach:
- Differential acceleration of AI capabilities can help us be more prepared to manage rapid AI-enabled change. When faced with the prospect of an intelligence explosion, a key challenge is that human intellectual capabilities will not speed up much compared to the rapid increase in the pace of AI capabilities and technological change. Differentially advancing these beneficial AI capabilities would augment human intelligence, enabling human decision-makers to keep up for longer, and make better decisions in the early stages of the intelligence explosion.
- It would be more acceptable to industry. Rather than stifling cutting-edge AI research outside of the international project, it would merely be steering the trajectory and sequencing of AI research. Companies outside of the international project could still do cutting-edge research, and profit from it. If the plan were combined with subsidies and the ability to purchase equity in the joint project, then these other companies might actively benefit compared to the status quo.
- It would decrease the incentives for countries/actors outside the international project to race. They would get access to many advanced AI capabilities, and so there would be less incentive for them to build up their own AI industry. They would be actively benefiting from AI progress, which would reduce the feeling that they are being left out; it would also make them richer and their decision-makers wiser as AI progresses, and so they would have more to lose by threatening the status quo.
- For some capabilities, it can be beneficial for your adversaries to have those capabilities. For example, if the US and a rival country also have access to “market-making” AI that can help identify and enact positive-sum trades or treaties, that would be good for the US as well as the rival country.
Risks of this approach
This approach isn’t without its challenges, including:
- Ensuring that all sufficiently dangerous capabilities are covered.
- Ensuring that models with allowed capabilities can’t easily be augmented by third parties (e.g. via fine-tuning) into models with prohibited capabilities.
Depending on how AI progresses, it might turn out that these challenges are too difficult to get around. But given the magnitude of the potential benefits from helpful AI capabilities (including for existential risk reduction), it could well be worth increasing the risk from dangerous capabilities a little, to increase the benefits from helpful ones.
Thanks to many people for comments and discussion, and to Rose Hadshar for help with editing.
Footnotes
Released on 26th January 2026
Citations
The international AGI project series
Article SeriesPart 4 of 7
This is a series of papers and research notes on the idea that AGI should be developed as part of an international collaboration between governments. We aim to (i) assess how desirable an international AGI project is; (ii) assess what the best version of an international AGI project (taking feasibility into account) would look like.