Precursory Capabilities: A Refinement to Pre-deployment Information Sharing and Tripwire Capabilities


In our new papers on ‘Pre-Deployment Information Sharing: A Zoning Taxonomy for Precursory Capabilities’ and ‘Towards Frontier Safety Policies Plus,’ we explore methods to better operationalise existing capability thresholds through, what we call, ‘precursory capabilities.’ 

We think of precursory capabilities as smaller preliminary components to high-impact capabilities that an AI model needs to have in order to unlock more advanced capabilities. In this sense, we imagine precursory capabilities as ‘but for’ skills––skills without which a certain action by an AI model would be impossible. Precursory capabilities are causally connected to one another in a spectrum that leads from basic core capabilities to high-impact capabilities and, hence, from ‘less close’ to ‘closer’ to potential catastrophic risk thresholds. 

More specifically, we see precursory capabilities as located in what we call a ‘zoning taxonomy’––a gradient where each precursory component may bring us closer to unacceptable risk  or ‘red lines’ (Figure A).

Figure A. Illustrative Example of Precursory Capabilities to Scheming. We note that the purpose of Figure A is to exemplify our recommendations and not to provide a comprehensive taxonomy for scheming.


We believe that a taxonomy of precursory capabilities has certain advantages over current capability thresholds. Among other things, such a taxonomy could:

  1. Provide a clearer ‘early warning’ platform on AI capability progress.

  2. Support the building of more concrete consensus, in turn enabling a timely trigger for relevant mitigations and governance interventions.

  3. Offer a more granular and exhaustive perspective on risk levels and foreseeable capabilities enabling societal and systemic preparedness.

  4. Support the harmonization of ‘early warning’ signs between frontier AI developers.


We articulate our thinking in more detail below.

Precursory Capabilities as Early Warning Signs Ahead of ‘Red Lines’

In our paper on ‘Pre-Deployment Information Sharing: A Zoning Taxonomy for Precursory Capabilities,’ we built on the Frontier AI Safety Commitments and explained how a zoning taxonomy of precursory capabilities could represent an advancement for information sharing and transparency. In particular, it can aid AI developers examine which information they should share, when, to whom, and how.

At its core, we think developers should share precursory capabilities to high-impact capabilities when they are first identified (in other words, as soon as they become known) before model deployment. This would support early awareness, allowing visibility into warning signs long before reaching red lines (Figure B).

Figure B. An Illustrative Comparison Between Red Lines and the Zoning Taxonomy.

In the paper, we also advanced other recommendations that would complement a transparent and functional pre-deployment information-sharing regime. In particular, we suggest that: 

  • AISIs and AISI-like institutions (such as the European Union AI Office) receive and coordinate pre-deployment information on precursory capabilities.

  • AISIs and AISI-like institutions manage the information received from AI developers with increasing sensitivity depending on its relative location within the zoning taxonomy and safeguard it by classifying it or marking it as ‘controlled,’ if necessary. 

  • AISIs and AISI-like institutions establish an early warning pipeline and alert each other if a precursory capability identified by the zoning taxonomy has been reached, provided that the recipient AISI or an AISI-like institution guarantees appropriate levels of information security.

Precursory Capabilities as Fine-Grained Tripwires for Frontier Safety Policies

In our paper on ‘Towards Frontier Safety Policies Plus,’ we examined how a fine-grained taxonomy of precursory capabilities could serve as more granular triggers for AI companies’ safety commitments in FSPs. Converting precursory capabilities into FSP tripwires could remedy the lack of specificity, insufficient depth of vertical risk levels, and external verifiability of FSPs that the AI research community has criticized. 

In this context, while assessing reasonable counterarguments, we analyzed arguments in support of standardizing precursory capabilities through a consensus-driven process led by international (e.g., ISO, IEC) or domestic (e.g., NIST, CEN-CENELEC, BIS) standardisation bodies. While these bodies engage different industry, government, and academia stakeholders in their quest towards standardisation, we suggest that the Frontier Model Forum lead the way in establishing preliminary consensus between frontier developers. 


This blog post summarises our recent thinking on pre-deployment information sharing and frontier safety policies (‘FSPs’). Our paper on ‘
Pre-Deployment Information Sharing: A Zoning Taxonomy for Precursory Capabilities ‘ was selected for presentation at the UK AISI’s Conference on Frontier AI Safety Frameworks. Our paper on ‘Towards Frontier Safety Policies Plus’ was selected for presentation at the International Association for Safe and Ethical AI Conference.

Next
Next

Detecting Strategic Deception Using Linear Probes