An Early Warning System for AI-Powered Threats to National Security and Public Safety
In just a few years, state-of-the-art artificial intelligence (AI) models have gone from not reliably counting to 10 to writing software, generating photorealistic videos on demand, combining language and image processing to guide robots, and even advising heads of state in wartime. If responsibly developed and deployed, AI systems could benefit society enormously. However, emerging AI capabilities could also pose severe threats to public safety and national security. AI companies are already evaluating their most advanced models to identify dual-use capabilities, such as the capacity to conduct offensive cyber operations, enable the development of biological or chemical weapons, and autonomously replicate and spread. These capabilities can arise unpredictably and undetected during development and after deployment.
To better manage these risks, Congress should set up an early warning system for novel AI-enabled threats to provide defenders maximal time to respond to a given capability before information about it is disclosed or leaked to the public. This system should also be used to share information about defensive AI capabilities. To develop this system, we recommend:
- Congress should assign and fund the Bureau of Industry and Security (BIS) to act as an information clearinghouse to receive, triage, and distribute reports on dual-use AI capabilities. In parallel, Congress should require developers of advanced models to report dual-use capability evaluations results and other safety critical information to BIS.
- Congress should task specific agencies to lead working groups of government agencies, private companies, and civil society to take coordinated action to mitigate risks from novel threats.
Challenge and Opportunity
In just the past few years, advanced AI has surpassed human capabilities across a range of tasks. Rapid progress in AI systems will likely continue for several years, as leading model developers like OpenAI and Google DeepMind plan to spend tens of billions of dollars to train more powerful models. As models gain more sophisticated capabilities, some of these could be dual-use, meaning they will “pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters”—but in some cases may also be applied to defend against serious risks in those domains.
New AI capabilities can emerge unexpectedly. AI companies are already evaluating models to check for dual-use capabilities, such as the capacity to enhance cyber operations, enable the development of biological or chemical weapons, and autonomously replicate and spread. These capabilities could be weaponized by malicious actors to threaten national security or could lead to brittle, uncontrollable systems that cause severe accidents. Despite the use of evaluations, it is not clear what should happen when a dual-use capability is discovered.
An early-warning system would allow the relevant actors to access evaluation results and other details of dual-use capability reports to strengthen responses to novel AI-powered threats. Various actors could take concrete actions to respond to risks posed by dual-use AI capabilities, but they need lead time to coordinate and develop countermeasures. For example, model developers could mitigate immediate risks by restricting access to models. Governments could work with private-sector actors to use new capabilities defensively or employ enhanced, targeted export controls to deny foreign adversaries from accessing strategically relevant capabilities.
A warning system should ensure secure information flow between three types of actors:
- Finders: the parties that can initially identify dual-use capabilities in models. These include AI company staff, government evaluators such as the U.S. AI Safety Institute (USAISI), contracted evaluators and red-teamers, and independent security researchers.
- Coordinators: the parties that provide the infrastructure for collecting, triaging, and directing dual AI capability reports.
- Defenders: the parties that could take concrete actions to mitigate threats from dual-use capabilities or leverage them for defensive purposes, such as advanced AI companies and various government agencies.
While this system should cover a variety of finders, defenders, and capability domains, one example of early warning and response in practice might look like the following:
- Discovery: An AI company identifies a novel capability in one of its latest models during the development process. They find the model is able to autonomously detect, identify, and exploit cyber vulnerabilities in simulated IT systems.
- Reporting to coordinator: The company is concerned that this capability could be used to attack critical infrastructure systems, so they report the relevant information to a government coordinator.
- Triage and reporting to working groups: The coordinator processes this report and passes it along to the Cybersecurity and Infrastructure Security Agency (CISA), the lead agency for handling AI-enabled cyber threats to critical infrastructure.
- Verification and response: CISA verifies that this system can identify specific types of vulnerabilities in some legacy systems and creates a priority contract with the developer and critical infrastructure providers to use the model to proactively and regularly identify vulnerabilities across these systems for patching.
The current environment has some parts of a functional early-warning system, such as reporting requirements for AI developers described in Executive Order 14110, and existing interagency mechanisms for information-sharing and coordination like the National Security Council and the Vulnerabilities Equities Process.
However, gaps exist across the current system:
- There is a lack of clear intake channels and standards for capability reporting to the government outside of mandatory reporting under EO14110. Also, parts of the Executive Order that mandate reporting may be overturned in the next administration, or this specific use of the Defense Production Act (DPA) could be successfully struck down in the courts.
- Various legal and operational barriers mean that premature public disclosure, or no disclosure at all, is likely to happen. This might look like an independent researcher publishing details about a dangerous offensive cyber capability online, or an AI company failing to alert appropriate authorities due to concerns about trade secret leakage or regulatory liability.
- BIS intakes mandatory dual-use capability reports, but it is not tasked to be a coordinator and is not adequately resourced for that role, and information-sharing from BIS to other parts of government is limited.
- There is also a lack of clear, proactive ownership of response around specific types of AI-powered threats. Unless these issues are resolved, AI-powered threats to national security and public safety are likely to arise unexpectedly without giving defenders enough lead time to prepare countermeasures.
Plan of Action
Improving the U.S. government’s ability to rapidly respond to threats from novel dual-use AI capabilities requires actions from across government, industry, and civil society. The early warning system detailed below draws inspiration from “coordinated vulnerability disclosure” (CVD) and other information-sharing arrangements used in cybersecurity, as well as the federated Sector Risk Management Agency (SRMA) approach used to organize protections around critical infrastructure. The following recommended actions are designed to address the issues with the current disclosure system raised in the previous section.
First, Congress should assign and fund an agency office within the BIS to act as a coordinator–an information clearinghouse for receiving, triaging, and distributing reports on dual-use AI capabilities. In parallel, Congress should require developers of advanced models to report dual-use capability evaluations results and other safety critical information to BIS (more detail can be found in the FAQ). This creates a clear structure for finders looking to report to the government and provides capacity to triage reports and figure out what information should be sent to which working groups.
This coordinating office should establish operational and legal clarity to encourage voluntary reporting and facilitate mandatory reporting. This should include the following:
- Set up a reporting protocol where finders can report dual-use capability-related information: a full accounting of dual-use capability evaluations run on the model, details on mitigation measures, and information about the compute used to train affected models.
- Outline criteria for Freedom of Information Act (FOIA) disclosure exemptions for reported information in order to manage concerns from companies and other parties around potential trade secret leakage.1
- Adapt relevant protections for whistleblowers from their employers or contracting parties.
- If the relevant legal mechanism is not fit for purpose, Congress should include equivalent mechanisms in legislation. This can draw from similar legislation in the cybersecurity domain, such as the Cybersecurity Information Sharing Act 2015’s provisions to protect reporting organizations from antitrust and specific kinds of regulatory liability.
BIS is suited to house this function because it already receives reports on dual-use capabilities from companies via DPA authority under EO14110. Additionally, it has in-house expertise on AI and hardware from administering export controls on critical emerging technology, and it has relationships with key industry stakeholders, such as compute providers. (There are other candidates that could house this function as well. See the FAQ.)
To fulfill its role as a coordinator, this office would need an initial annual budget of $8 million to handle triaging and compliance work for an annual volume of between 100 and 1,000 dual-use capability reports.2 We provide a budget estimate below:
The office should leverage the direct hire authority outlined by Office of Personnel Management (OPM) and associated flexible pay and benefits arrangements to attract staff with appropriate AI expertise. We expect most of the initial reports would come from 5 to 10 companies developing the most advanced models. Later, if there’s more evidence that near-term systems have capabilities with national security implications, then this office could be scaled up adaptively to allow for more fine-grained monitoring (see FAQ for more detail).
Second, Congress should task specific agencies to lead working groups of government agencies, private companies, and civil society to take coordinated action to mitigate risks from novel threats. These working groups would be responsible for responding to threats arising from reported dual-use AI capabilities. They would also work to verify and validate potential threats from reported dual-use capabilities and develop incident response plans. Each working group would be risk-specific and correspond to different risk areas associated with dual-use AI capabilities:
- Chemical weapons research, development, and acquisition
- Biological weapons research, development, and acquisition
- Cyber-offense research, development, and acquisition
- Radiological and nuclear weapons research, development, and acquisition
- Deception, persuasion, manipulation, and political strategy
- Model autonomy and loss of control3
- For dual-use capabilities that fall into a category not covered by other lead agencies, USAISI acts as the interim lead until a more appropriate owner is identified.
This working group structure enables interagency and public-private coordination in the style of SRMAs and Government Coordination Councils (GCCs) used for critical infrastructure protection. This approach distributes responsibilities for AI-powered threats across federal agencies, allowing each lead agency to be appointed based on the expertise they can leverage to deal with specific risk areas. For example, the Department of Energy (specifically the National Nuclear Security Administration) would be an appropriate lead when it comes to the intersection of AI and nuclear weapons development. In cases of very severe and pressing risks, such as threats of hundreds or thousands of fatalities, the responsibility for coordinating an interagency response should be escalated to the President and the National Security Council system.
Conclusion
Dual-use AI capabilities can amplify threats to national security and public safety but can also be harnessed to safeguard American lives and infrastructure. An early-warning system should be established to ensure that the U.S. government, along with its industry and civil society partners, has maximal time to prepare for AI-powered threats before they occur. Congress, working together with the executive branch, can lay the foundation for a secure future by establishing a government coordinating office to manage the sharing of safety-critical across the ecosystem and tasking various agencies to lead working groups of defenders focused on specific AI-powered threats.
The longer research report this memo is based on can be accessed here.
This idea is part of our AI Legislation Policy Sprint. To see all of the policy ideas spanning innovation, education, healthcare, and trust, safety, and privacy, head to our sprint landing page.
This plan recommends that companies developing and deploying dual-use foundation models be mandated to report safety-critical information to specific government offices. However, we expect these requirements to only apply to a few large tech companies that would be working with models that fulfill specific technical conditions. A vast majority of businesses and models would not be subject to mandatory reporting requirements, though they are free to report relevant information voluntarily.
The few companies that are required to report should have the resources to comply. An important consideration behind our plan is to, where possible and reasonable, reduce the legal and operational friction around reporting critical information for safety. This can be seen in our recommendation that relevant parties from industry and civil society work together to develop reporting standards for dual-use capabilities. Also, we suggest that the coordinating office should establish operational and legal clarity to encourage voluntary reporting and facilitate mandatory reporting, which is done with industry and other finder concerns in mind.
This plan does not place restrictions on how companies conduct their activities. Instead, it aims to ensure that all parties that have equities and expertise in AI development have the information needed to work together to respond to serious safety and security concerns. Instead of expecting companies to shoulder the responsibility of responding to novel dangers, the early-warning system distributes this responsibility to a broader set of capable actors.
Bureau of Industry and Security (BIS) already intakes reports on dual-use capabilities via DPA authority under EO 14110
Department of Commerce
- USAISI will have significant AI safety-related expertise and also sits under Commerce
- Internal expertise on AI and hardware from administering export controls
US AI Safety Institute (USAISI), Department of Commerce
- USAISI will have significant AI safety-related expertise
- Part of NIST, which is not a regulator, so there may be fewer concerns on the part of companies when reporting
- Experience coordinating relevant civil society and industry groups as head of the AI Safety Consortium
Cybersecurity and Infrastructure Security Agency (CISA), Department of Homeland Security
- Experience managing info-sharing regime for cyber threats that involve most relevant government agencies, including SRMAs for critical infrastructure
- Experience coordinating with private sector
- Located within DHS, which has responsibilities covering counterterrorism, cyber and infrastructure protection, domestic chemical, biological, radiological, and nuclear protection, and disaster preparedness and response. That portfolio seems like a good fit for work handling information related to dual-use capabilities.
- Option of Federal Advisory Committee Act exemption for DHS Federal Advisory Committees would mean working group meetings can be nonpublic and meetings do not require representation from all industry representatives
Office of Critical and Emerging Technologies, Department of Energy (DOE)
- Access to DOE expertise and tools on AI, including evaluations and other safety and security-relevant work (e.g., classified testbeds in DOE National Labs)
- Links to relevant defenders within DOE, such as the National Nuclear Security Administration
- Partnerships with industry and academia on AI
- This office is much smaller than the alternatives, so would require careful planning and management to add this function.
Based on dual-use capability evaluations conducted on today’s most advanced models, there is no immediate concern that these models can meaningfully enhance the ability of malicious actors to threaten national security or cause severe accidents. However, as outlined in earlier sections of the memo, model capabilities have evolved rapidly in the past, and new capabilities have emerged unintentionally and unpredictably.
This memo recommends initially putting in place a lean and flexible system to support responses to potential AI-powered threats. This would serve a “fire alarm” function if dual-use capabilities emerge and would be better at reacting to larger, more discontinuous jumps in dual-use capabilities. This also lays the foundation for reporting standards, relationships between key actors, and expertise needed in the future. Once there is more concrete evidence that models have major national security implications, Congress and the president can scale up this system as needed and allocate additional resources to the coordinating office and also to lead agencies. If we expect a large volume of safety-critical reports to pass through the coordinating office and a larger set of defensive actions to be taken, then the “fire alarm” system can be shifted into something involving more fine-grained, continuous monitoring. More continuous and proactive monitoring would tighten the Observe, Orient, Decide, and Act (OODA) loop between working group agencies and model developers, by allowing agencies to track gradual improvements, including from post-training enhancements.
While incident reporting is also valuable, an early-warning system focused on capabilities aims to provide a critical function not addressed by incident reporting: preventing or mitigating the most serious AI incidents before they even occur. Essentially, an ounce of prevention is worth a pound of cure.
Sharing information on vulnerabilities to AI systems and infrastructure and threat information (e.g., information on threat actors and their tactics, techniques, and practices) is also important, but distinct. We think there should be processes established for this as well, which could be based on Information Sharing and Analysis Centers, but it is possible that this could happen via existing infrastructure for sharing this type of information. Information sharing around dual-use capabilities though is distinct to the AI context and requires special attention to build out the appropriate processes.
While this memo focuses on the role of Congress, an executive branch that is interested in setting up or supporting an early warning system for AI-powered threats could consider the following actions.
Our second recommendation—tasking specific agencies to lead working groups to take coordinated action to mitigate risks from advanced AI systems—could be implemented by the president via Executive Order or a Presidential Directive.
Also, the National Institute of Standards and Technology could work with other organizations in industry and academia, such as advanced AI developers, the Frontier Model Forum, and security researchers in different risk domains, to standardize dual-use capability reports, making it easier to process reports coming from diverse types of finders. A common language around reporting would make it less likely that reported information is inconsistent across reports or is missing key decision-relevant elements; standardization may also reduce the burden of producing and processing reports. One example of standardization is narrowing down thresholds for sending reports to the government and taking mitigating actions. One product that could be generated from this multi-party process is an AI equivalent to the Stakeholder-Specific Vulnerability Categorization system used by CISA to prioritize decision-making on cyber vulnerabilities. A similar system could be used by the relevant parties to process reports coming from diverse types of finders and by defenders to prioritize responses and resources according to the nature and severity of the threat.
The government has a responsibility to protect national security and public safety – hence their central role in this scheme. Also, many specific agencies have relevant expertise and authorities on risk areas like biological weapons development and cybersecurity that are difficult to access outside of government.
However, it is true that the private sector and civil society have a large portion of the expertise on dual-use foundation models and their risks. The U.S. government is working to develop its in-house expertise, but this is likely to take time.
Ideally, relevant government agencies would play central roles as coordinators and defenders. However, our plan recognizes the important role that civil society and industry play in responding to emerging AI-powered threats as well. Industry and civil society can take a number of actions to move this plan forward:
- An entity like the Frontier Model Forum can convene other organizations in industry and academia, such as advanced AI developers and security researchers in different risk domains, to standardize dual-use capability reports independent of NIST.
- Dual-use foundation model (DUFM) developers should establish clear policies and intake procedures for independent researchers reporting dual-use capabilities.
- DUFM developers should work to identify capabilities that could help working groups to develop countermeasures to AI threats, which can be shared via the aforementioned information-sharing infrastructure or other channels (e.g., pre-print publication).
- In the event that a government coordinating office cannot be created, there could be an independent coordinator that fulfills a role as an information clearinghouse for dual-use AI capabilities reports. This could be housed in organizations with experience operating federally funded research and development centers like MITRE or Carnegie Mellon University’s Software Engineering Institute.
- If it is responsible for sharing information between AI companies, this independent coordinator may need to be coupled with a safe harbor provision around antitrust litigation specifically pertaining to safety-related information. This safe harbor could be created via legislation, like a similar provision used in CISA 2015 or via a no-action letter from the Federal Trade Commission.
We suggest that reporting requirements should apply to any model trained using computing power greater than 1026 floating-point operations. These requirements would only apply to a few companies working with models that fulfill specific technical conditions. However, it will be important to establish an appropriate authority within law to dynamically update this threshold as needed. For example, revising the threshold downwards (e.g., to 1025) may be needed if algorithmic improvements allow developers to train more capable models with less compute or other developers devise new “scaffolding” that enables them to elicit dangerous behavior from already-released models. Alternatively, revising the threshold upwards (e.g., to 1027) may be desirable due to societal adaptation or if it becomes clear that models at this threshold are not sufficiently dangerous. The following information should be included in dual-use AI capability reports, though the specific format and level of detail will need to be worked out in the standardization process outlined in the memo:
- Name and address of model developer
- Model ID information (ideally standardized)
- Indicator of sensitivity of information
- A full accounting of the dual-use capabilities evaluations run on the model at the training and pre-deployment stages, their results, and details of the size and scope of safety-testing efforts, including parties involved
- Details on current and planned mitigation measures, including up-to-date incident response plans
- Information about compute used to train models that have triggered reporting (e.g., amount of compute and training time required, quantity and variety of chips used and networking of compute infrastructure, and the location and provider of the compute)
Some elements would not need to be shared beyond the coordinating office or working group lead (e.g., personal identifying information about parties involved in safety testing or specific details about incident response plans) but would be useful for the coordinating office in triaging reports.
The following information should not be included in reports in the first place since it is commercially sensitive and could plausibly be targeted for theft by malicious actors seeking to develop competing AI systems:
- Information on model architecture
- Datasets used in training
- Training techniques
- Fine-tuning techniques
The incoming administration should work towards encouraging state health departments to develop clear and well-communicated data storage standards for newborn screening samples.
Proposed bills advance research ecosystems, economic development, and education access and move now to the U.S. House of Representatives for a vote
NIST’s guidance on “Managing Misuse Risk for Dual-Use Foundation Models” represents a significant step forward in establishing robust practices for mitigating catastrophic risks associated with advanced AI systems.
Surveillance has been used on citizen activists for decades. What can civil society do to fight back against the growing trend of widespread digital surveillance?