Message Incoming: Establish an AI Incident Reporting System
What if an artificial intelligence (AI) lab found their model had a novel dangerous capability? Or a susceptibility to manipulation? Or a security vulnerability? Would they tell the world, confidentially notify the government, or quietly patch it up before release? What if a whistleblower wanted to come forward – where would they go?
Congress has the opportunity to proactively establish a voluntary national AI Incident Reporting Hub (AIIRH) to identify and share information about AI system failures, accidents, security breaches, and other potentially hazardous incidents with the federal government. This reporting system would be managed by a designated federal agency—likely the National Institute of Standards and Technology (NIST). It would be modeled after successful incident reporting and info-sharing systems operated by the National Cybersecurity FFRDC (funded by the Cybersecurity and Infrastructure Security Agency (CISA)), the Federal Aviation Administration (FAA), and the Food and Drug Administration (FDA). This system would encourage reporting by allowing for confidentiality and guaranteeing only government agencies could access sensitive AI systems specifications.
AIIRH would provide a standardized and systematic way for companies, researchers, civil society, and the public to provide the federal government with key information on AI incidents, enabling analysis and response. It would also provide the public with some access to these data in a reliable way, due to its statutory mandate – albeit often with less granularity than the government will have access to. Nongovernmental and international organizations, including the Responsible AI Collaborative (RAIC) and the Organisation for Economic Co-operation and Development (OECD), already maintain incident reporting systems, cataloging incidents such as facial recognition systems identifying the wrong person for arrest and trading algorithms causing market dislocations. However, these two systems have a number of limitations in their scope and reliability that make them more suitable for public accountability than government use.
By establishing this system, Congress can enable better identification of critical AI risk areas before widespread harm occurs. This proposal would help both build public trust and, if implemented successfully, would help relevant agencies recognize emerging patterns and take preemptive actions through standards, guidance, notifications, or rulemaking.
Challenge and Opportunity
While AI systems have the potential to produce significant benefits across industries like healthcare, education, environmental protection, finance, and defense, they are also potentially capable of serious harm to individuals and groups. It is crucial that the federal government understand the risks posed by AI systems and develop standards, best practices, and legislation around its use.
AI risks and harms can take many forms, from representational (such as women CEOs being underrepresented in image searches), to financial (such as automated trading systems or AI agents crashing markets), to possibly existential (such as through the misuse of AI to advance chemical, biological, radiological, and nuclear (CBRN) threats). As these systems become more powerful and interact with more aspects of the physical and digital worlds, a material increase in risk is all but inevitable in the absence of a sensible governance framework. However, in order to craft public policy that maximizes the benefits of AI and ameliorates harms, government agencies and lawmakers must understand the risks these systems pose.
There have been notable efforts by agencies to catalog types of risks, such as NIST’s 2023 AI Risk Management Framework, and to combat the worst of them, such as the Department of Homeland Security’s (DHS) efforts to mitigate AI CBRN threats. However, the U.S. government does not yet have an adequate resource to track and understand specific harmful AI incidents that have occurred or are likely to occur in the real world. While entities like the RAIC and the OECD manage AI incident reporting efforts, these systems primarily collect publicly reported incidents from the media, which are likely a small fraction of the total. These databases serve more as a source of public accountability for developers of problematic systems than a comprehensive repository suitable for government use and analysis. The OECD system lacks a proper taxonomy for different incident types and contexts, and while the RAIC database applies two external taxonomies to their data, it only does so at an aggregated level. Additionally, the OECD and RAIC systems depend on their organizations’ continued support, whereas AIIRH would be statutorily guaranteed.
The U.S. government should do all it can to facilitate as comprehensive reporting of AI incidents and risks as possible, enabling policymakers to make informed decisions and respond flexibly as the technology develops. As it has done in the cybersecurity space, it is appropriate for the federal government to act as a focal point for collection, analysis, and dissemination of data that is nationally distributed, is multi-sectoral, and has national impacts. Many federal agencies are also equipped to appropriately handle sensitive and valuable data, as is the case with AI system specifications. Compiling this kind of comprehensive dataset would constitute a national public good.
Plan of Action
We propose a framework for a voluntary Artificial Intelligence Incident Reporting Hub, inspired by existing public initiatives in cybersecurity, like the list of Common Vulnerabilities and Exploits (CVE)1 funded by CISA, and in aviation, like the FAA’s confidential Aviation Safety Reporting System (ASRS).
AIIRH should cover a broad swath of what could be considered an AI incident in order to give agencies maximal data for setting standards, establishing best practices, and exploring future safeguards. Since there is no universally agreed-upon definition of an AI safety “incident,” AIIRH would (at least initially) utilize the OECD definitions of “AI incident” and “AI hazard,” as follows:
- An AI incident is an event, circumstance or series of events where the development, use or malfunction of one or more AI systems directly or indirectly leads to any of the following harms:
- (a) injury or harm to the health of a person or groups of people;
- (b) disruption of the management and operation of critical infrastructure;
- (c) violations of human rights or a breach of obligations under the applicable law intended to protect fundamental, labour and intellectual property rights;
- (d) harm to property, communities or the environment.
- An AI hazard is an event, circumstance or series of events where the development, use or malfunction of one or more AI systems could plausibly lead to an AI incident, i.e., any of the following harms:
- (a) injury or harm to the health of a person or groups of people;
- (b) disruption of the management and operation of critical infrastructure;
- (c) violations to human rights or a breach of obligations under the applicable law intended to protect fundamental, labour and intellectual property rights;
- (d) harm to property, communities or the environment.
With this scope, the system would cover a wide range of confirmed harms and situations likely to cause harm, including dangerous capabilities like CBRN threats. Having an expansive repository of incidents also sets up organizations like NIST to create and iterate on future taxonomies of the space, unifying language for developers, researchers, and civil society. This broad approach does introduce overlap on voluntary cybersecurity incident reporting with the expanded CVE and National Vulnerability Database (NVD) systems proposed by Senators Warner and Tillis in their Secure AI Act. However, the CVE provides no analysis of incidents, so it should be viewed instead as a starting point to be fed into the AIIRH2, and the NVD only applies traditional cybersecurity metrics, whereas the AIIRH could accommodate a much broader holistic analysis.
Reporting submitted to AIIRH should highlight key issues, including whether the incident occurred organically or as the result of intentional misuse. Details of harm either caused or deemed plausible should also be provided. Importantly, reporting forms should allow maximum information but require as little as possible in order to encourage industry reporting without fear of leaking sensitive information and lower the implied transaction costs of reporting. While as much data on these incidents as possible should be broadly shared to build public trust, there should be guarantees that any confidential information and sensitive system details shared remain secure. Contributors should also have the option to reveal their identity only to AIIRH staff and otherwise maintain anonymity.
NIST is the natural candidate to function as the reporting agency, as it has taken a larger role in AI standards setting since the release of the Biden Administration’s Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. NIST also has experience with incident reporting through their NVD, which contains agency experts’ analysis of CVE incidents. Finally, similar to how the National Aeronautics and Space Administration (NASA) operates the FAA’s confidential reporting system, ASRS, as a neutral third party, NIST is a non-enforcing agency with excellent industry relationships due to its collaborations on standards and practices. CISA is another option, as it funds and manages several incident reporting systems, including over AI security if the Warner-Tillis bill passes, but there is no reason to believe CISA has the expertise to address harms caused by things like algorithmic discrimination or CBRN threats.
While NIST might be a trusted party to maintain a confidential system, employees reporting credible threats to AIIRH should have additional guarantees against retaliation from their current/former employers in the form of whistleblower protections. These are particularly relevant in light of reports that OpenAI, an AI industry leader, is allegedly neglecting safety and preventing employee disclosure through restrictive nondisparagement agreements. A potential model could be whistleblower protections introduced in California SB1047, where employers are forbidden from preventing, or retaliating based upon, the disclosure of an AI incident to an appropriate government agent.
In order to further incentivize reporting, contributors may be granted advanced, real-time, or more complete access to the AIIRH reporting data. While the goal is to encourage the active exchange of threat vectors, in acknowledgment of the aforementioned confidentiality issues, reporters could opt out from having their data shared in this way, forgoing their own advanced access. If they allow a redacted version of their incident to be shared anonymously with other contributors, they could still maintain access to the reporting data.
Key stakeholders include:
- NIST’s Information Services Office
- NIST’s Artificial Intelligence Safety Institute
- DHS/CISA Stakeholder Engagement Division and Cybersecurity Division
- Congressional and Legislative Affairs Office and Office of Information Systems Management
Related proposed bills include:
- Secure AI Act – Senators Warner and Tillis
- California State Bill 1047 – Senators Wiener, Roth, Rubio, Stern
The proposal is likely to require congressional action to appropriate funds for the creation and implementation of the AIIRH. It would require an estimated $10–25 million annually to create and maintain AIIRH, pay-for to be determined.3
Conclusion
An AI Incident Reporting System would enable informed policymaking as the risks of AI continue to develop. By allowing organizations to report information on serious risks that their systems may pose in areas like CBRN, illegal discrimination, and cyber threats, this proposal would enable the U.S. government to collect and analyze high-quality data and, if needed, promulgate standards to prevent the proliferation of dangerous capabilities to non-state actors. By incentivizing voluntary reporting, we can preserve innovative and high-value uses of AI for society and the economy, while staying up-to-date with the quickly evolving frontier in cases where regulatory oversight is paramount.
This idea is part of our AI Legislation Policy Sprint. To see all of the policy ideas spanning innovation, education, healthcare, and trust, safety, and privacy, head to our sprint landing page.
NIST has institutional expertise with incident reporting, having maintained the National Vulnerability Database and Disaster Data Portal. NIST’s role as a standard-setting body leaves it ideally placed to keep pace with developments in new areas of technology. This role as a standard-setting body that frequently collaborates with companies, while not regulating them, allows them to act as a trusted home for cross-industry collaboration on sensitive issues. In the Biden Administration’s Executive Order on AI, NIST was given authority over establishing testbeds and guidance for testing and red-teaming of AI systems, making it a natural home for the closely-related work here.
AIIRH staff shall be empowered to conduct follow-ups on credible threat reports, and to share information with Department of Commerce, Department of Homeland Security, Department of Defense, and other agency leadership on those reports.
AIIRH staff could work with others at NIST to build a taxonomy of AI incidents, which would provide a helpful shared language for standards and regulations. Additionally, staff might share incidents as relevant with interested offices like CISA, Department of Justice, and the Federal Trade Commission, although steps should be taken to minimize retribution against organizations who voluntarily disclosed incidents (in contrast to whistleblower cases).
Similar to the logic of companies disclosing cybersecurity vulnerabilities and incidents, voluntary reporting builds public trust, earns companies favor with enforcement agencies, and increases safety broadly across the community. The confidentiality guarantees provided by AIIRH should make the prospect more appealing as well. Separately, individuals at organizations like OpenAI and Google have demonstrated a propensity towards disclosure through whistleblower complaints when they believe their employers are acting unsafely.
Proposed bills advance research ecosystems, economic development, and education access and move now to the U.S. House of Representatives for a vote
NIST’s guidance on “Managing Misuse Risk for Dual-Use Foundation Models” represents a significant step forward in establishing robust practices for mitigating catastrophic risks associated with advanced AI systems.
Surveillance has been used on citizen activists for decades. What can civil society do to fight back against the growing trend of widespread digital surveillance?
Public-private collaboration in standards development also increases the likelihood that companies are able to adopt the standards without being overly burdened.