
Accelerating R&D for Critical AI Assurance and Security Technologies
The opportunities presented by advanced artificial intelligence are immense, from accelerating cutting-edge scientific research to improving key government services. However, for these benefits to be realized, both the private and public sectors need confidence that AI tools are reliable and secure. This will require R&D effort to solve urgent technical challenges related to understanding and evaluating emergent AI behaviors and capabilities, securing AI hardware and infrastructure, and preparing for a world with many advanced AI agents.
To secure global adoption of U.S. AI technology and ensure America’s workforce can fully leverage advanced AI, the federal government should take a strategic and coordinated approach to support AI assurance and security R&D by: clearly defining AI assurance and security R&D priorities; establishing an AI R&D consortium and deploying agile funding mechanisms for critical R&D areas; and establishing an AI Frontier Science Fellowship to ensure a pipeline of technical AI talent.
Challenge and Opportunity
AI systems have progressed rapidly in the past few years, demonstrating human-level and even superhuman performance across diverse tasks. Yet, they remain plagued by flaws that produce unpredictable and potentially dangerous failures. Frontier systems are vulnerable to attacks that can manipulate them into executing unintended actions, hallucinate convincing but incorrect information, and exhibit other behaviors that researchers struggle to predict or control.
As AI capabilities rapidly advance toward more consequential applications—from medical diagnosis to financial decision-making to military systems—these reliability issues could pose increasingly severe risks to public safety and national security, while reducing beneficial uses. Recent polling shows that just 32% of Americans trust AI, and this limited trust will slow the uptake of impactful AI use-cases that could drive economic growth and enhance national competitiveness.
The federal government has an opportunity to secure America’s technological lead and promote global adoption of U.S. AI by catalyzing research to address urgent AI reliability and security challenges—challenges that align with broader policy consensus reflected in the National Security Commission on AI’s recommendations and bipartisan legislative efforts like the VET AI Act. Recent research has surfaced substantial expert consensus around priority research areas that address the following three challenges.
The first challenge involves understanding emergent AI capabilities and behaviors. As AI systems get larger, also referred to as “scaling”, they develop unexpected capabilities and reasoning patterns that researchers cannot predict, making it difficult to anticipate risks or ensure reliable performance. Addressing this means advancing the science of AI scaling and evaluations.
This research aims to build a scientific understanding of how AI systems learn, reason, and exhibit diverse capabilities. This involves not only studying specific phenomena like emergence and scaling but, more broadly, employing and refining evaluations as the core empirical methodology to characterize all facets of AI behavior. This includes evaluations in areas such as CBRN weapons, cybersecurity, and deception, and broader research on AI evaluations to ensure that AI systems can be accurately assessed and understood. Example work includes Wijk et al. (2024) and McKenzie et al. (2023)
The second challenge is securing AI hardware and infrastructure. AI systems require robust protection of model weights, secure deployment environments, and resilient supply chains to prevent theft, manipulation, or compromise by malicious actors seeking to exploit these powerful technologies. Addressing this means advancing hardware and infrastructure security for AI.
Ensuring the security of AI systems at the hardware and infrastructure level involves protecting model weights, securing deployment environments, maintaining supply chain integrity, and implementing robust monitoring and threat detection mechanisms. Methods include the use of confidential computing, rigorous access controls, specialized hardware protections, and continuous security oversight. Example work includes Nevo et al. (2024) and Hepworth et al. (2024)
The third challenge involves preparing for a world with many AI agents—AI models that can act autonomously. Alongside their potentially immense benefits, the increasing deployment of AI agents creates critical blind spots, as agents could coordinate covertly beyond human oversight, amplify failures into system-wide cascades, and combine capabilities in ways that circumvent existing safeguards. Addressing this means advancing agent metrology, infrastructure, and security.
Developing a deeper understanding of agentic behavior in LLM-based systems, including clarifying how LLM agents learn over time, respond to underspecified goals, and engage with their environments. This also includes research that ensures safe multi-agent interactions, such as detecting and preventing malicious collective behaviors, studying how transparency can affect agent interactions, and developing evaluations for agent behavior and interaction. Example work includes Lee and Tiwari (2024) and Chan et al. (2024)
While academic and industry researchers have made progress on these problems, this progress is not keeping pace with AI development and deployment. The market is likely to underinvest in research that is more experimental or with no immediate commercial applications. The U.S. government, as the R&D lab of the world, has an opportunity to unlock AI’s transformative potential through accelerating assurance and security research.
Plan of Action
The rapid pace of AI advancement demands a new strategic, coordinated approach to federal R&D for AI assurance and security. Given financial constraints, it is more important than ever to make sure that the impact of every dollar invested in R&D is maximized.
Much of the critical technical expertise now resides in universities, startups, and leading AI companies rather than traditional government labs. To harness this distributed talent, we need R&D mechanisms that move at the pace of innovation, leverage academic research excellence, engage early-career scientists who drive breakthroughs, and partner with industry leaders who can share access to essential compute resources and frontier models. Traditional bureaucratic processes risk leaving federal efforts perpetually behind the curve.
The U.S. government should implement a three-pronged plan to advance the above R&D priorities.
Recommendation 1. Clearly define AI assurance and security R&D priorities
The Office of Science and Technology Policy (OSTP) and the National Science Foundation (NSF) should highlight critical areas of AI assurance and security as R&D priorities by including these in the 2025 update of the National AI R&D Strategic Plan and the forthcoming AI Action Plan. All federal agencies conducting AI R&D should engage with the construction of these plans to explain how their expertise could best contribute to these goals. For example, the Defense Advanced Research Projects Agency (DARPA)’s Information Innovation Office could leverage its expertise in AI security to investigate ways to design secure interaction protocols and environments for AI agents that eliminate risks from rogue agents.
The priorities would help coordinate government R&D activities by providing funding agencies with a common set of priorities, public research institutes such as the National Labs to conduct fundamental R&D activities, Congress with information to support relevant legislative decisions, and industry to serve as a guide to R&D.
Additionally, given the dynamic nature of frontier AI research, OSTP and NSF should publish an annual survey of progress in critical AI assurance and security areas and identify which challenges are the highest priority.
Recommendation 2. Establish an AI R&D consortium and deploy agile funding mechanisms for critical R&D
As noted by OSTP Director Michael Kratsios, “prizes, challenges, public-private partnerships, and other novel funding mechanisms, can multiply the impact of targeted federal dollars. We must tie grants to clear strategic targets, while still allowing for the openness of scientific exploration.” Federal funding agencies should develop and implement agile funding mechanisms for AI assurance and security R&D in line with established priorities. Congress should include reporting language in its Commerce, Justice, Science (CJS) appropriations bill that supports accelerated R&D disbursements for investment into prioritized areas.
A central mechanism should be the creation of an AI Assurance and Security R&D Consortium, jointly led by DARPA and NSF, bringing together government, AI companies, and universities. In this model:
- Government provides funding for personnel, administrative support, and manages the consortium’s strategic direction
- AI companies contribute model access, compute credits, and engineering expertise
- Universities provide researchers and facilities for conducting fundamental research
This consortium structure would enable rapid resource sharing, collaborative research projects, and accelerated translation of research into practice. It would operate under flexible contracting mechanisms using Other Transaction Authority (OTA) to reduce administrative barriers.
Beyond the consortium, funding agencies should leverage Other Transaction Authority (OTA) and Prize Competition Authority to flexibly contract and fund research projects related to priority areas. New public-private grant vehicles focused on funding fundamental research in priority areas should be set up via existing foundations linked to funding agencies such as the NSF Foundation, DOE’s Foundation for Energy Security and Innovation, or the proposed NIST Foundation.
Specific funding mechanisms should be chosen based on the target technology’s maturity level. For example, the NSF can support more fundamental research through fast grants via its EAGER and RAPID programs. Previous fast-grant programs, such as SGER, were found to be wildly effective, with “transformative research results tied to more than 10% of projects.”
For research areas where clear, well-defined technical milestones are achievable, such as developing secure cluster-scale environments for large AI training workloads, the government can support the creation of focused research organizations (FROs) and implement advanced market commitments (AMCs) to take technologies across the ‘valley of death’. DARPA and IARPA can administer higher-risk, more ambitious R&D programs with national security applications.
Recommendation 3. Establish an AI Frontier Science Fellowship to ensure a pipeline of technical AI talent that can contribute directly to R&D and support fast-grant program management
It is critical to ensure that America has a growing pool of talented researchers entering the field of AI assurance and security, given its strategic importance to American competitiveness and national security.
The NSF should launch an AI Frontier Science Fellowship targeting early-career researchers in critical AI assurance and security R&D. Drawing from proven models like CyberCorp Scholarship for Service, COVID-19 Fast Grants, and proposals such as for “micro-ARPAs”, this program operates on two tracks:
- Frontier Scholars: This track would provide comprehensive research support for PhD students and post-docs conducting relevant research on priority AI security and reliability topics. This includes computational resources, research rotations at government labs and agencies, and financial support.
- Rapid Grant Program Managers (PM): This track recruits researchers to serve fixed terms as Rapid Grant PMs, responsible for administering EAGER/RAPID grants focused on AI assurance and security.
This fellowship solves multiple problems at once. It builds the researcher pipeline while creating a nimble, decentralized approach to science funding that is more in line with the dynamic nature of the field. This should improve administrative efficiency and increase the surface area for innovation by allowing for more early-stage high-risk projects to be funded. Also, PMs who perform well in administering these small, fast grants can then become full-fledged program officers and PMs at agencies like the NSF and DARPA. This program (including grant budget) would cost around $40 million per year.
Conclusion
To unlock AI’s immense potential, from research to defense, we must ensure these tools are reliable and secure. This demands R&D breakthroughs to better understand emergent AI capabilities and behaviors, secure AI hardware and infrastructure, and prepare for a multi-agent world. The federal government must lead by setting clear R&D priorities, building foundational research talent, and injecting targeted funding to fast-track innovation. This unified push is key to securing America’s AI leadership and ensuring that American AI is the global gold standard.
This memo was written by an AI Safety Policy Entrepreneurship Fellow over the course of a six-month, part-time program that supports individuals in advancing their policy ideas into practice. You can read more policy memos and learn about Policy Entrepreneurship Fellows here.
Yes, the recommendations are achievable by reallocating the existing budget and using existing authorities, but this would likely mean accepting a smaller initial scale.
In terms of authorities, OSTP and NSF can already update the National AI R&D Strategic Plan and establish AI assurance and security priorities through normal processes. To implement agile funding mechanisms, agencies can use OTA and Prize Competition Authority. Fast grants require no special statute and can be done under existing grant authorities.
In terms of budget, agencies can reallocate 5-10% of existing AI research funds towards security and assurance R&D. The Frontier Science Fellowship could start as a $5-10 million pilot under NSF’s existing education authorities, e.g. drawing from NSF’s Graduate Research Fellowship Program.
While agencies have flexibility to begin this work, achieving the memo’s core objective – ensuring AI systems are trustworthy and reliable for workforce and military adoption – requires dedicated funding. Congress could provide authorization and appropriation for a named fellowship, which would make the program more stable and allow it to survive personnel turnover.
Market incentives drive companies to fix AI failures that directly impact their bottom line, e.g., chatbots giving bad customer service or autonomous vehicles crashing. More visible, immediate problems are likely to be prioritized because customers demand it or because of liability concerns. This memo focuses on R&D areas that the private sector is less likely to tackle adequately.
The private will address some security and reliability issues, but there are likely to be significant gaps. Understanding emergent model capabilities demands costly fundamental research that generates little immediate commercial return. Likewise, securing AI infrastructure against nation-state attacks will likely require multi-year R&D processes, and companies can fail to coordinate to develop these technologies without a clear demand signal. Finally, systemic dangers arising from multi-agent interactions might be left unmanaged because these failures emerge from complex dynamics with unclear liability attribution.
The government can step in to fund the foundational research that the market is likely to undersupply by default and help coordinate the key stakeholders in the process.
Companies need security solutions to access regulated industries and enterprise customers. Collaboration on government-funded research provides these solutions while sharing costs and risks.
The proposed AI Assurance and Security R&D Consortium in Recommendation 2 create a structured framework for cooperation. Companies contribute model access and compute credits while receiving:
- Government-funded researchers working on their deployment challenges
- Shared IP rights under consortium agreements
- Early access to security and reliability innovations
- Risk mitigation through collaborative cost-sharing
Under the consortia’s IP framework, companies retain full commercial exploitation rights while the government gets unlimited rights for government purposes. In the absence of a consortium agreement, an alternative arrangement could be a patent pool, where companies can access patented technologies in the pool through a single agreement. These structures, combined with the fellowship program providing government-funded researchers, creates strong incentives for private sector participation while advancing critical public research objectives.
Improved detection could strengthen deterrence, but only if accompanying hazards—automation bias, model hallucinations, exploitable software vulnerabilities, and the risk of eroding assured second‑strike capability—are well managed.
A dedicated and properly resourced national entity is essential for supporting the development of safe, secure, and trustworthy AI to drive widespread adoption, by providing sustained, independent technical assessments and emergency coordination.
Congress should establish a new grant program, coordinated by the Cybersecurity and Infrastructure Security Agency, to assist state and local governments in addressing AI challenges.
If AI systems are not always reliable and secure, this could inhibit their adoption, especially in high-stakes scenarios, potentially compromising American AI leadership.