FAS Receives $1.5 Million Grant on The Artificial Intelligence / Global Risk Nexus

Grant Funds Research of AI’s Impact on Nuclear Weapons, Biosecurity, Military Autonomy, Cyber, and other global issues

Washington, D.C. – September 11, 2024 – The Federation of American Scientists (FAS) has received a $1.5 million grant from the Future of Life Institute (FLI) to investigate the implications of artificial intelligence on global risk. The 18-month project supports FAS’s efforts to bring together the world’s leading security and technology experts to better understand and inform policy on the nexus between AI and several global issues, including nuclear deterrence and security, bioengineering, autonomy and lethality, and cyber security-related issues.

FAS’s CEO Daniel Correa noted that “understanding and responding to how new technology will change the world is why the Federation of American Scientists was founded. Against this backdrop, FAS has embarked on a critical journey to explore AI’s potential. Our goal is not just to understand these risks, but to ensure that as AI technology advances, humanity’s ability to understand and manage the potential of this technology advances as well.

“When the inventors of the atomic bomb looked at the world they helped create, they understood that without scientific expertise and brought her perspectives humanity would never live the potential benefits they had helped bring about. They founded FAS to ensure the voice of objective science was at the policy table, and we remain committed to that effort after almost 80 years.”

“We’re excited to partner with FLI on this essential work,” said Jon Wolfsthal, who directs FAS’ Global Risk Program. “AI is changing the world. Understanding this technology and how humans interact with it will affect the pressing global issues that will determine the fate of all humanity. Our work will help policy makers better understand these complex relationships. No one fully understands what AI will do for us or to us, but having all perspectives in the room and working to protect against negative outcomes and maximizing positive ones is how good policy starts.”

“As the power of AI systems continues to grow unchecked, so too does the risk of devastating misuse and accidents,” writes FLI President Max Tegmark. “Understanding the evolution of different global threats in the context of AI’s dizzying development is instrumental to our continued security, and we are honored to support FAS in this vital work.”

The project will include a series of activities, including high-level focused workshops with world-leading experts and officials on different aspects of artificial intelligence and global risk, policy sprints and fellows, and directed research, and conclude with a global summit on global risk and AI in Washington in 2026.

###

ABOUT FAS

The Federation of American Scientists (FAS) works to advance progress on a broad suite of contemporary issues where science, technology, and innovation policy can deliver dramatic progress, and seeks to ensure that scientific and technical expertise have a seat at the policymaking table. Established in 1945 by scientists in response to the atomic bomb, FAS continues to work on behalf of a safer, more equitable, and more peaceful world. More information at fas.org.

ABOUT FLI

Founded in 2014, the Future of Life Institute (FLI) is a leading nonprofit working to steer transformative technology towards benefiting humanity. FLI is best known for their 2023 open letter calling for a six-month pause on advanced AI development, endorsed by experts such as Yoshua Bengio and Stuart Russell, as well as their work on the Asilomar AI Principles and recent EU AI Act.

Public Comment on the U.S. Artificial Intelligence Safety Institute’s Draft Document: NIST AI 800-1, Managing Misuse Risk for Dual-Use Foundation Models

Public comments serve the executive branch by informing more effective, efficient program design and regulation. As part of our commitment to evidence-based, science-backed policy, FAS staff leverage public comment opportunities to embed science, technology, and innovation into policy decision-making.

The Federation of American Scientists (FAS) is a non-partisan organization dedicated to using science and technology to benefit humanity through equitable and impactful policy. With a strong track record in AI governance, FAS has actively contributed to the development of AI standards and frameworks, including providing feedback on NIST AI 600-1, the Generative AI Profile. Our work spans advocating for federal AI testbeds, recommending policy measures for frontier AI developers, and evaluating industry adoption of the NIST AI Risk Management Framework. We are members of the U.S. AI Safety Institute Research Consortium, and we responded to NIST’s request for information earlier this year concerning its responsibilities under sections 4.1, 4.5, and 11 of the AI Executive Order.

We commend NIST’s U.S. Artificial Intelligence Safety Institute for developing the draft guidance on “Managing Misuse Risk for Dual-Use Foundation Models.” This document represents a significant step toward establishing robust practices for mitigating catastrophic risks associated with advanced AI systems. The guidance’s emphasis on comprehensive risk assessment, transparent decision-making, and proactive safeguards aligns with FAS’s vision for responsible AI development.

In our response, we highlight several strengths of the guidance, including its focus on anticipatory risk assessment and the importance of clear documentation. We also identify areas for improvement, such as the need for harmonized language and more detailed guidance on model development safeguards. Our key suggestions include recommending a more holistic socio-technical approach to risk evaluation, strengthening language around halting development for unmanageable risks, and expanding the range of considered safeguards. We believe these adjustments will further strengthen NIST’s crucial role in shaping responsible AI development practices.

Background and Context

The rapid advancement of AI foundation models has spurred novel industry-led risk mitigation strategies. Leading AI companies have voluntarily adopted frameworks like Responsible Scaling Policies and Preparedness Frameworks, outlining risk thresholds and mitigation strategies for increasingly capable AI systems. (Our response to NIST’s February RFI was largely an exploration of these policies, their benefits and drawbacks, and how they could be strengthened.)

Managing misuse risks in foundation models is of paramount importance given their broad applicability and potential for dual use. As these models become more powerful, they may inadvertently enable malicious actors to cause significant harm, including facilitating the development of weapons, enabling sophisticated cyber attacks, or generating harmful content. The challenge lies not only in identifying current risks but also in anticipating future threats that may emerge as AI capabilities expand.

NIST’s new guidance on “Managing Misuse Risk for Dual-Use Foundation Models” builds upon these industry initiatives, providing a more standardized and comprehensive approach to risk management. By focusing on objectives such as anticipating potential misuse, establishing clear risk thresholds, and implementing robust evaluation procedures, the guidance creates a framework that can be applied across the AI development ecosystem. This approach is crucial for ensuring that as AI technology advances, appropriate safeguards are in place to protect against potential misuse while still fostering innovation.

Strengths of the guidance

1. Comprehensive Documentation and Transparency

The guidance’s emphasis on thorough documentation and transparency represents a significant advancement in AI risk management. For every practice under every objective, the guidance indicates appropriate documentation; this approach is more thorough in advancing transparency than any comparable guidance to date. The creation of a paper trail for decision-making and risk evaluation is crucial for both internal governance and potential external audits.

The push for transparency extends to collaboration with external stakeholders. For instance, practice 6.4 recommends providing “safe harbors for third-party safety research,” including publishing “a clear vulnerability disclosure policy for model safety issues.” This openness to external scrutiny and feedback is essential for building trust and fostering collaborative problem-solving in AI safety. (FAS has published a legislative proposal calling for enshrining “safe harbor” protections for AI researchers into law.)

2. Lifecycle Approach to Risk Management

The guidance excels in its holistic approach to risk management, covering the entire lifecycle of foundation models from pre-development assessment through to post-deployment monitoring. This comprehensive approach is evident in the structure of the document itself, which follows a logical progression from anticipating risks (Objective 1) through to responding to misuse after deployment (Objective 6).

The guidance demonstrates a proactive stance by recommending risk assessment before model development. Practice 1.3 suggests to “Estimate the model’s capabilities of concern before it is developed…”, which helps anticipate and mitigate potential harms before they materialize. The framework for red team evaluations (Practice 4.2) is particularly robust, recommending independent external experts and suggesting ways to compensate for gaps between red teams and real threat actors. The guidance also emphasizes the importance of ongoing risk assessment. Practice 3.2 recommends to “Periodically revisit estimates of misuse risk stemming from model theft…” This acknowledgment of the dynamic nature of AI risks encourages continuous vigilance.

3. Strong Stance on Model Security and Risk Tolerance

The guidance takes a firm stance on model security and risk tolerance, particularly in Objective 3. It unequivocally states that models relying on confidentiality for misuse risk management should only be developed when theft risk is sufficiently mitigated. This emphasizes the critical importance of security in AI development, including considerations for insider threats (Practice 3.1).

The guidance also demonstrates a realistic approach to the challenges posed by different deployment strategies. In Practice 5.1, it notes, “For example, allowing fine-tuning via API can significantly limit options to prevent jailbreaking and sharing the model’s weights can significantly limit options to monitor for misuse (Practice 6.1) and respond to instances of misuse (Practice 6.2).” This candid discussion of the limitations of safety interventions for open weight foundation models is crucial for fostering realistic risk assessments.

Additionally, the guidance promotes a conservative approach to risk management. Practice 5.3 recommends to “Consider leaving a margin of safety between the estimated level of risk at the point of deployment and the organization’s risk tolerance.” It further suggests considering “a larger margin of safety to manage risks that are more severe or less certain.” This approach provides an extra layer of protection against unforeseen risks or rapid capability advancements, which is crucial given the uncertainties inherent in AI development.

These elements collectively demonstrate NIST’s commitment to promoting realistic and robust risk management practices that prioritize safety and security in AI development and deployment. However, while the NIST guidance demonstrates several important strengths, there are areas where it could be further improved to enhance its effectiveness in managing misuse risks for dual-use foundation models.

Areas for improvement

1. Need for a More Comprehensive Socio-technical Approach to Measuring Misuse Risk

Objective 4 of the guidance demonstrates a commendable effort to incorporate elements of a socio-technical approach in measuring misuse risk. The guidance recognizes the importance of considering both technical and social factors, emphasizes the use of red teams to assess potential misuse scenarios, and acknowledges the need to consider different levels of access and various threat actors. Furthermore, it highlights the importance of avoiding harm during the measurement process, which is crucial in a socio-technical framework.

However, the guidance falls short in fully embracing a comprehensive socio-technical perspective. While it touches on the importance of external experts, it does not sufficiently emphasize the value of diverse perspectives, particularly from individuals with lived experiences relevant to specific risk scenarios. The guidance also lacks a structured approach to exploring the full range of potential misuse scenarios across different contexts and risk areas. Finally, the guidance does not mention measuring absolute versus marginal risks (ie., how much total misuse risk a model poses in a specific context versus how much marginal risk it poses compared to existing tools). These gaps limit the effectiveness of the proposed risk measurement approach in capturing the full complexity of AI system interactions with human users and broader societal contexts.

Specific recommendations for improving socio-technical approach

The NIST guidance in Practice 1.3 suggests estimating model capabilities by comparison to existing models, but provides little direction on how to conduct these comparisons effectively. To improve this, NIST could incorporate the concept of “available affordances.” This concept emphasizes that an AI system’s risk profile depends not just on its absolute capabilities, but also on the environmental resources and opportunities for affecting the world that are available to it.

Additionally, Kapoor et al. (2024) emphasize the importance of assessing the marginal risk of open foundation models compared to existing technologies or closed models. This approach aligns with a comprehensive socio-technical perspective by considering not just the absolute capabilities of AI systems, but also how they interact with existing technological and social contexts. For instance, when evaluating cybersecurity risks, they suggest considering both the potential for open models to automate vulnerability detection and the existing landscape of cybersecurity tools and practices. This marginal risk framework helps to contextualize the impact of open foundation models within broader socio-technical systems, providing a more nuanced understanding of their potential benefits and risks.

NIST could recommend that organizations assess both the absolute capabilities of their AI systems and the affordances available to them in potential deployment contexts. This approach would provide a more comprehensive view of potential risks than simply comparing models in isolation. For instance, the guidance could suggest evaluating how a system’s capabilities might change when given access to different interfaces, actuators, or information sources.

Similarly, Weidinger et al. (2023) argue that while quantitative benchmarks are important, they are insufficient for comprehensive safety evaluation. They suggest complementing quantitative measures with qualitative assessments, particularly at the human interaction and systemic impact layers. NIST could enhance its guidance by providing more specific recommendations for integrating qualitative evaluation methods alongside quantitative benchmarks.

NIST should acknowledge potential implementation challenges with a comprehensive socio-technical approach. Organizations may struggle to create benchmarks that accurately reflect real-world misuse scenarios, particularly given the rapid evolution of AI capabilities and threat landscapes. Maintaining up-to-date benchmarks in a fast-paced field presents another ongoing challenge. Additionally, organizations may face difficulties in translating quantitative assessments into actionable risk management strategies, especially when dealing with novel or complex risks. NIST could enhance the guidance by providing strategies for navigating these challenges, such as suggesting collaborative industry efforts for benchmark development or offering frameworks for scalable testing approaches.

OpenAI‘s approach of using human participants to evaluate AI capabilities provides both a useful model for more comprehensive evaluation and an example of quantification challenges. While their evaluation attempted to quantify biological risk increase from AI access, they found that, as they put it, “Translating quantitative results into a meaningfully calibrated threshold for risk turns out to be difficult.” This underscores the need for more research on how to set meaningful thresholds and interpret quantitative results in the context of AI safety.

2. Inconsistencies in Risk Management Language

There are instances where the guidance uses varying levels of strength in its recommendations, particularly regarding when to halt or adjust development. For example, Practice 2.2 recommends to “Plan to adjust deployment or development strategies if misuse risks rise to unacceptable levels,” while Practice 3.2 uses stronger language, suggesting to “Adjust or halt further development until the risk of model theft is adequately managed.” This variation in language could lead to confusion and potentially weaker implementation of risk management strategies.

Furthermore, while the guidance emphasizes the importance of managing risks before deployment, it does not provide clear criteria for what constitutes “adequately managed” risk, particularly in the context of development rather than deployment. More consistent and specific language around these critical decision points would strengthen the guidance’s effectiveness in promoting responsible AI development.

Specific recommendations for strengthening language on halting development for unmanageable risks

To address the inconsistencies noted above, we suggest the following changes:

1. Standardize the language across the document to consistently use strong phrasing such as “Adjust or halt further development” when discussing responses to unacceptable levels of risk.

The current guidance uses varying levels of strength in its recommendations regarding development adjustments. For instance, Recommendation 4 of Practice 2.2 uses the phrase “Plan to adjust deployment or development strategies,” while Recommendation 3 of Practice 3.2 more strongly suggests to “Adjust or halt further development.” Consistent language would emphasize the critical nature of these decisions and reduce potential confusion or weak implementation of risk management strategies. This could be accomplished by changing the language of Practice 2.2, Recommendation 4 to “Plan to adjust or halt further development or deployment if misuse risks rise to unacceptable levels before adequate security and safeguards are available to manage risk.”

The need for stronger language regarding halting development is reflected both in NIST’s other work and in commitments that many frontier AI developers have publicly agreed to. For instance, the NIST AI Risk Management Framework, section 1.2.3 (Risk Prioritization), suggests: “In some cases where an AI system presents the highest risk – where negative impacts are imminent, severe harms are actually occurring, or catastrophic risks are present – development and deployment should cease in a safe manner until risks can be sufficiently mitigated.” Further, the AI Seoul Summit frontier AI safety commitments explicitly state that organizations should “set out explicit processes they intend to follow if their model or system poses risks that meet or exceed the pre-defined thresholds.” Importantly, these commitments go on to specify that “In the extreme, organisations commit not to develop or deploy a model or system at all, if mitigations cannot be applied to keep risks below the thresholds.”

2. Add to the list of transparency documentation for Practice 2.2 the following: “A decision-making framework for determining when risks have become truly unmanageable, considering factors like the severity of potential harm, the likelihood of the risk materializing, and the feasibility of mitigation strategies.”

While the current guidance emphasizes the importance of managing risks before deployment (e.g., in Practice 5.3), it does not provide clear criteria for what constitutes “adequately managed” risk, particularly in the context of development rather than deployment. A decision-making framework would provide clearer guidance on when to take the serious step of halting development. This addition would help prevent situations where development continues despite unacceptable risks due to a lack of clear stopping criteria. This recommendation aligns with the approach suggested by Alaga and Schuett (2023) in their paper on coordinated pausing, where they emphasize the need for clear thresholds and decision criteria to determine when AI development should be halted due to unacceptable risks.

3. Gaps in Model Development Safeguards

The guidance’s treatment of safeguards, particularly those related to model development, lacks sufficient detail to be practically useful. This is most evident in Appendix B, which lists example safeguards. While this appendix is a valuable addition, the safeguards related to model training (“Improve the model’s training”) are notably lacking in detail compared to the safeguards around model security and detecting misuse.

While the guidance covers many aspects of risk management comprehensively, especially model security, it does not provide enough specific recommendations for technical approaches to building safer models during the development phase. This gap could limit the practical utility of the guidance for AI developers seeking to implement safety measures from the earliest stages of model creation.

Specific recommendations for additional safeguards for model development

For some safeguards, we recommend that the misuse risk guidance explicitly reference relevant sections of NIST 600-1, the Generative Artificial Intelligence Profile. Specifically, the GAI profile offers more comprehensive guidance on data-related and monitoring safeguards. For instance, the profile emphasizes documenting training data curation policies (MP-4.1-004) and establishing policies for data collection, retention, and quality (MP-4.1-005), which are crucial for managing misuse risk from the earliest stages of development. Additionally, the profile suggests implementing real-time monitoring processes for analyzing generated content performance and trustworthiness characteristics (MG-3.2-006), which could significantly enhance ongoing risk management during development. These references to the GAI Profile on model development safeguards could take the form of an additional item in Appendix B, or be incorporated into the relevant sections earlier in the guidance.

Beyond pointing to the model development safeguards included in the GAI Profile, we also recommend expanding Appendix B to include further safeguards for the model development phase. Both the GAI Profile and the current misuse risk guidance lack specific recommendations for two key model development safeguards: iterative safety testing throughout development and staged development/release processes. Below are two proposed additions to Appendix B:

Safeguard	Possible Implementation Methods
Implement iterative safety testing throughout development.	* Develop and continuously update a comprehensive suite of safety tests covering identified risk areas. * Establish quantitative safety benchmarks and ensure the model meets predefined thresholds before progressing to next development stages. * Conduct regular adversarial testing, updating the test suite based on discovered vulnerabilities or emerging threats.
Consider a staged development and release process.	* Define clear safety criteria that must be met before advancing to each subsequent stage of model development or deployment. * Implement a phased release strategy, incrementally increasing model capabilities or access only after thorough safety evaluations at each stage. * If possible, maintain the capability to rapidly revert to previous versions or restrict access if safety issues are identified post-release.

The proposed safeguard “Implement iterative safety testing throughout development” addresses the current guidance’s limited detail on model training and development safeguards. This approach aligns with Barrett, et al.’s AI Risk-Management Standards Profile for General-Purpose AI Systems and Foundation Models (the “GPAIS Profile”)’s emphasis on proactive and ongoing risk assessment. Specifically, the Profile recommends identifying “GPAIS impacts…and risks (including potential uses, misuses, and abuses), starting from an early AI lifecycle stage and repeatedly through new lifecycle phases or as new information becomes available” (Barrett et al., 2023, p. 19). The GPAIS Profile further suggests that for larger models, developers should “analyze, customize, reanalyze, customize differently, etc., then deploy and monitor” (Barrett et al., 2023, p. 19), where “analyze” encompasses probing, stress testing, and red teaming. This iterative safety testing would integrate safety considerations throughout development, aligning with the guidance’s emphasis on proactive risk management and anticipating potential misuse risk.

Similarly, the proposed safeguard “Establish a staged development and release process” addresses a significant gap in the current guidance. While Practice 5.1 discusses pre-deployment risk assessment, it lacks a structured approach to incrementally increasing model capabilities or access. Solaiman et al. (2023) propose a “gradient of release” framework for generative AI, a phased approach to model deployment that allows for iterative risk assessment and mitigation. This aligns with the guidance’s emphasis on ongoing risk management and could enhance the ‘margin of safety’ concept in Practice 5.3. Implementing such a staged process would introduce multiple risk assessment checkpoints throughout development and deployment, potentially improving safety outcomes.

Conclusion

NIST’s guidance on “Managing Misuse Risk for Dual-Use Foundation Models” represents a significant step forward in establishing robust practices for mitigating catastrophic risks associated with advanced AI systems. The document’s emphasis on comprehensive risk assessment, transparent decision-making, and proactive safeguards demonstrates a commendable commitment to responsible AI development. However, to more robustly contribute to risk mitigation, the guidance must evolve to address key challenges, including a stronger approach to measuring misuse risk, consistent language on halting development, and more detailed model development safeguards.

As the science of AI risk assessment advances, this guidance should be recursively updated to address emerging risks and incorporate new best practices. While voluntary guidance is crucial, it is important to recognize that it cannot replace the need for robust policy and regulation. A combination of industry best practices, government oversight, and international cooperation will be necessary to ensure the responsible development of high-risk AI systems.

We appreciate the opportunity to provide input on this important document. FAS stands ready to continue assisting NIST in refining and implementing this guidance, as well as in developing further resources for responsible AI development. We believe that close collaboration between government agencies, industry leaders, and civil society organizations is key to realizing the benefits of AI while effectively mitigating its most serious risks.

Recent Advances in Artificial Intelligence and the Department of Energy’s Role in Ensuring U.S. Competitiveness and Security in Emerging Technologies

Statement For The Record

Chairman Manchin, Ranking Member Barrasso, and members of the Senate Energy and Natural Resources Committee. I appreciate the opportunity to submit this statement underpinning the Department of Energy’s visions to shape our strategic investments in AI.

The Federation of American Scientists (FAS) is a catalytic, non-partisan, and nonprofit organization committed to using science and technology to benefit humanity by delivering on the promise of equitable and impactful policy. FAS believes that society benefits from a federal government that harnesses science, technology, and innovation to meet ambitious policy goals and deliver impact to the public.

I am the Associate Director for Emerging Technologies and National Security at FAS where I lead our work on emerging technologies’ policy from the lens of our national security innovation base, as well as focusing on the strategic competition between the United States and the Chinese Communist Party. I wish to commend your work in bringing the Committee together to discuss the Department of Energy (DOE)’s role in ensuring U.S. competitiveness and security in emerging technologies. This hearing could not have come at a more opportune time.

In March, the Chinese Communist Party (CCP) held its yearly “two sessions” meeting—referring to the coming together of China’s principal political bodies, the National People’s Congress (NPC) and the National Committee of the Chinese People’s Political Consultative Conference (CPPCC)—during which they not only confirmed Xi Jinping’s third term as president but also introduced a set of new policies and government appointments. During this meeting, Xi emphasized the importance of self-reliance in science and technology as a strategic goal to combat Western influence. Meanwhile, the Central Committee revealed plans to restructure the Chinese government to better position China’s national innovation system for driving advancements in both commercial and dual-purpose military-civilian technologies. This latest initiative underscores two decades of unwavering CCP commitment toward indigenous innovation, calibrated specifically to outflank its Western competitors like the United States. And it’s getting results: a recent analysis by the Australian Strategic Policy Institute found that China now leads in 37 out of 44 critical technology areas globally, while Chinese production of high-value patents in the global marketplace has increased by 400% over the past decade.

The Committee’s hearing is exploring a question that is of vital national interest. The two proposals—creating an Office of Critical and Emerging Technology within the DOE and the Frontiers in Artificial Intelligence for Science, Security and Technology—could change this trajectory for the better.

First, the creation of an Office of Critical and Emerging Technology within the DOE. This office would enable a robust assessment of U.S. technological competitiveness and prepare us for emerging technology surprises conveying a potential threat to national security. This framework will refine our strategic direction, facilitate rapid threats-response coordination with interagency collaboration from entities like DoD, DNI and NSF amongst others, while advancing proactive countermeasure strategies.

The Office should serve as a hub for innovative practices across all 17 National Labs and 34 user facilities that the DOE stewards. The DOE labs and user facilities have expertise and capabilities that are important in national and international science policy challenges. This office should promote greater participation from our labs to better inform these discussions, thereby effectively fostering a diversity of perspectives within national science policy discourse and international forums, which is ever-critical given the ascending competition from nations including China and Russia in domains like AI, quantum computing, and biotechnology.

Secondly, the FASST initiative—Frontiers in Artificial Intelligence for Science, Security, and Technology—is another imperative. AI’s transformative potential is undeniable but demands substantial improvement in fundamental aspects like explainability, trustworthiness, reliability, especially for mission-critical applications and privacy-sensitive issues.

The DOE, with its high-performance computing prowess, is uniquely positioned to deliver secure and dependable AI solutions for the challenging problems of the century. By leveraging DOE’s world-leading exascale computing capabilities while working synergistically with key stakeholders from academia, industry, and interagency groups, we can unlock groundbreaking AI innovations.

Efforts must be made to accelerate integrated math and science R&D, particularly foundational AI research to develop secure, trustworthy techniques. Rigorous verification and validation processes, guided by scientific validity, can vet new technologies for their societal implications before widespread deployment.

Moreover, expanding on foundational research in physics-informed AI could lead to better integration of AI models with our understanding of real-world phenomena. This involves cooperative research among diverse specialties, an endeavor DOE labs and associated universities are equipped for.

The proposed multi-billion-dollar annual program involving DOE Office of Science, National Nuclear Security Administration, and applied energy programs aims to leverage unique leadership capabilities in computing to create transformative AI hubs focused on solving grand challenge problems, innovate world-class AI technologies, and harness cutting-edge testbeds for developing energy-efficient AI hardware platforms in concert with US industry.

Adding to the testimony, I would like to emphasize the pivotal role the FASST initiative will play in the development of unique open and secure foundation models for discovery and national security. The objective is to harness unique and highly-curated datasets to foster advancements and ensure that the United States remains at the helm of science and technology.

The creation of uniquely crafted models, possible only through supercomputing, will offer unprecedented insights into complex processes like molecular dynamics crucial for additive manufacturing or power grid dynamics, leading to a more resilient energy infrastructure. Moreover, it’s crucial for the DOE to develop classified models to manage threats to our national security, from maintaining space situational awareness to advancing biodefense, nuclear deterrence, and nonproliferation efforts. However, I would also urge caution as this could provide our adversaries with a single point of attack to extract classified data if they were to gain access to the frontier model trained on classified data.

We are observing an unprecedented deployment of large language models and other advanced AI models like AlphaFold 2, AlphaGo, amongst others, across the country. AI tools and foundational models developed by the DOE could test and validate these AI tools. This capability is imperative to ensuring AI models deployed meet safety and ethical standards that align with our societal values. Furthermore, it will allow DOE to assess risks posed by other AI models that are outside of U.S. regulatory jurisdictions.

In terms of tool and software development, FASST could develop common platforms for safe, trustworthy AI suitable for high-stake usage scenarios. This would involve crafting tools and methodologies that enhance the trustworthiness and reliability of AI systems while preserving privacy. It also involves an acute focus on cybersecurity, establishing classified platforms capable of evaluating potential adversarial AI systems.

The harnessing of both classified and unclassified scientific datasets will be instrumental in this endeavor. By transforming DOE’s leading-edge facilities into a nationwide integrated research infrastructure, we will cultivate a common platform for training and evaluation, thereby deriving valuable findings from the world’s largest volumes of scientific data.

Furthermore, FASST will be instrumental in bolstering state-of-the-art production capabilities for our nuclear stockpile by advancing the state-of-the-art in foundation models to rapidly validate AI technologies addressing emerging nuclear security missions. In addition, FASST’s aims to develop new foundation models for unique types of data such as seismic and electromagnetic are worthy of support as these areas where current capabilities are lacking.

Through these concerted efforts, we aim to combine the strides in AI innovation with critical missions in science, security, and technology—encompassing scientific discovery, energy sustainability, and national security. We will continue to boldly ride the tidal wave of AI evolution while ensuring that we stay ahead of possible detriments that could compromise our nation’s security and leadership in technology.

Eventually, the transformation of DOE facilities into a nationwide integrated research infrastructure can stimulate advanced AI research deployment across sectors, enhance resource utility, drive unprecedented growth potential, and reinforce U.S.’s techno-economic leadership.

In conclusion, championing these proposed provisions underscores the urgent need for research, development, and deployment to ensure our ongoing global competitiveness within the critical emerging technology fields. Proactive investments today promise substantial strategic dividends for our nation’s future by maintaining its vital role in technological innovation while robustly addressing potential risks tied to these technological breakthroughs. At the same time, we must proceed with caution as our adversaries try to gain access to our classified information every hour of every day. Creating frontier models with classified information could provide significant benefits to our national security apparatus, yet at the same time, it could also provide our adversaries an easier path to gain access to our secrets, hence we must do it in a way that ensures our systems are safe, secure, and reliable.

In the end, this is not just about maintaining a competitive edge; this is about national security, about establishing ethical guidelines for technology usage; it’s about mission-critical deployments where failure is unimaginable, about enhancing global standings through technological supremacy.

We believe this strategic investment into critical and emerging technologies will empower our nation to confront 21st-century challenges with solutions that are timely, scientifically rigorous, and security-enhancing. We express our unwavering support towards these provisions and encourage their decisive endorsement.

Thank you for considering our views on these pressing topics.

If you have any questions, please reach out to me at dkaushik@fas.org.

Divyansh Kaushik

Associate Director for Emerging Technologies and National Security

Federation of American Scientists

Strengthening the Integrity of Government Payments Using Artificial Intelligence

Summary

Tens of billions of taxpayer dollars are lost every year due to improper payments to the federal government. These improper payments arise from agency and claimant errors as well as outright fraud. Data analytics can help identify errors and fraud, but often only identify improper payments after they have already been issued.

Artificial intelligence (AI) in general—and machine learning (ML) in particular (AI/ML)—could substantially improve the accuracy of federal payment systems. The next administration should launch an initiative to integrate AI/ML into federal agencies’ payment processes. As part of this initiative, the federal government should work extensively with non-federal entities—including commercial firms, nonprofits, and academic institutions—to address major enablers and barriers pertaining to applications of AI/ML in federal payment systems. These include the incidence of false positives and negatives, perceived and actual fairness and bias issues, privacy and security concerns, and the use of ML for predicting the likelihood of future errors and fraud.