Below, we analyze AI R&D grants from the National Science Foundation’s Computer and Information Science and Engineering (NSF CISE) directorate, estimating those supporting “trustworthy AI” research. NSF hasn’t offered an overview of specific funding for such studies within AI. Through reviewing a random sample of granted proposals 2018-2022, we estimate that ~10-15% of annual AI funding supports trustworthy AI research areas, including interpretability, robustness, privacy-preservation, and fairness, despite an increased focus on trustworthy AI in NSF’s strategic plan as well as public statements by key NSF and White House officials. Robustness receives the most allocation (~6% annually), while interpretability and fairness each obtain ~2%. Funding for privacy-preserving machine learning has seen a significant rise, from .1% to ~5%. We suggest NSF increases funding towards responsible AI, incorporating specific programs and solicitations addressing critical AI trustworthiness issues. We also clarify that NSF should consider trustworthiness in all AI grant application assessments and prioritize projects enhancing the safety of foundation models.
Background on Federal AI R&D
Federal R&D funding has been critical to AI research, especially a decade ago when machine learning (ML) tools had less potential for wide use and received limited private investment. Much of the early AI development occurred in academic labs that were mainly federally funded, forming the foundation for modern ML insights and attracting large-scale private investment. With private sector investments outstripping public ones and creating notable AI advances, federal funding agencies are now reevaluating their role in this area. The key question lies in how public investment can complement private finance to advance AI research that is beneficial for American wellbeing.
The Growing Importance of Trustworthy AI R&D
A growing priority within the discourse of national AI strategy is the advancement of “trustworthy AI”. Per the National Institutes of Standards and Technology, Trustworthy AI refers to AI systems that are safe, reliable, interpretable, robust, demonstrate respect for privacy, and have harmful biases mitigated. Though terms such as “trustworthy AI”, “safe AI”, “responsible AI”, and “beneficial AI” are not precisely defined, they are an important part of the government’s characterization of high-level AI R&D strategy. We aim to elucidate these concepts further in this report, focusing on specific research directions aimed at bolstering the desirable attributes in ML models. We will start by discussing an increasing trend we observe in governmental strategies and certain program solicitations emphasizing such goals.
This increased focus has been reflected in many government strategy documents in recent years. Both the 2016 National AI R&D Strategic Plan and its 2019 update from the National Science and Technology Council pinpointed trustworthiness in AI as a crucial objective. This was reiterated even more emphatically in the recent 2023 revision, which stressed ensuring confidence and reliability of AI systems as especially significant objectives. The plan also underlined how burgeoning numbers of AI models have necessitated urgent efforts towards enhancing safety parameters in AIs. Public feedback regarding previous versions of this plan highlight an expanded priority across academia, industry and society at large for AI models that maintain safety codes, transparency protocols, and equitable improvements without trespassing privacy norms. The NSF’s FY2024 budget proposal submission articulated its primary intention in advancing “the frontiers of trustworthy AI“, deviating from earlier years’ emphasis on sowing seeds for future advancements across various realms of human pursuits.
Concrete manifestations of this increasing emphasis on trustworthy AI can be seen not only in high-level discussions of strategy, but also through specific programs designed to advance trustworthiness in AI models. One of the seven new NSF AI institutes established recently focuses exclusively on “trustworthy AI“. Other programs like NSF’s Fairness in Artificial Intelligence and Safe-Learning Enabled Systems focus chiefly on cultivating dimensions of trustworthy AI research.
Despite their value, these individual programs focused on AI trustworthiness form only a small fragment of total funding allocated for AI R&D by the NSF; at around $20 million per year against nearly $800 million per year in funding towards AI R&D. It remains unclear how much this mounting concern surrounding trustworthy and responsible AI influences NSF’s funding commitments towards responsible AI research. In this paper, we aim to provide an initial investigation of this question by estimating the proportion of grants over the past five fiscal years (FY 2018-2022) from NSF’s CISE directorate (the primary funder of AI R&D within NSF) which support a few key research directions within trustworthy AI: interpretability, robustness, fairness, and privacy-preservation.
Please treat our approximations cautiously; these are neither exact nor conclusive responses to this question. Our methodology heavily relies upon individual judgments categorizing nebulous grant types within a sample of the overall grants. Our goal is to offer an initial finding into federal funding trends directed towards trustworthy AI research.
We utilized NSF’s online database of granted awards from the CISE directorate to facilitate our research. Initially, we identified a representative set of AI R&D-focused grants (“AI grants”) funded by NSF’s CISE directorate across certain fiscal years 2018-2022. Subsequently, we procured a random selection of these grants and manually classified them according to predetermined research directions relevant to trustworthy AI. An overview of this process is given below, with details on each step of our methodology provided in the Appendix.
- Search: Using NSF’s online award search feature, we extracted a near comprehensive collection of abstracts of grant applications approved by NSF’s CISE directorate during fiscal years 2018-2022. Since the search function relies on keywords, we focused on high recall in the search results over high precision, leading to an overly encompassing result set yielding close to 1000 grants annually. It is believed that this initial set encompasses nearly all AI grants from NSF’s CISE directorate while also incorporating numerous non-AI-centric R&D awards.
- Sample: For each fiscal year, a representative random subset of 100 abstracts was drawn (approximating 10% of the total abstracts extracted). This sample size was chosen as it strikes a balance between manageability for manual categorization and sufficient numbers for reasonably approximate funding estimations.
- Sort: Based on prevailing definitions of trustworthy AI, four clusters were conceptualized for research directions: i) interpretability/explainability, ii) robustness/safety, iii) fairness, iv) privacy-preservation. To furnish useful contrasts with trustworthy AI funding numbers, additional categories were designated: v) capabilities and vi) applications of AI. Herein, “capabilities” corresponds to pioneering initiatives in model performance and “application of AI” refers to endeavors leveraging extant AI techniques for progress in other domains. Non-AI-centric grants were sorted out of our sample and marked as “other” in this stage. Each grant within our sampled allotment was manually classified into one or more of these research directions based on its primary focus and possible secondary or tertiary objectives where applicable—additional specifics regarding this sorting process are delineated in the Appendix.
Based on our sorting process, we estimate the proportion of AI grant funds from NSF’s CISE directorate which are primarily directed at our trustworthy AI research directions.
As depicted in Figure 2, the collective proportion of CISE funds allocated to trustworthy AI research directions usually varies from approximately 10% to around 15% of the total AI funds per annum. However, there are no noticeable positive or negative trends in this overall metric, indicating that over the five-year period examined, there were no dramatic shifts in the funding proportion assigned to trustworthy AI projects.
Considering secondary and tertiary research directions
As previously noted, several grants under consideration appeared to have secondary or tertiary focuses or seemed to strive for research goals which bridge different research directions. We estimate that over the five-year evaluation period, roughly 18% of grant funds were directed to projects having at least a partial focus on trustworthy AI.
Specific Research Directions
Presently, ML systems tend to fail unpredictably when confronted with situations considerably different from their training scenarios (non-iid settings). This failure propensity may induce detrimental effects, especially in high-risk environments. With the objective of diminishing such threats, robustness or safety-related research endeavors aim to enhance system reliability across new domains and mitigate catastrophic failure when facing untrained situations.1 Additionally, this category encompasses projects addressing potential risks and failure modes identification for further safety improvements.
Over the past five years, our analysis shows that research pertaining to robustness is typically the most funded trustworthy AI direction, representing about 6% of the total funds allocated by CISE to AI research. However, no definite trends have been identified concerning funding directed at robustness over this period.
Explaining why a machine learning model outputs certain predictions for a given input is still an unsolved problem.2 Research on interpretability or explainability aspires to devise methods for better understanding the decision-making processes of machine learning models and designing more easily interpretable decision systems.
Over the investigated years, funding supporting interpretability and explainability doesn’t show substantial growth, averagely accounting for approximately 2% of all AI funds.
ML systems often reflect and exacerbate existing biases present in their training data. To circumvent these issues, research focusing on fairness or non-discrimination purposes works towards creating systems that sidestep such biases. Frequently this area of study involves exploring ways to reduce dataset biases and developing bias-assessment metrics for current models along with other bias-reducing strategies for ML models.3
The funding allocated to this area also generally accounts for around 2% of annual AI funds. Our data did not reveal any discernible trend related to fairness/non-discrimination orientated fundings throughout the examined period.
AI systems training typically requires large volumes of data that can include personal information; therefore privacy preservation is crucial. In response to this concern, privacy-preserving machine learning research aims at formulating methodologies capable of safeguarding private information.4
Throughout the studied years, funding for privacy-preserving machine learning exhibits significant growth from under 1% in 2018 (the smallest among our examined research directions) escalating to over 6% in 2022 (the largest among our inspect trustworthy AI research topics). This increase flourishes around fiscal year 2020; however, its cause remains indeterminate.
NSF should continue to carefully consider the role that its funding can play in an overall AI R&D portfolio, taking into account both private and public investment. Trustworthy AI research presents a strong opportunity for public investment. Many of the lines of research within trustworthy AI may be under-incentivized within industry investments, and can be usefully pursued by academics. Concretely, NSF could:
- Build on its existing work by introducing more focused programs and solicitations for specific problems in trustworthy AI, and scaling these programs to be a significant fraction of its overall AI budget.
- Include the consideration of trustworthy and responsible AI as a component of the “broader impacts” for NSF AI grants. NSF could also consider introducing a separate statement for every application for funding for an AI project, which explicitly asks researchers to identify how their project contributes to trustworthy AI. Reviewers could be instructed to favor work which offers potential benefits on some of these core trustworthy AI research directions.
- Publish a Dear Colleague Letter (DCL) inviting proposals and funding requests for specific trustworthy AI projects, and/or a DCL seeking public input on potential new research directions in trustworthy AI.
- NSF could also encourage or require researchers to follow the NIST AI RMF when conducting their research.
- In all of the above, NSF should consider a specific focus on supporting the development of techniques and insights which will be useful in making large, advanced foundation models, such as GPT-4, more trustworthy and reliable. Such systems are advancing and proliferating, and government funding could play an important role in helping to develop techniques which proactively guard against risks of such systems.
For this investigation, we aim to estimate the proportion of AI grant funding from NSF’s CISE directorate which supports research that is relevant to trustworthy AI. To do this, we rely on publicly-provided data of awarded grants from NSF’s CISE directorate, accessed via NSF’s online award search feature. We first aim to identify, for each of the examined fiscal years, a set of AI-focused grants (“AI grants”) from NSF’s CISE directorate. From this set, we draw a random sample of grants, which we manually sort into our selected trustworthy AI research directions. We go into more detail on each of these steps below.
How did we choose this question?
We touch on some of the motivation for this question in the introduction above. We investigate NSF’s CISE directorate because it is the primary directorate within NSF for AI research, and because focusing on one directorate (rather than some broader focus, like NSF as a whole) allows for a more focused investigation. Future work could examine other directorates within NSF or other R&D agencies for which grant awards are publicly available.
We focus on estimating trustworthy AI funding as a proportion of total AI funding, with our goal being to analyze how trustworthy AI is prioritized relative to other AI work, and because this information could be more action-guiding for funders like NSF who are choosing which research directions within AI to prioritize.
Search (identifying a list of AI grants from NSF’s CISE Directorate)
To identify a set of AI grants from NSF’s CISE directorate, we used the advanced award search feature on NSF’s website. We conducted the following search:
- For the NSF organization window, we selected “CSE – Direct for Computer & Info Science”
- For “Keyword”, we entered the following list of terms:
- AI, “computer vision”, “Artificial Intelligence”, “Machine Learning”, ML, “Natural language processing”, NLP, “Reinforcement learning”, RL
- We included both active and expired awards.
- We set the range for each search to capture the fiscal years of interest (e.g. 10/01/2017 to 09/30/2018 for FY18, 10/01/2018 to 9/30/2019 for FY19, and so on).
This search yielded a set of ~1000 grants for each fiscal year. This set of grants was over-inclusive, with many grants which were not focused on AI. This is because we aimed for high recall, rather than high precision when choosing our key words; our focus was to find a set of grants which would include all of the relevant AI grants made by NSF’s CISE directorate. We aim to sort out false positives, i.e. grants not focused on AI, in the subsequent “sorting” phase.
We assigned a random number to each grant returned by our initial search, and then sorted the grants from smallest to largest. For each year, we copied the 100 grants with the smallest randomly assigned numbers and into a new spreadsheet which we used for the subsequent “sorting” step.
We now had a random sample of 500 grants (100 for each FY) from the larger set of ~5000 grants which we identified in the search phase. We chose this number of grants for our sample because it was manageable for manual sorting, and we did not anticipate massive shifts in relative proportions were we to expand from a ~10% sample to say, 20% or 30%.
Identifying Trustworthy AI Research Directions
We aimed to identify a set of broad research directions which would be especially useful for promoting trustworthy properties in AI systems, which could serve as our categories in the subsequent manual sorting phase. We consulted various definitions of trustworthy AI, relying most heavily on the definition provided by NIST: “characteristics of trustworthy AI include valid and reliable, safe, secure and resilient, accountable and transparent, explainable and interpretable, privacy-enhanced, and fair with harmful bias managed.” We also consulted some lists of trustworthy AI research directions, identifying research directions which appeared to us to be of particular importance for trustworthy AI. Based on the above process, we identify the following clusters of trustworthy AI research:
- Privacy-preserving machine learning
- Fairness and non-discrimination
It is important to note here that none of these research areas are crisply defined, but we thought that these clusters provided a useful, high-level, way to break trustworthy AI research down into broad categories.
In the subsequent steps, we aim to compare the amount of grant funds that are specifically aimed at promoting the above trustworthy AI research directions with the amount of funds which are directed towards improving AI systems’ capabilities in general, or simply applying AI to other classes of problems.
For our randomly sampled set of 500 grants, we aimed to sort each grant according to its intended research direction.
For each grant, we a) read the title and the abstract of the grant and b) assigned the grant a primary research direction, and if applicable, a secondary and tertiary research direction. Secondary and tertiary research directions were not selected for each grant, but were chosen for some grants which stood out to us as having a few different objectives. We provide examples of some of these “overlapping” grants below.
We sorted grants into the following categories:
- This category was used for projects that are primarily aimed at advancing the capabilities of AI systems, by making them more competent at some task, or for research which could be used to push forward the frontier of capabilities for AI systems.
- This category also includes investments in resources that are generally useful for AI research, e.g. computing clusters at universities.
- Example: A project which aims to develop a new ML model which achieves SOTA performance on a computer vision benchmark.
- Application of AI/ML.
- This category was used for projects which apply existing ML/AI techniques to research questions in other domains.
- Example: A grant which uses some machine learning techniques to analyze large sets of data on precipitation, temperature, etc. to test a hypothesis in climatology.
- This category was used for projects which aim to make AI systems more interpretable or explainable, by allowing for a better understanding of their decision-making process. Here, we included both projects which offer methods for better interpreting existing models, and also on projects which offer new training methods that are easier to interpret.
- Example: A project which determines the features of a resume that make it more or less likely to be scored positively by a resume-ranking algorithm.
- This category was used for projects which aim to make AI systems more robust to distribution shifts and adversarial inputs, and more reliable in unfamiliar circumstances. Here, we include both projects which introduce methods for making existing systems more robust, and those which introduce new techniques that are more robust in general.
- Example: A project which explores new methods for providing systems with training data that causes a computer vision model to learn robustly useful patterns from data, rather than spurious ones.
- This category was used for projects which aim to make AI systems less likely to entrench or reflect harmful biases. Here, we focus on work directly geared at making models themselves less biased. Many project abstracts described efforts to include researchers from underrepresented populations in the research process, which we chose not to include because of our focus on model behavior.
- Example: A project which aims to design techniques for “training out” certain undesirable racial or gender biases.
- Privacy preservation
- This category was used for projects which aim to make AI systems less privacy-invading.
- Example: A project which provides a new algorithm that allows a model to learn desired behavior without using private data.
- This category was used for grants which are not focused on AI. As mentioned above, the random sample included many grants which were not AI grants, and these could be removed as “other.”
Some caveats and clarifications on our sorting process
This sorting focuses on the apparent intentions and goals of the research as stated in the abstracts and titles, as these are the aspects of each grant the NSF award search feature makes readily viewable. Our process may therefore miss research objectives which are outlined in the full grant application (and not within the abstract and title).
A focus on specific research directions
We chose to focus on specific research agendas within trustworthy and responsible AI, rather than just sorting grants between a binary of “trustworthy” or “not trustworthy” in order to bring greater clarity to our grant sorting process. We still make judgment calls with regards to which individual research agendas are being promoted by various grants, but we hope that such a sorting approach will allow greater agreement.
As mentioned above, we also assigned secondary and tertiary research directions to some of these grants. You can view the grants in the sample and how we sorted each here. Below, we offer some examples of the kinds of grants which we would sort into these categories.
Examples of Grants with Multiple Research Directions
- Primary: Capabilities, Secondary: Application of ML.
- A project which aims to introduce a novel ML approach that is useful for making progress on a research problem in another domain would be categorized as having a primary purpose of Capabilities and a Secondary purpose of Application of ML.
- Primary: Application of ML, Secondary: Capabilities
- This is similar to the above, except that the “application” seems more central to the research objective than the novel capabilities do. Of course, this weighing of which research objectives were most and least central is subjective and ultimately our decisions on many were judgment calls.
- Primary: Capabilities, Secondary: Interpretability
- A project which introduces a novel method that achieves better performance on some benchmark while also being more interpretable.
- Primary: Interpretability, Secondary: Robustness
- A project which aims to introduce methods for making AI systems both more interpretable and more robust.
To summarize: in the sorting phase, we read the title and abstract of each grant in our random sample, and assigned these grants to a research direction. Many grants received only a “primary” research direction, though some received secondary and tertiary research directions as well. This sorting was based on our understanding of the main goals of the project, based on the description provided by the project title and abstract.
The Pentagon has turned innovation into a buzzword, and everyone can agree on the need for faster innovation. It seems a new innovation office is created every week. Yet when it comes to AI, the DoD is still moving too slowly and hampered by a slow procurement process. How can it make innovation core to the organization and leverage the latest technological developments?
We have to first understand what type of innovation is needed. As Harvard Business School professor Clayton Christensen wrote, there are two types of innovation: sustaining and disruptive. Sustaining innovation makes existing products and services better. It’s associated with incremental improvements, like adding new features to a smartphone or boosting the performance of the engine on a car, in pursuit of better performance and higher profits.
Disruptive innovation occurs when a firm with fewer resources challenges one of the bigger incumbents, typically either with a lower-cost business model or by targeting a new customer segment. Disruptive firms can start with fewer resources because they have less overhead and fewer fixed costs, and they often leverage new technologies.
Initially, a disruptor goes unnoticed by an incumbent, who is focused on capturing more profitable customers through incremental improvements. Over time, though, the disruptor grows enough to capture large market share, threatening to replace the incumbent altogether.
Intel Illustrates Both Types of Innovation
Intel serves as an illustrative example of both types of innovation. It was the first company to manufacture DRAM memory chips, creating a whole new market. However, as it focused on sustaining innovation, it was disrupted by low-cost Japanese firms that were able to offer the same DRAM memory chips at a lower cost. Intel then pivoted to focus on microprocessors, disrupting the personal computer industry. However, more recently, Intel is at risk of being disrupted again, this time by lower-power microprocessors, like ARM, and application-specific processors, like Nvidia GPUs.
The DoD, like the large incumbent it is, has become good at sustaining innovation. Its acquisitions process first outlines the capabilities it needs, then sets budgets, and finally purchases what external partners provide. Each part of this – the culture, the procedures, the roles, the rules – have been optimized over time for sustaining innovation. This lengthy, three-part process has allowed the Pentagon to invest in steadily improving hardware, like submarines and airplanes, and the defense industrial base has followed suit, consolidating to just five major defense contractors that can provide the desired sustaining innovation.
The problem is that we are now in an era of disruptive innovation, and a focus on sustaining innovation doesn’t work for disruptive innovation. As a result of decreasing defense budgets in the 1990s and a parallel increase in funding in the private sector, companies now lead the way on innovation. With emerging technologies like drones, artificial intelligence, and quantum computing advancing every month by the private sector, a years-long process to outline capabilities and define budgets won’t work: by the time the requirements are defined and shared, the technology will have moved on, rendering the old requirements obsolete. To illustrate the speed of change, consider that the National Security Commission on Artificial Intelligence’s lengthy 2021 report on how the U.S. can win in the AI era failed to include any mention of generative AI or Large-Language Models, which have seen revolutionary advances in just the past few years. Innovation is happening faster than our ability to write reports or define capabilities.
The Small, Daring, and Nimble Prevail
So how does an organization respond to the threat of disruptive innovation? It must create an entirely new business unit to respond, with new people, processes, and culture. The existing organization has been optimized to the current threat in every way, so in many ways it has to start over while still leveraging the resources and knowledge it has accumulated.
Ford learned this lesson the hard way. After trying to intermix production of internal combustion cars and electric vehicles for years, Ford recently carved out the EV group into a separate business unit. The justification? The “two businesses required different skills and mind-sets that would clash and hinder each area if they remained parts of one organization”, reported the New York Times after speaking with Jim Farley, the CEO of Ford.
When the personal computer was first introduced by Apple, IBM took it seriously and recognized the threat to its mainframe business. Due to bureaucratic and internal controls, however, its product development process took four or five years. The industry was moving too quickly for that. To respond, the CEO created a secretive, independent team of just 40 people. The result? The IBM personal computer was ready to ship just one year later.
One of the most famous examples of creating a new business unit comes from the defense space: Skunkworks. Facing the threat of German aircraft in World War II, the Air Force asked Lockheed Martin to design them a plane that could fly at 600-mph, which was 200 mph faster than Lockheed’s current planes. And they wanted a working prototype in just 180 days. With the company already at capacity, a small group of engineers, calling themselves Skunkworks, set up shop in a different building with limited resources – and miraculously hit the goal ahead of schedule. Their speed was attributed to their ability to avoid Lockheed’s bureaucratic processes. Skunkworks would expand over the years and go on to build some of the most famous Air Force planes, including the U-2 and SR-71.
DoD’s Innovation Approach to Date
The DoD appears to be re-learning these lessons today. Its own innovation pipeline is clogged down by bureaucracy and internal controls. Faced with the threat of a Chinese military that is investing heavily into AI and moving towards AI-enabled warfare, the DoD has finally realized that it cannot rely on its sustaining innovation to win. It must reorganize itself to respond to the disruptive threat.
It has created a wave of new pathways to accelerate the adoption of emerging technologies. SBIR open topics, the Defense Innovation Unit, SOFWERX, the Office of Strategic Capital, and the National Security Innovation Capital program are all initiatives created in the spirit of Skunkworks or the “new business unit”. Major commands are doing it too, with the emergence of innovation units like Navy Task Force 59 in CENTCOM.
These initiatives are all attempts to respond to the disruption by opening up alternative pathways to fund and acquire technology. SBIR open topics, for example, have been found to be more effective than traditional approaches because they don’t require the DoD to list requirements up front, instead allowing it to quickly follow along with commercial and academic innovation.
Making the DoD More Agile
Some of these initiatives will work, others won’t. The advantage of DoD is that it has the resources and institutional heft to create multiple such “new business units” that try a variety of approaches, provided Congress continues to fund them.
From there, it must learn which approaches work best for accelerating the adoption of emerging technologies and pick a winner, scaling that approach to replace its core acquisitions process. These new pathways must be integrated into the main organization, otherwise they risk remaining fringe programs with scoped impact. The best contractors from these new pathways will also have to scale up, disrupting the defense industrial base. It is only with these new operating and business models – along with new funding policies and culture – can the DoD become proficient at acquiring the latest technologies. Scaling up the new business units is the only way to do so.
The path forward is clear. The hard work to reform the acquisitions process must begin by co-opting the strengths of these new innovation pathways. The good news is that the DoD, through its large and varied research programs, partnerships, and funding, has clear visibility into emerging and future technologies. Now it must figure out how to scale the new innovation programs or risk getting disrupted.
As both the House and Senate gear up to vote on the National Defense Authorization Act (NDAA), FAS is launching this live blog post to track all proposals around artificial intelligence (AI) that have been included in the NDAA. In this rapidly evolving field, these provisions indicate how AI now plays a pivotal role in our defense strategies and national security framework. This tracker will be updated following major updates.
Senate NDAA. This table summarizes the provisions related to AI from the version of the Senate NDAA that advanced out of committee on July 11. Links to the section of the bill describing these provisions can be found in the “section” column. Provisions that have been added in the manager’s package are in red font.
House NDAA. This table summarizes the provisions related to AI from the version of the House NDAA that advanced out of committee. Links to the section of the bill describing these provisions can be found in the “section” column.
Funding Comparison. The following tables compare the funding requested in the President’s budget to funds that are authorized in current House and Senate versions of the NDAA. All amounts are in thousands of dollars.
The White House Office of Science and Technology Policy (OSTP) has sought public input for the Biden administration’s National AI Strategy, acknowledging the potential benefits and risks of advanced AI. The Federation of American Scientists (FAS) was happy to recommend specific actions for federal agencies to safeguard Americans’ rights and safety. With U.S. companies creating powerful frontier AI models, the federal government must guide this technology’s growth toward public benefit and risk mitigation.
Recommendation 1: OSTP should work with a suitable agency to develop and implement a pre-deployment risk assessment protocol that applies to any frontier AI model.
Before launching a frontier AI system, developers must ensure safety, trustworthiness, and reliability through pre-deployment risk assessment. This protocol aims to thoroughly analyze potential risks and vulnerabilities in AI models before deployment.
We advocate for increased funding towards the National Institute of Standards and Technology (NIST) to enhance its risk measurement capacity and develop robust benchmarks for AI model risk assessment. Building upon NIST’s AI Risk Management Framework (RMF) will standardize metrics for evaluation incorporating various cases such as open-source models, academic research, and fine-tuning of models which differ from larger labs like OpenAI’s GPT-4.
We propose the Federal Trade Commission (FTC), under Section 5 of the FTC Act, implement and enforce this pre-deployment risk assessment strategy. The FTC’s role to prevent unfair or deceptive practices in commerce is aligned with mitigating potential risks from AI systems.
Recommendation 2: Adherence to the appropriate risk management framework should be compulsory for any AI-related project that receives federal funding.
The U.S. government, as a significant funder of AI through contracts and grants, has both a responsibility and opportunity. Responsibility: to ensure that its AI applications meet a high bar for risk management. Opportunity: to enhance a culture of safety in AI development more broadly. Adherence to a risk management framework should be a prerequisite for AI projects seeking federal funds.
Currently, voluntary guidelines such as NIST’s AI RMF exist, but we propose making these compulsory. Agencies should require contractors to document and verify the risk management practices in place for the contract. For agencies that do not have their own guidelines, the NIST AI RMF should be used. And the NSF should require documentation of the grantee’s compliance with the NIST AI RMF in grant applications for AI projects. This approach will ensure all federally funded AI initiatives maintain a high bar for risk management.
Recommendation 3: NSF should increase its funding for “trustworthy AI” R&D.
“Trustworthy AI” refers to AI systems that are reliable, safe, transparent, privacy-enhanced, and unbiased. While NSF is a key non-military funder of AI R&D in the U.S., our rough estimates indicate that its investment in fields promoting trustworthiness has remained relatively static, accounting for only 10-15% of all AI grants. Given its $800 million annual AI-related budget, we recommend that NSF direct a larger share of grants towards research in trustworthy AI.
To enable this shift, NSF could stimulate trustworthy AI research through specific solicitations; launch targeted programs in this area; and incorporate a “trustworthy AI” section in funding applications, prompting researchers to outline the trustworthiness of their projects. This would help evaluate AI project impacts and promote proposals with significant potential in trustworthy AI. Lastly, researchers could be requested or mandated to apply the NIST AI RMF during their studies.
Recommendation 4: FedRAMP should be broadened to cover AI applications contracted for by the federal government.
The Federal Risk and Authorization Management Program (FedRAMP) is a government-wide initiative that standardizes security protocols for cloud services. Given the rising utilization of AI services in federal operations, a similar system of security standards should apply to these services, since they are responsible for managing highly sensitive data related to national security and individual privacy.
Expanding FedRAMP’s mandate to include AI services is a logical next step in ensuring the secure integration of advanced technologies into federal operations. Applying a framework like FedRAMP to AI services would involve establishing robust security standards specific to AI, such as secure data handling, model transparency, and robustness against adversarial attacks. The expanded FedRAMP program would streamline AI integration into federal operations and avoid repetitive security assessments.
Recommendation 5: The Department of Homeland Security should establish an AI incidents database.
The Department of Homeland Security (DHS) needs to create a centralized AI Incidents Database, detailing AI-related breaches, failures and misuse across industries. Its existing authorization under the Homeland Security Act of 2002 makes DHS capable of this role. This database would increase understanding, mitigate risks, and build trust in AI systems’ safety and security.
Voluntary reporting from AI stakeholders should be encouraged while preserving data confidentiality. For effectiveness, anonymized or aggregated data should be shared with AI developers, researchers, and policymakers to better understand AI risks. DHS could use existing databases such as the one maintained by the Partnership on AI and Center for Security and Emerging Technologies, as well as adapt reporting methods from global initiatives like the Financial Services Information Sharing and Analysis Center.
Recommendation 6: OSTP should work with agencies to streamline the process of granting Interested Agency Waivers to AI researchers on J-1 visas.
The ongoing global competition in AI necessitates attracting and retaining a diverse, highly skilled talent pool. The US J-1 Exchange Visitor Program, often used by visiting researchers, requires some participants to return home for two years before applying for permanent residence.
Federal agencies can waive this requirement for certain individuals via an “Interested Government Agency” (IGA) request. Agencies should establish a transparent, predictable process for AI researchers to apply for such waivers. The OSTP should collaborate with agencies to streamline this process. Taking cues from the Department of Defense’s structured application process, including a dedicated webpage, application checklist, and sample sponsor letter, could prove highly beneficial for improving the transition of AI talent to permanent residency in the US.
Review the details of these proposals in our public comment.
The 2019 defense authorization act directed the Secretary of Defense to produce a definition of artificial intelligence (AI) by August 13, 2019 to help guide law and policy. But that was not done.
Therefore “no official U.S. government definition of AI yet exists,” the Congressional Research Service observed in a newly updated report on the subject.
But plenty of other unofficial and sometimes inconsistent definitions do exist. And in any case, CRS noted, “AI research is underway in the fields of intelligence collection and analysis, logistics, cyber operations, information operations, command and control, and in a variety of semiautonomous and autonomous vehicles. Already, AI has been incorporated into military operations in Iraq and Syria.”
“The Central Intelligence Agency alone has around 140 projects in development that leverage AI in some capacity to accomplish tasks such as image recognition and predictive analytics.” CRS surveys the field in Artificial Intelligence and National Security, updated November 21, 2019.
* * *
The 2018 financial audit of the Department of Defense, which was the first such audit ever, cost a stunning $413 million to perform. Its findings were assessed by CRS in another new report. See Department of Defense First Agency-wide Financial Audit (FY2018): Background and Issues for Congress, November 27, 2019.
* * *
The Arctic region is increasingly important as a focus of security, environmental and economic concern. So it is counterintuitive — and likely counterproductive — that the position of U.S. Special Representative for the Arctic has been left vacant since January 2017. In practice it has been effectively eliminated by the Trump Administration. See Changes in the Arctic: Background and Issues for Congress, updated November 27, 2019.
* * *
Other noteworthy new and updated CRS reports include the following (which are also available through the CRS public website at crsreports.congress.gov).
Resolutions to Censure the President: Procedure and History, updated November 20, 2019
Immigration: Recent Apprehension Trends at the U.S. Southwest Border, November 19, 2019
Air Force B-21 Raider Long Range Strike Bomber, updated November 13, 2019
Precision-Guided Munitions: Background and Issues for Congress, November 6, 2019
Intelligence Community Spending: Trends and Issues, updated November 6, 2019
While many countries recognize freedom of speech as a fundamental value, every country also imposes some legal limits on free speech.
A new report from the Law Library of Congress surveys the legal limitations on free expression in thirteen countries: Argentina, Brazil, Canada, China, Israel, Japan, Germany, France, New Zealand, Sweden, the Netherlands, the United Kingdom, and Ukraine.
“In particular, the report focuses on the limits of protection that may apply to the right to interrupt or affect in any other way public speech. The report also addresses the availability of mechanisms to control foreign broadcasters working on behalf of foreign governments,” wrote Ruth Levush in the document summary. See Limits on Freedom of Expression, Law Library of Congress, June 2019.
Some other noteworthy recent reports from the Law Library of Congress include the following.
Artificial intelligence (AI) technologies such as machine learning are already being used by the Department of Defense in operations in Iraq and Syria, and they have many potential uses in intelligence processing, military logistics, cyber defense, as well as autonomous weapon systems.
The range of such applications for defense and intelligence is surveyed in a new report from the Congressional Research Service.
The CRS report also reviews DoD funding for AI, international competition in the field, including Chinese investment in US AI companies, and the foreseeable impacts of AI technologies on the future of combat. See Artificial Intelligence and National Security, April 26, 2018.
“We’re going to have self-driving vehicles in theater for the Army before we’ll have self-driving cars on the streets,” Michael Griffin, the undersecretary of defense for research and engineering told Congress last month (as reported by Bloomberg).
Other new and updated reports from the Congressional Research Service include the following.
Foreign Aid: An Introduction to U.S. Programs and Policy, April 25, 2018
OPIC, USAID, and Proposed Development Finance Reorganization, April 27, 2018
OPEC and Non-OPEC Crude Oil Production Agreement: Compliance Status, CRS Insight, April 26, 2018
What Is the Farm Bill?, updated April 26, 2018
Navy Aegis Ballistic Missile Defense (BMD) Program: Background and Issues for Congress, updated April 27, 2018
African American Members of the United States Congress: 1870-2018, updated April 26, 2018
The field of artificial intelligence is habitually susceptible to exaggerated claims and expectations. But when it comes to new applications in health care, some of those claims may prove to be valid, says a new report from the JASON scientific advisory panel.
“Overall, JASON finds that AI is beginning to play a growing role in transformative changes now underway in both health and health care, in and out of the clinical setting.”
“One can imagine a day where people could, for instance, 1) use their cell phone to check their own cancer or heart disease biomarker levels weekly to understand their own personal baseline and trends, or 2) ask a partner to take a cell-phone-based HIV test before a sexual encounter.”
Already, automated skin cancer detection programs have demonstrated performance comparable to human dermatologists.
The JASON report was requested and sponsored by the U.S. Department of Health and Human Services. See Artificial Intelligence for Health and Health Care, JSR-17-Task-002, December 2017.
Benefits aside, there are new opportunities for deception and scams, the report said.
“There is potential for the proliferation of misinformation that could cause harm or impede the adoption of AI applications for health. Websites, apps, and companies have already emerged that appear questionable based on information available.”
Fundamentally, the JASONs said, the future of AI in health care depends on access to private health data.
“The availability of and access to high quality data is critical in the development and ultimate implementation of AI applications. The existence of some such data has already proven its value in providing opportunities for the development of AI applications in medical imaging.”
“A major initiative is just beginning in the U.S. to collect a massive amount of individual health data, including social behavioral information. This is a ten year, $1.5B National Institutes of Health (NIH) Precision Medicine Initiative (PMI) project called All of Us Research Program. The goal is to develop a 1,000,000 person-plus cohort of individuals across the country willing to share their biology, lifestyle, and environment data for the purpose of research.”
But all such efforts raise knotty questions of data security and personal privacy.
“PMI has recognized from the start of this initiative that no amount of de-identification (anonymization) of the data will guarantee the privacy protection of the participants.”
Lately, the US Government has barred access by non-US researchers to a National Cancer Institute database concerning Medicare recipients, according to a story in The Lancet Oncology. See “International access to major US cancer database halted” by Bryant Furlow, January 18, 2018 (sub. req’d.).