Public Comment on Executive Branch Agency Handling of CAI containing PII
Public comments serve the executive branch by informing more effective, efficient program design and regulation. As part of our commitment to evidence-based, science-backed policy, FAS staff leverage public comment opportunities to embed science, technology, and innovation into policy decision-making.
The Federation of American Scientists (FAS) is a non-partisan, nonprofit organization committed to using science and technology to benefit humanity by delivering on the promise of equitable and impactful policy. FAS believes that society benefits from a federal government that harnesses science, technology, and innovation to meet ambitious policy goals and deliver impactful results to the public.
We are writing in response to your Request for Information on the Executive Branch Agency Handling of Commercially Available Information (CAI) Containing Personally Identifiable Information (PII). Specifically, we will be answering questions 2 and 5 in your request for information:
2. What frameworks, models, or best practices should [the White House Office of Management and Budget] consider as it evaluates agency standards and procedures associated with the handling of CAI containing PII and considers potential guidance to agencies on ways to mitigate privacy risks from agencies’ handling of CAI containing PII?
5. Agencies provide transparency into the handling of PII through various means (e.g., policies and directives, Privacy Act statements and other privacy notices at the point of collection, Privacy Act system of records notices, and privacy impact assessments). What, if any, improvements would enhance the public’s understanding of how agencies handle CAI containing PII?
Background
In the digital landscape, commercially available information (CAI) represents a vast ecosystem of personal data that can be easily obtained, sold, or licensed to various entities. The Executive Order on Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence (EO 14110) defines CAI comprehensively as information about individuals or groups that is publicly accessible, encompassing details like device information and location data.
A 2017 report by the Georgetown Law Review found that 63% of Americans can be uniquely identified using just three basic attributes—gender, birth date, and ZIP code—with an astonishing 99.98% of individuals potentially re-identifiable from a dataset containing only 15 fundamental characteristics. This vulnerability underscores the critical challenges of data privacy in an increasingly interconnected world.
CAI takes on heightened significance in the context of artificial intelligence (AI) deployment, as these systems enable both data collection and the use of advanced inference models to analyze datasets and produce predictions, insights, and assumptions that reveal patterns or relationships not directly evident in the data. Some AI systems can allow the intentional or unintentional reidentification of supposedly anonymized private data. These capabilities raise questions about privacy, consent, and the potential for unprecedented levels of personal information aggregation and analysis, challenging existing data protection frameworks and individual rights.
The United States federal government is one of the largest customers of commercial data brokers. Government entities increasingly use CAI to empower public programs, enabling federal agencies to augment decision-making, policy development, and resource allocation and enrich research and innovation goals with large yet granular datasets. For example, the National Institutes of Health have discussed within their data strategies how to incorporate commercially available data into research projects. The use of commercially available electronic health records is essential for understanding social inequalities within the healthcare system but includes sensitive personal data that must be protected.
However, government agencies face significant public scrutiny over their use of CAI in areas including law enforcement, homeland security, immigration, and tax administration. This scrutiny stems from concerns about privacy violations, algorithmic bias, and the risks of invasive surveillance, profiling, and discriminatory enforcement practices that could disproportionately harm vulnerable populations. For example, federal agencies like Immigration and Customs Enforcement (ICE) and Customs and Border Protection (CBP) have used broker-purchased location data to track individuals without warrants, raising constitutional concerns.
In 2020, the American Civil Liberties Union filed a Freedom of Information Act lawsuit against several Department of Homeland Security (DHS) agencies, arguing that the DHS’s use of cellphone data and data from smartphone apps constitutes unreasonable searches without a warrant and violates the Fourth Amendment. A report by the Electronic Frontier Foundation found that CAI was used for mass surveillance practices, including geofence warrants that query all phones in specific locations, further challenging constitutional protections.
While the Privacy Act of 1974 covers the use of federally collected personal information by agencies, there is no explicit guidance governing federal use of third-party data. The bipartisan Fourth Amendment is Not for Sale Act (H.R.4639) would bar certain technology providers—such as remote computing service and electronic communication service providers—from sharing the contents of stored electronic communications with anyone (including government actors) and from sharing customer records with government agencies. The bill has passed the House of Representatives in the 118th Congress but has yet to pass the Senate as of December 2024. Without protections in statute, it is imperative that the federal government crafts clear guidance on the use of CAI containing PII in AI systems. In this response to the Office of Management and Budget’s (OMB) request for information, FAS will outline three policy ideas that can improve how federal agencies navigate the use of CAI containing PII, including in AI use.
Summary of Recommendations
The federal government is responsible for ensuring the safety and privacy of the processing of personally identifiable information within commercially available information used for the development and deployment of artificial intelligence systems. For this RFI, FAS brings three proposals to increase government capacity in ensuring transparency and risk mitigation in how CAI containing PII is used, including in agency use of AI:
- Enable FedRAMP to Create an Authorization System for Third-Party Data Sources: An authorization framework for CAI containing PII would ensure a standardized approach for data collection, management, and contracting, mitigating risks, and ensuring ethical data use.
- Expand Existing Privacy Impact Assessments (PIA) to Incorporate Additional Requirements and Periodic Evaluations: Regular public reports on CAI sources and usage will enable stakeholders to monitor federal data practices effectively.
- Build Government Capacity for the Use of Privacy Enhancing Technologies to Bolster Anonymization Techniques by harnessing existing resources such as the United States Digital Service (USDS).
Recommendation 1. Enable FedRAMP to Create an Authorization System for Third-Party Data Sources
Government agencies utilizing CAI should implement a pre-evaluation process before acquiring large datasets to ensure privacy and security. OMB, along with other agencies that are a part of the governing board of the Federal Risk and Authorization Management Program (FedRAMP), should direct FedRAMP to create an authorization framework for third-party data sources that contract with government agencies, especially data brokers that provide CAI with PII, to ensure that these vendors comply with privacy and security requirements. FedRAMP is uniquely positioned for this task because of its previous mandate to ensure the safety of cloud service providers used by the federal government and its recent expansion of this mandate to standardize AI technologies. The program could additionally harmonize its new CAI requirements with its forthcoming AI authorization framework.
When designing the content of the CAI authorization, a useful benchmark in terms of evaluation criteria is the Ag Data Transparent (ADT) certification process. Companies applying for this certification must submit contracts and respond to 11 data collection, usage, and sharing questions. Like the FedRAMP authorization process, a third-party administrator reviews these materials for consistency, granting the ADT seal only if the company’s practices align with its contracts. Any discrepancies must be corrected, promoting transparency and protecting farmers’ data rights. The ADT is a voluntary certification, and therefore does not provide a good model for enforcement. However, it does provide a framework for the kind of documentation that should be required. The CAI authorization should thus include the following information required by the ADT certification process:
- Data source: The origin or provider of the data, such as a specific individual, organization, database, device, or system, that supplies information for analysis or processing, as well as the technologies, platforms, or applications used to collect data. For example, the authorization framework should identify if an AI system collected, compiled, or aggregated a CAI dataset.
- Data categories: The classification of data based on its format or nature, such as structured (e.g., spreadsheets), unstructured (e.g., text or images), personal (e.g., names, Social Security numbers), or non-personal (e.g., aggregated statistics).
- Data ownership: A description of any agreements in place that define which individual or organization owns the data and what happens when that ownership is transferred.
- Third-party data collection contractors: An explanation of whether or not partners or contractors associated with the vendor have to follow the company’s data governance standards.
- Consent and authorization to sell to third-party contractors: A description of whether or not there is an explicit agreement between data subjects (e.g., an individual using an application) that their data can be collected and sold to the government or another entity for different purposes, such as use to train or deploy an AI system. In addition, a description of the consent that has been obtained for that use.
- Opt out and deletion: Whether or not the data can be deleted at the request of a data subject, or if the data subject opt out of certain data use. A description of the existing mechanisms where individuals can decline or withdraw consent for their data to be collected, processed, or used, ensuring they retain control over their personal information.
- Security safeguards and breach notifications: The measures and protocols implemented to protect data from unauthorized access, breaches, and misuse. These include encryption, access controls, secure storage, vulnerability testing, and compliance with industry security standards.
Unlike the ADT, a FedRAMP authorization process can be strictly enforced. FedRAMP is mandatory for all cloud service providers working with the executive branch and follows a detailed authorization process with evaluations and third-party auditors. It would be valuable to bring that assessment rigor to federal agency use of CAI, and would help provide clarity to commercial vendors.
The authorization framework should also document the following specific protocols for the use of CAI within AI systems:
- Provide a detailed explanation of which datasets were aggregated and the efforts to minimize data. According to a report by the Information Systems Audit and Control Association (ISACA), singular data points, when combined, can compromise anonymity, especially when placed through an AI system with inference capabilities.
- Type of de-identification or anonymization technique used. Providing this information helps agencies assess whether additional measures are necessary, particularly when using AI systems capable of recognizing patterns that could re-identify individuals.
By setting these standards, this authorization could help agencies understand privacy risks and ensure the reliability of CAI data vendors before deploying purchased datasets within AI systems or other information systems, therefore setting them up to create appropriate mitigation strategies.
By encouraging data brokers to follow best practices, this recommendation would allow agencies to focus on authorized datasets that meet privacy and security standards. Public availability of this information could drive market-wide improvements in data governance and elevate trust in responsible data usage. This approach would support ethical data governance in AI projects and create a more transparent, publicly accountable framework for CAI use in government.
Recommendation 2. Expand Privacy Impact Assessments (PIA) to Incorporate Additional Requirements and Periodic Evaluations
Public transparency regarding the origins and details of government-acquired CAI containing PII is critical, especially given the largely unregulated nature of the data broker industry at the federal level. Privacy Impact Assessments (PIAs) are mandated under Section 208 of the 2002 E-Government Act and OMB Memo M-03-22, and can serve as a vital policy tool for ensuring such transparency. Agencies must complete PIAs at the outset of any new electronic information collection process that includes “information in identifiable form for ten or more persons.” Under direction from Executive Order 14110 on Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence, OMB issued a request for information in April 2024 to explore updating PIA guidance for AI-era privacy concerns, although new guidance has not yet been issued.
To ensure that PIAs can effectively provide transparency into government practices on CAI that contains PII, we recommend that OMB provide updated guidance requiring agencies to regularly review and update their PIAs at least every three years, and also require agencies to report more comprehensive information in PIAs. We provide more details on these recommendations below.
First, OMB should guide agencies to periodically update their PIAs to ensure evolutions in agency data practices are publicly captured, which is increasingly important as data-driven AI systems are adopted by government actors and create novel privacy concerns. Under OMB Memo M-03-22, agencies must initiate or update PIAs when new privacy risks or factors emerge that affect the collection and handling of PII, including when agencies incorporate PII obtained from commercial or public sources into existing information systems. However, a public comment submitted by the Electronic Privacy Information Center (EPIC) pointed out that many agencies fail to publish and update required PIAs in a timely manner, indicating that a stricter schedule is needed to maintain accountability for PIA reporting requirements. As data privacy risks evolve through the advancement of AI systems, increased cybersecurity risks, and new legislation, it is essential that a minimum standard schedule for updating PIAs is created to ensure agencies provide the public with an up-to-date understanding of the potential risks resulting from using CAI that includes PII. For example, the European Union’s General Data Protection Regulation (Art. 35) requires PIAs to be reconducted every three years.
Second, agency PIAs should report more detailed information on the CAI’s source, vendor information, contract agreements, and licensing arrangements. A frequent critique of existing PIAs is that they contain too little information to inform the public of relevant privacy harms. Such a lack of transparency risks damaging public trust in government. One model for expanded reporting frameworks for CAI containing PII is the May 2024 Policy Framework for CAI, established for the Intelligence Community (IC) by the Office of the Director of National Intelligence (ODNI). This framework requires the IC to document and report “the source of the Sensitive CAI and from whom the Sensitive CAI was accessed or collected” and “any licensing agreements and/or contract restrictions applicable to the Sensitive CAI”. OMB should incorporate these reporting practices into agency PIA requirements and explicitly require agencies to identify the CAI data vendor in order to provide insight into the source and quality of purchased data.
Many of these elements are also present in Recommendation 1, for a new FedRAMP authorization framework. However, that recommendation does not include existing agency projects using CAI or agencies that could contract CAI datasets outside of the FedRAMP authorization. Including this information within the PIA framework also allows for an iterative understanding of privacy risks throughout the lifecycle of a project using CAI.
By obligating agencies to provide more frequent PIA updates and include additional details on the source, vendor, contract and licensing arrangements for CAI containing PII, the public gains valuable insight into how government agencies acquire, use, and manage sensitive data. These updates to PIAs would allow civil society groups, journalists, and other external stakeholders to track government data management practices over time during this critical juncture where federal uptake of AI systems is rapidly increasing.
Recommendation 3. Build Government Capacity for the Use of Privacy Enhancing Technologies to Bolster Anonymization Techniques
Privacy Enhancing Technologies (PETs) are a diverse set of tools that can be used throughout the data lifecycle to ensure privacy by design. They can also be powerful tools in ensuring that PII within CAI) is adequately anonymized and secure. OMB should collect information on current agency PET usage, gather best practices, and identify deployment gaps. To address these gaps, OMB should collaborate with agencies like the USDS to establish capacity-building programs, leveraging initiatives like the proposed “Responsible Data Sharing Core” to provide expert consultations and enhance responsible data-sharing practices.
Meta’s Open Loop project identified eight types of PETs that are ripe to be deployed in AI systems, categorizing them into maturity levels, context of deployment, and limitations. One type of PET is differential privacy, a mathematical framework designed to protect individuals’ privacy in datasets by introducing controlled noise to the data. This ensures that the output of data analysis or AI models does not reveal whether a specific individual’s information is included in the dataset. The noise is calibrated to balance privacy with data utility, allowing meaningful insights to be derived without compromising personal information. Differential privacy is particularly useful in AI models that rely on large-scale data for training, as it prevents the inadvertent exposure of PII during the learning process. Within the federal government, the U.S. Census Bureau is using differential privacy to anonymize data while preserving its aggregate utility, ensuring compliance with privacy regulations and reducing re-identification within datasets.
Scaling the use of PETs in other agencies has been referenced in several U.S. government strategy documents, such as the National Strategy to Advance Privacy-Preserving Data Sharing and Analytics, which encourages federal agencies to adopt and invest in the development of PETs, and the Executive Order (EO) on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence, which calls for federal agencies to identify where they could use PETs. As a continuation of this EO, the National Science Foundation and the Department of Energy established a Research Coordination Network on PETs that will “address the barriers to widespread adoption of PETs, including regulatory considerations.”
Although the ongoing research and development of PETS is vital to this growing field, there is an increasing need to ensure these technologies are implemented across the federal government. To kick this off, OMB should collect detailed information on how agencies currently use PETs, especially in projects that use CAI containing PII. This effort should include gathering best practices from agencies with successful PET implementations, such as the previous U.S. Census Bureau’s use of differential privacy. Additionally, OMB should identify gaps in PET deployment, assessing barriers such as technical capacity, funding, and awareness of relevant PETs. To address these gaps, OMB should collaborate with other federal agencies to design and implement capacity-building programs, equipping personnel with the knowledge and tools needed to integrate PETs effectively. For example, a forthcoming FAS’ Day One Project publication, “Increasing Responsible Data Sharing Capacity throughout Government,” seeks to harness existing government capabilities to build government capacity in deploying PETs. This proposal aims to enhance responsible data sharing in government by creating a capacity-building initiative called the “Responsible Data Sharing Core” (RDSC). Managed by the USDS, the RDSC would deploy fellows and industry experts to agencies to consult on data use and sharing decisions and offer consultations on which PETs are appropriate for different contexts.
Conclusion
The federal government’s increasing reliance on CAI containing PII presents significant privacy challenges. The current landscape of data procurement and AI deployment by agencies like ICE, CBP, and others raises critical concerns about potential Fourth Amendment violations, discriminatory profiling, and lack of transparency.
The ideas proposed in this memo—implementing FedRAMPamp authorization for data brokers, expanding privacy impact assessment requirements, and developing capacity-building programs for privacy-enhancing technologies—represent crucial first steps in addressing these systemic risks. As AI systems become increasingly integrated into government processes, maintaining a delicate balance between technological advancement and fundamental constitutional protections will be paramount to preserving individual privacy, promoting responsible adoption, and maintaining public trust.
We appreciate the opportunity to contribute to this Request for Information on Executive Branch Agency Handling of Commercially Available Information Containing Personally Identifiable Information. Please contact clangevin@fas.org if you have any questions or need additional information.
Teacher Education Clearinghouse for AI and Data Science
The next presidential administration should develop a teacher education and resource center that includes vetted, free, self-guided professional learning modules, resources to support data-based classroom activities, and instructional guides pertaining to different learning disciplines. This would provide critical support to teachers to better understand and implement data science education and use of AI tools in their classroom. Initial resource topics would be:
- An Introduction to AI, Data Literacy, and Data Science
- AI & Data Science Pedagogy
- AI and Data Science for Curriculum Development & Improvement
- Using AI Tools for Differentiation, Assessment & Feedback
- Data Science for Ethical AI Use
In addition, this resource center would develop and host free, pre-recorded, virtual training sessions to support educators and district professionals to better understand these resources and practices so they can bring them back to their contexts. This work would improve teacher practice and cut administrative burdens. A teacher education resource would lessen the digital divide and ensure that our educators are prepared to support their students in understanding how to use AI tools so that each and every student can be college and career ready and competitive at the global level. This resource center would be developed using a process similar to the What Works Clearinghouse, such that it is not endorsing a particular system or curriculum, but is providing a quality rating, based on the evidence provided.
Challenge and Opportunity
AI is an incredible technology that has the power to revolutionize many areas, especially how educators teach and prepare the next generation to be competitive in higher education and the workforce. A recent RAND study showed leaders in education indicating promise in adapting instructional content to fit the level of their students and for generating instructional materials and lesson plans. While this technology holds a wealth of promise, the field has developed so rapidly that people across the workforce do not understand how best to take advantage of AI-based technologies. One of the most crucial areas for this is in education. AI-enabled tools have the potential to improve instruction, curriculum development, and assessment, but most educators have not received adequate training to feel confident using them in their pedagogy. In a Spring 2024 pilot study (Beiting-Parrish & Melville, in preparation), initial results indicated that 64.3% of educators surveyed had not had any professional development or training in how to use AI tools. In addition, more than 70% of educators surveyed felt they did not know how to pick AI tools that are safe for use in the classroom, and that they were not able to detect biased tools. Additionally, the RAND study indicated only 18% of educators reported using AI tools for classroom purposes. Within those 18%, approximately half of those educators used AI because they had been specifically recommended or directly provided a tool for classroom use. This suggests that educators need to be given substantial support in choosing and deploying tools for classroom use. Providing guidance and resources to support vetting tools for safe, ethical, appropriate, and effective instruction is one of the cornerstone missions of the Department of Education. This education should not rest on the shoulders of individual educators who are known to have varying levels of technical and curricular knowledge, especially for veteran teachers who have been teaching for more than a decade.
If the teachers themselves do not have enough professional development or expertise to select and teach new technology, they cannot be expected to thoroughly prepare their students to understand emerging technologies, such as AI, nor the underpinning concepts necessary to understand these technologies, most notably data science and statistics. As such, students’ futures are being put at risk from a lack of emphasis in data literacy that is apparent across the nation. Recent results from the National Assessment of Education Progress (NAEP), assessment scores show a shocking decline in student performance in data literacy, probability, and statistics skills – outpacing declines in other content areas. In 2019, the NAEP High School Transcript Study (HSTS) revealed that only 17% of students completed a course in statistics and probability, and less than 10% of high school students completed AP Statistics. Furthermore, the HSTS study showed that less than 1% of students completed a dedicated course in modern data science or applied data analytics in high school. Students are graduating with record-low proficiency in data, statistics, and probability, and graduating without learning modern data science techniques. While students’ data and digital literacy are failing, there is a proliferation of AI content online; they are failing to build the necessary critical thinking skills and a discerning eye to determine what is real versus what has been AI-generated, and they aren’t prepared to enter the workforce in sectors that are booming. The future the nation’s students will inherit is one in which experience with AI tools and Big Data will be expected to be competitive in the workforce.
Whether students aren’t getting the content because it isn’t given its due priority, or because teachers aren’t comfortable teaching the content, AI and Big Data are here, and our educators don’t have the tools to help students get ready for a world in the midst of a data revolution. Veteran educators and preservice education programs alike may not have an understanding of the essential concepts in statistics, data literacy, or data science that allow them to feel comfortable teaching about and using AI tools in their classes. Additionally, many of the standard assessment and practice tools are not fit for use any longer in a world where every student can generate an A-quality paper in three seconds with proper prompting. The rise of AI-generated content has created a new frontier in information literacy; students need to know to question the output of publically available LLM-based tools, such as Chat-GPT, as well as to be more critical of what they see online, given the rise of AI-generated deep fakes, and educators need to understand how to either incorporate these tools into their classrooms or teach about them effectively. Whether educators are ready or not, the existing Digital Divide has the potential to widen, depending on whether or not they know how to help students understand how to use AI safely and effectively and have the access to resources and training to do so.
The United States finds itself at a crossroads in the global data boom. Demand in the economic marketplace, and threat to national security by way of artificial intelligence and mal-, mis-, and disinformation, have educators facing an urgent problem in need of an immediate solution. In August of 1958, 66 years ago, Congress passed the National Defense Education Act (NDEA), emphasizing teaching and learning in science and mathematics. Specifically in response to the launch of Sputnik, the law supplied massive funding to, “insure trained manpower of sufficient quality and quantity to meet the national defense needs of the United States.” The U.S. Department of Education, in partnership with the White House Office of Science and Technology Policy, must make bold moves now to create such a solution, as Congress did once before.
Plan of Action
In the years since the Space Race, one problem with STEM education persists: K-12 classrooms still teach students largely the same content; for example, the progression of high school mathematics including algebra, geometry, and trigonometry is largely unchanged. We are no longer in a race to space – we’re now needing to race against data. Data security, artificial intelligence, machine learning, and other mechanisms of our new information economy are all connected to national security, yet we do not have educators with the capacity to properly equip today’s students with the skills to combat current challenges on a global scale. Without a resource center to house the urgent professional development and classroom activities America’s educators are calling for, progress and leadership in spaces where AI and Big Data are being used will continue to dwindle, and our national security will continue to be at risk. It’s beyond time for a new take on the NDEA that emphasizes more modern topics in the teaching and learning of mathematics and science, by way of data science, data literacy, and artificial intelligence.
Previously, the Department of Education has created resource repositories to support the dissemination of information to the larger educational praxis and research community. One such example is the What Work Clearinghouse, a federally vetted library of resources on educational products and empirical research that can support the larger field. The WWC was created to help cut through the noise of many different educational product claims to ensure that only high-quality tools and research were being shared. A similar process is happening now with AI and Data Science Resources; there are a lot of resources online, but many of these are of dubious quality or are even spreading erroneous information.
To combat this, we suggest the creation of something similar to the WWC, with a focus on vetted materials for educator and student learning around AI and Data Science. We propose the creation of the Teacher Education Clearinghouse (TEC) underneath the Institute of Education Sciences, in partnership with the Office of Education Technology. Currently, WWC costs approximately $2,500,000 to run, so we anticipate a similar budget for the TEC website. The resource vetting process would begin with a Request for Information from the larger field that would encourage educators and administrators to submit high quality materials. These materials would be vetted using an evaluation framework that looks for high quality resources and materials.
For example, the RFI might request example materials or lesson goals for the following subjects:
- An Introduction to AI, Data Literacy, and Data Science
- Introduction to AI & Data Science Literacy & Vocabulary
- Foundational AI Principles
- Cross-Functional Data Literacy and Data Science
- LLMs and How to Use Them
- Critical Thinking and Safety Around AI Tools
- AI & Data Science Pedagogy
- AI and Data Science for Curriculum Development & Improvement
- Using AI Tools for Differentiation, Assessment & Feedback
- Data Science for Safe and Ethical AI Use
- Characteristics of Potentially Biased Algorithms and Their Shortcomings
A framework for evaluating how useful these contributions might be for the Teacher Education Clearinghouse would consider the following principles:
- Accuracy and relevance to subject matter
- Availability of existing resources vs. creation of new resources
- Ease of instructor use
- Likely classroom efficacy
- Safety, responsible use, and fairness of proposed tool/application/lesson
Additionally, this would also include a series of quick start guide books that would be broken down by topic and include a set of resources around foundational topics such as, “Introduction to AI” and “Foundational Data Science Vocabulary”.
When complete, this process would result in a national resource library, which would house a free series of asynchronous professional learning opportunities and classroom materials, activities, and datasets. This work could be promoted through the larger DoE as well as through the Regional Educational Laboratory program and state level stakeholders. The professional learning would consist of prerecorded virtual trainings and related materials (ex: slide decks, videos, interactive components of lessons, etc.). The materials would include educator-facing materials to support their professional development in Big Data and AI alongside student-facing lessons on AI Literacy that teachers could use to support their students. All materials would be publicly available for download on an ED-owned website. This will allow educators from any district, and any level of experience, to access materials that will improve their understanding and pedagogy. This especially benefits educators from less resourced environments because they can still access the training they need to adequately support their students, regardless of local capacity for potentially expensive training and resource acquisition. Now is the time to create such a resource center because there currently isn’t a set of vetted and reliable resources that are available and accessible to the larger educator community and teachers desperately need these resources to support themselves and their students in using these tools thoughtfully and safely. The successful development of this resource center would result in increased educator understanding of AI and data science such that the standing of U.S. students increases on such international measurements as the International Computer and Information Literacy Study (ICILS), as well as increased participation in STEAM fields that rely on these skills.
Conclusion
The field of education is at a turning point; the rise of advancements in AI and Big Data necessitate increased focus on these areas in the K-12 classroom; however, most educators do not have the preparation needed to adequately teach these topics to fully prepare their students. For the United States to continue to be a competitive global power in technology and innovation, we need a workforce that understands how to use, apply, and develop new innovations using AI and Data Science. This proposal for a library of high quality, open-source, vetted materials would support democratization of professional development for all educators and their students.
Modernizing AI Fairness Analysis in Education Contexts
The 2022 release of ChatGPT and subsequent foundation models sparked a generative AI (GenAI) explosion in American society, driving rapid adoption of AI-powered tools in schools, colleges, and universities nationwide. Education technology was one of the first applications used to develop and test ChatGPT in a real-world context. A recent national survey indicated that nearly 50% of teachers, students, and parents use GenAI Chatbots in school, and over 66% of parents and teachers believe that GenAI Chatbots can help students learn more and faster. While this innovation is exciting and holds tremendous promise to personalize education, educators, families, and researchers are concerned that AI-powered solutions may not be equally useful, accurate, and effective for all students, in particular students from minoritized populations. It is possible that as this technology further develops that bias will be addressed; however, to ensure that students are not harmed as these tools become more widespread it is critical for the Department of Education to provide guidance for education decision-makers to evaluate AI solutions during procurement, to support EdTech developers to detect and mitigate bias in their applications, and to develop new fairness methods to ensure that these solutions serve the students with the most to gain from our educational systems. Creating this guidance will require leadership from the Department of Education to declare this issue as a priority and to resource an independent organization with the expertise needed to deliver these services.
Challenge and Opportunity
Known Bias and Potential Harm
There are many examples of the use of AI-based systems introducing more bias into an already-biased system. One example with widely varying results for different student groups is the use of GenAI tools to detect AI-generated text as a form of plagiarism. Liang et. al found that several GPT-based plagiarism checkers frequently identified the writing of students for whom English is not their first language as AI-generated, even though their work was written before ChatGPT was available. The same errors did not occur with text generated by native English speakers. However, in a publication by Jiang (2024), no bias against non-native English speakers was encountered in the detection of plagiarism between human-authored essays and ChatGPT-generated essays written in response to analytical writing prompts from the GRE, which is an example of how thoughtful AI tool design and representative sampling in the training set can achieve fairer outcomes and mitigate bias.
Beyond bias, researchers have raised additional concerns about the overall efficacy of these tools for all students; however, more understanding around different results for subpopulations and potential instances of bias(es) is a critical aspect of deciding whether or not these tools should be used by teachers in classrooms. For AI-based tools to be usable in high-stakes educational contexts such as testing, detecting and mitigating bias is critical, particularly when the consequences of being incorrect are so high, such as for students from minoritized populations who may not have the resources to recover from an error (e.g., failing a course, being prevented from graduating school).
Another example of algorithmic bias before the widespread emergence of GenAI which illustrates potential harms is found in the Wisconsin Dropout Early Warning System. This AI-based tool was designed to flag students who may be at risk of dropping out of school; however, an analysis of the outcomes of these predictions found that the system disproportionately flagged African American and Hispanic students as being likely to drop out of school when most of these students were not at risk of dropping out). When teachers learn that one of their students is at risk, this may change how they approach that student, which can cause further negative treatment and consequences for that student, creating a self-fulfilling prophecy and not providing that student with the education opportunities and confidence that they deserve. These examples are only two of many consequences of using systems that have underlying bias and demonstrate the criticality of conducting fairness analysis before these systems are used with actual students.
Existing Guidance on Fair AI & Standards for Education Technology Applications
Guidance for Education Technology Applications
Given the harms that algorithmic bias can cause in educational settings, there is an opportunity to provide national guidelines and best practices that help educators avoid these harms. The Department of Education is already responsible for protecting student privacy and provides guidelines via the Every Student Succeeds Act (ESSA) Evidence Levels to evaluate the quality of EdTech solution evidence. The Office of Educational Technology, through support of a private non-profit organization (Digital Promise) has developed guidance documents for teachers and administrators, and another for education technology developers (U.S. Department of Education, 2023, 2024). In particular, “Designing for Education with Artificial Intelligence” includes guidance for EdTech developers including an entire section called “Advancing Equity and Protecting Civil Rights” that describes algorithmic bias and suggests that, “Developers should proactively and continuously test AI products or services in education to mitigate the risk of algorithmic discrimination.” (p 28). While this is a good overall guideline, the document critically is not sufficient to help developers conduct these tests.
Similarly, the National Institute of Standards and Technology has released a publication on identifying and managing bias in AI . While this publication highlights some areas of the development process and several fairness metrics, it does not provide specific guidelines to use these fairness metrics, nor is it exhaustive. Finally demonstrating the interest of industry partners, the EDSAFE AI Alliance, a philanthropically-funded alliance representing a diverse group of companies in educational technology, has also created guidance in the form of the 2024 SAFE (Safety, Accountability, Fairness, and Efficacy) Framework. Within the Fairness section of the framework, the authors highlight the importance of using fair training data, monitoring for bias, and ensuring accessibility of any AI-based tool. But again, this framework does not provide specific actions that education administrators, teachers, or EdTech developers can take to ensure these tools are fair and are not biased against specific populations. The risk to these populations and existing efforts demonstrate the need for further work to develop new approaches that can be used in the field.
Fairness in Education Measurement
As AI is becoming increasingly used in education, the field of educational measurement has begun creating a set of analytic approaches for finding examples of algorithmic bias, many of which are based on existing approaches to uncovering bias in educational testing. One common tool is called Differential Item Functioning (DIF), which checks that test questions are fair for all students regardless of their background. For example, it ensures that native English speakers and students learning English have an equal chance to succeed on a question if they have the same level of knowledge . When differences are found, this indicates that a student’s performance on that question is not based on their knowledge of the content.
While DIF checks have been used for several decades as a best practice in standardized testing, a comparable process in the use of AI for assessment purposes does not yet exist. There also is little historical precedent indicating that for-profit educational companies will self-govern and self-regulate without a larger set of guidelines and expectations from a governing body, such as the federal government.
We are at a critical juncture as school districts begin adopting AI tools with minimal guidance or guardrails, and all signs point to an increase of AI in education. The US Department of Education has an opportunity to take a proactive approach to ensuring AI fairness through strategic programs of support for school leadership, developers in educational technology, and experts in the field. It is important for the larger federal government to support all educational stakeholders under a common vision for AI fairness while the field is still at the relative beginning of being adopted for educational use.
Plan of Action
To address this situation, the Department of Education’s Office of the Chief Data Officer should lead development of a national resource that provides direct technical assistance to school leadership, supports software developers and vendors of AI tools in creating quality tech, and invests resources to create solutions that can be used by both school leaders and application developers. This office is already responsible for data management and asset policies, and provides resources on grants and artificial intelligence for the field. The implementation of these resources would likely be carried out via grants to external actors with sufficient technical expertise, given the rapid pace of innovation in the private and academic research sectors. Leading the effort from this office ensures that these advances are answering the most important questions and can integrate them into policy standards and requirements for education solutions. Congress should allocate additional funding to the Department of Education to support the development of a technical assistance program for school districts, establish new grants for fairness evaluation tools that span the full development lifecycle, and pursue an R&D agenda for AI fairness in education. While it is hard to provide an exact estimate, similar existing programs currently cost the Department of Education between $4 and $30 million a year.
Action 1. The Department of Education Should Provide Independent Support for School Leadership Through a Fair AI Technical Assistance Center (FAIR-AI-TAC)
School administrators are hearing about the promise and concerns of AI solutions in the popular press, from parents, and from students. They are also being bombarded by education technology providers with new applications of AI within existing tools and through new solutions.
These busy school leaders do not have time to learn the details of AI and bias analysis, nor do they have the technical background required to conduct deep technical evaluations of fairness within AI applications. Leaders are forced to either reject these innovations or implement them and expose their students to significant potential risk with the promise of improved learning. This is not an acceptable status quo.
To address these issues, the Department of Education should create an AI Technical Assistance Center (the Center) that is tasked with providing direct guidance to state and local education leaders who want to incorporate AI tools fairly and effectively. The Center should be staffed by a team of professionals with expertise in data science, data safety, ethics, education, and AI system evaluation. Additionally, the Center should operate independently of AI tool vendors to maintain objectivity.
There is precedent for this type of technical support. The U.S. Department of Education’s Privacy Technical Assistance Center (PTAC) provides guidance related to data privacy and security procedures and processes to meet FERPA guidelines; they operate a help desk via phone or email, develop training materials for broad use, and provide targeted training and technical assistance for leaders. A similar kind of center could be stood up to support leaders in education who need support evaluating proposed policy or procurement decisions.
This Center should provide a structured consulting service offering a variety of levels of expertise based on the individual stakeholder’s needs and the variety of levels of potential impact of the system/tool being evaluated on learners; this should include everything from basic levels of AI literacy to active support in choosing technological solutions for educational purposes. The Center should partner with external organizations to develop a certification system for high-quality AI educational tools that have passed a series of fairness checks. Creating a fairness certification (operationalized by third party evaluators) would make it much easier for school leaders to recognize and adopt fair AI solutions that meet student needs.
Action 2. The Department of Education Should Provide Expert Services, Data, and Grants for EdTech Developers
There are many educational technology developers with AI-powered innovations. Even when well-intentioned, some of these tools do not achieve their desired impacts or may be unintentionally unsafe due to a lack of processes and tests for fairness and safety.
Educational Technology developers generally operate under significant constraints when incorporating AI models into their tools and applications. Student data is often highly detailed and deeply personal, potentially containing financial, disability, and educational status information that is currently protected by FERPA, which makes it unavailable for use in AI model training or testing.
Developers need safe, legal, and quality datasets that they can use for testing for bias, as well as appropriate bias evaluation tools. There are several promising examples of these types of applications and new approaches to data security, such as the recently awarded NSF SafeInsights project, which allows analysis without disclosing the underlying data. In addition, philanthropically-funded organizations such as the Allen Institute for AI have released LLM evaluation tools that could be adapted and provided to Education Technology developers for testing. A vetted set of evaluation tools, along with more detailed technical resources and instructions for how to use them would encourage developers to incorporate bias evaluations early and often. Currently, there are very few market incentives or existing requirements that push developers to invest the necessary time or resources into this type of fairness analysis. Thus, the government has a key role to play here.
The Department of Education should also fund a new grant program that tasks grantees with developing a robust and independently validated third-party evaluation system that checks for fairness violations and biases throughout the model development process from pre-processing of data, to the actual AI use, to testing after AI results are created. This approach would support developers in ensuring that the tools they are publishing meet an agreed-upon minimum threshold for safe and fair use and could provide additional justification for the adoption of AI tools by school administrators.
Action 3. The Department of Education Should Develop Better Fairness R&D Tools with Researchers
There is still no consensus on best practices for how to ensure that AI tools are fair. As AI capabilities evolve, the field needs an ongoing vetted set of analyses and approaches that will ensure that any tools being used in an educational context are safe and fair for use with no unintended consequences.
The Department of Education should lead the creation of a a working group or task force comprised of subject matter experts from education, educational technology, educational measurement, and the larger AI field to identify the state of the art in existing fairness approaches for education technology and assessment applications, with a focus on modernized conceptions of identity. This proposed task force would be an inter-organizational group that would include representatives from several different federal government offices, such as the Office of Educational Technology and the Chief Data Office as well as prominent experts from industry and academia. An initial convening could be conducted alongside leading national conferences that already attract thousands of attendees conducting cutting-edge education research (such as the American Education Research Association and National Council for Measurement in Education).
The working group’s mandate should include creating a set of recommendations for federal funding to advance research on evaluating AI educational tools for fairness and efficacy. This research agenda would likely span multiple agencies including NIST, the Institute of Education Sciences of the U.S. Department of Education, and the National Science Foundation. There are existing models for funding early stage research and development with applied approaches, including the IES “Accelerate, Transform, Scale” programs that integrate learning sciences theory with efforts to scale theories through applied education technology program and Generative AI research centers that have the existing infrastructure and mandates to conduct this type of applied research.
Additionally, the working group should recommend the selection of a specialized group of researchers who would contribute ongoing research into new empirically-based approaches to AI fairness that would continue to be used by the larger field. This innovative work might look like developing new datasets that deliberately look for instances of bias and stereotypes, such as the CrowS-Pairs dataset. It may build on current cutting edge research into the specific contributions of variables and elements of LLM models that directly contribute to biased AI scores, such as the work being done by the AI company Anthropic. It may compare different foundation LLMs and demonstrate specific areas of bias within their output. It may also look like a collaborative effort between organizations, such as the development of the RSM-Tool, which looks for biased scoring. Finally, it may be an improved auditing tool for any portion of the model development pipeline. In general, the field does not yet have a set of universally agreed upon actionable tools and approaches that can be used across contexts and applications; this research team would help create these for the field.
Finally, the working group should recommend policies and standards that would incentivize vendors and developers working on AI education tools to adopt fairness evaluations and share their results.
Conclusion
As AI-based tools continue being used for educational purposes, there is an urgent need to develop new approaches to evaluating these solutions to fairness that include modern conceptions of student belonging and identity. This effort should be led by the Department of Education, through the Office of the Chief Data Officer, given the technical nature of the services and the relationship with sensitive data sources. While the Chief Data Officer should provide direction and leadership for the project, partnering with external organizations through federal grant processes would provide necessary capacity boosts to fulfill the mandate described in this memo.As we move into an age of widespread AI adoption, AI tools for education will be increasingly used in classrooms and in homes. Thus, it is imperative that robust fairness approaches are deployed before a new tool is used in order to protect our students, and also to protect the developers and administrators from potential litigation, loss of reputation, and other negative outcomes.
This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.
PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.
When AI is used to grade student work, fairness is evaluated by comparing the scores assigned by AI to those assigned by human graders across different demographic groups. This is often done using statistical metrics, such as the standardized mean difference (SMD), to detect any additional bias introduced by the AI. A common benchmark for SMD is 0.15, which suggests the presence of potential machine bias compared to human scores. However, there is a need for more guidance on how to address cases where SMD values exceed this threshold.
In addition to SMD, other metrics like exact agreement, exact + adjacent agreement, correlation, and Quadratic Weighted Kappa are often used to assess the consistency and alignment between human and AI-generated scores. While these methods provide valuable insights, further research is needed to ensure these metrics are robust, resistant to manipulation, and appropriately tailored to specific use cases, data types, and varying levels of importance.
Existing approaches to demographic post hoc analysis of fairness assume that there are two discrete populations that can be compared, for example students from African-American families vs. those not from African-American families, students from an English language learner family background vs. those that are not, and other known family characteristics. However in practice, people do not experience these discrete identities. Since at least the 1980s, contemporary sociological theories have emphasized that a person’s identity is contextual, hybrid, and fluid/changing. One current approach to identity that integrates concerns of equity that has been applied to AI is “intersectional identity” theory . This approach has begun to develop promising new methods that bring contemporary approaches to identity into evaluating fairness of AI using automated methods. Measuring all interactions between variables results in too small a sample; these interactions can be prioritized using theory or design principles or more advanced statistical techniques (e.g., dimensional data reduction techniques).
Driving Equitable Healthcare Innovations through an AI for Medicaid (AIM) Initiative
Artificial intelligence (AI) has transformative potential in the public health space – in an era when millions of Americans have limited access to high-quality healthcare services, AI-based tools and applications can enable remote diagnostics, drive efficiencies in implementation of public health interventions, and support clinical decision-making in low-resource settings. However, innovation driven primarily by the private sector today may be exacerbating existing disparities by training models on homogenous datasets and building tools that primarily benefit high socioeconomic status (SES) populations.
To address this gap, the Center for Medicare and Medicaid Innovation (CMMI) should create an AI for Medicaid (AIM) Initiative to distribute competitive grants to state Medicaid programs (in partnership with the private sector) for pilot AI solutions that lower costs and improve care delivery for rural and low-income populations covered by Medicaid.
Challenge & Opportunity
In 2022, the United States spent $4.5 trillion on healthcare, accounting for 17.3% of total GDP. Despite spending far more on healthcare per capita compared to other high-income countries, the United States has significantly worse outcomes, including lower life expectancy, higher death rates due to avoidable causes, and lesser access to healthcare services. Further, the 80 million low-income Americans reliant on state-administered Medicaid programs often have below-average health outcomes and the least access to healthcare services.
AI has the potential to transform the healthcare system – but innovation solely driven by the private sector results in the exacerbation of the previously described inequities. Algorithms in general are often trained on datasets that do not represent the underlying population – in many cases, these training biases result in tools and models that perform poorly for racial minorities, people living with comorbidities, and people of low SES. For example, until January 2023, the model used to prioritize patients for kidney transplants systematically ranked Black patients lower than White patients – the race component was identified and removed due to advocacy efforts within the medical community. AI models, while significantly more powerful than traditional predictive algorithms, are also more difficult to understand and engineer, resulting in the likelihood of further perpetuating such biases.
Additionally, startups innovating the digital health space today are not incentivized to develop solutions for marginalized populations. For example, in FY 2022, the top 10 startups focused on Medicaid received only $1.5B in private funding, while their Medicare Advantage (MA)-focused counterparts received over $20B. Medicaid’s lower margins are not attractive to investors, so digital health development targets populations that are already well-insured and have higher degrees of access to care.
The Federal Government is uniquely positioned to bridge the incentive gap between developers of AI-based tools in the private sector and American communities who would benefit most from said tools. Accordingly, the Center for Medicare and Medicaid Innovation (CMMI) should launch the AI for Medicaid (AIM) Initiative to incentivize and pilot novel AI healthcare tools and solutions targeting Medicaid recipients. Precedents in other countries demonstrate early success in state incentives unlocking health AI innovations – in 2023, the United Kingdom’s National Health Service (NHS) partnered with Deep Medical to pilot AI software that streamlines services by predicting and mitigating missed appointment risk. The successful pilot is now being adopted more broadly and is projected to save the NHS over $30M annually in the coming years.
The AIM Initiative, guided by the structure of the former Medicaid Innovation Accelerator Program (IAP), President Biden’s executive order on integrating equity into AI development, and HHS’ Equity Plan (2022), will encourage the private sector to partner with State Medicaid programs on solutions that benefit rural and low-income Americans covered by Medicaid and drive efficiencies in the overall healthcare system.
Plan of Action
CMMI will launch and operate the AIM Initiative within the Department of Health and Human Services (HHS). $20M of HHS’ annual budget request will be allocated towards the program. State Medicaid programs, in partnership with the private sector, will be invited to submit proposals for competitive grants. In addition to funding, CMMI will leverage the former structure of the Medicaid IAP program to provide state Medicaid agencies with technical assistance throughout their participation in the AIM Initiative. The programs ultimately selected for pilot funding will be monitored and evaluated for broader implementation in the future.
Sample Detailed Timeline
- 0-6 months:
- HHS Secretary to announce and launch the AI for Medicaid (AIM) Initiative within CMMI (e.g., delineating personnel responsibilities and engaging with stakeholders to shape the program)
- HHS to include AIM funding in annual budget request to Congress ($20M allocation)
- 6-12 months:
- CMMI to engage directly with state Medicaid agencies to support proposal development and facilitate connections with private sector partners
- CMMI to complete solicitation period and select ~7-10 proposals for pilot funding of ~$2-5M each by end of Year 1
- Year 2-7: Launch and roll out selected AI projects, led by state Medicaid agencies with continued technical assistance from CMMI
- Year 8: CMMI to produce an evaluative report and provide recommendations for broader adoption of AI tools and solutions within Medicaid-covered and other populations
Risks and Limitations
- Participation: Success of the initiative relies on state Medicaid programs and private sector partners’ participation. To mitigate this risk, CMMI will engage early with the National Association of Medicaid Directors (NAMD) to generate interest and provide technical assistance in proposal development. These conversations will also include input and support from the HHS Office of the Chief AI Officer (OCAIO) and its AI Council/Community of Practice. Further, startups in the healthcare AI space will be invited to engage with CMMI on identifying potential partnerships with state Medicaid agencies. A secondary goal of the initiative will be to ensure a number of private sector partners are involved in AIM.
- Oversight: AI is at the frontier of technological development today, and it is critical to ensure guardrails are in place to protect patients using AI technologies from potential adverse outcomes. To mitigate this risk, state Medicaid agencies will be required to submit detailed evaluation plans with their proposals. Additionally, informed consent and the ability to opt-out of data sharing when engaging with personally identifiable information (PII) and diagnostic or therapeutic technologies will be required. Technology partners (whether private, academic, or public sector) will further be required to demonstrate (1) adequate testing to identify and reduce bias in their AI tools to reasonable standards, (2) engagement with beneficiaries in the development process, and (3) leveraging testing environments that reflect the particular context of the Medicaid population. Finally, all proposals must adhere to guidelines published by AI guidelines adopted by HHS and the federal government more broadly, such as the CMS AI Playbook, the HHS Trustworthy AI Playbook, and any imminent regulations.
- Longevity: As a pilot grant program, the initiative does not promise long-term results for the broader population and will only facilitate short-term projects at the state level. Consequently, HHS leadership must remain committed to program evaluation and a long-term outlook on how AI can be integrated to support Americans more broadly. AI technologies or tools considered for acquisition by state Medicaid agencies or federal agencies after pilot implementation should ensure compliance with OMB guidelines.
Conclusion
The AI for Medicaid Initiative is an important step in ensuring the promise of artificial intelligence in healthcare extends to all Americans. The initiative will enable the piloting of a range of solutions at a relatively low cost, engage with stakeholders across the public and private sectors, and position the United States as a leader in healthcare AI technologies. Leveraging state incentives to address a critical market failure in the digital health space can additionally unlock significant efficiencies within the Medicaid program and the broader healthcare system. The rural and low-income Americans reliant on Medicaid have too often been an afterthought in access to healthcare services and technologies – the AIM Initiative provides an opportunity to address this health equity gap.
This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.
PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.
Accelerating Materials Science with AI and Robotics
Innovations in materials science enable innumerable downstream innovations: steel enabled skyscrapers, and novel configurations of silicon enabled microelectronics. Yet progress in materials science has slowed in recent years. Fundamentally, this is because there is a vast universe of potential materials, and the only way to discover which among them are most useful is to experiment. Today, those experiments are largely conducted by hand. Innovations in artificial intelligence and robotics will allow us to accelerate the search process using foundation AI models for science research and automate much of the experimentation with robotic, self-driving labs. This policy memo recommends the Department of Energy (DOE) lead this effort because of its unique expertise in supercomputing, AI, and its large network of National Labs.
Challenge and Opportunity
Take a look at your smartphone. How long does its battery last? How durable is its frame? How tough is its screen? How fast and efficient are the chips inside it?
Each of these questions implicates materials science in fundamental ways. The limits of our technological capabilities are defined by the limits of what we can build, and what we can build is defined by what materials we have at our disposal. The early eras of human history are named for materials: the Stone Age, the Bronze Age, the Iron Age. Even today, the cradle of American innovation is Silicon Valley, a reminder that even our digital era is enabled by finding innovative ways to assemble matter to accomplish novel things.
Materials science has been a driver of economic growth and innovation for decades. Improvements to silicon purification and processing—painstakingly worked on in labs for decades—fundamentally enabled silicon-based semiconductors, a $600 billion industry today that McKinsey recently projected would double in size by 2030. The entire digital economy, conservatively estimated by the Bureau of Economic Analysis (BEA) at $3.7 trillion in the U.S. alone, in turn, rests on semiconductors. Plastics, another profound materials science innovation, are estimated to have generated more than $500 billion in economic value in the U.S. last year. The quantitative benefits are staggering, but even qualitatively, it is impossible to imagine modern life without these materials.
However, present-day materials are beginning to show their age. We need better batteries to accelerate the transition to clean energy. We may be approaching the limits of traditional methods of manufacturing semiconductors in the next decade. We require exotic new forms of magnets to bring technologies like nuclear fusion to life. We need materials with better thermal properties to improve spacecraft.
Yet materials science and engineering—the disciplines of discovering and learning to use new materials—have slowed down in recent decades. The low-hanging fruit has been plucked, and the easy discoveries are old news. We’re approaching the limits of what our materials can do because we are also approaching the limits of what the traditional practice of materials science can do.
Today, materials science proceeds at much the same pace as it did half a century ago: manually, with small academic labs and graduate students formulating potential new combinations of elements, synthesizing those combinations, and studying their characteristics. Because there are more ways to configure matter than there are atoms in the universe, manually searching through the space of possible materials is an impossible task.
Fortunately, AI and robotics present an opportunity to automate that process. AI foundation models for physics and chemistry can be used to simulate potential materials with unprecedented speed and low cost compared to traditional ab initio methods. Robotic labs (also known as “self-driving labs”) can automate the manual process of performing experiments, allowing scientists to synthesize, validate, and characterize new materials twenty-four hours a day at dramatically lower costs. The experiments will generate valuable data for further refining the foundation models, resulting in a positive feedback loop. AI language models like OpenAI’s GPT-4 can write summaries of experimental results and even help ideate new experiments. The scientists and their grad students, freed from this manual and often tedious labor, can do what humans do best: think creatively and imaginatively.
Achieving this goal will require a coordinated effort, significant investment, and expertise at the frontiers of science and engineering. Because much of materials science is basic R&D—too far from commercialization to attract private investment—there is a unique opportunity for the federal government to lead the way. As with much scientific R&D, the economic benefits of new materials science discoveries may take time to emerge. One literature review estimated that it can take roughly 20 years for basic research to translate to economic growth. Research indicates that the returns—once they materialize—are significant. A study from the Federal Reserve Bank of Dallas suggests a return of 150-300% on federal R&D spending.
The best-positioned department within the federal government to coordinate this effort is the DOE, which has many of the key ingredients in place: a demonstrated track record of building and maintaining the supercomputing facilities required to make physics-based AI models, unparalleled scientific datasets with which to train those models collected over decades of work by national labs and other DOE facilities, and a skilled scientific and engineering workforce capable of bringing challenging projects to fruition.
Plan of Action
Achieving the goal of using AI and robotics to simulate potential materials with unprecedented speed and low cost, and benefit from the discoveries, rests on five key pillars:
- Creating large physics and chemistry datasets for foundation model training (estimated cost: $100 million)
- Developing foundation AI models for materials science discovery, either independently or in collaboration with the private sector (estimated cost: $10-100 million, depending on the nature of the collaboration);
- Building 1-2 pilot self-driving labs (SDLs) aimed at establishing best practices, building a supply chain for robotics and other equipment, and validating the scientific merit of SDLs (estimated cost: $20-40 million);
- Making self-driving labs an official priority of the DOE’s preexisting FASST initiative (described below);
- Directing the DOE’s new Foundation for Energy Security and Innovation (FESI) to prioritize establishing fellowships and public-private partnerships to support items (1) and (2), both financially and with human capital.
The total cost of the proposal, then, is estimated at between $130-240 million. The potential return on this investment, though, is far higher. Moderate improvements to battery materials could drive tens or hundreds of billions of dollars in value. Discovery of a “holy grail” material, such as a room-temperature, ambient-pressure superconductor, could create trillions of dollars in value.
Creating Materials Science Foundation Model Datasets
Before a large materials science foundation model can be trained, vast datasets must be assembled. DOE, through its large network of scientific facilities including particle colliders, observatories, supercomputers, and other experimental sites, collects enormous quantities of data–but this, unfortunately, is only the beginning. DOE’s data infrastructure is out-of-date and fragmented between different user facilities. Data access and retention policies make sharing and combining different datasets difficult or impossible.
All of these policy and infrastructural decisions were made far before training large-scale foundation models was a priority. They will have to be changed to capitalize on the newfound opportunity of AI. Existing DOE data will have to be reorganized into formats and within technical infrastructure suited to training foundation models. In some cases, data access and retention policies will need to be relaxed or otherwise modified.
In other cases, however, highly sensitive data will need to be integrated in more sophisticated ways. A 2023 DOE report, recognizing the problems with DOE data infrastructure, suggests developing federated learning capabilities–an active area of research in the broader machine learning community–which would allow for data to be used for training without being shared. This would, the report argues, ”allow access and connections to the information through access control processes that are developed explicitly for multilevel privacy.”
This work will require deep collaboration between data scientists, machine learning scientists and engineers, and domain-specific scientists. It is, by far, the least glamorous part of the process–yet it is the necessary groundwork for all progress to follow.
Building AI Foundation Models for Science
Fundamentally, AI is a sophisticated form of statistics. Deep learning, the broad approach that has undergirded all advances in AI over the past decade, allows AI models to uncover deep patterns in extremely complex datasets, such as all the content on the internet, the genomes of millions of organisms, or the structures of thousands of proteins and other biomolecules. Models of this kind are sometimes loosely referred to as “foundation models.”
Foundation models for materials science can take many different forms, incorporating various aspects of physics, chemistry, and even—for the emerging field of biomaterials—biology. Broadly speaking, foundation models can help materials science in two ways: inverse design and property prediction. Inverse design allows scientists to input a given set of desired characteristics (toughness, brittleness, heat resistance, electrical conductivity, etc.) and receive a prediction for what material might be able to achieve those properties. Property prediction is the opposite flow of information, inputting a given material and receiving a prediction of what properties it will have in the real world.
DOE has already proposed creating AI foundation models for materials science as part of its Frontiers in Artificial Intelligence for Science, Security and Technology (FASST) initiative. While this initiative contains numerous other AI-related science and technology objectives, supporting it would enable the creation of new foundation models, which can in turn be used to support the broader materials science work.
DOE’s long history of stewarding America’s national labs makes it the best-suited home for this proposal. DOE labs and other DOE sub-agencies have decades of data from particle accelerators, nuclear fusion reactors, and other specialized equipment rarely seen in other facilities. These labs have performed hundreds of thousands of experiments in physics and chemistry over their lifetimes, and over time, DOE has created standardized data collection practices. AI models are defined by the data that they are trained with, and DOE has some of the most comprehensive physics and chemistry datasets in the country—if not the world.
The foundation models created by DOE should be made available to scientists. The extent of that availability should be determined by the sensitivity of the data used to train the model and other potential risks associated with broad availability. If, for example, a model was created using purely internal or otherwise sensitive DOE datasets, it might have to be made available only to select audiences with usage monitored; otherwise, there is a risk of exfiltrating sensitive training data. If there are no such data security concerns, DOE could choose to fully open source the models, meaning their weights and code would be available to the general public. Regardless of how the models themselves are distributed, the fruits of all research enabled by both DOE foundation models and self-driving labs should be made available to the academic community and broader public.
Scaling Self-Driving Labs
Self-driving labs are largely automated facilities that allow robotic equipment to autonomously conduct scientific experiments with human supervision. They are well-suited to relatively simple, routine experiments—the exact kind involved in much of materials science. Recent advancements in robotics have been driven by a combination of cheaper hardware and enhanced AI models. While fully autonomous humanoid robots capable of automating arbitrary manual labor are likely years away, it is now possible to configure facilities to automate a broad range of scripted tasks.
Many experiments in materials science involve making iterative tweaks to variables within the same broad experimental design. For example, a grad student might tweak the ratios of the elements that constitute the material, or change the temperature at which the elements are combined. These are highly automatable tasks. Furthermore, by allowing multiple experiments to be conducted in parallel, self-driving labs allow scientists to rapidly accelerate the pace at which they conduct their work.
Creating a successful large-scale self-driving lab will require collaboration with private sector partners, particularly robot manufacturers and the creators of AI models for robotics. Fortunately, the United States has many such firms. Therefore, DOE should initiate a competitive bidding process for the robotic equipment that will be housed within its self-driving labs. Because DOE has experience in building lab facilities, it should directly oversee the construction of the self-driving lab itself.
The United States already has several small-scale self-driving labs, primarily led by investments at DOE National Labs. The small size of these projects, however, makes it difficult to achieve the economies of scale that are necessary for self-driving labs to become an enduring part of America’s scientific ecosystem.
AI creates additional opportunities to expand automated materials science. Frontier language and multi-modal models, such as OpenAI’s GPT-4o, Anthropic’s Claude 3.5, and Google’s Gemini family, have already been used to ideate scientific experiments, including directing a robotic lab in the fully autonomous synthesis of a known chemical compound. These models would not operate with full autonomy. Instead, scientists would direct the inquiry and the design of the experiment, with the models autonomously suggesting variables to tweak.
Modern frontier models have substantial knowledge in all fields of science, and can hold all of the academic literature relevant to a specific niche of materials science within their active attention. This combination means that they have—when paired with a trained human—the scientific intuition to iteratively tweak an experimental design. They can also write the code necessary to direct the robots in the self-driving lab. Finally, they can write summaries of the experimental results—including the failures. This is crucial, because, given the constraints on their time, scientists today often only report their successes in published writing. Yet failures are just as important to document publicly to avoid other scientists duplicating their efforts.
Once constructed, this self-driving lab infrastructure can be a resource made available as another DOE user facility to materials scientists across the country, much as DOE supercomputers are today. DOE already has a robust process and infrastructure in place to share in-demand resources among different scientists, again underscoring why the Department is well-positioned to lead this endeavor.
Conclusion
Taken together, materials science faces a grand challenge, yet an even grander opportunity. Room-temperature, ambient-pressure superconductors—permitted by the laws of physics but as-yet undiscovered—could transform consumer electronics, clean energy, transportation, and even space travel. New forms of magnets could enable a wide range of cutting-edge technologies, such as nuclear fusion reactors. High-performance ceramics could improve reusable rockets and hypersonic aircraft. The opportunities are limitless.
With a coordinated effort led by DOE, the federal government can demonstrate to Americans that scientific innovation and technological progress can still deliver profound improvements to daily life. It can pave the way for a new approach to science firmly rooted in modern technology, creating an example for other areas of science to follow. Perhaps most importantly, it can make Americans excited about the future—something that has been sorely lacking in American society in recent decades.
AI is a radically transformative technology. Contemplating that transformation in the abstract almost inevitably leads to anxiety and fear. There are legislative proposals, white papers, speeches, blog posts, and tweets about using AI to positive ends. Yet merely talking about positive uses of AI is insufficient: the technology is ready, and the opportunities are there. Now is the time to act.
This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.
PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.
Compared to “cloud labs” for biology and chemistry, the risks associated with self-driving labs for materials science are low. In a cloud lab equipped with nucleic acid synthesis machines, for example, genetic sequences need to be screened carefully to ensure that they are not dangerous pathogens—a nontrivial task. There are not analogous risks for most materials science applications.
However, given the dual-use nature of many novel materials, any self-driving lab would need to have strong cybersecurity and intellectual property protections. Scientists using self-driving lab facilities would need to be carefully screened by DOE—fortunately, this is an infrastructure DOE possesses already for determining access to its supercomputing facilities.
Not all materials involve easily repeatable, and hence automatable, experiments for synthesis and characterization. But many important classes of materials do, including:
- Thin films and coatings
- Photonic and optoelectronic materials such as perovskites (used for solar panels)
- Polymers and monomers
- Battery and energy storage materials
Over time, additional classes of materials can be added.
DOE can and should be creative and resourceful in finding additional resources beyond public funding for this project. Collaborations on both foundation AI models and scaling self-driving labs between DOE and private sector AI firms can be uniquely facilitated by DOE’s new Foundation for Energy Security and Innovation (FESI), a private foundation created by DOE to support scientific fellowships, public-private partnerships, and other key mission-related initiatives.
Yes. Some private firms have recently demonstrated the promise. In late 2023, Google DeepMind unveiled GNoME, a materials science model that identified thousands of new potential materials (though they need to be experimentally validated). Microsoft’s GenMatter model pushed in a similar direction. Both models were developed in collaboration with DOE National Labs (Lawrence Berkeley in the case of DeepMind, and Pacific Northwest in the case of Microsoft).
America’s Teachers Innovate: A National Talent Surge for Teaching in the AI Era
Thanks to Melissa Moritz, Patricia Saenz-Armstrong, and Meghan Grady for their input on this memo.
Teaching our young children to be productive and engaged participants in our society and economy is, alongside national defense, the most essential job in our country. Yet the competitiveness and appeal of teaching in the United States has plummeted over the past decade. At least 55,000 teaching positions went unfilled this year, with long-term annual shortages set to double to 100,000 annually. Moreover, teachers have little confidence in their self-assessed ability to teach critical digital skills needed for an AI enabled future and in the profession at large. Efforts in economic peer countries such as Canada or China demonstrate that reversing this trend is feasible. The new Administration should announce a national talent surge to identify, scale, and recruit into innovative teacher preparation models, expand teacher leadership opportunities, and boost the profession’s prestige. “America’s Teachers Innovate” is an eight-part executive action plan to be coordinated by the White House Office of Science and Technology Policy (OSTP), with implementation support through GSA’s Challenge.Gov and accompanied by new competitive priorities in existing National Science Foundation (NSF), Department of Education (ED), Department of Labor (DoL), and Department of Defense education (DoDEA) programs.
Challenge and Opportunity
Artificial Intelligence may add an estimated $2.6 trillion to $4.4 trillion annually to the global economy. Yet, if the U.S. is not able to give its population the proper training to leverage these technologies effectively, the U.S. may witness a majority of this wealth flow to other countries over the next few decades while American workers are automated from, rather than empowered by, AI deployment within their sectors. The students who gain the digital, data, and AI foundations to work in tandem with these systems – currently only 5% of graduating high school students in the U.S. – will fare better in a modern job market than the majority who lack them. Among both countries and communities, the AI skills gap will supercharge existing digital divides and dramatically compound economic inequality.
China, India, Germany, Canada, and the U.K. have all made investments to dramatically reshape the student experience for the world of AI and train teachers to educate a modern, digitally-prepared workforce. While the U.S. made early research & development investments in computer science and data science education through the National Science Foundation, we have no teacher workforce ready to implement these innovations in curriculum or educational technology. The number of individuals completing a teacher preparation program has fallen 25% over the past decade; long-term forecasts suggest at least 100,000 shortages annually, teachers themselves are discouraging others from joining their own profession (especially in STEM), and preparing to teach digital skills such as computer science was the least popular option for prospective educators to pursue. In 2022, even Harvard discontinued its Undergraduate Teacher Education Program completely, citing low interest and enrollment numbers. There is still consistent evidence that young people or even current professionals remain interested in teaching as a possible career, but only if we create the conditions to translate that interest into action. U.S. policymakers have a narrow window to leverage the strong interest in AI to energize the education workforce, and ensure our future graduates are globally competitive for the digital frontier.
Plan of Action
America’s teaching profession needs a coordinated national strategy to reverse decades of decline and concurrently reinvigorate the sector for a new (and digital) industrial revolution now moving at an exponential pace. Key levers for this work include expanding the number of leadership opportunities for educators; identifying and scaling successful evidence-based models such as UTeach, residency-based programs, or National Writing Project’s peer-to-peer training sites; scaling registered apprenticeship programs or Grow Your Own programs along with the nation’s largest teacher colleges; and leveraging the platform of the President to boost recognition and prestige of the teaching profession.
The White House Office of Science and Technology Policy (OSTP) should coordinate a set of Executive Actions within the first 100 days of the next administration, including:
Recommendation 1. Launch a Grand Challenge for AI-Era Teacher Preparation
Create a national challenge via www.Challenge.Gov to identify the most innovative teacher recruitment, preparation, and training programs to prepare and retain educators for teaching in the era of AI. Challenge requirements should be minimal and flexible to encourage innovation, but could include the creation of teacher leadership opportunities, peer-network sites for professionals, and digital classroom resource exchanges. A challenge prompt could replicate the model of 100Kin10 or even leverage the existing network.
Recommendation 2. Update Areas of National Need
To enable existing scholarship programs to support AI readiness, the U.S. Department of Education should add “Artificial Intelligence,” “Data Science,” and “Machine Learning” to GAANN Areas of National Need under the Computer Science and Mathematics categories to expand eligibility for Masters-level scholarships for teachers to pursue additional study in these critical areas. The number of higher education programs in Data Science education has significantly increased in the past five years, with a small but increasing number of emerging Artificial Intelligence programs.
Recommendation 3. Expand and Simplify Key Programs for Technology-Focused Training
The President should direct the U.S. Secretary of Education, the National Science Foundation Director, and the Department of Defense Education Activity Director to add “Artificial Intelligence, Data Science, Computer Science” as competitive priorities where appropriate for existing grant or support programs that directly influence the national direction of teacher training and preparation, including the Teacher Quality Partnerships (ED) program, SEED (ED), the Hawkins Program (ED), the STEM Corps (NSF), the Robert Noyce Scholarship Program (NSF), and the DoDEA Professional Learning Division, and the Apprenticeship Building America grants from the U.S. Department of Labor. These terms could be added under prior “STEM” competitive priorities, such as the STEM Education Acts of 2014 and 2015 for “Computer Science,”and framed under “Digital Frontier Technologies.”
Additionally, the U.S. Department of Education should increase funding allocations for ESSA Evidence Tier-1 (“Demonstrates Rationale”), to expand the flexibility of existing grant programs to align with emerging technology proposals. As AI systems quickly update, few applicants have the opportunity to conduct rigorous evaluation studies or randomized control trials (RCTs) within the timespan of an ED grant program application window.
Additionally, the National Science Foundation should relaunch the 2014 Application Burden Taskforce to identify the greatest barriers in NSF application processes, update digital review infrastructure, review or modernize application criteria to recognize present-day technology realities, and set a 2-year deadline for recommendations to be implemented agency-wide. This ensures earlier-stage projects and non-traditional applicants (e.g. nonprofits, local education agencies, individual schools) can realistically pursue NSF funding. Recommendations may include a “tiered” approach for requirements based on grant size or applying institution.
Recommendation 4. Convene 100 Teacher Prep Programs for Action
The White House Office of Science & Technology Policy (OSTP) should host a national convening of nationally representative colleges of education and teacher preparation programs to 1) catalyze modernization efforts of program experiences and training content, and 2) develop recruitment strategies to revitalize interest in the teaching profession. A White House summit would help call attention to falling enrollment in teacher preparation programs; highlight innovative training models to recruit and retrain additional graduates; and create a deadline for states, districts, and private philanthropy to invest in teacher preparation programs. By leveraging the convening power of the White House, the Administration could make a profound impact on the teacher preparation ecosystem.
The administration should also consider announcing additional incentives or planning grants for regional or state-level teams in 1) catalyzing K-12 educator Registered Apprenticeship Program (RAPs) applications to the Department of Labor and 2) enabling teacher preparation program modernization for incorporating introductory computer science, data science, artificial intelligence, cybersecurity, and other “digital frontier skills,” via the grant programs in Recommendation 3 or via expanded eligibility for the Higher Education Act.
Recommendation 5. Launch a Digital “White House Data Science Fair”
Despite a bipartisan commitment to continue the annual White House Science Fair, the tradition ended in 2017. OSTP and the Committee on Science, Technology, and Math Education (Co-STEM) should resume the White House Science Fair and add a national “White House Data Science Fair,” a digital rendition of the Fair for the AI-era. K-12 and undergraduate student teams would have the opportunity to submit creative or customized applications of AI tools, machine-learning projects (similar to Kaggle competitions), applications of robotics, and data analysis projects centered on their own communities or global problems (climate change, global poverty, housing, etc.), under the mentorship of K-12 teachers. Similar to the original White House Science Fair, this recognition could draw from existing student competitions that have arisen over the past few years, including in Cleveland, Seattle, and nationally via AP Courses and out-of-school contexts. Partner Federal agencies should be encouraged to contribute their own educational resources and datasets through FC-STEM coordination, enabling students to work on a variety of topics across domains or interests (e.g. NASA, the U.S. Census, Bureau of Labor Statistics, etc.).
Recommendation 6. Announce a National Teacher Talent Surge at the State of Union
The President should launch a national teacher talent surge under the banner of “America’s Teachers Innovate,” a multi-agency communications campaign to reinvigorate the teaching profession and increase the number of teachers completing undergraduate or graduate degrees each year by 100,000. This announcement would follow the First 100 Days in office, allowing Recommendations 1-5 to be implemented and/or planned. The “America’s Teachers Innovate” campaign would include:
A national commitments campaign for investing in the future of American teaching, facilitated by the White House, involving State Education Agencies (SEAs) and Governors, the 100 largest school districts, industry, and philanthropy. Many U.S. education organizations are ready to take action. Commitments could include targeted scholarships to incentivize students to enter the profession, new grant programs for summer professional learning, and restructuring teacher payroll to become salaried annual jobs instead of nine-month compensation (see Discover Bank: “Surviving the Summer Paycheck Gap”).
Expansion of the Presidential Awards for Excellence in Mathematics and Science Teaching (PAMEST) program to include Data Science, Cybersecurity, AI, and other emerging technology areas, or a renaming of the program for wider eligibility across today’s STEM umbrella. Additionally, the PAMEST Award program should resume in-person award ceremonies beyond existing press releases, which were discontinued during COVID disruptions and have not since been offered. Several national STEM organizations and teacher associations have requested these events to return.
Student loan relief through the Teacher Loan Forgiveness (TLF) program for teachers who commit to five or more years in the classroom. New research suggests the lifetime return of college for education majors is near zero, only above a degree in Fine Arts. The administration should add “computer science, data science, and artificial intelligence” to the subject list of “Highly Qualified Teacher” who receive $17,500 of loan forgiveness via executive order.
An annual recruitment drive at college campus job fairs, facilitated directly under the banner of the White House Office of Science & Technology Policy (OSTP), to help grow awareness on the aforementioned programs directly with undergraduate students at formative career choice-points.
Recommendation 7. Direct IES and BLS to Support Teacher Shortage Forecasting Infrastructure
The IES Commissioner and BLS Commissioner should 1) establish a special joint task-force to better link existing Federal data across agencies and enable cross-state collaboration on the teacher workforce, 2) support state capacity-building for interoperable teacher workforce data systems through competitive grant priorities in the State Longitudinal Data Systems (SLDS) at IES and the Apprenticeship Building America (ABA) Program (Category 1 grants), and 3) recommend a review criteria question for education workforce data & forecasting in future EDA Tech Hub phases. The vast majority of states don’t currently have adequate data systems in place to track total demand (teacher vacancies), likely supply (teachers completing preparation programs), and the status of retention/mobility (teachers leaving the profession or relocating) based on near- or real-time information. Creating estimates for this very brief was challenging and subject to uncertainty. Without this visibility into the nuances of teacher supply, demand, and retention, school systems cannot accurately forecast and strategically fill classrooms.

Image: AmericanProgress.org
Recommendation 8. Direct the NSF to Expand Focus on Translating Evidence on AI Teaching to Schools and Districts.
The NSF Discovery Research PreK-12 Program Resource Center on Transformative Education Research and Translation (DRK-12 RC) program is intended to select intellectual partners as NSF seeks to enhance the overall influence and reach of the DRK-12 Program’s research and development investments. The DRK-12 RC program could be utilized to work with multi-sector constituencies to accelerate the identification and scaling of evidence-based practices for AI, data science, computer science, and other emerging tech fields. Currently, the program is anticipated to make only one single DRK-RC award; the program should be scaled to establish at least three centers: one for AI, integrated data science, and computer science, respectively, to ensure digitally-powered STEM education for all students.
Conclusion
China was #1 in the most recent Global Teacher Status Index, which measures the prestige, respect, and attractiveness of the teaching profession in a given country; meanwhile, the United States ranked just below Panama. The speed of AI means educational investments made by other countries have an exponential impact, and any misstep can place the United States far behind – if we aren’t already. Emerging digital threats from other major powers, increasing fluidity of talent and labor, and a remote-work economy makes our education system the primary lever to keep America competitive in a fast-changing global environment. The timing is ripe for a new Nation at Risk-level effort, if not an action on the scale of the original National Defense Education Act in 1958 or following the more recent America COMPETES Act. The next administration should take decisive action to rebuild our country’s teacher workforce and prepare our students for a future that may look very different from our current one.
This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.
PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.
This memo was developed in partnership with the Alliance for Learning Innovation, a coalition dedicated to advocating for building a better research and development infrastructure in education for the benefit of all students. Read more education R&D memos developed in partnership with ALI here.
Approximately 100,000 more per year. The U.S. has 3.2 million public school teachers and .5 million private school teachers (NCES, 2022). According to U.S. Department of Education data, 8% of public and 12% of private school teachers exit the profession each year (-316,000), a number that has remained relatively steady since 2012, while long-term estimates of re-entry continue to hover near 20% (+63,000). Unfortunately, the number of new teachers completing either traditional or alternative preparation programs has steadily declined over the past decade to 159,000+ per year. As a result of this gap, active vacancies continue to increase each year, and more than 270,000 educators are now cumulatively underqualified for their current roles, assumedly filling-in for absences caused by the widening gap. These predictions were made as early as 2016 (p. 2) and now have seemingly become a reality. Absent any changes, the total shortage of vacant or underqualified teaching positions could reach a total deficit between 700,000 and 1,000,000 by 2035.
The above shortage estimate assumes a base of 50,000 vacancies and 270,000 underqualified teachers as of the most recent available data, a flow of -94,000 net (entries – exits annually, including re-entrants) in 2023-2024. This range includes uncertainties for a slight (3%-5%) annual improvement in preparation from the status quo growth of alternative licensure pathways such as Grow your Own or apprenticeship programs through 2035. For exit rate, the most conservative estimates suggest a 5% exit rate, while the highest estimate at 50%; however, assembled state-level data suggests a 7.9% exit rate, similar to the NCES estimate (8%). Population forecasts for K-12 students (individuals aged 14-17) imply slight declines by 2035, based on U.S. Census estimates. Taken together, more optimistic assumptions result in a net cumulative shortage closer to -700,000 teachers, while worst-case scenario estimates may exceed -1,000,000.
Early versions of AI-powered tutoring have significant promise but have not yet lived up to expectations. Automated tutors have resulted in frustrating experiences for users, led students to perform worse on tests than those who leveraged no outside support, and have yet to successfully integrate other school subject problem areas (such as mathematics). We should expect AI tools to improve over time and become more additive for learning specific concepts, including repetitive or generalizable tasks requiring frequent practice, such as sentence writing or paragraph structure, which has the potential to make classroom time more useful and higher-impact. However, AI will struggle to replace other critical classroom needs inherent to young and middle-aged children, including classroom behavioral management, social motivation to learn, mentorship relationships, facilitating collaboration between students for project-based learning, and improving quality of work beyond accuracy or pre-prompted, rubric-based scoring. Teachers consistently report student interest as a top barrier for continued learning, which digital curriculum and AI automation may provide effectively for a short-period, but cannot do for the full twelve-year duration of a students’ K-12 experience.
These proposed executive actions complement a bi-partisan legislative proposal, “A National Training Program for AI-Ready Students,” which would invest in a national network of training sites for in-service teachers, provide grant dollars to support the expansion of teacher preparation programs, and help reset teacher payroll structure from 9-months to 12-months. Either proposal can be implemented independently from the other, but are stronger together.
Three Artificial Intelligence Bills Endorsed by Federation of American Scientists Advance from the House Committee
Proposed bills advance research ecosystems, economic development, and education access and move now to the U.S. House of Representatives for a vote
Washington, D.C. – September 12, 2024 – Three proposed artificial intelligence bills endorsed by the Federation of American Scientists (FAS), a nonpartisan science think tank, advance forward from a House Science, Space, and Technology Committee markup held on September 11th, 2024. These bills received bipartisan support and will now be reported to the full chamber. The three bills are: H.R. 9403, the Expanding AI Voices Act, co-sponsored by Rep. Vince Fong (CA-20) and Rep. Andrea Salinas (OR-06); H.R. 9197, the Small Business AI Act, co-sponsored by Rep. Mike Collins (GA-10) and Rep. Haley Stevens (MI-11), and H.R. 9403, the Expand AI Act, co-sponsored by Rep. Valerie Foushee (NC-04) and Rep. Frank Lucas (OK-03).
“FAS endorsed these bills based on the evaluation of their strengths. Among these are the development of infrastructure to develop AI safely and responsibly; the deployment of resources to ensure development benefits more equitably across our economy; and investment in the talent pool necessary for this consequential, emerging technology,” says Dan Correa, CEO of FAS.
“These three bills pave a vision for the equitable and safe use of AI in the U.S. Both the Expanding AI Voices Act and the NSF AI Education Act will create opportunities for underrepresented voices to have a say in how AI is developed and deployed. Additionally, the Small Business AI Act will ensure that an important sector of our society feels empowered to use AI safely and securely,” says Clara Langevin, FAS AI Policy Specialist.
Expanding AI Voices Act
The Expanding AI Voices Act will support a broad and diverse interdisciplinary research community for the advancement of artificial intelligence and AI-powered innovation through partnerships and capacity building at certain institutions of higher education to expand AI capacity in populations historically underrepresented in STEM.
Specifically, the Expanding AI Voices Act of 2024 will:
- Codify and expand the ExpandAI program at the National Science Foundation (NSF), which supports artificial intelligence (AI) capacity-building projects for eligible entities including Minority Serving Institutions (MSIs), Historically Black Colleges and Universities (HBCUs), and Tribal Colleges and Universities (TCUs).
- Broaden the ExpandAI program in scope and types of activities it supports to further build and enhance partnerships between eligible entities and awardees of the National AI Research Institutes ecosystem to broaden AI research and development.
- Direct the National Science Foundation to engage in outreach to increase their pool of applications and address common barriers preventing these organizations from submitting an application.
Small Business AI Act
Emerging science is central to new and established small businesses, across industries and around the country. This bill will require the Director of the National Institute of Standards and Technology (NIST) to develop resources for small businesses in utilizing artificial intelligence, and for other purposes.
- This bill amends the NIST Organic Act, as amended by the National AI Initiative Act, and directs NIST, in coordination with the Small Business Administration, to consider the needs of America’s small businesses and develop AI resources for best practices, case studies, benchmarks, methodologies, procedures, and processes for small businesses to understand, apply, and integrate AI systems.
- It will connect Small Businesses with existing Federal educational resources, such as the risk management framework and activities from the national cybersecurity awareness and education program under the Cybersecurity Enhancement Act of 2014.
- This bill aligns with FAS’s mission to broaden AI use and access as a catalyst for economic development.
National Science Foundation Artificial Intelligence Education Act of 2024 (NSF AI Education Act).
The National Artificial Intelligence Initiative Act of 2020 (15 U.S.C. 9451) will bolster educational skills in AI through new learning initiatives and workforce training programs. Specifically, the bill will:
- Allow NSF to award AI scholarships in critical sectors such as education, agriculture and advanced manufacturing.
- Authorize the NSF to conduct outreach and encourage applications from rural institutions, Tribal Colleges and Universities, and institutions located in Established Program to Stimulate Competitive Research (EPSCoR) jurisdictions to promote research competitiveness.
- Award fellowships for teachers, school counselors, and other school professionals for professional development programs, providing skills and training in collaboration with industry partners on the teaching and application of artificial intelligence in K-12 settings
- This bill aligns with FAS’s commitment to STEM education and equity as powerful levers for our nation to compete on the global stage.
###
ABOUT FAS
The Federation of American Scientists (FAS) works to advance progress on a broad suite of contemporary issues where science, technology, and innovation policy can deliver dramatic progress, and seeks to ensure that scientific and technical expertise have a seat at the policymaking table. Established in 1945 by scientists in response to the atomic bomb, FAS continues to work on behalf of a safer, more equitable, and more peaceful world. More information at fas.org.
GenAI in Education Research Accelerator (GenAiRA)
The United States faces a critical challenge in addressing the persistent learning opportunity gaps in math and reading, particularly among disadvantaged student subgroups. According to the 2022 National Assessment of Educational Progress (NAEP) data, only 37% of fourth-grade students performed at or above the proficient level in math, and 33% in reading. The rapid advancement of generative AI (GenAI) technologies presents an unprecedented opportunity to bridge these gaps by providing personalized learning experiences and targeted support. However, the current mismatch between the speed of GenAI innovation and the lengthy traditional research pathways hinders the thorough evaluation of these technologies before widespread adoption, potentially leading to unintended negative consequences.
Failure to adapt our research and regulatory processes to keep pace with the development of GenAI technologies could expose students to ineffective or harmful educational tools, exacerbate existing inequities, and hinder our ability to prepare all students for success in an increasingly complex and technology-driven world. The education sector must act with urgency to establish the necessary infrastructure, expertise, and collaborative partnerships to ensure that GenAI-powered tools are rigorously evaluated, continuously improved, and equitably implemented to benefit all students.
To address this challenge, we propose three key recommendations for congressional action:
- Establish the GenAI in Education Research Accelerator Program (GenAiRA) within the Institute of Education Sciences (IES) to support and expedite efficacy research on GenAI-powered educational tools.
- Adapt IES research and evaluation processes to create a framework for the rapid assessment of GenAI-enabled educational technology, including alternative research designs and evidence standards.
- Support the establishment of a GenAI Education Research and Innovation Consortium, bringing together schools, researchers, and education technology (EdTech) developers to participate in rapid cycle studies and continuous improvement of GenAI tools.
By implementing these recommendations, Congress can foster a more responsive and evidence-based ecosystem for GenAI-powered educational tools, ensuring that they are equitable, effective, and safe for all students. This comprehensive approach will help unlock the transformative potential of GenAI to address persistent learning opportunity gaps and improve outcomes for all learners, while maintaining scientific rigor and prioritizing student well-being.
During the preparation of this work, the authors used the tool Claude 3 Opus (by Anthropic) to help clarify and synthesize, and add accessible language around concepts and ideas generated by members of the team. The authors reviewed and edited the content as needed and take full responsibility for the content of this publication.
Challenge and Opportunity
Widening Learning Opportunity Gap
NAEP data reveals that many U.S. students, especially those from disadvantaged subgroups, are not achieving proficiency in math and reading. In 2022, only 37% of fourth-graders performed at or above the NAEP proficient level in math, and 33% in reading—the lowest levels in over a decade. Disparities are more profound when disaggregated by race, ethnicity, and socioeconomic status; for example, only 17% of Black students and 21% of Hispanic students reached reading proficiency, compared to 42% of white students.
Rapid AI Evolution
GenAI is a transformative technology that enables rapid development and personalization of educational content and tools, addressing unmet needs in education such as lack of resources, 1:1 teaching time, and teacher quality. However, that rapid pace also raises concerns about premature adoption of unvetted tools, which could negatively impact students’ educational achievement. Unvetted GenAI tools may introduce misconceptions, provide incorrect guidance, or be misaligned with curriculum standards, leading to gaps in students’ understanding of foundational concepts. If used for an extended period, particularly with vulnerable learners, these tools could have a long-term impact on learning foundations that may be difficult to remedy.
On the other hand, carefully designed, trained, and vetted GenAI models that have undergone rapid cycle studies and design iterations based on data have the potential to effectively address students’ misconceptions, build solid learning foundations, and provide personalized, adaptive support to learners. These tools could accelerate progress and close learning opportunity gaps at an unprecedented scale.
Slow Vetting Processes
The rapid pace of AI development poses significant challenges for traditional research and evaluation processes in education. Efficacy research, particularly studies sponsored by the IES or other Department of Education entities, is a lengthy, resource-intensive, and often onerous process that can take years to complete. Randomized controlled trials and longitudinal studies struggle to keep up with the speed of AI innovation: by the time a study is completed, the AI-powered tool may have already undergone multiple iterations or been replaced.
It can be difficult to recruit and sustain school and teacher participation in efficacy research due to the significant time and effort required from educators. Moreover, obtaining certifications and approvals for research can be complex and time-consuming, as researchers must navigate institutional review boards, data privacy regulations, and ethical guidelines, which can delay the start of a study by months or even years.
Many EdTech developers find themselves in a catch-22 situation, where their products are already being adopted by schools and educators, yet they are simultaneously expected to participate in lengthy and expensive research studies to prove efficacy. The time and resources required to engage in such research can be a significant burden for EdTech companies, especially start-ups and small businesses, which may prefer to focus on iterating and improving their products based on real-world feedback. As a result, many EdTech developers may be reluctant to participate in traditional efficacy research, further exacerbating the disconnect between the rapid pace of AI innovation and the slow process of evaluating the effectiveness of these tools in educational settings.
Gaps in Existing Efforts and Programs
While federal initiatives like SEERNet and ExpandAI have made strides in supporting AI and education research and development, they may not be fully equipped to address the specific challenges and opportunities presented by GenAI for several reasons:
- GenAI has the ability to generate novel content and interact with users in unpredictable and personalized ways.
- GenAI-powered educational technologies involve unique considerations in terms of data training, prompt engineering, and output evaluation, especially when considering the developmental stages of PreK-12 students.
- GenAI raises specific ethical concerns, such as the potential for biased or inappropriate content generation, ensuring the accuracy and quality of generated responses, and protecting student privacy and agency.
- GenAI is evolving at an unprecedented pace.
Traditional approaches to efficacy research and evaluation may not be well-suited to evaluating the potential benefits and outcomes associated with GenAI-powered tools in the short term, particularly when assessing whether a program shows enough promise to warrant wider deployment with students.
A New Approach
To address these challenges and bridge the gap between GenAI innovation and efficacy research, we need a new approach to streamline the research process, reduce the burden on educators and schools, and provide timely and actionable insights into the effectiveness of GenAI-powered tools. This may involve alternative study designs, such as rapid cycle evaluations or single-case research, and developing new incentive structures and support systems to encourage and facilitate the participation of teachers, schools, and product developers in research studies.
GenAiRA aims to tackle these challenges by providing resources, guidance, and infrastructure to support more agile and responsive efficacy research in the education sciences. By fostering collaboration among researchers, developers, and educators, and promoting innovative approaches to evaluation, this program can help ensure that the development and adoption of AI-powered tools in education are guided by rigorous, timely, and actionable evidence—while simultaneously mitigating risks to students.
Learning from Other Sectors
Valuable lessons can be drawn from other fields that have faced similar balancing acts between innovation, research, and safety. Two notable examples are the U.S. Food and Drug Administration’s (FDA) expedited review pathways for drug development and the National Institutes of Health’s (NIH) Clinical and Translational Science Awards (CTSA) program for accelerating medical research.
Example 1: The FDA Model
The FDA’s expedited review programs, such as Fast Track, Breakthrough Therapy, Accelerated Approval, and Priority Review, are designed to speed up the development and approval of drugs that address unmet medical needs or provide significant improvements over existing treatments. These pathways recognize that, in certain cases, the benefits of bringing a potentially life-saving drug to market quickly may outweigh the risks associated with a more limited evidence base at the time of approval.
Key features include:
- Early and frequent communication between the FDA and drug developers to provide guidance and feedback throughout the development process.
- Flexibility in clinical trial design and evidence requirements, such as allowing the use of surrogate endpoints or single-arm studies in certain cases.
- Rolling review of application materials, allowing drug developers to submit portions of their application as they become available rather than waiting for the entire package to be complete.
- Shortened review timelines, with the FDA committing to reviewing and making a decision on an application within a specified timeframe (e.g., six months for Priority Review).
These features can accelerate the development and approval process while still ensuring that drugs meet standards for safety and effectiveness. They also acknowledge that the evidence base for a drug may evolve over time, with post-approval studies and monitoring playing a crucial role in confirming the drug’s benefits and identifying any rare or long-term side effects.
Example 2: The CTSA Program
The NIH’s CTSA program established a national network of academic medical centers, research institutions, and community partners to accelerate the translation of research findings into clinical practice and improve patient outcomes.
Key features include:
- Collaborative research infrastructure, consisting of a network of institutions and partners that work together to conduct translational research, share resources and expertise, and disseminate best practices.
- Streamlined research processes with standardized protocols, templates, and tools to facilitate the rapid design, approval, and implementation of research studies across the network.
- Training and development of researchers and clinicians to build a workforce equipped to conduct innovative and rigorous translational research.
- Community engagement in the research process to ensure that studies are responsive to real-world needs and priorities.
By learning from the successes and principles of the FDA’s expedited review pathways and the NIH’s CTSA program, the education sector can develop its own innovative approach to accelerating the responsible development, evaluation, and deployment of GenAI-powered tools, as outlined in the following plan of action.
Plan of Action
To address the challenges and opportunities presented by GenAI in education, we propose the following three key recommendations for congressional action and the evolution of existing programs.
Recommendation 1. Establish the GenAI in Education Research Accelerator Program (GenAiRA).
Congress should establish the GenAiRA, housed in the IES, to support and expedite efficacy research on products and tools utilizing AI-powered educational tools and programs. This program will:
- Provide funding and resources to researchers and educators to conduct rigorous, timely, and cost-effective efficacy studies on promising AI-based solutions that address achievement gaps.
- Create guidelines and offer webinars and technical assistance to researchers, educators, and developers to build expertise in the responsible design, implementation, and evaluation of GenAI-powered tools in education.
- Foster collaboration and knowledge-sharing among researchers, educators, and GenAI developers to facilitate the rapid translation of research findings into practice and continuously improve GenAI-powered tools.
- Develop and disseminate best practices, guidelines, and ethical frameworks for responsible development and deployment of GenAI-enabled educational technology tools in educational settings, focusing on addressing bias, accuracy, privacy, and student agency issues.
Recommendation 2. Under the auspices of GenAiRA, adapt IES research and evaluation processes to create a framework to evaluate GenAI-enabled educational technology.
In consultation with experts in educational research and AI, IES will develop a framework that:
- Identifies existing research designs and creates alternative research designs (e.g., quasi-experimental studies, rapid short evaluations) suitable for generating credible evidence of effectiveness while being more responsive to the rapid pace of AI innovation.
- Establish evidence-quality guidelines for rapid evaluation, including minimum sample sizes, study duration, effect size, and targeted population.
- Funds replication studies and expansion studies to determine impact in different contexts or with different populations (e.g., students with IEPs and English learners).
- Provides guidance to districts on how to interpret and apply evidence from different types of studies to inform decision-making around adopting and using AI technologies in education.
Recommendation 3. Establish a GenAI Education Research and Innovation Consortium.
Congress should provide funding and incentives for IES to establish a GenAI Education Research and Innovation Consortium that brings together a network of “innovation schools,” research institutions, and EdTech developers committed to participating in rapid cycle studies and continuous improvement of GenAI tools in education. This approach will ensure that AI tools are developed and implemented in a way that is responsive to the needs and values of educators, students, and communities.
To support this consortium, Congress should:
- Allocate funds for the IES to provide grants and resources to schools, research institutions, and EdTech developers that meet established criteria for participation in the consortium, such as demonstrated commitment to innovation, research capacity, and ethical standards.
- Direct IES to work with programs like SEERNet and ExpandAI to identify and match potential consortium members, provide guidance and oversight to ensure that research studies meet rigorous standards for quality and ethics, and disseminate findings and best practices to the broader education community.
- Encourage the development of standardized protocols and templates for data sharing, privacy protection, and informed consent within the consortium, to reduce the time and effort required for each individual study and streamline administrative processes.
- Incentivize participation in the consortium by offering resources and support for schools, researchers, and developers, such as access to funding opportunities, technical assistance, and professional development resources.
- Require the establishment of a central repository of research findings and best practices generated through rapid cycle evaluations conducted within the consortium, to facilitate the broader dissemination and adoption of effective GenAI-powered tools.
Conclusion
Persistent learning opportunity gaps in math and reading, particularly among disadvantaged students, are a systemic challenge requiring innovative solutions. GenAI-powered educational tools offer potential for personalizing learning, identifying misconceptions, and providing tailored support. However, the mismatch between the pace of GenAI innovation and lengthy traditional research pathways impedes thorough vetting of these technologies to ensure they are equitable, effective, and safe before widespread adoption.
GenAiRA and development of alternative research frameworks provide a comprehensive approach to bridge the divide between GenAI’s rapid progress and the need for thorough evaluation in education. Leveraging existing partnerships, research infrastructure, and data sources can expedite the research process while maintaining scientific rigor and prioritizing student well-being.
The plan of action creates a roadmap for responsibly harnessing GenAI’s potential in education. Identifying appropriate congressional mechanisms for establishing the accelerator program, such as creating a new bill or incorporating language into upcoming legislation, can ensure this critical initiative receives necessary funding and oversight.
This comprehensive strategy charts a path toward equitable, personalized learning facilitated by GenAI while upholding the highest standards of evidence. Aligning GenAI innovation with rigorous research and prioritizing the needs of underserved student populations can unlock the transformative potential of these technologies to address persistent achievement gaps and improve outcomes for all learners.
This idea is part of our AI Legislation Policy Sprint. To see all of the policy ideas spanning innovation, education, healthcare, and trust, safety, and privacy, head to our sprint landing page.
Implementing AI and GenAI-powered educational tools without sufficient evidence of their effectiveness or safety could lead to the widespread use of ineffective interventions. If these tools fail to improve student outcomes or even hinder learning progress, they can have long-lasting negative consequences for students’ academic attainment and self-perception as learners.
When students are exposed to ineffective educational tools, they may struggle to grasp key concepts, leading to gaps in their knowledge and skills. Over time, these gaps can compound, leaving students ill-prepared for future learning challenges and limiting their academic and career opportunities. Moreover, repeated experiences of frustration and failure with educational technologies can erode students’ confidence, motivation, and engagement with learning.
This erosion of learner identity can be particularly damaging for students from disadvantaged backgrounds, who may already face additional barriers to academic success. If AI-powered tools fail to provide effective support and personalization, these students may fall even further behind their peers, exacerbating existing educational inequities.
A Safe Harbor for AI Researchers: Promoting Safety and Trustworthiness Through Good-Faith Research
Artificial intelligence (AI) companies disincentivize safety research by implicitly threatening to ban independent researchers that demonstrate safety flaws in their systems. While Congress encourages companies to provide bug bounties and protections for security research, this is not yet the case for AI safety research. Without independent research, we do not know if the AI systems that are being deployed today are safe or if they pose widespread risks that have yet to be discovered, including risks to U.S. national security. While companies conduct adversarial testing in advance of deploying generative AI models, they fail to adequately test their models after they are deployed as part of an evolving product or service. Therefore, Congress should promote the safety and trustworthiness of AI systems by establishing bug bounties for AI safety via the Chief Digital and Artificial Intelligence Office and creating a safe harbor for research on generative AI platforms as part of the Platform Accountability and Transparency Act.
Challenge and Opportunity
In July 2023, the world’s top AI companies signed voluntary commitments at the White House, pledging to “incent third-party discovery and reporting of issues and vulnerabilities.” Almost a year later, few of the signatories have lived up to this commitment. While some companies do reward researchers for finding security flaws in their AI systems, few companies strongly encourage research on safety or provide concrete protections for good-faith research practices. Instead, leading generative AI companies’ Terms of Service legally prohibit safety and trustworthiness research, in effect threatening anyone who conducts such research with bans from their platforms or even legal action.
In March 2024, over 350 leading AI researchers and advocates signed an open letter calling for “a safe harbor for independent AI evaluation.” The researchers noted that generative AI companies offer no legal protections for independent safety researchers, even though this research is critical to identifying safety issues in AI models and systems. The letter stated: “whereas security research on traditional software has established voluntary protections from companies (‘safe harbors’), clear norms from vulnerability disclosure policies, and legal protections from the DOJ, trustworthiness and safety research on AI systems has few such protections.”
In the months since the letter was released, companies have continued to be opaque about key aspects of their most powerful AI systems, such as the data used to build their models. If a researcher wants to test whether AI systems like ChatGPT, Claude, or Gemini can be jailbroken such that they pose a threat to U.S. national security, they are not allowed to do so as companies proscribe such research. Developers of generative AI models tout the safety of their systems based on internal red-teaming, but there is no way for the federal government or independent researchers to validate these results, as companies do not release reproducible evaluations.
Generative AI companies also impose barriers on their platforms that limit good-faith research. Unlike much of the web, the content on generative AI platforms is not publicly available, meaning that users need accounts to access AI-generated content and these accounts can be restricted by the company that owns the platform. In addition, companies like Google, Amazon, Microsoft, and OpenAI block certain requests that users might make of their AI models and limit the functionality of their models to prevent researchers from unearthing issues related to safety or trustworthiness.
Similar issues plague social media, as companies take steps to prevent researchers and journalists from conducting investigations on their platforms. Social media researchers face liability under the Computer Fraud and Abuse Act and Section 1201 of the Digital Millennium Copyright Act among other laws, which has had a chilling effect on such research and worsened the spread of misinformation online. The stakes are even higher for AI, which has the potential not only to turbocharge misinformation but also to provide U.S. adversaries like China and Russia with material strategic advantages. While legislation like the Platform Accountability and Transparency Act would enable research on recommendation algorithms, proposals that grant researchers access to platform data do not consider generative AI platforms to be in scope.
Congress can safeguard U.S. national security by promoting independent AI safety research. Conducting pre-deployment risk assessments is insufficient in a world where tens of millions of Americans are using generative AI—we need real-time assessments of the risks posed by AI systems after they are deployed as well. Big Tech should not be taken at its word when it says that its AI systems cannot be used by malicious actors to generate malware or spy on Americans. The best way to ensure the safety of generative AI systems is to empower the thousands of cutting-edge researchers at U.S. universities who are eager to stress test these systems. Especially for general-purpose technologies, small corporate safety teams are not sufficient to evaluate the full range of potential risks, whereas the independent research community can do so thoroughly.

Figure 1. What access protections do AI companies provide for independent safety research? Source: Longpre et al., “A Safe Harbor for AI Evaluation and Red Teaming.”
Plan of Action
Congress should enable independent AI safety and trustworthiness researchers by adopting two new policies. First, Congress should incentivize AI safety research by creating algorithmic bug bounties for this kind of work. AI companies often do not incentivize research that could reveal safety flaws in their systems, even though the government will be a major client for these systems. Even small incentives can go a long way, as there are thousands of AI researchers capable of demonstrating such flaws. This would also entail establishing mechanisms through which safety flaws or vulnerabilities in AI models can be disclosed, or a kind of help-line for AI systems.
Second, Congress should require AI platform companies, such as Google, Amazon, Microsoft, and OpenAI to share data with researchers regarding their AI systems. As with social media platforms, generative AI platforms mediate the behavior of millions of people through the algorithms they produce and the decisions they enable. Companies that operate application programming interfaces used by tens of thousands of enterprises should share basic information about their platforms with researchers to facilitate external oversight of these consequential technologies.
Taken together, vulnerability disclosure incentivized through algorithmic bug bounties and protections for researchers enabled by safe harbors would substantially improve the safety and trustworthiness of generative AI systems. Congress should prioritize mitigating the risks of generative AI systems and protecting the researchers who expose them.
Recommendation 1. Establish algorithmic bug bounties for AI safety.
As part of the FY2024 National Defense Authorization Act (NDAA), Congress established “Artificial Intelligence Bug Bounty Programs” requiring that within 180 days “the Chief Digital and Artificial Intelligence Officer of the Department of Defense shall develop a bug bounty program for foundational artificial intelligence models being integrated into the missions and operations of the Department of Defense.” However, these bug bounties extend only to security vulnerabilities. In the FY2025 NDAA, this bug bounty program should be expanded to include AI safety. See below for draft legislative language to this effect.
Recommendation 2. Create legal protections for AI researchers.
Section 9 of the proposed Platform Accountability and Transparency Act (PATA) would establish a “safe harbor for research on social media platforms.” This likely excludes major generative AI platforms such as Google Cloud, Amazon Web Services, Microsoft Azure, and OpenAI’s API, meaning that researchers have no legal protections when conducting safety research on generative AI models via these platforms. PATA and other legislative proposals related to AI should incorporate a safe harbor for research on generative AI platforms.
Conclusion
The need for independent AI evaluation has garnered significant support from academics, journalists, and civil society. Safe harbor for AI safety and trustworthiness researchers is a minimum fundamental protection against the risks posed by generative AI systems, including related to national security. Congress has an important opportunity to act before it’s too late.
This idea is part of our AI Legislation Policy Sprint. To see all of the policy ideas spanning innovation, education, healthcare, and trust, safety, and privacy, head to our sprint landing page.
The authors of this memorandum as well as the academic paper underlying it submitted a comment to the Copyright Office in support of an exemption to DMCA for AI safety and trustworthiness research. The Computer Crime and Intellectual Property Section of the U.S. Department of Justice’s Criminal Division and Senator Mark Warner have also endorsed such an exemption. However, a DMCA exemption regarding research on AI bias, trustworthiness, and safety alone would not be sufficient to assuage the concerns of AI researchers, as they may still face liability under other statutes such as the Computer Fraud and Abuse Act.
Much of this research is currently conducted by research labs with direct connections to the AI companies they are assessing. Researchers who are less well connected, of which there are thousands, may be unwilling to take the legal or personal risk of violating companies’ Terms of Service. See our academic paper on this topic for further details on this and other questions.
See draft legislative language below, building on Sec. 1542 of the FY2024 NDAA:
SEC. X. EXPANSION OF ARTIFICIAL INTELLIGENCE BUG BOUNTY PROGRAMS.
(a) Update to Program for Foundational Artificial Intelligence Products Being Integrated Within Department of Defense.—
(1) Development required.—Not later than 180 days after the date of the enactment of this Act and subject to the availability of appropriations, the Chief Digital and Artificial Intelligence Officer of the Department of Defense shall expand its bug bounty program for foundational artificial intelligence models being integrated into the missions and operations of the Department of Defense to include unsafe model behaviors in addition to security vulnerabilities.
(2) Collaboration.—In expanding the program under paragraph (1), the Chief Digital and Artificial Intelligence Officer may collaborate with the heads of other Federal departments and agencies with expertise in cybersecurity and artificial intelligence.
(3) Implementation authorized.—The Chief Digital and Artificial Intelligence Officer may carry out the program In subsection (a).
(4) Contracts.—The Secretary of Defense shall ensure, as may be appropriate, that whenever the Secretary enters into any contract, such contract allows for participation in the bug bounty program under paragraph (1).
(5) Rule of construction.—Nothing in this subsection shall be construed to require—
(A) the use of any foundational artificial intelligence model; or
(B) the implementation of the program developed under paragraph (1) for the purpose of the integration of a foundational artificial intelligence model into the missions or operations of the Department of Defense.
Update COPPA 2.0 to Strengthen Children’s Online Voice Privacy in the AI Era
Emerging technologies like artificial intelligence (AI) are changing the way humans interact with machines. As AI technology has made huge progress over the last decade, the processing of modalities such as text, voice, image, and video data has been replaced with data-driven large AI models. These models were primarily aimed for machines to comprehend various data and perform tasks without human intervention. Now, with the emergence of generative AI like ChatGPT, these models are capable of generating data such as text, voice, image, or video. Policymakers across the globe are struggling to draft to govern ethical use of data as well as regulate the creation of safe, secure, and trustworthy AI models.
Data privacy is a major concern with the advent of AI technology. Actions by the US Congress such as the proposed American Privacy Rights Act aim to enforce strict data privacy rights. With emerging AI applications for children, the privacy of children and the safekeeping of their personal information is also a legislative challenge.
Congress must act to protect children’s voice privacy before it’s too late. Companies that store children’s voice recordings and use them for profit-driven applications (or advertising) without parental consent pose serious privacy threats to children and families. The proposed revisions to the Children’s Online Privacy Protection Act (COPPA) aim to restrict companies’ capacity to profit from children’s data and transfer the responsibility of compliance from parents to companies. However, several measures in the proposed legislation need more clarity and additional guidelines.
Challenge and Opportunity
Human voice1 is one of the most popular modalities for AI technology. Advancements in voice AI technology such as voice AI assistants (Siri, Google, Bixby, Alexa, etc.) in smartphones have made many day-to-day activities easier; however, there are also emerging threats from voice AI and a lack of regulations governing voice data and voice AI applications. One example is AI voice impersonation scams. Using the latest voice AI technology,2 a high-quality personalized voice recording can be generated with as little as 15 seconds of the speaker’s recorded voice. A technology rat race among Big Tech has begun, as companies are trying to achieve this using voice recordings that are less than a few seconds. Scammers have increasingly been using this technology for their benefit. OpenAI, the creator of ChatGPT, recently developed a product called Voice Engine—but refrained from commercializing it by acknowledging that this technology poses “serious risks,” especially in an election year.
A voice recording contains very personal information about a speaker, and that gives the ability to identify a target speaker from recordings of multiple speakers. Emerging research in voice AI technology has potential implications for medical and health-related applications from voice recordings, plus identification of age, height, and much more. When using cloud-based applications, privacy concerns also arise during voice data transfer and from data storage leaks, due to noncompliance with data collection and storage. Therefore, the threats from misuse of voice data and voice AI technology are enormous.
Social media services, educational technology, online games, and smart toys are just a few services for children that have started adopting voice technology (e.g., Alexa for Kids). Any service operator (or company) collecting and using children’s personal information, including their voice, is bound by the Children’s Online Privacy Protection Act (COPPA). The Federal Trade Commission (FTC) is the enforcing federal agency for COPPA. However, several companies have recently violated COPPA by collecting personal information from children without parental consent and used it for advertising and maximizing their platform profits. “Amazon’s history of misleading parents, keeping children’s recordings indefinitely, and flouting parents’ deletion requests violated COPPA and sacrificed privacy for profits,” said Samuel Levine of the FTC’s Bureau of Consumer Protection. The FTC alleges that Amazon maintained records of children’s data, disregarding parents’ deletion requests, and trained its Voice AI algorithms on that data.
Children’s spoken characteristics are different from those of adults; thus, developing voice AI technology for children is more challenging. Most commercial voice-AI-enabled services work smoothly for adults, but their accuracy in understanding children’s voices is often limited. Another challenge is the relatively sparse availability of children’s voice data to train AI models. Therefore, Big Tech is looking to leverage ways to acquire as much children’s voice data as possible to train AI voice models. This challenge is prevalent not only in industry but also in academic research on the subject due to very limited data availability and varying spoken skills. However, misuse of acquired data, especially without consent, is not a solution, and operators must be penalized for such actions.
Considering the recent violations of COPPA by operators, and with a goal to strengthen the compliance of safeguarding and avoid misuse of personal information such as voice, Congress is updating COPPA with new legislation. The COPPA updates propose to extend and update the definition of “operator,” “personal information” including voice prints, “consent,” “website/service/application” including devices connected to the internet, and guidelines for “collection, use, disclosure, and deletion of personal information.” These updates are especially critical when the personal information of users (or consumers) can serve as valuable data for operators for profit-driven applications and misuse without any federal regulation. The FTC acknowledges that the current version of COPPA is insufficient; therefore, these updates would also enable the FTC to act on operators and take strict action.
Plan of Action
The Children and Teens’ Online Privacy Protection Act (COPPA 2.0) has been proposed in both the Senate and House to update COPPA for the modern internet age, with a renewed focus on limiting misuse of children’s personal data (including voice recordings). This proposed legislation has gained momentum and bipartisan support. However, the text in this legislation could still be updated to ensure consumer privacy and support future innovation.
Recommendation 1. Clarify the exclusion clause for audio files.
An exclusion clause has been added in this legislation particularly for audio files containing a child’s voice, declaring that the collected audio file is not considered personal information if it meets certain criteria. This was added to adopt a more expansive audio file exception, particularly to allow operators to provide some features to their users (or consumers).
While just having the text “only uses the voice within the audio file solely as a replacement for written words”3 might be overly restrictive for voice-based applications, the text “to perform a task” might open the use of audio files for any task that could be beneficial to operators. The task should only be related to performing a request or providing a service to the user, and that needs to be clarified in the text. Potential misuse of this text could be (1) to train AI models for tasks that might help operators provide a service to the user—especially for personalization, or (2) to extract and store “audio features”4 (most voice AI models are trained using audio features instead of the raw audio itself). Operators might argue that extracting audio features is necessary as part of the algorithm that assists in providing a service to the user. Therefore, the phrasing “to perform a task” in this exclusion might be open-ended and should be modified as suggested:
Current text: “(iii) only uses the voice within the audio file solely as a replacement for written words, to perform a task, or engage with a website, online service, online application, or mobile application, such as to perform a search or fulfill a verbal instruction or request; and”
Suggestion text: “(iii) only uses the voice within the audio file solely as a replacement for written words, to only perform a task to engage with a website, online service, online application, or mobile application, such as to perform a search or fulfill a verbal instruction or request; and”
On a similar note, legislators should consider adding the term “audio features.” Audio features are enough to train voice AI models and develop any voice-related application, even if the original audio file is deleted. Therefore, the deletion argument in the exclusion clause should be modified as suggested:
Current text: “(iv) only maintains the audio file long enough to complete the stated purpose and then immediately deletes the audio file and does not make any other use of the audio file prior to deletion.”
Suggestion text: “(iv) only maintains the audio file long enough to complete the stated purpose and then immediately deletes the audio file and any extracted audio-based features and does not make any other use of the audio file (or extracted audio-features) prior to deletion.”
Adding more clarity to the exclusion will help avoid misuse of children’s voices for any task that companies might still find beneficial and also ensure that operators delete all forms of the audio which could be used to train AI models.
Recommendation 2. Add guidelines on the deidentification of audio files to enhance innovation.
A deidentified audio file is one that cannot be used to identify the speaker whose voice is recorded in that file. The legislative text of COPPA 2.0 does not mention or have any guidelines on how to deidentify an audio file. These guidelines would not only protect the privacy of users but also allow operators to use deidentified audio files to add features and improve their products. The guidelines could include steps to be followed by operators as well as additional commitment from operators.
The steps include:
- Each audio file collected by an application should be stored with an anonymous identifier.
- Each audio file collected from the same user account (a child) or a device (e.g., smartphone, tablets/iPad, laptop/computer) should be treated as an individual file and stored with anonymous identifiers. This will avoid linking multiple audio files from the same user or device.
- If any audio file contains any personally identifiable information, that audio file must be deleted instantly, and the user (parent of the child) must be informed. There is no guarantee that an audio recording will not contain any personally identifiable information, and therefore this step is critical.
The commitments include:
- An operator should commit not to reidentify the speaker from any given audio file (both anonymized and non-anonymous) (this task is often called “speaker identification”).
- An operator should seek consent (or approval) from the parent of the user (child) to allow the use of the deidentified audio files for product development, and only then use it or else immediately delete the audio file following the completion of the task for which it was collected.
Following these guidelines might be expensive for operators; however, it is crucial to take as many precautions as possible. Current deidentification steps of audio files followed by operators are not sufficient, and there have been numerous instances in which anonymized data had been reidentified, according to a statement released by a group of State Attorneys General. These proposed guidelines could allow operators to deidentify audio files and use those files for product development. This will allow the innovation of voice AI technology for children to flourish.
Recommendation 3. Add AI-generated avatars in the definition of personal information.
With the emerging applications of generative AI and growing virtual reality use for education (in classrooms) and for leisure (in online games), “AI-based avatar generation from a child’s image, audio, or video” should be added to the legislative definition of “personal information.” Virtual reality is a growing space, and digital representations of the human user (an avatar) are increasingly used to allow the user to see and interact with virtual reality environments and other users.
Conclusion
As new applications of AI emerge, operators must ensure compliance in the collection and use of consumers’ personal information and safety in the design of their products using that data, especially when dealing with vulnerable populations like children. Since the original passage of COPPA in 1998, how consumers use online services for day-to-day activities, including educational technology and amusement for children, has changed dramatically. This ever-changing scope and reach of online services require strong legislative action to bring online privacy standards into the 21st century. Without a doubt, COPPA 2.0 will lead this regulatory drive not only to protect children’s personal information collected by online services and operators from misuse but also to ensure that the burden of compliance rests on the operators rather than on parents. These recommendations will help strengthen the protections of COPPA 2.0 even further while leaving open avenues for innovation in voice AI technology for children.
This idea is part of our AI Legislation Policy Sprint. To see all of the policy ideas spanning innovation, education, healthcare, and trust, safety, and privacy, head to our sprint landing page.
A National Training Program for AI-Ready Students
In crafting future legislation on artificial intelligence (AI), Congress should introduce a Digital Frontier and AI Readiness Act of 2025 to create educator training sites in emerging technology to ensure our students can graduate AI-ready. Computing, data, and AI basics will be critical for every student, yet our education system does not have the capacity to impart them. A national mobilization for the education workforce would ensure U.S. leadership in the global AI talent race, address mounting challenges in teacher shortages and retention, and fill critical workforce preparedness gaps not addressed by the CHIPS and Science Act. The legislation would include three components: (1) a prestigious national fellowship program for classroom educators with extended summer pay; (2) an evidence-based national network of training sites for peer-based learning; and (3) a modernization competition for teacher college programs to sustain long-term improvement in our education workforce.
Investing in effective educators has a significant impact: one high-quality teacher can significantly boost lifetime incomes, degree attainment, and other life satisfaction measures for many classrooms of students. These programs would be facilitated through the National Science Foundation (NSF), including through simplified application procedures, expanded eligibility, and evaluation approaches.
Challenge and Opportunity
If AI is positioned to dramatically transform our economy, from the production line to the c-suite, then everyone must be prepared to leverage its power. AI alone may add between an estimated $2.6 trillion to $4.4 trillion annually to the global economy and may automate between 60% to 70% of task-time within existing jobs, rather than full replacement. Earlier studies estimated that emerging technologies will increase the technology intensity of existing careers across all sectors. A report by the Burning Glass Institute found that 22% of all current open jobs in the U.S. economy include at least one “data science skill,” with the highest share of data-skill job postings in utilities, manufacturing, and agriculture. Not every worker will build the next AI algorithm or become a data scientist, but nearly every American will need to leverage data and AI to maintain a competitive edge in their sector or risk losing entire industries to other countries who do the same. This unprecedented economic growth will only be captured by the countries whose workers are prepared in data and AI basics.
U.S. educators are mostly unsupported to teach students about AI and other emerging technologies. An analysis of math educators nationally found that teachers are least confident to teach about data and statistics, as well as technology integration, compared to other content categories. Computer science was the least popular credential for K-12 educators to pursue as recently as the 2018–2019 school year. These challenges translate to student opportunities and outcomes. As of 2023, only 5.8% of our high school students are enrolled in foundational computer science courses. Introductory basics in data or AI are typically not covered even if they exist in some state standards. Nationally, students’ foundational data literacy has declined between one and three grade levels steadily over the past decade, varying disproportionately by race and geography, with losses only accelerated by the pandemic.
Moreover, our teacher workforce capacity is declining. Teacher entry, preparation, and retention rates remain at historical lows across the country and have not meaningfully recovered since the pandemic. Over the past decade, the number of individuals completing a teacher preparation program has fallen 25%, with only modest recovery since the pandemic, shortages of at least 55,000 unfilled positions this year, and long-term forecasts reaching at least 100,000 shortages annually. Factors including low pay, low prestige, and difficult environments create a perception challenge for the profession: less than 1 in 5 Americans would encourage a young person to become a teacher. These challenges compound over time, as more graduate schools of education close or cut their programming. In 2022, Harvard discontinued its Undergraduate Teacher Program completely, citing low interest and enrollment numbers, one among many.
What if the concurrent challenges of digital upskilling and teacher shortages could help solve one another? The teaching profession is facing a perception problem just as AI has made education more important than ever before. In the global information age, U.S. worker skills and talent are our greatest weapons. The expectations of teachers and teaching must change. Major U.S. economic peers, including Canada, Germany, China, India, New Zealand, and the United Kingdom, have all announced similar national efforts to make robust investments in teacher upskilling in high-value technology areas. In our new AI era, U.S. policymakers now have the opportunity to develop the infrastructure, 21st-century training, and prestigious social recognition to properly value education as an economic and national security priority. A recent report from Goldman Sachs identified “a narrow window of opportunity – what we call the inter-AI years,” in which policymaker “decisions made today will determine what is possible in the future. A generative world order will emerge.” Inaction today risks the United States falling quickly behind tomorrow.

Teacher preparation program enrollment by program and year, 2010–2018 via CAP, 2019
Plan of Action
A Digital Frontier Teaching Corps (DFT Corps) would mobilize a new generation of teachers who are fluent in, adaptive to, and resilient to fast-changing technology, equipped to help our students become the same. The DFT Corps would re-norm the job of teaching to become a full-year profession, making the summer months an essential part of the job of adaptive 21st-century teaching with regular training intensives. Currently, educators only work and are paid for nine months of the year.
Upon acceptance by application, selected teachers would enter a three-year fellowship program to participate in training intensives facilitated at local institutes of higher education, nonprofits, educational service agencies, or industry partners. Scholarships facilitated through the National Science Foundation would extend educator pay and hours from nine months to a full annualized salary. DFT Corps members would also be eligible for substantial federal loan forgiveness in return for their additional time investment.
After three rotations, members would become eligible to serve as DFT Corps site leaders, responsible for program design at new or existing training sites. These opportunities would lend greater compensation, prestige, and retention through leadership opportunities, concurrently addressing systemic talent challenges in education at their root and creating an adaptive mechanism for faster upskilling. Additional program components, including licensure incentives and teacher college innovation grants, would further sustain long-term impacts. By year three of the program, 50,000 educators would be on the path to preparing our students for the future of work, 500 inaugural Corps members would become state or local site leaders to expand the mobilization, and the perception of teaching would further shift from childcare to a critical and respected national service.
To accomplish this vision, Congress should authorize the National Science Foundation to create:
1. A national Digital Frontier Teaching Corps, a three-year “talent surge” fellowship opportunity covering summertime pay for high-potential educators to conduct intensive study in AI, data science, and computing foundations. The DFT Corps would be a prestigious and materially meaningful program to both impart digital technical skills and transform the social perception of the teaching profession. The DFT Corps would include:
- Educator scholarships for full summertime pay
- A low-barrier digital application process for individuals to apply for funding directly from the NSF or an intermediary third-party organization
- Automatic enrollment in local site-based training sites, including college-level coursework (computer science, data science, artificial intelligence, quantum computing) and peer-based pedagogical training
- Tax-based verification for program completion status
- Federal loan forgiveness upon successful program completion
- Federal recognition awards of schools employing DFT Corps educators
- Eligibility to apply for DFT Corps training site leadership upon successful program completion
2. DFT Corps training sites, a national network of university-based, locally led professional development sites in collaboration with local education agencies, based on the evidence-based model of the National Writing Project. Competitive five-year grants would support the creation of Corps sites, one per state, with the opportunity for renewal. DFT Corps training sites would:
- Provide college-level coursework (computer science, data science, artificial intelligence, quantum computing) and peer-based pedagogy training to 100 educators per summer
- Be co-designed with local faculty, researchers, or industry experts, along with prior DFT Corps members
- Be evaluated on an annual basis, with criteria determined by the NSF, with all evaluation results published publicly and submitted to the NSF
- Expand with matching funds from state, industry, or philanthropic support as resources allow
3. Teacher College Innovation Grants, a competitive NSF grant program for modernizing teacher preparation programs and teacher licensure models. Teacher College Innovation Grants would provide research funding and capacity to evaluate DFT Corps training sites and ensure lessons learned are quickly integrated back into teacher preparation programs. Competitive priorities would be made for:
- Proposals that develop and/or scale new methods-style courses for teaching digital technology (i.e., computational thinking, data literacy, AI literacy) to students across traditional school subjects
- Proposals that develop new models for teacher preparation, leveraging online learning or other digital training systems that result in lower program costs
- Proposals that focus on building or modernizing teacher preparation capacity in majority-rural or majority-Indigenous communities
The DFT Corps program is intended to be catalytic. Should the program find success in early scaling, state and local funding could support further adoption of the model over time, so that teaching transforms to an annualized profession across subject areas and grade-levels.
Conclusion
In the new era of AI, education is a national security issue. Advancing our population’s ability to effectively deploy AI and other emerging technology will uniquely determine U.S. leadership and economic competitiveness in the coming years and decades. Education investments made by states within the next few years will all but determine local long-term economic trajectories.
In the 1950s and 1960s, education and competitiveness were one and the same. One year after the Soviets launched Sputnik, Congress took action and passed the National Defense Education Act, a $1 billion spending package to advance teaching and learning in science, mathematics, and foreign languages. At one time, we respected teachers as critical to the national mission, leading the charge to prepare our next generation to lead, and we took swift action to support their mission. We must take the same bold action now.
This idea is part of our AI Legislation Policy Sprint. To see all of the policy ideas spanning innovation, education, healthcare, and trust, safety, and privacy, head to our sprint landing page.
The scale of this national challenge requires meaningful appropriations to raise teacher pay, ensure high-quality training opportunities with sufficient expertise, and sustain a long-term strategy to address deeply-rooted sector challenges. A short-term, one-shot approach will simply waste money and generate minimal impact.
Moreover, the program’s creation necessitates a significant simplification of National Science Foundation application processes to reduce grant application length, burden, and paperwork. It also creates a targeted exception for the NSF to support broader nonresearch activities that are otherwise sector-critical for national scientific and educational endeavors. If enacted, this legislation could help reduce overhead for program administration and redirect more resources to supporting quality state and local implementation vs. program compliance.
Once scaled to all 50 states, the recurring annual costs of the proposed legislation would be $250 million:
- $150 million for DFT Corps member scholarships (5,000 teachers per year)
- $50 million for DFT Corps training sites (one site per state at $1 million each)
- $50 million for Teacher College Innovation Grants (one site per state at $1 million each)
In the first five years, the cost would slowly increase to the total amount, starting at a base $25 million for five states ($15 million for 500 scholarships, $5 million for training sites, $5 million for Innovation grants).
Creating an AI-ready workforce is a critical national priority to maintain U.S. economic competitiveness, mitigate risk of AI primacy, and ensure our citizenry can successfully navigate a complex technology landscape they will graduate into. McKinsey projects that successful integration of AI across more than 63 business use cases would add between $13.6 trillion and $22.1 trillion to the global economy. A recent National Institutes of Health analysis suggests that, for any country to successfully specialize in AI, there must be general preexisting technological capabilities and a strong scientific knowledge base. AI-readiness must be a population-wide goal. Given that 60% of Americans do not complete a bachelor’s degree, AI readiness must begin early in K-12 education and in community colleges.
Estimated return on investment: $250 million represents less than 1.5% of the Every Student Succeeds Act’s last annual appropriation level in 2020, the nation’s primary national education funding mechanism. If this legislation increases the share of economic growth forecasted by effectively harnessing AI by only five percent, we would conservatively add $171 billion to the U.S. economy each year.
The majority of educator training programs are too short, only given during the busy school year, and do not have the opportunity to improve over multiple years within a given school. Early iterations of the National Writing Project, on which this program is based, determined that “although schools may see results from C3WP in a single school year, a longer-term investment may produce a greater impact.” Even if sustained during a school year, researchers have found that “absent a surrounding context that is highly supportive of teacher learning and change, 1 year of PD cannot sufficiently alter instructional practices enough to impact student outcomes.” While earlier evaluation studies saw no impact on student achievement, the National Writing Project is now one of the most lauded and effective educator training models trialed in the United States, made possible by a long-term and consistent investment in professional learning.
A three-year program will allow educators to advance from novice (year 1) to intermediate (year 2) to mentor or facilitator (year 3).By year 4, graduating educators would be prepared to serve as site leaders, dramatically increasing the available talent pool for sustaining and growing DFT Corps sites nationally. Additional time will also enable a local site to improve its own programming and align tightly with multi-year school and district planning.
The DFT Corps is an accelerated investment in the creation of locally led professional development sites, uniquely designed with (1) direct support for current classroom educators to participate; (2) a replicated network model for summer-based, in-service training; and (3) innovation grants to research aligned training improvements and best practices. No current federal program does all three at once for current classroom educators.
Existing teacher training grant programs, such as the Teacher Quality Partnerships or Supporting Effective Educator Development carry strong evidence requirements or incompatible competitive preferences. Given AI is new and little research exists on effective teaching practices, these requirements serve to significantly limit proposals on emerging topics. Grants also vary widely by institution.
Existing educator scholarship programs, such as the Robert Noyce Teacher Scholarship Program, focus mostly on recruitment of new teachers, and only provide small support for existing teachers pursuing or having previously obtained a master’s degree. 40% of U.S. teachers do not have a master’s degree. A targeted national focus on AI readiness would also require several higher-education institutions across states to organically propose training programs to the Noyce program at the same time, with the same model.
AI technology development is moving faster than the education sector can respond. In order to accelerate site creation, reduce application burden, and modernize grant distribution, the DFT Corps program would direct the NSF to:
- Allow nonresearch activities to be funded under the program, including educator salary support
- Remove and centralize all program evaluation requirements away from individual grantees, reallocating evaluation activities to external researchers across sites
- Centrally manage disbursement of DFT salary supplements, potentially via tax credits
- Modernize required data management plan requirements for present-day technology
- Limit total grant application length to 10 pages or less. In other fields, NSF grant applications take investigators over 171 hours to prepare, despite little relation between time invested and actual funding outcomes in some cases. Another study found that 42% of investigators time is spent on administrative and reporting tasks to support the execution of an NSF grant.
Yes, with appropriations. Under new 2023 guidance, the Robert Noyce Teacher Scholarship Program has expanded salary supplement options and enabled two-summer support. An executive action version of this proposal would expand the Robert Noyce Teacher Scholarship program via (1) increasing support for Track 3 with lower degree requirements (i.e. Bachelors instead of Masters); (2) stipulating a competitive priority for AI readiness and emerging technology education (defined as: computer science, computational thinking, data science, artificial intelligence literacy across the curriculum); and (3) direct the White House Office of Science & Technology Policy to launch a multi-agency, public-facing communications and recruitment effort for the DFT Corps program, in collaboration with the 50 largest teacher colleges and other participating Noyce program institutions.
The proposed DFT Corps mirrors a long-running evidence-based model, the National Writing Project (NWP), which has trained over 95,000 teachers in high-quality writing instruction across 2,000 school districts since 1974. Three independent evaluation studies over multiple years across 20 states found “positive and statistically significant effects on student achievement” across all measured components of writing. The evidence base supporting NWP is “unusually robust” for education research, employing randomized-controlled trials and meeting ESSA Tier 1 evidence criteria. A recent replication study in 2023 focusing on rural schools found positive results on “on all attributes measured,” a similar priority for the proposed DFT Corps program.
Similar to the Robert Noyce Scholarship program, the DFT Corps program would waive tuition costs and provide scholarship funds in exchange for a multi-year teaching commitment. Each year’s participation in the program would extend an educator’s teaching commitment by two additional years. A 2013 evaluation of the Noyce program found this model worked, with longer retention rates compared to new teachers graduating from the same institutions.
Recodes a nine-month profession to annual pay, and annual expectations: A primary change advanced by the DFT Corps is converting the typical teacher job from a nine-month term to an annual salary, similar to lawyers, doctors, and other high-prestige professions. In a recent RAND report on why teachers wanted to leave the profession, salary was the #2 reason, hours worked outside the school day the #3, and total hours worked was the #4. Teachers are promised a flexible and part-year job on paper, when the reality is very different. Nine-month pay challenges are so extreme that several U.S. banks host articles on “surviving the summer paycheck gap.” Many teachers take second (non-academic) jobs. And the popular #NoSummersOff hashtag gained a significant following amongst educators pre-pandemic. Concurrently, the rate of technology and curriculum changes demand more professional learning time than is typically given by schools and districts. Summer professional learning is often optional and highly variable across states. Our expectations are far too low for one of our most critical knowledge jobs. DFT Corps members would be paid during the summer for intensive study to update curriculum, plan content, and incorporate new education research on how students learn. Full-time summer work would remove pressure for administrators to “squeeze in” short, one-day professional development sessions during the school year, which study after study has demonstrated are a waste of time and money. Many current classroom educators to the former U.S. Secretary of Education continue to question these existing PD approaches.
Creates a leadership ladder: Leadership opportunities for classroom educators are few and far between. Teaching is often described as a “flat” profession, and nearly half of educators leaving the field point to a perceived lack of leadership or decision-making opportunities as contributing factors. Concurrently, new teachers who have the opportunity to collaborate with teacher-leaders within their own school generate stronger academic gains for their students. The DFT Corps would create state-wide leadership opportunities at Corps summer sites that do not disrupt school-year teaching, allowing educators to remain in the classroom during the other nine months of the year but still access visible leadership and mentor roles during the summer.
Leverages peer-based learning: Beyond the opportunity to positively impact students and student learning, 63% of educators report that strong relationships with other teachers are a top reason for staying in the classroom. The DFT Corps would leverage peer-based professional development over multiple years, reallocating the summer months to joint study and creating stronger educator networks statewide. One of the DFT Corp’s precedent peer-based models, the National Writing Project, “has a legacy as being the best professional development model for K-12 teachers” precisely due to a targeted focus on peer exchange. In post-training interviews, researchers found that educators “immediately changed several of their teaching practices and felt a renewed sense of enthusiasm towards the teaching of writing after participating in the NWP… a renewed sense of authority that quickly transferred to agency, these teachers possessed the self-efficacy to share what they knew and had learned with other teachers, administrators, district leaders, fellow graduate students, and most importantly, the students who would enter their classrooms in the fall.”
Builds needed prestige for the profession: The DFT Corps program forwards a reinvigorated national prioritization of the education field. In the information economy, educators are one of our most critical professions, and a greater determinant of gross domestic product than any individual semiconductor or algorithm. Under a DFT Corps communications rollout, teaching would be separated from any prior stereotypes of “caretakers,” positioned instead as essential to the economic, technology, and security fabric that advance societal progress. Research consistently suggests that low prestige of the profession pushes high-achievers away from teaching, is closely correlated with both falling preparation and retention, and may even directly affect student achievement. In China, where educators have long enjoyed high prestige for their profession, researchers found that an expansion of the country’s Free Teacher Education program helped to increase application competitiveness, extend retention rates, and enhance self-identity for program participants in a pre-publication evaluation study. In a 2018 “Global Teacher Status Index,” China was the only country to score 100 while the United States scored under 40 points. The United States is falling behind in our education culture, and we have little time to make up for lost ground.
The long-term vision for this proposal also extends beyond the NSF AI Education Act and suggests a new mechanism for federal education support in the Every Student Succeeds Act.
National Security AI Entrepreneur Visa: Creating a New Pathway for Elite Dual-Use Technology Founders to Build in America
NVIDIA, Anthropic, OpenAI, HuggingFace, and scores of other American startups helping cement America’s leadership in the race for artificial intelligence (AI) dominance all have one thing in common: they have at least one immigrant co-founder. In fact, in 2023, the National Foundation for American Policy released a policy analysis on the role of immigrants in the top American AI companies. According to their research, 65% of the companies appearing on the Forbes AI 50 list were founded or co-founded by at least one immigrant. Immigrant entrepreneurs are critical to America’s economic success, and as the private sector takes an increasing role in developing critical dual-use technologies like AI, they will be critical to America’s defense.
According to a Brookings Foundation report, “China sees talent as central to its technological advancement; President Xi Jinping has repeatedly called talent ‘the first resource’ in China’s push for ‘independent innovation.’” It’s easy to understand why the CCP sees talent as critical in its efforts to dominate key dual-use technologies relevant to national and economic security – in today’s knowledge economy, those who can innovate faster win. A company like SpaceX, which almost single-handedly reinvigorated America’s spacefaring economy, would likely not exist without Elon Musk. The lists of companies and dual-use technologies critical to American national and economic security that are unlikely to have been created successfully without the right personalities behind them are innumerable. America needs these entrepreneurs more than ever as competition with China for global leadership in key fields like AI heats up.
Given increased competition for talent – from allies like the United Kingdom to competitors and adversaries like China – in critical technology areas like AI, Congress must act to support high-skilled entrepreneurs by creating a National Security Startup Visa specifically targeted at founders of AI firms whose technology is inherently dual-use and critical for America’s economic leadership and national security. To maximize the potential economic benefits of such a visa for all Americans, it can be narrowly tailored, focusing only on entrepreneurs who (1) have raised significant capital from accredited American investors and venture capitalists (VCs), (2) are willing to physically reside and start their business in an Opportunity Zone, and (3) will hire at least five Americans within the first year of operation. Immigration may be a complex issue, but there is no doubt that immigrant founders are the not-so-secret ingredient that have helped to fuel America’s rise as a tech superpower. Developing a narrowly scoped visa targeted at a critical technology segment means that America can ensure its continued dominance in AI, a technology that the CEO of Google has said may be as profound as fire or electricity.
Challenge and Opportunity
While the United States has long been the preferred destination for immigrant entrepreneurs, America has never had more competition for global talent. Countries like Canada, Germany, and Estonia have created visas to attract entrepreneurs, and they appear to be working. After the introduction of a Canadian startup visa in 2013, the program increased the likelihood of previously U.S.-based immigrants creating a startup in Canada by 69%. These are immigrants who were already in America to study or work, and it should have been an obvious choice for them to stay and build their company in the United States. This means that the United States is losing out to hundreds of new companies and likely thousands of high-paying jobs that would come along with them. The fact that Canada, thanks to a streamlined immigration process for founders, was able to attract so many who were already in the United States should serve as a serious warning as to how the competition for talent is heating up.

Canada demonstrates how a start-up visa enhances immigrant entrepreneurship via National Bureau of Economic Research
Historically, the United States—and Silicon Valley in particular—was the undisputed leader for venture capital fundraising and the place to start a potential unicorn (a company valued at over $1 billion). However, America’s dominance has shrunk, and VC dollars along with unicorns are increasingly found across the world in tech hotspots from China to India to the United Kingdom, showing it is increasingly easy for entrepreneurs to build a successful startup elsewhere. This is critical, because when America was the only place to build a leading company, entrepreneurs had little choice but to wade through the labyrinth that is the American immigration system. Now, top talent have many choices, and the United States must compete to become not just the premier destination to build a company and raise capital but one that is accessible to startup founders who can’t afford high-priced immigration lawyers or to wait for years until their visa is granted.
While America’s largest geopolitical competitor may suffer from extreme difficulties in attracting foreign entrepreneurs to its shores, China has a massive population advantage. This can be seen directly in the STEM space and AI in particular. According to a CSIS report, “By 2025, Chinese universities are projected to produce more than 77,000 STEM PhD graduates per year, more than double the 2010 level of about 34,000 STEM PhD graduates. In comparison, the United States is projected to graduate only approximately 40,000 STEM PhD students in 2025, a figure that includes over 16,000 international students.”
China has already outpaced the United States in the number of AI-related research articles published, and its domestic tech champions are global leaders in AI-enabled technology like facial recognition. Given the strong domestic showing in AI from Chinese researchers and entrepreneurs, with local AI startups raising billions of dollars in 2023 despite a slowdown in VC funding in China, China presents a strategic threat to America’s leadership in the AI space. America is on the cusp of losing its leadership in AI to China, but this policy creates clear opportunities to expeditiously regain lost ground by bringing in AI entrepreneurs who have already raised venture funding and are able to immediately hire American workers.
However daunting the challenge China presents, America has long had a superpower: attracting the best and brightest to our shores to build innovative global businesses. And while many leading American AI startups have an immigrant co-founder, for every entrepreneur coming to the United States today, many more are turned away or dissuaded from applying. Take Erdal Arikan, a Turkish MIT and CalTech graduate who had difficulty staying in America to continue his research and returned to Turkey. According to Graham Allison and Eric Schmidt, “It turned out that Arikan’s insight was the breakthrough needed to leap from 4G telecommunications networks to much faster 5G mobile internet services. Four years later, China’s national telecommunications champion, Huawei, was using Arikan’s discovery to invent some of the first 5G technologies. Today, Huawei holds over two-thirds of the patents related to Arikan’s solution… Had the United States been able to retain Arikan—simply by allowing him to stay in the country instead of making his visa contingent on immediately finding a sponsor for his work—this history might well have been different.”
By creating a narrowly tailored AI National Security Entrepreneur Visa, the United States has a unique opportunity to recruit founders in a field deemed “critical and emerging” by the White House and help the nation maintain both its economic and national security competitiveness. And while many are concerned about the potential economic dislocation from AI, one way to mitigate such a risk is by helping entrepreneurship flourish in the United States, especially in underserved communities like those found in Opportunity Zones across every state. With hundreds or thousands of new businesses creating high-paid jobs in rural and underserved communities, Americans outside existing tech hubs of New York City and San Francisco could finally see real economic benefits of the tech boom.
The economic potential for such a visa is tremendous. According to a 2024 report from the Center for Growth and Opportunity at Utah State University, a startup visa could have a significant impact: “Data collected at the state level suggests that when the population’s share of immigrant college graduates increases by 1 percent, patents per capita increase by 9 to 18 percent” with the report going on to say that (depending on the number of entrepreneurs brought in) “Census and industrial data predict an increase of 500,000 to 1.6 million new jobs from young start-up visa companies in the United States after 10 years of operation.”
The time for an AI startup visa is now. It will help create American jobs and revitalize local economies, cement American global leadership, and ensure that we beat China in the AI race.
Plan of Action
Create a 10-year pilot AI Entrepreneur Visa program for a select group of countries to demonstrate the potential efficacy of the visa.
The AI National Security Entrepreneur Visa will be narrowly tailored to founders from friendly nations, who have already raised significant capital for their companies from accredited American investors and are willing to physically reside in an Opportunity Zone. This will minimize risks of visa overstays and espionage while maximizing the potential economic benefits by bringing companies that have capital ready to deploy to the United States.
Visa Characteristics
- Initially available for nationals of Israel, the European Union, Japan, South Korea, Australia, New Zealand, United Kingdom, Canada, Ukraine, Armenia, Colombia, India, and Nigeria based on these country’s close ties to the U.S. and/or high rates of entrepreneurship.
- A two-year visa with an opportunity for a single three-year extension, plus a pathway to citizenship after the third year of residence in the United States.
- Also applies to immediate family members only (children and spouse).
- Allows for physical residency within any designated Opportunity Zone for the visa holder and, if applicable, their spouse and children.
- Rapid issuance by default with a maximum timeline for issuing a visa of 30 days from the date of a completed application being submitted.
- Allows for multiple foreign cofounders of a company to submit an application for a visa.
Initial Visa Application Requirements
- Must demonstrate that the company has raised at least $500,000 from an accredited American venture investor or firm.
- Have a startup focused on developing an AI or AI-enabled solution.
- To be considered eligible, each applicant must own at least 20% of the startup.
- Applicants must have demonstrated AI-related technology skills including software engineering, AI research, applied AI research, or tech product management OR have a bachelor’s or graduate degree in software engineering, physics, or mathematics from an accredited college or university.
Visa Extension Requirements
- The applicant must demonstrate that the visa holder’s company has created a minimum of five jobs employing American citizens within the first year of operation.
- The applicant must demonstrate that the company has grown revenues by at least 10% month over month on average during the period in which the visa was held OR have raised at least an additional $1.5 million from accredited American investors.
- The applicant must own 10% or more of the company at time of application for the visa extension.
Recommended Timeline
- Congress should mandate that the U.S. Citizenship and Immigration Services (USCIS) begin accepting applications for the pilot program within 180 days of the bill’s passage.
- Congress may consider adding a sunset provision 10 years after the program formally begins accepting applications in order to assess the program’s efficacy and determine whether it should be continued as is, scaled up, scaled down, or deprecated.
Miscellaneous Recommendations
- Consider creating a volunteer board of current and former American investors and entrepreneurs who can provide training and recommendations to the Department of Homeland Security/USCIS while guidelines and rules are being created for the new program.
- Example: An immigration officer may have difficulty understanding how prestigious an accelerator program such as Y Combinator or StartX is, leading them to downgrade an applicant who actually has a high likelihood of building a competitive startup.
- Require an independent economic impact assessment to be conducted every three years to understand the current and projected future impact of the program over time.
- Make the program fee-based, so that applicant fees pay for USCIS’s operating costs and no additional costs are passed on to taxpayers.
- Consider creating a cap of under $10,000 for total federal fees to make it accessible to early-stage startups.
- Ensure that it is possible for existing visa holders (student, work) to easily switch to the AI Entrepreneur Visa assuming they meet the relevant criteria.
- Consider allowing governors to opt in to the program, so that only states that would like to participate in the program do so.
Conclusion
America is in a race for global talent, especially when it comes to AI. The data shows that the majority of leading AI companies in America were created with at least one immigrant founder—but our immigration system makes it incredibly difficult for experts to come and build their companies in America, a serious strategic disadvantage compared to China, which produces dramatically more STEM graduates. By creating an AI National Security Entrepreneur Visa targeting high-skill founders who have already raised funds, Congress can quickly close the gap with China, bringing the best and brightest from around the world to America to build their companies. Not only will this help create jobs across the United States, it will make America the undisputed superpower in AI, allowing us to set standards and control the development of a technology whose impact may surpass those of all other innovations in recent decades.
This idea is part of our AI Legislation Policy Sprint. To see all of the policy ideas spanning innovation, education, healthcare, and trust, safety, and privacy, head to our sprint landing page.
Yes. Take it from the founder of Yahoo and naturalized American citizen, Jerry Yang, who said “If I had to worry about a visa, maybe Yahoo wouldn’t have gotten started,” and that “There are more places around the world where entrepreneurship has taken off… so founders have more choices. And to the extent that our immigration policies are not so welcoming, people don’t want to come.”
Created under President Trump’s Tax Cuts and Jobs Act, Opportunity Zones are designated areas across all 50 states deemed economically distressed by the Internal Revenue Service. Many previous technology booms have created outsized benefits to existing wealthy tech hubs like San Francisco and New York City thanks to positive agglomeration and network effects. By pushing entrepreneurs to found their business in a Opportunity Zone, which by its nature is an economically distressed area, the visa will help bring new jobs and opportunities to areas that previously had a difficult time attracting tech entrepreneurs and high-growth startups.
The Economic Innovation Group has written extensively about the concept of a “heartland visa,” which would allow counties to decide on specific new immigration pathways based on their distinct needs. The AI Entrepreneur Visa could be structured similarly, with states or localities opting in to the program and deciding the number and type of AI entrepreneurs they would like to bring to their communities.
Yes. Some options to further narrow the visa:
- Decrease the number of countries eligible for the pilot visa program.
- Create a cap for the number of potential founders per year (recommended minimum of 10,000 to create a sample size large enough for an economic impact assessment).
- Create a mandatory sunset for the program, requiring it to be renewed after five or 10 years.
- Increase equity ownership requirements or implement a maximum number of applicants per company.
- Allow individual states or counties to opt in to the program rather than it being available for the entire nation’s Opportunity Zones at the start.
Yes. Some options to further expand the visa:
- Increase the number of countries eligible to apply for the visa.
- Expand the technologies/industries eligible for the visa.
- Decrease or eliminate the threshold for the amount of funds raised to be eligible.
- Decrease or eliminate equity ownership requirements.
- The company’s primary physical place of business must be shown to be within an Opportunity Zone.