Building a Comprehensive NEPA Database to Facilitate Innovation

The Inflation Reduction Act and the Infrastructure Innovation and Jobs Act are set to drive $300 billion in energy infrastructure investment by 2030. Without permitting reform, lengthy review processes threaten to make these federal investments one-third less effective at reducing greenhouse gas emissions. That’s why Congress has been grappling with reforming the National Environmental Policy Act (NEPA) for almost two years. Yet, despite the urgency to reform the law, there is a striking lack of available data on how NEPA actually works. Under these conditions, evidence-based policy making is simply impossible. With access to the right data and with thoughtful teaming, the next administration has a golden opportunity to create a roadmap for permitting software that maximizes the impact of federal investments.

Challenge and Opportunity

NEPA is a cornerstone of U.S. environmental law, requiring nearly all federally funded projects—like bridges, wildfire risk-reduction treatments, and wind farms—to undergo an environmental review. Despite its widespread impact, NEPA’s costs and benefits remain poorly understood. Although academics and the Council on Environmental Quality (CEQ) have conducted piecemeal studies using limited data, even the most basic data points, like the average duration of a NEPA analysis, remain elusive. Even the Government Accountability Office (GAO), when tasked with evaluating NEPA’s effectiveness in 2014, was unable to determine how many NEPA reviews are conducted annually, resulting in a report aptly titled “National Environmental Policy Act: Little Information Exists on NEPA Analyses.”

The lack of comprehensive data is not due to a lack of effort or awareness. In 2021, researchers at the University of Arizona launched NEPAccess, an AI-driven program aimed at aggregating publicly available NEPA data. While successful at scraping what data was accessible, the program could not create a comprehensive database because many NEPA documents are either not publicly available or too hard to access, namely Environmental Assessments (EAs) and Categorical Exclusions (CEs). The Pacific Northwest National Laboratory (PNNL) also built a language model to analyze NEPA documents but contained their analysis to the least common but most complex category of environmental reviews, Environmental Impact Statements (EISs).

Fortunately, much of the data needed to populate a more comprehensive NEPA database does exist. Unfortunately, it’s stored in a complex network of incompatible software systems, limiting both public access and interagency collaboration. Each agency responsible for conducting NEPA reviews operates its own unique NEPA software. Even the most advanced NEPA software, SOPA used by the Forest Service and ePlanning used by the Bureau of Land Management, do not automatically publish performance data.

Analyzing NEPA outcomes isn’t just an academic exercise; it’s an essential foundation for reform. Efforts to improve NEPA software have garnered bipartisan support from Congress. CEQ recently published a roadmap outlining important next steps to this end. In the report, CEQ explains that organized data would not only help guide development of better software but also foster broad efficiency in the NEPA process. In fact, CEQ even outlines the project components that would be most helpful to track (including unique ID numbers, level of review, document type, and project type).

Put simply, meshing this complex web of existing softwares into a tracking database would be nearly impossible (not to mention expensive). Luckily, advances in large language models, like the ones used by NEPAccess and PNNL, offer a simpler and more effective solution. With properly formatted files of all NEPA documents in one place, a small team of software engineers could harness PolicyAI’s existing program to build a comprehensive analysis dashboard.

Plan of Action

The greatest obstacles to building an AI-powered tracking dashboard are accessing the NEPA documents themselves and organizing their contents to enable meaningful analysis. Although the administration could address the availability of these documents by compelling agencies to release them, inconsistencies in how they’re written and stored would still pose a challenge. That means building a tracking board will require open, ongoing collaboration between technologists and agencies.

Conclusion

The stakes are high. With billions of dollars in federal climate and infrastructure investments on the line, a sluggish and opaque permitting process threatens to undermine national efforts to cut emissions. By embracing cutting-edge technology and prioritizing transparency, the next administration can not only reshape our understanding of the NEPA process but bolster its efficiency too.

This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.

PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.

Frequently Asked Questions
Why is it important to have more data about Environmental Assessments and Categorical Exclusions?

It’s estimated that only 1% of NEPA analyses are Environmental Impact Statements (EISs), 5% are Environmental Assessments (EAs), and 94% are Categorical Exclusions (CEs). While EISs cover the most complex and contentious projects, using only analysis of EISs to understand the NEPA process paints an extremely narrow picture of the current system. In fact, focusing solely on EISs provides an incomplete and potentially misleading understanding of the true scope and effectiveness of NEPA reviews.


The vast majority of projects undergo either an EA or are afforded a CE, making these categories far more representative of the typical environmental review process under NEPA. EAs and CEs often address smaller projects, like routine infrastructure improvements, which are critical to the nation’s broader environmental and economic goals. Ignoring these reviews means disregarding a significant portion of federal environmental decision-making; as a result, policymakers, agency staff, and the public are left with an incomplete view of NEPA’s efficiency and impact.

Using Home Energy Rebates to Support Market Transformation

Without market-shaping interventions, federal and state subsidies for energy-efficient products like heat pumps often lead to higher prices, leaving the overall market worse off when rebates end. This is a key challenge that must be addressed as the Department of Energy (DOE) and states implement the Inflation Reduction Act’s Home Electrification and Appliance Rebates (HEAR) program. 

DOE should prioritize the development of evidence-based market-transformation strategies that states can implement with their HEAR funding. The DOE should use its existing allocation of administrative funds to create a central capability to (1) develop market-shaping toolkits and an evidence base on how state programs can improve value for money and achieve market transformation and (2) provide market-shaping program implementation assistance to states.

There are proven market-transformation strategies that can reduce costs and save consumers billions of dollars. DOE can look to the global public health sector for an example of what market-shaping interventions could do for heat pumps and other energy-efficient technologies. In that arena, the Clinton Health Access Initiative (CHAI) has shown how public funding can support market-based transformation, leading to sustainably lower drug and vaccine prices, new types of “all-inclusive” contracts, and improved product quality. Agreements negotiated by CHAI and the Bill and Melinda Gates Foundation have generated over $4 billion in savings for publicly financed health systems and improved healthcare for hundreds of millions of people. 

Similar impact can be achieved in the market for heat pumps if DOE and states can supply information to empower consumers to purchase the most cost-effective products, offer higher rebates for those cost-effective products, and seek supplier discounts for heat pumps eligible for rebates. 

Challenge and Opportunity 

HEAR received $4.5 billion in appropriations from the Inflation Reduction Act and provides consumers with rebates to purchase and install high-efficiency electric appliances. Heat pumps, the primary eligible appliance, present a huge opportunity for lowering overall greenhouse gas emissions from heating and cooling, which makes up over 10% of global emissions. In the continental United States, studies have shown that heat pumps can reduce carbon emissions up to 93% compared to gas furnaces across their lifetime

However, direct-to-consumer rebate programs have been shown to enable suppliers to increase prices unless these subsidies are used to reward innovation and reduce cost. If subsidies are dispersed and the program design is not aligned with a market-transformation strategy, the result will be a short-term boost in demand followed by a fall-off in consumer interest as prices increase and the rebates are no longer available. This is a problem because program funding for heat pump rebates will support only ~500,000 projects over the life of the program—but more than 50 million households will need to convert to heat pumps in order to decarbonize the sector.

HEAR aims to address this through Market Transformation Plans, which states are required to submit to DOE within a year after receiving the award. States will then need to obtain DOE approval before implementing them. We see several challenges with the current implementation of HEAR: 

Despite these challenges, DOE has a clear opportunity to increase the impact of HEAR rebates by providing program design support to states for market-transformation goals. To ensure a competitive market and better value for money, state programs need guidance on how to overcome barriers created by information asymmetry – meaning that HVAC contractors have a much better understanding of the technical and cost/benefit aspects of heat pumps than consumers do. Consumers cannot work with contractors to select a heat pump solution that represents the best value for money if they do not understand the technical performance of products and how operating costs are affected by Seasonal Energy Efficiency Rating, coefficient of performance, and utility rates. If consumers are not well-informed, market outcomes will not be efficient. Currently, consumers do not have easy access to critical information such as the tradeoff in costs between increased Seasonal Energy Efficiency Rating and savings on monthly utility bills. 

Overcoming information asymmetry will also help lower soft costs, which is critical to lowering the cost of heat pumps. Based on studies conducted by New York State, Solar Energy Industries Association and DOE, soft costs run over 60% of project costs in some cases and have increased over the past 10 years.

There is still time to act, as thus far only a few states have received approval to begin issuing rebates and state market-transformation plans are still in the early stages of development.

Plan of Action 

Recommendation 1. Establish a central market transformation team to provide resources and technical assistance to states.

To limit cost and complexity at the state level for designing and staffing market-transformation initiatives, the DOE should set up central resources and capabilities. This could either be done by a dedicated team within the Office of State and Community Energy Programs or through a national lab. Funding would come from the 3% of program funds that DOE is allowed to use for administration and technical assistance. 

This team would:

Data collection, analysis, and consistent reporting are at the heart of what this central team could provide states. The DOE data and tools requirements guide already asks states to provide information on the invoice, equipment and materials, and installation costs for each rebate transaction. It is critical that the DOE and state programs coordinate on how to collect and structure this data in order to benefit consumers across all state programs.

A central team could provide resources and technical assistance to State Energy Offices (SEOs) on how to implement market-shaping strategies in a phased approach.

Phase 1. Create greater price transparency and set benchmarks for pricing on the most common products supported by rebates.

The central market-transformation team should provide technical support to states on how to develop benchmarking data on prices available to consumers for the most common product offerings. Consumers should be able to evaluate pricing for heat pumps like they do for major purchases such as cars, travel, or higher education. State programs could facilitate these comparisons by having rebate-eligible contractors and suppliers provide illustrative bids for a set of 5–10 common heat pump installation scenarios, for example, installing a ductless mini-split in a three-bedroom home.

States should also require contractors to provide hourly rates for different types of labor, since installation costs are often ~70% of total project costs. Contractors should only be designated as recommended or preferred service providers (with access to HEAR rebates) if they are willing to share cost data.

In addition, the central market-transformation team could facilitate information-sharing and data aggregation across states to limit confusion and duplication of data. This will increase price transparency and limit the work required at the state level to find price information and integrate with product technical performance data.

Phase 2. Encourage price and service-level competition among suppliers by providing consumers with information on how to judge value for money.

A second area to improve market outcomes is by promoting competition. Price transparency supports this goal, but to achieve market transformation programs need to go further to help consumers understand what products, specific to their circumstances, offer best value for money. 

In the case of a heat pump installation, this means taking account of fuel source, energy prices, house condition, and other factors that drive the overall value-for-money equation when achieving improved energy efficiency. Again, information asymmetry is at play. Many energy-efficiency consultants and HVAC contractors offer to advise on these topics but have an inherent bias to promoting their products and services. There are no easily available public sources of reliable benchmark price/performance data for ducted and ductless heat pumps for homes ranging from 1500 to 2700 square feet, which would cover 75% of the single-family homes in the United States. 

In contrast, the commercial building sector benefits from very detailed cost information published on virtually every type of building material and specialty trade procedure. Data from sources such as RSMeans provides pricing and unit cost information for ductwork, electrical wiring, and mean hourly wage rates for HVAC technicians by region. Builders of newly constructed single-family homes use similar systems to estimate and manage the costs of every aspect of the new construction process. But a homeowner seeking to retrofit a heat pump into an existing structure has none of this information. Since virtually all rebates are likely to be retrofit installations, states and the DOE have a unique interest in making this market more competitive by developing and publishing cost/performance benchmarking data. 

State programs have considerable leverage that can be used to obtain the information needed from suppliers and installers. The central market-transformation team should use that information to create a tool that provides states and consumers with estimates of potential bill savings from installation of heat pumps in different regions and under different utility rates. This information would be very valuable to low- and middle-income (LMI) households, who are to receive most of the funding under HEAR.

Phase 3. Use the rebate program to lower costs and promote best-value products by negotiating product and service-level agreements with suppliers and contractors and awarding a higher level of rebate to installations that represent best value for money.

By subsidizing and consolidating demand, SEOs will have significant bargaining power to achieve fair prices for consumers.

First, by leveraging relationships with public and private sector stakeholders, SEOs can negotiate agreements with best-value contractors, offering guaranteed minimum volumes in return for discounted pricing and/or longer warranty periods for participating consumers. This is especially important for LMI households, who have limited home improvement budgets and experience disproportionately higher energy burdens, which is why there has been limited uptake of heat pumps by LMI households. In return, contractors gain access to a guaranteed number of additional projects that can offset the seasonal nature of the business.

Second, as states design the formulas used to distribute rebates, they should be encouraged to create systems that allocate a higher proportion of rebates to projects quoted at or below the benchmark costs and a smaller proportion or completely eliminate the rebates to projects higher than the benchmark. This will incentivize contractors to offer better value for money, as most projects will not proceed unless they receive a substantial rebate. States should also adopt a similar process as New York and Wisconsin in creating a list of approved contractors that adhere to “reasonable price” thresholds.

Recommendation 2. For future energy rebate programs, Congress and DOE can make market transformation more central to program design. 

In future clean energy legislation, Congress should direct DOE to include the principles recommended above into the design of energy rebate programs, whether implemented by DOE or states. Ideally, that would come with either greater funding for administration and technical assistance or dedicated funding for market-transformation activities in addition to the rebate program funding. 

For future rebate programs, DOE could take market transformation a step further by establishing benchmarking data for “fair and reasonable” prices from the beginning and requiring that, as part of their applications, states must have service-level agreements in place to ensure that only contractors that are at or below ceiling prices are awarded rebates. Establishing this at the federal level will ensure consistency and adoption at the state level.

Conclusion

The DOE should prioritize funding evidence-based market transformation strategies to increase the return on investment for rebate programs. Learning from U.S.-funded programs for global public health, a similar approach can be applied to the markets for energy-efficient appliances that are supported under the HEAR program. Market shaping can tip the balance towards more cost-effective and better-value products and prevent rebates from driving up prices. Successful market shaping will lead to sustained uptake of energy-efficient appliances by households across the country.

This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.

PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.

Frequently Asked Questions
Why are prices driven up by subsidies?

There is compelling evidence that federal and state subsidies for energy-efficient products can lead to price inflation, particularly in the clean energy space. The federal government has offered tax credits in the residential solar space for many years. While there has been a 64% reduction in the ex-factory photovoltaic module price for residential panels, the total residential installed cost per kWh has increased. The soft costs, including installation, have increased over the same period and are now ~65% or more of total project costs.


In 2021, the National Bureau of Economic Research linked consumer subsidies with firms charging higher prices, in the case of Chinese cell phones. The researchers found that by introducing competition for eligibility, through techniques such as commitment to price ceilings, price increases were mitigated and, in some cases, even reduced, creating more consumer surplus. This type of research along with the observed price increases after tax credits for solar show the risks of government subsidies without market-shaping interventions and the likely detrimental long-term impacts.

In which contexts has market-shaping/transformation work succeeded in the global health sector?

CHAI has negotiated over 140 agreements for health commodities supplied to low-and-middle-income countries (LMICs) with over 50 different companies. These market-shaping agreements have generated $4 billion in savings for health systems and touched millions of lives.


For example, CHAI collaborated with Duke University and Bristol Myers Squibb to combat hepatitis-C, which impacts 71 million people, 80% of whom are in LMICs, mostly in Southeast Asia and Africa [see footnote]. The approval in 2013 of two new antiviral drugs transformed treatment for high-income countries, but the drugs were not marketed or affordable in LMICs. Through its partnerships and programming, CHAI was able to achieve initial pricing of $500 per treatment course for LMICs. Prices fell over the next six years to under $60 per treatment course while the cost in the West remained at over $50,000 per treatment course. This was accomplished through ceiling price agreements and access programs with guaranteed volume considerations.


CHAI has also worked closely with the Bill and Melinda Gates Foundation to develop the novel market-shaping intervention called a volume guarantee (VG), where a drug or diagnostic test supplier agrees to a price discount in exchange for guaranteed volume (which will be backstopped by the guarantor if not achieved). Together, they negotiated a six-year fixed price VG with Bayer and Merck for contraceptive implants that reduced the price by 53% for 40 million units, making family planning more accessible for millions and generating $500 million in procurement savings.


Footnote: Hanafiah et al., Global epidemiology of hepatitis C virus infection: New estimates of age-specific antibody to HCV seroprevalence, J Hepatol. (2013), Volume 57, Issue 4, Pages 1333–1342; Gower E, Estes C, Blach S, et al. Global epidemiology and genotype distribution of the hepatitis C virus infection. J Hepatol. (2014),61(1 Suppl):S45-57; World Health Organization. Work conducted by the London School of Hygiene and Tropical Medicine. Global Hepatitis Report 2017.

How are states implementing HEAR?

Many states are in the early stages of setting up the program, so they have not yet released their implementation plans. However, New York and Wisconsin indicate which contractors are eligible to receive rebates through approved contractor networks on their websites. Once a household applies for the program, they are put in touch with a contractor from the approved state network, which they are required to use if they want access to the rebate. Those contractors are approved based on completion of training and other basic requirements such as affirming that pricing will be “fair and reasonable.” Currently, there is no detail about specific price thresholds that suppliers need to meet (as an indication of value for money) to qualify.

How can states get benchmark data given the variation between homes for heat pump installation?

DOE’s Data and Tools Requirements document lays out the guidelines for states to receive federal funding for rebates. This includes transaction-level data that must be reported to the DOE monthly, including the specs of the home, the installation costs, and the equipment costs. Given that states already have to collect this data from contractors for reporting, this proposal recommends that SEOs streamline data collection and standardize it across all participating states, and then publish summary data so consumers can get an accurate sense of the range of prices.


There will be natural variation between homes, but by collecting a sufficient sample size and overlaying efficiency metrics like Seasonal Energy Efficiency Rating, Heating Seasonal Performance Factor, and coefficient of performance, states will be able to gauge value for money. Rewiring America and other nonprofits have software that can quickly make these calculations to help consumers understand the return on investment for higher-efficiency (and higher-cost) heat pumps given their location and current heating/cooling costs.

What impact would price transparency and benchmark data have?

In the global public health markets, CHAI has promoted price transparency for drugs and diagnostic tests by publishing market surveys that include product technical specifications, and links to product performance studies. We show the actual prices paid for similar products in different countries and by different procurement agencies. All this information has helped public health programs migrate to the best-in-class products and improve value for money. Stats could do the same to empower consumers to choose best-in-class and best-value products and contractors.

Driving Product Model Development with the Technology Modernization Fund

The Technology Modernization Fund (TMF) currently funds multiyear technology projects to help agencies improve their service delivery. However, many agencies abdicate responsibility for project outcomes to vendors, lacking the internal leadership and project development teams necessary to apply a product model approach focused on user needs, starting small, learning what works, and making adjustments as needed. 

To promote better outcomes, TMF could make three key changes to help agencies shift from simply purchasing static software to acquiring ongoing capabilities that can meet their long-term mission needs: (1) provide education and training to help agencies adopt the product model; (2) evaluate investments based on their use of effective product management and development practices; and (3) fund the staff necessary to deliver true modernization capacity. 

Challenge and Opportunity

Technology modernization is a continual process of addressing unmet needs, not a one-time effort with a defined start and end. Too often, when agencies attempt to modernize, they purchase “static” software, treating it like any other commodity, such as computers or cars. But software is fundamentally different. It must continuously evolve to keep up with changing policies, security demands, and customer needs. 

Presently, agencies tend to rely on available procurement, contracting, and project management staff to lead technology projects. However, it is not enough to focus on the art of getting things done (project management); it is also critically important to understand the art of deciding what to do (product management). A product manager is empowered to make real-time decisions on priorities and features, including deciding what not to do, to ensure the final product effectively meets user needs. Without this role, development teams typically march through a vast, undifferentiated, unprioritized list of requirements, which is how information technology (IT) projects result in unwieldy failures. 

By contrast, the product model fosters a continuous cycle of improvement, essential for effective technology modernization. It empowers a small initial team with the right skills to conduct discovery sprints, engage users from the outset and throughout the process, and continuously develop, improve, and deliver value. This approach is ultimately more cost effective, results in continuously updated and effective software, and better meets user needs.

However, transitioning to the product model is challenging. Agencies need more than just infrastructure and tools to support seamless deployment and continuous software updates – they also need the right people and training. A lean team of product managers, user researchers, and service designers who will shape the effort from the outset can have an enormous impact on reducing costs and improving the effectiveness of eventual vendor contracts. Program and agency leaders, who truly understand the policy and operational context, may also require training to serve effectively as “product owners.” In this role, they work closely with experienced product managers to craft and bring to life a compelling product vision. 

These internal capacity investments are not expensive relative to the cost of traditional IT projects in government, but they are currently hard to make. Placing greater emphasis on building internal product management capacity will enable the government to more effectively tackle the root causes that lead to legacy systems becoming problematic in the first place. By developing this capacity, agencies can avoid future costly and ineffective “modernization” efforts.

Plan of Action

The General Services Administration’s Technology Modernization Fund plays a crucial role in helping government agencies transition from outdated legacy systems to modern, secure, and efficient technologies, strengthening the government’s ability to serve the public. However, changes to TMF’s strategy, policy, and practice could incentivize the broader adoption of product model approaches and make its investments more impactful.

The TMF should shift from investments in high-cost, static technologies that will not evolve to meet future needs towards supporting the development of product model capabilities within agencies. This requires a combination of skilled personnel, technology, and user-centered approaches. Success should be measured not just by direct savings in technology but by broader efficiencies, such as improvements in operational effectiveness, reductions in administrative burdens, and enhanced service delivery to users.

While successful investments may result in lower costs, the primary goal should be to deliver greater value by helping agencies better fulfill their missions. Ultimately, these changes will strengthen agency resilience, enabling them to adapt, scale, and respond more effectively to new challenges and conditions.

Recommendation 1. The Technology Modernization Board, responsible for evaluating proposals, should: 

  1. Assess future investments based on the applicant’s demonstrated competencies and capacities in product ownership and management, as well as their commitment to developing these capabilities. This includes assessing proposed staffing models to ensure the right teams are in place.
  2. Expand assessment criteria for active and completed projects beyond cost savings, to include measurements of improved mission delivery, operational efficiencies, resilience, and adaptability. 

Recommendation 2. The TMF Program Management Office, responsible for stewarding investments from start to finish, should: 

  1. Educate and train agencies applying for funds on how to adopt and sustain the product model. 
  2. Work with the General Services Administration’s 18F to incorporate TMF project successes and lessons learned into a continuously updated product model playbook for government agencies that includes guidance on the key roles and responsibilities needed to successfully own and manage products in government.
  3. Collaborate with the Office of Personnel Management (OPM) to ensure that agencies have efficient and expedited pathways for acquiring the necessary talent, utilizing appropriate assessments to identify and onboard skilled individuals. 

Recommendation 3. Congress should: 

  1. Encourage agencies to set up their own working capital funds under the authorities outlined in the TMF legislation. 
  2. Explore the barriers to product model funding in the current budgeting and appropriations processes for the federal government as a whole and develop proposals for fitting them to purpose. 
  3. Direct OPM to reduce procedural barriers that hinder swift and effective hiring. 

Conclusion 

The TMF should leverage its mandate to shift agencies towards a capabilities-first mindset. Changing how the program educates, funds, and assesses agencies will build internal capacity and deliver continuous improvement. This approach will lead to better outcomes, both in the near and long terms, by empowering agencies to adapt and evolve their capabilities to meet future challenges effectively.

This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.

PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.

Frequently Asked Questions
What is the Technology Modernization Fund and what does it do?

Congress established TMF in 2018 “to improve information technology, and to enhance cybersecurity across the federal government” through multiyear technology projects. Since then, more than $1 billion has been invested through the fund across dozens of federal agencies in four priority areas.

Why is the TMF uniquely positioned to lead product management adoption across the federal government?
The TMF represents an innovative funding model that offers agencies resource flexibility outside the traditional budget cycle for priority IT projects. The TMF team can leverage agency demand for its support to shape not only what projects agencies pursue but how they do them. Through the ongoing demonstration of successful product-driven projects, the TMF can drive momentum toward making the product model approach standard practice within agencies.

Introducing Certification of Technical Necessity for Resumption of Nuclear Explosive Testing

The United States currently observes a voluntary moratorium on explosive nuclear weapons testing. At the same time, the National Nuclear Security Administration (NNSA) is required by law to maintain the capability to conduct an underground nuclear explosive test at the Nevada National Security Site, if directed to do so by the U.S. president. 

Restarting U.S. nuclear weapons testing could have very negative security implications for the United States unless it was determined to be an absolute technical or security necessity. A restart of U.S. nuclear testing for any reason could open the door for China, Russia, Pakistan, and India to do the same, and make it even harder to condemn North Korea for its testing program. This would have significant security consequences for the United States and global environmental impacts.

The United States conducted over 1,000 nuclear weapons tests before the 1991 testing moratorium took effect. It was able to do so with the world’s most advanced diagnostic and data detection equipment, which enabled the US to conduct advanced computer simulations after the end of testing. Neither Russia or China conducted as many tests, and many fewer of those were able to collect advanced metrics, hampering these countries’ ability to match American simulation capabilities. Enabling Russia and China to resume testing could narrow the technical advantage the United States has held in testing data since the testing moratorium went into effect in 1992. 

Aside from the security loss, nuclear testing would also have long-lasting radiological effects at the test site itself, including radiation contamination in the soil and groundwater, and the chance of venting into the atmosphere. Despite these downsides, a future president has the legal authority—for political or other reasons—to order a resumption of nuclear testing. Ensuring any such decision is more democratic and subject to a broader system of political accountability could be achieved by creating a more integrated approval process, based on scientific or security needs. To this end, Congress should pass legislation requiring the NNSA administrator to certify that an explosive nuclear test is technically necessary to rectify an existing problem or doubt in U.S. nuclear surety before a test can be conducted.

Challenges and Opportunities

The United States is party to the 1963 Limited Test Ban Treaty, which prohibits atmospheric tests, and the Threshold Ban Treaty of 1974, limiting underground tests of more than 150 kilotons of explosive yield. In 1992, the United States also established a legal moratorium on nuclear testing through the Hatfield-Exon-Mitchell Amendment, passed during the George H.W. Bush Administration. After extending this moratorium in 1993, the United States, Russia, and China also signed the Comprehensive Nuclear Test Ban Treaty (CTBT) in 1996, which prohibits nuclear explosions. However, none of the Annex 2 (nuclear armed) states have ratified the CTBT, which prevents it from entering into force. 

Since halting nuclear explosive tests in 1992, the United States has benefited from a comparative advantage over other nuclear-armed states, given its advanced simulation and computing technologies, coupled with extensive data collected from conducting over 1,000 explosive nuclear tests over nearly five decades. The NNSA’s Stockpile Stewardship Program uses computer simulations to combine new scientific research with data from past nuclear explosive tests to assess the reliability, safety, and security of the U.S. stockpile without returning to nuclear explosive testing. Congress has mandated that the NNSA must provide a yearly report to the Nuclear Weapons Council, which reports to the president on the reliability of the nuclear weapons stockpile. The NNSA also maintains the capability to test nuclear weapons at the Nevada Test Site as directed by President Clinton in Presidential Decision Directive 15 (PDD-15). National Security Memorandum 7 requires the NNSA to have the capability to conduct an underground explosive test with limited diagnostics within 36 months, but the NNSA has asserted in their Stockpile Stewardship and Management plan that domestic and international laws and regulations could slow down this timeline. A 2011 report to Congress from the Department of Energy stated that a small test for political reasons could take only 6–10 months.

For the past 27 years, the NNSA administrator and the three directors of the national laboratories have annually certified—following a lengthy assessment process—that “there is no technical reason to conduct nuclear explosive testing.” Now, some figures, including former President Trump’s National Security Advisor, have called for a resumption of U.S. nuclear testing for political reasons. Specifically, testing advocates suggest—despite a lack of technical justification—that a return to testing is necessary in order to maintain the reliability of the U.S. nuclear stockpile and to intimidate China and other adversaries at the bargaining table. 

A 2003 study by Sandia National Laboratories found that conducting an underground nuclear test would cost between $76 million and $84 million in then-year dollars, approximately $132 million to $146 million today. In addition to financial cost, explosive nuclear testing could also be costly to both humans and the environment even if conducted underground. For example, at least 32 underground tests performed at the Nevada Test Site were found to have released considerable quantities of radionuclides into the atmosphere through venting. Underground testing can also lead to contamination of land and groundwater. One of the most significant impacts of nuclear testing in the United States is the disproportionately high rate of thyroid cancer in Nevada and surrounding states due to radioactive contamination of the environment.

In addition to health and environmental concerns, the resumption of nuclear tests in the United States would likely trigger nuclear testing by other states—all of which would have comparatively more to gain and learn from testing. When the CTBT was signed, the United States had already conducted far more nuclear tests than China or Russia with better technology to collect data, including fiber optic cables and supercomputers. A return to nuclear testing would also weaken international norms on nonproliferation and, rather than coerce adversaries into a preferred course of action, likely instigate more aggressive behavior and heightened tensions in response.

Plan of Action

In order to ensure that, if resumed, explosive nuclear testing is done for technical rather than political reasons, Congress should amend existing legislation to implement checks and balances on the president’s ability to order such a resumption. Per section 2530 of title 50 of the United States Code, “No underground test of nuclear weapons may be conducted by the United States after September 30, 1996, unless a foreign state conducts a nuclear test after this date, at which time the prohibition on United States nuclear testing is lifted.” Congress should amend this legislation to stipulate that, prior to any nuclear test being conducted, the NNSA administrator must first certify that the objectives of the test cannot be achieved through simulation and are important enough to warrant an end to the moratorium. A new certification should be required for every individual test, and the amendment should require that the certification be provided in the form of a publicly available, unclassified report to Congress, in addition to a classified report. In the absence of such an amendment, the president should make a Presidential Decision Directive to call for a certification by the NNSA administrator and a public hearing under oath to certify the same results cannot be achieved through scientific simulation in order for a nuclear test to be conducted.

Conclusion

The United States should continue its voluntary moratorium on all types of explosive nuclear weapons tests and implement further checks on the president’s ability to call for a resumption of nuclear testing.

This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.

PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.

Protecting U.S. Critical Infrastructure with Resilience Caches of Reusable Respirators

To help protect U.S. critical infrastructure workers from future pandemics and other biological threats, the next presidential administration should use the federal government’s grantmaking power to ensure ample supplies of high-quality respiratory personal protective equipment (PPE). The administration can take five concrete actions:

  1. The Office of Pandemic Preparedness and Response Policy (OPPR) can coordinate requirements for federal agencies and recipients of federal emergency/disaster preparedness funding to maintain access to at least one reusable respirator per critical employee.
  2. The Department of Labor’s Occupational Safety and Health Administration (OSHA) can initiate an occupational safety rule on reusable respirator resilience caches.
  3. The Department of Health and Human Services’ Administration for Strategic Preparedness and Response (ASPR) can require PPE manufacturers receiving federal funding to demonstrate their robustness to extreme pandemics.
  4. ASPR’s Strategic National Stockpile can start stockpiling reusable respirators.
  5. The Federal Emergency Management Agency (FEMA) can leverage its public outreach experience to increase “peacetime” adoption of reusable respirators.

These actions would complete the Biden Administration’s existing portfolio of efforts to reduce the likelihood of dangerous PPE shortages in the future, reaffirming executive commitment to protecting vulnerable workers, building a resilient national supply chain, and encouraging innovation.

Challenge and Opportunity

The next pandemic could strike at any time, and our PPE supply chain is not ready. Experts predict that the chance of a severe natural epidemic could perhaps triple in the next few decades, and advances in synthetic biology are increasing the risk of deliberate biological threats. As the world witnessed in 2020, disposable PPE can quickly become scarce in a crisis. Inadequate stockpiles left millions of workers with insufficient access to respiratory protection and often higher death rates than the general public—especially the critical infrastructure workers who operate the supply chains for our food, healthcare, public safety, and other essential goods and services. In future pandemics, which could have a 4% to 11%+ chance of occurring in the next 20 years based on historical extrapolations, PPE shortages could cause unnecessary infections, deaths, and burnout among critical infrastructure workers.

Figure 1. Notional figure from Blueprint Biosecurity’s Next-Gen PPE Blueprint demonstrating the need for stockpiling PPE in advance of future pandemics.

Recognizing the vulnerability of our PPE supply chain to future pandemics, Section 3.3 of the National Biodefense Strategy and Implementation Plan directs the federal government to:

Establish resilient and scalable supply and manufacturing capabilities for PPE in the United States that can: (a) enable a containment response for; and (b) meet U.S. peak projected demand for healthcare and other essential critical infrastructure workers during a nationally or internationally significant biological incident.

At a high level, securing the supply of PPE during crises is already understood as a national priority. However, despite the federal government’s past efforts to invest in domestic PPE manufacturing, production capacity will still take time to ramp up in future scenarios. Our current stockpiles aren’t large enough to bridge that gap. Some illustrative math: there are approximately 50 million essential workers in the United States, but as of 2022 our Strategic National Stockpile only had about 540 million disposable N95 respirators. This is barely enough to last 10 days, assuming each worker only uses one per day. (One per day is even a stretch: extended use and reuse of disposable N95s often leads to air leakage around wearers’ faces.) State- and local-level stockpiles may help, but many states have already started jettisoning their PPE stocks as purchases from 2020 expire and the prudence of paying for storage becomes less visible. PPE shortages may happen again.

Fortunately, there is existing technology that can reduce the likelihood of shortages while also protecting workers better and reducing costs: reusable respirators, like elastomeric half-mask respirators (EHMRs). 

A single EHMR typically costs between $20 and $40. While the up-front cost of an EHMR is higher than the ~$1 cost of a disposable N95, a single EHMR can reliably last a worker for thousands of shifts over the entirety of a pandemic. Compared to disposable N95s, EHMRs are also better at protecting workers from infection, and workers prefer them to disposable N95s in risky environments. EHMR facepieces often have a 10-year shelf life, and filter cartridges typically have the same five-year shelf life of a typical disposable N95. A supply of EHMRs also takes up an estimated 1.5% of the warehouse space of the equivalent supply of disposable N95s.

Figure 2. The relative size of equivalent PPE stockpiles. Source: Blueprint Biosecurity’s Next-Gen PPE Blueprint.

Some previous drawbacks of EHMRs were their lack of filtration for exhaled air and the unclear efficacy of disinfecting them between uses. Both of those problems are on their way to being solved. The newest generation of EHMRs on the market (products like the Dentec Comfort-Air Nx and the ElastoMaskPro) provide filtration on both inhalation and exhalation, and initial results from ongoing studies presented by the National Institute for Occupational Safety and Health (NIOSH) have demonstrated that they can be safely disinfected. (Product links are for illustrative purposes, not endorsement.)

Establishing stable demand for the newest generation of EHMRs could drive additional innovation in product design or material use. This innovation could further reduce worker infection rates by eliminating the need for respirator fit testing, improving comfort and communication, and enabling self-disinfection. It could also increase the number of critical infrastructure workers coverable with a fixed stockpile budget by increasing shelf lives and reducing cost per unit. Making reusable respirators more protective, ergonomic, and storable would improve the number of lives they are able to save in future pandemics while lowering costs. For further information on EHMRs, the National Academies has published studies that explore the benefits of reusable respirators.

The next administration, led by the new OPPR can require critical infrastructure operators that receive federal emergency/disaster preparedness funding to maintain resilience caches of at least one reusable respirator per critical infrastructure worker in their workplaces—enough to protect those workers during future pandemics.

These resilience caches would have two key benefits:

  1. Because many U.S. critical infrastructure operators, from healthcare to electricity providers, receive federal emergency preparedness funds, these requirements would bolster our nation’s mission-critical functions against pandemics or other inhalation hazards like wildfire smoke. At the same time, the requirements would be tied to a source of funding that could be used to meet them. 
  2. By creating large, sustainable private-sector demand for domestic respirators, these requirements would help substantially grow the domestic industrial base for PPE manufacturing, without relying on future warm-basing payments like those that Congress recently rescinded.

By taking action, the next administration has an opportunity to reduce the future burden on taxpayers and the federal government, help keep workers safe, and increase the robustness of domestic critical infrastructure.

Plan of Action

Recommendation 1. Require federal agencies and recipients of federal emergency/disaster preparedness funding to maintain access to at least one reusable respirator per critical employee.

OPPR can coordinate a process to define the minimum target product profile of reusable respirators that employers must procure. To incentivize continual respirator innovation, OPPR’s process can regularly raise the minimum performance standards of PPE in these resilience caches. These standards could be published alongside regular PPE demand forecasts. As products expire every 5 or 10 years, employers would be required to procure the new, higher standard. 

OPPR can also convene representatives from each agency that administers emergency/disaster preparedness funding programs to critical infrastructure sectors and can align those agencies on language for:

The Cybersecurity and Infrastructure Security Agency (CISA) within the Department of Homeland Security (DHS) can update its definition of essential workers and set guidelines for which employees would need a reusable respirator.

FEMA’s Office of National Continuity Programs can recommend reusable respirator stocks for critical staff at federal departments and agencies, and the Centers for Medicare and Medicaid Services (CMS) can also set a requirement for healthcare facilities as a condition of participation for receiving Medicare reimbursement.

Recommendation 2. Initiate an occupational safety rule on reusable respirator resilience caches.

To cover any critical infrastructure workplaces that are not affected by the requirements in Recommendation 1, OSHA can also require employers to maintain these resilience caches. This provision could be incorporated into a broader rule on pandemic preparedness, as a former OSHA director has suggested.

OSHA should also develop preemptive guidance on the scenarios in which it would likely relax its other rules. In normal times, employers are usually required to implement a full, costly Respiratory Protection Program (RPP) whenever they hand an employee an EHMR. An RPP typically includes complex, time-consuming steps like medical evaluations that may impede PPE access in crises. OSHA already has experience relaxing RPP rules in pandemics, and preemptive guidance on when those rules might be relaxed in the future would help employers better understand possible regulations around using their resilience caches.

Recommendation 3. Require PPE manufacturers receiving federal funding to demonstrate their robustness to extreme pandemics.

The DHS pandemic response plan notes that workplace absenteeism rates during extreme pandemics are projected to be to 40%. 

U.S. PPE manufacturers supported by federal industrial base expansion programs, such as the investments managed by ASPR, should be required to demonstrate that they can remain operational in extreme conditions in order to continue receiving funding. 

To demonstrate their pandemic preparedness, these manufacturers should have:

Recommendation 4. Start stockpiling reusable respirators in the Strategic National Stockpile.

Inside ASPR, the Strategic National Stockpile should ensure that the majority of its new PPE purchases are for reusable respirators, not disposable N95s. The stockpile can also encourage further innovation by making advance market commitments for next-generation reusable respirators.

Recommendation 5. Leverage FEMA’s public outreach experience to increase “peacetime” adoption of reusable respirators.

To complement work on growing reusable respirator stockpiles and hardening manufacturing, FEMA can also help familiarize the workforce with these products in advance of a crisis. FEMA can use Ready.gov to encourage the general public to adopt reusable respirators in household emergency preparedness kits. It can also develop partnerships with professional groups like the American Industrial Hygiene Association (AIHA) or the Association for Health Care Resource & Materials Management (AHRMM) to introduce workers to reusable respirators and instruct them in their use cases during both business as usual and crises.

Conclusion

Given the high and growing risk of another pandemic, ensuring that we have an ample supply of highly protective respiratory PPE should be a national priority. With new reusable respirators hitting the market, the momentum around pandemic preparedness after the COVID-19 pandemic, and a clear opportunity to reaffirm prior commitments, the time is ripe for the next administration to make sure our workers are safe the next time a pandemic strikes.

This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.

PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.

Frequently Asked Questions
Are critical infrastructure workers interested in using reusable respirators?

Yes. Throughout COVID-19, critical infrastructure worker unions have consistently advocated for EHMRs. Examples include the New York State Nurses Association (NYSNA) and the Service Employees International Union (SEIU) 121rn.


Unions have also been at the forefront of broader calls for securing PPE access in future pandemics; the California Nurses Association was the driving force behind California’s most recent PPE stockpiling laws.


Studies have also shown that workers prefer reusable respirators to disposable N95s in risky environments.

Will these requirements increase costs for employers?

Not in the long run. As a single employee infection can cost $340 per day, it is more cost-effective for most employers to spend around $3 per critical employee per year for reusable respirators. For hospitals in states like California or New York, which mandate one- to three-month PPE stockpiles, switching those stockpiles to reusable respirators would likely be cost-saving, as demonstrated by past case studies. Most of these hospitals are still meeting those requirements with disposable N95s largely because of slow product choice re-evaluation cycles.


Managing these resilience caches would also pose a minimal burden on employers. Most EHMRs can be comfortably stored in most indoor workplaces, taking up around the volume of a large coffee mug for each employee. Small workplaces with fewer than 50 employees could likely fit their entire resilience cache in a cardboard box in a back closet, and large workplaces will likely already have systems for managing emergency products that expire, like AEDs, first-aid kits, and fire extinguishers. As with other consumables like printer ink cartridges, PPE manufacturers can send reminders to employers when the products they purchased are about to expire.


To put this into perspective, fire alarm units should generally be replaced every 10 years for $20 to $30 each, and typically require new batteries once or twice per year. We readily accept the burden and minor cost of fire alarm maintenance, even though all U.S. fire deaths in the last 10 years only accumulate to 3% of COVID-19’s U.S. death toll.

What about workers who can’t wear EHMRs?

While EHMRs fit most workers, there may be some workers who aren’t able to wear them due to religious norms or assistive devices.


Those workers can instead wear another type of reusable respirator, powered air-purifying respirators (PAPRs). While PAPRs are even more effective at keeping workers safe than EHMRs, they cost significantly more and can be very loud. Employers and government stockpiles can include a small amount of PAPRs for those workers who can’t wear EHMRs, and can encourage eventual cost reductions and user-experience improvements with advance market commitments and incremental increases in procurement standards.

Aren’t physical stockpiles inefficient?

Yes, but they reduce the risk of any lags in PPE access. Every day that workers are exposed to pathogens without adequate PPE, their likelihood of infection goes up. Any unnecessary exposures speed the spread of the pandemic. Also, PPE manufacturing ramp-up could be slowed by employee absences due to infection or caring for infected loved ones.


To accommodate some employers’ reluctance to build physical stockpiles, the administration can enable employers to satisfy the resilience cache requirement in multiple ways, such as:



  • On-site resilience caches in their workplaces

  • Agreements with distributors to manage resilience cache inventory as a rotating supply bubble

  • Agreements with third-party resilience cache managers

  • Purchase options with manufacturers that have demonstrated enough capacity to rapidly manufacture resilience cache inventory at the start of a pandemic


Purchase options would function like a “virtual” resilience cache: they would incentivize manufacturers to build extra warm-base surge capacity and test their ability to rapidly ramp up manufacturing pace. However, it would increase the risk that workers will be exposed to infectious disease hazards before their PPE arrives. (Especially in the case of a severe pandemic, where logistics systems could get disrupted.)

Would these resilience caches be usable for any other hazards besides pandemics?

Yes. Employers could use respirators from their resilience cache to protect workers from localized incidents like seasonal flu outbreaks, wildfires, or smog days, and put them back into storage when they’re no longer needed.

Could this be expanded to other pandemic hardening activities, beyond PPE?

Yes. Federal emergency/disaster preparedness funding could be tied to other requirements, like:



  • Installing the capacity to turn up workplace air ventilation or filtration significantly

  • Maintaining and regularly exercising pandemic response protocols

  • Investing in passive transmission suppression technology (e.g., germicidal ultraviolet light)

How might these requirements affect post-emergency PPE spending?

The next time there’s a pandemic, having these requirements in place could help ensure that any post-emergency funding (e.g., Hazard Mitigation Assistance Program grants) will be spent on innovative PPE that aligns with the federal government’s broader PPE supply chain strategies.


If the Strategic National Stockpile receives additional post-emergency funding from Congress, it could also align its purchases with the target product profiles that critical infrastructure operators are already procuring to.

Scaling Proven IT Modernization Strategies Across the Federal Government

Ten years after the creation of the U.S. Digital Service (USDS) and 18F (an organization with the General Services Administration that helps other government agencies build, buy, and share technology products), the federal government still struggles to buy, build, and operate technology in a speedy, modern, scalable way. Cybersecurity remains a continuous challenge – in part due to lack of modernization of legacy technology systems. As data fuels the next transformative modernization phase, the federal government has an opportunity to leverage modern practices to leap forward in scaling IT Modernization.

While there have been success stories, like IRS’s direct file tool and electronic passport renewal, most government technology and delivery practices remain antiquated and the replacement process remains too slow. Many obstacles to modernization have been removed in theory, yet in practice Chief Information Officers (CIOs) still struggle to exercise their authority to achieve meaningful results. Additionally, procurement and hiring processes, as well as insufficient modernization budgets, remain barriers.

The DoD failed to modernize its 25-year-old Defense Travel System (DTS) after spending $374 million, while the IRS relies on hundreds of outdated systems, including a key taxpayer data processing system built in the 1960s, with full replacement not expected until 2030. The GAO identified 10 critical systems across various agencies, ranging from 8 to 51 years old, that provide essential services like emergency management, health care, and defense, costing $337 million annually to operate and maintain, many of which use outdated code and unsupported hardware, posing major security and reliability risks. Despite the establishment of the Technology Modernization Fund (TMF) with a $1.23 billion appropriation, most TMF funds have been expended for a small number of programs, many of which did not solve legacy modernization problems. Meanwhile the urgency of modernizing antiquated legacy systems to prevent service breakdowns continues to increase.

This memo proposes a new effort to rapidly scale proven IT modernization strategies across the federal government. The result will be a federal government with the structure and culture in place to buy, build, and deliver technology that meets the needs of Americans today and into the future. 

Challenge and Opportunity 

Government administrations typically arrive with a significant policy agenda and a limited management agenda. The management agenda often receives minimal focus until the policy agenda is firmly underway. As a result, the management agenda is rarely well implemented, if it is implemented at all. It should be noted that there are signs of progress in this area, as the Biden-Harris Administration publishing its management agenda in the first year of the Administration, while the the Trump Administration did not publish its management agenda until the second year of the administration. However, even when the management agenda is published earlier, alignment, accountability and senior White House and departmental  leadership focus on the management agenda is far weaker than for the policy agenda.

Even when a PMA has been published and alignment is achieved amongst all the stakeholders within the EOP, the PMA is simply not a priority for Departmental/Agency leadership and there is little focus on the PMA among Secretaries/Administrators. Each Department/Agency is responsible for a policy agenda and, unless, IT or other management agenda items are core to the delivery of the policy agenda, such as at the VA, departmental political leadership pays little attention to the PMA or related activities such as IT and procurement.

 An administration’s failure to implement a management agenda and improve government operations jeopardizes the success of that administration’s policy agenda, as poor government technology inhibits successful implementation of many policiesThis has been clear during the Biden – Harris administration as departments have struggled to rapidly deliver IT systems to support loan, grant and tax programs, sometimes delaying or slowing the implementation of those programs. 

The federal government as a whole spends about 80% of its IT budget on maintenance of outdated systems—a percentage that is increasing, not declining. Successful innovations in federal technology and service delivery have not scaled, leaving pockets of success throughout the government that are constantly at risk of disappearing with changes in staff or leadership. 

The Obama administration created USDS and 18F/Technology Transformation Services (TTS) to begin addressing the federal government’s technology problems through improved adoption of modern Digital Services. The Trump administration created the Office of American Innovation (OAI) to further advance government technology management. As adoption of AI accelerates, it becomes even more imperative for the federal government to close the technology gap between where we are and where we need to be to provide the government services that the American people deserve. 

The Biden administration has adapted IT modernization efforts to address the pivot to AI innovations by having groups like USDS, 18F/TTS and DoD Software Factories increasingly focus on Data adoption and AI. With the Executive Order on AI and the Consortium Dedicated to AI Safety the Biden-Harris administration is establishing guidelines to adopt and properly govern increasing focus on Data and AI. These are all positive highlights for IT modernization – but there is a need for these efforts to deliver real productivity. Expectations of citizens continue to increase. Services that take months should take weeks, weeks should take days, and days should take hours. This level of improvement can’t be reached across the majority of government services until modernization occurs at scale. While multiple laws designed to enhance CIO authorities and accelerate digital transformation have been passed in recent years, departmental CIOs still do not have the tools to drive change, especially in large, federated departments where CIOs do not have substantial budget authority.

As the evolution of Digital Transformation for the government pivots to data, modernizedAgencies/Department can leap forward, while others are still stuck with antiquated systems and not able to derive value from data yet. For more digitally mature Agencies/Departments, the pivot to data-driven decisions, automation and AI, offer the best chance for a leap in productivity and quality gains. AI will fuel the next opportunity to leap forward by shifting focus from the process of delivering digital services (as they become norms) and more on the data based insights they ingest and create. For the Agencies/Departments “left behind” the value of data driven-decisions, automation and AI – could drive rapid transformation and new tools to deliver legacy system modernization.

The Department of Energy’s “Scaling IT Modernization Playbook” offers key approaches to scale IT modernization by prioritizing  mission outcomes, driving data adoption, coordinating at scale across government, and valuing speed and agility because, “we underrate speed as value”. Government operations have become too complacent with slow processes and modernization; we are increasingly outpaced by faster developing innovations. Essentially, Moore’s Law (posited by Gordon Moore that the number of transistors in an integrated circuit doubles every 18 months while cost increases minimally. Moore’s law has been more generally applied to a variety of advanced technologies) is outpacing successful policy implementation.

As a result, the government and the public continue to struggle with dysfunctional legacy systems that make government services difficult to use under normal circumstances and can be crippling in a crisis. The solution to these problems is to boldly and rapidly scale emerging modernization efforts across the federal government enterprise – embracing leaps forward with the opportunistic shift of data and AI fueled transformation. 

Some departments have delivered notably successful modern systems, such DHS’ Global Entry site and the State Department’s online passport renewal service. While these solutions are clearly less complex than the IRS’ tax processing system, which the IRS has struggled to modernize, they demonstrate that the government can deliver modern digital services under the right conditions. 

Failed policy implementation due to failed technology implementation and modernization will continue until management and leadership practices associated with modern delivery are rapidly adopted at scale across government and efforts and programs are retained between administrations. 

Plan of Action 

Recommendation 1. Prioritize Policy Delivery through the Office of Management and Budget (OMB) and the General Services Administration (GSA) 

First, the Administration should elevate the position of Federal CIO to be a peer to the Deputy Directors at the OMB and move the Federal CIO outside of OMB, while remaining within the Executive Office of the President, to ensure that the Federal CIO and, therefore, IT and Cybersecurity priorities and needs of the departments and agencies have a true seat at the table. The Federal CIO represents positions that are as important as but different from those of the OMB Deputy Directors and the National Security Advisor and, therefore, should be peers to those individuals, just as they are within departments and agencies, where CIOs are required to report to the Secretary or Administrator. Second, Elevate the role of the GSA Administrator to a Cabinet-level position, and formally recognize GSA as the federal government’s “Operations & Implementation” agency. These actions will effectively make the GSA Administrator the federal government’s Chief Operating Officer (COO). Policy, financial oversight, and governance will remain the purview of the OMB. Operations & Implementation will become the responsibility of the GSA, aligning existing GSA authorities of TTS, quality & proven shared services, acquisitions, and asset management with a renewed focus on mission centric government-service delivery. The GSA Administrator will collaborate with the President’s Management Council (PMC), OMB and agency level CIOs to coordinate policy delivery strategy with delivery responsibility, thereby integrating existing modernization and transformation efforts from the GSA Project Management Office (PMO) towards a common mission with prioritization on rapid transformation. 

For the government to improve government services, it needs high-level leaders charged with prioritizing operations and implementation—as a COO does for a commercial organization. Elevating the Federal CIO to an OMB Deputy Director and the GSA Administrator to a Cabinet-level position tasked with overseeing “Operations & Implementation” would ensure that management and implementation best practices go hand in hand with policy development, dramatically reducing the delivery failures that put even strong policy agendas at risk.

Recommendation 2. Guide Government Leaders with the Rapid Agency Transformation Playbook 

Building on the success of the Digital Services Playbook, and influenced by the DOE’s “Scaling IT Modernization Playbook” the Federal CIO should develop a set of “plays” for rapidly scaling technology and service delivery improvements across an entire agency. The Rapid Agency Transformation Playbook will act both as a guide to advise agency leaders in scaling best practices, as well as a standard against which modernization efforts can be assessed. The government wide “plays” will be based on practices that have proven successful in the private and public sectors, and will address concepts such as fostering innovation, rapid transformation, data adoption, modernizing or sunsetting legacy systems, and continually improving work processes infused with AI. Where the Digital Services Playbook has helped successfully innovate practices in pockets of government, the Rapid Agency Transformation Playbook will help scale those successful practices across government as a whole. 

A Rapid Agency Transformation Playbook will provide a living document to guide leadership and management, helping align policy implementation with policy content. The Playbook will also clearly lay out expected practices for Federal employees and contractors who collaborate on policy delivery. 

Recommendation 3. Fuel Rapid Transformation by Creating Rapid Transformation Funds

Congress should create Rapid Transformation Funds (RTF) under the control of each Cabinet-level CIO, as well as the most senior-IT leader in smaller departments and independent agencies. These funds would be placed in a Working Capital Fund (WCF) that is controlled by the cabinet level CIO or the most senior IT leader in smaller departments and independent agencies. These funds must be established through legislation. For those departments that do not currently have a working capital fund under the control of the CIO, the legislation should create that fund, rather than depending on each department or agency to make a legislative request for an IT WCF. 

This structure will give the CIO of each Department/Agency direct control of the funds. All RTFs must be under the control of the most senior IT leader in each organization and the authority to manage these funds must not be delegatable.The TMF puts the funds under the control of GSA’s Office of the Federal Chief Information Officer (OFCIO) and a board that has to juggle priorities among GSA OCIO and the individual Departments and Agencies. Direct control will streamline decision making and fund disbursement. It will help to create a carrot to align with existing Federal Information Technology Acquisition Reform Act (FITARA) (i.e., stick) authorities. In addition, Congress should evaluate how CIO authorities are measured under FITARA to ensure that CIOs have a true seat at the table.

The legislation will provide the CIO the authority to sweep both expiring and canceling funds into the new WCF. Seed funds in the amount of 10% of department/agency budgets will be provided to each department/agency. CIOs will have the discretion to distribute the funds for modernization projects throughout their department or agency and to determine payback model(s) that best suit their organization, including the option to reduce or waive payback for projects, while the overarching model will be cost reimbursement.

The RTF will enhance the CIO’s ability to drive change within their own organization. While Congress has expanded CIO authorities through legislation three different times in recent years, no legislation has redirected funding to CIOs. Most cabinet level CIOs control a single digit percentage of the Department’s IT budget. For example, the Department of Energy CIO directly controls about 5% of DOE’s IT spending. Direct control of a meaningfully sized pool of money that can be allocated to component IT teams by the cabinet level CIO enables that cabinet level CIOs to drive critical priorities including modernization and security. Without funding, CIO authorities amount to unfunded mandates. The RTF will allow the CIO to enhance their authority by directly funding new initiatives. A reevaluation of the metrics associated with CIO authorities would ensure that CIOs have a true seat at the table.

Recommendation 4. Ensure transformation speed through continuity by establishing a Transformation Advisory Board and department/agency management councils. 

First, OMB should establish a Transformation Advisory Board (TAB) within the Executive Office of the President (EOP), composed of senior and well-respected individuals who will be appointed to serve fixed terms not tied to the presidential administration and sponsored by the Federal CIO. The TAB will be chartered to impact management and technology policy across the government and make recommendations to change governance that impedes rapid modernization and transformation of government. Modeled after the Defense Innovation Board, the TAB will focus on entrenching rapid modernization efforts across administrations and on supporting, protecting, and enhancing existing digital-transformation capabilities. Second, each department and agency should be directed to establish a management council composed of leaders of the department/agency’s administrative functions to include at least IT, finance, human resources, and acquisition, under the leadership of the deputy secretary/deputy administrator. In large departments this may require creating a new deputy secretary or undersecretary position to ensure meaningful focus on the priorities, rather than simply holding meaningless council meetings. This council will ensure that collaborative management attention is given to departmental/agency administration and that leadership other than the CIO understand IT challenges and opportunities. 

A Transformation Advisory Board will ensure continuity across administrations and changes in agency leadership to prevent the loss of good practices, enabling successful transformative innovations to take root and grow without breaks and gaps in administration changes. The management council will ensure that modernization is a priority of departmental/agency leadership beyond the CIO.

Ann Dunkin contributed to an earlier version of this memo.

This idea was originally published on November 13, 2020; we’ve re-published this updated version on October 22, 2024.

Frequently Asked Questions
We have given CIOs lots of authority and nothing has changed. Why should we do this now? What difference will it make?

While things have not changed as much as we would like, departments and agencies have made progress in modernizing their technology products and processes. Elevating the GSA Administrator to the cabinet level, adding a Transformation Advisory Board, elevating the Federal CIO, reevaluating how CIO authorities are measured, creating departmental/agency management councils, and providing modernization funds directly to CIOs through working capital funds will provide agencies and departments with the management attention, expertise, support, and resources needed to scale and sustain that progress over time. Additionally, CIOs—who are responsible for technology delivery—are often siloed rather than part of a broad, holistic approach to operations and implementation. Elevating the GSA Administrator and the Federal CIO, as well as establishing the TAB and departmental/agency management councils, will provide coordinated focus on the government’s need to modernize IT.

How will this help fix and modernize the federal government’s legacy systems?

Elevating the role of the Federal CIO and the GSA Administrator will provide more authority and attention for the President’s Management Agenda, thereby aligning policy content with policy implementation. Providing CIOs with a direct source of modernization funding will allow them to direct funds to the most critical projects throughout their organizations, as well as require adherence to standards and best practices. A new focus on successful policy delivery aided by experienced leaders will drive modernization of government systems that rely on dangerously outdated technology.

How do we ensure that scaling modernization is actually part of the President’s Management Agenda?

We believe that an administration that embraces the proposal outlined here will see scaling innovation as critical. Establishing a government COO and elevating the Federal CIO along with an appointed board that crosses administrations, departmental management councils, better measurement of CIO authorities, and direct funding to CIOs will dramatically increase the likelihood that that improved technology and service delivery remain a priority for future administrations.

Is the federal government doing anything now that can be built upon to implement this proposal?

The federal government has many pockets of innovation that have proven modern methodologies can and do work in government. These pockets of innovation—including USDS, GSA TTS, 18F, the U.S. Air Force Software Factories, fellowships, the Air Force Works Program (AFWERX), Defense Advanced Research Projects Agency (DARPA), and others—are inspiring. It is time to build on these innovations, coordinate their efforts under a U.S. government COO and empowered Federal CIO, and scale solutions to modernize the government as a whole.

Is another cabinet-level agency necessary to solve this problem?

Yes. A cabinet-level chief operating officer with top-level executive authority over policy operations and implementation is needed to carry out policy agendas effectively. It is hard to imagine a high-performing organization without a COO and a focus on operations and implementation at the highest level of leadership.

A president has a great deal to think about. Why should modernizing government technology and service delivery be a priority?

The legacy of any administration is based on its ability to enact its policy agenda and its ability to respond to national emergencies. Scaling modernization across the government is imperative if policy implementation and emergency response is important to the president.

Investing in Apprenticeships to Fill Labor-Market Talent and Opportunity Gaps

Over the last 20 years, the cost of college has skyrocketed, with tuition costs far outpacing wage growth. At the same time, many employers complain that they’re unable to find high-quality talent, in part due to an excessive focus on the signaling effect conferred by college degrees. Although the last three administrations have made significant strides towards expanding the number of pathways to high-earning jobs through apprenticeship programs, they remain under-utilized and have significant potential for growth. To maximize the potential of apprenticeship programs, the federal government should develop a cohesive approach to supporting “apprenticeships of the future,” such as those in cyber, healthcare, and advanced manufacturing. These apprenticeships provide high pay and upward mobility, support economic growth, and serve vital national interests. To maximize the benefits provided by an expansion of high-quality apprenticeships, the federal government should articulate degree pathways and credit equivalencies for individuals seeking further education, collaborate with industry associations to create standards for skills acquisition, and develop an innovation fund that supports cutting-edge labor market innovations, including those in apprenticeship programs.

Challenge & Opportunity

While recent student debt cancellation received significant attention, the key underlying driver is the spiraling cost of college: tuition at four-year universities has risen by more than 125% in the last twenty years, far outpacing inflation and leaving students with an average debt load of $28,000 by graduation. To alleviate the strain, policymakers have increasingly recognized the potential of non-degree training, particularly apprenticeships, which mix on-the-job training with targeted academic skills acquisition. Apprenticeships, which typically last between a few months and 2 years, enable an individual in a high school or tertiary education program to work with an employer, earning a wage while developing skills that may lead to a permanent position or enhance future employability. President Obama spent $260 million on apprenticeship training, while the Trump administration spent $1 billion. Thus far, the Biden administration has spent $730 million to expand registered apprenticeships

Nevertheless, apprenticeships in America remain vastly underutilized compared to some of our peer economies. In Germany, 1.2 million adults are enrolled in apprenticeship programs across 330 occupations. By contrast, the U.S. has roughly half as many apprentices despite enrolling 6.5 times as many college students as Germany. Moreover, apprentices are overwhelmingly concentrated in roles such as electricians, machinists, plumbers, and other industries historically classified as “skilled trades.” 

American employers have put a significant premium on college degrees. Research from the Harvard Business School highlights the pervasiveness of degree inflation in many middle-skill, well-paying jobs. The figure below shows the “degree gap percentage,” which is the difference between the percentage of job descriptions requiring a college degree and the percentage of job holders holding a college degree.

OccupationDegree gap %
Supervisors of Office Administrative Support Workers37%
Bookkeeping, Accounting, and Auditing Clerks27%
Secretaries and Administrative Assistants, Except Legal, Medical, and Executive17%
Sales Representatives, Wholesales, and Manufacturing27%
Executive Secretaries and Executive Administrative Assistants47%
Supervisors of Production and Operations Workers51%
Supervisors of Retail Sales Workers22%
Supervisors of Food Preparation and Serving Workers26%
Supervisors of Construction Trades and Extraction Workers44%
Sales Representatives, Services, All Other22%
Supervisors of Mechanics, Installers, and Repairers35%
Inspectors, Testers, Sorters, Samplers, and Weighers25%
Childcare Workers20%
Computer User Support Specialists19%
Billing and Posting Clerks21%

Historically, employers’ emphasis on degrees has made wide-scale adoption of apprenticeships outside of skilled trades more challenging. However, attitudes towards apprenticeships continue to change as more employers realize their versatility and applicability to a variety of industries. Over the past few years, companies have started to take action. For instance, JP Morgan Chase has provided $15 million since 2018 to create apprenticeship programs in operations, finance, and technology, while Accenture has led the way in developing apprenticeship networks across the U.S. Apprenticeships have clear momentum and strong applicability to critical, strategic jobs, and federal, state, and local officials should capitalize on the opportunity to create a coherent strategy.

Policy Framework For Strategic Jobs

To identify areas of policy synergy, policymakers should consider the following criteria for jobs that should attract government funding and policy support:

  1. Essential to economic growth: roles that are frequently employed in high-growth industries, or else required to improve the future general productivity of businesses.
  2. Necessary to protect American interests: jobs that have broader implications for American national interests, including economic competitiveness, national security, green energy, and public health.
  3. Middle-skill roles that do not require college degrees: while higher educational attainment is generally desirable, it is not a suitable nor affordable option for all individuals, and many roles can or should support workers who have alternative credentials. Simply put, these jobs should provide pathways into the middle class without excessive education debt burdens.
  4. High current job shortages: demand for roles far exceeds current labor supply.

Using this framework, there are three  areas in which the U.S. has clear, pressing needs:

  1. Tech job shortages in the United States will cost the American economy over $160 billion in revenue, driven by a shortage of over 1.2 million workers.
    1. Cyber attacks alone cost the American economy 1% – 4% of GDP , which can be partially addressed by eliminating the existing talent shortage of 350,000 cyber professionals.
    2. In addition, 50% of the federal tech workforce is over the age of 50 and just 20% is under the age of 40, indicating a large “retirement cliff” in the medium-term horizon.
  2. Although the U.S. has had a long-standing need for nurses and medical professionals, the COVID pandemic highlighted their importance and exposed systemic workforce shortages. By 2030, the country will be short over 500,000 nurses.
    1. The country also suffers from a lack of healthcare educators, with nearly 80,000 qualified nursing applicants turned away due to a lack of training capacity.
    2. While many critical healthcare roles (e.g., RNs and NPs) require at least a bachelor’s degree, apprenticeships are a great way to increase the pipeline of lower-level medical staff (e.g., medical assistants, CNAs, LVNs), who can then be upskilled into the RN role or higher.
  3. Today, the U.S. has over 600,000 unfilled manufacturing jobs, which may hamper efforts to bring back clean energy and semiconductor manufacturing despite the hundreds of billions invested by the Inflation Reduction Act, CHIPS Act, and Bipartisan Infrastructure Law. Cumulatively, this talent shortage could reduce American GDP by $1 trillion. The gap is most acute in a handful of roles, including assemblers, production supervisor, inspectors, and welders. However, these roles are essential to empowering the advanced manufacturing revolution, and need to be filled in order to maximize American industrial potential.

Policy Recommendations

In order to maximize the potential of apprenticeship programs in key strategic areas, the next administration should focus on coordinating resources, defining standards, and convening key stakeholders, which include employers and higher education providers, including private sector providers who demonstrate strong outcomes. To achieve this, the next administration should focus on the following policies: 

Recommendation 1. The Departments of Labor and Education should jointly lead the creation of a national strategy for increasing apprenticeships and blended work-learn programs in essential roles and industries. In conjunction with other government agencies, they will stand up a “Strategic Apprenticeships” Task Force. This task force will consist primarily of governmental agencies, including the Department of Defense, Department of Treasury, and the Fed, that have clear mandates for improving worker outcomes which are tied directly to national strategic priorities. This task force will cooperate with the Advisory Committee on Apprenticeships (a committee convened by the Department of Labor that consists of labor unions, community colleges, and other institutions) to set short, medium, and long-term priorities, propose funding levels, and develop a coherent apprenticeship and training strategy.

Recommendation 2. Congress should commit federal funds for apprenticeships in cyber, software engineering, healthcare, and advanced trades (“apprenticeships for the future”), which will be allocated by the Department of Labor as prioritized by the Strategic Apprenticeships Task Force. Given the strategic value and existing job shortages for these roles, the Department of Labor should direct at least 50% of funds to roles that (a) provide strong pathways into middle-class jobs and (b) address pressing economic and strategic shortages in our economy:

Under the Biden administration, progress has been made on higher education accountability: for example, the Gainful Employment Rule was reinstated, requiring for-profit programs to demonstrate that typical graduates’ debts are less than 8% of their earnings, or 20% of their discretionary income, to maintain access to federal student aid. Moreover, the rule requires more than half of graduates to demonstrate higher earnings than a typical high school graduate.

Nonetheless, more can be done to buttress progress that has been made on higher education, particularly given stronger regulations around ROI. The policies suggested above can roll up into an “Apprenticeships of the Future” initiative jointly managed by the Departments of Labor and Education. By using a coordinated approach to apprenticeships, policymakers can ensure that more attention is paid towards strategically important industries and roles while creating clearer pathways for individuals seeking apprenticeships and for former apprentices looking to gain further skills and training in 4-year degrees and other “alt-ed” training programs. Moreover, the initiative could make diversity and economic advancement for underserved communities a core part of its mission.

This idea was originally published on November 29, 2021; we’ve re-published this updated version on October 21, 2024.

This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.

PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.

Frequently Asked Questions
Why are apprenticeships less common in the United States than in Europe? How can they be adopted to fit the American labor market?
In a nutshell, major European corporations are more likely to cooperate with local and provincial governments on labor training and development. Moreover, unlike in America, workers sit on the board of directors of many European-based companies. For example, in Germany, workers comprise nearly half the supervisory board for companies with over 2,000 employees and one-third of the board for companies with 500 – 2,000 employees. As a result, European employers are more likely to take a long-term view of talent development and are willing to invest in apprentices to bring them up the learning curve.
Are apprenticeships just another term for vocational education? What impact might this have on long-term earnings potential?

While apprenticeships have been traditionally applied to fields that most people would associate with “vocational” roles such as electricians or construction work, they are also increasingly used in “new economy” roles such as IT and software development. When properly designed, apprenticeships have excellent earning potential. For instance, Kentucky’s FAME program prepares students for advanced manufacturing careers, with graduates enjoying average earnings of nearly $100,000 within five years of program completion.

Could we fix the skill shortage simply by paying more?
In a nutshell – yes, paying more would greatly alleviate critical skill shortages, with nursing (both at the RN level and below) as an exceptional example given the challenges of the job. However, there are still critical shortages at the front of the talent acquisition funnel that require some training (for example, helping individuals acquire nursing credentials or complete a training course to move into a tech data analyst role). While increased pay would certainly encourage more individuals to move into roles with major shortages, reducing the upfront barriers and cost to entry is also essential, and intelligent deployment of apprenticeship policy can play a major role in removing these barriers.
Do apprenticeships limit students’ ability to later receive a bachelor’s degree?
No! Students can earn a bachelor’s degree at a later point, and in some cases, the academic or work components of their apprenticeship may receive some credit. The apprenticeship to university pathway is more common in Europe, however, given the more widespread use of apprentices in non-trade roles.
What is the typical return on investment for an employer when they employ an apprentice?

Employers generally enjoy a strong ROI for apprenticeships. For example, employers who ran registered apprenticeships in industrial manufacturing received $1.47 of benefits for every $1.00 that they invest in apprenticeships, with benefits generally coming in the form of improved productivity and reduced waste. Depending on the upfront investment amount, the duration of the apprenticeship, and the time required to recoup productivity gains and cost efficiencies, the IRR percentage is somewhere between 5% – 25%.

What is the social return on investment for apprenticeships?
Again, this depends on the subsidy amount and the specific apprenticeship in question, but international studies suggest that the IRR percentage is in the 5% – 12% range, which meets or exceeds many alternatives (e.g., agricultural subsidies)
Why aren’t apprenticeships more popular? Shouldn’t the private sector self-organize their own apprenticeship programs?
In the case of skilled trades, industry has largely self-organized because “learning by doing” is the best way to ensure that an individual has verifiable skills on the job. For example, a welder needs to show that he or she can operate on a harness in the air. However, in many cases, employers may find it easier in the short term to rely on signals such as college degrees to sort through applicants or simply feel that the time and money invested in apprenticeships results in too uncertain of an outcome (e.g., the apprentice isn’t as productive as desired or leaves) relative to the risk level. Thus, public support for apprenticeships to bridge the gap between individual incentives and social returns, particularly in high-need roles and geographies, can be a good way to build a strong labor supply and provide meaningful economic mobility.

Mobilizing Innovative Financial Mechanisms for Extreme Heat Adaptation Solutions in Developing Nations

Global heat deaths are projected to increase by 370% if direct action is not taken to limit the effects of climate change. The dire implications of rising global temperatures extend across a spectrum of risks, from health crises exacerbated by heat stress, malnutrition, and disease, to economic disparities that disproportionately affect vulnerable communities in the U.S. and in low- and middle-income countries. In light of these challenges, it is imperative to prioritize a coordinated effort at both national and international levels to enhance resilience to extreme heat. This effort must focus on developing and implementing comprehensive strategies to ensure the vulnerable developing countries facing the worst and disproportionate effects of climate change have the proper capacity for adaptation, as wealthier, developed nations mitigate their contributions to climate change. 

To address these challenges, the U.S. Agency for International Development (USAID) should mobilize finance through environmental impact bonds focused on scaling extreme heat adaptation solutions. USAID should build upon the success of the  SERVIR joint initiative and expand it to include a partnership with NIHHIS to co-develop decision support tools for extreme heat. Additionally, the Bureau for Resilience, Environment, and Food Security (REFS) within the USAID should take the lead in tracking and reporting on climate adaptation funding data. This effort will enhance transparency and ensure that adaptation and mitigation efforts are effectively prioritized. By addressing the urgent need for comprehensive adaptation strategies, we can mitigate the impacts of climate change, increase resilience through adaptation, and protect the most vulnerable communities from the increasing threats posed by extreme heat.

Challenge 

Over the past 13 months, temperatures have hit record highs, with much of the world having just experienced their warmest June on record. Berkeley Earth predicts a 95% chance that 2024 will rank as the warmest year in history. Extreme heat drives interconnected impacts across multiple risk areas including: public health; food insecurity; health care system costs; climate migration and the growing transmission of life-threatening diseases.

Thus, as global temperatures continue to rise, resilience to extreme heat becomes a crucial element of climate change adaptation, necessitating a strategic federal response on both domestic and international scales.

Inequitable Economic and Health Impacts 

Despite contributing least to global greenhouse gas emissions, low- and middle-income countries experience four times higher economic losses from excess heat relative to wealthier counterparts. The countries likely to suffer the most are those with the most humidity, i.e. tropical nations in the Global South. Two-thirds of global exposure to extreme heat occurs in urban areas in the Global South, where there are fewer resources to mitigate and adapt. 

The health impacts associated with increased global extreme heat events are severe, with projections of up to 250,000 additional deaths annually between 2030 and 2050 due to heat stress, alongside malnutrition, malaria, and diarrheal diseases. The direct cost to the health sector could reach $4 billion per year, with 80% of the cost being shouldered by Sub-Saharan Africa. On the whole, low-and middle-income countries (LMICs) in the Global South experience a higher portion of adverse health effects from increasing climate variability despite their minimal contributions to global greenhouse emissions, underscoring a clear global inequity challenge. 

This imbalance points to a crucial need for a focus on extreme heat in climate change adaptation efforts and the overall importance of international solidarity in bolstering adaptation capabilities in developing nations. It is more cost-effective to prepare localities for extreme heat now than to deal with the impacts later. However, most communities do not have comprehensive heat resilience strategies or effective early warning systems due to the lack of resources and the necessary data for risk assessment and management — reflected by the fact that only around 16% of global climate financing needs are being met, with far less still flowing to the Global South. Recent analysis from Climate Policy Initiative, an international climate policy research organization, shows that the global adaptation funding gap is widening, as developing countries are projected to require $212 billion per year for climate adaptation through 2030. The needs will only increase without direct policy action.  

Opportunity: The Role of USAID in Climate Adaptation and Resilience

As the primary federal agency responsible for helping partner countries adapt to and build resilience against climate change, USAID announced multiple commitments at COP28 to advance climate adaptation efforts in developing nations. In December 2023, following COP28, Special Presidential Envoy for Climate John Kerry and USAID Administrator Power announced that 31 companies and partners have responded to the President’s Emergency Plan for Adaptation and Resilience (PREPARE) Call to Action and committed $2.3 billion in additional adaptation finance. Per the State Department’s December 2023 Progress Report on President Biden’s Climate Finance Pledge, this funding level puts agencies on track to reach President Biden’s pledge of working with Congress to raise adaptation finance to $3 billion per year by 2024 as part of PREPARE.

USAID’s Bureau for Resilience, Environment, and Food Security (REFS) leads the implementation of PREPARE. USAID’s entire adaptation portfolio was designed to contribute to PREPARE and align with the Action Plan released in September 2022 by the Biden Administration. USAID has further committed to better integrating adaptation in its Climate Strategy for 2022 to 2030 and established a target to support 500 million people’s adaptation efforts.  

This strategy is complemented by USAID’s efforts to spearhead international action on extreme heat at the federal level, with the launch of its Global Sprint of Action on Extreme Heat in March 2024. This program started with the inaugural Global Heat Summit and ran through June 2024, calling on national and local governments, organizations, companies, universities, and youth leaders to take action to help prepare the world for extreme heat, alongside USAID Missions, IFRC and its 191-member National Societies. The executive branch was also advised to utilize the Guidance on Extreme Heat for Federal Agencies Operating Overseas and United States Government Implementing Partners.

On the whole, the USAID approach to climate change adaptation is aimed at predicting, preparing for, and mitigating the impacts of climate change in partner countries. The two main components of USAID’s approach to adaptation include climate risk management and climate information services. Climate risk management involves a “light-touch, staff-led process” for assessing, addressing, and adaptively managing climate risks in non-emergency development funding. The climate information services translate data, statistical analyses, and quantitative outputs into information and knowledge to support decision-making processes. Some climate information services include early warning systems, which are designed to enable governments’ early and effective action. A primary example of a tool for USAID’s climate information services efforts is the SERVIR program, a joint development initiative in partnership with the National Aeronautics and Space Administration (NASA) to provide satellite meteorology information and science to partner countries. ​​

Additionally, as the flagship finance initiative under PREPARE, the State Department and  USAID, in collaboration with the U.S. Development Finance Corporation (DFC), have opened an Adaptation Finance Window under the Climate Finance for Development Accelerator (CFDA), which aims to de-risk the development and scaling of companies and investment vehicles that mobilize private finance for climate adaptation. 

Plan of Action

Recommendation 1: Mobilize private capital through results-based financing such as environmental impact bonds

Results-based financing (RBF) has long been a key component of USAID’s development aid strategy, offering innovative ways to mobilize finance by linking payments to specific outcomes. In recent years, Environmental Impact Bonds (EIBs) have emerged as a promising addition to the RBF toolkit and would greatly benefit as a mechanism for USAID to mobilize and scale novel climate adaptation. Thus, in alignment with the PREPARE plan, USAID should launch an EIB pilot focused on extreme heat through the Climate Finance for Development Accelerator (CFDA), a $250 million initiative designed to mobilize $2.5 billion in public and private climate investments by 2030.  An EIB piloted through the CFDA can help unlock public and private climate financing that focuses on extreme heat adaptation solutions, which are sorely needed. 

With this EIB pilot, the private sector, governments, and philanthropic investors raise the upfront capital and repayment is contingent on the project’s success in meeting predefined goals. By distributing financial risk among stakeholders in the private sector, government, and philanthropy, EIBs encourage investment in pioneering projects that might struggle to attract traditional funding due to their novel or unproven nature. This approach can effectively mobilize the necessary resources to drive climate adaptation solutions. 

This approach can effectively mobilize the necessary resources to drive climate adaptation solutions.

Overview of EIB structure, including cash flow (purple and green arrows) and environmental benefits (black arrows). The EIB is designed by project developers, and implemented by stakeholders and others to fund restoration activities that yield quantifiable environmental benefits. These environmental benefits are converted by the beneficiaries into financial benefits that influence the return on investment.
Environmental Impact Bonds structure

Overview of EIB structure, including cash flow (purple and green arrows) and environmental benefits (black arrows). 

Adapted from Environmental Impact Bonds: a common framework and looking ahead

The USAID EIB pilot should focus on scaling projects that facilitate uptake and adoption of affordable and sustainable cooling systems such as solar-reflective roofing and other passive cooling strategies. In Southeast Asia alone, annual heat-related mortality is projected to increase by 295% by 2030. Lack of access to affordable and sustainable cooling mechanisms in the wake of record-shattering heat waves affects public health, food and supply chain, and local economies. An EIB that aims to fund and scale solar-reflective roofing (cool roofs) has the potential to generate high impact for the local population by lowering indoor temperature, reducing energy use for air conditioning, and mitigating the heat island effect in surrounding areas. Indonesia, which is home to 46.5 million people at high risk from a lack of access to cooling, has seen notable success in deploying cool roofs/solar-reflective roofing through the Million Cool Roof Challenge, an initiative of the Clean Cooling Collaborative. The country is now planning to scale production capacity of cool roofs and set up its first testing facility for solar-reflective materials to ensure quality and performance. Given Indonesia’s capacity and readiness, an EIB to scale cool roofs in Indonesia can be a force multiplier to see this cooling mechanism reach millions and spur new manufacturing and installation jobs for the local economy. 

To mainstream EIBs and other innovative financial instruments, it is essential to pilot and explore more EIB projects. Cool roofs are an ideal candidate for scaling through an EIB due to their proven effectiveness as a climate adaptation solution, their numerous co-benefits, and the relative ease with which their environmental impacts can be measured (such as indoor temperature reductions, energy savings, and heat island index improvements). Establishing an EIB can be complex and time-consuming, but the potential rewards make the effort worthwhile if executed effectively. Though not exhaustive, the following steps are crucial to setting up an environmental impact bond:

Analyze ecosystem readiness

Before launching an environmental impact bond, it’s crucial to conduct an analysis to better understand what capacities already exist among the private and public sectors in a given country to implement something like an EIB. Additionally working with local civil society organizations is important to ensure climate adaptation projects and solutions are centered around the local community. 

Determine the financial arrangement, scope, and risk sharing structure 

Determine the financial structure of the bond, including the bond amount, interest rate, and maturity date. Establish a mechanism to manage the funds raised through the bond issuance.

Co-develop standardized, scientifically verified impact metrics and reporting mechanism 

Develop a robust system for measuring and reporting the environmental impact projects; With key stakeholders and partner countries, define key performance indicators (KPIs) to track and report progress.

USAID has already begun to incubate and pilot innovative financing mechanisms in the global health space through development impact bonds. The Utkrisht Impact Bond, for example, is the world’s first maternal and newborn health impact bond, which aims to reach up to 600,000 pregnant women and newborns in Rajasthan, India. Expanding the use case of this financing mechanism in the climate adaptation sector can further leverage private capital to address critical environmental challenges, drive scalable solutions, and enhance the resilience of vulnerable communities to climate impacts.

Recommendation 2: USAID should expand the SERVIR joint initiative to include a partnership with NIHHIS and co-develop decision support tools such as an intersectional vulnerability map. 

Building on the momentum of Administrator Power’s recent announcement at COP28, USAID should expand the SERVIR joint initiative to include a partnership with NOAA, specifically with NIHHIS, the National Integrated Heat Health Information System. NIHHIS is an integrated information system supporting equitable heat resilience, which is an important area that SERVIR should begin to explore. Expanded partnerships could begin with a pilot to map regional extreme heat vulnerability in select Southeast Asian countries. This kind of tool can aid in informing local decision makers about the risks of extreme heat that have many cascading effects on food systems, health, and infrastructure.

Intersectional vulnerabilities related to extreme heat refer to the compounding impacts of various social, economic, and environmental factors on specific groups or individuals. Understanding these intersecting vulnerabilities is crucial for developing effective strategies to address the disproportionate impacts of extreme heat. Some of these intersections include age, income/socioeconomic status, race/ethnicity, gender, and occupation. USAID should partner with NIHHIS to develop an intersectional vulnerability map that can help improve decision-making related to extreme heat. Exploring the intersectionality of extreme heat vulnerabilities is critical to improving local decision-making and helping tailor interventions and policies to where it is most needed. The intersection between extreme heat and health, for example, is an area that is under-analyzed, and work in this area will contribute to expanding the evidence base. 

The pilot can be modeled after the SERVIR-Mekong program, which produced 21 decision support tools throughout the span of the program from 2014-2022. The SERVIR-Mekong program led to the training of more than 1,500 people, the mobilization of $500,000 of investment in climate resilience activities, and the adoption of policies to improve climate resilience in the region. In developing these tools, engaging and co-producing with the local community will be essential. 

Recommendation 3: USAID REFS and the State Department Office of Foreign Assistance should work together to develop a mechanism to consistently track and report climate funding flow. This also requires USAID and the State Department to develop clear guidelines on the U.S. approach to adaptation tracking and determination of adaptation components.

Enhancing analytical and data collection capabilities is vital for crafting effective and informed responses to the challenges posed by extreme heat. To this end, USAID REFS, along with the State Department Office of Foreign Assistance, should co-develop a mechanism to consistently track and report climate funding flow. Currently, both USAID and the State Department do not consistently report funding data on direct and indirect climate adaptation foreign assistance. As the Department of State is required to report on its climate finance contributions annually for the Organisation for Economic Co-operation and Development (OECD) and biennially for the United Nations Framework Convention on Climate Change (UNFCCC), the two agencies should report on adaptation funding at similarly set, regular interval and make this information accessible to the executive branch and the general public. A robust tracking mechanism can better inform and aid agency officials in prioritizing adaptation assistance and ensuring the US fulfills its commitments and pledges to support global adaptation to climate change.

The State Department Office of Foreign Assistance (State F) is responsible for establishing standard program structures, definitions, and performance indicators, along with collecting and reporting allocation data on State and USAID programs. Within the framework of these definitions and beyond, there is a lack of clear definitions in terms of which foreign assistance projects may qualify as climate projects versus development projects and which qualify as both. Many adaptation projects are better understood on a continuum of adaptation and development activities. As such, this tracking mechanism should be standardized via a taxonomy of definitions for adaptation solutions. 

Therefore, State F should create standardized mechanisms for climate-related foreign assistance programs to differentiate and determine the interlinkages between adaptation and mitigation action from the outset in planning, finance, and implementation — and thereby enhance co-benefits. State F relies on the technical expertise of bureaus, such as REFS, and the technical offices within them, to evaluate whether or not operating units have appropriately attributed funding that supports key issues, including indirect climate adaptation. 

Further, announced at COP26, PREPARE is considered the largest U.S. commitment in history to support adaptation to climate change in developing nations. The Biden Administration has committed to using PREPARE to “respond to partner countries’ priorities, strengthen cooperation with other donors, integrate climate risk considerations into multilateral efforts, and strive to mobilize significant private sector capital for adaptation.”  Co-led by USAID and the U.S. Department of State (State Department), the implementation of PREPARE also involves the Treasury, NOAA, and the U.S. International Development Finance Corporation (DFC). Other U.S. agencies, such as USDA, DOE, HHS, DOI, Department of Homeland Security, EPA, FEMA, U.S. Forest Service, Millennium Challenge Corporation, NASA, and U.S. Trade and Development Agency, will respond to the adaptation priorities identified by countries in National Adaptation Plans (NAPs) and nationally determined contributions (NDCs), among others. 

As USAID’s REFS leads the implementation of the PREPARE and hosts USAID’s Chief Climate Officer, this office should be responsible for ensuring the agency’s efforts to effectively track and consistently report climate funding data. The two REFS Centers that should lead the implementation of these efforts include the Center for Climate-Positive Development, which advises USAID leadership and supports the implementation of USAID’s Climate Strategy, and the Center for Resilience, which supports efforts to help reduce recurrent crises — such as climate change-induced extreme weather events — through the promotion of risk management and resilience in the USAID’s strategies and programming. 
In making standardized processes to prioritize and track the flow of adaptation funds, USAID will be able to more effectively determine its progress towards addressing global climate hazards like extreme heat, while enhancing its ability to deliver innovative finance and private capital mechanisms in alignment with PREPARE. Additionally, standardization will enable both the public and private sectors to understand the possible areas of investment and direct their flows for relevant projects.

Frequently Asked Questions
How does USAID describe, compare, and analyze its global climate adaptation efforts?

USAID uses the Standardized Program Structure and Definitions (SPSD) system — established by State F — to provide a common language to describe climate change adaptation and resilience programs and therefore enable the comparison and analysis of budget and performance data within a country, regionally or globally. The SPSD system uses the following categories: (1) democracy, human rights, and governance; (2) economic growth; (3) education and social services; (4) health; (5) humanitarian assistance; (6) peace and security; and (7) program development and oversight. Since 2016, climate change has been in the economic growth category and each climate change pillar has separate Program Areas and Elements. The SPSD consists of definitions for foreign assistance programs, providing a common language to describe programs. By utilizing a common language, information for various types of programs can be aggregated within a country, regionally, or globally, allowing for the comparison and analysis of budget and performance data.


Using the SPSD program areas and key issues, USAID categorizes and tracks the funding for its allocations related to climate adaptation as either directly or indirectly addressing climate adaptation. Funding that directly addresses climate adaptation is allocated to the “Climate Change—Adaptation” under SPSD Program Area EG.11 for activities that enhance resilience and reduce the vulnerability to climate change of people, places, and livelihoods. Under this definition, adaptation programs may have the following elements: improving access to science and analysis for decision-making in climate-sensitive areas or sectors; establishing effective governance systems to address climate-related risks; and identifying and disseminating actions that increase resilience to climate change by decreasing exposure or sensitivity or by increasing adaptive capacity. Funding that indirectly addresses climate adaptation is not allocated to a specific SPSD program area. It is funding that is allocated to another SPSD program area and also attributed to the key issue of “Adaptation Indirect,” which is for adaptation activities. The SPSD program area for these activities is not Climate Change—Adaptation, but components of these activities also have climate adaptation effects.


In addition to the SPSD, the State Department and USAID have also identified “key issues” to help describe how foreign assistance funds are used. Key issues are topics of special interest that are not specific to one operating unit or bureau and are not identified, or only partially identified, within the SPSD. As specified in the State Department’s foreign assistance guidance for key issues, “operating units with programs that enhance climate resilience, and/or reduce vulnerability to climate variability and change of people, places, and/or livelihoods are expected to attribute funding to the Adaptation Indirect key issue.”


Operating units use the SPSD and relevant key issues to categorize funding in their operational plans. State guidance requires that any USAID operating unit receiving foreign assistance funding must complete an operational plan each year. The purpose of the operational plan is to provide a comprehensive picture of how the operating unit will use this funding to achieve foreign assistance goals and to establish how the proposed funding plan and programming supports the operating unit, agency, and U.S. government policy priorities. According to the operational plan guidance, State F does an initial screening of these plans.

What is the role of multilateral development banks (MDBs)?

MDBs play a critical role in bridging the significant funding gap faced by vulnerable developing countries that bear a disproportionate burden of climate adaptation costs—estimated to reach up to 20 percent of GDP for small island nations exposed to tropical cyclones and rising seas. MDBs offer a range of financing options, including direct adaptation investments, green financing instruments, and support for fiscal adjustments to reallocate spending towards climate resilience. To be most sustainably impactful, adaptation support from MDBs should supplement existing aid with conditionality that matches the institutional capacities of recipient countries.

What is the role of other federal agencies on an international scale?

In January 2021, President Biden issued an Executive Order (EO 14008) calling upon federal agencies and others to help domestic and global communities adapt and build resilience to climate change. Shortly thereafter in September 2022, the White House announced the launch of the PREPARE Action Plan, which specifically lays out America’s contribution to the global effort to build resilience to the impacts of the climate crisis in developing countries. Nineteen U.S. departments and agencies are working together to implement the PREPARE Action Plan: State, USAID, Commerce/NOAA, Millennium Challenge Corporation (MCC), U.S. Trade and Development Agency (USTDA), U.S. Department of Agriculture (USDA), Treasury, DFC, Department of Defense (DOD) & U.S. Army Corps of Engineers (USACE), International Trade Administration (ITA), Peace Corps, Environmental Protection Agency (EPA), Department of Energy (DOE), Federal Emergency Management Agency (FEMA), Department of Transportation (DOT), Health and Human Services (HHS), NASA, Export–Import Bank of the United States (EX/IM), and Department of Interior (DOI).

What is the role of Congress in international climate finance for adaptation?

Congress oversees federal climate financial assistance to lower-income countries, especially through the following actions: (1) authorizing and appropriating for federal programs and multilateral fund contributions, (2) guiding federal agencies on authorized programs and appropriations, and (3) overseeing U.S. interests in the programs. Congressional committees of jurisdiction include the House Committees on Foreign Affairs, Financial Services, Appropriations, and the Senate Committees on Foreign Relations and Appropriations, among others.

Critical Thinking on Critical Minerals

Access to critical minerals supply chains will be crucial to the clean energy transition in the United States. Batteries for electric vehicles, in particular, will require the U.S. to consume an order of magnitude more lithium, nickel, cobalt, and graphite than it currently consumes. Currently, these materials are sourced from around the world. Mining of critical minerals is concentrated in just a few countries for each material, but is becoming increasingly geographically diverse as global demand incentivizes new exploration and development. Processing of critical minerals, however, is heavily concentrated in a single country—China—raising the risk of supply chain disruption. 

To address this, the U.S. government has signaled its desire to onshore and diversify critical minerals supply chains through key legislation, such as the Bipartisan Infrastructure Law and the Inflation Reduction Act, and trade policies. The development of new mining and processing projects entails significant costs, however, and project financiers require developers to demonstrate certainty that projects will generate profit through securing long-term offtake agreements with buyers. This is made difficult by two factors: critical minerals markets are volatile, and, without subsidies or trade protections, domestically-produced critical minerals have trouble competing against low-priced imports, making it difficult for producers and potential buyers to negotiate a mutually agreeable price (or price floor). As a result, progress in expanding the domestic critical minerals supply may not occur fast enough to catch up to the growing consumption of critical minerals.

To accelerate project financing and development, the Department of Energy (DOE) should help generate demand certainty through backstopping the offtake of processed, battery-grade critical minerals at a minimum price floor. Ideally, this would be accomplished by paying producers the difference between the market price and the price floor, allowing them to sign offtake agreements and sell their products at a competitive market price. Offtake agreements, in turn, allow developers to secure project financing and proceed at full speed with development.

While demand-side support can help address the challenges faced by individual developers, market-wide issues with price volatility and transparency require additional solutions. Currently, the pricing mechanisms available for battery-grade critical minerals are limited to either third-party price assessments with opaque sources or the market exchange traded price of imperfect proxies. Concerns have been raised about the reliability of these existing mechanisms, hindering market participation and complicating discussions on pricing. 

As the North American critical minerals industry and market develops, DOE should support the parallel development of more transparent, North American based pricing mechanisms to improve price discovery and reduce uncertainty. In the short- and medium-term, this could be accomplished through government-backed auctions, which could be combined with offtake backstop agreements. Auctions are great mechanisms for price discovery, and data from them can help improve market price assessments. In the long-term, DOE could support the creation of new market exchanges for trading critical minerals in North America. Exchange trading enables greater price transparency and provides opportunities for hedging against price volatility. 

Through this two-pronged approach, DOE would simultaneously accelerate the development of the domestic critical minerals supply chain through addressing short-term market needs, while building a more transparent and reliable marketplace for the future.

Introduction

The global transportation system is currently undergoing a transition to electric vehicles (EVs) that will fundamentally transform not only our transportation system, but also domestic manufacturing and supply chains. Demand for lithium ion batteries, the most important and expensive component of EVs, is expected to grow 600% by 2030 compared to 2023, and the U.S. currently imports a majority of its lithium batteries. To ensure a stable and successful transition to EVs, the U.S. needs to reduce its import-dependence and build out its domestic supply chain for critical minerals and battery manufacturing. 

Crucial to that will be securing access to battery-grade critical minerals. Lithium, nickel, cobalt, and graphite are the primary critical minerals used in EV batteries. All four were included in the 2023 Department of Energy (DOE) Critical Minerals List. Cobalt and graphite are considered at risk of shortage in the short-term (2020-2025), while all four materials are at risk in the medium-term (2025-2030).

As shown in Figure 1, the domestic supply chain for batteries and critical minerals consists primarily of downstream buyers like automakers and battery assemblers, though there are a growing number of battery cell manufacturers thanks to domestic sourcing requirements in the Inflation Reduction Act (IRA) incentives. The U.S. has major gaps in upstream and midstream activities—mining of critical minerals, refining/processing, and the production of active materials and battery components. These industries are concentrated globally in a small number of countries, presenting supply chain risks. By developing new domestic industries within these gaps, the federal government can help build out new, resilient clean energy supply chains. 

This report is organized into three main sections. The first section provides an overview of current global supply chains and the process of converting different raw materials into battery-grade critical minerals. The second section delves into the pricing and offtake challenges that projects face and proposes demand-side support solutions to provide the price and volume certainty necessary to obtain project financing. The final section takes a look at existing pricing mechanisms and proposes two approaches that the government can take to facilitate price discovery and transparency, with an eye towards mitigating market volatility in the long term. Given DOE’s central role in supporting the development of domestic clean energy industries, the policies proposed in this report were designed with DOE in mind as the main implementer.

Figure 1. Lithium-ion battery supply chain

Adapted from Li-BRIDGE

Segments highlighting in light blue indicated gaps in U.S. supply chains. See original graphic from Li-BRIDGE for more information.

Section 1. Understanding Critical Minerals Supply Chains

Global Critical Minerals Sources

Globally, 65% or more of processed lithium, cobalt, and graphite originates from a single country: China (Figure 2). This concentration is particularly acute for graphite, 91% of which was processed by China in 2023. This market concentration has made downstream buyers in the U.S. overly dependent on sourcing from a single country. The concentration of supply chains in any one country makes them vulnerable to disruptions within that country—whether they be natural disasters, pandemics, geopolitical conflict, or macroeconomic changes. Moreover, lithium, nickel, cobalt, and graphite are all expected to experience shortages over the next decade. In the case of future shortages, concentration in other countries puts U.S. access to critical minerals at risk. Rocky foreign relations and competition between the U.S. and China over the past few years have put further strain on this dependence. In October 2023, China announced new export controls on graphite, though it has not yet restricted supply, in response to the U.S.’s export restrictions on semiconductor chips to China and other “foreign entities of concern” (FEOC).

Expanding domestic processing of critical minerals and manufacturing of battery components can help reduce dependence on Chinese sources and ensure access to critical minerals in future shortages. However, these efforts will hurt Chinese businesses, so the U.S. will also need to anticipate additional protectionist measures from China.

On the other hand, mining of critical minerals—with the exception of graphite and rare earth elements—occurs primarily outside of China. These operations are also concentrated in a small handful of countries, shown in Figure 3. Consequently, geopolitical disruptions affecting any of those primary countries can significantly affect the price and supply of the material globally. For example, Russia is the third largest producer of nickel. In the aftermath of Russia’s invasion of Ukraine at the beginning of 2022, expectations of shortages triggered a historic short squeeze of nickel on the London Metal Exchange (LME), the primary global trading platform, significantly disrupting the global market. 
To address global supply chain concentration, new incentives and grant programs were passed in the IRA and the Bipartisan Infrastructure Law. These include the 30D clean vehicle tax credit, the 45X advanced manufacturing production credit, and the Battery Materials Processing Grants Program (see Domestic Price Premium section for further discussion). Thanks to these policies, there are now on the order of a hundred North American projects in mining, processing, and active1 material manufacturing in development. The success of these and future projects will help create new domestic sources of critical minerals and batteries to feed the EV transition in the U.S. However, success is not guaranteed. A number of challenges to investment in the critical minerals supply chain will need to be addressed first.

Battery Materials Supply Chain

Critical minerals are used to make battery electrodes. These electrodes require specific forms of critical minerals for their production processes: typically lithium hydroxide or carbonate, nickel sulfate, cobalt sulfate, and a blend of coated spherical graphite and synthetic graphite.2

Lithium hydroxide/carbonate typically comes from two sources: spodumene, a hard rock ore that is mined primarily in Australia, and lithium brine, which is primarily found in South America (Figure 3). Traditionally, lithium brine must be evaporated in large open-air pools before the lithium can be extracted, but new technologies are emerging for direct lithium extraction that significantly reduces the need for evaporation. Whereas spodumene mining and refining are typically conducted by separate entities, lithium brine operations are typically fully integrated. A third source of lithium that has yet to be put into commercial production is lithium clay. The U.S. is leading the development of projects to extract and refine lithium from clay deposits.
Lithium Hydroxide and Lithium Carbonate

Lithium hydroxide/carbonate typically comes from two sources: spodumene, a hard rock ore that is mined primarily in Australia, and lithium brine, which is primarily found in South America (Figure 3). Traditionally, lithium brine must be evaporated in large open-air pools before the lithium can be extracted, but new technologies are emerging for direct lithium extraction that significantly reduces the need for evaporation. Whereas spodumene mining and refining are typically conducted by separate entities, lithium brine operations are typically fully integrated. A third source of lithium that has yet to be put into commercial production is lithium clay. The U.S. is leading the development of projects to extract and refine lithium from clay deposits.

Nickel sulfate can be made from either nickel metal, which was historically the preferred feedstock, or directly from nickel intermediate products, such as mixed hydroxide precipitate and nickel matte, which are the feedstocks that most Chinese producers have switched to in the past few years (Figure 4). Though demand from batteries is driving much of the nickel project development in the U.S., since nickel metal has a much larger market than nickel sulfate, developers are designing their projects with the flexibility to produce either nickel metal or nickel sulfate.
Nickel Sulfate

Nickel sulfate can be made from either nickel metal, which was historically the preferred feedstock, or directly from nickel intermediate products, such as mixed hydroxide precipitate and nickel matte, which are the feedstocks that most Chinese producers have switched to in the past few years (Figure 4). Though demand from batteries is driving much of the nickel project development in the U.S., since nickel metal has a much larger market than nickel sulfate, developers are designing their projects with the flexibility to produce either nickel metal or nickel sulfate.

Cobalt is primarily produced in the Democratic Republic of the Congo from cobalt-copper ore. Cobalt can also be found in lesser amounts in nickel and other metallic ores. Cobalt concentrate is extracted from cobalt-bearing ore and then processed into cobalt hydroxide. At this point, the cobalt hydroxide can be further processed into either cobalt sulfate for batteries or cobalt metal and other chemicals for other purposes.
Cobalt Sulfate

Cobalt is primarily produced in the Democratic Republic of the Congo from cobalt-copper ore. Cobalt can also be found in lesser amounts in nickel and other metallic ores. Cobalt concentrate is extracted from cobalt-bearing ore and then processed into cobalt hydroxide. At this point, the cobalt hydroxide can be further processed into either cobalt sulfate for batteries or cobalt metal and other chemicals for other purposes.

Battery cathodes come in a variety of chemistries: lithium nickel manganese cobalt (NMC) is the most common in lithium-ion batteries thanks to its higher energy density, while lithium iron phosphate is growing in popularity for its affordability and use of more abundantly available materials, but is not as energy dense. Cathode active material (CAM) manufacturers purchase lithium hydroxide/carbonate, nickel sulfate, and cobalt sulfate and then convert them into CAM powders. These powders are then sold to battery cell manufacturers, who coat them onto copper electrodes to produce cathodes.
Cathode Active Materials

Battery cathodes come in a variety of chemistries: lithium nickel manganese cobalt (NMC) is the most common in lithium-ion batteries thanks to its higher energy density, while lithium iron phosphate is growing in popularity for its affordability and use of more abundantly available materials, but is not as energy dense. Cathode active material (CAM) manufacturers purchase lithium hydroxide/carbonate, nickel sulfate, and cobalt sulfate and then convert them into CAM powders. These powders are then sold to battery cell manufacturers, who coat them onto copper electrodes to produce cathodes.

Graphite can be synthesized from petroleum needle coke, a fossil fuel waste material, or mined from natural deposits. Natural graphite typically comes in the form of flakes and is reshaped into spherical graphite to reduce its particle size and improve its material properties. Spherical graphite is then coated with a protective layer to prevent unwanted chemical reactions when charging and discharging the battery.
Natural and Synthetic Graphite

Graphite can be synthesized from petroleum needle coke, a fossil fuel waste material, or mined from natural deposits. Natural graphite typically comes in the form of flakes and is reshaped into spherical graphite to reduce its particle size and improve its material properties. Spherical graphite is then coated with a protective layer to prevent unwanted chemical reactions when charging and discharging the battery.

The majority of battery anodes on the market are made using just graphite, so there is no intermediate step between processors and battery cell manufacturers. Producers of battery-grade synthetic graphite and coated spherical graphite sell these materials directly to cell manufacturers, who coat them onto electrodes to make anodes. These battery-grade forms of graphite are also referred to as graphite anode powder or, more generally, as anode active materials. Thus, the terms graphite processor and graphite anode manufacturer are interchangeable.
Anode Active Material

The majority of battery anodes on the market are made using just graphite, so there is no intermediate step between processors and battery cell manufacturers. Producers of battery-grade synthetic graphite and coated spherical graphite sell these materials directly to cell manufacturers, who coat them onto electrodes to make anodes. These battery-grade forms of graphite are also referred to as graphite anode powder or, more generally, as anode active materials. Thus, the terms graphite processor and graphite anode manufacturer are interchangeable.

Section 2. Building Out Domestic Production Capacity

Challenges Facing Project Developers

Offtake Agreements

Offtake agreements (a.k.a. supply agreements or contracts) are an agreement between a producer and a buyer to purchase a future product. They are a key requirement for project financing because they provide lenders and investors with the certainty that if a project is built, there will be revenue generated from sales to pay back the loan and justify the valuation of the business. The vast majority of feedstocks and battery-grade materials are sold under offtake agreements, though small amounts are also sold on the spot market in one-off transactions. Offtake agreements are made at every step of the supply chain: between miners and processors (if they’re not vertically integrated), between processors and component manufacturers; and between component manufacturers and cell manufacturers. Due to domestic automakers’ concerns about potential material shortages upstream and the desire to secure IRA incentives, many of them have also been entering into offtake agreements directly with North American miners and processors. Tesla has started constructing their own domestic lithium processing plant.

Historically, these offtake agreements were structured as fixed-price deals. However, when prices on the spot market go too high, sellers often find a way to rip up the contract, and vice versa, when spot prices go too low, buyers often find a way to get out of the contract. As a result, more and more offtake agreements for battery-grade lithium, nickel, and cobalt have become indexed to spot prices, with price floors and/or ceilings set as guardrails and adjustments for premiums and discounts based on other factors (e.g. IRA compliance, risk from a greenfield producer, etc.). 

Graphite is the one exception where buyers and suppliers have mostly stuck to fixed-price agreements. There are two main reasons for this: graphite pricing is opaque and products exhibit much more variation, complicating attempts to index the price. As a result, cell manufacturers don’t consider the available price indexes to accurately reflect the value of the specific products they are buying.

Offtake agreements for battery cells are also typically partially indexed on the price of the critical minerals used to manufacture them. In other words, a certain amount of the price per unit of battery cell is fixed in the agreement, while the rest is variable based on the index price of critical minerals at the time of transaction.

Domestic critical minerals projects face two key challenges to securing investment and offtake agreements: market volatility and a lack of price competitiveness. The price difference between materials produced domestically and those produced internationally stems from two underlying causes: the current oversupply from Chinese-owned companies and the domestic price premium. 

Market Volatility

Lithium, cobalt, and graphite have relatively low-volume markets with a small customer base compared to traditional commodities. Low-volume products experience low liquidity, meaning it can be difficult to buy or sell quickly, so slight changes in supply and demand can result in sharp price swings, creating a volatile market. Because of the higher risk and smaller market, companies and investors tend to prefer mining and processing of base metals, such as copper, which have much larger markets, resulting in underinvestment in production capacity. 

In comparison, nickel is a base metal commodity, primarily used for stainless steel production. However, due to its rapidly growing use in battery production, its price has become increasingly linked to other battery materials, resulting in greater volatility than other base metals. Moreover, the short squeeze in 2022 forced LME to suspend trading and cancel transactions for the first time in three decades. As a result, trust in the price of nickel on LME faltered, many market participants dropped out, and volatility grew due to low trading volumes.

For all four of these materials, prices reached record highs in 2022 and subsequently crashed in 2023 (Figure 4). Nickel, cobalt, and graphite experienced price declines of 30-45%, while lithium prices dropped by an enormous 75%. As discussed above, market volatility discourages investment into critical minerals production capacity. The current low prices have caused some domestic projects to be paused or canceled. For example, Jervois halted operation of its Idaho cobalt mine in March 2023 due to cobalt prices dropping below its operating costs. In January 2024, lithium giant Albemarle announced that it was delaying plans to begin construction on a new South Carolina lithium hydroxide processing plant.

Retrospective analysis suggests that mining companies, battery investors, and automakers had all made overly optimistic demand projections and ramped up their production a bit too fast. These projections assumed that EV demand would keep growing as fast as it did immediately after the pandemic and that China’s lifting of pandemic restrictions would unlock even faster growth in the largest EV market. Instead, China, which makes up over 60% of the EV market, emerged into an economic downturn, and global demand elsewhere didn’t grow quite as fast as projected, as backlogs built up during the pandemic were cleared. (It is important to note that the EV market is still growing at significant rates—global EV sales increased by 35% from 2022 to 2023—just not as fast as companies had wished.) Consequently, supply has temporarily outpaced demand. Midstream and upstream companies stopped receiving new purchase orders while automakers worked through their stock build-up. Prices fell rapidly as a result and are now bottoming out. Some companies are waiting for prices to recover before they restart construction and operation of existing projects or invest in expanding production further. 

While companies are responding to short-term market signals, the U.S. government needs to act in anticipation of long-term demand growth outpacing current planned capacity. Price volatility in critical minerals markets will need to be addressed to ensure that companies and financiers continue investing in expanding production capacity. Otherwise, demand projections suggest that the supply chain will experience new shortages later this decade. 

Oversupply

The current oversupply of critical minerals has been exacerbated by below market-rate financing and subsidies from the Chinese government. Many of these policies began in 2009, incentivizing a wave of investment not just in China, but also in mineral-rich countries. These subsidies played a large role in the 2010s in building out nascent battery critical minerals supply chains. Now, however, they are causing overproduction from Chinese-owned companies, which threatens to push out competitors from other countries.

Overproduction begins with mining. Chinese companies are the primary financial backers for 80% of both the Democratic Republic of the Congo’s cobalt mines and Indonesia’s nickel mines. Chinese companies have also expanded their reach in lithium, buying half of all the lithium mines offered for sale since 2018, in addition to domestically mining 18% of global lithium.  For graphite, 82% of natural graphite was mined directly in China in 2023, and nearly all natural and synthetic graphite is processed in China.

After the price crash in 2023, while other companies pulled back their production volume significantly, Chinese-owned companies pulled back much less and in some cases continued to expand their production, generating an oversupply of lithium, cobalt, nickel, and natural and synthetic graphite. Government policies enabled these decisions by making it financially viable for Chinese companies to sell materials at low prices that would otherwise be unsustainable. 

Domestic Price Premium (and Current Policies Addressing It) 

Domestically-produced critical minerals and battery electrode active materials come with a higher cost of production over imported materials due to higher wages and stricter environmental regulations in the U.S. The IRA’s new 30D and 45X tax credit and upcoming section 301 tariffs help address this problem by creating financial incentives for using domestically produced materials, allowing them to compete on a more even playing field with imported materials. 

The 30D New Clean Vehicle Tax Credit provides up to $7,500 per EV purchased, but it requires eligible EVs to be manufactured from critical minerals and battery components that are FEOC-compliant, meaning they cannot be sourced from companies with relationships to China, North Korea, Russia, and Iran. It also requires that an increasing percentage of critical minerals used to make the EV batteries be extracted or processed in the U.S. or a Free Trade Agreement country. These two requirements apply to lithium, nickel, cobalt, and graphite. For graphite, however, since nearly all processing occurs in China and there is currently no domestic supply, the US Treasury has chosen to exempt it from the 30D tax credit’s FEOC and domestic sourcing requirements until 2027 to give automakers time to develop alternate supply chains.

The 45X Advanced Manufacturing Production Tax Credit subsidizes 10% of the production cost for each unit of critical minerals processed. The Internal Revenue Service’s proposed regulations for this tax credit interprets the legislation for 45X as applying only to the value-added production cost, meaning that the cost of purchasing raw materials and processing chemicals is not included in the covered production costs. This limits the amount of subsidy that will be provided to processors. The strength of 45X, though, is that unlike the 30D tax credit, there is no sunset clause for critical minerals, providing a long term guarantee of support. 

In terms of tariffs, the Biden administration announced in May 2024, a new set of section 301 tariffs on Chinese products, including EVs, batteries, battery components, and critical minerals. The critical minerals tariffs include a 25% tariff on cobalt ores and concentrates that will go into effect in 2024 and a 25% tariff on natural flake graphite that will go into effect in 2026. In addition, there are preexisting 25% tariffs in section 301 for natural and synthetic graphite anode powder. These tariffs were previously waived to give automakers time to diversify their supply chains, but the U.S. Trade Representative (USTR) announced in May 2024 that the exemptions would expire for good on June 14th, 2024, citing the lack of progress from automakers as a reason for not extending them.

Current State of Supply Chain Development

For lithium, despite market volatility, offtake demand for existing domestic projects has remained strong thanks to IRA incentives. Based on industry conversations, many of the projects that are developed enough to make offtake agreements have either signed away their full output capacity or are actively in the process of negotiating agreements. Strong demand combined with tax incentives has enabled producers to negotiate offtake agreements that guarantee a price floor at or above their capital and operating costs. Lithium is the only material for which the current planned mining and processing capacity for North America is expected to meet demand from planned U.S. gigafactories.

Graphite project developers report that the 25% tariff coming into force will be sufficient to close the price gap between domestically produced materials and imported materials, enabling them to secure offtake agreements at a sustainable price. Furthermore, the Internal Revenue Service will require 30D tax credit recipients to submit period reports on progress that they are making on sourcing graphite outside of China. If automakers take these reports and the 2027 exemption deadline seriously, there will be even more motivation to work with domestic graphite producers. However, the current planned production capacity for North America still falls significantly short of demand from planned U.S. battery gigafactories. Processing capacity is the bottleneck for production output, so there is room for additional investment in processing capacity.

Pricing has been a challenge for cobalt though. Jervois briefly opened the only primary cobalt mine in the U.S. before shutting down a few months later due to the price crash. Jervois has said that as soon as prices for standard-grade cobalt rise above $20/pound, they will be able to reopen the mine, but that has yet to happen. Moreover, the real bottleneck is in cobalt processing, which has attracted less attention and investment than other critical minerals in the U.S. There are currently no cobalt sulfate refineries in North America; only one or two are in development in the U.S. and a few more in Canada.3

Nickel sulfate is also facing pricing challenges, and, similar to cobalt, there is an insufficient amount of nickel sulfate processing capacity being developed domestically. There is one processing plant being developed in the U.S. that will be able to produce either nickel metal or nickel sulfate and a few more nickel sulfate refineries being developed in Canada.

Policy Solutions to Support the Development of Processing Capacity

The U.S. government should prioritize the expansion of processing capacity for lithium, graphite, cobalt, and nickel. Demand from domestic battery manufacturing is expected to outpace the current planned capacity for all of these materials, and processing capacity is the key bottleneck in the supply chain. Tariffs and tax incentives have resulted in favorable pricing for lithium and graphite project developers, but cobalt and nickel processing has gotten less support and attention. 

DOE should provide demand-side support for processed, battery-grade critical minerals to accelerate the development of processing capacity and address cobalt and nickel pricing needs. The Office of Manufacturing and Energy Supply Chains (MESC) within DOE would be the ideal entity to administer such a program, given its mandate to address vulnerabilities in U.S. energy supply chains. In the immediate term, funding could come from MESC’s Battery Materials Processing Grants program, which has roughly $1.9B in remaining, uncommitted funds. Below we propose a few demand-support mechanisms that MESC could consider.

Long term, the Bipartisan Policy Center proposes that Congress establish and appropriate funding for a new government corporation that would take on the responsibility of administering demand-support mechanisms as necessary to mitigate volume and price uncertainty and ensure that domestic processing capacity grows to sufficiently meet critical minerals needs.

Offtake Backstops

Offtake backstops would commit MESC to guaranteeing the purchase of a specific amount of materials at a minimum negotiated price if producers are unable to find buyers at that price. This essentially creates a price floor for specific producers while also providing a volume guarantee. Offtake backstops help derisk project development and enable developers to access project financing. Backstop agreements should be made for at least the first five years of a plant’s operations, similar to a regular offtake agreement. Ideally, MESC should prioritize funding for critical minerals with the largest expected shortages based on current planned capacity—i.e., nickel, cobalt, and graphite.

There are two primary ways that DOE could implement offtake backstops:

First. The simplest approach would be for DOE to pay processors the difference between the spot price index (adjusted for premiums and discounts) and the pre-negotiated price floor for each unit of material, similar to how a pay-for-difference or one-sided contract-for-difference would work.4 This would enable processors to sign offtake agreements with no price floor, accelerating negotiations and thus the pace of project development. Processors could also choose to keep some of their output capacity uncommitted so that they can sell their products on the spot market without worrying about prices collapsing in the future.

A more limited form of this could look like DOE subsidizing the price floor for specific offtake agreements between a processor and a buyer. This type of intervention requires a bit more preliminary work from processors, since they would have to identify and bring a buyer to the table before applying for support.

Second. Purchasing the actual materials would be a more complex route for DOE to take, since the agency would have to be ready to receive delivery of the materials. The agency could do this by either setting up a system of warehouses suitable for storing battery-grade critical minerals or using “virtual warehousing,” as proposed by the Bipartisan Policy Center. An actual warehousing system could be set up by contracting with existing U.S. warehouses, such as those in LME and CME’s networks, to expand or upgrade their facilities to store critical minerals. These warehouses could also be made available for companies’ to store their private stockpiles, increasing the utility of the warehousing system and justifying the cost of setting it up. Virtual warehousing would entail DOE paying producers to store materials on-site at their processing plants. 

The physical reserve provides an additional opportunity for DOE to address market volatility by choosing when it sells materials from the reserve. For example, DOE could pause sales of a material when there is an oversupply on the market and prices dip or ramp up sales when there is a shortage and prices spike. However, this can only be used to address short-term fluctuations in supply and demand (e.g. a few months to a few years at most), since these chemicals have limited shelf lives. 

A third way to implement offtake backstops that would also support price discovery and transparency is discussed in Section 3. 


Section 3. Creating Stable and Transparent Markets

Concerns about Pricing Mechanisms

Market volatility in critical minerals markets has raised concerns about just how reliable the current pricing mechanisms for these markets are. There are two main ways that prices in a market are determined: third-party price assessments and market exchanges. A third approach that has attracted renewed attention this year is auctions. Below, we walk through these three approaches and propose potential solutions for addressing challenges in price discovery and transparency. 

Index Pricing

Price reporting agencies like Fastmarkets and Benchmark Mineral Intelligence offer subscription services to help market participants assess the price of commodities in a region. These agencies develop rosters of companies for each commodity, who regularly contribute information on transaction prices. That intel is then used to generate price indexes. Fastmarkets and Benchmark’s indexes are primarily based on prices provided by large, high-volume sellers and buyers. Smaller buyers may pay higher than index prices. 

It can be hard to establish reliable price indexes in immature markets if there is an insufficient volume of transactions or if the majority of transactions are made by a small set of companies. For example, lithium processing is concentrated among a small number of companies in China and spot transactions are a minority share of the market. New entrants and smaller producers have raised concern that these companies have significant control over Asian spot prices reported by Fastmarkets and Benchmark, which are used to set offtake agreement prices, and that the price indexes are not sufficiently transparent.

Exchange Trading

Market exchanges are a key feature of mature markets that helps reduce market volatility. Market exchanges allow for a wider range of participants, improving market liquidity, and enables price discovery and transparency. Companies up and down the supply chain can use physically-delivered futures and options contracts to hedge against price volatility and gain visibility into expectations for the market’s general direction to help inform decision-making. This can help derisk the effect of market volatility on investments in new production capacity.

Of the materials we’ve discussed, nickel and cobalt metal are the only two that are physically traded on a market exchange, specifically LME. Metals make good exchange commodities due to their fungibility. Other forms of nickel and cobalt are typically priced as a percentage of the payable price for nickel and cobalt metal. LME’s nickel price is used as the global benchmark for many nickel products, while the in-warehouse price of cobalt metal in Rotterdam, Europe’s largest seaport, is used as the global benchmark for many cobalt products. These pricing relationships enable companies to use nickel and cobalt metal as proxies for hedging related materials.

After nickel trading volumes plummeted on LME in the wake of the short squeeze, doubts were raised about LME’s ability to accurately benchmark its price, sparking interest in alternative exchanges. In April 2024, UK-based Global Commodities Holdings Ltd (GCHL) launched a new trading platform for nickel metal that is only available to producers, consumers, and merchants directly involved in the physical market, excluding speculative traders. The trading platform will deliver globally “from Baltimore to Yokohama.” GCHL is using the prices on the platform to publish its own price index and is also working with Intercontinental Exchange to create cash-settled derivatives contracts. This new platform could potentially expand to other metals and critical minerals. 

In addition to LME’s troubles though, changes in the battery supply chain have led to a growing divergence between the nickel and cobalt metal traded on exchanges and the actual chemicals used to make batteries. Chinese processors who produce most of the global supply of nickel sulfate have mostly switched from nickel metal to cheaper nickel intermediate products as their primary feedstock. Consequently, market participants say that the LME exchange price for nickel metal, which is mostly driven by stainless steel, no longer reflects market conditions for the battery sector, raising the need for new tradeable contracts and pricing mechanisms. For the cobalt industry, 75% of demand comes from batteries, which use cobalt sulfate. Cobalt metal makes up only 18% of the market, of which only 10-15% is traded on the spot market. As a result, cobalt chemicals producers have transitioned away from using the metal reference price towards fixed-prices or cobalt sulfate payables. 

These trends motivate the development of new exchange contracts for physically trading nickel and cobalt chemicals that can enable price discovery separate from the metals markets. There is also a need to develop exchange contracts for materials like lithium and graphite with immature markets that exhibit significant volatility. 

However, exchange trading of these materials is complicated by their nature as specialty chemicals: they have limited shelf lives and more complex storage requirements, unlike metal commodities. Lithium and graphite products also exhibit significant variations that affect how buyers can use them. For example, depending on the types and level of impurities in lithium hydroxide/carbonate, manufacturers of cathode active materials may need to conduct different chemical processes to remove them. Offtakers may also require that products meet additional specifications based on the characteristics they need for their CAM and battery chemistries.

For these reasons, major exchanges like LME, the Chicago Mercantile Exchange (CME), and the Singapore Exchange (SGX) have instead chosen to launch cash-settled contracts for lithium hydroxide/carbonate and cobalt hydroxide that allow for financial trading, but require buyers and sellers to arrange physical delivery separately from the exchange. Large firms have begun to participate increasingly in these derivatives markets to hedge against market volatility, but the lack of physical settlement limits their utility to producers who still need to physically deliver their products in order to make a profit. Nevertheless, CME’s contracts for lithium and cobalt have seen significant growth in transaction volume. LME, CME, and SGX all use Fastmarkets’ price indexes as the basis for their cash-settled contracts. 

As regional industries mature and products become more standardized, these exchanges may begin to add physically settled contracts for battery-grade critical minerals. For example, the Guangzhou Futures Exchange (GFEX) in China, where the vast majority of lithium refining currently occurs, began offering physically settled contracts for lithium carbonate in August 2023. Though the exchange exhibited significant volatility in its first few months, raising concerns, the first round of physical deliveries in January 2024 occurred successfully, and trading volumes have been substantial this year. Access to GFEX is currently limited to Chinese entities and their affiliates, but another trading platform could come to do the same for North America over the next few decades as lithium production volume grows and a spot market emerges. Abaxx Exchange, a Singapore-based startup, has also launched a physically settled futures contract for nickel sulfate with delivery points in Singapore and Rotterdam. A North American delivery point could be added as the North American supply chain matures. 

No market exchange for graphite currently exists, since products in the industry vary even greater than other materials. Even the currently available price indexes are not seen as sufficiently robust for offtake pricing. 

Auctions

In the absence of a globally accessible market exchange for lithium and concerns about the transparency of index pricing, Albemarle, the top producer of lithium worldwide, has turned to auctions of spodumene concentrate and lithium carbonate as a means to improve market transparency and an “approach to price discovery that can lead to fair product valuation.” Albemarle’s first auction in March of spodumene concentrate in China closed at a price of $1200/ton, which was in line with spot prices reported by Asian Metal, but about 10% greater than prices provided by other price reporting agencies like Fastmarkets. Plans are in place to continue conducting regular auctions at the rate of about one per week in China and other locations like Australia. Lithium hydroxide will be auctioned as well. Auction data will be provided to Fastmarkets and other price reporting agencies to be formulated into publicly available price indexes.

Auctions are not a new concept: in 2021 and 2022, Pilbara Minerals regularly conducted auctions of spodumene on its own platform Battery Metals Exchange, helping to improve market sentiment. Now, though, the company says that most of its material is now committed to offtakers, so auctions have mostly stopped, though it did hold an auction for spodumene concentrate in March. If other lithium producers join Albemarle in conducting auctions, the data could help improve the accuracy and transparency of price indexes. Auctions could also be used to inform the pricing of other battery-grade critical minerals. 

Policy Solutions to Support Price Discovery and Transparency Across the Market

Right now, the only pricing mechanisms available to domestic project developers are spot price indexes for battery-grade critical minerals in Asia or global benchmarks for proxies like nickel and cobalt metal. Long-term, the development of new pricing mechanisms for North America will be crucial to price discovery and transparency in this new market. There are two ways that DOE could help facilitate this: one that could be implemented immediately for some materials and one that will require domestic production volume to scale up first.

First. Government-Backed Auctions: Auctions require project developers to keep a portion of their expected output uncommitted to any offtakers. However, there is a risk that future auctions won’t generate a price sufficient to offset capital and operating expenses, so processors are unlikely to do this on their own, especially for their first domestic project. MESC could address this by providing a backstop guarantee for the portion of a producer’s output that they commit to regularly auctioning for a set timespan. If, in the future, auctions are unable to generate a price above a pre-negotiated price floor, then DOE would pay sellers the difference between the highest auction price and the price floor for each unit sold. Such an agreement could be made using DOE’s Other Transaction Authority. DOE could separately contract with a platform such as MetalsHub to conduct the auction. 

Government-backed auctions would enable the discovery of a true North American price for different battery-grade critical minerals and the raw materials used to make them, generating a useful comparison point with Asian spot prices. Such a scheme would also help address developers’ price and demand needs for project financing. These backstop-auction agreements could be complementary to the other types of backstop agreements proposed earlier and potentially more appealing than physically offtaking materials since the government would not have to receive delivery of the materials and there would be a built-in mechanism to sell the materials to an appropriate buyer. If successful, companies could continue to conduct auctions independently after the agreements expire.

Second. New Benchmark Contracts: Employ America has proposed that the Loan Programs Office (LPO) could use Section 1703 to guarantee lending to a market exchange to develop new, physically settled benchmark contracts for battery-grade critical minerals. The development of new contracts should include producers in the entire North American region. Canada also has a significant number of mines and processing plants in development. Including those projects would increase the number of participants, market volume, and liquidity of new benchmark contracts.

In order for auctions or new benchmark contracts to operate successfully, three prerequisites must be met:

  1. There must be a sufficient volume of materials available for sale (i.e. production output that is not committed to an offtaker).
  2. There must be sufficient product standardization in the industry such that materials produced by different companies can be used interchangeably by a significant number of buyers.
  3. There must be a sufficient volume of demand from buyers, brokers, and traders.

Market exchanges typically conduct research into stakeholders to understand whether or not the market is mature enough to meet these requirements before they launch a new contract. Interest from buyers and sellers must indicate that there would be sufficient trading volume for the exchange to make a profit greater than the cost of setting up the new contract. A loan from LPO under Section 1703 can help offset some of those upfront costs and potentially make it worthwhile for an exchange to launch a new contract in a less mature market than they typically would. 

Government-backed auctions, on the other hand, solve the first prerequisite by offering guarantees to producers for keeping a portion of their production output uncommitted. Product standardization can also be less stringent, since each producer can hold separate auctions, with varying material specifications, unlike market exchanges where there must be a single set of product standards.

Given current market conditions, no battery-grade critical minerals can meet the above prerequisites for new benchmark contracts, primarily due to a lack of available volume, though there are also issues with product standardization for certain materials. However, nickel, cobalt, lithium, and graphite could be good candidates for government-backed auctions. DOE should start engaging with project developers that have yet to fully commit their output to offtakers and gauge their interest in backstop-auction agreements. 

Nickel and Cobalt

As discussed prior, there are only a handful of nickel and cobalt sulfate refineries currently being developed in North America, making it difficult to establish a benchmark contract for North America. None of the project developers have yet signed offtake agreements covering their full production capacity, so backstop-auction agreements could be appealing to project developers and their investors. Given that more than half of the projects in development are located in Canada, MESC and DOE’s Office of International Affairs should collaborate with the Canadian government in designing and implementing government-backed auctions. 

Lithium

Domestic companies have expressed interest in establishing North American-based spot markets and price indexes for lithium hydroxide and carbonate, but say that it will take quite a few years before production volume is large enough to warrant that. Product variation has also been a concern from lithium processors when the idea of a market exchange or public auction has been raised. Lessons could be learned from the GFEX battery-grade lithium carbonate contracts. GFEX set standards on the purity, moisture, loss on ignition, and maximum content of different impurities. Some Chinese companies were able to meet these standards, while others were not, preventing them from participating in the futures market or requiring them to trade their materials as lower-purity industrial-grade lithium carbonate, which sells for a discounted price. Other companies producing lithium of much higher quality than the GFEX standards, opted to continue selling on the spot market because they could charge a premium on the standard price. Despite some companies choosing not to participate, trading volumes on GFEX have been substantial, and the exchange was able to weather through initial concerns of a short squeeze, suggesting that challenges with product variation can be overcome through standardization.

Analysts have proposed that spodumene could be a better candidate for exchange trading, since it is fungible and does not have the limited shelf-life or storage requirements of lithium salts. 60% of global lithium comes from spodumene, and the U.S. has some of the largest spodumene deposits in the world, so spodumene would be a good proxy for lithium salts in North America. However, the two domestic developers of spodumene mines are planning to construct processing plants to convert the spodumene into battery-grade lithium on-site. Similarly, the two Canadian mines that currently produce spodumene are also planning to build their own processing plants. These vertical integration plans mean that there is unlikely to be large amounts of spodumene available for sale on a market exchange in the near future.

DOE could, however, work with miners and processors to sign backstop-auction agreements for smaller amounts of lithium hydroxide/carbonate and spodumene that they have yet to commit to offtakers. This may be especially appealing to companies that have announced delays to project development due to current low market prices and help derisk bringing timelines forward. Interest in these future auctions could also help gauge the potential for developing new benchmark contracts for lithium hydroxide/carbonate further down the line.

Graphite

Natural and synthetic graphite anode material products currently exhibit a great range of variation and insufficient product standardization, so a market exchange would not be viable at the moment. As the domestic graphite industry develops, DOE should work with graphite anode material producers and battery manufacturers to understand the types and degree of variations that exist across products and discuss avenues towards product standardization. Government-backed auctions could be a smaller-scale way to test the viability of product standards developed from that process, perhaps using several tiers or categories to group products. Natural and synthetic graphite would have to be treated separately, of course. 

Conclusion

The current global critical minerals supply chain partially reflects the results of over a decade of focused, industrial policies implemented by the Chinese government. If the U.S. wants to lead the clean energy transition, critical minerals will also need to become a cornerstone of U.S. industrial policy. Developing a robust North American critical minerals industry would bolster U.S. energy security and independence and ensure a smooth energy transition. 

Promising progress has already been made in lithium, with planned processing capacity expected to meet demand from future battery manufacturing. However, market and pricing challenges remain for battery-grade nickel, cobalt, and graphite, which will fall far short of future demand without additional intervention. This report proposes that DOE take a two-pronged approach to supporting the critical minerals industry through offtake backstops, which address project developers’ current pricing dilemmas, and the development of more reliable and transparent pricing mechanisms such as government-backed auctions, which will set up markets for the future.

While the solutions proposed in this report focus on DOE as the primary implementer, Congress also has a role to play in authorizing and appropriating new funding necessary to execute a cohesive industrial strategy on critical minerals . The policies proposed in this report can also be applied to other critical minerals crucial for the energy transition and our national security. Similar analysis of other critical minerals markets and end uses should be conducted to understand how these solutions can be tailored to those industry needs. 

GenAI in Education Research Accelerator (GenAiRA)

The United States faces a critical challenge in addressing the persistent learning opportunity gaps in math and reading, particularly among disadvantaged student subgroups. According to the 2022 National Assessment of Educational Progress (NAEP) data, only 37% of fourth-grade students performed at or above the proficient level in math, and 33% in reading. The rapid advancement of generative AI (GenAI) technologies presents an unprecedented opportunity to bridge these gaps by providing personalized learning experiences and targeted support. However, the current mismatch between the speed of GenAI innovation and the lengthy traditional research pathways hinders the thorough evaluation of these technologies before widespread adoption, potentially leading to unintended negative consequences.

Failure to adapt our research and regulatory processes to keep pace with the development of GenAI technologies could expose students to ineffective or harmful educational tools, exacerbate existing inequities, and hinder our ability to prepare all students for success in an increasingly complex and technology-driven world. The education sector must act with urgency to establish the necessary infrastructure, expertise, and collaborative partnerships to ensure that GenAI-powered tools are rigorously evaluated, continuously improved, and equitably implemented to benefit all students.

To address this challenge, we propose three key recommendations for congressional action:

  1. Establish the GenAI in Education Research Accelerator Program (GenAiRA) within the Institute of Education Sciences (IES) to support and expedite efficacy research on GenAI-powered educational tools.
  2. Adapt IES research and evaluation processes to create a framework for the rapid assessment of GenAI-enabled educational technology, including alternative research designs and evidence standards.
  3. Support the establishment of a GenAI Education Research and Innovation Consortium, bringing together schools, researchers, and education technology (EdTech) developers to participate in rapid cycle studies and continuous improvement of GenAI tools.

By implementing these recommendations, Congress can foster a more responsive and evidence-based ecosystem for GenAI-powered educational tools, ensuring that they are equitable, effective, and safe for all students. This comprehensive approach will help unlock the transformative potential of GenAI to address persistent learning opportunity gaps and improve outcomes for all learners, while maintaining scientific rigor and prioritizing student well-being.

During the preparation of this work, the authors used the tool Claude 3 Opus (by Anthropic) to help clarify and synthesize, and add accessible language around concepts and ideas generated by members of the team. The authors reviewed and edited the content as needed and take full responsibility for the content of this publication.

Challenge and Opportunity

Widening Learning Opportunity Gap 

NAEP data reveals that many U.S. students, especially those from disadvantaged subgroups, are not achieving proficiency in math and reading. In 2022, only 37% of fourth-graders performed at or above the NAEP proficient level in math, and 33% in reading—the lowest levels in over a decade. Disparities are more profound when disaggregated by race, ethnicity, and socioeconomic status; for example, only 17% of Black students and 21% of Hispanic students reached reading proficiency, compared to 42% of white students.

Rapid AI Evolution

GenAI is a transformative technology that enables rapid development and personalization of educational content and tools, addressing unmet needs in education such as lack of resources, 1:1 teaching time, and teacher quality. However, that rapid pace also raises concerns about premature adoption of unvetted tools, which could negatively impact students’ educational achievement. Unvetted GenAI tools may introduce misconceptions, provide incorrect guidance, or be misaligned with curriculum standards, leading to gaps in students’ understanding of foundational concepts. If used for an extended period, particularly with vulnerable learners, these tools could have a long-term impact on learning foundations that may be difficult to remedy.

On the other hand, carefully designed, trained, and vetted GenAI models that have undergone rapid cycle studies and design iterations based on data have the potential to effectively address students’ misconceptions, build solid learning foundations, and provide personalized, adaptive support to learners. These tools could accelerate progress and close learning opportunity gaps at an unprecedented scale.

Slow Vetting Processes 

The rapid pace of AI development poses significant challenges for traditional research and evaluation processes in education. Efficacy research, particularly studies sponsored by the IES or other Department of Education entities, is a lengthy, resource-intensive, and often onerous process that can take years to complete. Randomized controlled trials and longitudinal studies struggle to keep up with the speed of AI innovation: by the time a study is completed, the AI-powered tool may have already undergone multiple iterations or been replaced.

It can be difficult to recruit and sustain school and teacher participation in efficacy research due to the significant time and effort required from educators. Moreover, obtaining certifications and approvals for research can be complex and time-consuming, as researchers must navigate institutional review boards, data privacy regulations, and ethical guidelines, which can delay the start of a study by months or even years.

Many EdTech developers find themselves in a catch-22 situation, where their products are already being adopted by schools and educators, yet they are simultaneously expected to participate in lengthy and expensive research studies to prove efficacy. The time and resources required to engage in such research can be a significant burden for EdTech companies, especially start-ups and small businesses, which may prefer to focus on iterating and improving their products based on real-world feedback. As a result, many EdTech developers may be reluctant to participate in traditional efficacy research, further exacerbating the disconnect between the rapid pace of AI innovation and the slow process of evaluating the effectiveness of these tools in educational settings.

Gaps in Existing Efforts and Programs

While federal initiatives like SEERNet and ExpandAI have made strides in supporting AI and education research and development, they may not be fully equipped to address the specific challenges and opportunities presented by GenAI for several reasons:

Traditional approaches to efficacy research and evaluation may not be well-suited to evaluating the potential benefits and outcomes associated with GenAI-powered tools in the short term, particularly when assessing whether a program shows enough promise to warrant wider deployment with students. 

A New Approach 

To address these challenges and bridge the gap between GenAI innovation and efficacy research, we need a new approach to streamline the research process, reduce the burden on educators and schools, and provide timely and actionable insights into the effectiveness of GenAI-powered tools. This may involve alternative study designs, such as rapid cycle evaluations or single-case research, and developing new incentive structures and support systems to encourage and facilitate the participation of teachers, schools, and product developers in research studies.

GenAiRA aims to tackle these challenges by providing resources, guidance, and infrastructure to support more agile and responsive efficacy research in the education sciences. By fostering collaboration among researchers, developers, and educators, and promoting innovative approaches to evaluation, this program can help ensure that the development and adoption of AI-powered tools in education are guided by rigorous, timely, and actionable evidence—while simultaneously mitigating risks to students.

Learning from Other Sectors 

Valuable lessons can be drawn from other fields that have faced similar balancing acts between innovation, research, and safety. Two notable examples are the U.S. Food and Drug Administration’s (FDA) expedited review pathways for drug development and the National Institutes of Health’s (NIH) Clinical and Translational Science Awards (CTSA) program for accelerating medical research.

Example 1: The FDA Model

The FDA’s expedited review programs, such as Fast Track, Breakthrough Therapy, Accelerated Approval, and Priority Review, are designed to speed up the development and approval of drugs that address unmet medical needs or provide significant improvements over existing treatments. These pathways recognize that, in certain cases, the benefits of bringing a potentially life-saving drug to market quickly may outweigh the risks associated with a more limited evidence base at the time of approval.

Key features include:

  1. Early and frequent communication between the FDA and drug developers to provide guidance and feedback throughout the development process.
  2. Flexibility in clinical trial design and evidence requirements, such as allowing the use of surrogate endpoints or single-arm studies in certain cases.
  3. Rolling review of application materials, allowing drug developers to submit portions of their application as they become available rather than waiting for the entire package to be complete.
  4. Shortened review timelines, with the FDA committing to reviewing and making a decision on an application within a specified timeframe (e.g., six months for Priority Review).

These features can accelerate the development and approval process while still ensuring that drugs meet standards for safety and effectiveness. They also acknowledge that the evidence base for a drug may evolve over time, with post-approval studies and monitoring playing a crucial role in confirming the drug’s benefits and identifying any rare or long-term side effects.

Example 2: The CTSA Program

The NIH’s CTSA program established a national network of academic medical centers, research institutions, and community partners to accelerate the translation of research findings into clinical practice and improve patient outcomes.

Key features include:

  1. Collaborative research infrastructure, consisting of a network of institutions and partners that work together to conduct translational research, share resources and expertise, and disseminate best practices.
  2. Streamlined research processes with standardized protocols, templates, and tools to facilitate the rapid design, approval, and implementation of research studies across the network.
  3. Training and development of researchers and clinicians to build a workforce equipped to conduct innovative and rigorous translational research.
  4. Community engagement in the research process to ensure that studies are responsive to real-world needs and priorities.

By learning from the successes and principles of the FDA’s expedited review pathways and the NIH’s CTSA program, the education sector can develop its own innovative approach to accelerating the responsible development, evaluation, and deployment of GenAI-powered tools, as outlined in the following plan of action.

Plan of Action

To address the challenges and opportunities presented by GenAI in education, we propose the following three key recommendations for congressional action and the evolution of existing programs.

Recommendation 1. Establish the GenAI in Education Research Accelerator Program (GenAiRA).

Congress should establish the GenAiRA, housed in the IES, to support and expedite efficacy research on products and tools utilizing AI-powered educational tools and programs. This program will:

  1. Provide funding and resources to researchers and educators to conduct rigorous, timely, and cost-effective efficacy studies on promising AI-based solutions that address achievement gaps.
  2. Create guidelines and offer webinars and technical assistance to researchers, educators, and developers to build expertise in the responsible design, implementation, and evaluation of GenAI-powered tools in education.
  3. Foster collaboration and knowledge-sharing among researchers, educators, and GenAI developers to facilitate the rapid translation of research findings into practice and continuously improve GenAI-powered tools.
  4. Develop and disseminate best practices, guidelines, and ethical frameworks for responsible development and deployment of GenAI-enabled educational technology tools in educational settings, focusing on addressing bias, accuracy, privacy, and student agency issues.

Recommendation 2. Under the auspices of GenAiRA, adapt IES research and evaluation processes to create a framework to evaluate GenAI-enabled educational technology.

In consultation with experts in educational research and AI, IES will develop a framework that:

  1. Identifies existing research designs and creates alternative research designs (e.g., quasi-experimental studies, rapid short evaluations) suitable for generating credible evidence of effectiveness while being more responsive to the rapid pace of AI innovation. 
  2. Establish evidence-quality guidelines for rapid evaluation, including minimum sample sizes, study duration, effect size, and targeted population.
  3. Funds replication studies and expansion studies to determine impact in different contexts or with different populations (e.g., students with IEPs and English learners).
  4. Provides guidance to districts on how to interpret and apply evidence from different types of studies to inform decision-making around adopting and using AI technologies in education.   

Recommendation 3. Establish a GenAI Education Research and Innovation Consortium.

Congress should provide funding and incentives for IES to establish a GenAI Education Research and Innovation Consortium that brings together a network of “innovation schools,” research institutions, and EdTech developers committed to participating in rapid cycle studies and continuous improvement of GenAI tools in education. This approach will ensure that AI tools are developed and implemented in a way that is responsive to the needs and values of educators, students, and communities.

To support this consortium, Congress should:

  1. Allocate funds for the IES to provide grants and resources to schools, research institutions, and EdTech developers that meet established criteria for participation in the consortium, such as demonstrated commitment to innovation, research capacity, and ethical standards.
  2. Direct IES to work with programs like SEERNet and ExpandAI to identify and match potential consortium members, provide guidance and oversight to ensure that research studies meet rigorous standards for quality and ethics, and disseminate findings and best practices to the broader education community.
  3. Encourage the development of standardized protocols and templates for data sharing, privacy protection, and informed consent within the consortium, to reduce the time and effort required for each individual study and streamline administrative processes.
  4. Incentivize participation in the consortium by offering resources and support for schools, researchers, and developers, such as access to funding opportunities, technical assistance, and professional development resources.
  5.  Require the establishment of a central repository of research findings and best practices generated through rapid cycle evaluations conducted within the consortium, to facilitate the broader dissemination and adoption of effective GenAI-powered tools.

Conclusion 

Persistent learning opportunity gaps in math and reading, particularly among disadvantaged students, are a systemic challenge requiring innovative solutions. GenAI-powered educational tools offer potential for personalizing learning, identifying misconceptions, and providing tailored support. However, the mismatch between the pace of GenAI innovation and lengthy traditional research pathways impedes thorough vetting of these technologies to ensure they are equitable, effective, and safe before widespread adoption.

GenAiRA and development of alternative research frameworks provide a comprehensive approach to bridge the divide between GenAI’s rapid progress and the need for thorough evaluation in education. Leveraging existing partnerships, research infrastructure, and data sources can expedite the research process while maintaining scientific rigor and prioritizing student well-being.

The plan of action creates a roadmap for responsibly harnessing GenAI’s potential in education. Identifying appropriate congressional mechanisms for establishing the accelerator program, such as creating a new bill or incorporating language into upcoming legislation, can ensure this critical initiative receives necessary funding and oversight.

This comprehensive strategy charts a path toward equitable, personalized learning facilitated by GenAI while upholding the highest standards of evidence. Aligning GenAI innovation with rigorous research and prioritizing the needs of underserved student populations can unlock the transformative potential of these technologies to address persistent achievement gaps and improve outcomes for all learners.

This idea is part of our AI Legislation Policy Sprint. To see all of the policy ideas spanning innovation, education, healthcare, and trust, safety, and privacy, head to our sprint landing page.

Frequently Asked Questions
What makes AI and GenAI-powered educational tools different from traditional educational technologies?
AI and GenAI-powered educational tools differ from traditional educational technologies in their speed of development and deployment, as AI-generated content can be created and deployed extremely quickly, often with little time taken for thorough testing and evaluation. Additionally, AI-powered tools can generate content dynamically based on user inputs and interactions, meaning that the content presented to each student may be different every time, making it inherently more time-consuming to test and evaluate compared to fixed, pre-written content. Also, the ability of AI-powered tools to rapidly generate and disseminate educational content at scale means that any issues or flaws in the technology can have far-reaching consequences, potentially impacting large numbers of students across multiple schools and districts.
How do gaps in early grades impact students’ long-term educational outcomes and opportunities?
Students who fall behind in math and reading in the early years are more likely to struggle academically in later grades, leading to lower graduation rates, reduced college enrollment, and limited career opportunities.
What are some of the limitations of current educational interventions in addressing these learning opportunity gaps?
Current educational interventions often take a one-size-fits-all approach, failing to address the unique learning needs of individual students. They may also lack the ability to provide immediate feedback and adapt instruction in real-time based on student performance.
How has the rapid advancement of AI and GenAI technologies created new opportunities for personalized learning and targeted support?
Advancements such as machine learning and natural language processing have enabled the development of educational tools that can analyze vast amounts of student data, identify patterns in learning behavior, and provide customized recommendations and support. Personalization can include recommendations for what topics to learn and when, but also adjustments to finer details like amount and types of feedback and support provided. Further, content can be adjusted to make it more accessible to students, both from a language standpoint (dynamic translation) and a cultural one (culturally relevant contexts and characters). In the past, these types of adjustments were not feasible due to the labor involved in building them. With GenAI, this level of personalization will become commonplace and expected.
What are the potential risks or unintended consequences of implementing AI-powered educational tools without sufficient evidence of their effectiveness or safety?

Implementing AI and GenAI-powered educational tools without sufficient evidence of their effectiveness or safety could lead to the widespread use of ineffective interventions. If these tools fail to improve student outcomes or even hinder learning progress, they can have long-lasting negative consequences for students’ academic attainment and self-perception as learners.


When students are exposed to ineffective educational tools, they may struggle to grasp key concepts, leading to gaps in their knowledge and skills. Over time, these gaps can compound, leaving students ill-prepared for future learning challenges and limiting their academic and career opportunities. Moreover, repeated experiences of frustration and failure with educational technologies can erode students’ confidence, motivation, and engagement with learning.


This erosion of learner identity can be particularly damaging for students from disadvantaged backgrounds, who may already face additional barriers to academic success. If AI-powered tools fail to provide effective support and personalization, these students may fall even further behind their peers, exacerbating existing educational inequities.

How can we ensure that AI and GenAI-powered educational tools are developed and implemented in an equitable manner, benefiting all students, especially those from disadvantaged backgrounds?
By prioritizing research and funding for interventions that target the unique needs of disadvantaged student populations. We must also engage diverse stakeholders, including educators, parents, and community members, in the design and evaluation process to ensure that these tools are culturally responsive and address the specific challenges faced by different communities.
How can educators, parents, and policymakers stay informed about the latest developments in AI-powered educational tools and make informed decisions about their adoption and use?
Educators, parents, and policymakers can stay informed by engaging with resources, guidance and programs developed by organizations like the Office of Educational Technology, Institute of Education Sciences, EDSAFE AI Alliance and others on the opportunities and risks of AI/GenAI in education.

A Safe Harbor for AI Researchers: Promoting Safety and Trustworthiness Through Good-Faith Research

Artificial intelligence (AI) companies disincentivize safety research by implicitly threatening to ban independent researchers that demonstrate safety flaws in their systems. While Congress encourages companies to provide bug bounties and protections for security research, this is not yet the case for AI safety research. Without independent research, we do not know if the AI systems that are being deployed today are safe or if they pose widespread risks that have yet to be discovered, including risks to U.S. national security. While companies conduct adversarial testing in advance of deploying generative AI models, they fail to adequately test their models after they are deployed as part of an evolving product or service. Therefore, Congress should promote the safety and trustworthiness of AI systems by establishing bug bounties for AI safety via the Chief Digital and Artificial Intelligence Office and creating a safe harbor for research on generative AI platforms as part of the Platform Accountability and Transparency Act.

Challenge and Opportunity 

In July 2023, the world’s top AI companies signed voluntary commitments at the White House, pledging to “incent third-party discovery and reporting of issues and vulnerabilities.” Almost a year later, few of the signatories have lived up to this commitment. While some companies do reward researchers for finding security flaws in their AI systems, few companies strongly encourage research on safety or provide concrete protections for good-faith research practices. Instead, leading generative AI companies’ Terms of Service legally prohibit safety and trustworthiness research, in effect threatening anyone who conducts such research with bans from their platforms or even legal action.

In March 2024, over 350 leading AI researchers and advocates signed an open letter calling for “a safe harbor for independent AI evaluation.” The researchers noted that generative AI companies offer no legal protections for independent safety researchers, even though this research is critical to identifying safety issues in AI models and systems. The letter stated: “whereas security research on traditional software has established voluntary protections from companies (‘safe harbors’), clear norms from vulnerability disclosure policies, and legal protections from the DOJ, trustworthiness and safety research on AI systems has few such protections.” 

In the months since the letter was released, companies have continued to be opaque about key aspects of their most powerful AI systems, such as the data used to build their models. If a researcher wants to test whether AI systems like ChatGPT, Claude, or Gemini can be jailbroken such that they pose a threat to U.S. national security, they are not allowed to do so as companies proscribe such research. Developers of generative AI models tout the safety of their systems based on internal red-teaming, but there is no way for the federal government or independent researchers to validate these results, as companies do not release reproducible evaluations. 

Generative AI companies also impose barriers on their platforms that limit good-faith research. Unlike much of the web, the content on generative AI platforms is not publicly available, meaning that users need accounts to access AI-generated content and these accounts can be restricted by the company that owns the platform. In addition, companies like Google, Amazon, Microsoft, and OpenAI block certain requests that users might make of their AI models and limit the functionality of their models to prevent researchers from unearthing issues related to safety or trustworthiness.

Similar issues plague social media, as companies take steps to prevent researchers and journalists from conducting investigations on their platforms. Social media researchers face liability under the Computer Fraud and Abuse Act and Section 1201 of the Digital Millennium Copyright Act among other laws, which has had a chilling effect on such research and worsened the spread of misinformation online. The stakes are even higher for AI, which has the potential not only to turbocharge misinformation but also to provide U.S. adversaries like China and Russia with material strategic advantages. While legislation like the Platform Accountability and Transparency Act would enable research on recommendation algorithms, proposals that grant researchers access to platform data do not consider generative AI platforms to be in scope.

Congress can safeguard U.S. national security by promoting independent AI safety research. Conducting pre-deployment risk assessments is insufficient in a world where tens of millions of Americans are using generative AI—we need real-time assessments of the risks posed by AI systems after they are deployed as well. Big Tech should not be taken at its word when it says that its AI systems cannot be used by malicious actors to generate malware or spy on Americans. The best way to ensure the safety of generative AI systems is to empower the thousands of cutting-edge researchers at U.S. universities who are eager to stress test these systems. Especially for general-purpose technologies, small corporate safety teams are not sufficient to evaluate the full range of potential risks, whereas the independent research community can do so thoroughly.

Figure 1. What access protections do AI companies provide for independent safety research? Source: Longpre et al., “A Safe Harbor for AI Evaluation and Red Teaming.

Plan of Action

Congress should enable independent AI safety and trustworthiness researchers by adopting two new policies. First, Congress should incentivize AI safety research by creating algorithmic bug bounties for this kind of work. AI companies often do not incentivize research that could reveal safety flaws in their systems, even though the government will be a major client for these systems. Even small incentives can go a long way, as there are thousands of AI researchers capable of demonstrating such flaws. This would also entail establishing mechanisms through which safety flaws or vulnerabilities in AI models can be disclosed, or a kind of help-line for AI systems.

Second, Congress should require AI platform companies, such as Google, Amazon, Microsoft, and OpenAI to share data with researchers regarding their AI systems. As with social media platforms, generative AI platforms mediate the behavior of millions of people through the algorithms they produce and the decisions they enable. Companies that operate application programming interfaces used by tens of thousands of enterprises should share basic information about their platforms with researchers to facilitate external oversight of these consequential technologies. 

Taken together, vulnerability disclosure incentivized through algorithmic bug bounties and protections for researchers enabled by safe harbors would substantially improve the safety and trustworthiness of generative AI systems. Congress should prioritize mitigating the risks of generative AI systems and protecting the researchers who expose them.

Recommendation 1. Establish algorithmic bug bounties for AI safety.

As part of the FY2024 National Defense Authorization Act (NDAA), Congress established “Artificial Intelligence Bug Bounty Programs” requiring that within 180 days “the Chief Digital and Artificial Intelligence Officer of the Department of Defense shall develop a bug bounty program for foundational artificial intelligence models being integrated into the missions and operations of the Department of Defense.” However, these bug bounties extend only to security vulnerabilities. In the FY2025 NDAA, this bug bounty program should be expanded to include AI safety. See below for draft legislative language to this effect. 

Recommendation 2. Create legal protections for AI researchers.

Section 9 of the proposed Platform Accountability and Transparency Act (PATA) would establish a “safe harbor for research on social media platforms.” This likely excludes major generative AI platforms such as Google Cloud, Amazon Web Services, Microsoft Azure, and OpenAI’s API, meaning that researchers have no legal protections when conducting safety research on generative AI models via these platforms. PATA and other legislative proposals related to AI should incorporate a safe harbor for research on generative AI platforms.

Conclusion

The need for independent AI evaluation has garnered significant support from academics, journalists, and civil society. Safe harbor for AI safety and trustworthiness researchers is a minimum fundamental protection against the risks posed by generative AI systems, including related to national security. Congress has an important opportunity to act before it’s too late.

This idea is part of our AI Legislation Policy Sprint. To see all of the policy ideas spanning innovation, education, healthcare, and trust, safety, and privacy, head to our sprint landing page.

Frequently Asked Questions
Do companies support this idea?
Some companies are supportive of this idea, but many legal teams are risk averse, especially when there is no legal obligation to offer safe harbor. Multiple companies have indicated they will not change their policies and practices until the government compels them to do so.
Wouldn’t allowing for more safety testing come with safety risks?
Safety testing does not entail additional safety risks. In the absence of widespread safety testing, these flaws will still be found by foreign adversaries, but we would not know that these flaws existed in the first place. Security through obscurity has long been disproven. Furthermore, safe harbors only protect research that is conducted according to strict rules regarding what constitutes good-faith research.
What federal agencies have relevant authorities here?
The National Institute of Standards and Technology (NIST), the Federal Trade Commission (FTC), and the National Science Foundation (NSF) are among the most important federal entities in this area. Under President Biden’s AI executive order, NIST is responsible for drafting guidance on red teaming among other issues, which could include protections for independent researchers. FTC has jurisdiction over competition and consumer protection issues related to generative AI, both of which relate to researcher access. NSF has launched the National AI Research Resource Pilot, which can help scale up researcher access as AI companies provide compute credits via the pilot.
How does this intersect with the Copyright Office’s triennial Section 1201 DMCA proceeding?

The authors of this memorandum as well as the academic paper underlying it submitted a comment to the Copyright Office in support of an exemption to DMCA for AI safety and trustworthiness research. The Computer Crime and Intellectual Property Section of the U.S. Department of Justice’s Criminal Division and Senator Mark Warner have also endorsed such an exemption. However, a DMCA exemption regarding research on AI bias, trustworthiness, and safety alone would not be sufficient to assuage the concerns of AI researchers, as they may still face liability under other statutes such as the Computer Fraud and Abuse Act.

Are researchers really limited by what AI companies are doing? I see lots of academic research on these topics.

Much of this research is currently conducted by research labs with direct connections to the AI companies they are assessing. Researchers who are less well connected, of which there are thousands, may be unwilling to take the legal or personal risk of violating companies’ Terms of Service. See our academic paper on this topic for further details on this and other questions.

How might language from the FY2024 NDAA be adapted to bug bounties for AI safety?

See draft legislative language below, building on Sec. 1542 of the FY2024 NDAA:


SEC. X. EXPANSION OF ARTIFICIAL INTELLIGENCE BUG BOUNTY PROGRAMS.


(a) Update to Program for Foundational Artificial Intelligence Products Being Integrated Within Department of Defense.—


(1) Development required.—Not later than 180 days after the date of the enactment of this Act and subject to the availability of appropriations, the Chief Digital and Artificial Intelligence Officer of the Department of Defense shall expand its bug bounty program for foundational artificial intelligence models being integrated into the missions and operations of the Department of Defense to include unsafe model behaviors in addition to security vulnerabilities.


(2) Collaboration.—In expanding the program under paragraph (1), the Chief Digital and Artificial Intelligence Officer may collaborate with the heads of other Federal departments and agencies with expertise in cybersecurity and artificial intelligence.


(3) Implementation authorized.—The Chief Digital and Artificial Intelligence Officer may carry out the program In subsection (a).


(4) Contracts.—The Secretary of Defense shall ensure, as may be appropriate, that whenever the Secretary enters into any contract, such contract allows for participation in the bug bounty program under paragraph (1).


(5) Rule of construction.—Nothing in this subsection shall be construed to require—


(A) the use of any foundational artificial intelligence model; or


(B) the implementation of the program developed under paragraph (1) for the purpose of the integration of a foundational artificial intelligence model into the missions or operations of the Department of Defense.

Update COPPA 2.0 to Strengthen Children’s Online Voice Privacy in the AI Era

Emerging technologies like artificial intelligence (AI) are changing the way humans interact with machines. As AI technology has made huge progress over the last decade, the processing of modalities such as text, voice, image, and video data has been replaced with data-driven large AI models. These models were primarily aimed for machines to comprehend various data and perform tasks without human intervention. Now, with the emergence of generative AI like ChatGPT, these models are capable of generating data such as text, voice, image, or video. Policymakers across the globe are struggling to draft to govern ethical use of data as well as regulate the creation of safe, secure, and trustworthy AI models. 

Data privacy is a major concern with the advent of AI technology. Actions by the US Congress such as the proposed American Privacy Rights Act aim to enforce strict data privacy rights. With emerging AI applications for children, the privacy of children and the safekeeping of their personal information is also a legislative challenge. 

Congress must act to protect children’s voice privacy before it’s too late. Companies that store children’s voice recordings and use them for profit-driven applications (or advertising) without parental consent pose serious privacy threats to children and families. The proposed revisions to the Children’s Online Privacy Protection Act (COPPA) aim to restrict companies’ capacity to profit from children’s data and transfer the responsibility of compliance from parents to companies. However, several measures in the proposed legislation need more clarity and additional guidelines.

Challenge and Opportunity 

Human voice1 is one of the most popular modalities for AI technology. Advancements in voice AI technology such as voice AI assistants (Siri, Google, Bixby, Alexa, etc.) in smartphones have made many day-to-day activities easier; however, there are also emerging threats from voice AI and a lack of regulations governing voice data and voice AI applications. One example is AI voice impersonation scams. Using the latest voice AI technology,2 a high-quality personalized voice recording can be generated with as little as 15 seconds of the speaker’s recorded voice. A technology rat race among Big Tech has begun, as companies are trying to achieve this using voice recordings that are less than a few seconds. Scammers have increasingly been using this technology for their benefit. OpenAI, the creator of ChatGPT, recently developed a product called Voice Engine—but refrained from commercializing it by acknowledging that this technology poses “serious risks,” especially in an election year. 

A voice recording contains very personal information about a speaker, and that gives the ability to identify a target speaker from recordings of multiple speakers. Emerging research in voice AI technology has potential implications for medical and health-related applications from voice recordings, plus identification of age, height, and much more. When using cloud-based applications, privacy concerns also arise during voice data transfer and from data storage leaks, due to noncompliance with data collection and storage. Therefore, the threats from misuse of voice data and voice AI technology are enormous.

Social media services, educational technology, online games, and smart toys are just a few services for children that have started adopting voice technology (e.g., Alexa for Kids). Any service operator (or company) collecting and using children’s personal information, including their voice, is bound by the Children’s Online Privacy Protection Act (COPPA). The Federal Trade Commission (FTC) is the enforcing federal agency for COPPA. However, several companies have recently violated COPPA by collecting personal information from children without parental consent and used it for advertising and maximizing their platform profits. “Amazon’s history of misleading parents, keeping children’s recordings indefinitely, and flouting parents’ deletion requests violated COPPA and sacrificed privacy for profits,” said Samuel Levine of the FTC’s Bureau of Consumer Protection. The FTC alleges that Amazon maintained records of children’s data, disregarding parents’ deletion requests, and trained its Voice AI algorithms on that data.

Children’s spoken characteristics are different from those of adults; thus, developing voice AI technology for children is more challenging. Most commercial voice-AI-enabled services work smoothly for adults, but their accuracy in understanding children’s voices is often limited. Another challenge is the relatively sparse availability of children’s voice data to train AI models. Therefore, Big Tech is looking to leverage ways to acquire as much children’s voice data as possible to train AI voice models. This challenge is prevalent not only in industry but also in academic research on the subject due to very limited data availability and varying spoken skills. However, misuse of acquired data, especially without consent, is not a solution, and operators must be penalized for such actions. 

Considering the recent violations of COPPA by operators, and with a goal to strengthen the compliance of safeguarding and avoid misuse of personal information such as voice, Congress is updating COPPA with new legislation. The COPPA updates propose to extend and update the definition of “operator,” “personal information” including voice prints, “consent,” “website/service/application” including devices connected to the internet, and guidelines for “collection, use, disclosure, and deletion of personal information.” These updates are especially critical when the personal information of users (or consumers) can serve as valuable data for operators for profit-driven applications and misuse without any federal regulation. The FTC acknowledges that the current version of COPPA is insufficient; therefore, these updates would also enable the FTC to act on operators and take strict action. 

Plan of Action 

The Children and Teens’ Online Privacy Protection Act (COPPA 2.0) has been proposed in both the Senate and House to update COPPA for the modern internet age, with a renewed focus on limiting misuse of children’s personal data (including voice recordings). This proposed legislation has gained momentum and bipartisan support. However, the text in this legislation could still be updated to ensure consumer privacy and support future innovation.

Recommendation 1. Clarify the exclusion clause for audio files. 

An exclusion clause has been added in this legislation particularly for audio files containing a child’s voice, declaring that the collected audio file is not considered personal information if it meets certain criteria. This was added to adopt a more expansive audio file exception, particularly to allow operators to provide some features to their users (or consumers).  

While just having the text “only uses the voice within the audio file solely as a replacement for written words”3 might be overly restrictive for voice-based applications, the text “to perform a task” might open the use of audio files for any task that could be beneficial to operators. The task should only be related to performing a request or providing a service to the user, and that needs to be clarified in the text. Potential misuse of this text could be (1) to train AI models for tasks that might help operators provide a service to the user—especially for personalization, or (2) to extract and store “audio features”4 (most voice AI models are trained using audio features instead of the raw audio itself). Operators might argue that extracting audio features is necessary as part of the algorithm that assists in providing a service to the user.  Therefore, the phrasing “to perform a task” in this exclusion might be open-ended and should be modified as suggested: 

Current text: “(iii) only uses the voice within the audio file solely as a replacement for written words, to perform a task, or engage with a website, online service, online application, or mobile application, such as to perform a search or fulfill a verbal instruction or request; and”

Suggestion text: “(iii) only uses the voice within the audio file solely as a replacement for written words, to only perform a task to engage with a website, online service, online application, or mobile application, such as to perform a search or fulfill a verbal instruction or request; and” 

On a similar note, legislators should consider adding the term “audio features.” Audio features are enough to train voice AI models and develop any voice-related application, even if the original audio file is deleted. Therefore, the deletion argument in the exclusion clause should be modified as suggested: 

Current text: “(iv) only maintains the audio file long enough to complete the stated purpose and then immediately deletes the audio file and does not make any other use of the audio file prior to deletion.”

Suggestion text: “(iv) only maintains the audio file long enough to complete the stated purpose and then immediately deletes the audio file and any extracted audio-based features and does not make any other use of the audio file (or extracted audio-features) prior to deletion.

Adding more clarity to the exclusion will help avoid misuse of children’s voices for any task that companies might still find beneficial and also ensure that operators delete all forms of the audio which could be used to train AI models. 

Recommendation 2. Add guidelines on the deidentification of audio files to enhance innovation. 

A deidentified audio file is one that cannot be used to identify the speaker whose voice is recorded in that file. The legislative text of COPPA 2.0 does not mention or have any guidelines on how to deidentify an audio file. These guidelines would not only protect the privacy of users but also allow operators to use deidentified audio files to add features and improve their products. The guidelines could include steps to be followed by operators as well as additional commitment from operators. 

The steps include: 

The commitments include: 

Following these guidelines might be expensive for operators; however, it is crucial to take as many precautions as possible. Current deidentification steps of audio files followed by operators are not sufficient, and there have been numerous instances in which anonymized data had been reidentified, according to a statement released by a group of State Attorneys General. These proposed guidelines could allow operators to deidentify audio files and use those files for product development. This will allow the innovation of voice AI technology for children to flourish. 

Recommendation 3. Add AI-generated avatars in the definition of personal information.

With the emerging applications of generative AI and growing virtual reality use for education (in classrooms) and for leisure (in online games), “AI-based avatar generation from a child’s image, audio, or video” should be added to the legislative definition of “personal information.” Virtual reality is a growing space, and digital representations of the human user (an avatar) are increasingly used to allow the user to see and interact with virtual reality environments and other users. 

Conclusion 

As new applications of AI emerge, operators must ensure compliance in the collection and use of consumers’ personal information and safety in the design of their products using that data, especially when dealing with vulnerable populations like children. Since the original passage of COPPA in 1998, how consumers use online services for day-to-day activities, including educational technology and amusement for children, has changed dramatically. This ever-changing scope and reach of online services require strong legislative action to bring online privacy standards into the 21st century.  Without a doubt, COPPA 2.0 will lead this regulatory drive not only to protect children’s personal information collected by online services and operators from misuse but also to ensure that the burden of compliance rests on the operators rather than on parents. These recommendations will help strengthen the protections of COPPA 2.0 even further while leaving open avenues for innovation in voice AI technology for children.

This idea is part of our AI Legislation Policy Sprint. To see all of the policy ideas spanning innovation, education, healthcare, and trust, safety, and privacy, head to our sprint landing page.