Supporting Data Center Development by Reducing Energy System Impact
In the last decade, American data center energy use has tripled. By 2028, the Department of Energy predicts it will either double or triple again. To meet growing tech industry energy demands without imposing a staggering toll on individual energy consumers, and to best position the United States to benefit from the advancements of artificial intelligence (AI), Congress should invest in innovative approaches to powering data centers. Namely, Congress should create a pathway for data centers to be viably integrated into Thermal Energy Networks (TENs) in order to curb costs, increase efficiency, and support grid resilience and reliability for all customers.
Congress should invest in American energy security and maximize benefits from data center use by:
- Authorizing a program for a new TEN pilot program that ties grants to performance metrics such as reducing the cost of installing underground infrastructure,
- Including requirements for data centers related to Power Usage Effectiveness (PUE) in the National Defense Authorization Act for Fiscal Year 2026, and
- Updating the 2018 Commercial Buildings Energy Consumption Survey (CBECS) Data Center Pilot to increase data center participation.
These actions will position the federal government to deploy innovative approaches to energy infrastructure while unlocking technological advancement and economic growth from AI.
Challenge and Opportunity
By 2028, American data center energy demands are expected to account for up to 12% of the country’s electricity consumption from 4.4% in 2023. The development of artificial intelligence (AI) technologies is driving this increase because they consume more compute resources than other technologies. As a result of their significant energy demand, data centers face two hurdles to development: (1) interconnection delays due to infrastructure development requirements and (2) the resulting costs borne by consumers in those markets, which heighten resident resistance to siting centers nearby.
Interconnection rates across the country are lengthy. In 2023, the interconnection request to commercial operations period was five years for typical power plant projects. In states like Virginia, widely-known as the “Data Center Capital of the World,” waits can stretch to seven years for data centers specifically. These interconnection timelines have grown over time, and are expected to continue growing based on queue lengths.
Interconnection is also costly. The primary cost drivers are various upgrade requirements to the broader transmission system. Unlike upgrades for energy generators, which are typically paid for by the energy generators, the cost of interconnection for new energy consumers such as data centers affects everyone around them as well. Experts believe that by socializing the costs of new data center infrastructure, utilities are passing these costs to ratepayers.
Efforts are underway to minimize data center energy costs while improving operational efficiency. One way to do that is to reclaim the energy that data centers consume by repurposing waste heat through thermal energy networks (TENs). TENs are shared networks of pipes that move heat between locations; they may incorporate any number of heat sources, including data centers. Data centers can not only generate heat for these systems, but also benefit from cooling—a major source of current data center energy consumption—provided by integrated systems.
Like other energy infrastructure projects, TENs require significant upfront financial investment to reap long-term rewards. However, they can potentially offset some of those upfront costs by shortening interconnection timelines based on demonstrated lower energy demand and reduced grid load. Avoiding larger traditional grid infrastructure upgrades would also avert the skyrocketing consumer costs described above.
At a community or utility level, TENs also offer other benefits. They improve grid resiliency and reliability: The network loops that compose a TEN increase redundancy, reducing the likelihood that a single point of failure will yield systemic failure, especially in light of increasing energy demands brought about by weather events such as extreme heat. Further, TENs allow utilities to decrease and transfer electrical demand, offering a way to balance peak loads. TENs offer building tradespeople such as pipefitters ”plentiful and high-paying jobs” as they become more prevalent, especially in rural areas. They also provide employment paths for employees of utilities and natural gas companies with expertise in underground infrastructure. By creating jobs, reducing water stress and grid strain, and decreasing the risk of quickly rising utility costs, investing in TENs to bolster data center development would reduce the current trend of community resistance to development. Many of these benefits extend to non-data center TEN participants, like nearby homes and businesses, as well.
Federal coordination is essential to accelerating the creation of TENs in data-center heavy areas. Some states, like New York and Colorado, have passed legislation to promote TEN development. However, the states with the densest data center markets, many of which also rank poorly on grid reliability, are not all putting forth efforts to develop TENs. Because the U.S. grid is divided into multiple regions and managed by the Federal Energy Regulatory Commission, the federal government is uniquely well positioned to invest in improvements in grid resiliency through TENs and to make the U.S. a world leader in this technology.
Plan of Action
The Trump Administration and Congress can promote data center development while improving grid resiliency and reliability and reducing consumers’ financial burden through a three-part strategy:
Recommendation 1. Create a new competitive grant program to help states launch TEN pilots.
Congress should create a new TEN pilot competitive grant program administered by the Department of Energy. The federal TEN program should allow states to apply for funding to run their own TEN programs administered by states’ energy offices and organizations. This program could build on two strong precedents:
- The Department of Energy’s 2022 funding opportunity for Community Geothermal Heating and Cooling Design and Deployment. This opportunity supported geothermal heating and cooling networks, which are a type of TEN that relies on the earth’s constant temperature and heat pumps to heat or cool buildings. Though this program generated significant interest, an opportunity remains for the federal government to invest in non-geothermal TEN projects. These would be projects that rely on exchanging heat with other sources, such as bodies of water, waste systems, or even energy-intensive buildings like data centers. The economic advantages are promising: one funded project reported expecting “savings of as much as 70% on utility bills” for beneficiaries of the proposed design.
- The New York State’s Large-Scale Thermal program, run by its Energy Research and Development Authority (NYSERDA), has offered multiple funding opportunities that specifically include the development of TENs. In 2021, it launched a Community Heat Pump Systems (PON 4614) program that has since awarded multiple projects that include data centers. One project reported its design would save $2.4 million or roughly 77% annually in operations costs.
Congress should authorize a new pilot program with $30 million to be distributed to state TEN programs, which states could disperse via grants and performance contracts. Such a program would support the Trump administration’s goal of fast-tracking AI data center development.
To ensure that the funding benefits both grant recipients and their host communities, requirements should be attached to these grants that incentivize consumer benefits such as reduced electricity or heating bills, improved air quality and decreased pollution. The grant awards should be prioritized according to performance metrics such as projected cost reductions related to drilling or to installing underground infrastructure and greater operational efficiency.
Recommendation 2. Include power usage effectiveness in the amendments to the National Defense Authorization Act for Fiscal Year 2026 (2026 NDAA).
In the National Defense Authorization Act of 2024, Sec. 5302 (“Federal Data Center Consolidation Initiative amendments”) amended Section 834 of the Carl Levin and Howard P. “Buck” McKeon National Defense Authorization Act for Fiscal Year 2015 by specifying minimum requirements for new data centers. Sec. 5302(b)(2)(b)(2)(A)(ii) currently reads:
[…The minimum requirements established under paragraph (1) shall include requirements relating to—…] “the use of new data centers, including costs related to the facility, energy consumption, and related infrastructure;.”
To couple data center development with improved grid resilience and stability, the 2026 NDAA should amend Sec. 5302(b)(2)(b)(2)(A)(ii) as follows:
[…The minimum requirements established under paragraph (1) shall include requirements relating to—…] “the use of new data centers, including power usage effectiveness, costs related to the facility, energy consumption, and related infrastructure.”
Power usage effectiveness (PUE) is a common metric to measure the efficiency of data center power use. It is the ratio of total power used by the facility over the amount of that power dedicated to IT services. The PUE metric has limitations, such as its inability to provide an apples-to-apples comparison of data center energy efficiency based on variability in underlying technology and its lack of precision, especially given the growth of AI data centers. However, introducing the PUE metric as part of the regulatory framework for data centers would provide a specific target for new builds to use, making it easier for both developers and policymakers to identify progress. Requirements related to PUE would also encourage developers to invest in technologies that increase energy efficiency without unduly hurting their bottom lines. In the future, legislators should continue to amend this section of the NDAA as new, more accurate, and useful efficiency metrics develop.
Recommendation 3. The U.S. Energy Information Administration (EIA) should update the 2018 Commercial Buildings Energy Consumption Survey (CBECS) Data Center Pilot.
To facilitate community acceptance and realize benefits like better financing terms based on lower default risk, data center developers should seek to benchmark their facilities’ energy consumptions. Energy consumption benchmarking, the process of analyzing consumption data and comparing to both past performance and the performance of similar facilities, results in operational cost savings. These savings amplify the economic benefits of vehicles like TENs for cost-sensitive developers and lower the potential increase of community utility costs.
Data center developers should create industry-standard benchmarking tools, much as other industries do. However, it’s challenging for them to embark on this work without accurate and current information that facilitates the development of useful models and targets, especially in such a fast-changing field. Yet data sources such as those used to create benchmarks for other industries are unavailable. One popular source is the CBECS, which does not include data centers as a separate building type. This issue is longstanding; in 2018, the EIA released a report detailing the results of their data center pilot, which they undertook to address this gap. The pilot cited three main hurdles to accurately account for data centers’ energy consumption: the lack of a comprehensive frame or list of data centers, low cooperation rates, and a high rate of nonresponse to important survey questions.
With the proliferation of data centers since the pilot, it has become only more pressing to differentiate this building type and enable data centers to seek accurate representation and develop industry benchmarks. To address the framing issue, CBECS should use a commercial data source like Data Center Map. At the time the EIA considered this source “unvalidated,” but it has been used as a data source by the U.S. Department of Commerce and the International Energy Agency. Additionally, the EIA should also perform the “cognitive research and pretests” recommended in the pilot to find ways to encourage complete responses in order to recreate its pilot and seek an improved outcome.
Conclusion
Data center energy demand has exploded in recent years and continues to climb, due in part to the advent of widespread AI development. Data centers need access to reliable energy without creating grid instability or dramatically increasing utility costs for individual consumers. This creates a unique opportunity for the federal government to develop and implement innovative technology such as TENs in areas working to support changing energy demands. The government should also seize this moment to define and update standards for site developers to ensure they are building cost-effective and operationally efficient facilities. By progressing systems and tools that benefit other area energy consumers down to the individual ratepayer, the federal government can transform data centers from infrastructural burdens to good neighbors.
This budget was calculated by using the allocation for the NYSERDA Large-Scale Thermal pilot program ($10 million) and multiplying by three (for a three year pilot). Because NYSERDA’s program funded projects at over 50 sites, this initial pilot would plan to fund roughly 150 projects across the states.
Performance-based contracts differ from other types of contracts in that they focus on what work is to be performed rather than how specifically it is accomplished. Solicitations include either a Performance Work Statement or Statement of Objectives and resulting contracts include measurable performance standards and potentially performance incentives.
Rebuild Corporate Research for a Stronger American Future
The American research enterprise, long the global leader, faces intensifying competition and mounting criticism regarding its productivity and relevance to societal challenges. At the same time, a vital component of a healthy research enterprise has been lost: corporate research labs, epitomized by the iconic Bell Labs of the 20th century. Such labs uniquely excelled at reverse translational research, where real-world utility and problem-rich environments served as powerful inspirations for fundamental learning and discovery. Rebuilding such labs in a 21st century “Bell Labs X” form would restore a powerful and uniquely American approach to technoscientific discovery—harnessing the private sector to discover and invent in ways that fundamentally improve U.S. national and economic competitiveness. Moreover, new metaresearch insights into “how to innovate how we innovate” provide principles that can guide their rebuilding. The White House Office of Science and Technology Policy (OSTP) can help turn these insights into reality by convening a working group of stakeholders (philanthropy, business, and science agency leaders), alongside policy and metascience scholars, to make practical recommendations for implementation.
Challenge and Opportunity
The American research enterprise faces intensifying competition and mounting criticism regarding its productivity and relevance to societal challenges. While a number of reasons have been proposed for why, among the most important is that corporate research labs, a vital piece of a healthy research enterprise, are missing. Exemplified by the Bell Labs, these labs dominated the research enterprise of the first half of the 20th century but became defunct in the second half. The reason: formalization of profits as the prime goal of corporations, which is incompatible with research, particularly the basic research that produces public-goods science and technology. Instead, academic research is now dominant. The reason: the rise of federal agencies like the National Science Foundation (NSF) with a near-total focus on academia. This dynamic, however, is not fundamental: federal agencies could easily fund research at corporations and not just in academia.
Moreover, there is a compelling reason to do so. Utility and learning are cyclical and build on each other. In one direction, learning serves as a starting point for utility. Academia excels at such translational research. In the other direction, utility serves as a starting point for learning. Corporations in principle excel at such reverse translational research. Corporations are where utility lives and breathes and where real-world problem-rich environments and inspiration for learning thrives. This reverse translational half of the utility-learning cycle, however, is currently nearly absent, and is a critical void that could be filled by corporate research.
For example, at Bell Labs circa WWII, Claude Shannon’s exposure to real-world problems in cryptography and noisy communications inspired his surprising idea to treat information as a quantifiable and manipulable entity independent of its physical medium, revolutionizing information science and technology. Similarly, Mervyn Kelly’s exposure to the real-world benefit of compact and reliable solid-state amplifiers inspired him to create a research activity at Bell Labs that invented the transistor and discovered the transistor effect. These advances, inspired by real-world utility, laid the foundations for our modern information age.
Importantly, these advances were given freely to the nation because Bell Labs’ host corporation, the AT&T of the 20th century, was a monopoly and could be altruistic with respect to its research. Now, in the 21st century, corporations, even when they have dominant market power, are subject to intense competitive pressures on their bottom-line profit which make it difficult for them to engage in research that is given freely to the nation. But to throw away corporate research along with the monopolies that could afford to do such research is to throw away the baby with the bathwater. Instead, the challenge is to rebuild corporate research in a 21st century: “Bell Labs X” form without relying on monopolies, using public-private partnerships instead.
Moreover, new insights into the nature and nurture of research provide principles that can guide the creation of such public-private partnerships for the purpose of public-goods research.
- Inspire, but Don’t Constrain, Research by Particular Use. Reverse-translational research should start with real-world challenges but not be constrained by them as it seeks the greatest advances in learning—advances that surprise and contradict prevailing wisdom. This principle combines Donald Stokes’ “use-inspired research” with Ken Stanley and Joel Lehman’s “why greatness cannot be planned” with Gold Standard Science’s informed contrariness and dissent.
- Fund and Execute Research at the Institution, not Individual Researcher, Level. This would be very different from the dominant mode of research funding in the U.S.: matrix-funding to principal investigators (PIs) in academia. Here, instead, research funding would be to research institutes that employ researchers rather than contract with researchers employed by other institutions. Leadership would be empowered to nurture and orchestrate the people, culture, and organizational structure of the institute for the singular purpose of empowering researchers to achieve groundbreaking discoveries.
- Evolve Research Institutions by Retrospective, Competitive Reselection. There should be many research institutes and none should have guaranteed perpetual funding. Instead, they should be subject to periodic evaluation “with teeth” where research institutions only continue to receive support if they are significantly changing the way we think and/or do. This creates a dynamic market-like ecosystem within which the population of research institutes evolves in response to a competitive re-selection pressure towards ever-increasing research productivity.
Plan of Action
The White House Office of Science and Technology Policy (OSTP) should convene a working group of stakeholders, alongside policy and metaresearch scholars, to make practical recommendations for public-private partnerships that enable corporate research akin to the Bell Labs of the 20th century, but in a 21st century “Bell Labs X” form.
Among the stakeholders would be government agencies, corporations and philanthropies—perhaps along the lines of the Government-University-Industry-Philanthropy Research Roundtable (GUIPRR) of the National Academies of Sciences, Engineering and Medicine (NASEM).
Importantly, the working group does not need to start from scratch. A high-level, funding and organizational model was recently articulated.
Its starting point is the initial selection of ten or so Bell Labs Xs based on their potential for major advances in public-goods science and technology. Each Bell Labs X would be hosted and cost-shared by a corporation that brings with it its problem-rich use environment and state-of-the-art technological contexts, but majority block-funded by a research funder (federal agencies and/or philanthropies) with broad societal benefit in mind. To establish a sense of scale, we might imagine each Bell Labs X having a $120M/year operating budget and a 20% cost share—so $20M/year coming from the corporate host and $100M/year coming from the research funder.
This plan also envisions a market-like competitive renewal structure of these corporate research labs. At the end of a period of time (say, ten years) appropriate for long-term basic research, all ten or so Bell Labs Xs would be evaluated for their contributions to public-goods science and technology independent of their contributions to commercial applications of the host corporation. Only the most productive seven or eight of the ten would be renewed. In between selection, re-selection and subsequent re-re-selections, leadership of each Bell Labs X would be free to nurture its people, culture and organizational structure as it believes will maximize research productivity. Each Bell Labs X would thus be an experiment in research institution design. And each Bell Labs X would make its own bet on the knowledge domain it believes is ripe for the greatest disruptive advances. Government’s role would be largely confined to retrospectively rewarding or disrewarding those Bell Labs Xs that made better or worse bets, without itself making bets.
Conclusion
Imagine a private institution whose researchers routinely disrupted knowledge and changed the world. That’s the story of Bell Labs—a legendary research institute that gave us scientific and technological breakthroughs we now take for granted. In its heyday in the mid-20th century, Bell Labs was a crucible of innovation where brilliant minds were exposed to and inspired by real-world problems, then given the freedom to explore those problems in deep and fundamental ways, often pivoting to and solving unanticipated new problems of even greater importance.
Recreating that innovative environment is possible and its impact on American research productivity would be profound. By innovating how we innovate, we would leap-frog other nations who are investing heavily in their own research productivity but are largely copying the structure of the current U.S. research enterprise. The resulting network of Bell Labs Xs would flip the relationship between corporations and the nation’s public-goods science and technology from asking not what the nation’s public-goods science and technology can do for corporations, but what corporations can do for the nation’s public-goods science and technology. Disruptive and useful ideas are not getting harder to find; our current research enterprise is just not well optimized to find them.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Bounty Hunters for Science
Fraud in scientific research is more common than we’d like to think. Such research can mislead entire scientific fields for years, driving futile and wasteful followup studies, and slowing down real scientific discoveries. To truly push the boundaries of knowledge, researchers should be able to base their theories and decisions on a more trustworthy scientific record.
Currently there are insufficient incentives to identify fraud and correct the record. Meanwhile, fraudsters can continue to operate with little chance of being caught. That should change: Scientific funders should establish one or more bounty programs aimed at rewarding people who identify significant problems with federally-funded research, and should particularly reward fraud whistleblowers whose careers are on the line.
Challenge and Opportunity
In 2023 it was revealed that 20 papers from Hoau-Yan Wang, an influential Alzheimer’s researcher, were marred by doctored images and other scientific misconduct. Shockingly, his research led to the development of a drug that was tested on 2,000 patients. A colleague described the situation as “embarrassing beyond words”.
There is a common belief that science is self-correcting. But what’s interesting about this case is that the scientist who uncovered Wang’s fraud was not driven by the usual academic incentives. He was being paid by Wall Street short sellers who were betting against the drug company!
This was not an isolated incident. The most notorious example of Alzheimer’s research misconduct – doctored images in Sylvain Lesné’s papers – was also discovered with the help of short sellers. And as reported in Science, Lesné’s “paper has been cited in about 2,300 scholarly articles—more than all but four other Alzheimer’s basic research reports published since 2006, according to the Web of Science database. Since then, annual NIH support for studies labeled ‘amyloid, oligomer, and Alzheimer’s’ has risen from near zero to $287 million in 2021.” While not all of that research was motivated by Lesné’s paper, it’s inconceivable that a paper with that many citations could not have had some effect on the direction of the field.
These cases show how a critical part of the scientific ecosystem – the exposure of faked research – can be undersupplied by ordinary science. Unmasking fraud is a difficult and awkward task, and few people want to do it. But financial incentives can help close those gaps.
Plan of Action
People who witness scientific fraud often stay silent due to perceived pressure from their colleagues and institutions. Whistleblowing is an undersupplied part of the scientific ecosystem.
We can correct these incentives by borrowing an idea from the Securities and Exchange Commission, whose bounty program around financial fraud pays whistleblowers 10-30% of the fines imposed by the government. The program has been a huge success, catching dozens of fraudsters and reducing the stigma around whistleblowing. The Department of Justice has recently copied the model for other types of fraud, such as healthcare fraud. The model should be extended to scientific fraud.
- Funder: Any U.S. government funding agency, such as NIH or NSF
- Eligibility: Research employees with insider knowledge from having worked in a particular lab
- Cost: The program should ultimately pay for itself, both through the recoupment of grant expenditures and through the impacts on future funding, including, potentially, the trajectory of entire academic fields.
The amount of the bounty should vary with the scientific field and the nature of the whistleblower in question. For example, compare the following two situations:
- An undergraduate whistleblower who identifies a problem in a psychology or education study that hardly anyone had cited, let alone implemented in the real world
- A graduate student or postdoc who calls out their own mentor for academic fraud related to influential papers on Alzheimer’s disease or cancer.
The stakes are higher in the latter case. Few graduate students or post-docs will ever be willing to make the intense personal sacrifice of whistleblowing on their own mentor and adviser, potentially forgoing approval of their dissertation or future recommendation letters for jobs. If we want such people to be empowered to come forward despite the personal stakes, we need to make it worth their while.
Suppose that one of Lesné’s students in 2006 had been rewarded with a significant bounty for direct testimony about the image manipulation and fraud that was occurring. That reward might have saved tens of millions in future NIH spending, and would have been more than worth it. In actuality, as we know, none of Lesné’s students or postdocs ever had the courage to come forward in the face of such immense personal risk.
The Office of Research Integrity at the Department of Health and Human Services should be funded to create a bounty program for all HHS-funded research at NIH, CDC, FDA, or elsewhere. ORI’s budget is currently around $15 million per year. That should be increased by at least $1 million to account for a significant number of bounties plus at least one full-time employee to administer the program.
Conclusion
Some critics might say that science works best when it’s driven by people who are passionate about truth for truth’s sake, not for the money. But by this point it’s clear that like anyone else, scientists can be driven by incentives that are not always aligned with the truth. Where those incentives fall short, bounty programs can help.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Confirming Hope: Validating Surrogate Endpoints to Support FDA Drug Approval Using an Inter-Agency Approach
To enable more timely access to new drugs and biologics, clinical trials are increasingly using surrogate markers in lieu of traditional clinical outcomes that directly measure how patients feel, function, or survive. Surrogate markers, such as imaging findings or laboratory measurements, are expected to predict clinical outcomes of interest. In comparison to clinical outcomes, surrogate markers offer an advantage in reducing the duration, size, and total cost of trials. Surrogate endpoints are considered to be “validated” if they have undergone extensive testing that confirms their ability to predict a clinical outcome. However, reviews of “validated” surrogate markers used as primary endpoints in trials supporting U.S. Food and Drug Administration (FDA) approvals suggest that many lack sufficient evidence of being associated with a clinical outcome.
Since 2018, FDA has regularly updated the publicly available “Table of Surrogate Endpoints That Were the Basis of Drug Approval or Licensure”, which includes over 200 surrogate markers that have been or would be accepted by the agency to support approval of a drug or biologic. Not included within the table is information regarding the strength of evidence for each surrogate marker and its association with a clinical outcome. As surrogate markers are increasingly being accepted by FDA to support approval of new drugs and biologics, it is imperative that patients and clinicians understand whether such novel endpoints are reflective of meaningful clinical benefits. Thus, FDA, in collaboration with other agencies, should take steps to increase transparency regarding the strength of evidence for surrogate endpoints used to support product approvals, routinely reassess the evidence behind such endpoints to continue justifying their use in regulatory decision-making, and sunset those that fail to show association with meaningful clinical outcomes. Such transparency would not only benefit the public, clinicians, and the payers responsible for coverage decisions, but also help shape the innovation landscape for drug developers to design clinical trials that assess endpoints truly reflective of clinical efficacy.
Challenge and Opportunity
To receive regulatory approval by FDA, new therapeutics are generally required to be supported by “substantial evidence of effectiveness” from two or more “adequate and well-controlled” pivotal trials. However, FDA has maintained a flexible interpretation of this guidance to enable timely access to new treatments. New drugs and biologics can be approved for specific disease indications based on pivotal trials measuring clinical outcomes (how patients feel, function, or survive). They can also be approved based on pivotal trials measuring surrogate markers that are meant to be proxy measures and expected to predict clinical outcomes. Examples of such endpoints include changes in tumor size as seen on imaging or blood laboratory tests such as cholesterol.
Surrogate markers are considered “validated” when sufficient evidence demonstrates that the endpoint reliably predicts clinical benefit. Such validated surrogate markers are typically the basis of traditional FDA therapeutics approval. However, FDA has also accepted the use of “unvalidated” surrogate endpoints that are reasonably likely to predict clinical benefit as the basis of approval of new therapeutics, particularly if they are being used to treat or prevent a serious or life-threatening disease. Under expedited review pathways, such as accelerated approval that grant drug manufacturers faster FDA market authorization using unvalidated surrogate markers, manufacturers are required to complete an additional clinical trial after approval to confirm the predicted clinical benefit. Should the manufacturer fail to do so, FDA has the authority to withdraw that drug’s particular indication approval.
For drug developers, the use of surrogate markers in clinical trials can shorten the duration, size, and total cost of the pivotal trial. Over time, FDA has increasingly allowed for surrogate markers to be used as primary endpoints in pivotal trials, allowing for shorter clinical trial testing periods and thus faster market access. Moreover, use of unvalidated surrogate markers has grown outside of expedited review pathways such as accelerated approval. One analysis of FDA approved drugs and biologics that received “breakthrough therapy designation” found that among those that received traditional approval, over half were based on pivotal trials using surrogate markers.
While basing FDA approval on surrogate markers can enable more timely market access to novel therapeutics, such endpoints also involve certain trade-offs, including the risk of making erroneous inferences and diminishing certainty about the medical product’s long-term clinical effect. In oncology, evidence suggests that most validation studies of surrogate markers find low correlations with meaningful clinical outcomes such as overall survival or a patient’s quality of life. For instance, in a review of 15 surrogate validation studies conducted by the FDA for oncologic drugs, only one was found to demonstrate a strong correlation between surrogate markers and overall survival. Another study suggested that there are weak or missing correlations between surrogate markers for solid tumors and overall survival. A more recent evaluation found that most surrogate markers used as primary endpoints in clinical trials to support FDA approval of drugs treating non-oncologic chronic disease lack high-strength evidence of associations with clinical outcomes.
Section 3011 of the 21st Century Cures Act of 2016 amended the Federal Food, Drug, and Cosmetic Act to mandate FDA publish a list of “surrogate endpoints which were the basis of approval or licensure (as applicable) of a drug or biological product” under both accelerated and traditional approval pathways. While FDA has posted surrogate endpoint tables for adult and pediatric disease indications that fulfil this legislative requirement, missing within these tables is any justification for surrogate selection, including evidence supporting validation. Without this information, patients, prescribers, and payers are left uncertain about the actual clinical benefit of therapeutics approved by the FDA based on surrogate markers. Instead, drug developers have continued to use this table as a guide in designing their clinical trials, viewing the included surrogate markers as “accepted” by the FDA regardless of the evidence (or lack thereof) undergirding them.
Plan of Action
Recommendation 1. FDA should make more transparent the strength of evidence of surrogate markers included within the “Adult Surrogate Endpoint Table” as well as the “Pediatric Surrogate Endpoint Table.”
Previously, agency officials stated that the use of surrogate markers to support traditional approvals was usually based, at a minimum, on evidence from meta-analyses of clinical trials demonstrating an association between surrogate markers and clinical outcomes for validation. However, more recently, FDA officials have indicated that they consider a “range of sources, including mechanistic evidence that the [surrogate marker] is on the causal pathway of disease, nonclinical models, epidemiologic data, and clinical trial data, including data from the FDA’s own analyses of patient- and trial-level data to determine the quantitative association between the effect of treatment on the [surrogate marker] and the clinical outcomes.” Nevertheless, what specific evidence and how the agency weighed such evidence is not included as part of their published tables of surrogate endpoints, leaving unclear to drug developers as well as patients, clinicians, and payers the strength of the evidence behind such endpoints. Thus, this serves as an opportunity for the agency to enhance their transparency and communication with the public.
FDA should issue a guidance document detailing their current thinking about how surrogate markers should be validated and evaluated on an ongoing basis. Within the guidance, the agency could detail the types of evidence that would be considered to establish surrogacy.
FDA should also include within the tables of surrogate endpoints, a summary of evidence for each surrogate marker listed. This would provide justification (through citations to relevant articles or internal analyses) so that all stakeholders understand the evidence establishing surrogacy. Moreover, FDA can clearly indicate within the tables which clinical outcomes each surrogate marker listed is thought to predict.
FDA should also publicly report on an annual basis a list of therapeutics approved by the agency based on clinical trials using surrogate markers as primary endpoints. This coupled with the additional information around strength of evidence for each surrogate marker would allow patients and clinicians to make more informed decisions around treatments where there may be uncertainty of the therapeutic’s clinical benefit at the time of FDA approval.
Recently, FDA’s Oncology Center for Excellent through Project Confirm has made additional efforts to communicate that status of required postmarketing studies meant to confirm clinical benefit of drugs for oncologic disease indications that received accelerated approval. FDA could further expand this across therapeutic areas and approval pathways by publishing a list of ongoing postmarketing studies for therapeutics where approval was based on surrogate markers that are intended to confirm clinical benefit.
FDA should also regularly convene advisory committees to allow for independent experts to review and vote on recommendations around the use of new surrogate markers for disease indications. Additionally, FDA should regularly convene these advisory committees to re-evaluate the use of surrogate markers based on current evidence, especially those not supported by high-strength evidence demonstrating their association with clinical outcomes. At a minimum, FDA should convene such advisory committees focused on re-examining surrogate markers listed on their publicly available tables annually. In 2024, FDA convened the Oncologic Drugs Advisory Committee to discuss the use of the surrogate marker, minimal residual disease as an endpoint for multiple myeloma. Further such meetings including for those “unvalidated” endpoints would provide FDA opportunity to re-examine their use in regulatory decision-making.
Recommendation 2. In collaboration with the FDA, other federal research agencies should contribute evidence generation to determine whether surrogate markers are appropriate for use in regulatory decision-making, including approval of new therapeutic products and indications for use.
Drug manufacturers that receive FDA approval for products based on unvalidated surrogate markers may not be incentivized to conduct studies that demonstrate a lack of association between such surrogate markers with clinical outcomes. To address this, the Department of Health and Human Services (HHS) should establish an interagency working group including FDA, National Institutes of Health (NIH), Patient Centered Outcomes Research Institute (PCORI), Advanced Research Projects Agency for Health (ARPA-H), Centers for Medicare and Medicaid Services (CMS) and other agencies engaged in biomedical and health services research. These agencies could collaboratively conduct or commission meta-analyses of existing clinical trials to determine whether there is sufficient evidence to establish surrogacy. Such publicly-funded studies would then be brought to FDA advisory committees to be considered by members in making recommendations around the validity of various surrogate endpoints or whether any endpoints without sufficient evidence should be sunset. NIH in particular should prioritize funding large-scale trials aimed at validating important surrogate outcomes.
Through regular collaboration and convening, FDA can help guide the direction of resources towards investigating surrogate markers of key regulatory as well as patient, clinician, and payer interest to strengthen the science behind novel therapeutics. Such information would also be invaluable to drug developers in identifying evidence-based endpoints as part of their clinical trial design, thus contributing to a more efficient research and development landscape.
Recommendation 3. Congress should build upon the provisions related to surrogate markers that passed as part of the 21st Century Cures Act of 2016 in their “Cures 2.0” efforts.
The aforementioned interagency working group convened by HHS could be authorized explicitly through legislation coupled with funding specifically for surrogate marker validation studies. Congress should also mandate that FDA and other federal health agencies re-evaluate listed surrogate endpoints on an annual basis with additional reporting requirements. Additionally, through legislation, FDA could also be granted explicit authority for those endpoints where there is no clear evidence of their surrogacy to sunset them, thus preventing future drug candidates from establishing efficacy based on flawed endpoints. Congress should also require routine reporting from FDA on the status of the interagency working group focused on surrogate endpoints as well as other metrics including a list of new therapeutic approvals based on surrogate markers, expansion of the existing surrogate marker tables on FDA’s website to include the evidence of their surrogacy, and issuance of a guidance document detailing what scientific evidence would be considered by the agency in validating and re-evaluating surrogate markers.
Conclusion
FDA increasingly has allowed new drugs and biologics to be approved based on surrogate markers that are meant to be predictive of meaningful clinical outcomes demonstrating that patients feel better, function better, and survive longer. Although the agency has made more clear what surrogate endpoints could be or are being used to support approval, significant gaps exist in the evidence demonstrating that these novel endpoints are associated with meaningful clinical outcomes. Continued use of surrogate endpoints with little association with clinical benefit leaves patients and clinicians without assurance that novel therapeutics approved by FDA are meaningfully effective, as well as payers responsible for coverage decisions. Transparency of the evidence supporting clinical endpoints is urgently needed to mitigate this uncertainty around new drug approvals, including for drug developers as they continue clinical trials for therapeutic candidates seeking FDA approval. FDA should and in collaboration with other federal biomedical research agencies, routinely re-evaluate surrogate endpoints to determine their continued use in therapeutic innovation. Such regular re-evaluation that informs regulatory decision-making will strengthen FDA’s credibility and ensure accountability of the agency, tasked with ensuring the safety and efficacy of drugs and other medical products as well as with shaping the innovation landscape.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Yes. In 2016, eteplirsen (Exondys 51) was granted accelerated approval for the treatment of Duchenne muscular dystrophy (DMD) against the recommendation of an advisory committee and FDA’s own scientific staff. Concerns were raised that the approval was based on a small clinical trial that showed that eteplirsen led to a small increase in protein dystrophin, a surrogate marker. Three additional approvals for similar DMD drugs have been made based on the same surrogate endpoint. However, no studies have been completed providing confirmation of clinical benefit.
In 2021, aducanumab (Aduhelm) was granted accelerated approval for the treatment of Alzheimer’s disease against the recommendation of an advisory committee and FDA’s scientific staff. Concerns were raised that the approval was based on a surrogate marker, beta-amyloid levels, which has not been found to correlate with cognitive or function changes for Alzheimer’s disease patients. In particular, FDA’s internal statistical review team found no association between changes to the surrogate marker and the clinical outcomes reported in pivotal trials.
Industry may claim that such re-evaluation and potential removal of continued unvalidated surrogate endpoints would slow down the pace of innovation and thus, patient access to novel therapeutics. However, it is more likely this would instead enable more efficient drug development in providing manufacturers, particularly smaller companies with surrogate endpoints that not only decrease the duration and cost of clinical trials, but that also have strong evidence of association with meaningful clinical outcomes. This may also mitigate the need for postmarketing requirements for manufacturers meant to confirm clinical benefit if adequate validation is conducted through FDA and other federal agencies.
No. Having FDA in collaboration with other federal health agencies to validate surrogate endpoints would not halt the use of unvalidated surrogate endpoints reasonably likely to predict clinical benefit. Expedited regulatory pathways such as accelerated approval that are codified by law allowing manufacturers to use unvalidated surrogate markers as endpoints in pivotal clinical trials will still be available for manufacturers. Instead, this creates a process for re-evaluation such that unvalidated surrogate endpoints are not forever left unvalidated, but instead examined within a timely manner to inform their continued use in supporting FDA approval. Ultimately, patients and clinicians want drugs that meaningfully work to treat or prevent against a disease or condition. Routine re-evaluation and validation of surrogate endpoints would provide assurance that for those therapeutics whose approval is based off of these novel endpoints that the FDA approved treatment is clinically effective.
FDA’s function as a regulator is to evaluate the evidence that is brought before them by industry sponsors. To do so effectively, the evidence must be available. This is often not the case, particularly for new surrogate markers as there may not be commercial incentive to do so, particularly if after approval, a surrogate endpoint is found to be not associated with a meaningful clinical outcome. Thus, the involvement of multiple federal biomedical research agencies including NIH and ARPA-H alongside FDA can play an instrumental role in conducting or funding studies demonstrating a clear association between surrogate marker and clinical outcome. Already, several institutes within the NIH are engaged in biomarker development and in supporting validation. Collaboration between NIH institutes with expertise as well as other agencies engaged in translational research with FDA will enable validation of surrogate markers to inform regulatory decision-making of novel therapeutics.
Under the Prescription Drug User Fee Act VII passed in 2022, FDA was authorized to establish the Rare Disease Endpoint Advancement (RDEA) pilot program. This program is intended to foster the development of novel endpoints for rare diseases through FDA collaboration with industry sponsors. with proposed novel endpoints for a drug candidate, opportunities for stakeholders including the public to inform such endpoint development, and greater FDA staff capacity to help develop novel endpoints for rare diseases. Such a pilot program could be further expanded to not only develop novel endpoints, but to also develop approaches for validating novel endpoints such as surrogate markers and communicating the strength of evidence to the public.
Payers such as Medicare have also taken steps to enable postmarket evidence generation including for drugs approved by FDA based on surrogate endpoints. Following the accelerated approval of aducanumab (Aduhelm), the Centers for Medicare and Medicaid Services (CMS) issued a national coverage determination under the coverage with evidence development (CED) program, conditioning coverage of this class of drugs to studies approved by CMS approved by FDA based on a surrogate endpoint with access only available through randomized controlled trials assessing meaningful clinical outcomes. Further evaluation of surrogate endpoints informing FDA approval can be beneficial for payers as they make coverage decisions. Additionally, coverage and reimbursement could also be tied to evidence for such surrogate endpoints, providing additional incentive to complete and communicate the findings from such studies.
A Cross-Health and Human Services Initiative to Cut Wasteful Spending and Improve Patient Lives
Challenge and Opportunity
Many common medical practices do not have strong evidence behind them. In 2019, a group of prominent medical researchers—including Robert Califf, the former Food and Drug Administration (FDA) Commissioner—undertook the tedious task of looking into the level of evidence behind 2,930 recommendations in guidelines issued by the American Heart Association and the American College of Cardiology. They asked one simple question: how many recommendations were supported by multiple small randomized trials or at least one large trial? The answer: 8.5%. The rest were supported by only one small trial, by observational evidence, or just by “expert opinion only.”
For infectious diseases, a team of researchers looked at 1,042 recommendations in guidelines issued by the Infectious Diseases Society of America. They found that only 9.3% were supported by strong evidence. For 57% of the recommendations, the quality of evidence was “low” or “very low.” And to make matters worse, more than half of the recommendations considered low in quality of evidence were still issued as “strong” recommendations.
In oncology, a review of 1,023 recommendations from the National Comprehensive Cancer Network found that “…only 6% of the recommendations … are based on high-level evidence”, suggesting “a huge opportunity for research to fill the knowledge gap and further improve the scientific validity of the guidelines.”
Even worse, there are many cases where not only is a common medical treatment lacking the evidence to support it, but also one or more randomized trials have shown that the treatment is useless or even harmful! One of the most notorious examples is that of the anti-arrhythmic drugs given to millions of cardiac patients in the 1980s. Cardiologists at the time had the perfectly logical belief that since arrhythmia (irregular heartbeat) leads to heart attacks and death, drugs that prevented arrhythmia would obviously prevent heart attacks and death. In 1987, the National Institutes of Health (NIH) funded the Cardiac Arrhythmia Suppression Trial (CAST) to test three such drugs. One of the drugs had to be pulled after just a few weeks, because 17 patients had already died compared with only three in the placebo group. The other two drugs similarly turned out to be harmful, although it took several months to see that patients given those drugs were more than two times as likely to die. According to one JAMA article, “…there are estimates that 20,000 to 75,000 lives were lost each year in the 1980s in the United States alone…” due to these drugs. The CAST trial is a poignant reminder that doctors can be convinced they are doing the best for their patients, but they can be completely wrong if there is not strong evidence from randomized trials.
In 2016, randomized trials of back fusion surgery found that it does not work. But a recent analysis by the Lown Institute found that the Centers for Medicare & Medicaid Services (CMS) spent approximately $2 billion in the past 3 years on more than 200,000 of these surgeries.
There are hundreds of additional examples where medical practice was ultimately proven wrong. Given how few medical practices, even now, are actually supported by strong evidence, there are likely many more examples of treatments that either do not work or actively cause harm. This is not only wasted spending, but also puts patients at risk.
We can do better – both for patients and for the federal budget – if we reduce the use of medical practices that simply do not work.
Plan of Action
The Secretary of Health and Human Services should create a cross-division committee to develop an extensive and prioritized list of medical practices, products, and treatments that need evidence of effectiveness, and then roll out an ambitious agenda to run randomized clinical trials for the highest-impact medical issues.
That is, the CMS needs to work with the NIH and the FDA, and the Centers for Disease Control and Prevention (CDC) to develop a prioritized list of medical treatments, procedures, drugs, and devices with little evidence behind them and for which annual spending is large and the health impacts could be most harmful. Simultaneously, the FDA needs to work with its partner agencies to identify drugs, vaccines, and devices with widespread medical usage that need rigorous post-market evaluation. This includes drugs with off-label uses, oncology regimens that have never been tested against each other, surrogate outcomes that have not been validated against long-term outcomes, accelerated approvals without the needed follow-up studies, and more.
With priority lists available, the NIH could immediately launch trials to evaluate the effectiveness of the identified treatments and practices to ensure effective health and safety. The Department should report to Congress on a yearly basis as to the number and nature of clinical trials in progress, and eventually the results of those trials (which should also be made available on a public dashboard, with any resulting savings). The project should be ongoing for the indefinite future, and over time, HHS should explore ways to have artificial intelligence tools identify the key unstudied medical questions that deserve a high-value clinical trial.
Expected opponents to any such effort will be pharmaceutical, biotechnology and device companies and their affiliated trade associations, whose products might come under further scrutiny, and professional medical associations who are firmly convinced that their practices should not be questioned. Their lobbying power might be considerable, but the intellectual case behind the need for rigorous and unbiased studies is unquestionable, particularly when billions of federal dollars and millions of patients’ lives and health are at stake.
Conclusion
Far too many medical practices and treatments have not been subjected to rigorous randomized trials, and the divisions of Health and Human Services should come together to fix this problem. Doing so will likely lead to billions of dollars in savings and huge improvements to patient health.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Measuring Research Bureaucracy to Boost Scientific Efficiency and Innovation
Bureaucracy has become a critical barrier to scientific progress in America. An excess of management and administration efforts pulls researchers away from their core scientific work and consumes resources that could advance discovery. While we lack systematic measures of this inefficiency, the available data is troubling: researchers spend nearly half their time on administrative tasks, and nearly one in five dollars of university research budgets goes to regulatory compliance.
The proposed solution is a three-step effort to measure and roll back the bureaucratic burden. First, we need to create a detailed baseline by measuring administrative personnel, management layers, and associated time/costs across government funding agencies and universities receiving grant funding. Second, we need to develop and apply objective criteria to identify specific bureaucratic inefficiencies and potential improvements, based on direct feedback from researchers and administrators nationwide. Third, we need to quantify the benefits of reducing bureaucratic overhead and implement shared strategies to streamline processes, simplify regulations, and ultimately enhance research productivity.
Through this ambitious yet practical initiative, the administration could free up over a million research days annually and redirect billions of dollars toward scientific pursuits that strengthen America’s innovation capacity.
Challenge and Opportunity
Federally funded university scientists spend much of their time navigating procedures and management layers. Scientists, administrators, and policymakers widely agree that bureaucratic burden hampers research productivity and innovation, yet as the National Academy of Sciences noted in 2016 there is “little rigorous analysis or supporting data precisely quantifying the total burden and cost to investigators and research institutions of complying with federal regulations specific to the conduct of federally funded research.” This continues to be the case, despite evidence suggesting that federally funded faculty spend nearly half of their research time on administrative tasks, and nearly one in every five dollars spent on university research goes to regulatory compliance.
Judging by the steady rise in research administration requirements that face universities, the problem is getting worse. Federal rules and policies affecting research have multiplied ninefold in two decades— from 29 in 2004 to 255 in 2024, with half of the increase just in the last five years. It is no coincidence that the bureaucratic overhead is also expanding in funding agencies. At the National Institutes of Health (NIH), for instance, the growth of managers and administrators has significantly outpaced scientific roles and research funding activity (see figure).

The question is: just how much of universities’ $100 billion-plus annual research spend (more than half of it funded by the federal government) is hobbled by excess management and administration? To answer this, we must understand:
- Which bureaucratic activities are wasteful, or have a poor return on time and effort?
- How much time do bureaucratic activities take up, and what is the cost overall?
- Which activities are not required by the law or regulations, but are imposed by overly risk-averse legal counsel, compliance, and other administrators at agencies or universities?
- Which activities, rules, and processes should be eliminated or reimagined, and how?
- What portion of the overhead budget isn’t spent on research administration or management?
Plan of Action
The current administration aims to make government-funded research more efficient and productive. Recently, the director of the Office of Science and Technology Policy (OSTP) vowed to “reduce administrative burdens on federally funded researchers, not bog them down in bureaucratic box checking.” To that end, I propose a systematic effort that measures bureaucratic excess, quantifies the payoff from eliminating specific aspects of this burden, and improves accountability for results.
The president should issue an Executive Order directing the Office of Management and Budget (OMB) and Office of Science and Technology Policy (OSTP) to develop a Bureaucratic Burden report within 180 days of signing. The report should detail specific steps agencies will take to reduce administrative requirements. Agencies must participate in this effort at the leadership level, launching a government-wide effort to reduce bureaucracy. Furthermore, all research agencies should work together to develop a standardized method for calculating burden within both agencies and funded institutes, create a common set of policies that will streamline research processes, and establish clear limits on overhead spending to ensure full transparency in research budgets.
OMB and OSTP should create a cross-agency Research Efficiency Task Force within the National Science and Technology Council to conduct this work. This team would develop a shared approach and lead the data gathering, analysis, and synthesis using consistent measures across agencies. The Task Force’s first step would be to establish a bureaucratic baseline, including a detailed view of the managerial and administrative footprint within federal research agencies and universities that receive research funding, broken down into core components. The measurement approach would certainly vary between government agencies and funding recipients.
Key agencies, including the NIH, National Science Foundation, NASA, the Department of Defense, and the Department of Energy, should:
- Count personnel at each level—managers, administrators, and intramural scientists—along with their compensation;
- Document management layers from executives to frontline staff and supervisor ratios;
- Calculate time spent on administrative work by all staff, including researchers, to estimate total compliance costs and overhead.
- Task Force agencies should also hire an independent contractor(s) to analyze the administrative burden at a representative sample of universities. Through surveys and interviews, they should measure staffing, management structures, researcher time allocation, and overhead costs to size up the bureaucratic footprint across the scientific establishment.
Next, the Task Force should launch an online consultation with researchers and administrators nationwide. Participants could identify wasteful administrative tasks, quantify their time impact, and share examples of efficient practices. In parallel, agency leaders should submit to OMB and OSTP their formal assessment of which bureaucratic requirements can be eliminated, along with projected benefits.
Finally, the Task Force should produce a comprehensive estimate of the total cost of unnecessary bureaucracy and propose specific reforms. Its recommendations will identify potential savings from streamlining agency practices, statutory requirements, and oversight mechanisms. The Task Force should also examine how much overhead funding supports non-research activities, propose ways to redirect these resources to scientific research, and establish metrics and a public dashboard to track progress.
Some of this information may have already been gathered as part of ongoing reorganization efforts, which would expedite the assessment.
Within six months, the group should issue a public report that would include:
- A detailed estimate of the unnecessary costs of research bureaucracy.
- The cost gains from rolling back or adjusting specific burdens.
- A synthesis of harder-to-quantify benefits from these moves, such as faster approval cycles, better decision-making, and less conservatism in research proposals;
- A catalog of innovative research management practices, with a four-year timeline for studying and scaling them.
- A proposed approach for regular tracking and reporting on bureaucratic burden in science.
- A prioritized list of changes that each agency should make, including a clear timeline for making those changes and the estimated cost savings.
These activities would serve as the start of a series of broad reforms by the White House and research funding agencies to improve federal funding policies and practices.
Conclusion
This initiative will build an irrefutable case for reform, provide a roadmap for meaningful improvement, and create real accountability for results. Giving researchers and administrators a voice in reimagining the system they navigate daily will generate better insights and build commitment for change. The potential upside is enormous: millions of research days could be freed from paperwork for lab work, strengthening America’s capacity to innovate and lead the world. With committed leadership, this administration could transform how the US funds and conducts research, delivering maximum scientific return on every federal dollar invested.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Improving Research Transparency and Efficiency through Mandatory Publication of Study Results
Scientists are incentivized to produce positive results that journals want to publish, improving the chances of receiving more funding and the likelihood of being hired or promoted. This hypercompetitive system encourages questionable research practices and limits disclosure of all research results. Conversely, the results of many funded research studies never see the light of day, and having no written description of failed research leads to systemic waste, as others go down the same wrong path. The Office of Science and Technology Policy (OSTP) should mandate that all grants must lead to at least one of two outputs: 1) publication in a journal that accepts null results (e.g., Public Library of Science (PLOS) One, PeerJ, and F1000Research), or 2) public disclosure of the hypothesis, methodology, and results to the funding agency. Linking grants to results creates a more complete picture of what has been tried in any given field of research, improving transparency and reducing duplication of effort.
Challenge and Opportunity
There is ample evidence that null results are rarely published. Mandated publication would ensure all federal grants have outputs, whether hypotheses were supported or not, reducing repetition of ideas in future grant applications. More transparent scientific literature would expedite new breakthroughs and reduce wasted effort, money, and time across all scientific fields. Mandating that all recipients of federal research grants publish results would create transparency about what exactly is being done with public dollars and what the results of all studies were. It would also enable learning about which hypotheses/research programs are succeeding and which are not, as well as the clinical and pre-clinical study designs that are producing positive versus null findings.
Better knowledge of research results could be applied to myriad funding and research contexts. For example, an application for a grant could state that, in a previous grant, an experiment was not conducted because previous experiments did not support it, or alternatively, the experiment was conducted but it produced a null result. In both scenarios, the outcome should be reported, either in a publication in PubMed or as a disclosure to federal science funding agencies. In another context, an experiment might be funded across multiple labs, but only the labs that obtain positive results end up publishing. Mandatory publication would enable an understanding of how robust the result is across different laboratory contexts and nuances in study design, and also why the result was positive in some contexts and null in others.
Pressure to produce novel and statistically significant results often leads to questionable research practices, such as not reporting null results (a form of publication bias), p-hacking (a statistical practice where researchers manipulate analytical or experimental procedures to find significant results that support their hypothesis, even if the results are not meaningful), hypothesizing after results are known (HARKing), outcome switching (changes to outcome measures), and many others. The replication and reproducibility crisis in science presents a major challenge for the scientific community—questionable results undermine public trust in science and create tremendous waste as the scientific community slowly course-corrects for results that ultimately prove unreliable. Studies have shown that a substantial portion of published research findings cannot be replicated, raising concerns about the validity of the scientific evidence base.
In preclinical research, one survey of 454 animal researchers estimated that 50% of animal experiments are not published, and that one of the most important causes of non-publication was a lack of statistical significance (“negative” findings). The prevalence of these issues in preclinical research undoubtedly plays a role in poor translation to the clinic as well as duplicative efforts. In clinical trials, a recent study found that 19.2% of cancer phase 3 randomized controlled trials (RCTs) had primary end point changes (i.e., outcome switching), and 70.3% of these did not report the changes in their resulting manuscripts. These changes had a statistically significant relationship with trial positivity, indicating that they may have been carried out to present positive results. Other work examining RCTs more broadly found one-third with clear inconsistencies between registered and published primary outcomes. Beyond outcome switching, many trials include “false” data. Among 526 trials submitted to the journal Anaesthesia from February 2017 to March 2020, 73 (14%) had false data, including “the duplication of figures, tables and other data from published work; the duplication of data in the rows and columns of spreadsheets; impossible values; and incorrect calculations.”
Mandatory publication for all grants would help change the incentives that drive the behavior in these examples by fundamentally altering the research and publication processes. At the conclusion of a study that obtained null results, this scientific knowledge would be publicly available to scientists, the public, and funders. All grant funding would have outputs. Scientists could not then repeatedly apply for grants based on failed previous experiments, and they would be less likely to receive funding for research projects that have already been tried, and failed, by others. The cumulative, self-correcting nature of science cannot be fully realized without transparency around what worked and what did not work.
Adopting mandatory publication of results from federally funded grants would also position the U.S. as a global leader in research integrity, matching international initiatives such as the UK Reproducibility Network and European Open Science Cloud, which promote similar reforms. By embracing mandatory publication, the U.S. will enhance its own research enterprise and set a standard for other nations to follow.
Plan of Action
Recommendation 1. The White House should issue a directive to federal research funding agencies that mandates public disclosure of research results from all federal grants, including null results, unless they reveal intellectual property or trade secrets. To ensure lasting reform to America’s research enterprise , Congress could pass a law requiring such disclosures.
Recommendation 2. The National Science and Technology Council (NSTC) should develop guidelines for agencies to implement mandatory reporting. Successful implementation requires that researchers are well-informed and equipped to navigate this process. NSTC should coordinate with agencies to establish common guidelines for all agencies to reduce confusion and establish a uniform policy. In addition, agencies should create and disseminate detailed guidance documents that outline best practice for studies reporting null results, including step-by-step instructions on how to prepare and submit null studies to journals (and their differing guidelines) or federal databases.
Conclusion
Most published research is not replicated because the research system incentivizes the publication of novel, positive results. There is a tremendous amount of research that is not published due to null results, representing an enormous amount of wasted effort, money, and time, and compromised progress and transparency of our scientific institutions. OSTP should mandate the publication of null results through existing agency authority and funding, and Congress should consider legislation to ensure its longevity.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
It is well understood that most scientific findings cannot be taken at face value until they are replicated or reproduced. To make science more trustworthy, transparent, and replicable, we must change incentives to only publish positive results. Publication of null results will accelerate advancement of science.
Scientific discovery is often unplanned and serendipitous, but it is abundantly clear that we can reduce the amount of waste it currently generates. By mandating outputs for all grants, we expedite a cumulative record of research, where the results of all studies are known, and we can see why experiments might be valid in one context but not another to assess the robustness of findings in different experimental contexts and labs.
While many agencies prioritize hypothesis-driven research, even exploratory research will produce an output, and these outputs should be publicly available, either as an article or by public disclosure.
Studies that produce null results can still easily share data and code, to be evaluated post-publication by the community to see if code can be refactored, refined, and improved.
The “Cadillac” version of mandatory publication would be the registered reports model, where a study has its methodology peer reviewed before data are collected (Stage 1 Review). Authors are given in-principle acceptance, whereby, as long as the scientist follows the agreed-upon methodology, their study is guaranteed publication regardless of the results. When a study is completed, it is peer reviewed again (Stage 2 Review) simply to confirm the agreed-upon methodology was followed. In the absence of this registered reports model, we should at least mandate transparent publication via journals that publish null results, or via public federal disclosure.
Maintaining American Leadership through Early-Stage Research in Methane Removal
Methane is a potent gas with increasingly alarming effects on the climate, human health, agriculture, and the economy. Rapidly rising concentrations of atmospheric methane have contributed about a third of the global warming we’re experiencing today. Methane emissions also contribute to the formation of ground-level ozone, which causes an estimated 1 million premature deaths around the world annually and poses a significant threat to staple crops like wheat, soybeans, and rice. Overall, methane emissions cost the United States billions of dollars each year.
Most methane mitigation efforts to date have rightly focused on reducing methane emissions. However, the increasingly urgent impacts of methane create an increasingly urgent need to also explore options for methane removal. Methane removal is a new field exploring how methane, once in the atmosphere, could be broken down faster than with existing natural systems alone to help lower peak temperatures, and counteract some of the impact of increasing natural methane emissions. This field is currently in the “earliest stages of knowledge discovery”, meaning that there is a tremendous opportunity for the United States to establish its position as the unrivaled world leader in an emerging critical technology – a top goal of the second Trump Administration. Global interest in methane means that there is a largely untapped market for innovative methane-removal solutions. And investment in this field will also generate spillover knowledge discovery for associated fields, including atmospheric, materials, and biological sciences.
Congress and the Administration must move quickly to capitalize on this opportunity. Following the recommendations of the National Academies of Sciences, Engineering, and Medicine (NASEM)’s October 2024 report, the federal government should incorporate early-stage methane removal research into its energy and earth systems research programs. This can be achieved through a relatively small investment of $50–80 million annually, over an initial 3–5 year phase. This first phase would focus on building foundational knowledge that lays the groundwork for potential future movement into more targeted, tangible applications.
Challenge and Opportunity
Methane represents an important stability, security, and scientific frontier for the United States. We know that this gas is increasing the risk of severe weather, worsening air quality, harming American health, and reducing crop yields. Yet too much about methane remains poorly understood, including the cause(s) of its recent accelerating rise. A deeper understanding of methane could help scientists better address these impacts – including potentially through methane removal.
Methane removal is an early-stage research field primed for new American-led breakthroughs and discoveries. To date, four potential methane-removal technologies and one enabling technology have been identified. They are:
- Ecosystem uptake enhancement: Increasing microbes’ consumption of methane in soils and trees or getting plants to do so.
- Surface treatments: Applying special coatings that “eat” methane on panels, rooftops, or other surfaces.
- Atmospheric oxidation enhancement: Increasing atmospheric reactions conducive to methane breakdown.
- Methane reactors: Breaking down methane in closed reactors using catalysts, reactive gases, or microbes.
- Methane concentrators: A potentially enabling technology that would separate or enrich methane from other atmospheric components.

Figure 1. Atmospheric Methane Removal Technologies. (Source: National Academies Research Agenda)
Many of these proposed technologies have analogous traits to existing carbon dioxide removal methods and other interventions. However, much more research is needed to determine the net climate benefit, cost plausibility and social acceptability of all proposed methane removal approaches. The United States has positioned itself to lead on assessing and developing these technologies, such as through NASEM’s 2024 report and language included in the final FY24 appropriations package directing the Department of Energy to produce its own assessment of the field. The United States also has shown leadership with its civil society funding some of the earliest targeted research on methane removal.
But we risk ceding our leadership position – and a valuable opportunity to reap the benefits of being a first-mover on an emergent technology – without continued investment and momentum. Indeed, investing in methane removal research could help to improve our understanding of atmospheric chemistry and thus unlock novel discoveries in air quality improvement and new breakthrough materials for pollution management. Investing in methane removal, in short, would simultaneously improve environmental quality, unlock opportunities for entrepreneurship, and maintain America’s leadership in basic science and innovation. New research would also help the United States avoid possible technological surprises by competitors and other foreign governments, who otherwise could outpace the United States in their understanding of new systems and approaches and leave the country unprepared to assess and respond to deployment of methane removal elsewhere.
Plan of Action
The federal government should launch a five-year Methane Removal Initiative pursuant to the recommendations of the National Academies. A new five-year research initiative will allow the United States to evaluate and potentially develop important new tools and technologies to mitigate security risks arising from the dangerous accumulation of methane in the atmosphere while also helping to maintain U.S. global leadership in innovation. A well-coordinated, broad, cross-cutting federal government effort that fosters collaborations among agencies, research universities, national laboratories, industry, and philanthropy will enable the United States to lead science and technology improvements to meet these goals. To develop any new technologies on timescales most relevant for managing earth system risk, this foundational research should begin this year at an annual level of $50–$80 million per year. Research should last ideally five years and inform a more applied second-phase assessment recommended by the National Academies.
Consistent with the recommendations from the National Academies’ Atmospheric Methane Removal Research Agenda and early philanthropic seed funding for methane removal research, the Methane Removal Initiative would:
- Establish a national methane removal research and development program involving key science agencies, primarily the National Science Foundation, Department of Energy, and National Oceanic and Atmospheric Administration, with contributions from other agencies including the US Department of Agriculture, National Institute of Standards and Technology, National Aeronautics and Space Administration, Department of Interior, and Environmental Protection Agency.
- Focus early investments in foundational research to advance U.S. interests and close knowledge gaps, specifically in the following areas:
- The “sinks” and sources of methane, including both ground-level and atmospheric sinks as well as human-driven and natural sources (40% of research budget),
- Methane removal technologies, as described below (30% of research budget); and
- Potential applications of methane removal, such as demonstration and deployment systems and their interaction with other climate response strategies (30% of research budget).
The goal of this research program is ultimately to assess the need for and viability of new methods that could break down methane already in the atmosphere faster than natural processes already do alone. This program would be funded through several appropriations subcommittees in Congress, most notably Energy & Water Development and Commerce, Justice, Science and Related Agencies. Agriculture, Rural Development, Food and Drug Administration, and Interior and Environment also have funding recommendations relevant to their subcommittees. As scrutiny grows on the federal government’s fiscal balance, it should be noted that the scale of proposed research funding for methane removal is relatively modest and that no funding has been allocated to this potentially critical area of research to date. Forgoing these investments could result in neglecting this area of innovation at a critical time where there is an opportunity for the United States to demonstrate leadership.
Conclusion
Emissions reductions remain the most cost-effective means of arresting the rise in atmospheric methane, and improvements in methane detection and leak mitigation will also help America increase its production efficiency by reducing losses, lowering costs, and improving global competitiveness. The National Academies confirms that methane removal will not replace mitigation on timescales relevant to limiting peak warming this century, but the world will still likely face “a substantial methane emissions gap between the trajectory of increasing methane emissions (including from anthropogenically amplified natural emissions) and technically available mitigation measures.” This creates a substantial security risk for the United States in the coming decades, especially given large uncertainties around the exact magnitude of heat-trapping emissions from natural systems. A modest annual investment of $50–80 million can pay much larger dividends in future years through new innovative advanced materials, improved atmospheric models, new pollution control methods, and by potentially enhancing security against these natural systems risks. The methane removal field is currently at a bottleneck: ideas for innovative research abound, but they remain resource-limited. The government has the opportunity to eliminate these bottlenecks to unleash prosperity and innovation as it has done for many other fields in the past. The intensifying rise of atmospheric methane presents the United States with a new grand challenge that has a clear path for action.
Methane is a powerful greenhouse gas that plays an outsized role in near-term warming. Natural systems are an important source of this gas, and evidence indicates that these sources may be amplified in a warming world and emit even more. Even if we succeed in reducing anthropogenic emissions of methane, we “cannot ignore the possibility of accelerated methane release from natural systems, such as widespread permafrost thaw or release of methane hydrates from coastal systems in the Arctic.” Methane removal could potentially serve as a partial response to such methane-emitting natural feedback loops and tipping elements to reduce how much these systems further accelerate warming.
No. Aggressive emissions reductions—for all greenhouse gases, including methane—are the highest priority. Methane removal cannot be used in place of methane emissions reduction. It’s incredibly urgent and important that methane emissions be reduced to the greatest extent possible, and that further innovation to develop additional methane abatement approaches is accelerated. These have the important added benefit of improving American energy security and preventing waste.
More research is needed to determine the viability and safety of large-scale methane removal. The current state of knowledge indicates several approaches may have the potential to remove >10 Mt of methane per year (~0.8 Gt CO₂ equivalent over a 20 year period), but the research is too early to verify feasibility, safety, and effectiveness. Methane has certain characteristics that suggest that large-scale and cost-effective removal could be possible, including favorable energy dynamics in turning it into CO2 and the lack of a need for storage.
The volume of methane removal “needed” will depend on our overall emissions trajectory, atmospheric methane levels as influenced by anthropogenic emissions and anthropogenically amplified natural systems feedbacks, and target global temperatures. Some evidence indicates we may have already passed warming thresholds that trigger natural system feedbacks with increasing methane emissions. Depending on the ultimate extent of warming, permafrost methane release and enhanced methane emissions from wetland systems are estimated to potentially lead to ~40-200 Mt/yr of additional methane emissions and a further rise in global average temperatures (Zhang 2023, Kleinen 2021, Walter 2018, Turetsky 2020). Methane removal may prove to be the primary strategy to address these emissions.
Methane is a potent greenhouse gas, 43 times stronger than carbon dioxide molecule for molecule, with an atmospheric lifetime of roughly a decade (IPCC, calculation from Table 7.15). Methane removal permanently removes methane from the atmosphere by oxidizing or breaking down methane into carbon dioxide, water, and other byproducts, or if biological processes are used, into new biomass. These products and byproducts will remain cycling through their respective systems, but without the more potent warming impact of methane. The carbon dioxide that remains following oxidation will still cause warming, but this is no different than what happens to the carbon in methane through natural removal processes. Methane removal approaches accelerate this process of turning the more potent greenhouse gas methane into the less potent greenhouse gas carbon dioxide, permanently removing the methane to reduce warming.
The cost of methane removal will depend on the specific potential approach and further innovation, specific costs are not yet known at this stage. Some approaches have easier paths to cost plausibility, while others will require significant increases in catalytic, thermal or air processing efficiency to achieve cost plausibility. More research is needed to determine credible estimates, and innovation has the potential to significantly lower costs.
Greenhouse gases are not interchangeable. Methane removal cannot be used in place of carbon dioxide removal because it cannot address historical carbon dioxide emissions, manage long-term warming or counteract other effects (e.g., ocean acidification) that are results of humanity’s carbon dioxide emissions. Some methane removal approaches have characteristics that suggest that they may be able to get to scale quickly once developed and validated, should deployment be deemed appropriate, which could augment our near-term warming mitigation capacity on top of what carbon dioxide removal and emissions reductions offer.
Methane has a short atmospheric lifetime due to substantial methane sinks. The primary methane sink is atmospheric oxidation, from hydroxyl radicals (~90% of the total sink) and chlorine radicals (0-5% of the total sink). The rest is consumed by methane-oxidizing bacteria and archaea in soils (~5%). While understood at a high level, there is substantial uncertainty in the strength of the sinks and their dynamics.
Up until about 2000, the growth of methane was clearly driven by growing human-caused emissions from fossil fuels, agriculture, and waste. But starting in the mid-2000s, after a brief pause where global emissions were balanced by sinks, the level of methane in the atmosphere started growing again. At the same time, atmospheric measurements detected an isotopic signal that the new growth in methane may be from recent biological—as opposed to older fossil—origin. Multiple hypotheses exist for what the drivers might be, though the answer is almost certainly some combination of these. Hypotheses include changes in global food systems, growth of wetlands emissions as a result of the changing climate, a reduction in the rate of methane breakdown and/or the growth of fracking. Learn more in Spark’s blog post.
Methane has a significant warming effect for the 9-12 years that it remains in the atmosphere. Given how potent methane is, and how much is currently being emitted, even with a short atmospheric lifetime, methane is accumulating in the atmosphere and the overall warming impact of current and recent methane emissions is 0.5°C. Methane removal approaches may someday be able to bring methane-driven warming down faster than with natural sinks alone. The significant risk of ongoing substantial methane sources, such as natural methane emissions from permafrost and wetlands, would lead to further accumulation. Exploring options to remove atmospheric methane is one strategy to better manage this risk.
Research into all methane removal approaches is just beginning, and there is no known timeline for their development or guarantee that they will prove to be viable and safe.
Some methane removal and carbon dioxide removal approaches overlap. Some soil amendments may have an impact on both methane and carbon dioxide removal, and are currently being researched. Catalytic methane-oxidizing processes could be added to direct air capture (DAC) systems for carbon dioxide, but more innovation will be needed to make these systems sufficiently efficient to be feasible. If all planned DAC capacity also removed methane, it would make a meaningful difference, but still fall very short of the scale of methane removal that could be needed to address rising natural methane emissions, and additional approaches should be researched in parallel.
Methane emissions destruction refers to the oxidation of methane from higher-methane-concentration air streams from sources, for example air in dairy barns. There is technical overlap between some methane emissions destruction and methane removal approaches, but each area has its own set of constraints that will also lead to non-overlapping approaches, given different methane concentrations to treat, and different form-factor constraints.
Measuring and Standardizing AI’s Energy and Environmental Footprint to Accurately Access Impacts
The rapid expansion of artificial intelligence (AI) is driving a surge in data center energy consumption, water use, carbon emissions, and electronic waste—yet these environmental impacts, and how they will change in the future, remain largely opaque. Without standardized metrics and reporting, policymakers and grid operators cannot accurately track or manage AI’s growing resource footprint. Currently, companies often use outdated or narrow measures (like Power Usage Effectiveness, PUE) and purchase renewable credits to obscure true emissions. Their true carbon footprint may be as much as 662% higher than the figures they report. A single hyperscale AI data center can guzzle hundreds of thousands of gallons of water per day and contribute to a “mountain” of e-waste, yet only about a quarter of data center operators even track what happens to retired hardware.
This policy memo proposes a set of congressional and federal executive actions to establish comprehensive, standardized metrics for AI energy and environmental impacts across model training, inference, and data center infrastructure. We recommend that Congress directs the Department of Energy (DOE) and the National Institute of Standards and Technology (NIST) to design, collect, monitor and disseminate uniform and timely data on AI’s energy footprint, while designating the White House Office of Science and Technology Policy (OSTP) to coordinate a multi-agency council that coordinates implementation. Our plan of action outlines steps for developing metrics (led by DOE, NIST, and the Environmental Protection Agency [EPA]), implementing data reporting (with the Energy Information Administration [EIA], National Telecommunications and Information Administration [NTIA], and industry), and integrating these metrics into energy and grid planning (performed by DOE’s grid offices and the Federal Energy Regulatory Commission [FERC]). By standardizing how we measure AI’s footprint, the U.S. can be better prepared for the growth in power consumption while maintaining its leadership in artificial intelligence.
Challenge and Opportunity
Inconsistent metrics and opaque reporting make future AI power‑demand estimates extremely uncertain, leaving grid planners in the dark and climate targets on the line.
AI’s Opaque Footprint
Generative AI and large-scale cloud computing are driving an unprecedented increase in energy demand. AI systems require tremendous amounts of computing power both during training (the AI development period) and inference (when AI is used in real world applications). The rapid rise of this new technology is already straining energy and environmental systems at an unprecedented scale. Data centers consumed an estimated 415 Terawatt hours (TWh) of electricity in 2024 (roughly 1.5% of global power demand), and with AI adoption accelerating, the International Energy Agency (IEA) forecasts that data center energy use could more than double to 945 TWh by 2030. This is an added load comparable to powering an entire country the size of Sweden or even Germany. There are a range of projections of AI’s energy consumption, with some estimates suggesting even more rapid growth than the IEA. Estimates suggest that much of this growth will be concentrated in the United States.
The large divergence in estimates for AI-driven electricity demand stem from the different assumptions and methods used in each study. One study uses one of the parameters like the AI Query volume (the number of requests made by users for AI answers), another tries to estimate energy demand from the estimated supply of AI related hardware. Some estimate the Compound Annual Growth Rate (CAGR) of data center growth under different growth scenarios. Different authors make various assumptions about chip shipment growth, workload mix (training vs inference), efficiency gains, and per‑query energy. Amidst this fog of measurement confusion, energy suppliers are caught by surges in demand from new compute infrastructure on top of existing demands from sources like electric vehicles and manufacturing. Electricity grid operators in the United States typically plan for gradual increases in power demand that can be met with incremental generation and transmission upgrades. But if the rapid build-out of AI data centers, on top of other growing power demands, pushes global demand up by an additional hundreds of terawatt hours annually this will shatter the steady-growth assumption embedded in today’s models. Planners need far more granular, forward-looking forecasting methods to avoid driving up costs for rate-payers, last-minute scrambles to find power, and potential electricity reliability crises.
This surge in power demand also threatens to undermine climate progress. Many new AI data centers require 100–1000 megawatts (MW), equivalent to the demands of a medium-sized city, while grid operators are faced with connection lead times of over 2 years to connect to clean energy supplies. In response to these power bottlenecks some regional utilities, unable to supply enough clean electricity, have even resorted to restarting retired coal plants to meet data center loads, undermining local climate goals and efficient operation. Google’s carbon emissions rose 48% over the past five years and Microsoft’s by 23.4% since 2020, largely due to cloud computing and AI.
In spite of the risks to the climate, carbon emissions data is often obscured: firms often claim “carbon neutrality” via purchased clean power credits, while their actual local emissions go unreported. One analysis found Big Tech (Amazon, Meta) data centers may emit up to 662% more CO₂ than they publicly report. For example, Meta’s 2022 data center operations reported only 273 metric tons CO₂ (using market-based accounting with credits), but over 3.8 million metric tons CO₂ when calculated by actual grid mix according to one analysis—a more than 19,000-fold increase. Similarly, AI’s water impacts are largely hidden. Each interactive AI query (e.g. a short session with a language model) can indirectly consume half a liter of fresh water through data center cooling, contributing to millions of gallons used by AI servers—but companies rarely disclose water usage per AI workload. This lack of transparency masks the true environmental cost of AI, hinders accountability, and impedes smart policymaking.
Outdated and Fragmented Metrics
Legacy measures like Power Usage Effectiveness (PUE) miss what is important for AI compute efficiency, such as water consumption, hardware manufacturing, and e-waste.
The metrics currently used to gauge data center efficiency are insufficient for AI-era workloads. Power Usage Effectiveness (PUE), the two-decades-old standard, gives only a coarse snapshot of facility efficiency under ideal conditions. PUE measures total power delivered to a datacenter versus how much of that power actually makes it to the IT equipment inside. The more power used (e.g. for cooling), the worse the PUE ratio will be. However, PUE does not measure how efficiently the IT equipment actually uses the power delivered to it. Think about a car that reports how much fuel reaches the engine but not the miles per gallon of that engine. You can ensure that the fuel doesn’t leak out of the line on its way to the engine, but that engine might not be running efficiently. A good PUE is the equivalent of saying that fuel isn’t leaking out on its way to the engine; it might tell you that a data center isn’t losing too much energy to cooling, but won’t flag inefficient IT equipment. An AI training cluster with a “good” PUE (around 1.1) could still be wasteful if the hardware or software is poorly optimized.
In the absence of updated standards, companies “report whatever they choose, however they choose” regarding AI’s environmental impact. Few report water usage or lifecycle emissions. Only 28% of operators track hardware beyond its use, and just 25% measure e-waste, resulting in tons of servers and AI chips quietly ending up in landfills. This data gap leads to misaligned incentives—for instance, firms might build ever-larger models and data centers, chasing AI capabilities, without optimizing for energy or material efficiency because there is no requirement or benchmark to do so.
Opportunities for Action
Standardizing metrics for AI’s energy and environmental footprint presents a win-win opportunity. By measuring and disclosing AI’s true impacts, we can manage them. With better data, policymakers can incentivize efficiency innovations (from chip design to cooling to software optimization) and target grid investments where AI load is rising. Industry will benefit too: transparency can highlight inefficiencies (e.g. low server utilization or high water-cooled heat that could be recycled) and spur cost-saving improvements. Importantly, several efforts are already pointing the way. In early 2024, bicameral lawmakers introduced the Artificial Intelligence Environmental Impacts Act, aiming to have the EPA study AI’s environmental footprint and develop measurement standards and a voluntary reporting system via NIST. Internationally, the European Union’s upcoming AI Act will require large AI systems to report energy use, resource consumption, and other life cycle impacts, and the ISO is preparing “sustainable AI” standards for energy, water, and materials accounting. The U.S. can build on this momentum. A recent U.S. Executive Order (Jan 2025) already directed DOE to draft reporting requirements for AI data centers covering their entire lifecycle—from material extraction and component manufacturing to operation and retirement—including metrics for embodied carbon (greenhouse-gas emissions that are “baked into” the physical hardware and facilities before a single watt is consumed to run a model), water usage, and waste heat. It also launched a DOE–EPA “Grand Challenge” to push the PUE ratio below 1.1 and minimize water usage in AI facilities. These signals show that there is willingness to address the problem. Now is the time to implement a comprehensive framework that standardizes how we measure AI’s environmental impact. If we seize this opportunity, we can ensure innovation in AI is driven by clean energy, a smarter grid, and less environmental and economic burden on communities.
Plan of Action
To address this challenge, Congress should authorize DOE and NIST to lead an interagency working group and a consortium of public, private and academic communities to enact a phased plan to develop, implement, and operationalize standardized metrics, in close partnership with industry.
Recommendation 1. Identify and Assign Agency Mandates
Creating and Implementing this measurement framework requires concerted action by multiple federal agencies, each leveraging its mandate. The Department of Energy (DOE) should serve as the co-lead federal agency driving this initiative. Within DOE, the Office of Critical and Emerging Technologies (CET) can coordinate AI-related efforts across DOE programs, given its focus on AI and advanced tech integration. The National Institute of Standards and Technology (NIST) will also act as a co-lead for this initiative leading the metrics development and standardization effort as described, convening experts and industry. The White House Office of Science and Technology Policy (OSTP) will act as the coordinating body for this multi-agency effort. OSTP, alongside the Council on Environmental Quality (CEQ), can ensure alignment with broader energy, environment, and technology policy. The Environmental Protection Agency (EPA) should take charge of environmental data collection and oversight. The Federal Energy Regulatory Commission (FERC) should play a supporting role by addressing grid and electricity market barriers. FERC should streamline interconnection processes for new data center loads, perhaps creating fast-track procedures for projects that commit to high efficiency and demand flexibility.
Congressional leadership and oversight will be key. The Senate Committee on Energy and Natural Resources and House Energy & Commerce Committee (which oversee energy infrastructure and data center energy issues) should champion legislation and hold hearings on AI’s energy demands. The House Science, Space, and Technology Committee and Senate Commerce, Science, & Transportation Committee (which oversee NIST, and OSTP) should support R&D funding and standards efforts. Environmental committees (like Senate Environment and Public Works, House Natural Resources) should address water use and emissions. Ongoing committee oversight can ensure agencies stay on schedule and that recommendations turn into action (for example, requiring an EPA/DOE/NIST joint report to Congress within a set timeframe(s).
Congress should mandate a formal interagency task force or working group, co-led by the Department of Energy (DOE) and the National Institute of Standards and Technology (NIST), with the White House Office of Science and Technology Policy (OSTP) serving as the coordinating body and involving all relevant federal agencies. This body will meet regularly to track progress, resolve overlaps or gaps, and issue public updates. By clearly delineating responsibilities, The federal government can address the measurement problem holistically.
Recommendation 2. Develop a Comprehensive AI Energy Lifecycle Measurement Framework
A complete view of AI’s environmental footprint requires metrics that span the full lifecycle, including every layer from chip to datacenter, workload drivers, and knock‑on effects like water use and electricity prices.
Create new standardized metrics that capture AI’s energy and environmental footprint across its entire lifecycle—training, inference, data center operations (cooling/power), and hardware manufacturing/disposal. This framework should be developed through a multi-stakeholder process led by NIST in partnership with DOE and EPA, and in consultation with industry, academia as well as state and local governments.
Key categories should include:
- Data Center Efficiency Metrics: how effectively do data centers use power?
- AI Hardware & Compute Metrics: e.g. Performance per Watt (PPW)—the throughput of AI computations per watt of power.
- Cooling and Water Metrics: How much energy and water are being used to cool these systems?
- Environmental Impact Metrics: What is the carbon intensity per AI task?
- Composite or Lifecycle Metrics: Beyond a single point in time, what are the lifetime characteristics of impact for these systems?
Designing standardized metrics
NIST, with its measurement science expertise, should coordinate the development of these metrics in an open process, building on efforts like NIST’s AI Standards Working Group—a standing body chartered under the Interagency Committee on Standards Policy which brings together technical stakeholders to map the current AI-standards landscape, spot gaps, and coordinate U.S. positions and research priorities. The goal is to publish a standardized metrics framework and guidelines that industry can begin adopting voluntarily within 12 months. Where possible, leverage existing standards (for example, those from the Green Grid consortium on PUE and Water Usage Effectiveness (WUE), or IEEE/ISO standards for energy management) and tailor them to AI’s unique demands. Crucially, these metrics must be uniformly defined to enable apples-to-apples comparisons and periodically updated as technology evolves.
Review, Governance, and improving metrics
We recommend establishing a Metrics Review Committee (led by NIST with DOE/EPA and external experts) to refine the metrics whenever needed, host stakeholder workshops, and public updates. This continuous improvement process will keep the framework current with new AI model types, cooling tech, and hardware advances, ensuring relevance into the future. For example, when we move from the current model of chatbots responding to queries to agentic AI systems that plan, act, remember, and iterate autonomously, traditional “energy per query” metrics no longer capture the full picture.
Recommendation 3. Operationalize Data Collection, Reporting, Analysis and Integrate it into Policy
Start with a six‑month voluntary reporting program, and gradually move towards a mandatory reporting mechanism which feeds straight into EIA outlooks and FERC grid planning.
The task force should solicit inputs via a Request for Information (RFI) — similar to DOE’s recent RFI on AI infrastructure development, asking data center operators, AI chip manufacturers, cloud providers, utilities, and environmental groups to weigh in on feasible reporting requirements and data sharing methods. Within 12 months of starting, this taskforce should complete (a) a draft AI energy lifecycle measurement framework (with standardized definitions for energy, water, carbon, and e-waste metrics across training and data center operations), and (b) an initial reporting template for technology companies, data centers and utilities to pilot.
With standardized metrics in hand, we must shift the focus to implementation and data collection at scale. In the beginning, a voluntary AI energy reporting program can be launched by DOE and EPA (with NIST overseeing the standards). This program would provide guidance to AI developers (e.g. major model-training companies), cloud service providers, and data center operators to report their metrics on an annual or quarterly basis.
After a trial run of the voluntary program, Congress should enact legislation to create a mandatory reporting regime that borrows the best features of existing federal disclosure programs. One useful template is EPA’s Greenhouse Gas Reporting Program, which obliges any facility that emits more than 25,000 tons of CO₂ equivalent per year to file standardized, verifiable electronic reports. The same threshold logic could be adapted for data centers (e.g., those with more than 10 MW of IT load) and for AI developers that train models above a specified compute budget. A second model is DOE/EIA’s Form EIA-923 “Power Plant Operations Report,” whose structured monthly data flow straight into public statistics and planning models. An analogous “Form EIA-AI-01” could feed the Annual Energy Outlook and FERC reliability assessments without creating a new bureaucracy. EIA could also consider adding specific questions or categories in the Commercial Buildings Energy Consumption Survey and Form EIA-861 to identify energy use by data centers and large computing loads. This may involve coordinating with the Census Bureau to leverage industrial classification data (e.g., NAICS codes for data hosting facilities) so that baseline energy/water consumption of the “AI sector” is measured in national statistics. NTIA, which often convenes multi stakeholder processes on technology policy, can host industry roundtables to refine reporting processes and address any concerns (e.g. data confidentiality, trade secrets). NTIA can help ensure that reporting requirements are not overly burdensome to smaller AI startups by working out streamlined methods (perhaps aggregated reporting via cloud providers, for instance). DOE’s Grid Deployment Office (GDO) and Office of Electricity (OE), with better data, should start integrating AI load growth into grid planning models and funding decisions. For example, GDO could prioritize transmission projects that will deliver clean power to regions with clusters of AI data centers, based on EIA data showing rapid load increases. FERC, for its part, can use the reported data to update its reliability and resource adequacy guidelines and possibly issue guidance for regional grid operators (RTOs/ISOs) to explicitly account for projected large computing loads in their plans.
This transparency will let policymakers, researchers, and consumers track improvements (e.g., is the energy per AI training decreasing over time?) and identify leaders/laggards. It will also inform mid-course adjustments that if certain metrics prove too hard to collect or not meaningful, NIST can update the standards. The Census Bureau can contribute by testing the inclusion of questions on technology infrastructure in its Economic Census 2027 and annual surveys, ensuring that the economic data of the tech sector includes environmental parameters (for example, collecting data center utility expenditures, which correlate with energy use). Overall, this would establish an operational reporting system and start feeding the data into both policy and market decisions.
Through these recommendations, responsible offices have clear roles: DOE spearheads efficiency measures in data center initiatives; OE (Office of Electricity and GDO (Grid Deployment Office) use the data to guide grid improvements; NIST creates and maintains the measurement standards; EPA oversees environmental data and impact mitigation; EIA institutionalizes energy data collection and dissemination; FERC adapts regulatory frameworks for reliability and resource adequacy; OSTP coordinates the interagency strategy and keeps the effort a priority; NTIA works with industry to smooth data exchange and involve them; and Census Bureau integrates these metrics into broader economic data. See the table below.Meanwhile, non-governmental actors like utilities, AI companies, and data center operators must not only be data providers but partners. Utilities could use this data to plan investments and can share insights on demand response or energy sourcing; AI developers and data center firms will implement new metering and reporting practices internally, enabling them to compete on efficiency (similar to car companies competing on miles per gallon ratings). Together, these actions create a comprehensive approach: measuring AI’s footprint, managing its growth, and mitigating its environmental impacts through informed policy.
Conclusion
AI’s extraordinary capabilities should not come at the expense of our energy security or environmental sustainability. This memo outlines how we can effectively operationalize measuring AI’s environmental footprint by establishing standardized metrics and leveraging the strengths of multiple agencies to implement them. By doing so, we can address a critical governance gap: what isn’t measured cannot be effectively managed. Standard metrics and transparent reporting will enable AI’s growth while ensuring that data center expansion is met with commensurate increases in clean energy, grid upgrades, and efficiency gains.
The benefits of these actions are far-reaching. Policymakers will gain tools to balance AI innovation with energy and environment goals. For example, by being able to require improvements if an AI service is energy-inefficient, or to fast-track permits for a new data center that meets top sustainability standards. Communities will be better protected: with data in hand, we can avoid scenarios where a cluster of AI facilities suddenly strains a region’s power or water resources without local officials knowing in advance. Instead, requirements for reporting and coordination can channel resources (like new transmission lines or water recycling systems) to those communities ahead of time. The AI industry itself will benefit by building trust and reducing the risk of backlash or heavy-handed regulation; a clear, federal metrics framework provides predictability and a level playing field (everyone measures the same way), and it showcases responsible stewardship of technology. Moreover, emphasizing energy efficiency and resource reuse can reduce operating costs for AI companies in the long run, a crucial advantage as energy prices and supply chain concerns grow.
This memo is part of our AI & Energy Policy Sprint, a policy project to shape U.S. policy at the critical intersection of AI and energy. Read more about the Policy Sprint and check out the other memos here.
While there are existing metrics like PUE for data centers, they don’t capture the full picture of AI’s impacts. Traditional metrics focus mainly on facility efficiency (power and cooling) and not on the computational intensity of AI workloads or the lifecycle impacts. AI operations involve unique factors—for example, training a large AI model can consume significant energy in a short time, and using that AI model continuously can draw power 24/7 across distributed locations. Current standards are outdated and inconsistent: one data center might report a low PUE but could be using water recklessly or running hardware inefficiently. AI-specific metrics are needed to measure things like energy per training run, water per cooling unit, or carbon per compute task, which no standard reporting currently requires. In short, general data center standards weren’t designed for the scale and intensity of modern AI. By developing AI-specific metrics, we ensure that the unique resource demands of AI are monitored and optimized, rather than lost in aggregate averages. This helps pinpoint where AI can be made more efficient (e.g., via better algorithms or chips)—an opportunity not visible under generic metrics.
AI’s environmental footprint is a cross-cutting issue, touching on energy infrastructure, environmental impact, technological standards, and economic data. No single agency has the full expertise or jurisdiction to cover all aspects. Each agency will have clearly defined roles (as outlined in the Plan of Action). For instance, NIST develops the methodology, DOE/EPA collect and use the data, EIA disseminates it, and FERC/Congress use it to adjust policies. This collaborative approach prevents blind spots. A single-agency approach would likely miss critical elements (for instance, a purely DOE-led effort might not address e-waste or standardized methods, which NIST and EPA can). The good news is that frameworks for interagency cooperation already exist, and this initiative aligns with broader administration priorities (clean energy, reliable grid, responsible AI). Thus, while it involves multiple agencies, OSTP and the White House will ensure everyone stays synchronized. The result will be a comprehensive policy that each agency helps implement according to its strength, rather than a piecemeal solution. See below:
Roles and Responsibilities to Measure AI’s Environmental Impact
- Department of Energy (DOE): DOE should serve as the co-lead federal agency driving this initiative. Within DOE, the Office of Critical and Emerging Technologies (CET) can coordinate AI-related efforts across DOE programs, given its focus on AI and advanced tech integration. DOE’s Office of Energy Efficiency and Renewable Energy (EERE) can lead on promoting energy-efficient data center technologies and practices (e.g. through R&D programs and partnerships), while the Office of Electricity (OE) and Grid Deployment Office address grid integration challenges (ensuring AI data centers have access to reliable clean power). DOE should also collaborate with utilities and FERC to plan for AI-driven electricity demand growth and to encourage demand-response or off-peak operation strategies for energy-hungry AI clusters.
- National Institute of Standards and Technology (NIST): NIST will also act as a co-lead for this initiative leading the metrics development and standardization effort as described, convening experts and industry. NIST should revive or expand its AI Standards Coordination Working Group to focus on sustainability metrics, and ultimately publish technical standards or reference materials for measuring AI energy use, water use, and emissions. NIST is also suited to host stakeholder consortium on AI environmental impacts, working in tandem with EPA and DOE.
- White House, including the Office of Science and Technology Policy (OSTP): OSTP will act as the coordinating body for this multi-agency effort. OSTP, alongside the Council on Environmental Quality (CEQ), can ensure alignment with broader climate and tech policy (such as the U.S. Climate Strategy and AI initiatives). The Administration can also use the Federal Chief Sustainability Officer and OMB guidance to integrate AI energy metrics into federal sustainability requirements (for instance, updating OMB’s memos on data center optimization to include AI-specific measures).
- Environmental Protection Agency (EPA): EPA should take charge of environmental data collection and oversight. In the near term, EPA (with DOE) would conduct the comprehensive study of AI’s environmental impacts, examining AI systems’ lifecycle emissions, water and e-waste. EPA’s expertise in greenhouse gas (GHG) accounting will ensure metrics like carbon intensity are rigorously quantified (e.g. using location-based grid emissions factors rather than unreliable REC-based accounting).
- Federal Energy Regulatory Commission (FERC): FERC plays a supporting role by addressing grid and electricity market barriers. FERC should streamline interconnection processes for new data center loads, perhaps creating fast-track procedures for projects that commit to high efficiency and demand flexibility. FERC can ensure that regional grid reliability assessments start accounting for projected AI/data center load growth using data.
- Congressional Committees: Congressional leadership and oversight will be key. The Senate Committee on Energy and Natural Resources and House Energy & Commerce Committee (which oversee energy infrastructure and data center energy issues) should champion legislation and hold hearings on AI’s energy demands. The House Science, Space, and Technology Committee and Senate Commerce, Science, & Transportation Committee (which oversee NIST and OSTP) should support R&D funding and standards efforts. Environmental committees (like Senate Environment and Public Works, House Natural Resources) should address water use and emissions. Ongoing committee oversight can ensure agencies stay on schedule and that recommendations turn into action (for example, requiring the EPA/DOE/NIST joint report to Congress in four years as the Act envisions, and then moving on any further legislative needs).
The plan requires high-level, standardized data that balances transparency with practicality. Companies running AI operations (like cloud providers or big AI model developers) would report metrics such as: total electricity consumed for AI computations (annually), average efficiency metrics (e.g. PUE, Carbon Usage Effectiveness (CUE), and WUE for their facilities), water usage for cooling, and e-waste generated (amount of hardware decommissioned and how it was handled). These data points are typically already collected internally for cost and sustainability tracking but the difference is they would be reported in a consistent format and possibly to a central repository. For utilities, if involved, they might report aggregated data center load in their service territory or significant new interconnections for AI projects (much of this is already in utility planning documents). See below for examples.
Metrics to Illustrate the Types of Shared Information
- Data Center Efficiency Metrics: Power Usage Effectiveness (PUE) (refined for AI workloads), Data Center Infrastructure Efficiency (DCIE) which measures IT versus total facility power (the inverse of PUE), Energy Reuse Factor (ERF) to quantify how much waste heat is reused on-site, and Carbon Usage Effectiveness (CUE) to link energy use with carbon emissions (kg CO₂ per kWh). These give a holistic view of facility efficiency and carbon intensity, beyond just power usage.
- AI Hardware & Compute Metrics: Performance per Watt (PPW)—the throughput of AI computations (like FLOPS or inferences) per watt of power, which encourages energy-efficient model training and inference. Compute Utilization—ensuring expensive AI accelerators (GPUs/TPUs) are well-utilized rather than idling (tracking average utilization rates). Training energy per model—total kWh or emissions per training run (possibly normalized by model size or training-hours). Inference efficiency—energy per 1000 queries or per inference for deployed models. Idle power draw—measure and minimize the energy hardware draws when not actively in use.
- Cooling and Water Metrics: Cooling Energy Efficiency Ratio (EER)—the output cooling power per watt of energy input, to gauge cooling system efficiency. Water Usage Effectiveness (WUE)—liters of water used per kWh of IT compute, or simply total water used for cooling per year. These help quantify and benchmark the significant water and electricity overhead for thermal management in AI data centers.
- Environmental Impact Metrics: Carbon Intensity per AI Task—CO₂ emitted per training or per 1000 inferences, which could be aggregated to an organizational carbon footprint for AI operations. Greenhouse Gas emissions per kWh—linking energy use to actual emissions based on grid mix or backup generation. Also, e-waste metrics—such as total hardware weight decommissioned annually, or a recycling ratio. For instance, tracking the tons of servers/chips retired and the fraction recycled versus landfilled can illuminate the life cycle impact.
- Composite or Lifecycle Metrics: Develop ways to combine these factors to rate overall sustainability of AI systems. For example, an “AI Sustainability Score” could incorporate energy efficiency, renewables use, cooling efficiency, and end-of-life recycling. Another idea is an “AI Energy Star” rating for AI hardware or cloud services that meet certain efficiency and transparency criteria, modeled after Energy Star appliance ratings.
No, the intention is not to force disaggregation down to proprietary details (e.g., exactly how a specific algorithm uses energy) but rather to get macro-level indicators. Regarding trade secrets or sensitive info, the data collected (energy, water, emissions) is not about revealing competitive algorithms or data, it’s about resource use. These are analogous to what many firms already publish in sustainability reports (power usage, carbon footprint), just more uniformly. There will be provisions to protect any sensitive facility-level data (e.g., EIA could aggregate or anonymize certain figures in public releases). The goal is transparency about environmental impact, not exposure of intellectual property.
Once collected, the data will become a powerful tool for evidence-based policymaking and oversight. At the strategic level, DOE and the White House can track whether the AI sector is becoming more efficient or not—for instance, seeing trends in energy-per-AI-training decreasing (good) or total water use skyrocketing (a flag for action).
Energy planning: EIA will incorporate the numbers into its models, which guide national energy policy and investment. If data shows that AI is driving, say, an extra 5% electricity demand growth in certain regions, DOE’s Grid Deployment Office and FERC can respond by facilitating grid expansions or reliability measures in those areas.
Climate policy: EPA can use reported emissions data to update greenhouse gas inventories and identify if AI/data centers are becoming a significant source—if so, that could shape future climate regulations or programs (ensuring this sector contributes to emissions reduction goals).
Water resource management: If we see large water usage by AI in drought-prone areas, federal and state agencies can work on water recycling or alternative cooling initiatives.
Research and incentives: DOE’s R&D programs (through ARPA-E or National Labs) can target the pain points revealed—e.g., if e-waste volumes are high, fund research into longer-lasting hardware or recycling tech; if certain metrics like Energy Reuse Factor are low, push demonstration projects for waste heat reuse.
This could inform everything from ESG investment decisions to local permitting. For example, a company planning a new data center might be asked by local authorities, “What’s your expected PUE and water usage? The national average for AI data centers is X—will you do better?” In essence, the data ensures the government and public can hold the AI industry accountable for progress (or regress) on sustainability. By integrating these data into models and policies, the government can anticipate and avert problems (like grid strain or high emissions) before they grow, and steer the sector toward solutions.
AI services and data centers are worldwide, so consistency in how we measure impacts is important. The U.S. effort will be informed by and contribute to international standards. Notably, the ISO (International Organization for Standardization) is already developing criteria for sustainable AI, including energy, raw materials, and water metrics across the AI lifecycle NIST, which often represents the U.S. in global standards bodies, is involved and will ensure that our metrics framework aligns with ISO’s emerging standards. Similarly, the EU’s AI Act also has requirements for reporting AI energy and resource use. By moving early on our own metrics, the U.S. can actually help shape what those international norms look like, rather than react to them. This initiative will encourage U.S. agencies to engage in forums like the Global Partnership on AI (GPAI) or bilateral tech dialogues to promote common sustainability reporting frameworks. In the end, aligning metrics internationally will create a more level playing field—ensuring that AI companies can’t simply shift operations to avoid transparency. If the U.S., EU, and others all require similar disclosures, it reinforces responsible practices everywhere.
Shining a light on energy and resource use can drive new innovation in efficiency. Initially, there may be modest costs—for example, installing better sub-meters in data centers or dedicating staff time to reporting. However, these costs are relatively small in context. Many leading companies already track these metrics internally for cost management and corporate sustainability goals. We are recommending formalizing and sharing that information. Over time, the data collected can reduce costs: companies will identify wasteful practices (maybe servers idling, or inefficient cooling during certain hours) and correct them, saving on electricity and water bills. There is also an economic opportunity in innovation: as efficiency becomes a competitive metric, we expect increased R&D into low-power AI algorithms, advanced cooling, and longer-life hardware. Those innovations can improve performance per dollar as well. Moreover, policy support can offset any burdens—for instance, the government can provide technical assistance or grants to smaller firms to help them improve energy monitoring. We should also note that unchecked resource usage carries its own risks to innovation: if AI’s growth starts causing blackouts or public backlash due to environmental damage, that would seriously hinder AI progress.
Speed Grid Connection Using ‘Smart AI Fast Lanes’ and Competitive Prizes
Innovation in artificial intelligence (AI) and computing capacity is essential for U.S. competitiveness and national security. However, AI data center electricity use is growing rapidly. Data centers already consume more than 4% of U.S. electricity annually and could rise to 6% to 12% of U.S. electricity by 2028. At the same time, electricity rates are rising for consumers across the country, with transmission and distribution infrastructure costs a major driver of these increases. For the first time in fifteen years, the U.S. is experiencing a meaningful increase in electricity demand. Electricity use from data centers already consumes more than 25% of electricity in Virginia, which leads the world in data center installations. Data center electricity load growth results in real economic and environmental impacts for local communities. It also represents a national policy trial on how the U.S. responds to rising power demand from the electrification of homes, transportation, and manufacturing– important technology transitions for cutting carbon emissions and air pollution.
Federal and state governments need to ensure that the development of new AI and data center infrastructure does not increase costs for consumers, impact the environment, and exacerbate existing inequalities. “Smart AI Fast Lanes” is a policy and infrastructure investment framework that ensures the U.S. leads the world in AI while building an electricity system that is clean, affordable, reliable, and equitable. Leveraging innovation prizes that pay for performance, coupled with public-private partnerships, data center providers can work with the Department of Energy, the Foundation for Energy Security and Innovation (FESI), the Department of Commerce, National Labs, state energy offices, utilities, and the Department of Defense to drive innovation to increase energy security while lowering costs.
Challenge and Opportunity
Targeted policies can ensure that the development of new AI and data center infrastructure does not increase costs for consumers, impact the environment, and exacerbate existing energy burdens. Allowing new clean power sources co-located or contracted with AI computing facilities to connect to the grid quickly, and then manage any infrastructure costs associated with that new interconnection, would accelerate the addition of new clean generation for AI while lowering electricity costs for homes and businesses.
One of the biggest bottlenecks in many regions of the U.S. in adding much-needed capacity to the electricity grid are the so-called “interconnection queues”. There are different regional requirements for power plants to complete (often, a number of studies on how a project affects grid infrastructure) before they are allowed to connect. Solar, wind, and battery projects represented 95% of the capacity waiting in interconnection queues in 2023. The operator of Texas’ power grid, the Electric Reliability Council of Texas (ERCOT), uses a “connect and manage” interconnection process that results in faster interconnections of new energy supplies than the rest of the country. Instead of requiring each power plant to complete lengthy studies of needed system-wide infrastructure investments before connecting to the grid, the “connect and manage” approach in Texas gets power plants online quicker than a “studies first” approach. Texas manages any risks that arise using the power markets and system-wide planning efforts. The results are clear: the median time from an interconnection request to commercial operations in Texas was four years, compared to five years in New York and more than six and a half years in California.
“Smart AI Fast Lanes” expands the spirit of the Texas “connect and manage” approach nationwide for data centers and clean energy, and adds to it investment and innovation prizes to speed up the process, ensure grid reliability, and lower costs.
Data center providers would work with the Department of Energy, the Foundation for Energy Security and Innovation (FESI), the Department of Commerce, National Laboratories, state energy offices, utilities, and the Department of Defense to speed up interconnection queues, spur innovation in efficiency, and re-invest in infrastructure, to increase energy security and lower costs.
Why FESI Should Lead ‘Smart AI Fast Lanes’
With FESI managing this effort, the process can move faster than the government acting alone. FESI is an independent, non-profit, agency-related foundation that was created by Congress in the CHIPS and Science Act of 2022 to help the Department of Energy achieve its mission and accelerate “the development and commercialization of critical energy technologies, foster public-private partnerships, and provide additional resources to partners and communities across the country supporting solutions-driven research and innovation that strengthens America’s energy and national security goals”. Congress has created many other agency-related foundations, such as the Foundation for NIH, the National Fish and Wildlife Foundation, and the National Park Foundation, which was created in 1935. These agency-related foundations have a demonstrated record of raising external funding to leverage federal resources and enabling efficient public-private partnerships. As a foundation supporting the mission of the Department of Energy, FESI has a unique opportunity to quickly respond to emergent priorities and create partnerships to help solve energy challenges.
As an independent organization, FESI can leverage the capabilities of the private sector, academia, philanthropies, and other organizations to enable collaboration with federal and state governments. FESI can also serve as an access point to opening up additional external investment, and shared risk structures and clear rules of engagement make emerging energy technologies more attractive to institutional capital. For example, the National Fish and Wildlife Foundation awards grants that are matched with non-federal private, philanthropic, or local funding sources that multiply the impact of any federal investments. In addition, the National Fish and Wildlife Foundation has partnered with the Department of Defense and external funding sources to enhance coastal resilience near military installations. Both AI compute capabilities and energy resilience are of strategic importance to the Department of Defense, Department of Energy, and other agencies, and leveraging public-private partnerships is a key pathway to enhance capabilities and security. FESI leading a Smart AI Fast Lanes initiative could be a force multiplier to enable rapid deployment of clean AI compute capabilities that are good for communities, companies, and national security.
Use Prizes to Lessen Cost and Maximize Return
The Department of Energy has long used prize competitions to spur innovation and accelerate access to funding and resources. Prize competitions with focused objectives but unstructured pathways for success enables the private sector to compete and advance innovation without requiring a lot of federal capacity and involvement. Federal prize programs pay for performance and results, while also providing a mechanism to crowd in additional philanthropic and private sector investment. In the Smart AI Fast Lane framework, FESI could use prizes to support energy innovation from AI data centers while working with the Department of Energy’s Office of Cybersecurity, Energy Security, and Emergency Response (CESER) to enable a repeatable and scalable public private partnership program. These prizes would be structured so that there is a low administrative and operational effort required for FESI itself, with other groups such as American-Made, National Laboratories, or organizations like FAS, helping to provide technical expertise to review and administer prize applications. This can ensure quality while enabling scalable growth.
Plan of Action
Here’s how “Smart AI Fast Lanes” would work. For any proposed data center investment of more than 250 MW, companies could apply to work with FESI. Successful application would leverage public, private, and philanthropic funds and technical assistance. Projects would be required to increase clean energy supplies, achieve world-leading data center energy efficiency, invest in transmission and distribution infrastructure, and/or deploy virtual power plants for grid flexibility.
Recommendation 1. Use a “Smart AI Fast Lane” Connection Fee to Quickly Connect to the Grid, Further Incentivized by a “Bring Your Own Power” Prize
New large AI data center loads choosing the “Smart AI Fast Lane” would pay a fee to connect to the grid without first completing lengthy pre-connection cost studies. Those payments would go into a fund, managed and overseen by FESI, that would be used to cover any infrastructure costs incurred by regional grids for the first three years after project completion. The fee could be a flat fee based on data center size, or structured as an auction, enabling the data centers bidding the highest in a region to be at the front of the line. This enables the market to incentivize the highest priority additions. Alternatively, large load projects could choose to do the studies first and remain in the regular – and likely slower – interconnection queue to avoid the fee.
In addition, FESI could facilitate a “Bring Your Own Power” prize award that is a combination of public, private, and philanthropic funds that data center developers can match to contract for new, additional zero-emission electricity generated locally that covers twice as much as the data center uses annually. For data centers committing to this “Smart AI Fast Lane” process, both the data center and the clean energy supply would receive accelerated priority in the interconnection queue and technical assistance from National Laboratories. This leverages economies of scale for projects, lowers the cost of locally-generated clean electricity, and gets clean energy connected to the grid quicker. Prize resources would support a “connect and manage” interconnection approach to cover 75% of the costs of any required infrastructure for local clean power projects resulting from the project. FESI prize resources could further supplement these payments to upgrade electrical infrastructure in areas of national need for new electricity supplies to maintain electricity reliability. These include areas assessed by the North American Reliability Corporation to have a high risk of an electricity shortfall in the coming years, such as the Upper Midwest or Gulf Coast, or areas with an elevated risk such as California, the Great Plains, Texas, the Mid-Atlantic, or the Northeast.
Recommendation 2. Create an Efficiency Prize To Establish World-Leading Energy and Water Efficiency at AI Data Centers
Data centers have different design configurations that affect how much energy and water are needed to operate. Data centers use electricity for computing, but also for the cooling systems needed for computing equipment, and there are innovation opportunities to increase the efficiency of both. One historical measure of AI data center energy efficiency is Power Use Effectiveness (PUE), which is the total facility annual energy use, divided by the computing equipment annual energy use, with values closer to 1.0 being more efficient. Similarly, Water Use Effectiveness (WUE) is measured as total annual water use divided by the computing equipment annual energy use, with closer to zero being more efficient. We should continue to push for improvement in PUE and WUE, but these are incomplete current metrics to drive deep innovation because they do nor reflect how much computing power is provided and do not assess impacts in the broader infrastructure energy system. While there have been multiple different metrics for data center energy efficiency proposed over the past several years, what is important for innovation is to improve the efficiency of how much AI computing work we get for the amount of energy and water used. Just like efficiency in a car is measured in miles per gallon (MPG), we need to measure the “MPG” of how AI data centers perform work and then create incentives and competition for continuous improvements. There could be different metrics for different types of AI training and inference workloads, but a starting point could be the tokens per kilowatt-hour of electricity used. A token is a word or portion of a word that AI foundation models use for analysis. Another way could be to measure the efficiency of computing performance, or FLOPS, per kilowatt-hour. The more analysis an AI model or data center can perform using the same amount of energy, the more energy efficient it is.
FESI could deploy sliding scale innovation prizes based on data center size for new facilities that demonstrate leading edge AI data center MPG. These could be based on efficiency targets for tokens per kilowatt-hour, FLOPS per kilowatt-hour, top-performing PUE, or other metrics of energy efficiency. Similar prizes could be provided for water use efficiency, within different classes of cooling technologies that exceed best-in-class performance. These prizes could be modeled after the USDA’s agency-related foundation’s FFAR Egg-Tech Prize, which was a program that was easy to administer and has had great success. A secondary benefit of an efficiency innovation prize is continuous competition for improvement, and open information about best-in-class data center facilities.

Fig. 1. Power Use Efficiency (PUE) and Water Use Efficiency (WUE) values for Data Centers Source: LBNL 2024
Recommendation 3. Create Prizes to Maximize Transmission Throughput and Upgrade Grid Infrastructure
FESI could award prizes for rapid deployment of reconductoring, new transmission, or grid enhancing technologies to increase the transmission capacity for any project in DOE’s Coordinated Interagency Authorizations and Permit Program. Similarly, FESI could award prizes for utilities to upgrade local distribution infrastructure beyond the direct needs for the project to reduce future electricity rate cases, which will keep electricity costs affordable for residential customers. The Department of Energy already has authority to finance up to $2.5 billion in the Transmission Facilitation Program, a revolving fund administered by the Grid Deployment Office (GDO) that helps support transmission infrastructure. These funds could be used for public-private partnerships in a national interest electric transmission corridor and necessary to accommodate an increase in electricity demand across more than one state or transmission planning region.
Recommendation 4. Develop Prizes That Reward Flexibility and End-Use Efficiency Investments
Flexibility in how and when data centers use electricity can meaningfully reduce the stress on the grid. FESI should award prizes to data centers that demonstrate best-in-class flexibility through smart controls and operational improvements. Prizes could also be awarded to utilities hosting data centers that reduce summer and winter peak loads in the local service territory. Prizes for utilities that meet home weatherization targets and deploy virtual power plants could help reduce costs and grid stress in local communities hosting AI data centers.
Conclusion
The U.S. is facing the risk of electricity demand outstripping supplies in many parts of the country, which would be severely detrimental to people’s lives, to the economy, to the environment, and to national security. “Smart AI Fast Lanes” is a policy and investment framework that can rapidly increase clean energy supply, infrastructure, and demand management capabilities.
It is imperative that the U.S. addresses the growing demand from AI and data centers, so that the U.S. remains on the cutting edge of innovation in this important sector. How the U.S. approaches and solves the challenge of new demand from AI, is a broader test on how the country prepares its infrastructure for increased electrification of vehicles, buildings, and manufacturing, as well as how the country addresses both carbon pollution and the impacts from climate change. The “Smart AI Fast Lanes” framework and FESI-run prizes will enable U.S. competitiveness in AI, keep energy costs affordable, reduce pollution, and prepare the country for new opportunities.
This memo is part of our AI & Energy Policy Sprint, a policy project to shape U.S. policy at the critical intersection of AI and energy. Read more about the Policy Sprint and check out the other memos here.
A Holistic Framework for Measuring and Reporting AI’s Impacts to Build Public Trust and Advance AI
As AI becomes more capable and integrated throughout the United States economy, its growing demand for energy, water, land, and raw materials is driving significant economic and environmental costs, from increased air pollution to higher costs for ratepayers. A recent report projects that data centers could consume up to 12% of U.S. electricity by 2028, underscoring the urgent need to assess the tradeoffs of continued expansion. To craft effective, sustainable resource policies, we need clear standards for estimating the data centers’ true energy needs and for measuring and reporting the specific AI applications driving their resource consumption. Local and state-level bills calling for more oversight of utility rates and impacts to ratepayers have received bipartisan support, and this proposal builds on that momentum.
In this memo, we draw on research proposing a holistic evaluation framework for characterizing AI’s environmental impacts, which establishes three categories of impacts arising from AI: (1) Computing-related impacts; (2) Immediate application impacts; and (3) System-level impacts . Concerns around AI’s computing-related impacts, e.g. energy and water use due to AI data centers and hardware manufacturing, have become widely known with corresponding policy starting to be put into place. However, AI’s immediate application and system-level impacts, which arise from the specific use cases to which AI is applied, and the broader socio-economic shifts resulting from its use, remain poorly understood, despite their greater potential for societal benefit or harm.
To ensure that policymakers have full visibility into the full range of AI’s environmental impacts we recommend that the National Institute of Standards and Technology (NIST) oversee creation of frameworks to measure the full range of AI’s impacts. Frameworks should rely on quantitative measurements of the computing and application related impacts of AI and qualitative data based on engagements with the stakeholders most affected by the construction of data centers. NIST should produce these frameworks based on convenings that include academic researchers, corporate governance personnel, developers, utility companies, vendors, and data center owners in addition to civil society organizations. Participatory workshops will yield new guidelines, tools, methods, protocols and best practices to facilitate the evolution of industry standards for the measurement of the social costs of AI’s energy infrastructures.
Challenge and Opportunity
Resource consumption associated with AI infrastructures is expanding quickly, and this has negative impacts, including asthma from air pollution associated with diesel backup generators, noise pollution, light pollution, excessive water and land use, and financial impacts to ratepayers. A lack of transparency regarding these outcomes and public participation to minimize these risks losing the public’s trust, which in turn will inhibit the beneficial uses of AI. While there is a huge amount of capital expenditure and a massive forecasted growth in power consumption, there remains a lack of transparency and scientific consensus around the measurement of AI’s environmental impacts with respect to data centers and their related negative externalities.
A holistic evaluation framework for assessing AI’s broader impacts requires empirical evidence, both qualitative and quantitative, to influence future policy decisions and establish more responsible, strategic technology development. Focusing narrowly on carbon emissions or energy consumption arising from AI’s computing related impacts is not sufficient. Measuring AI’s application and system-level impacts will help policymakers consider multiple data streams, including electricity transmission, water systems and land use in tandem with downstream economic and health impacts.
Regulatory and technical attempts so far to develop scientific consensus and international standards around the measurement of AI’s environmental impacts have focused on documenting AI’s computing-related impacts, such as energy use, water consumption, and carbon emissions required to build and use AI. Measuring and mitigating AI’s computing-related impacts is necessary, and has received attention from policymakers (e.g. the introduction of the AI Environmental Impacts Act of 2024 in the U.S., provisions for environmental impacts of general-purpose AI in the EU AI Act, and data center sustainability targets in the German Energy Efficiency Act). However, research by Kaack et al (2022) highlights that impacts extend beyond computing. AI’s application impacts, which arise from the specific use cases for which AI is deployed (e.g. AI’s enabled emissions, such as application of AI to oil and gas drilling have much greater potential scope for positive or negative impacts compared to AI’s computing impacts alone, depending on how AI is used in practice). Finally, AI’s system-level impacts, which include even broader, cascading social and economic impacts associated with AI energy infrastructures, such as increased pressure on local utility infrastructure leading to increased costs to ratepayers, or health impacts to local communities due to increased air pollution, have the greatest potential for positive or negative impacts, while being the most challenging to measure and predict. See Figure 1 for an overview.

from Kaack et al. (2022). Effectively understanding and shaping AI’s impacts will require going beyond impacts arising from computing alone, and requires consideration and measurement of impacts arising from AI’s uses (e.g. in optimizing power systems or agriculture) and how AI’s deployment throughout the economy leads to broader systemic shifts, such as changes in consumer behavior.
Effective policy recommendations require more standardized measurement practices, a point raised by the Government Accountability Office’s recent report on AI’s human and environmental effects, which explicitly calls for increasing corporate transparency and innovation around technical methods for improved data collection and reporting. But data should also include multi-stakeholder engagement to ensure there are more holistic evaluation frameworks that meet the needs of specific localities, including state and local government officials, businesses, utilities, and ratepayers. Furthermore, while states and municipalities are creating bills calling for more data transparency and responsibility, including in California, Indiana, Oregon, and Virginia, the lack of federal policy means that data center owners may move their operations to states that have fewer protections in place and similar levels of existing energy and data transmission infrastructure.
States are also grappling with the potential economic costs of data center expansion. Ohio’s Policy Matters found that tax breaks for data center owners are hurting tax revenue streams that should be used to fund public services. In Michigan, tax breaks for data centers are increasing the cost of water and power for the public while undermining the state’s climate goals. Some Georgia Republicans have stated that data center companies should “pay their way.” While there are arguments that data centers can provide useful infrastructure, connectivity, and even revenue for localities, a recent report shows that at least ten states each lost over $100 million a year in revenue to data centers because of tax breaks. The federal government can help create standards that allow stakeholders to balance the potential costs and benefits of data centers and related energy infrastructures. We now have an urgent need to increase transparency and accountability through multi-stakeholder engagement, maximizing economic benefits while reducing waste.
Despite the high economic and policy stakes, critical data needed to assess the full impacts—both costs and benefits—of AI and data center expansion remains fragmented, inconsistent, or entirely unavailable. For example, researchers have found that state-level subsidies for data center expansion may have negative impacts on state and local budgets, but this data has not been collected and analyzed across states because not all states publicly release data about data center subsidies. Other impacts, such as the use of agricultural land or public parks for transmission lines and data center siting, must be studied at a local and state level, and the various social repercussions require engagement with the communities who are likely to be affected. Similarly, estimates on the economic upsides of AI vary widely, e.g. the estimated increase in U.S. labor productivity due to AI adoption ranges from 0.9% to 15% due in large part to lack of relevant data on AI uses and their economic outcomes, which can be used to inform modeling assumptions.
Data centers are highly geographically clustered in the United States, more so than other industrial facilities such as steel plants, coal mines, factories, and power plants (Fig. 4.12, IEA World Energy Outlook 2024). This means that certain states and counties are experiencing disproportionate burdens associated with data center expansion. These burdens have led to calls for data center moratoriums or for the cessation of other energy development, including in states like Indiana. Improved measurement and transparency can help planners avoid overly burdensome concentrations of data center infrastructure, reducing local opposition.
With a rush to build new data center infrastructure, states and localities must also face another concern: overbuilding. For example, Microsoft recently put a hold on parts of its data center contract in Wisconsin and paused another in central Ohio, along with contracts in several other locations across the United States and internationally. These situations often stem from inaccurate demand forecasting, prompting utilities to undertake costly planning and infrastructure development that ultimately goes unused. With better measurement and transparency, policymakers will have more tools to prepare for future demands, avoiding the negative social and economic impacts of infrastructure projects that are started but never completed.
While there have been significant developments in measuring the direct, computing-related impacts of AI data centers, public participation is needed to fully capture many of their indirect impacts. Data centers can be constructed so they are more beneficial to communities while mitigating their negative impacts, e.g. by recycling data center heat, and they can also be constructed to be more flexible by not using grid power during peak times. However, this requires collaborative innovation and cross-sector translation, informed by relevant data.
Plan of Action
Recommendation 1. Develop a database of AI uses and framework for reporting AI’s immediate applications in order to understand the drivers of environmental impacts.
The first step towards informed decision-making around AI’s social and environmental impacts is understanding what AI applications are actually driving data center resource consumption. This will allow specific deployments of AI systems to be linked upstream to compute-related impacts arising from their resource intensity, and downstream to impacts arising from their application, enabling estimation of immediate application impacts.
The AI company Anthropic demonstrated a proof-of-concept categorizing queries to their Claude language model under the O*NET database of occupations. However, O*NET was developed in order to categorize job types and tasks with respect to human workers, which does not exactly align with current and potential uses of AI. To address this, we recommend that NIST works with relevant collaborators such as the U.S. Department of Labor (responsible for developing and maintaining the O*NET database) to develop a database of AI uses and applications, similar to and building off of O*NET, along with guidelines and infrastructure for reporting data center resource consumption corresponding to those uses. This data could then be used to understand particular AI tasks that are key drivers of resource consumption.
Any entity deploying a public-facing AI model (that is, one that can produce outputs and/or receive inputs from outside its local network) should be able to easily document and report its use case(s) within the NIST framework. A centralized database will allow for collation of relevant data across multiple stakeholders including government entities, private firms, and nonprofit organizations.
Gathering data of this nature may require the reporting entity to perform analyses of sensitive user data, such as categorizing individual user queries to an AI model. However, data is to be reported in aggregate percentages with respect to use categories without attribution to or listing of individual users or queries. This type of analysis and data reporting is well within the scope of existing, commonplace data analysis practices. As with existing AI products that rely on such analyses, reporting entities are responsible for performing that analysis in a way that appropriately safeguards user privacy and data protection in accordance with existing regulations and norms.
Recommendation 2. NIST should create an independent consortium to develop a system-level evaluation framework for AI’s environmental impacts, while embedding robust public participation in every stage of the work.
Currently, the social costs of AI’s system-level impacts—the broader social and economic implications arising from AI’s development and deployment—are not being measured or reported in any systematic way. These impacts fall heaviest on the local communities that host the data centers powering AI: the financial burden on ratepayers who share utility infrastructure, the health effects of pollutants from backup generators, the water and land consumed by new facilities, and the wider economic costs or benefits of data-center siting. Without transparent metrics and genuine community input, policymakers cannot balance the benefits of AI innovation against its local and regional burdens. Building public trust through public participation is key when it comes to ensuring United States energy dominance and national security interests in AI innovation, themes emphasized in policy documents produced by the first and second Trump administrations.
To develop evaluation frameworks in a way that is both scientifically rigorous and broadly trusted, NIST should stand up an independent consortium via a Cooperative Research and Development Agreement (CRADA). A CRADA allows NIST to collaborate rapidly with non-federal partners while remaining outside the scope of the Federal Advisory Committee Act (FACA), and has been used, for example, to convene the NIST AI Safety Institute Consortium. Membership will include academic researchers, utility companies and grid operators, data-center owners and vendors, state, local, Tribal, and territorial officials, technologists, civil-society organizations, and frontline community groups.
To ensure robust public engagement, the consortium should consult closely with FERC’s Office of Public Participation (OPP)—drawing on OPP’s expertise in plain-language outreach and community listening sessions—and with other federal entities that have deep experience in community engagement on energy and environmental issues. Drawing on these partners’ methods, the consortium will convene participatory workshops and listening sessions in regions with high data-center concentration—Northern Virginia, Silicon Valley, Eastern Oregon, and the Dallas–Fort Worth metroplex—while also making use of online comment portals to gather nationwide feedback.
Guided by the insights from these engagements, the consortium will produce a comprehensive evaluation framework that captures metrics falling outside the scope of direct emissions alone. These system-level metrics could encompass (1) the number, type, and duration of jobs created; (2) the effects of tax subsidies on local economies and public services; (3) the placement of transmission lines and associated repercussions for housing, public parks, and agriculture; (4) the use of eminent domain for data-center construction; (5) water-use intensity and competing local demands; and (6) public-health impacts from air, light, and noise pollution. NIST will integrate these metrics into standardized benchmarks and guidance.
Consortium members will attend public meetings, engage directly with community organizations, deliver accessible presentations, and create plain-language explainers so that non-experts can meaningfully influence the framework’s design and application. The group will also develop new guidelines, tools, methods, protocols, and best practices to facilitate industry uptake and to evolve measurement standards as technology and infrastructure grow.
We estimate a cost of approximately $5 million over two years to complete the work outlined in recommendation 1 and 2, covering staff time, travel to at least twelve data-center or energy-infrastructure sites across the United States, participant honoraria, and research materials.
Recommendation 3. Mandate regular measurement and reporting on relevant metrics by data center operators.
Voluntary reporting is the status quo, via e.g. corporate Environmental, Social, and Governance (ESG) reports, but voluntary reporting has so far been insufficient for gathering necessary data. For example, while the technology firm OpenAI, best known for their highly popular ChatGPT generative AI model, holds a significant share of the search market and likely corresponding share of environmental and social impacts arising from the data centers powering their products, OpenAI chooses not to publish ESG reports or data in any other format regarding their energy consumption or greenhouse gas (GHG) emissions. In order to collect sufficient data at the appropriate level of detail, reporting must be mandated at the local, state, or federal level. At the state level, California’s Climate Corporate Data Accountability Act (SB -253, SB-219) requires that large companies operating within the state report their GHG emissions in accordance with the GHG Protocol, administered by the California Air Resources Board (CARB).
At the federal level, the EU’s Corporate Sustainable Reporting Directive (CSRD), which requires firms operating within the EU to report a wide variety of data related to environmental sustainability and social governance, could serve as a model for regulating companies operating within the U.S. The Environmental Protection Agency’s (EPA) GHG Reporting Program already requires emissions reporting by operators and suppliers associated with large GHG emissions sources, and the Energy Information Administration (EIA) collects detailed data on electricity generation and fuel consumption through forms 860 and 923. With respect to data centers specifically, the Department of Energy (DOE) could require that developers who are granted rights to build AI data center infrastructure on public lands perform the relevant measurement and reporting, and more broadly reporting could be a requirement to qualify for any local, state or federal funding or assistance provided to support buildout of U.S. AI infrastructure.
Recommendation 4. Incorporate measurements of social cost into AI energy and infrastructure forecasting and planning.
There is a huge range in estimates of future data center energy use, largely driven by uncertainty around the nature of demands from AI. This uncertainty stems in part from a lack of historical and current data on which AI use cases are most energy intensive and how those workloads are evolving over time. It also remains unclear the extent to which challenges in bringing new resources online, such as hardware production limits or bottlenecks in permitting, will influence growth rates. These uncertainties are even more significant when it comes to the holistic impacts (i.e. those beyond direct energy consumption) described above, making it challenging to balance costs and benefits when planning future demands from AI.
To address these issues, accurate forecasting of demand for energy, water, and other limited resources must incorporate data gathered through holistic measurement frameworks described above. Further, the forecasting of broader system-level impacts must be incorporated into decision-making around investment in AI infrastructure. Forecasting needs to go beyond just energy use. Models should include predicting energy and related infrastructure needs for transmission, the social cost of carbon in terms of pollution, the effects to ratepayers, and the energy demands from chip production.
We recommend that agencies already responsible for energy-demand forecasting—such as the Energy Information Administration at the Department of Energy—integrate, in line with the NIST frameworks developed above, data on the AI workloads driving data-center electricity use into their forecasting models. Agencies specializing in social impacts, such as the Department of Health and Human Services in the case of health impacts, should model social impacts and communicate those to EIA and DOE for planning purposes. In parallel, the Federal Energy Regulatory Commission (FERC) should update its new rule on long-term regional transmission planning, to explicitly include consideration of the social costs corresponding to energy supply, demand and infrastructure retirement/buildout across different scenarios.
Recommendation 5. Transparently use federal, state, and local incentive programs to reward data-center projects that deliver concrete community benefits.
Incentive programs should attach holistic estimates of the costs and benefits collected under the frameworks above, and not purely based on promises. When considering using incentive programs, policymakers should ask questions such as: How many jobs are created by data centers and for how long do those jobs exist, and do they create jobs for local residents? What tax revenue for municipalities or states is created by data centers versus what subsidies are data center owners receiving? What are the social impacts of using agricultural land or public parks for data center construction or transmission lines? What are the impacts to air quality and other public health issues? Do data centers deliver benefits like load flexibility and sharing of waste heat?
Grid operators (Regional Transmission Organizations [RTOs] and Independent System Operators [ISOs]) can leverage interconnection queues to incentivize data center operators to justify that they have sufficiently considered the impacts to local communities when proposing a new site. FERC recently approved reforms to processing the interconnect request queue, allowing RTOs to implement a “first-ready first-served” approach rather than a first-come first-served approach, wherein proposed projects can be fast-tracked based on their readiness. A similar approach could be used by RTOs to fast-track proposals that include a clear plan for how they will benefit local communities (e.g. through load flexibility, heat reuse, and clean energy commitments), grounded in careful impact assessment.
There is the possibility of introducing state-level incentives in states with existing significant infrastructure. Such incentives could be determined in collaboration with the National Governors Association, who have been balancing AI-driven energy needs with state climate goals.
Conclusion
Data centers have an undeniable impact on energy infrastructures and the communities living close to them. This impact will continue to grow alongside AI infrastructure investment, which is expected to skyrocket. It is possible to shape a future where AI infrastructure can be developed sustainably, and in a way that responds to the needs of local communities. But more work is needed to collect the necessary data to inform government decision-making. We have described a framework for holistically evaluating the potential costs and benefits of AI data centers, and shaping AI infrastructure buildout based on those tradeoffs. This framework includes: establishing standards for measuring and reporting AI’s impacts, eliciting public participation from impacted communities, and putting gathered data into action to enable sustainable AI development.
This memo is part of our AI & Energy Policy Sprint, a policy project to shape U.S. policy at the critical intersection of AI and energy. Read more about the Policy Sprint and check out the other memos here.
Data centers are highly spatially concentrated largely due to reliance on existing energy and data transmission infrastructure; it is more cost-effective to continue building where infrastructure already exists, rather than starting fresh in a new region. As long as the cost of performing the proposed impact assessment and reporting in established regions is less than that of the additional overhead of moving to a new region, data center operators are likely to comply with regulations in order to stay in regions where the sector is established.
Spatial concentration of data centers also arises due to the need for data center workloads with high data transmission requirements, such as media streaming and online gaming, to have close physical proximity to users in order to reduce data transmission latency. In order for AI to be integrated into these realtime services, data center operators will continue to need presence in existing geographic regions, barring significant advances in data transmission efficiency and infrastructure.
bad for national security and economic growth. So is infrastructure growth that harms the local communities in which it occurs.
Researchers from Good Jobs First have found that many states are in fact losing tax revenue to data center expansion: “At least 10 states already lose more than $100 million per year in tax revenue to data centers…” More data is needed to determine if data center construction projects coupled with tax incentives are economically advantageous investments on the parts of local and state governments.
The DOE is opening up federal lands in 16 locations to data center construction projects in the name of strengthening America’s energy dominance and ensuring America’s role in AI innovation. But national security concerns around data center expansion should also consider the impacts to communities who live close to data centers and related infrastructures.
Data centers themselves do not automatically ensure greater national security, especially because the critical minerals and hardware components of data centers depend on international trade and manufacturing. At present, the United States is not equipped to contribute the critical minerals and other materials needed to produce data centers, including GPUs and other components.
Federal policy ensures that states or counties do not become overburdened by data center growth and will help different regions benefit from the potential economic and social rewards of data center construction.
Developing federal standards around transparency helps individual states plan for data center construction, allowing for a high-level, comparative look at the energy demand associated with specific AI use cases. It is also important for there to be a federal intervention because data centers in one state might have transmission lines running through a neighboring state, and resultant outcomes across jurisdictions. There is a need for a national-level standard.
Current cost-benefit estimates can often be extremely challenging. For example, while municipalities often expect there will be economic benefits attached to data centers and that data center construction will yield more jobs in the area, subsidies and short-term jobs in construction do not necessarily translate into economic gains.
To improve the ability of decision makers to do quality cost-benefit analysis, the independent consortium described in Recommendation 2 will examine both qualitative and quantitative data, including permitting histories, transmission plans, land use and eminent domain cases, subsidies, jobs numbers, and health or quality of life impacts in various sites over time. NIST will help develop standards in accordance with this data collection, which can then be used in future planning processes.
Further, there is customer interest in knowing their AI is being sourced from firms implementing sustainable and socially responsible practices. These efforts which can be used in marketing communications and reported as a socially and environmentally responsible practice in ESG reports. This serves as an additional incentive for some data center operators to participate in voluntary reporting and maintain operations in locations with increased regulation.
Advance AI with Cleaner Air and Healthier Outcomes
Artificial intelligence (AI) is transforming industries, driving innovation, and tackling some of the world’s most pressing challenges. Yet while AI has tremendous potential to advance public health, such as supporting epidemiological research and optimizing healthcare resource allocation, the public health burden of AI due to its contribution to air pollutant emissions has been under-examined. Energy-intensive data centers, often paired with diesel backup generators, are rapidly expanding and degrading air quality through emissions of air pollutants. These emissions exacerbate or cause various adverse health outcomes, from asthma to heart attacks and lung cancer, especially among young children and the elderly. Without sufficient clean and stable energy sources, the annual public health burden from data centers in the United States is projected to reach up to $20 billion by 2030, with households in some communities located near power plants supplying data centers, such as those in Mason County, WV, facing over 200 times greater burdens than others.
Federal, state, and local policymakers should act to accelerate the adoption of cleaner and more stable energy sources and address AI’s expansion that aligns innovation with human well-being, advancing the United States’ leadership in AI while ensuring clean air and healthy communities.
Challenge and Opportunity
Forty-six percent of people in the United States breathe unhealthy levels of air pollution. Ambient air pollution, especially fine particulate matter (PM2.5), is linked to 200,000 deaths each year in the United States. Poor air quality remains the nation’s fifth highest mortality risk factor, resulting in a wide range of immediate and severe health issues that include respiratory diseases, cardiovascular conditions, and premature deaths.
Data centers consume vast amounts of electricity to power and cool the servers running AI models and other computing workloads. According to the Lawrence Berkeley National Laboratory, the growing demand for AI is projected to increase the data centers’ share of the nation’s total electricity consumption to as much as 12% by 2028, up from 4.4% in 2023. Without enough sustainable energy sources like nuclear power, the rapid growth of energy-intensive data centers is likely to exacerbate ambient air pollution and its associated public health impacts.
Data centers typically rely on diesel backup generators for uninterrupted operation during power outages. While the total operation time for routine maintenance of backup generators is limited, these generators can create short-term spikes in PM2.5, NOx, and SO2 that go beyond the baseline environmental and health impacts associated with data center electricity consumption. For example, diesel generators emit 200–600 times more NOx than natural gas-fired power plants per unit of electricity produced. Even brief exposure to high-level NOx can aggravate respiratory symptoms and hospitalizations. A recent report to the Governor and General Assembly of Virginia found that backup generators at data centers emitted approximately 7% of the total permitted pollution levels for these generators in 2023. Based on the Environmental Protection Agency’s COBRA modeling tool, the public health cost of these emissions in Virginia is estimated at approximately $200 million, with health impacts extending to neighboring states and reaching as far as Florida. In Memphis, Tennessee, a set of temporary gas turbines powering a large AI data center, which has not undergone a complete permitting process, is estimated to emit up to 2,000 tons of NOx annually. This has raised significant health concerns among local residents and could result in a total public health burden of $160 million annually. These public health concerns coincide with a paradigm shift that favors dirty energy and potentially delays sustainability goals.
In 2023 alone, air pollution attributed to data centers in the United States resulted in an estimated $5 billion in health-related damages, a figure projected to rise up to $20 billion annually by 2030. This projected cost reflects an estimated 1,300 premature deaths in the United States per year by the end of the decade. While communities near data centers and power plants bear the greatest burden, with some households facing over 200 times greater impacts than others, the health impacts of these facilities extend to communities across the nation. The widespread health impacts of data centers further compound the already uneven distribution of environmental costs and water resource stresses imposed by AI data centers across the country.
While essential for mitigating air pollution and public health risks, transitioning AI data centers to cleaner backup fuels and stable energy sources such as nuclear power presents significant implementation hurdles, including lengthy permitting processes. Clean backup generators that match the reliability of diesel remain limited in real-world applications, and multiple key issues must be addressed to fully transition to cleaner and more stable energy.
While it is clear that data centers pose public health risks, comprehensive evaluations of data center air pollution and related public health impacts are essential to grasp the full extent of the harms these centers pose, yet often remain absent from current practices. Washington State conducted a health risk assessment of diesel particulate pollution from multiple data centers in the Quincy area in 2020. However, most states lack similar evaluations for either existing or newly proposed data centers. To safeguard public health, it is essential to establish transparency frameworks, reporting standards, and compliance requirements for data centers, enabling the assessment of PM2.5, NOₓ, SO₂, and other harmful air pollutants, as well as their short- and long-term health impacts. These mechanisms would also equip state and local governments to make informed decisions about where to site AI data center facilities, balancing technological progress with the protection of community health nationwide.
Finally, limited public awareness, insufficient educational outreach, and a lack of comprehensive decision-making processes further obscure the potential health risks data centers pose to public health. Without robust transparency and community engagement mechanisms, communities housing data center facilities are left with little influence or recourse over developments that may significantly affect their health and environment.
Plan of Action
The United States can build AI systems that not only drive innovation but also promote human well-being, delivering lasting health benefits for generations to come. Federal, state, and local policymakers should adopt a multi-pronged approach to address data center expansion with minimal air pollution and public health impacts, as outlined below.
Federal-level Action
Federal agencies play a crucial role in establishing national standards, coordinating cross-state efforts, and leveraging federal resources to model responsible public health stewardship.
Recommendation 1. Incorporate Public Health Benefits to Accelerate Clean and Stable Energy Adoption for AI Data Centers
Congress should direct relevant federal agencies, including the Department of Energy (DOE), the Nuclear Regulatory Commission (NRC), and the Environmental Protection Agency (EPA), to integrate air pollution reduction and the associated public health benefits into efforts to streamline the permitting process for more sustainable energy sources, such as nuclear power, for AI data centers. Simultaneously, federal resources should be expanded to support research, development, and pilot deployment of alternative low-emission fuels for backup generators while ensuring high reliability.
- Public Health Benefit Quantification. Direct the EPA, in coordination with DOE and public health agencies, to develop standardized methods for estimating the public health benefits (e.g., avoided premature deaths, hospital visits, and economic burden) of using cleaner and more stable energy sources for AI data centers. Require lifecycle emissions modeling of energy sources and translate avoided emissions into quantitative health benefits using established tools such as the EPA’s BenMAP. This should:
- Include modeling of air pollution exposure and health outcomes (e.g., using tools like EPA’s COBRA)
- Incorporate cumulative risks from regional electricity generation and local backup generator emissions
- Account for spatial disparities and vulnerable populations (e.g., children, the elderly, and disadvantaged communities)
- Evaluate both short-term (e.g., generator spikes) and long-term (e.g., chronic exposure) health impacts
- Preferential Permitting. Instruct the DOE to prioritize and streamline permitting for cleaner energy projects (e.g., small modular reactors, advanced geothermal) that demonstrate significant air pollution reduction and health benefits in supporting AI data center infrastructures. Develop a Clean AI Permitting Framework that allows project applicants to submit health benefit assessments as part of the permitting package to justify accelerated review timelines.
- Support for Cleaner Backup Systems. Expand DOE and EPA R&D programs to support pilot projects and commercialization pathways for alternative backup generator technologies, including hydrogen combustion systems and long-duration battery storage. Provide tax credits or grants for early adopters of non-diesel backup technologies in AI-related data center facilities.
- Federal Guidance & Training. Provide technical assistance to state and local agencies to implement the protocol, and fund capacity-building efforts in environmental health departments.
Recommendation 2. Establish a Standardized Emissions Reporting Framework for AI Data Centers
Congress should direct the EPA, in coordination with the National Institute of Standards and Technology (NIST), to develop and implement a standardized reporting framework requiring data centers to publicly disclose their emissions of air pollutants, including PM₂.₅, NOₓ, SO₂, and other hazardous air pollutants associated with backup generators and electricity use.
- Multi-Stakeholder Working Group. Task EPA with convening a multi-stakeholder working group, including representatives from NIST, DOE, state regulators, industry, and public health experts, to define the scope, metrics, and methodologies for emissions reporting.
- Standardization. Develop a federal technical standard that specifies:
- Types of air pollutants that should be reported
- Frequency of reporting (e.g., quarterly or annually)
- Facility-specific disclosures (including generator use and power source profiles)
- Geographic resolution of emissions data
- Public access and data transparency protocols
State-level Action
Recommendation 1. State environmental and public health departments should conduct a health impact assessment (HIA) before and after data center construction to evaluate discrepancies between anticipated and actual health impacts for existing and planned data center operations. To maintain and build trust, HIA findings, methodologies, and limitations should be publicly available and accessible to non-technical audiences (including policymakers, local health departments, and community leaders representing impacted residents), thereby enhancing community-informed action and participation. Reports should focus on the disparate impact between rural and urban communities, with particular attention to overburdened communities that have under-resourced health infrastructure. In addition, states should coordinate HIA and share findings to address cross-boundary pollution risks. This includes accounting for nearby communities across state lines, considering that jurisdictional borders should not constrain public health impacts and analysis.
Recommendation 2. State public health departments should establish a state-funded program that offers community education forums for affected residents to express their concerns about how data centers impact them. These programs should emphasize leading outreach, engaging communities, and contributing to qualitative analysis for HIAs. Health impact assessments should be used as a basis for informed community engagement.
Recommendation 3. States should incorporate air pollutant emissions related to data centers into their implementation of the National Ambient Air Quality Standards (NAAQS) and the development of State Implementation Plans (SIPs). This ensures that affected areas can meet standards and maintain their attainment statuses. To support this, states should evaluate the adequacy of existing regulatory monitors in capturing emissions related to data centers and determine whether additional monitoring infrastructure is required.
Local-level Action
Recommendation 1. Local governments should revise zoning regulations to include stricter and more explicit health-based protections to prevent data center clustering in already overburdened communities. Additionally, zoning ordinances should address colocation factors and evaluate potential cumulative health impacts. A prominent example is Fairfax County, Virginia, which updated its zoning ordinance in September 2024 to regulate the proximity of data centers to residential areas, require noise pollution studies prior to construction, and establish size thresholds. These updates were shaped through community engagement and input.
Recommendation 2. Local governments should appoint public health experts to the zoning boards to ensure data center placement decisions reflect community health priorities, thereby increasing public health expert representation on zoning boards.
Conclusion
While AI can revolutionize industries and improve lives, its energy-intensive nature is also degrading air quality through emissions of air pollutants. To mitigate AI’s growing air pollution and public health risks, a comprehensive assessment of AI’s health impact and transitioning AI data centers to cleaner backup fuels and stable energy sources, such as nuclear power, are essential. By adopting more informed and cleaner AI strategies at the federal and state levels, policymakers can mitigate these harms, promote healthier communities, and ensure AI’s expansion aligns with clean air priorities.
This memo is part of our AI & Energy Policy Sprint, a policy project to shape U.S. policy at the critical intersection of AI and energy. Read more about the Policy Sprint and check out the other memos here.