Establish a Network of Centers of Excellence in Human Nutrition (CEHN) to Overcome the Data Drought in Nutrition Science Research
NIH needs to invest in both the infrastructure and funding to undertake rigorous nutrition clinical trials, so that we can rapidly improve food and make progress on obesity and nutrition-related chronic disease.
The notion that ‘what we eat impacts our health’ has risen to national prominence with the rise of the “Make America Healthy Again” movement, placing nutrition at the center of American politics. This high degree of interest and enthusiasm is starkly contrasted by the limited high quality data to inform many key nutrition questions, a result of the limited and waning investment in controlled, experimental research on diet’s impact on health and disease. With heightened public interest and increasing societal costs due to diet-related diseases (>$250 billion), it is imperative to re-envision a nutrition research ecosystem that is capable of rapidly producing robust evidence relevant to policymakers, regulators and the food industry. This begins with establishing a network of clinical research centers capable of undertaking controlled human nutrition intervention studies at scale. Such a network, combined with enhanced commitment to nutrition research funding, will revolutionize nutrition science and our understanding of how food impacts obesity and metabolic health.
The proposed clinical trial network would be endowed with high capacity metabolic wards and kitchens, with the ability to oversee the long-term stay of large numbers of participants. The network would be capable of deeply phenotyping body composition, metabolic status, clinical risk factors, and the molecular mediators of diet. Such a clinical trial network would publish numerous rigorous clinical trials per year, testing the leading hypotheses that exist in the literature that lack meaningful clinical trials. Such a network would produce evidence of direct relevance to policy makers and the food industry, to inform how to best make long-overdue progress on reforming the food system and reducing the burden of diet-related diseases like obesity and Type 2 diabetes.
Challenge and Opportunity
While commonly viewed in the modern era as a medical science, nutrition research historically began as an agricultural science. Early researchers sought to define food composition of common agricultural products and ensure the food system supplied adequate levels of nutrients at affordable prices to meet nutrient requirements. This history firmly established the field of nutrition in universities embedded in the agricultural extension network and funded in large part by the United States Department of Agriculture (USDA) and relevant food industries. It took decades, until the late 1980s and early 1990s, for nutrition’s impact on chronic diseases like obesity and cardiometabolic to be taken seriously and viewed through a more medicalized lens. Despite this updated view of nutrition as ostensibly a serious medical science, the study of food has arguably never received the level of attention and resources commensurate with both its importance and the challenges in its rigorous study.
Our understanding of obesity and related chronic diseases has increased dramatically over the past 30 years. Despite improved understanding, many nutrition questions remain. For example, what is the impact of food processing and additives on health? What is the role of factors such as genetics and the microbiome (“Precision Nutrition”) in determining the effect of diet? These and more have emerged as key questions facing the field, policymakers and industry. Unfortunately during this time, the capacity to undertake controlled nutrition interventions, the strongest form of evidence generating causal relationships, has atrophied substantially.
Early research examining nutrition and its relationship to chronic diseases (e.g. type of fat and blood cholesterol responses). benefited from the availability of the general clinical research center (GCRCs). GCRCs were largely extramurally funded clinical research infrastructure that provided the medical and laboratory services, as well as metabolic kitchens and staff funding, to conduct controlled dietary interventions. This model produced evidence that continues to serve as the backbone of existing nutritional recommendations. In the mid-2000s, the GCRC infrastructure was largely defunded and replaced with the Clinical Translational Science Awards (CTSAs). CTSAs’ funding model is significantly less generous and provides limited if any funds for key infrastructure such as metabolic kitchens, nursing and laboratory services, and registered dietitian staff, all essential for undertaking controlled nutrition research. The model outsources the burden of cost from the NIH to the funder, a price tag that the pharmaceutical and medical device industries can bear but is simply not met by the food and supplement industry and is beyond the limited research budgets of academic or government research. Without public investment, there is simply no way for nutrition science to keep up with other fields of biomedicine, exacerbating a perception of the American medical system ignoring preventive measures like nutrition and ensuring that nutrition research is rated as ‘low quality’ in systematic reviews of the evidence.
The results attributed to this funding model are strikingly evident, and were readily predicted in two high profile commentaries mourning the loss of the GCRC model. When the field systematically reviews the data, the evidence from controlled feeding trials and chronic disease risk factors are largely published between the 1980s-2000s. More modern data is overwhelmingly observational in nature or relies on dietary interventions that educate individuals to consume a specific diet, rather than providing food – both types of evidence significantly reduce the confidence in results and introduce various biases that downgrade the certainty of evidence. The reality of the limited ability to generate high quality controlled feeding trial data was most evident in the last edition of the Dietary Guidelines Advisory Committee’s report, which conducted a systematic review of ultraprocessed foods (UFPs) and obesity. This review identified only a single, small experimental study, a two-week clinical trial in 20 adults, with the rest of the literature being observational in nature and graded as too limited to draw firm conclusions about UPFs and obesity risk. This state of the literature is the expected reality for all forthcoming questions in the field of nutrition until the field receives a massive infusion of resources. Without funding for infrastructure and research, the situation will worsen, as both the facilities and investigators trained in this work continue to wither, and academic tenure track lines are filled instead by areas currently prioritized by funders (e.g., basic science, global health). How can we expect ‘high certainty’ conclusions in the evidence to inform dietary guidelines when we simply don’t fund research with the capability of producing such evidence? While the GCRCs were far from perfect, the impact of their defunding on nutrition science over the past two decades is apparent from the quality of evidence on emerging topics and an even cursory look at the faculty at legacy academic nutrition departments. Legislators and policymakers should be alarmed at what the trajectory of the field over the last two decades means for public health.
As we deal with crisis levels of obesity and nutrition-related chronic diseases, we must face the realities of our failures to fund nutrition science seriously over the last two decades, and the data drought a lack of funding has caused. It is a critical failure of the biomedical research infrastructure in the United States that controlled nutrition interventions have fallen by the wayside while rates of diet-related chronic diseases have only worsened. It is essential for the health of our nation and our economy to reinvest in nutrition research to a degree never-before-seen in American history, and produce a state-of-the-art network of clinical trial centers capable of elucidating how food impacts health.
Several key challenges exist to produce a coordinated clinical research center network capable of producing evidence that transforms our understanding of diet’s impact on health:
Challenge 1. Few Existing Research Centers Have the Existing Interdisciplinary Expertise Needed to Tackle Pressing Food and Nutrition Challenges
Both food and health are wildly interdisciplinary in nature, requiring the right mix of expertises across plant and animal agriculture, food science, human nutrition, and various fields of medicine to adequately tackle the pressing nutrition-related challenges facing society. However, the current ‘nutrition’ research landscape of the United States reflects its natural, uncoordinated evolution across diverse agricultural colleges and medical centers.
Any proposed clinical research network needs to harmonize the divides and bring together the broad expertises needed to conduct rigorous experimental human nutrition studies into a coordinated network. Conquering this divide will require funding to intentionally build out research centers with the appropriate mix of researchers, staff, infrastructure and equipment necessary to tackle key questions on large cohorts of study participants consuming controlled diets for extended time periods.
Challenge 2. The Study of Food and Nutrition is Intrinsically Challenging
Despite less investment relative to pharmaceuticals and medical devices, the conduct of rigorous nutrition science is often more cost burdensome due to its unique methodological burdens. Typical gold-standard pharmaceutical designs of placebo-controlled randomized double blind trials are impossible for most research questions. Many interventions cannot be blinded. Placebos do not exist for foods, necessitating comparisons between active interventions, of which there are many viable options. Foods are complex interventions, serving as vehicles for many bioactive compounds, making isolating causal factors challenging in the setting of a single study. Researchers must often make zero-sum decisions that balance internal versus external validity, often trading off between rigorous inference and ‘real-world’ application.
Challenge 3. The Study of Food and Nutrition Is Practically Challenging
Historically, controlled trials, including those conducted in GCRC facilities, have been restricted to shorter term interventions (e.g. 1-4 weeks in duration). These short-term trials are the subject of relevant critique, for both failing to capture long-term adaptations to diet as well as relying on surrogate endpoints, of which there are few with significant prognostic capacity. Observing differences in actual disease endpoints in response to food interventions is ideal but investment in such studies has historically been limited. Attempts at a definitive ‘National Diet Heart Study’ trial to address diet-cardiovascular disease hypotheses in the 1960s were ultimately not funded beyond pilot trials. These challenges have long been used to justify underinvestment in experimental nutrition research and exacerbated the field’s reliance on observational data. While presenting real challenges, investment and innovation are needed to tackle these challenges rather than continue avoiding.
These challenges presented by investing in a nutrition clinical research center network pale in comparison to the benefits of its successful implementation. We need only look at the current state of the scientific literature on how to modify diets and food composition to prevent obesity to understand the harms of not investing. The opportunities from doing so are many:
Benefit 1. Build Back Trust in the Scientific Process Surrounding Dietary Recommendation
The deep distrust of the scientific process and in the dietary recommendations from the federal government should be impetus alone for investing heavily in rigorous nutrition research. It is essential for the public to see that the government is taking nutrition seriously. Data alone will not fix public trust but investing in nutrition research, engaging citizens in the process, and ensuring transparency in the conduct of studies and their results will begin to make a dent in the deep distrust that underlies the MAHA movement and that of many food activists over the past several decades.
Benefit 2. Produce Policy- and Formulator-relevant Data
The atrophying of the clinical research network, limited funding and historical separation of expertises in nutrition have led to a situation where we know little about how food influences disease risk, beyond oft-cited links between sodium and blood pressure and saturated fats and LDL-cholesterol. It should be evident from these two long-standing recommendations that have survived many politicized criticisms that controlled human intervention research is the critical foundation of rigorous policy.
In two decades, we need to look back and be able to say the same things about the discoveries ultimately made from studying the next generation of topics around food and food processing. Such findings will be critical for not only policy makers but also from the food industry, who have shown a willingness to reformulate products but often lack the policy guidance and level of evidence needed to do so in an informed manner, leaving their actions to chase trends over science.
Benefit 3. Enhanced Discovery in Emerging Health Research Topics, such as the Microbiome
The potential to rigorously control and manipulate diets to understand their impact on health and disease holds great promise to improve not only public policy and dietary guidance but also shape our fundamental understanding of human physiology, the gut microbiome, diet-x-gene interactions, and impact of environmental chemicals. The previous GCRC network expired prior to numerous technical revolutions in nucleotide (DNA, RNA) sequencing, mass spectrometry, and cell biology that have left nutrition decades behind other advances in medicine.
Benefit 4. Improved Public Health and Reduced Societal Costs
Ultimately, the funding of a clinical research center network that supports the production of rigorous data on links between diet and disease will address the substantial degree of morbidity and mortality caused by obesity and related chronic conditions. This research can be applied to reduce health risks, improve patient outcomes, and lessen the costly burden of an unhealthy nation.
Plan of Action
Congress must pass legislation that mandates the revival and revolution of experimental human nutrition research through the appropriation of funds to establish a network of Centers of Excellence in Human Nutrition (CEHN) research across the country.
Recommendation 1. Congressional Mandate to Review Historical Human Nutrition Research Funding in America and the GCRCs, and:
- Examine of the current landscape of nutrition research happening across the United States;
- Investigate of the deficiencies in the current funding models and clinical research infrastructure;
- Identify key existing infrastructure and infrastructure needs;
- Establish a 10 year timeline to coordinate and fund high priority nutrition research, appropriating at least $5,000,000,000 per year to the CEHN (representing 200% of the current NIH investment in nutrition research, spanning basic to clinical and epidemiological studies).
Congress should seek NIH, USDA, university, industry and public input to inform the establishment of the CEHN and the initial rounds of projects it ultimately funds. Within six months, Congress should have a clear roadmap for the CEHN, including participating centers and researchers, feasibility analyses and cost estimates, and three initial approved proposals. At least (1) proposal should advance existing intramurally funded work on processed foods that has identified several characteristics, including energy density and palatability, as potential drivers of energy intake.
Recommendation 2. Congress Should Mandate that CEHN establish an Operating Charter that Details a Governing Council for the Network Composed of Multi-sector Stakeholders.
This charter will oversee the network’s management and coordination. Specifically, it will:
- Identify key nutrition science stakeholders, including those with perceived contrasting viewpoints, to guide trial design and prioritization efforts;
- Engage non-nutrition science trialists and form partnerships with contract research organizations (CROs) with experience and demonstrated success managing large pharmaceutical trials to ensure innovative and rigorous trial designs are undertaken and CEHN operations are rigorous;
- Identify methods for public involvement in the proposal and funding of proposals, and; issue quarterly mandates to inform the public on updated progress on studies operating within the CEHN,providing new results from its studies and ways for the public to actively participate in this research.
CEHN should be initiated by Congress. It should also explore novel funding mechanisms that pools resources from specific NIH institutes that have historically supported nutrition research (such as the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Heart, Lung, and Blood Institute (NHLBI), NIA, National Institute of Child Health and Human Development (NICHD), the USDA, the Department of Defense, agricultural commodity boards and the food industry to ensure cost-sharing across relevant sectors and a robust, sustainable CEHN. CEHN will ultimately study topics that may produce results that are perceived as adversarial to the food industry and its independence should be protected. However, it is clear that America is not going back to a world where the majority of food is produced in the home in resource scarce settings as occurred when rates of overweight and obesity were low. Thus, engagement with industry actors across the food system will be critical, including food product formulators and manufacturers, restaurants and food delivery companies. Funds should be appropriated to facilitate collaboration between the CEHN and these industries to study effective reformulations and modifications that impact the health of the population.
Conclusion
The current health, social and economic burden of obesity and nutrition-related diseases are indefensible, and necessitate a multi-sector, coordinated approach to reenvisioning our food environment. Such a reenvisioning needs to be based on rigorous science that describes causal links between food and health, and serves innovative solutions to address food and nutrition-related problems. Investment in an intentional, coordinated and well-funded research network capable of conducting rigorous and long-term nutrition intervention trials is long overdue and holds substantial capacity to revolutionize nutritional guidance, food formulation and policy. It is imperative that visionary political leaders overhaul our nutrition research landscape and invest in a network of Centers of Excellence in Human Nutrition that can meet the demands for rigorous nutrition evidence, build back trust in public health, and dramatically mitigate the impacts of nutrition- and obesity-related chronic disease.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Terminal Patients Need Better Access to Drugs and Clinical Trial Information
Editor’s note: This policy memo was written by Jake Seliger and his wife Bess Stillman. Jake passed away before meeting his child, Athena, born on October 31, 2024. Except where indicated, the first-person voice is that of Jake. This memo advocates for user-centric technology modernization and data interoperability. More crucially, he urges expanded patient rights to opt-in to experimental drug trials and FDA rule revisions to enable terminal patients to take more risks on behalf of themselves, for the benefit of others.
The FDA is supposed to ensure that treatments are safe and effective, but terminal cancer patients like me are already effectively dead. If I don’t receive advanced treatment quickly, I will die. My hope, in the time I have remaining, is to promote policies that will speed up access to treatments and clinical trials for cancer patients throughout the U.S.
There are about two million cancer diagnoses and 600,000 deaths annually in the United States. Cancer treatments are improved over time via the clinical trial system: thousands of clinical trials are conducted each year (many funded by the government via the National Institutes of Health, or NIH, and many funded by pharmaceutical companies hoping to get FDA approval for their products).
But the clinical trial system is needlessly slow, and as discussed below, is nearly impossible for any layperson to access without skilled consulting. As a result, clinical trials are far less useful than they could be.
The FDA is currently “protecting” me from being harmed or killed by novel, promising, advanced cancer treatments that could save or extend my life, so that I can die from cancer instead. Like most patients, I would prefer a much faster system in which the FDA conditionally approves promising, early-phase advanced cancer treatments, even if those treatments haven’t yet been proven fully effective. Drugmakers will be better incentivized to invest in novel cancer treatments if they can begin receiving payment for those treatments sooner. The true risks to terminal patients like me are low—I’m already dying—and the benefits to both existing terminal patients and future patients of all kinds are substantial.
I would also prefer a clinical trial system that was easy for patients to navigate, rather than next-to-impossible. Easier access for patients could radically lower the cost and time for clinical trials by making recruitment far cheaper and more widespread, rather than including only about 6% of patients. In turn, speeding up the clinical-trial process means that future cancer patients will be helped by more quickly approving novel treatments. About half of pharmaceutical R&D spending goes not to basic research, but to the clinical trial process. If we can cut the costs of clinical trials by streamlining the process to improve access to terminal patients, more treatments will make it to patients, and will be faster in doing so.
Cancer treatment is a non-partisan issue. To my knowledge, both left and right agree that prematurely dying from cancer is bad. Excess safety-ism and excessive caution from the FDA costs lives, including in the near future, my own. Three concrete actions would improve clinical research, particular for terminal cancer patients like me, but for many other patients as well:
Clinical trials should be easier and cheaper. The chief obstacles to this are recruitment and retention.
Congress and NIH should modernize the website ClinicalTrials.gov and vastly expand what drug companies and research sites are required to report there, and the time in which they must report it. Requiring timely updates that include comprehensive eligibility criteria, availability for new participants, and accurate site contact information,would mean that patients and doctors will have much more complete information about what trials are available and for whom.
The process of determining patient eligibility and enrolling in a trial should be easier. Due to narrow eligibility criteria, studies struggle to enroll an adequate number of local patients, which severely delays trial progression. A patient who wishes to participate in a trial must “establish care” with the hospital system hosting the trial, before they are even initially screened for eligibility or told if a slot is available. Due to telemedicine practice restrictions across state lines, this means that patients who aren’t already cared for at that site— patients who are ill and for whom unnecessary travel is a huge burden— must use their limited energy to travel to a site just to find out if they can proceed to requesting a trial slot and starting further eligibility testing. Then, if approved for the study, they must be able to uproot their lives to move to, or spend extensive periods of time at, the study location
Improved access to remote care for clinical trials would solve both these problems. First, by allowing the practice of telemedicine across state lines for visits directly related to screening and enrollment into clinical trials. Second, by incentivizing decentralization—meaning a participant in the study can receive the experimental drug and most monitoring, labs and imaging at a local hospital or infusion clinic—by accepting data from sites that can follow a standardized study protocol.
We should require the FDA to allow companies with prospective treatments for fatal diseases to bring those treatments to market after initial safety studies, with minimal delays and with a lessened burden for demonstrating benefit.
Background
[At the time of writing this] I’m a 40-year-old man whose wife is five months pregnant, and the treatment that may keep me alive long enough to meet my child is being kept from me because of current FDA policies. That drug may be in a clinical trial I cannot access. Or, it may be blocked from coming to market by requirements for additional testing to further prove efficacy that has already been demonstrated.
Instead of giving me a chance to take calculated risks on a new therapy that might allow me to live and to be with my family, current FDA regulations are choosing for me: deciding that my certain death from cancer is somehow less harmful to me than taking a calculated, informed risk that might save or prolong my life. Who is asking the patients being protected what they would rather be protected from? The FDA errs too much on the side of extreme caution around drug safety and efficacy, and that choice leads to preventable deaths.
One group of scholars attempted to model how many lives are lost versus gained from a more or less conservative FDA. They find that “from the patient’s perspective, the approval criteria in [FDA program accelerators] may still seem far too conservative.” Their analysis is consistent with the FDA being too stringent and slow in approving use of drugs for fatal diseases like cancer: “Our findings suggest that conventional standards of statistical significance for approving drugs may be overly conservative for the most deadly diseases.”
Drugmakers also find it difficult to navigate what exactly the FDA wants: “their deliberations are largely opaque—even to industry insiders—and the exact role and weight of patient preferences are unclear.” This exacerbates the difficulty drugmakers face in seeking to get treatments to patients faster. Inaction in the form of delaying patient access to drugs is killing people. Inaction is choosing death for cancer patients.
I’m an example of this; I was diagnosed with squamous cell carcinoma (SCC) of the tongue in Oct. 2022. I had no risk factors, like smoking or alcoholism, that put me at risk for SCC. The original tumor was removed in October 2022, and I then had radiation that was supposed to cure me. It didn’t, and the cancer reappeared in April 2023. At that point, I would’ve been a great candidate for a drug like MCLA-158, which has been stuck in clinical trials, despite “breakthrough therapy designation” by the FDA and impressive data, for more than five years. This, despite the fact that MCLA-158 is much easier to tolerate than chemotherapy and arrests cancer in about 70% of patients. Current standard of care chemotherapy and immunotherapy has a positive response rate of only 20-30%.
Had MCLA-158 been approved, I might have received it in April 2023, and still have my tongue. Instead, in May 2023, my entire tongue was surgically removed in an attempt to save my life, forever altering my ability to speak and eat and live a normal life. That surgery removed the cancer, but two months later it recurred again in July 2023. While clinical-trial drugs are keeping me alive right now, I’m dying in part because promising treatments like MCLA-158 are stuck in clinical trials, and I couldn’t get them in a timely fashion, despite early data showing their efficacy. Merus, the maker of MCLA-158, is planning a phase 3 trial for MCLA-158, despite its initial successes. This is crazy: head and neck cancer patients need MCLA-158 now.

I’m only writing this brief because I was one of the few patients who was indeed, at long last, able to access MCLA-158 in a clinical trial, which incredibly halted my rapidly expanding, aggressive tumors. Without it, I’d have been dead nine months ago. Without it, many other patients already are. Imagine that you, or your spouse, or parent, or child, finds him or herself in a situation like mine. Do you want the FDA to keep testing a drug that’s already been shown to be effective and allow patients who could benefit to suffer and die? Or do you want your loved one to get the drug, and avoid surgical mutilation and death? I know what I’d choose.
Multiply this situation across hundreds of thousands of people per year and you’ll understand the frustration of the dying cancer patients like me.
As noted above, about 600,000 people die annually from cancer—and yet cancer drugs routinely take a decade or more to move from lab to approval. The process is slow: “By the time a drug gets into phase III, the work required to bring it to that point may have consumed half a decade, or longer, and tens if not hundreds of millions of dollars.” If you have a fatal disease today, a treatment that is five to ten years away won’t help. Too few people participate in clinical trials partly because participation is so difficult; one study finds that “across the entire U.S. system, the estimated participation rate to cancer treatment trials was 6.3%.” Given how many people die, the prospect of life is attractive.
There is another option in between waiting decades for a promising drug to come to market and opening the market to untested drugs: Allowing terminal patients to consent to the risk of novel, earlier-phase treatments, instead of defaulting to near-certain death, would potentially benefit those patients as well as generate larger volumes of important data regarding drug safety and efficacy, thus improving the speed of drug approval for future patients. Requiring basic safety data is reasonable, but requiring complete phase 2 and 3 data for terminal cancer patients is unreasonably slow, and results in deaths that could be prevented through faster approval.
Again, imagine you, your spouse, parent, or child, is in a situation like mine: they’ve exhausted current standard-of-care cancer treatments and are consequently facing certain death. New treatments that could extend or even save their life may exist, or be on the verge of existing, but are held up by the FDA’s requirements that treatments be redundantly proven to be safe and highly effective. Do you want your family member to risk unproven but potentially effective treatments, or do you want your family member to die?
I’d make the same choice. The FDA stands in the way.
Equally important, we need to shorten the clinical trial process: As Alex Telford notes, “Clinical trials are expensive because they are complex, bureaucratic, and reliant on highly skilled labour. Trials now cost as much as $100,000 per patient to run, and sometimes up to $300,000 or even $500,000 per patient.” And as noted above, about half of pharmaceutical R&D spending goes not to basic research, but to the clinical trial process.
Cut the costs of clinical trials, and more treatments will make it to patients, faster. And while it’s not reasonable to treat humans like animal models, a lot of us who have fatal diagnoses have very little to lose and consequently want to try drugs that may help us, and people in the future with similar diseases. Most importantly, we understand the risks of potentially dying from a drug that might help us and generate important data, versus waiting to certainly die from cancer in a way that will not benefit anybody. We are capable of, and willing to give, informed consent. We can do better and move faster than we are now. In the grand scheme of things, “When it comes to clinical trials, we should aim to make them both cheaper and faster. There is as of yet no substitute for human subjects, especially for the complex diseases that are the biggest killers of our time. The best model of a human is (still) a human.” Inaction will lead to the continued deaths of hundreds of thousands of people annually.
Trials need patients, but the process of searching for a trial in which to enroll is archaic. We found ClinicalTrials.gov nearly impossible to navigate. Despite the stakes, from the patient’s perspective, the clinical trial process is impressively broken, obtuse and confusing, and one that we gather no one likes: patients don’t, their families don’t, hospitals and oncologists who run the clinical trials don’t, drug companies must not, and the people who die while waiting to get into a trial probably don’t.
I [Bess] knew that a clinical trial was Jake’s only chance. But how would we find one? I’d initially hoped that a head and neck oncologist would recommend a specific trial, preferably one that they could refer us to. But most doctors didn’t know of trial options outside their institution, or, frequently, within it, unless they were directly involved. ,Most recommended large research institutions that had a good reputation for hard cases, assuming they’d have more studies and one might be a match.
How were they, or we, supposed to find out what trials actually existed?
The only large-scale search option is ClinicalTrials.gov. But many oncologists I spoke with don’t engage with ClinicalTrials.gov, because the information is out-of-date, difficult to navigate, and inaccurate. It can’t be relied on. For example, I shared a summary of Jake’s relevant medical information (with his permission) in a group of physicians who had offered to help us with the clinical trial search. Ten physicians shared their top-five search results. Ominously, none listed the same trials.
How is it that ten doctors can put in the same basic, relevant clinical data into an engine meant to list and search for all existing clinical trials, only for no two to surface the same study? The problem is simple: There’s a lack of keyword standardization.
Instead of a drop-down menu or click-boxes with diagnoses to choose from, the first search “filter” on ClinicalTrials.gov is a text box that says “Condition\Disease.” If I search: “Head and Neck Cancer” I get ___________ results. If I search “Tongue Cancer,” I get _________ results. Although Tongue Cancer is a subset of Head and Neck Cancer, I don’t see the studies listed as “Head and Neck Cancer”, unless I type in both, or the person who created the ClinicalTrials.gov post for the study chose to type out multiple variations on a diagnosis. Nothing says they have to. If I search for both, I will still miss studies filed as: “HNSCC” or “All Solid Tumors” or “Squamous Cell Carcinoma of the Tongue.”
The good news is that online retailers solved this problem for us years ago. It’s easier to find a dress to my exact specifications out of thousands on H&M,com than it is to find a clinical trial. I can open a search bar, click “dress,” select my material from another click box (which allows me to select from the same options the people listing the garments chose from), then click on the boxes for my desired color, dry clean or machine wash, fabric, finish, closure, and any other number of pre-selected categories before adding additional search keywords if I choose. I find a handful of options all relevant to my desires within a matter of minutes. H&M provides a list of standardized keywords describing what they are offering, and I can filter from there. This way, H&M and I are speaking the same language. And a dress isn’t life or death. For much more on my difficulties with ClinicalTrials.gov, see here.
Further slowing a patient’s ability to find a relevant clinical trial is a lack of comprehensive, searchable, eligibility criteria. Every study has eligibility criteria, and eligibility criteria—like keywords—aren’t standardized on ClinicalTrials.gov. Nor is it required that an exhaustive explanation of eligibility criteria be provided, which may lead to a patient wasting precious weeks attempting to establish care and enroll in a trial, only to discover there was unpublished eligibility criteria they don’t meet. Instead, the information page for each study outlines inclusion and exclusion criteria using whatever language whoever is typing feels like using. Many have overlapping inclusion and exclusion criteria, but there can be long lists of additional criteria for each arm of a study, and it’s up to the patient or their doctor to read through them line by line—if they’re even listed— to see if prior medications, current medications, certain genomic sequencing findings, numbers of lines of therapy, etc. makes the trial relevant.
In the end, we hired a consultant (Eileen), who leveraged her full-time work helping pharmaceutical companies determine which novel compounds might be worth pouring their R&D efforts into assisting patients find potential clinical trials. She helped us narrow it down to the top 5 candidate trials from the thousands that turned up in initial queries.
- NCT04815720: Pepinemab in Combination With Pembrolizumab in Recurrent or Metastatic Squamous Cell Carcinoma of the Head and Neck (KEYNOTE-B84)
- NCT03526835: A Study of Bispecific Antibody MCLA-158 in Patients With Advanced Solid Tumors.
- NCT05743270: Study of RP3 in Combination With Nivolumab and Other Therapy in Patients With Locoregionally Advanced or Recurrent SCCHN.”
- NCT03485209: Efficacy and Safety Study of Tisotumab Vedotin for Patients With Solid Tumors (innovaTV 207)
- NCT05094336: AMG 193, Methylthioadenosine (MTA) Cooperative Protein Arginine Methyltransferase 5 (PRMT5) Inhibitor, Alone and in Combination With Docetaxel in Advanced Methylthioadenosine Phosphorylase (MTAP)-Null Solid Tumors (MTAP).
Based on the names alone, you can see why it would be difficult if not impossible for someone without some expertise in cancer oncology to evaluate trials. Even with Eileen’s expertise, two of the trials were stricken from our list when we discovered unlisted eligibility criteria, which excluded Jake. When exceptionally motivated patients, the oncologists who care for them, and even consultants selling highly specialized assistance can’t reliably navigate a system that claims to be desperate to enroll patients into trials, there is a fundamental problem with the system. But this is a mismatch we can solve, to everyone’s benefit.
Plan of Action
We propose three major actions. Congress should:
Recommendation 1. Direct the National Library of Medicine (NLM) at the National Institutes of Health (NIH) to modernize ClinicalTrials.gov so that patients and doctors have complete and relevant information about available trials, as well as requiring more regular updates from companies as to all the details of available trials, and
Recommendation 2. Allow the practice of telemedicine across state lines for visits related to clinical trials.
Recommendation 3. Require the FDA to allow companies with prospective treatments for fatal diseases to bring those treatments to market after initial safety studies.
Modernizing ClinicalTrials.gov will empower patients, oncologists, and others to better understand what trials are available, where they are available, and their up-to-date eligibility criteria, using standardized search categories to make them more easily discoverable. Allowing telemedicine across state lines for clinical trial care will significantly improve enrollment and retention. Bringing treatments to market after initial safety studies will speed the clinical trial process, and get more patients treatments, sooner. In cancer, delays cause death.
To get more specific:
The FDA already has a number of accelerated approval options. Instead of the usual “right to try” legislation, we propose creating a second, provisional market for terminal patients that allows partial approval of a drug while it’s still undergoing trials, making it available to trial-ineligible (or those unable to easily access a trial) patients for whom standard of care doesn’t provide a meaningful chance at remission. This partial approval would ideally allow physicians to prescribe the drug to this subset of patients as they see fit: be that monotherapy or a variety of personalized combination therapies, tailored to a patient’s needs, side-effect profile and goals. This wouldn’t just expand access to patients who are otherwise out of luck, but can give important data about real-world reaction and response.
As an incentive, the FDA could require pharmaceutical companies to provide drug access to terminal patients as a condition of continuing forward in the trial process on the road to a New Drug Application. While this would be federally forced compassion, it would differ from “compassionate use.” To currently access a study drug via compassionate use, a physician has to petition both the drug company and the FDA on the patient’s behalf, which comes with onerous rules and requirements. Most drug companies with compassionate use programs won’t offer a drug until there’s already a large amount of compelling phase 2 data demonstrating efficacy, a patient must have “failed” standard of care and other available treatments, and the drug must (usually) be given as a monotherapy. Even if the drugmaker says yes, the FDA can still say no. Compassionate use is an option available to very small numbers, in limited instances, and with bureaucratic barriers to overcome. Not terribly compassionate, in my opinion.
The benefits of this “terminal patient” market can and should go both ways, much as lovers should benefit from each other instead of trying to create win-lose situations. Providing the drug to patients would come with reciprocal benefits to the pharmaceutical companies. Any physician prescribing the drug to patients should be required to report data regarding how the drug was used, in what combination, and to what effect. This would create a large pool of observational, real-world data gathered from the patients who aren’t ideal candidates for trials, but better represent the large subset of patients who’ve exhausted multiple lines of therapies yet aren’t ready for the end. Promising combinations and unexpected effects might be identified from these observational data sets, and possibly used to design future trials.
Dying patients would get drugs faster, and live longer, healthier lives, if we can get better access to both information about clinical trials, and treatments for diseases. The current system errs too much on proving effectiveness and too little on the importance of speed itself, and of the number of people who die while waiting for new treatments. Patients like me, who have fatal diagnoses and have already failed “standard of care” therapies, routinely die while waiting for new or improved treatments. Moreover, patients like me have little to lose: because cancer is going to kill us anyway, many of us would prefer to roll the dice on an unproven treatment, or a treatment that has early, incomplete data showing its potential to help, than wait to be killed by cancer. Right now, however, the FDA does not consider how many patients will die while waiting for new treatments. Instead, the FDA requires that drugmakers prove both the safety and efficacy of treatments prior to allowing any approval whatsoever.
As for improving ClinicalTrials.gov, we have several ideas:
First, the NIH should hire programmers with UX experience, and should be empowered to pay market rates to software developers with experience at designing websites for, say, Amazon or Shein. Clinicaltrials.gov was designed to be a study registry, but needs an overhaul to be more useful to actual patients and doctors.
Second, UX is far from the only problem. Data itself can be a problem. For example, the NIH could require a patient-friendly summary of each trial; it could standardize the names of drugs and conditions so that patients and doctors could see consistent and complete search results; and it could use a consistent and machine-readable format for inclusion and exclusion criteria.
We are aware of one patient group that, in trying to build an interface to ClinicalTrials.gov, found a remarkable degree of inconsistency and chaos: “We have analyzed the inclusion and exclusion criteria for all the cancer-related trials from clinicaltrials.gov that are recruiting for interventional studies (approximately 8,500 trials). Among these trials, there are over 1,000 ways to indicate the patient must not be pregnant during a trial, and another 1,000 ways to indicate that the patient must use adequate birth control.” ClinicalTrials.gov should use a Domain Specific Language that standardizes all of these terms and conditions, so that patients and doctors can find relevant trials with 100x less effort.
Third, the long-run goal should be a real-time, searchable database that matches patients to studies using EMR data and allows doctors to see what spots are available and where. Pharmaceutical and biotech companies would need to be required to contribute up-to-date information on all extant clinical trials on a regular basis (e.g., monthly).
Conclusion
Indeed, we need a national clinical trial database that EMRs can connect to, in the style of Epic’s Care Everywhere. A patient signs a release to share their information, like Care Everywhere allows hospital A to download a patient’s information from hospital B if the patient signs a release. Trials with open slots could mark themselves as “recruiting” and update in real time. A doctor could press a button and identify potential open studies. A patient available for a trial could flag their EMR profile as actively searching, allowing clinical trial sites to browse patients in their region who are looking for a study. Access to that patient’s EMR would allow them to scan for eligibility.
This would be easy to do. OKCupid can do it. The tech exists. Finding a clinical trial already feels a bit like online dating, except if you get the wrong match, you die.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
The FDA has created a variety of programs that have promising-sounding names: “Currently, four programs—the fast track, breakthrough therapy, accelerated approval, and priority review designations—provide faster reviews and/or use surrogate endpoints to judge efficacy. However, published descriptions […] do not indicate any differences in the statistical thresholds used in these programs versus the standard approval process, nor do they mention adapting these thresholds to the severity of the disease.” The problem is that these programs do not appear to do much to actually accelerate getting drugs to patients. MCLA-158 is an example of the problem: the drug has been shown to be safe and effective, and yet Merus thinks it needs a Phase 3 trial to get it past the FDA and to patients.
Improve healthcare data capture at the source to build a learning health system
Studies estimate that only one in 10 recommendations made by major professional societies are supported by high-quality evidence. Medical care that is not evidence-based can result in unnecessary care that burdens public finances, harms patients, and damages trust in the medical profession. Clearly, we must do a better job of figuring out the right treatments, for the right patients, at the right time. To meet this challenge, it is essential to improve our ability to capture reusable data at the point of care that can be used to improve care, discover new treatments, and make healthcare more efficient. To achieve this vision, we will need to shift financial incentives to reward data generation, change how we deliver care using AI, and continue improving the technological standards powering healthcare.
The Challenge and Opportunity of health data
Many have hailed health data collected during everyday healthcare interactions as the solution to some of these challenges. Congress directed the U.S. Food and Drug Administration (FDA) to increase the use of real-world data (RWD) for making decisions about medical products. However, FDA’s own records show that in the most recent year for which data are available, only two out of over one hundred new drugs and biologics approved by FDA were approved based primarily on real-world data.
A major problem is that our current model in healthcare doesn’t allow us to generate reusable data at the point of care. This is even more frustrating because providers face a high burden of documentation, and patients report repetitive questions from providers and questionnaires.
To expand a bit: while large amounts of data are generated at the point of care, these data lack the quality, standardization, and interoperability to enable downstream functions such as clinical trials, quality improvement, and other ways of generating more knowledge about how to improve outcomes.
By better harnessing the power of data, including results of care, we could finally build a learning healthcare system where outcomes drive continuous improvement and where healthcare value leads the way. There are, however, countless barriers to such a transition. To achieve this vision, we need to develop new strategies for the capture of high-quality data in clinical environments, while reducing the burden of data entry on patients and providers.
Efforts to achieve this vision follow a few basic principles:
- Data should be entered only once– by the person or entity most qualified to do so – and be used many times.
- Data capture should be efficient, so as to minimize the burden on those entering the data, allowing them to focus their time on doing what actually matters, like providing patient care.
- Data generated at the point of care needs to be accessible for appropriate secondary uses (quality improvement, trials, registries), while respecting patient autonomy and obtaining informed consent where required. Data should not be stuck in any one system but should flow freely between systems, enabling linkages across different data sources.
- Data need to be used to provide real value to patients and physicians. This is achieved by developing data visualizations, automated data summaries, and decision support (e.g. care recommendations, trial matching) that allow data users to spend less time searching for data and more time on analysis, problem solving, and patient care– and help them see the value in entering data in the first place.
Barriers to capturing high-quality data at the point of care:
- Incentives: Providers and health systems are paid for performing procedures or logging diagnoses. As a result, documentation is optimized for maximizing reimbursement, but not for maximizing the quality, completeness, and accuracy of data generated at the point of care.
- Workflows: Influenced by the prevailing incentives, clinical workflows are not currently optimized to enable data capture at the point of care. Patients are often asked the same questions at multiple stages, and providers document the care provided as part of free-text notes, which are frequently required for billing but can make it challenging to find information.
- Technology: Shaped by incentives and workflows, technology has evolved to capture information in formats that frequently lack standardization and interoperability.
Plan of Action
Plan of Action
Recommendation 1. Incentivize generation of reusable data at the point of care
Financial incentives are needed to drive the development of workflows and technology to capture high-quality data at the point of care. There are several payment programs already in existence that could provide a template for how these incentives could be structured.
For example, the Centers for Medicare and Medicaid Services (CMS) recently announced the Enhancing Oncology Model (EOM), a voluntary model for oncology providers caring for patients with common cancer types. As part of the EOM, providers are required to report certain data fields to CMS, including staging information and hormone receptor status for certain cancer types. These data fields are essential for clinical care, research, quality improvement, and ongoing care observation involving cancer patients. Yet, at present, these data are rarely recorded in a way that makes it easy to exchange and reuse this information. To reduce the burden of reporting this data, CMS has collaborated with the HHS Assistant Secretary for Technology Policy (ASTP) to develop and implement technological tools that can facilitate automated reporting of these data fields.
CMS also has a long-standing program that requires participation in evidence generation as a prerequisite for coverage, known as coverage with evidence development (CED). For example, hospitals that would like to provide Transcatheter Aortic Valve Replacement (TAVR) are required to participate in a registry that records data on these procedures.
To incentivize evidence generation as part of routine care, CMS should refine these programs and expand their use. This would involve strengthening collaborations across the federal government to develop technological tools for data capture, and increasing the number of payment models that require generation of data at the point of care. Ideally, these models should evolve to reward 1) high-quality chart preparation (assembly of structured data) 2) establishing diagnoses and development of a care plan, and 3) tracking outcomes. These payment policies are powerful tools because they incentivize the generation of reusable infrastructure that can be deployed for many purposes.
Recommendation 2. Improve workflows to capture evidence at the point of care
With the right payment models, providers can be incentivized to capture reusable data at the point of care. However, providers are already reporting being crushed by the burden of documentation and patients are frequently filling out multiple questionnaires with the same information. To usher in the era of the learning health system (a system that includes continuous data collection to improve service delivery), without increasing the burden on providers and patients, we need to redesign how care is provided. Specifically, we must focus on approaches that integrate generation of reusable data into the provision of routine clinical care.
While the advent of AI is an opportunity to do just that, current uses of AI have mainly focused on drafting documentation in free-text formats, essentially replacing human scribes. Instead, we need to figure out how we can use AI to improve the usability of the resulting data. While it is not feasible to capture all data in a structured format on all patients, a core set of data are needed to provide high-quality and safe care. At a minimum, those should be structured and part of a basic core data set across disease types and health maintenance scenarios.
In order to accomplish this, NIH and the Advanced Research Projects Agency for Health (ARPA-H) should fund learning laboratories that develop, pilot, and implement new approaches for data capture at the point of care. These centers would leverage advances in human-centered design and artificial intelligence (AI) to revolutionize care delivery models for different types of care settings, ranging from outpatient to acute care and intensive care settings. Ideally, these centers would be linked to existing federally funded research sites that could implement the new care and discovery processes in ongoing clinical investigations.
The federal government already spends billions of dollars on grants for clinical research- why not use some of that funding to make clinical research more efficient, and improve the experience of patients and physicians in the process?
Recommendation 3. Enable technology systems to improve data standardization and interoperability
Capturing high-quality data at the point of care is of limited utility if the data remains stuck within individual electronic health record (EHR) installations. Closed systems hinder innovation and prevent us from making the most of the amazing trove of health data.
We must create a vibrant ecosystem where health data can travel seamlessly between different systems, while maintaining patient safety and privacy. This will enable an ecosystem of health data applications to flourish. HHS has recently made progress by agreeing to a unified approach to health data exchange, but several gaps remain. To address these we must
- Increase standardization of data elements: The federal government requires certain data elements to be standardized for electronic export from the EHR. However, this list of data elements, called the United States Core Data for Interoperability (USCDI) currently does not include enough data elements for many uses of health data. HHS could rapidly expand the USCDI by working with federal partners and professional societies to determine which data elements are critical for national priorities, like vaccine safety and use, or protection from emerging pathogens.
- Enable writeback into the EHR: While current efforts focused on interoperability have focused on the ability to export EHR data, developing a vibrant ecosystem of health data applications that are available to patients, physicians, and other data users, requires the capability to write data back into the EHR. This would enable the development of a competitive ecosystem of applications that use health data generated in the EHR, much like the app store on our phones.
- Create widespread interoperability of data for multiple purposes: HHS has made great progress towards allowing health data to be exchanged between any two entities in our healthcare system, thanks to the Trusted Exchange Framework and Common Agreement (TEFCA). TEFCA could allow any two healthcare sites to exchange data, but unfortunately, participation remains spotty and TEFCA currently does not allow data exchange solely for research. HHS should work to close these gaps by allowing TEFCA to be used for research, and incentivizing participation in TEFCA, for example by making joining TEFCA a condition of participation in Medicare.
Conclusion
The treasure trove of health data generated during routine care has given us a huge opportunity to generate knowledge and improve health outcomes. These data should serve as a shared resource for clinical trials, registries, decision support, and outcome tracking to improve the quality of care. This is necessary for society to advance towards personalized medicine, where treatments are tailored to biology and patient preference. However, to make the most of these data, we must improve how we capture and exchange these data at the point of care.
Essential to this goal is evolving our current payment systems from rewarding documentation of complexity or time spent, to generation of data that supports learning and improvement. HHS should use its payment authorities to encourage data generation at the point of care and promote the tools that enable health data to flow seamlessly between systems, building on the success stories of existing programs like coverage with evidence development. To allow capture of this data without making the lives of providers and patients even more difficult, federal funding bodies need to invest in developing technologies and workflows that leverage AI to create usable data at the point of care. Finally, HHS must continue improving the standards that allow health data to travel seamlessly between systems. This is essential for creating a vibrant ecosystem of applications that leverage the benefits of AI to improve care.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Reduce Administrative Research Burden with ORCID and DOI Persistent Digital Identifiers
There exists a low-effort, low-cost way to reduce administrative burden for our scientists, and make it easier for everyone – scientists, funders, legislators, and the public – to document the incredible productivity of federal science agencies. If adopted throughout government research these tools would maximize interoperability across reporting systems, reduce the administrative burden and costs, and increase the accountability of our scientific community. The solution: persistent digital identifiers (Digital Object Identifiers, or DOIs) and Open Researcher and Contributor IDs (ORCIDs) for key personnel. ORCIDs are already used by most federal science agencies. We propose that federal science agencies also adopt digital object identifiers for research awards, an industry-wide standard. A practical and detailed implementation guide for this already exists.
The Opportunity
Tracking the impact and outputs of federal research awards is labor-intensive and expensive. Federally funded scientists spend over 900,000 hours a year writing interim progress reports alone. Despite that tremendous effort, our ability to analyze the productivity of federal research awards is limited. These reports only capture research products created while the award is active, but many exciting papers and data sets are not published until after the award is over, making it hard for the funder to associate them with a particular award or agency initiative. Further, these data are often not structured in ways that support easy analysis or collaboration. When it comes time for the funding agency to examine the impact of an award, a call for applications, or even an entire division, staff rely on a highly manual process that is time-intensive and expensive. Thus, such evaluations are often not done. Deep analysis of federal spending is next to impossible, and simple questions regarding which type of award is better suited for one scientific problem over another, or whether one administrative funding unit is more impactful than a peer organization with the same spending level, are rarely investigated by federal research agencies. These questions are difficult to answer without a simple way to tie award spending to specific research outputs such as papers, patents, and datasets.
To simplify tracking of research outputs, the Office of Science and Technology Policy (OSTP) directed federal research agencies to “assign unique digital persistent identifiers to all scientific research and development awards and intramural research protocols […] through their digital persistent identifiers.” This directive builds on work from the Trump White House in 2018 to reduce the burden on researchers and the National Security Strategy guidance. It is a great step forward, but it has yet to be fully implemented, and allows implementation to take different paths. Agencies are now taking a fragmented, agency-specific approach, which will undermine the full potential of the directive by making it difficult to track impact using the same metrics across federal agencies.
Without a unified federal standard, science publishers, awards management systems, and other disseminators of federal research output will continue to treat award identifiers as unstructured text buried within a long document, or URLs tucked into acknowledgement sections or other random fields of a research product. These ad hoc methods make it difficult to link research outputs to their federal funding. It leaves scientists and universities looking to meet requirements for multiple funding agencies, relying on complex software translations of different agency nomenclatures and award persistent identifiers, or, more realistically, continue to track and report productivity by hand. It remains too confusing and expensive to provide the level of oversight our federal research enterprise deserves.
There is an existing industry standard for associating digital persistent identifiers with awards that has been adopted by the Department of Energy and other funders such as the ALS Association, the American Heart Association, and the Wellcome Trust. It is a low-effort, low-cost way to reduce administrative burden for our scientists and make it easier for everyone – scientists, federal agencies, legislators, and the public – to document the incredible productivity of federal science expenditures.
Adopting this standard means funders can automate the reporting of most award products (e.g., scientific papers, datasets), reducing administrative burden, and allowing research products to be reliably tracked even after the award ends. Funders could maintain their taxonomy linking award DOIs to specific calls for proposals, study sections, divisions, and other internal structures, allowing them to analyze research products in much easier ways. Further, funders would be able to answer the fundamental questions about their programs that are usually too labor-intensive to even ask, such as: did a particular call for applications result in papers that answered the underlying question laid out in that call? How long should awards for a specific type of research problem last to result in the greatest scientific productivity? In the light of rapid advances in artificial intelligence (AI) and other analytic tools, making the linkages between research funding and products standardized and easy to analyze opens possibilities for an even more productive and accountable federal research enterprise going forward. In short, assigning DOIs to awards fulfills the requirements of the 2022 directive to maximize interoperability with other funder reporting systems, the promise of the 2018 NSTC report to reduce burden, and new possibilities for a more accountable and effective federal research enterprise.
Plan of Action
The overall goal is to increase accountability and transparency for federal research funding agencies and dramatically reduce the administrative burden on scientists and staff. Adopting a uniform approach allows for rapid evaluation and improvements across the research enterprise. It also enables and for the creation of comparable data on agency performance. We propose that federal science agencies adopt the same industry-wide standard – the DOI – for awards. A practical and detailed implementation guide already exists.
These steps support the existing directive and National Security Strategy guidance issued by OSTP and build on 2018 work from the NSTC:.
Recommendation 1. An interagency committee led by OSTP should coordinate and harmonize implementation to:
- Develop implementation timelines and budgets for each agency that are consistent with existing industry standards;
- Consult with other stakeholders such as scientific publishers and awardee institutions, but consider that guidance and industry standards already exist, so there is no need for lengthy consultation.
Recommendation 2. Agencies should fully adopt the industry standard persistent identifier infrastructure for research funding—DOIs—for awards. Specifically, funders should:
- Ensure they are listed in the Research Organization Registry at the administrative level (e.g., agency, division) that suits their reporting and analytic needs.
- Require and collect ORCIDs, a digital identifier for researchers widely used by academia and scientific publishers, for the key personnel of an award.
- Issue DOIs for individual awards, and link those awards to the appropriate organizational units and research funding initiatives in the metadata to facilitate evaluation.
Recommendation 3. Agencies should require the Principal Investigator (PI) to cite the award DOI in research products (e.g., scientific papers, datasets). This requirement could be included in the terms and conditions of each award. Using DOIs to automate much of progress reporting, as described below, provides a natural incentive for investigators to comply.
Recommendation 4. Agencies should use award persistent identifiers from ORCID and award DOI systems to identify research products associated with an award to reduce PI burden. Awardees would still be required to certify that the product arose directly from their federal research award. After the award and reporting obligation ends, the agency can continue to use these systems to link products to awards based on information provided by the product creators to the product distributors (e.g., authors citing an award DOI when publishing a paper), but without the direct certification of the awardee. This compromise provides the public and the funder with better information about an award’s output, but does not automatically hold the awardee liable if the product conflicts with a federal policy.
Recommendation 5. Agencies should adopt or incorporate award DOIs into their efforts to describe agency productivity and create more efficient and consistent practices for reporting research progress across all federal research funding agencies. Products attributable to the award should be searchable by individual awards, and by larger collections of awards, such as administrative Centers or calls for applications. As an example of this transparency, PubMed, with its publicly available indexing of the biomedical literature, supports the efforts of the National Institutes of Health (NIH)’s RePORTER), and could serve as a model for other fields as persistent identifiers for awards and research products become more available.
Recommendation 6. Congress should issue appropriations reporting language to ensure that implementation costs are covered for each agency and that the agencies are adopting a universal standard. Given that the DOI for awards infrastructure works even for small non-profit funders, the greatest costs will be in adapting legacy federal systems, not in utilizing the industry standard itself.
Challenges
We envision the main opposition to come from the agencies themselves, as they have multiple demands on their time and might have shortcuts to implementation that meet the letter of the requirement but do not offer the full benefits of an industry standard. This short-sighted position denies both the public transparency needed on research award performance and the massive time and cost savings for the agencies and researchers.
A partial implementation of this burden-reducing workflow already exists. Data feeds from ORCID and PubMed populate federal tools such as My Bibliography, and in turn support the biosketch generator in SciENcv or an agency’s Research Performance Progress Report. These systems are feasible because they build on PubMed’s excellent metadata and curation. But PubMed does not index all scientific fields.
Adopting DOIs for awards means that persistent identifiers will provide a higher level of service across all federal research areas. DOIs work for scientific areas not supported by PubMed. And even for the sophisticated existing systems drawing from PubMed, user effort could be reduced and accuracy increased if awards were assigned DOIs. Systems such as NIH RePORTER and PubMed currently have to pull data from citation of award numbers in the acknowledgment sections of research papers, which is more difficult to do.
Conclusion
OSTP and the science agencies have put forth a sound directive to make American science funding even more accountable and impactful, and they are on the cusp of implementation. It is part of a long-standing effort to reduce burden and make the federal research enterprise more accountable and effective. Federal research funding agencies are susceptible to falling into bureaucratic fragmentation and inertia by adopting competing approaches that meet the minimum requirements set forth by OSTP, but offer minimal benefit. If these agencies instead adopt the industry standard that is being used by many other funders around the world, there will be a marked reduction in the burden on awardees and federal agencies, and it will facilitate greater transparency, accountability, and innovation in science funding. Adopting the standard is the obvious choice and well within America’s grasp, but avoiding bureaucratic fragmentation is not simple. It takes leadership from each agency, the White House, and Congress.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Use Artificial Intelligence to Analyze Government Grant Data to Reveal Science Frontiers and Opportunities
President Trump challenged the Director of the Office of Science and Technology Policy (OSTP), Michael Kratsios, to “ensure that scientific progress and technological innovation fuel economic growth and better the lives of all Americans”. Much of this progress and innovation arises from federal research grants. Federal research grant applications include detailed plans for cutting-edge scientific research. They describe the hypothesis, data collection, experiments, and methods that will ultimately produce discoveries, inventions, knowledge, data, patents, and advances. They collectively represent a blueprint for future innovations.
AI now makes it possible to use these resources to create extraordinary tools for refining how we award research dollars. Further, AI can provide unprecedented insight into future discoveries and needs, shaping both public and private investment into new research and speeding the application of federal research results.
We recommend that the Office of Science and Technology Policy (OSTP) oversee a multiagency development effort to fully subject grant applications to AI analysis to predict the future of science, enhance peer review, and encourage better research investment decisions by both the public and the private sector. The federal agencies involved should include all the member agencies of the National Science and Technology Council (NSTC).
Challenge and Opportunity
The federal government funds approximately 100,000 research awards each year across all areas of science. The sheer human effort required to analyze this volume of records remains a barrier, and thus, agencies have not mined applications for deep future insight. If agencies spent just 10 minutes of employee time on each funded award, it would take 16,667 hours in total—or more than eight years of full-time work—to simply review the projects funded in one year. For each funded award, there are usually 4–12 additional applications that were reviewed and rejected. Analyzing all these applications for trends is untenable. Fortunately, emerging AI can analyze these documents at scale. Furthermore, AI systems can work with confidential data and provide summaries that conform to standards that protect confidentiality and trade secrets. In the course of developing these public-facing data summaries, the same AI tools could be used to support a research funder’s review process.
There is a long precedent for this approach. In 2009, the National Institutes of Health (NIH) debuted its Research, Condition, and Disease Categorization (RCDC) system, a program that automatically and reproducibly assigns NIH-funded projects to their appropriate spending categories. The automated RCDC system replaced a manual data call, which resulted in savings of approximately $30 million per year in staff time, and has been evolving ever since. To create the RCDC system, the NIH pioneered digital fingerprints of every scientific grant application using sophisticated text-mining software that assembled a list of terms and their frequencies found in the title, abstract, and specific aims of an application. Applications for which the fingerprints match the list of scientific terms used to describe a category are included in that category; once an application is funded, it is assigned to categorical spending reports.
NIH staff soon found it easy to construct new digital fingerprints for other things, such as research products or even scientists, by scanning the title and abstract of a public document (such as a research paper) or by all terms found in the existing grant application fingerprints associated with a person.
NIH review staff can now match the digital fingerprints of peer reviewers to the fingerprints of the applications to be reviewed and ensure there is sufficient reviewer expertise. For NIH applicants, the RePORTER webpage provides the Matchmaker tool to create digital fingerprints of title, abstract, and specific aims sections, and match them to funded grant applications and the study sections in which they were reviewed. We advocate that all agencies work together to take the next logical step and use all the data at their disposal for deeper and broader analyses.
We offer five recommendations for specific use cases below:
Use Case 1: Funder support. Federal staff could use AI analytics to identify areas of opportunity and support administrative pushes for funding.
When making a funding decision, agencies need to consider not only the absolute merit of an application but also how it complements the existing funded awards and agency goals. There are some common challenges in managing portfolios. One is that an underlying scientific question can be common to multiple problems that are addressed in different portfolios. For example, one protein may have a role in multiple organ systems. Staff are rarely aware of all the studies and methods related to that protein if their research portfolio is restricted to a single organ system or disease. Another challenge is to ensure proper distribution of investments across a research pipeline, so that science progresses efficiently. Tools that can rapidly and consistently contextualize applications across a variety of measures, including topic, methodology, agency priorities, etc., can identify underserved areas and support agencies in making final funding decisions. They can also help funders deliberately replicate some studies while reducing the risk of unintentional duplication.
Use Case 2: Reviewer support. Application reviewers could use AI analytics to understand how an application is similar to or different from currently funded federal research projects, providing reviewers with contextualization for the applications they are rating.
Reviewers are selected in part for their knowledge of the field, but when they compare applications with existing projects, they do so based on their subjective memory. AI tools can provide more objective, accurate, and consistent contextualization to ensure that the most promising ideas receive funding.
Use Case 3: Grant applicant support: Research funding applicants could be offered contextualization of their ideas among funded projects and failed applications in ways that protect the confidentiality of federal data.
NIH has already made admirable progress in this direction with their Matchmaker tool—one can enter many lines of text describing a proposal (such as an abstract), and the tool will provide lists of similar funded projects, with links to their abstracts. New AI tools can build on this model in two important ways. First, they can help provide summary text and visualization to guide the user to the most useful information. Second, they can broaden the contextual data being viewed. Currently, the results are only based on funded applications, making it impossible to tell if an idea is excluded from a funded portfolio because it is novel or because the agency consistently rejects it. Private sector attempts to analyze award information (e.g., Dimensions) are similarly limited by their inability to access full applications, including those that are not funded. AI tools could provide high-level summaries of failed or ‘in process’ grant applications that protect confidentiality but provide context about the likelihood of funding for an applicant’s project.
Use Case 4: Trend mapping. AI analyses could help everyone—scientists, biotech, pharma, investors— understand emerging funding trends in their innovation space in ways that protect the confidentiality of federal data.
The federal science agencies have made remarkable progress in making their funding decisions transparent, even to the point of offering lay summaries of funded awards. However, the sheer volume of individual awards makes summarizing these funding decisions a daunting task that will always be out of date by the time it is completed. Thoughtful application of AI could make practical, easy-to-digest summaries of U.S. federal grants in close to real time, and could help to identify areas of overlap, redundancy, and opportunity. By including projects that were unfunded, the public would get a sense of the direction in which federal funders are moving and where the government might be underinvested. This could herald a new era of transparency and effectiveness in science investment.
Use Case 5: Results prediction tools. Analytical AI tools could help everyone—scientists, biotech, pharma, investors—predict the topics and timing of future research results and neglected areas of science in ways that protect the confidentiality of federal data.
It is standard practice in pharmaceutical development to predict the timing of clinical trial results based on public information. This approach can work in other research areas, but it is labor-intensive. AI analytics could be applied at scale to specific scientific areas, such as predictions about the timing of results for materials being tested for solar cells or of new technologies in disease diagnosis. AI approaches are especially well suited to technologies that cross disciplines, such as applications of one health technology to multiple organ systems, or one material applied to multiple engineering applications. These models would be even richer if the negative cases—the unfunded research applications—were included in analyses in ways that protect the confidentiality of the failed application. Failed applications may signal where the science is struggling and where definitive results are less likely to appear, or where there are underinvested opportunities.
Plan of Action
Leadership
We recommend that OSTP oversee a multiagency development effort to achieve the overarching goal of fully subjecting grant applications to AI analysis to predict the future of science, enhance peer review, and encourage better research investment decisions by both the public and the private sector. The federal agencies involved should include all the member agencies of the NSTC. A broad array of stakeholders should be engaged because much of the AI expertise exists in the private sector, the data are owned and protected by the government, and the beneficiaries of the tools would be both public and private. We anticipate four stages to this effort.
Recommendation 1. Agency Development
Pilot: Each agency should develop pilots of one or more use cases to test and optimize training sets and output tools for each user group. We recommend this initial approach because each funding agency has different baseline capabilities to make application data available to AI tools and may also have different scientific considerations. Despite these differences, all federal science funding agencies have large archives of applications in digital formats, along with records of the publications and research data attributed to those awards.
These use cases are relatively new applications for AI and should be empirically tested before broad implementation. Trend mapping and predictive models can be built with a subset of historical data and validated with the remaining data. Decision support tools for funders, applicants, and reviewers need to be tested not only for their accuracy but also for their impact on users. Therefore, these decision support tools should be considered as a part of larger empirical efforts to improve the peer review process.
Solidify source data: Agencies may need to enhance their data systems to support the new functions for full implementation. OSTP would need to coordinate the development of data standards to ensure all agencies can combine data sets for related fields of research. Agencies may need to make changes to the structure and processing of applications, such as ensuring that sections to be used by the AI are machine-readable.
Recommendation 2. Prizes and Public–Private Partnerships
OSTP should coordinate the convening of private sector organizations to develop a clear vision for the profound implications of opening funded and failed research award applications to AI, including predicting the topics and timing of future research outputs. How will this technology support innovation and more effective investments?
Research agencies should collaborate with private sector partners to sponsor prizes for developing the most useful and accurate tools and user interfaces for each use case refined through agency development work. Prize submissions could use test data drawn from existing full-text applications and the research outputs arising from those applications. Top candidates would be subject to standard selection criteria.
Conclusion
Research applications are an untapped and tremendously valuable resource. They describe work plans and are clearly linked to specific research products, many of which, like research articles, are already rigorously indexed and machine-readable. These applications are data that can be used for optimizing research funding decisions and for developing insight into future innovations. With these data and emerging AI technologies, we will be able to understand the trajectory of our science with unprecedented breadth and insight, perhaps to even the same level of accuracy that human experts can foresee changes within a narrow area of study. However, maximizing the benefit of this information is not inevitable because the source data is currently closed to AI innovation. It will take vision and resources to build effectively from these closed systems—our federal science agencies have both, and with some leadership, they can realize the full potential of these applications.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Rebuild Corporate Research for a Stronger American Future
The American research enterprise, long the global leader, faces intensifying competition and mounting criticism regarding its productivity and relevance to societal challenges. At the same time, a vital component of a healthy research enterprise has been lost: corporate research labs, epitomized by the iconic Bell Labs of the 20th century. Such labs uniquely excelled at reverse translational research, where real-world utility and problem-rich environments served as powerful inspirations for fundamental learning and discovery. Rebuilding such labs in a 21st century “Bell Labs X” form would restore a powerful and uniquely American approach to technoscientific discovery—harnessing the private sector to discover and invent in ways that fundamentally improve U.S. national and economic competitiveness. Moreover, new metaresearch insights into “how to innovate how we innovate” provide principles that can guide their rebuilding. The White House Office of Science and Technology Policy (OSTP) can help turn these insights into reality by convening a working group of stakeholders (philanthropy, business, and science agency leaders), alongside policy and metascience scholars, to make practical recommendations for implementation.
Challenge and Opportunity
The American research enterprise faces intensifying competition and mounting criticism regarding its productivity and relevance to societal challenges. While a number of reasons have been proposed for why, among the most important is that corporate research labs, a vital piece of a healthy research enterprise, are missing. Exemplified by the Bell Labs, these labs dominated the research enterprise of the first half of the 20th century but became defunct in the second half. The reason: formalization of profits as the prime goal of corporations, which is incompatible with research, particularly the basic research that produces public-goods science and technology. Instead, academic research is now dominant. The reason: the rise of federal agencies like the National Science Foundation (NSF) with a near-total focus on academia. This dynamic, however, is not fundamental: federal agencies could easily fund research at corporations and not just in academia.
Moreover, there is a compelling reason to do so. Utility and learning are cyclical and build on each other. In one direction, learning serves as a starting point for utility. Academia excels at such translational research. In the other direction, utility serves as a starting point for learning. Corporations in principle excel at such reverse translational research. Corporations are where utility lives and breathes and where real-world problem-rich environments and inspiration for learning thrives. This reverse translational half of the utility-learning cycle, however, is currently nearly absent, and is a critical void that could be filled by corporate research.
For example, at Bell Labs circa WWII, Claude Shannon’s exposure to real-world problems in cryptography and noisy communications inspired his surprising idea to treat information as a quantifiable and manipulable entity independent of its physical medium, revolutionizing information science and technology. Similarly, Mervyn Kelly’s exposure to the real-world benefit of compact and reliable solid-state amplifiers inspired him to create a research activity at Bell Labs that invented the transistor and discovered the transistor effect. These advances, inspired by real-world utility, laid the foundations for our modern information age.
Importantly, these advances were given freely to the nation because Bell Labs’ host corporation, the AT&T of the 20th century, was a monopoly and could be altruistic with respect to its research. Now, in the 21st century, corporations, even when they have dominant market power, are subject to intense competitive pressures on their bottom-line profit which make it difficult for them to engage in research that is given freely to the nation. But to throw away corporate research along with the monopolies that could afford to do such research is to throw away the baby with the bathwater. Instead, the challenge is to rebuild corporate research in a 21st century: “Bell Labs X” form without relying on monopolies, using public-private partnerships instead.
Moreover, new insights into the nature and nurture of research provide principles that can guide the creation of such public-private partnerships for the purpose of public-goods research.
- Inspire, but Don’t Constrain, Research by Particular Use. Reverse-translational research should start with real-world challenges but not be constrained by them as it seeks the greatest advances in learning—advances that surprise and contradict prevailing wisdom. This principle combines Donald Stokes’ “use-inspired research” with Ken Stanley and Joel Lehman’s “why greatness cannot be planned” with Gold Standard Science’s informed contrariness and dissent.
- Fund and Execute Research at the Institution, not Individual Researcher, Level. This would be very different from the dominant mode of research funding in the U.S.: matrix-funding to principal investigators (PIs) in academia. Here, instead, research funding would be to research institutes that employ researchers rather than contract with researchers employed by other institutions. Leadership would be empowered to nurture and orchestrate the people, culture, and organizational structure of the institute for the singular purpose of empowering researchers to achieve groundbreaking discoveries.
- Evolve Research Institutions by Retrospective, Competitive Reselection. There should be many research institutes and none should have guaranteed perpetual funding. Instead, they should be subject to periodic evaluation “with teeth” where research institutions only continue to receive support if they are significantly changing the way we think and/or do. This creates a dynamic market-like ecosystem within which the population of research institutes evolves in response to a competitive re-selection pressure towards ever-increasing research productivity.
Plan of Action
The White House Office of Science and Technology Policy (OSTP) should convene a working group of stakeholders, alongside policy and metaresearch scholars, to make practical recommendations for public-private partnerships that enable corporate research akin to the Bell Labs of the 20th century, but in a 21st century “Bell Labs X” form.
Among the stakeholders would be government agencies, corporations and philanthropies—perhaps along the lines of the Government-University-Industry-Philanthropy Research Roundtable (GUIPRR) of the National Academies of Sciences, Engineering and Medicine (NASEM).
Importantly, the working group does not need to start from scratch. A high-level, funding and organizational model was recently articulated.
Its starting point is the initial selection of ten or so Bell Labs Xs based on their potential for major advances in public-goods science and technology. Each Bell Labs X would be hosted and cost-shared by a corporation that brings with it its problem-rich use environment and state-of-the-art technological contexts, but majority block-funded by a research funder (federal agencies and/or philanthropies) with broad societal benefit in mind. To establish a sense of scale, we might imagine each Bell Labs X having a $120M/year operating budget and a 20% cost share—so $20M/year coming from the corporate host and $100M/year coming from the research funder.
This plan also envisions a market-like competitive renewal structure of these corporate research labs. At the end of a period of time (say, ten years) appropriate for long-term basic research, all ten or so Bell Labs Xs would be evaluated for their contributions to public-goods science and technology independent of their contributions to commercial applications of the host corporation. Only the most productive seven or eight of the ten would be renewed. In between selection, re-selection and subsequent re-re-selections, leadership of each Bell Labs X would be free to nurture its people, culture and organizational structure as it believes will maximize research productivity. Each Bell Labs X would thus be an experiment in research institution design. And each Bell Labs X would make its own bet on the knowledge domain it believes is ripe for the greatest disruptive advances. Government’s role would be largely confined to retrospectively rewarding or disrewarding those Bell Labs Xs that made better or worse bets, without itself making bets.
Conclusion
Imagine a private institution whose researchers routinely disrupted knowledge and changed the world. That’s the story of Bell Labs—a legendary research institute that gave us scientific and technological breakthroughs we now take for granted. In its heyday in the mid-20th century, Bell Labs was a crucible of innovation where brilliant minds were exposed to and inspired by real-world problems, then given the freedom to explore those problems in deep and fundamental ways, often pivoting to and solving unanticipated new problems of even greater importance.
Recreating that innovative environment is possible and its impact on American research productivity would be profound. By innovating how we innovate, we would leap-frog other nations who are investing heavily in their own research productivity but are largely copying the structure of the current U.S. research enterprise. The resulting network of Bell Labs Xs would flip the relationship between corporations and the nation’s public-goods science and technology from asking not what the nation’s public-goods science and technology can do for corporations, but what corporations can do for the nation’s public-goods science and technology. Disruptive and useful ideas are not getting harder to find; our current research enterprise is just not well optimized to find them.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Bounty Hunters for Science
Fraud in scientific research is more common than we’d like to think. Such research can mislead entire scientific fields for years, driving futile and wasteful followup studies, and slowing down real scientific discoveries. To truly push the boundaries of knowledge, researchers should be able to base their theories and decisions on a more trustworthy scientific record.
Currently there are insufficient incentives to identify fraud and correct the record. Meanwhile, fraudsters can continue to operate with little chance of being caught. That should change: Scientific funders should establish one or more bounty programs aimed at rewarding people who identify significant problems with federally-funded research, and should particularly reward fraud whistleblowers whose careers are on the line.
Challenge and Opportunity
In 2023 it was revealed that 20 papers from Hoau-Yan Wang, an influential Alzheimer’s researcher, were marred by doctored images and other scientific misconduct. Shockingly, his research led to the development of a drug that was tested on 2,000 patients. A colleague described the situation as “embarrassing beyond words”.
There is a common belief that science is self-correcting. But what’s interesting about this case is that the scientist who uncovered Wang’s fraud was not driven by the usual academic incentives. He was being paid by Wall Street short sellers who were betting against the drug company!
This was not an isolated incident. The most notorious example of Alzheimer’s research misconduct – doctored images in Sylvain Lesné’s papers – was also discovered with the help of short sellers. And as reported in Science, Lesné’s “paper has been cited in about 2,300 scholarly articles—more than all but four other Alzheimer’s basic research reports published since 2006, according to the Web of Science database. Since then, annual NIH support for studies labeled ‘amyloid, oligomer, and Alzheimer’s’ has risen from near zero to $287 million in 2021.” While not all of that research was motivated by Lesné’s paper, it’s inconceivable that a paper with that many citations could not have had some effect on the direction of the field.
These cases show how a critical part of the scientific ecosystem – the exposure of faked research – can be undersupplied by ordinary science. Unmasking fraud is a difficult and awkward task, and few people want to do it. But financial incentives can help close those gaps.
Plan of Action
People who witness scientific fraud often stay silent due to perceived pressure from their colleagues and institutions. Whistleblowing is an undersupplied part of the scientific ecosystem.
We can correct these incentives by borrowing an idea from the Securities and Exchange Commission, whose bounty program around financial fraud pays whistleblowers 10-30% of the fines imposed by the government. The program has been a huge success, catching dozens of fraudsters and reducing the stigma around whistleblowing. The Department of Justice has recently copied the model for other types of fraud, such as healthcare fraud. The model should be extended to scientific fraud.
- Funder: Any U.S. government funding agency, such as NIH or NSF
- Eligibility: Research employees with insider knowledge from having worked in a particular lab
- Cost: The program should ultimately pay for itself, both through the recoupment of grant expenditures and through the impacts on future funding, including, potentially, the trajectory of entire academic fields.
The amount of the bounty should vary with the scientific field and the nature of the whistleblower in question. For example, compare the following two situations:
- An undergraduate whistleblower who identifies a problem in a psychology or education study that hardly anyone had cited, let alone implemented in the real world
- A graduate student or postdoc who calls out their own mentor for academic fraud related to influential papers on Alzheimer’s disease or cancer.
The stakes are higher in the latter case. Few graduate students or post-docs will ever be willing to make the intense personal sacrifice of whistleblowing on their own mentor and adviser, potentially forgoing approval of their dissertation or future recommendation letters for jobs. If we want such people to be empowered to come forward despite the personal stakes, we need to make it worth their while.
Suppose that one of Lesné’s students in 2006 had been rewarded with a significant bounty for direct testimony about the image manipulation and fraud that was occurring. That reward might have saved tens of millions in future NIH spending, and would have been more than worth it. In actuality, as we know, none of Lesné’s students or postdocs ever had the courage to come forward in the face of such immense personal risk.
The Office of Research Integrity at the Department of Health and Human Services should be funded to create a bounty program for all HHS-funded research at NIH, CDC, FDA, or elsewhere. ORI’s budget is currently around $15 million per year. That should be increased by at least $1 million to account for a significant number of bounties plus at least one full-time employee to administer the program.
Conclusion
Some critics might say that science works best when it’s driven by people who are passionate about truth for truth’s sake, not for the money. But by this point it’s clear that like anyone else, scientists can be driven by incentives that are not always aligned with the truth. Where those incentives fall short, bounty programs can help.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Confirming Hope: Validating Surrogate Endpoints to Support FDA Drug Approval Using an Inter-Agency Approach
To enable more timely access to new drugs and biologics, clinical trials are increasingly using surrogate markers in lieu of traditional clinical outcomes that directly measure how patients feel, function, or survive. Surrogate markers, such as imaging findings or laboratory measurements, are expected to predict clinical outcomes of interest. In comparison to clinical outcomes, surrogate markers offer an advantage in reducing the duration, size, and total cost of trials. Surrogate endpoints are considered to be “validated” if they have undergone extensive testing that confirms their ability to predict a clinical outcome. However, reviews of “validated” surrogate markers used as primary endpoints in trials supporting U.S. Food and Drug Administration (FDA) approvals suggest that many lack sufficient evidence of being associated with a clinical outcome.
Since 2018, FDA has regularly updated the publicly available “Table of Surrogate Endpoints That Were the Basis of Drug Approval or Licensure”, which includes over 200 surrogate markers that have been or would be accepted by the agency to support approval of a drug or biologic. Not included within the table is information regarding the strength of evidence for each surrogate marker and its association with a clinical outcome. As surrogate markers are increasingly being accepted by FDA to support approval of new drugs and biologics, it is imperative that patients and clinicians understand whether such novel endpoints are reflective of meaningful clinical benefits. Thus, FDA, in collaboration with other agencies, should take steps to increase transparency regarding the strength of evidence for surrogate endpoints used to support product approvals, routinely reassess the evidence behind such endpoints to continue justifying their use in regulatory decision-making, and sunset those that fail to show association with meaningful clinical outcomes. Such transparency would not only benefit the public, clinicians, and the payers responsible for coverage decisions, but also help shape the innovation landscape for drug developers to design clinical trials that assess endpoints truly reflective of clinical efficacy.
Challenge and Opportunity
To receive regulatory approval by FDA, new therapeutics are generally required to be supported by “substantial evidence of effectiveness” from two or more “adequate and well-controlled” pivotal trials. However, FDA has maintained a flexible interpretation of this guidance to enable timely access to new treatments. New drugs and biologics can be approved for specific disease indications based on pivotal trials measuring clinical outcomes (how patients feel, function, or survive). They can also be approved based on pivotal trials measuring surrogate markers that are meant to be proxy measures and expected to predict clinical outcomes. Examples of such endpoints include changes in tumor size as seen on imaging or blood laboratory tests such as cholesterol.
Surrogate markers are considered “validated” when sufficient evidence demonstrates that the endpoint reliably predicts clinical benefit. Such validated surrogate markers are typically the basis of traditional FDA therapeutics approval. However, FDA has also accepted the use of “unvalidated” surrogate endpoints that are reasonably likely to predict clinical benefit as the basis of approval of new therapeutics, particularly if they are being used to treat or prevent a serious or life-threatening disease. Under expedited review pathways, such as accelerated approval that grant drug manufacturers faster FDA market authorization using unvalidated surrogate markers, manufacturers are required to complete an additional clinical trial after approval to confirm the predicted clinical benefit. Should the manufacturer fail to do so, FDA has the authority to withdraw that drug’s particular indication approval.
For drug developers, the use of surrogate markers in clinical trials can shorten the duration, size, and total cost of the pivotal trial. Over time, FDA has increasingly allowed for surrogate markers to be used as primary endpoints in pivotal trials, allowing for shorter clinical trial testing periods and thus faster market access. Moreover, use of unvalidated surrogate markers has grown outside of expedited review pathways such as accelerated approval. One analysis of FDA approved drugs and biologics that received “breakthrough therapy designation” found that among those that received traditional approval, over half were based on pivotal trials using surrogate markers.
While basing FDA approval on surrogate markers can enable more timely market access to novel therapeutics, such endpoints also involve certain trade-offs, including the risk of making erroneous inferences and diminishing certainty about the medical product’s long-term clinical effect. In oncology, evidence suggests that most validation studies of surrogate markers find low correlations with meaningful clinical outcomes such as overall survival or a patient’s quality of life. For instance, in a review of 15 surrogate validation studies conducted by the FDA for oncologic drugs, only one was found to demonstrate a strong correlation between surrogate markers and overall survival. Another study suggested that there are weak or missing correlations between surrogate markers for solid tumors and overall survival. A more recent evaluation found that most surrogate markers used as primary endpoints in clinical trials to support FDA approval of drugs treating non-oncologic chronic disease lack high-strength evidence of associations with clinical outcomes.
Section 3011 of the 21st Century Cures Act of 2016 amended the Federal Food, Drug, and Cosmetic Act to mandate FDA publish a list of “surrogate endpoints which were the basis of approval or licensure (as applicable) of a drug or biological product” under both accelerated and traditional approval pathways. While FDA has posted surrogate endpoint tables for adult and pediatric disease indications that fulfil this legislative requirement, missing within these tables is any justification for surrogate selection, including evidence supporting validation. Without this information, patients, prescribers, and payers are left uncertain about the actual clinical benefit of therapeutics approved by the FDA based on surrogate markers. Instead, drug developers have continued to use this table as a guide in designing their clinical trials, viewing the included surrogate markers as “accepted” by the FDA regardless of the evidence (or lack thereof) undergirding them.
Plan of Action
Recommendation 1. FDA should make more transparent the strength of evidence of surrogate markers included within the “Adult Surrogate Endpoint Table” as well as the “Pediatric Surrogate Endpoint Table.”
Previously, agency officials stated that the use of surrogate markers to support traditional approvals was usually based, at a minimum, on evidence from meta-analyses of clinical trials demonstrating an association between surrogate markers and clinical outcomes for validation. However, more recently, FDA officials have indicated that they consider a “range of sources, including mechanistic evidence that the [surrogate marker] is on the causal pathway of disease, nonclinical models, epidemiologic data, and clinical trial data, including data from the FDA’s own analyses of patient- and trial-level data to determine the quantitative association between the effect of treatment on the [surrogate marker] and the clinical outcomes.” Nevertheless, what specific evidence and how the agency weighed such evidence is not included as part of their published tables of surrogate endpoints, leaving unclear to drug developers as well as patients, clinicians, and payers the strength of the evidence behind such endpoints. Thus, this serves as an opportunity for the agency to enhance their transparency and communication with the public.
FDA should issue a guidance document detailing their current thinking about how surrogate markers should be validated and evaluated on an ongoing basis. Within the guidance, the agency could detail the types of evidence that would be considered to establish surrogacy.
FDA should also include within the tables of surrogate endpoints, a summary of evidence for each surrogate marker listed. This would provide justification (through citations to relevant articles or internal analyses) so that all stakeholders understand the evidence establishing surrogacy. Moreover, FDA can clearly indicate within the tables which clinical outcomes each surrogate marker listed is thought to predict.
FDA should also publicly report on an annual basis a list of therapeutics approved by the agency based on clinical trials using surrogate markers as primary endpoints. This coupled with the additional information around strength of evidence for each surrogate marker would allow patients and clinicians to make more informed decisions around treatments where there may be uncertainty of the therapeutic’s clinical benefit at the time of FDA approval.
Recently, FDA’s Oncology Center for Excellent through Project Confirm has made additional efforts to communicate that status of required postmarketing studies meant to confirm clinical benefit of drugs for oncologic disease indications that received accelerated approval. FDA could further expand this across therapeutic areas and approval pathways by publishing a list of ongoing postmarketing studies for therapeutics where approval was based on surrogate markers that are intended to confirm clinical benefit.
FDA should also regularly convene advisory committees to allow for independent experts to review and vote on recommendations around the use of new surrogate markers for disease indications. Additionally, FDA should regularly convene these advisory committees to re-evaluate the use of surrogate markers based on current evidence, especially those not supported by high-strength evidence demonstrating their association with clinical outcomes. At a minimum, FDA should convene such advisory committees focused on re-examining surrogate markers listed on their publicly available tables annually. In 2024, FDA convened the Oncologic Drugs Advisory Committee to discuss the use of the surrogate marker, minimal residual disease as an endpoint for multiple myeloma. Further such meetings including for those “unvalidated” endpoints would provide FDA opportunity to re-examine their use in regulatory decision-making.
Recommendation 2. In collaboration with the FDA, other federal research agencies should contribute evidence generation to determine whether surrogate markers are appropriate for use in regulatory decision-making, including approval of new therapeutic products and indications for use.
Drug manufacturers that receive FDA approval for products based on unvalidated surrogate markers may not be incentivized to conduct studies that demonstrate a lack of association between such surrogate markers with clinical outcomes. To address this, the Department of Health and Human Services (HHS) should establish an interagency working group including FDA, National Institutes of Health (NIH), Patient Centered Outcomes Research Institute (PCORI), Advanced Research Projects Agency for Health (ARPA-H), Centers for Medicare and Medicaid Services (CMS) and other agencies engaged in biomedical and health services research. These agencies could collaboratively conduct or commission meta-analyses of existing clinical trials to determine whether there is sufficient evidence to establish surrogacy. Such publicly-funded studies would then be brought to FDA advisory committees to be considered by members in making recommendations around the validity of various surrogate endpoints or whether any endpoints without sufficient evidence should be sunset. NIH in particular should prioritize funding large-scale trials aimed at validating important surrogate outcomes.
Through regular collaboration and convening, FDA can help guide the direction of resources towards investigating surrogate markers of key regulatory as well as patient, clinician, and payer interest to strengthen the science behind novel therapeutics. Such information would also be invaluable to drug developers in identifying evidence-based endpoints as part of their clinical trial design, thus contributing to a more efficient research and development landscape.
Recommendation 3. Congress should build upon the provisions related to surrogate markers that passed as part of the 21st Century Cures Act of 2016 in their “Cures 2.0” efforts.
The aforementioned interagency working group convened by HHS could be authorized explicitly through legislation coupled with funding specifically for surrogate marker validation studies. Congress should also mandate that FDA and other federal health agencies re-evaluate listed surrogate endpoints on an annual basis with additional reporting requirements. Additionally, through legislation, FDA could also be granted explicit authority for those endpoints where there is no clear evidence of their surrogacy to sunset them, thus preventing future drug candidates from establishing efficacy based on flawed endpoints. Congress should also require routine reporting from FDA on the status of the interagency working group focused on surrogate endpoints as well as other metrics including a list of new therapeutic approvals based on surrogate markers, expansion of the existing surrogate marker tables on FDA’s website to include the evidence of their surrogacy, and issuance of a guidance document detailing what scientific evidence would be considered by the agency in validating and re-evaluating surrogate markers.
Conclusion
FDA increasingly has allowed new drugs and biologics to be approved based on surrogate markers that are meant to be predictive of meaningful clinical outcomes demonstrating that patients feel better, function better, and survive longer. Although the agency has made more clear what surrogate endpoints could be or are being used to support approval, significant gaps exist in the evidence demonstrating that these novel endpoints are associated with meaningful clinical outcomes. Continued use of surrogate endpoints with little association with clinical benefit leaves patients and clinicians without assurance that novel therapeutics approved by FDA are meaningfully effective, as well as payers responsible for coverage decisions. Transparency of the evidence supporting clinical endpoints is urgently needed to mitigate this uncertainty around new drug approvals, including for drug developers as they continue clinical trials for therapeutic candidates seeking FDA approval. FDA should and in collaboration with other federal biomedical research agencies, routinely re-evaluate surrogate endpoints to determine their continued use in therapeutic innovation. Such regular re-evaluation that informs regulatory decision-making will strengthen FDA’s credibility and ensure accountability of the agency, tasked with ensuring the safety and efficacy of drugs and other medical products as well as with shaping the innovation landscape.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Yes. In 2016, eteplirsen (Exondys 51) was granted accelerated approval for the treatment of Duchenne muscular dystrophy (DMD) against the recommendation of an advisory committee and FDA’s own scientific staff. Concerns were raised that the approval was based on a small clinical trial that showed that eteplirsen led to a small increase in protein dystrophin, a surrogate marker. Three additional approvals for similar DMD drugs have been made based on the same surrogate endpoint. However, no studies have been completed providing confirmation of clinical benefit.
In 2021, aducanumab (Aduhelm) was granted accelerated approval for the treatment of Alzheimer’s disease against the recommendation of an advisory committee and FDA’s scientific staff. Concerns were raised that the approval was based on a surrogate marker, beta-amyloid levels, which has not been found to correlate with cognitive or function changes for Alzheimer’s disease patients. In particular, FDA’s internal statistical review team found no association between changes to the surrogate marker and the clinical outcomes reported in pivotal trials.
Industry may claim that such re-evaluation and potential removal of continued unvalidated surrogate endpoints would slow down the pace of innovation and thus, patient access to novel therapeutics. However, it is more likely this would instead enable more efficient drug development in providing manufacturers, particularly smaller companies with surrogate endpoints that not only decrease the duration and cost of clinical trials, but that also have strong evidence of association with meaningful clinical outcomes. This may also mitigate the need for postmarketing requirements for manufacturers meant to confirm clinical benefit if adequate validation is conducted through FDA and other federal agencies.
No. Having FDA in collaboration with other federal health agencies to validate surrogate endpoints would not halt the use of unvalidated surrogate endpoints reasonably likely to predict clinical benefit. Expedited regulatory pathways such as accelerated approval that are codified by law allowing manufacturers to use unvalidated surrogate markers as endpoints in pivotal clinical trials will still be available for manufacturers. Instead, this creates a process for re-evaluation such that unvalidated surrogate endpoints are not forever left unvalidated, but instead examined within a timely manner to inform their continued use in supporting FDA approval. Ultimately, patients and clinicians want drugs that meaningfully work to treat or prevent against a disease or condition. Routine re-evaluation and validation of surrogate endpoints would provide assurance that for those therapeutics whose approval is based off of these novel endpoints that the FDA approved treatment is clinically effective.
FDA’s function as a regulator is to evaluate the evidence that is brought before them by industry sponsors. To do so effectively, the evidence must be available. This is often not the case, particularly for new surrogate markers as there may not be commercial incentive to do so, particularly if after approval, a surrogate endpoint is found to be not associated with a meaningful clinical outcome. Thus, the involvement of multiple federal biomedical research agencies including NIH and ARPA-H alongside FDA can play an instrumental role in conducting or funding studies demonstrating a clear association between surrogate marker and clinical outcome. Already, several institutes within the NIH are engaged in biomarker development and in supporting validation. Collaboration between NIH institutes with expertise as well as other agencies engaged in translational research with FDA will enable validation of surrogate markers to inform regulatory decision-making of novel therapeutics.
Under the Prescription Drug User Fee Act VII passed in 2022, FDA was authorized to establish the Rare Disease Endpoint Advancement (RDEA) pilot program. This program is intended to foster the development of novel endpoints for rare diseases through FDA collaboration with industry sponsors. with proposed novel endpoints for a drug candidate, opportunities for stakeholders including the public to inform such endpoint development, and greater FDA staff capacity to help develop novel endpoints for rare diseases. Such a pilot program could be further expanded to not only develop novel endpoints, but to also develop approaches for validating novel endpoints such as surrogate markers and communicating the strength of evidence to the public.
Payers such as Medicare have also taken steps to enable postmarket evidence generation including for drugs approved by FDA based on surrogate endpoints. Following the accelerated approval of aducanumab (Aduhelm), the Centers for Medicare and Medicaid Services (CMS) issued a national coverage determination under the coverage with evidence development (CED) program, conditioning coverage of this class of drugs to studies approved by CMS approved by FDA based on a surrogate endpoint with access only available through randomized controlled trials assessing meaningful clinical outcomes. Further evaluation of surrogate endpoints informing FDA approval can be beneficial for payers as they make coverage decisions. Additionally, coverage and reimbursement could also be tied to evidence for such surrogate endpoints, providing additional incentive to complete and communicate the findings from such studies.
A Cross-Health and Human Services Initiative to Cut Wasteful Spending and Improve Patient Lives
Challenge and Opportunity
Many common medical practices do not have strong evidence behind them. In 2019, a group of prominent medical researchers—including Robert Califf, the former Food and Drug Administration (FDA) Commissioner—undertook the tedious task of looking into the level of evidence behind 2,930 recommendations in guidelines issued by the American Heart Association and the American College of Cardiology. They asked one simple question: how many recommendations were supported by multiple small randomized trials or at least one large trial? The answer: 8.5%. The rest were supported by only one small trial, by observational evidence, or just by “expert opinion only.”
For infectious diseases, a team of researchers looked at 1,042 recommendations in guidelines issued by the Infectious Diseases Society of America. They found that only 9.3% were supported by strong evidence. For 57% of the recommendations, the quality of evidence was “low” or “very low.” And to make matters worse, more than half of the recommendations considered low in quality of evidence were still issued as “strong” recommendations.
In oncology, a review of 1,023 recommendations from the National Comprehensive Cancer Network found that “…only 6% of the recommendations … are based on high-level evidence”, suggesting “a huge opportunity for research to fill the knowledge gap and further improve the scientific validity of the guidelines.”
Even worse, there are many cases where not only is a common medical treatment lacking the evidence to support it, but also one or more randomized trials have shown that the treatment is useless or even harmful! One of the most notorious examples is that of the anti-arrhythmic drugs given to millions of cardiac patients in the 1980s. Cardiologists at the time had the perfectly logical belief that since arrhythmia (irregular heartbeat) leads to heart attacks and death, drugs that prevented arrhythmia would obviously prevent heart attacks and death. In 1987, the National Institutes of Health (NIH) funded the Cardiac Arrhythmia Suppression Trial (CAST) to test three such drugs. One of the drugs had to be pulled after just a few weeks, because 17 patients had already died compared with only three in the placebo group. The other two drugs similarly turned out to be harmful, although it took several months to see that patients given those drugs were more than two times as likely to die. According to one JAMA article, “…there are estimates that 20,000 to 75,000 lives were lost each year in the 1980s in the United States alone…” due to these drugs. The CAST trial is a poignant reminder that doctors can be convinced they are doing the best for their patients, but they can be completely wrong if there is not strong evidence from randomized trials.
In 2016, randomized trials of back fusion surgery found that it does not work. But a recent analysis by the Lown Institute found that the Centers for Medicare & Medicaid Services (CMS) spent approximately $2 billion in the past 3 years on more than 200,000 of these surgeries.
There are hundreds of additional examples where medical practice was ultimately proven wrong. Given how few medical practices, even now, are actually supported by strong evidence, there are likely many more examples of treatments that either do not work or actively cause harm. This is not only wasted spending, but also puts patients at risk.
We can do better – both for patients and for the federal budget – if we reduce the use of medical practices that simply do not work.
Plan of Action
The Secretary of Health and Human Services should create a cross-division committee to develop an extensive and prioritized list of medical practices, products, and treatments that need evidence of effectiveness, and then roll out an ambitious agenda to run randomized clinical trials for the highest-impact medical issues.
That is, the CMS needs to work with the NIH and the FDA, and the Centers for Disease Control and Prevention (CDC) to develop a prioritized list of medical treatments, procedures, drugs, and devices with little evidence behind them and for which annual spending is large and the health impacts could be most harmful. Simultaneously, the FDA needs to work with its partner agencies to identify drugs, vaccines, and devices with widespread medical usage that need rigorous post-market evaluation. This includes drugs with off-label uses, oncology regimens that have never been tested against each other, surrogate outcomes that have not been validated against long-term outcomes, accelerated approvals without the needed follow-up studies, and more.
With priority lists available, the NIH could immediately launch trials to evaluate the effectiveness of the identified treatments and practices to ensure effective health and safety. The Department should report to Congress on a yearly basis as to the number and nature of clinical trials in progress, and eventually the results of those trials (which should also be made available on a public dashboard, with any resulting savings). The project should be ongoing for the indefinite future, and over time, HHS should explore ways to have artificial intelligence tools identify the key unstudied medical questions that deserve a high-value clinical trial.
Expected opponents to any such effort will be pharmaceutical, biotechnology and device companies and their affiliated trade associations, whose products might come under further scrutiny, and professional medical associations who are firmly convinced that their practices should not be questioned. Their lobbying power might be considerable, but the intellectual case behind the need for rigorous and unbiased studies is unquestionable, particularly when billions of federal dollars and millions of patients’ lives and health are at stake.
Conclusion
Far too many medical practices and treatments have not been subjected to rigorous randomized trials, and the divisions of Health and Human Services should come together to fix this problem. Doing so will likely lead to billions of dollars in savings and huge improvements to patient health.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Measuring Research Bureaucracy to Boost Scientific Efficiency and Innovation
Bureaucracy has become a critical barrier to scientific progress in America. An excess of management and administration efforts pulls researchers away from their core scientific work and consumes resources that could advance discovery. While we lack systematic measures of this inefficiency, the available data is troubling: researchers spend nearly half their time on administrative tasks, and nearly one in five dollars of university research budgets goes to regulatory compliance.
The proposed solution is a three-step effort to measure and roll back the bureaucratic burden. First, we need to create a detailed baseline by measuring administrative personnel, management layers, and associated time/costs across government funding agencies and universities receiving grant funding. Second, we need to develop and apply objective criteria to identify specific bureaucratic inefficiencies and potential improvements, based on direct feedback from researchers and administrators nationwide. Third, we need to quantify the benefits of reducing bureaucratic overhead and implement shared strategies to streamline processes, simplify regulations, and ultimately enhance research productivity.
Through this ambitious yet practical initiative, the administration could free up over a million research days annually and redirect billions of dollars toward scientific pursuits that strengthen America’s innovation capacity.
Challenge and Opportunity
Federally funded university scientists spend much of their time navigating procedures and management layers. Scientists, administrators, and policymakers widely agree that bureaucratic burden hampers research productivity and innovation, yet as the National Academy of Sciences noted in 2016 there is “little rigorous analysis or supporting data precisely quantifying the total burden and cost to investigators and research institutions of complying with federal regulations specific to the conduct of federally funded research.” This continues to be the case, despite evidence suggesting that federally funded faculty spend nearly half of their research time on administrative tasks, and nearly one in every five dollars spent on university research goes to regulatory compliance.
Judging by the steady rise in research administration requirements that face universities, the problem is getting worse. Federal rules and policies affecting research have multiplied ninefold in two decades— from 29 in 2004 to 255 in 2024, with half of the increase just in the last five years. It is no coincidence that the bureaucratic overhead is also expanding in funding agencies. At the National Institutes of Health (NIH), for instance, the growth of managers and administrators has significantly outpaced scientific roles and research funding activity (see figure).

The question is: just how much of universities’ $100 billion-plus annual research spend (more than half of it funded by the federal government) is hobbled by excess management and administration? To answer this, we must understand:
- Which bureaucratic activities are wasteful, or have a poor return on time and effort?
- How much time do bureaucratic activities take up, and what is the cost overall?
- Which activities are not required by the law or regulations, but are imposed by overly risk-averse legal counsel, compliance, and other administrators at agencies or universities?
- Which activities, rules, and processes should be eliminated or reimagined, and how?
- What portion of the overhead budget isn’t spent on research administration or management?
Plan of Action
The current administration aims to make government-funded research more efficient and productive. Recently, the director of the Office of Science and Technology Policy (OSTP) vowed to “reduce administrative burdens on federally funded researchers, not bog them down in bureaucratic box checking.” To that end, I propose a systematic effort that measures bureaucratic excess, quantifies the payoff from eliminating specific aspects of this burden, and improves accountability for results.
The president should issue an Executive Order directing the Office of Management and Budget (OMB) and Office of Science and Technology Policy (OSTP) to develop a Bureaucratic Burden report within 180 days of signing. The report should detail specific steps agencies will take to reduce administrative requirements. Agencies must participate in this effort at the leadership level, launching a government-wide effort to reduce bureaucracy. Furthermore, all research agencies should work together to develop a standardized method for calculating burden within both agencies and funded institutes, create a common set of policies that will streamline research processes, and establish clear limits on overhead spending to ensure full transparency in research budgets.
OMB and OSTP should create a cross-agency Research Efficiency Task Force within the National Science and Technology Council to conduct this work. This team would develop a shared approach and lead the data gathering, analysis, and synthesis using consistent measures across agencies. The Task Force’s first step would be to establish a bureaucratic baseline, including a detailed view of the managerial and administrative footprint within federal research agencies and universities that receive research funding, broken down into core components. The measurement approach would certainly vary between government agencies and funding recipients.
Key agencies, including the NIH, National Science Foundation, NASA, the Department of Defense, and the Department of Energy, should:
- Count personnel at each level—managers, administrators, and intramural scientists—along with their compensation;
- Document management layers from executives to frontline staff and supervisor ratios;
- Calculate time spent on administrative work by all staff, including researchers, to estimate total compliance costs and overhead.
- Task Force agencies should also hire an independent contractor(s) to analyze the administrative burden at a representative sample of universities. Through surveys and interviews, they should measure staffing, management structures, researcher time allocation, and overhead costs to size up the bureaucratic footprint across the scientific establishment.
Next, the Task Force should launch an online consultation with researchers and administrators nationwide. Participants could identify wasteful administrative tasks, quantify their time impact, and share examples of efficient practices. In parallel, agency leaders should submit to OMB and OSTP their formal assessment of which bureaucratic requirements can be eliminated, along with projected benefits.
Finally, the Task Force should produce a comprehensive estimate of the total cost of unnecessary bureaucracy and propose specific reforms. Its recommendations will identify potential savings from streamlining agency practices, statutory requirements, and oversight mechanisms. The Task Force should also examine how much overhead funding supports non-research activities, propose ways to redirect these resources to scientific research, and establish metrics and a public dashboard to track progress.
Some of this information may have already been gathered as part of ongoing reorganization efforts, which would expedite the assessment.
Within six months, the group should issue a public report that would include:
- A detailed estimate of the unnecessary costs of research bureaucracy.
- The cost gains from rolling back or adjusting specific burdens.
- A synthesis of harder-to-quantify benefits from these moves, such as faster approval cycles, better decision-making, and less conservatism in research proposals;
- A catalog of innovative research management practices, with a four-year timeline for studying and scaling them.
- A proposed approach for regular tracking and reporting on bureaucratic burden in science.
- A prioritized list of changes that each agency should make, including a clear timeline for making those changes and the estimated cost savings.
These activities would serve as the start of a series of broad reforms by the White House and research funding agencies to improve federal funding policies and practices.
Conclusion
This initiative will build an irrefutable case for reform, provide a roadmap for meaningful improvement, and create real accountability for results. Giving researchers and administrators a voice in reimagining the system they navigate daily will generate better insights and build commitment for change. The potential upside is enormous: millions of research days could be freed from paperwork for lab work, strengthening America’s capacity to innovate and lead the world. With committed leadership, this administration could transform how the US funds and conducts research, delivering maximum scientific return on every federal dollar invested.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
Improving Research Transparency and Efficiency through Mandatory Publication of Study Results
Scientists are incentivized to produce positive results that journals want to publish, improving the chances of receiving more funding and the likelihood of being hired or promoted. This hypercompetitive system encourages questionable research practices and limits disclosure of all research results. Conversely, the results of many funded research studies never see the light of day, and having no written description of failed research leads to systemic waste, as others go down the same wrong path. The Office of Science and Technology Policy (OSTP) should mandate that all grants must lead to at least one of two outputs: 1) publication in a journal that accepts null results (e.g., Public Library of Science (PLOS) One, PeerJ, and F1000Research), or 2) public disclosure of the hypothesis, methodology, and results to the funding agency. Linking grants to results creates a more complete picture of what has been tried in any given field of research, improving transparency and reducing duplication of effort.
Challenge and Opportunity
There is ample evidence that null results are rarely published. Mandated publication would ensure all federal grants have outputs, whether hypotheses were supported or not, reducing repetition of ideas in future grant applications. More transparent scientific literature would expedite new breakthroughs and reduce wasted effort, money, and time across all scientific fields. Mandating that all recipients of federal research grants publish results would create transparency about what exactly is being done with public dollars and what the results of all studies were. It would also enable learning about which hypotheses/research programs are succeeding and which are not, as well as the clinical and pre-clinical study designs that are producing positive versus null findings.
Better knowledge of research results could be applied to myriad funding and research contexts. For example, an application for a grant could state that, in a previous grant, an experiment was not conducted because previous experiments did not support it, or alternatively, the experiment was conducted but it produced a null result. In both scenarios, the outcome should be reported, either in a publication in PubMed or as a disclosure to federal science funding agencies. In another context, an experiment might be funded across multiple labs, but only the labs that obtain positive results end up publishing. Mandatory publication would enable an understanding of how robust the result is across different laboratory contexts and nuances in study design, and also why the result was positive in some contexts and null in others.
Pressure to produce novel and statistically significant results often leads to questionable research practices, such as not reporting null results (a form of publication bias), p-hacking (a statistical practice where researchers manipulate analytical or experimental procedures to find significant results that support their hypothesis, even if the results are not meaningful), hypothesizing after results are known (HARKing), outcome switching (changes to outcome measures), and many others. The replication and reproducibility crisis in science presents a major challenge for the scientific community—questionable results undermine public trust in science and create tremendous waste as the scientific community slowly course-corrects for results that ultimately prove unreliable. Studies have shown that a substantial portion of published research findings cannot be replicated, raising concerns about the validity of the scientific evidence base.
In preclinical research, one survey of 454 animal researchers estimated that 50% of animal experiments are not published, and that one of the most important causes of non-publication was a lack of statistical significance (“negative” findings). The prevalence of these issues in preclinical research undoubtedly plays a role in poor translation to the clinic as well as duplicative efforts. In clinical trials, a recent study found that 19.2% of cancer phase 3 randomized controlled trials (RCTs) had primary end point changes (i.e., outcome switching), and 70.3% of these did not report the changes in their resulting manuscripts. These changes had a statistically significant relationship with trial positivity, indicating that they may have been carried out to present positive results. Other work examining RCTs more broadly found one-third with clear inconsistencies between registered and published primary outcomes. Beyond outcome switching, many trials include “false” data. Among 526 trials submitted to the journal Anaesthesia from February 2017 to March 2020, 73 (14%) had false data, including “the duplication of figures, tables and other data from published work; the duplication of data in the rows and columns of spreadsheets; impossible values; and incorrect calculations.”
Mandatory publication for all grants would help change the incentives that drive the behavior in these examples by fundamentally altering the research and publication processes. At the conclusion of a study that obtained null results, this scientific knowledge would be publicly available to scientists, the public, and funders. All grant funding would have outputs. Scientists could not then repeatedly apply for grants based on failed previous experiments, and they would be less likely to receive funding for research projects that have already been tried, and failed, by others. The cumulative, self-correcting nature of science cannot be fully realized without transparency around what worked and what did not work.
Adopting mandatory publication of results from federally funded grants would also position the U.S. as a global leader in research integrity, matching international initiatives such as the UK Reproducibility Network and European Open Science Cloud, which promote similar reforms. By embracing mandatory publication, the U.S. will enhance its own research enterprise and set a standard for other nations to follow.
Plan of Action
Recommendation 1. The White House should issue a directive to federal research funding agencies that mandates public disclosure of research results from all federal grants, including null results, unless they reveal intellectual property or trade secrets. To ensure lasting reform to America’s research enterprise , Congress could pass a law requiring such disclosures.
Recommendation 2. The National Science and Technology Council (NSTC) should develop guidelines for agencies to implement mandatory reporting. Successful implementation requires that researchers are well-informed and equipped to navigate this process. NSTC should coordinate with agencies to establish common guidelines for all agencies to reduce confusion and establish a uniform policy. In addition, agencies should create and disseminate detailed guidance documents that outline best practice for studies reporting null results, including step-by-step instructions on how to prepare and submit null studies to journals (and their differing guidelines) or federal databases.
Conclusion
Most published research is not replicated because the research system incentivizes the publication of novel, positive results. There is a tremendous amount of research that is not published due to null results, representing an enormous amount of wasted effort, money, and time, and compromised progress and transparency of our scientific institutions. OSTP should mandate the publication of null results through existing agency authority and funding, and Congress should consider legislation to ensure its longevity.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
It is well understood that most scientific findings cannot be taken at face value until they are replicated or reproduced. To make science more trustworthy, transparent, and replicable, we must change incentives to only publish positive results. Publication of null results will accelerate advancement of science.
Scientific discovery is often unplanned and serendipitous, but it is abundantly clear that we can reduce the amount of waste it currently generates. By mandating outputs for all grants, we expedite a cumulative record of research, where the results of all studies are known, and we can see why experiments might be valid in one context but not another to assess the robustness of findings in different experimental contexts and labs.
While many agencies prioritize hypothesis-driven research, even exploratory research will produce an output, and these outputs should be publicly available, either as an article or by public disclosure.
Studies that produce null results can still easily share data and code, to be evaluated post-publication by the community to see if code can be refactored, refined, and improved.
The “Cadillac” version of mandatory publication would be the registered reports model, where a study has its methodology peer reviewed before data are collected (Stage 1 Review). Authors are given in-principle acceptance, whereby, as long as the scientist follows the agreed-upon methodology, their study is guaranteed publication regardless of the results. When a study is completed, it is peer reviewed again (Stage 2 Review) simply to confirm the agreed-upon methodology was followed. In the absence of this registered reports model, we should at least mandate transparent publication via journals that publish null results, or via public federal disclosure.