Improving Research Transparency and Efficiency through Mandatory Publication of Study Results
Scientists are incentivized to produce positive results that journals want to publish, improving the chances of receiving more funding and the likelihood of being hired or promoted. This hypercompetitive system encourages questionable research practices and limits disclosure of all research results. Conversely, the results of many funded research studies never see the light of day, and having no written description of failed research leads to systemic waste, as others go down the same wrong path. The Office of Science and Technology Policy (OSTP) should mandate that all grants must lead to at least one of two outputs: 1) publication in a journal that accepts null results (e.g., Public Library of Science (PLOS) One, PeerJ, and F1000Research), or 2) public disclosure of the hypothesis, methodology, and results to the funding agency. Linking grants to results creates a more complete picture of what has been tried in any given field of research, improving transparency and reducing duplication of effort.
Challenge and Opportunity
There is ample evidence that null results are rarely published. Mandated publication would ensure all federal grants have outputs, whether hypotheses were supported or not, reducing repetition of ideas in future grant applications. More transparent scientific literature would expedite new breakthroughs and reduce wasted effort, money, and time across all scientific fields. Mandating that all recipients of federal research grants publish results would create transparency about what exactly is being done with public dollars and what the results of all studies were. It would also enable learning about which hypotheses/research programs are succeeding and which are not, as well as the clinical and pre-clinical study designs that are producing positive versus null findings.
Better knowledge of research results could be applied to myriad funding and research contexts. For example, an application for a grant could state that, in a previous grant, an experiment was not conducted because previous experiments did not support it, or alternatively, the experiment was conducted but it produced a null result. In both scenarios, the outcome should be reported, either in a publication in PubMed or as a disclosure to federal science funding agencies. In another context, an experiment might be funded across multiple labs, but only the labs that obtain positive results end up publishing. Mandatory publication would enable an understanding of how robust the result is across different laboratory contexts and nuances in study design, and also why the result was positive in some contexts and null in others.
Pressure to produce novel and statistically significant results often leads to questionable research practices, such as not reporting null results (a form of publication bias), p-hacking (a statistical practice where researchers manipulate analytical or experimental procedures to find significant results that support their hypothesis, even if the results are not meaningful), hypothesizing after results are known (HARKing), outcome switching (changes to outcome measures), and many others. The replication and reproducibility crisis in science presents a major challenge for the scientific community—questionable results undermine public trust in science and create tremendous waste as the scientific community slowly course-corrects for results that ultimately prove unreliable. Studies have shown that a substantial portion of published research findings cannot be replicated, raising concerns about the validity of the scientific evidence base.
In preclinical research, one survey of 454 animal researchers estimated that 50% of animal experiments are not published, and that one of the most important causes of non-publication was a lack of statistical significance (“negative” findings). The prevalence of these issues in preclinical research undoubtedly plays a role in poor translation to the clinic as well as duplicative efforts. In clinical trials, a recent study found that 19.2% of cancer phase 3 randomized controlled trials (RCTs) had primary end point changes (i.e., outcome switching), and 70.3% of these did not report the changes in their resulting manuscripts. These changes had a statistically significant relationship with trial positivity, indicating that they may have been carried out to present positive results. Other work examining RCTs more broadly found one-third with clear inconsistencies between registered and published primary outcomes. Beyond outcome switching, many trials include “false” data. Among 526 trials submitted to the journal Anaesthesia from February 2017 to March 2020, 73 (14%) had false data, including “the duplication of figures, tables and other data from published work; the duplication of data in the rows and columns of spreadsheets; impossible values; and incorrect calculations.”
Mandatory publication for all grants would help change the incentives that drive the behavior in these examples by fundamentally altering the research and publication processes. At the conclusion of a study that obtained null results, this scientific knowledge would be publicly available to scientists, the public, and funders. All grant funding would have outputs. Scientists could not then repeatedly apply for grants based on failed previous experiments, and they would be less likely to receive funding for research projects that have already been tried, and failed, by others. The cumulative, self-correcting nature of science cannot be fully realized without transparency around what worked and what did not work.
Adopting mandatory publication of results from federally funded grants would also position the U.S. as a global leader in research integrity, matching international initiatives such as the UK Reproducibility Network and European Open Science Cloud, which promote similar reforms. By embracing mandatory publication, the U.S. will enhance its own research enterprise and set a standard for other nations to follow.
Plan of Action
Recommendation 1. The White House should issue a directive to federal research funding agencies that mandates public disclosure of research results from all federal grants, including null results, unless they reveal intellectual property or trade secrets. To ensure lasting reform to America’s research enterprise , Congress could pass a law requiring such disclosures.
Recommendation 2. The National Science and Technology Council (NSTC) should develop guidelines for agencies to implement mandatory reporting. Successful implementation requires that researchers are well-informed and equipped to navigate this process. NSTC should coordinate with agencies to establish common guidelines for all agencies to reduce confusion and establish a uniform policy. In addition, agencies should create and disseminate detailed guidance documents that outline best practice for studies reporting null results, including step-by-step instructions on how to prepare and submit null studies to journals (and their differing guidelines) or federal databases.
Conclusion
Most published research is not replicated because the research system incentivizes the publication of novel, positive results. There is a tremendous amount of research that is not published due to null results, representing an enormous amount of wasted effort, money, and time, and compromised progress and transparency of our scientific institutions. OSTP should mandate the publication of null results through existing agency authority and funding, and Congress should consider legislation to ensure its longevity.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
It is well understood that most scientific findings cannot be taken at face value until they are replicated or reproduced. To make science more trustworthy, transparent, and replicable, we must change incentives to only publish positive results. Publication of null results will accelerate advancement of science.
Scientific discovery is often unplanned and serendipitous, but it is abundantly clear that we can reduce the amount of waste it currently generates. By mandating outputs for all grants, we expedite a cumulative record of research, where the results of all studies are known, and we can see why experiments might be valid in one context but not another to assess the robustness of findings in different experimental contexts and labs.
While many agencies prioritize hypothesis-driven research, even exploratory research will produce an output, and these outputs should be publicly available, either as an article or by public disclosure.
Studies that produce null results can still easily share data and code, to be evaluated post-publication by the community to see if code can be refactored, refined, and improved.
The “Cadillac” version of mandatory publication would be the registered reports model, where a study has its methodology peer reviewed before data are collected (Stage 1 Review). Authors are given in-principle acceptance, whereby, as long as the scientist follows the agreed-upon methodology, their study is guaranteed publication regardless of the results. When a study is completed, it is peer reviewed again (Stage 2 Review) simply to confirm the agreed-upon methodology was followed. In the absence of this registered reports model, we should at least mandate transparent publication via journals that publish null results, or via public federal disclosure.
Automating Scientific Discovery: A Research Agenda for Advancing Self-Driving Labs
Despite significant advances in scientific tools and methods, the traditional, labor-intensive model of scientific research in materials discovery has seen little innovation. The reliance on highly skilled but underpaid graduate students as labor to run experiments hinders the labor productivity of our scientific ecosystem. An emerging technology platform known as Self-Driving Labs (SDLs), which use commoditized robotics and artificial intelligence for automated experimentation, presents a potential solution to these challenges.
SDLs are not just theoretical constructs but have already been implemented at small scales in a few labs. An ARPA-E-funded Grand Challenge could drive funding, innovation, and development of SDLs, accelerating their integration into the scientific process. A Focused Research Organization (FRO) can also help create more modular and open-source components for SDLs and can be funded by philanthropies or the Department of Energy’s (DOE) new foundation. With additional funding, DOE national labs can also establish user facilities for scientists across the country to gain more experience working with autonomous scientific discovery platforms. In an era of strategic competition, funding emerging technology platforms like SDLs is all the more important to help the United States maintain its lead in materials innovation.
Challenge and Opportunity
New scientific ideas are critical for technological progress. These ideas often form the seed insight to creating new technologies: lighter cars that are more energy efficient, stronger submarines to support national security, and more efficient clean energy like solar panels and offshore wind. While the past several centuries have seen incredible progress in scientific understanding, the fundamental labor structure of how we do science has not changed. Our microscopes have become far more sophisticated, yet the actual synthesizing and testing of new materials is still laboriously done in university laboratories by highly knowledgeable graduate students. The lack of innovation in how we historically use scientific labor pools may account for stagnation of research labor productivity, a primary cause of concerns about the slowing of scientific progress. Indeed, analysis of scientific literature suggests that scientific papers are becoming less disruptive over time and that new ideas are getting harder to find. The slowing rate of new scientific ideas, particularly in the discovery of new materials or advances in materials efficiency, poses a substantial risk, potentially costing billions of dollars in economic value and jeopardizing global competitiveness. However, incredible advances in artificial intelligence (AI) coupled with the rise of cheap but robust robot arms are leading to a promising new paradigm of material discovery and innovation: Self-Driving Labs. An SDL is a platform where material synthesis and characterization is done by robots, with AI models intelligently selecting new material designs to test based on previous experimental results. These platforms enable researchers to rapidly explore and optimize designs within otherwise unfeasibly large search spaces.
Today, most material science labs are organized around a faculty member or principal investigator (PI), who manages a team of graduate students. Each graduate student designs experiments and hypotheses in collaboration with a PI, and then executes the experiment, synthesizing the material and characterizing its property. Unfortunately, that last step is often laborious and the most time-consuming. This sequential method to material discovery, where highly knowledgeable graduate students spend large portions of their time doing manual wet lab work, rate limits the amount of experiments and potential discoveries by a given lab group. SDLs can significantly improve the labor productivity of our scientific enterprise, freeing highly skilled graduate students from menial experimental labor to craft new theories or distill novel insights from autonomously collected data. Additionally, they yield more reproducible outcomes as experiments are run by code-driven motors, rather than by humans who may forget to include certain experimental details or have natural variations between procedures.
Self-Driving Labs are not a pipe dream. The biotech industry has spent decades developing advanced high-throughput synthesis and automation. For instance, while in the 1970s statins (one of the most successful cholesterol-lowering drug families) were discovered in part by a researcher testing 3800 cultures manually over a year, today, companies like AstraZeneca invest millions of dollars in automation and high-throughput research equipment (see figure 1). While drug and material discovery share some characteristics (e.g., combinatorially large search spaces and high impact of discovery), materials R&D has historically seen fewer capital investments in automation, primarily because it sits further upstream from where private investments anticipate predictable returns. There are, however, a few notable examples of SDLs being developed today. For instance, researchers at Boston University used a robot arm to test 3D-printed designs for uniaxial compression energy adsorption, an important mechanical property for designing stronger structures in civil engineering and aerospace. A Bayesian optimizer was then used to iterate over 25,000 designs in a search space with trillions of possible candidates, which led to an optimized structure with the highest recorded mechanical energy adsorption to date. Researchers at North Carolina State University used a microfluidic platform to autonomously synthesize >100 quantum dots, discovering formulations that were better than the previous state of the art in that material family.
These first-of-a-kind SDLs have shown exciting initial results demonstrating their ability to discover new material designs in a haystack of thousands to trillions of possible designs, which would be too large for any human researcher to grasp. However, SDLs are still an emerging technology platform. In order to scale up and realize their full potential, the federal government will need to make significant and coordinated research investments to derisk this materials innovation platform and demonstrate the return on capital before the private sector is willing to invest it.
Other nations are beginning to recognize the importance of a structured approach to funding SDLs: University of Toronto’s Alan Aspuru-Guzik, a former Harvard professor who left the United States in 2018, has created an Acceleration Consortium to deploy these SDLs and recently received $200 million in research funding, Canada’s largest ever research grant. In an era of strategic competition and climate challenges, maintaining U.S. competitiveness in materials innovation is more important than ever. Building a strong research program to fund, build, and deploy SDLs in research labs should be a part of the U.S. innovation portfolio.
Plan of Action
While several labs in the United States are working on SDLs, they have all received small, ad-hoc grants that are not coordinated in any way. A federal government funding program dedicated to self-driving labs does not currently exist. As a result, the SDLs are constrained to low-hanging material systems (e.g., microfluidics), with the lack of patient capital hindering labs’ ability to scale these systems and realize their true potential. A coordinated U.S. research program for Self-Driving Labs should:
Initiate an ARPA-E SDL Grand Challenge: Drawing inspiration from DARPA’s previous grand challenges that have catalyzed advancements in self-driving vehicles, ARPA-E should establish a Grand Challenge to catalyze state-of-the-art advancements in SDLs for scientific research. This challenge would involve an open call for teams to submit proposals for SDL projects, with a transparent set of performance metrics and benchmarks. Successful applicants would then receive funding to develop SDLs that demonstrate breakthroughs in automated scientific research. A projected budget for this initiative is $30 million1, divided among six selected teams, each receiving $5 million over a four-year period to build and validate their SDL concepts. While ARPA-E is best positioned in terms of authority and funding flexibility, other institutions like National Science Foundation (NSF) or DARPA itself could also fund similar programs.
Establish a Focused Research Organization to open-source SDL components: This FRO would be responsible for developing modular, open-source hardware and software specifically designed for SDL applications. Creating common standards for both the hardware and software needed for SDLs will make such technology more accessible and encourage wider adoption. The FRO would also conduct research on how automation via SDLs is likely to reshape labor roles within scientific research and provide best practices on how to incorporate SDLs into scientific workflows. A proposed operational timeframe for this organization is five years, with an estimated budget of $18 million over that time period. The organization would work on prototyping SDL-specific hardware solutions and make them available on an open-source basis to foster wider community participation and iterative improvement. A FRO could be spun out of the DOE’s new Foundation for Energy Security (FESI), which would continue to establish the DOE’s role as an innovative science funder and be an exciting opportunity for FESI to work with nontraditional technical organizations. Using FESI would not require any new authorities and could leverage philanthropic funding, rather than requiring congressional appropriations.
Provide dedicated funding for the DOE national labs to build self-driving lab user facilities, so the United States can build institutional expertise in SDL operations and allow other U.S. scientists to familiarize themselves with these platforms. This funding can be specifically set aside by the DOE Office of Science or through line-item appropriations from Congress. Existing prototype SDLs, like the Argonne National Lab Rapid Prototyping Lab or Berkeley Lab’s A-Lab, that have emerged in the past several years lack sustained DOE funding but could be scaled up and supported with only $50 million in total funding over the next five years. SDLs are also one of the primary applications identified by the national labs in the “AI for Science, Energy, and Security” report, demonstrating willingness to build out this infrastructure and underscoring the recognized strategic importance of SDLs by the scientific research community.
As with any new laboratory technique, SDLs are not necessarily an appropriate tool for everything. Given that their main benefit lies in automation and the ability to rapidly iterate through designs experimentally, SDLs are likely best suited for:
- Material families with combinatorially large design spaces that lack clear design theories or numerical models (e.g., metal organic frameworks, perovskites)
- Experiments where synthesis and characterization are either relatively quick or cheap and are amenable to automated handling (e.g., UV-vis spectroscopy is relatively simple, in-situ characterization technique)
- Scientific fields where numerical models are not accurate enough to use for training surrogate models or where there is a lack of experimental data repositories (e.g., the challenges of using density functional theory in material science as a reliable surrogate model)
While these heuristics are suggested as guidelines, it will take a full-fledged program with actual results to determine what systems are most amenable to SDL disruption.
When it comes to exciting new technologies, there can be incentives to misuse terms. Self-Driving Labs can be precisely defined as the automation of both material synthesis and characterization that includes some degree of intelligent, automated decision-making in-the-loop. Based on this definition, here are common classes of experiments that are not SDLs:
- High-throughput synthesis, where synthesis automation allows for the rapid synthesis of many different material formulations in parallel (lacks characterization and AI-in-the-loop)
- Using AI as a surrogate trained over numerical models, which is based on software-only results. Using an AI surrogate model to make material predictions and then synthesizing an optimal material is also not a SDL, though certainly still quite the accomplishment for AI in science (lacks discovery of synthesis procedures and requires numerical models or prior existing data, neither of which are always readily available in the material sciences).
SDLs, like every other technology that we have adopted over the years, eliminate routine tasks that scientists must currently spend their time on. They will allow scientists to spend more time understanding scientific data, validating theories, and developing models for further experiments. They can automate routine tasks but not the job of being a scientist.
However, because SDLs require more firmware and software, they may favor larger facilities that can maintain long-term technicians and engineers who maintain and customize SDL platforms for various applications. An FRO could help address this asymmetry by developing open-source and modular software that smaller labs can adopt more easily upfront.