Supporting Federal Decision Making through Participatory Technology Assessment

The incoming administration needs a robust, adaptable and scalable participatory assessment capacity to address complex issues at the intersections of science, technology, and society. As such, the next administration should establish a special unit within the Science and Technology Policy Institute (STPI)—an existing federally funded research and development center (FFRDC)—to provide evidence-based, just-in-time, and fit-for-purpose capacity for Participatory Technology Assessment (pTA) to the White House Office of Science and Technology Policy and across executive branch agencies.

Robust participatory and multi-stakeholder engagement supports responsible decision making where neither science nor existing policy provide clear guidance. pTA is an established and evidence-based process to assess public values, manage sociotechnical uncertainties, integrate living and lived knowledge, and bridge democratic gaps on contested and complex science and society issues.By tapping into broader community expertise and experiences, pTA identifies plausible alternatives and solutions that may be overlooked by experts and advocates.

pTA provides critical and informed public input that is currently missing in technocratic policy- and decision-making processes. Policies and decisions will have greater legitimacy, transparency, and accountability as a result of enhanced use of pTA. When systematically integrated into research and development (R&D) processes, pTA can be used for anticipatory governance—that is, assessing socio-technical futures, engaging communities, stakeholders and publics, and directing decisions, policies, and investments toward desirable outcomes.

A pTA unit within STPI will help build and maintain a shared repository of knowledge and experience of the state of the art and innovative applications across government, and provide pTA as a design, development, implementation, integration and training service for the executive branch regarding emerging scientific and technological issues and questions. By integrating public and expert value assessments, the next administration can ensure that federal science and technology decisions provide the greatest benefit to society.

Challenge and Opportunity

Science and technology (S&T) policy problems always involve issues of public values—such as concerns for safety, prosperity, and justice—alongside issues of fact. However, few systematic and institutional processes meaningfully integrate values from informed public engagement alongside expert consultation. Existing public-engagement mechanisms such as public- comment periods, opinion surveys, and town halls have devolved into little more than “checkbox” exercises. In recent years, transition to online commenting, intended to improve access and participation, have also amplified the negatives. They have “also inadvertently opened the floodgates to mass comment campaigns, misattributed comments, and computer-generated comments, potentially making it harder for agencies to extract the information needed to inform decision making and undermining the legitimacy of the rulemaking process. Many researchers have found that a large percentage of the comments received in mass comment responses are not highly substantive, but rather contain general statements of support or opposition. Commenters are an entirely self selected group, and there is no reason to believe that they are in any way representative of the larger public. … Relatedly, the group of commenters may represent a relatively privileged group, with less advantaged members of the public less likely to engage in this form of political participation.”

Moreover, existing engagement mechanisms tend to be dominated by a small number of experts and organized interest groups: people and institutions who generally have established pathways to influence policy anyway.

Existing engagement mechanisms leave out the voices of people who may lack the time, awareness, and/or resources to voice their opinions in response to the Federal Register, such as the roofer, the hair stylist, or the bus driver. This means that important public values—widely held ideas about the rights and benefits that ought to guide policy making in a democratic system—go overlooked. For S&T policy, a failure to assess and integrate public values may result in lack of R&D and complementary investments that produce market successes with limited public value, such as treatments for cancer that most patients cannot afford or public failure when there is no immediately available technical or market response, such as early stages of a global pandemic.Failure to integrate public values may also mean that little to no attention gets paid to key areas of societal need, such as developing low-cost tools and approaches for mitigating lead and other contaminants in water supplies or designing effective policy response, such as behavioral and logistical actions to contain viral infections and delivering vaccination to resistant populations.

In its 2023 Letter to the President, the President’s Council of Advisors on Science and Technology (PCAST), observed that, “As a nation, we must strive to develop public policies that are informed by scientific understandings and community values. Achieving this goal will require both access to accurate and trusted scientific information and the ability to create dialogue and participatory engagement with the American people.” The PCAST letter recommends issuing “a clarion call to Federal agencies to make science and technology communication and public engagement a core component of their mission and strategy.” It also recommended the establishment of “a new office to support Federal agencies in their continuing efforts to develop and build participatory public engagement and effective science and technology communications.”

Institutionalizing pTA within the Federal Government would provide federal agencies access to the tools and resources they need to apply pTA to existing and emerging complex S&T challenges, enabling experts, publics, and decision makers to tackle pressing issues together.pTA can be applied toward resolving long-standing issues, as well as to anticipate and address questions around emerging or novel S&T issues.

pTA for Long-Standing S&T Issues

Storage and siting of disposal sites for nuclear waste is an example of the type of ongoing, intractable problems for which pTA is ideally suited. Billions of dollars have been invested to develop a government-managed site for storing nuclear waste in the United States, yet essentially no progress has been made.Entangled political and environmental concerns, such as the risks of leaving nuclear waste in a potentially unsafe state for the long term, have stalled progress. There is also genuine uncertainty and expert disagreement surrounding safety and efficacy of various storage alternatives.Our nation’s inability to address the issue of nuclear waste has long impacted development of new and alternative nuclear power plants and thus has contributed to the slowing the adoption of nuclear energy.

There are rarely unencumbered or obvious optimal solutions to long-standing S&T issues like nuclear-waste disposal. But a nuanced and informed dialogue among a diverse public, experts, and decision makers—precisely the type of dialogue enabled through pTA—can help break chronic stalemates and address misaligned or nonexistent incentives. By bringing people together to discuss options and to learn about the benefits and risks of different possible solutions, pTA enables stakeholders to better understand each other’s perspectives. Deliberative engagements like pTA often generate empathy, encouraging participants to collaborate and develop recommendations based on shared exploration of values.pTA is designed to facilitate timely, adequate, and pragmatic choices in the context of uncertainty, conflicting goals, and various real-world constraints. This builds transparency and trust across diverse stakeholders while helping move past gridlock.

pTA for Emerging and Novel Issues

pTA is also useful for anticipating controversies and governing emerging S&T challenges, such as the ethical dimensions of gene editing or artificial intelligence or nuclear adoption. pTA helps grow institutional knowledge and expertise about complex topics as well as about public attitudes and concerns salient to those topics at scale. For example, challenges associated with COVID-19 vaccines presented several opportunities to deploy pTA. Public trust of the government’s pandemic response was uneven at best. Many Americans reported specific concerns about receiving a COVID-19 vaccine.Public opinion polls have delivered mixed messages regarding willingness to receive a COVID- 19 vaccine,but polls can overlook other historically significant concerns and socio-political developments in rapidly changing environments. Demands for expediency in vaccine development complicated the situation when normal safeguards and oversights were relaxed. Apparent pressure to deliver a vaccine as soon as possible raised public concern that vaccine safety is not being adequately vetted. Logistical and ethical questions about vaccine rollout were also abound: who should get vaccinated first, at what cost, and alongside what other public health measures? The nation needed a portfolio of differentiated and locally robust strategies for vaccine deployment. pTA would help officials anticipate equity challenges and trust deficits related to vaccine use and inform messaging and means of delivery, helping effective and socially robust rollout strategies for different communities across the country.

pTA is an Established Practice

pTA has a history of use in the European Union and more recently in the United States. Inspired partly by the former U.S. Office of Technology Assessment (OTA), many European nations and the European Parliament operate their own technology assessment (TA) agencies. European TA took a distinctive turn from the OTA in further democratizing science and technology decision-making by developing and implementing a variety of effective and economical practices involving citizen participation (or pTA).Recent European Parliamentary Technology Assessment reports have taken on issues of assistive technologies, future of work, future of mobility, and climate-change innovation.

In the United States, a group of researchers, educators, and policy practitioners established the Expert and Citizen Assessment of Science and Technology (ECAST)network in 2010 to develop a distinctive 21^st-century model of TA. Over the course of a decade, ECAST developed an innovative and reflexive participatory technology assessment (pTA) method to support democratic decision-making in different technical, social, and political contexts.After a demonstration project providing citizen input to the United Nations Convention on Biological Diversity in collaboration with the Danish Board of Technology, ECAST, worked with the National Aeronautics and Space Administration (NASA) on the agency’s Asteroid Initiative.NASA-sponsored pTA activities about asteroid missions revealed important concerns about mitigating asteroid impact alongside decision support for specific NASA missions.Public audiences prioritized a U.S. role in planetary defense from asteroid impacts. These results were communicated to NASA administrators and informed the development of NASA’s Planetary Defense Coordination Office, demonstrating how pTA can identify novel public concerns to inform decision making.

This NASA pTA paved the way for pTA projects with the Department of Energy on nuclear-waste disposal and with the National Oceanic and Atmospheric Administration on community resilience.ECAST’s portfolio also includes projects on climate intervention research,the future of automated vehicles, gene editing, clean energy demonstration projects and interim storage of spent nuclear fuel. These and other pTA projects have been supported by more than six million dollars of public and philanthropic funding over the past ten years. Strong funding support in recent years highlights a growing demand for public engagement in science and technology decision-making.

However, the current scale of investment in pTA projects is vastly outstripped by the number of agencies and policy decisions that stand to benefit from pTA and are demanding applications for different use cases from public education, policy decisions, public value mapping and process and institutional innovations. ECAST’s capacity and ability to partner with federal agencies is limited and constrained by existing administrative rules and procedures on the federal side and resources and capacity deficiencies and flexibilities on the network side. Any external entity like ECAST will encounter difficulties in building institutional memory and in developing cooperative-agreement mechanisms across agencies with different missions as well as within agencies with different divisions. Integrating public engagement as a standard component of decision making will require aligning the interests of sponsoring agencies, publics, and pTA practitioners within the context of broad and shifting political environments. An FFRDC office dedicated to pTA would provide the embedded infrastructure, staffing, and processes necessary to achieve these challenging tasks. A dedicated home for pTA within the executive branch would also enable systematic research, evaluation, and training related to pTA methods and practices, as well as better integration of pTA tools into decision making involving public education, research, innovation and policy actions.

Plan of Action

The next administration should support and conduct pTA across the Federal Government by expanding the scope of the Science and Technology Policy Institute (STPI) to include a special unit with a separate operating budget dedicated specifically to pTA. STPI is an existing federally funded research and development center (FFRDC) that already conducts research on emerging technological challenges for the Federal Government. STPI is strategically associated with the White House Office of Science and Technology Policy (OSTP). Integrating pTA across federal agencies aligns with STPI’s mission to provide technical and analytical support to agency sponsors on the assessment of critical and emerging technologies.

A dedicated pTA unit within STPI would (1) provide expertise and resources to conduct pTA for federal agencies and (2) document and archive broader public expertise captured through pTA. Much publicly valuable knowledge generated from one area of S&T is applicable to and usable in other areas. As part of an FFRDC associated with the executive branch, STPI’s pTA unit could collaborate with universities to help disseminate best practices across all executive agencies.

We envision that STPI’s pTA unit would conduct activities related to the general theory and practice of pTA as well as partner with other federal agencies to integrate pTA into projects large and small. Small-scale projects, such as a series of public focus groups, expert consultations, or general topic research could be conducted directly by the pTA unit’s staff. Larger projects, such as a series of in-person or online deliberative engagements, workshops, and subsequent analysis and evaluation, would require additional funding and support from the requesting agencies. The STPI pTA unit could also establish longer-term partnerships with universities and science centers (as in the ECAST network), thereby enabling the federal government to leverage and learn from pTA exercises sponsored by non-federal entities.

The new STPI pTA unit would be funded in part through projects requested by other federal agencies. An agency would fund the pTA unit to design, plan, conduct, assess, and analyze a pTA effort on a project relevant to the agency. This model would enable the unit to distribute costs across the executive branch and would ensure that the unit has access to subject-matter experts (i.e., agency staff) needed to conduct an informed pTA effort. Housing the unit within STPI would contribute to OSTP’s larger portfolio of science and technology policy analysis, open innovation and citizen science,and a robust civic infrastructure.

Cost and Capacities

Adding a pTA unit to STPI would increase federal capacity to conduct pTA, utilizing existing pathways and budget lines to support additional staff and infrastructure for pTA capabilities. Establishing a semi-independent office for pTA within STPI would make it possible for the executive branch to share support staff and other costs. We anticipate that $3.5–5 million per year would be needed to support the core team of researchers, practitioners, leadership, small-scale projects, and operations within STPI for the pTA unit. This funding would require congressional approval.

The STPI pTA unit and its staff would be dedicated to housing and maintaining a critical infrastructure for pTA projects, including practical know-how, robust relationships with partner organizations (e.g., science centers, museums, or other public venues for hosting deliberative pTA forums), and analytic capabilities. This unit would not wholly be responsible for any given pTA effort. Rather, sponsoring agencies should provide resources and direction to support individual pTA projects.

We expect that the STPI pTA unit would be able to conduct two or three pTA projects per year initially. Capacity and agility of the unit would expand as time went on to meet the growth and demands from the federal agencies. In the fifth year of the unit (the typical length of an FFRDC contract), the presidential administration should consider whether there is sufficient agency demand for pTA—and whether the STPI pTA unit has sufficiently demonstrated proof-of-concept—to merit establishment of a new and independent FFRDC or other government entity fully dedicated to pTA.

Operations

The process for initiating, implementing and finalizing a pTA project would resemble the following:

Pre:

Agency approaches the pTA unit with interest in conducting pTA for agency assessment and decision making for a particular subject.
pTA unit assists the agency in developing questions appropriate for pTA. This process involves input from agency decision makers and experts as well as external stakeholders.
A Memorandum of understanding/agreement (MOU/MOA) is created, laying out the scope of the pTA effort.

During:

pTA unit and agency convene expert and/or public workshops (as appropriate) to inform pTA activities.
pTA unit and agency create, test, and evaluate prototype pTA activities (see FAQs below for more details on evaluation).
pTA unit and agency work with a network of pTA host institutions (e.g, science centers, universities, nonprofit organizations, etc.) to coordinate pTA forums.
pTA unit oversees pTA forums.

Post:

pTA unit collects, assesses, and analyzes pTA forum results with iterative input and analysis from the hosting agency.
pTA unit works with stakeholders to share and finalize pTA reports on the subject, as well as a dissemination plan for sharing results with stakeholder groups.

Conclusion

Participatory Technology Assessment (pTA) is an established suite of tools and processes for eliciting and documenting informed public values and opinions to contribute to decision making around complex issues at the intersections of science, technology, and society.

However, its creative adaptation and innovative use by federal agencies in recent years demonstrate their utility beyond providing decision support: from increasing scientific literacy and social acceptability to diffusing tensions and improving mutual trust. By creating capacity for pTA within STPI, the incoming administration will bolster its ability to address longstanding and emerging issues that lie at the intersection of scientific progress and societal well-being, where progress depends on aligning scientific, market and public values. Such capacity and capabilities will be crucial to improving the legitimacy, transparency, and accountability of decisions regarding how we navigate and tackle the most intractable problems facing our society, now and for years to come.

This action-ready policy memo is part of Day One 2025 — our effort to bring forward bold policy ideas, grounded in science and evidence, that can tackle the country’s biggest challenges and bring us closer to the prosperous, equitable and safe future that we all hope for whoever takes office in 2025 and beyond.

PLEASE NOTE (February 2025): Since publication several government websites have been taken offline. We apologize for any broken links to once accessible public data.

Frequently Asked Questions

Expert panels are the best way to address complex S&T issues. Why should S&T assessments focus on involving the public and public values?

Experts can help map potential policy and R&D options and their implications. However, there will always be an element of judgment when it comes to deciding among options. This stage is often more driven by ethical and social concerns than by technical assessments. For instance, leaders may need to figure out a fair and just process to govern hazardous-waste disposal, or weigh the implications of using genetically modified organisms to control diseases, or siting clean energy research and demonstration projects in resistant or disadvantaged communities. Involving the public in decision-making can help counter challenges associated with expert judgment (for example, “groupthink”) while bringing in perspectives, values, and considerations that experts may overlook or discount.

How do we know that members of the public are sufficiently informed to be able to contribute to a decision?

pTA incorporates a variety of measures to inform discussion, such as background materials distributed to participants and multimedia tools to provide relevant information about the issue. The content of background materials is developed by experts and stakeholders prior to a pTA event to give the public the information they need to thoughtfully engage with the topic at hand. Evaluation tools, such as those from the informal science-education community, can be used to assess how effective background materials are at preparing the public for an informed discussion, and to identify ineffective materials that may need revision or supplementation. Evaluations of several past pTA efforts have 1) shown consistent learning among public participants and 2) have documented robust processes for the creation, testing, and refinement of pTA activities that foster informed discussions among pTA participants.

Will doing pTA enhance the communications missions of federal agencies?

pTA can result in products and information, such as reports and data on public values, that are relevant and useful for the communication missions of agencies. However, pTA should avoid becoming a tool for strategic communications or a procedural “checkbox” activity for public engagement. Locating the Federal Government’s dedicated pTA unit within an FFRDC will ensure that pTA is informed by and accountable to a broader community of pTA experts and stakeholders who are independent of any mission agency.

Why does the Federal Government need in-house capacity to conduct pTA?

The work of universities, science centers, and nonpartisan think tanks have greatly expanded the tools and approaches available for using pTA to inform decision-making. Many past and current pTA efforts have been driven by such nongovernmental institutions, and have proven agile, collaborative, and low cost. These efforts, while successful, have limited or diffuse ties to federal decision making.

Embedding pTA within the federal government would help agencies overcome the opportunity and time cost of integrating public input into tight decision-making timelines. ECAST’s work with federal agencies has shown the need for a stable bureaucratic infrastructure surrounding pTA at the federal level to build organizational memory, create a federal community of practice, and productively institutionalize pTA into federal decision-making.

Importantly, pTA is a nonpartisan method that can help reduce tensions and find shared values. Involving a diversity of perspectives through pTA engagements can help stakeholders move beyond impasse and conflict. pTA engagements emphasize recruiting and involving Americans from all walks of life, including those historically excluded from policymaking.

How would a pTA unit within STPI complement existing technology assessment capacity? How would it differ from that existing capacity?

Currently, the Government Accountability Office’s Science, Technology Assessment, and Analytics team (STAA) conducts technology assessments for Congress. Technology Assessment (TA) is designed to enhance understanding of the implications of new technologies or existing S&T issues. The STAA certainly has the capacity to undertake pTA studies on key S&T issues if and when requested by Congress. However, the distinctive form of pTA developed by ECAST and exemplified in ECAST’s work with NASA, NOAA, and DOE follows a knowledge co- production model in which agency program managers work with pTA practitioners to co-design, co-develop, and integrate pTA into their decision-making processes. STAA, as a component of the legislative branch, is not well positioned to work alongside executive agencies in this way. The proposed pTA unit within STPI would make the proven ECAST model available to all executive agencies, nicely complementing the analytical TA capacity that STAA offers the federal legislature.

Why should the government establish a pTA unit within an FFRDC instead of using executive orders to conduct pTA or requiring agencies to undertake pTA?

Executive orders could support one-off pTA projects and require agencies to conduct pTA. However, establishing a pTA unit within an FFRDC like STPI would provide additional benefits that would lead to a more robust pTA capacity.

FFRDCs are a special class of research institutions owned by the federal government but operated by contractors, including universities, nonprofits, and industrial firms. The primary purpose of FFRDCs is to pursue research and development that cannot be effectively provided by the government or other sectors operating on their own. FFRDCs also enable the government to recruit and retain diverse experts without government hiring and pay constraints, providing the government with a specialized, agile workforce to respond to agency needs and societal challenges.
Creating a pTA unit in an FFRDC would provide an institutional home for general pTA know-how and capacity: a resource that all agencies could tap into. The pTA unit would be staffed by a small but highly-trained staff who are well-versed in the knowledge and practice of pTA. The pTA unit would not preclude individual agencies from undertaking pTA on their own, but would provide a “help center” to help agencies figure out where to start and how to overcome roadblocks. pTA unit staff could also offer workshops and other opportunities to help train personnel in other agencies on ways to incorporate the public perspective into their activities.

Other potential homes for a dedicated federal pTA unit include the Government Accountability Office (GAO) or the National Academies of Sciences, Engineering, and Medicine. However, GAO’s association with Congress would weaken the unit’s connections to agencies. The National Academies historically conduct assessments driven purely by expert consensus, which may compromise the ability of National Academies-hosted pTA to include and/or emphasize broader public values.

How will the government evaluate the performance and outcomes of pTA efforts?

Evaluating a pTA effort means answering four questions:

First, did the pTA effort engage a diverse public not otherwise engaged in S&T policy formulation? pTA practitioners generally do not seek statistically representative samples of participants (unlike, for instance, practitioners of mass opinion polling). Instead, pTA practitioners focus on including a diverse group of participants, with particular attention paid to groups who are generally not engaged in S&T policy formulation.

Second, was the pTA process informed and deliberative? This question is generally answered through strategies borrowed from the informal science-learning community, such as “pre- and post-“ surveys of self-reported learning. Qualitative analysis of the participant responses and discussions can evaluate if and how background information was used in pTA exercises. Involving decision makers and stakeholders in the evaluation process—for example, through sharing initial evaluation results—helps build the credibility of participant responses, particularly when decision makers or agencies are skeptical of the ability of lay citizens to provide informed opinions.

Third, did pTA generate useful and actionable outputs for the agency and, if applicable, stakeholders? pTA practitioners use qualitative tools for assessing public opinions and values alongside quantitative tools, such as surveys. A combination of qualitative and quantitative analysis helps to evaluate not just what public participants prefer regarding a given issue but why they hold that preference and how they justify those preferences. To ensure such information is useful to agencies and decision makers, pTA practitioners involve decision makers at various points in the analysis process (for example, to probe participant responses regarding a particular concern). Interviews with decision makers and other stakeholders can also assess the utility of pTA results.

Fourth, what impact did pTA have on participants, decisions and decision-making processes, decision makers, and organizational culture? This question can be answered through interviews with decision makers and stakeholders, surveys of pTA participants, and impact assessments.

How will the government evaluate the performance and outcomes of a dedicated pTA unit? How has pTA been evaluated previously?

Evaluation of a pTA unit within an existing FFRDC would likely involve similar questions as above: questions focused on the impact of the unit on decisions, decision-making processes, and the culture and attitudes of agency staff who worked with the pTA unit. An external evaluator, such as the Government Accountability Office or the National Academies of Sciences, could be tasked with carrying out such an evaluation.

How publicly accessible should the work of a pTA unit be? Should pTA results and processes be made public?

pTA results and processes should typically be made public as long as few risks are posed to pTA participants (in line with federal regulations protecting research participants). Publishing results and processes ensures that stakeholders, other members of government (e.g., Congress), and broader audiences can view and interpret the public values explored during a pTA effort. Further, making results and processes publicly available serves as a form of accountability, ensuring that pTA efforts are high quality.

Creating a Science and Technology Hub in Congress

Congress should create a new Science and Technology (S&T) Hub within the Government Accountability Office’s (GAO) Science, Technology Assessment, and Analytics (STAA) team to support an understaffed and overwhelmed Congress in addressing pressing science and technology policy questions. A new hub would connect Congress with technical experts and maintain a repository of research and information as well as translate this material to members and staff. There is already momentum building in Congress with several recent reforms to strengthen capacity, and the reversal of the Chevron doctrine infuses the issue with a new sense of urgency. The time is now for Congress to invest in itself.

Challenge and Opportunity

Congress does not have the tools it needs to contend with pressing scientific and technical questions. In the last few decades, Congress grappled with increasingly complex science and technology policy questions, such as social media regulation, artificial intelligence, and climate change. At the same time, its staff capacity has diminished; between 1994 to 2015, the Government Accountability Office (GAO) and Congressional Research Service (CRS), key congressional support agencies, lost about a third of their employees. Staff on key science related committees like the House Committee on Science, Space, and Technology fell by nearly half.

As a result, members frequently lack the resources they need to understand science and technology. “[T]hey will resort to Google searches, reading Wikipedia, news articles, and yes, even social media reports. Then they will make a flurry of cold calls and e-mails to whichever expert they can get on the phone,” one former science staffer noted. “You’d be surprised how much time I spend explaining to my colleagues that the chief dangers of AI will not come from evil robots with red lasers coming out of their eyes,” representative Jay Obernolte (R-CA), who holds a master’s degree in AI, told The New York Times. And AI is just one example of a pressing science need Congress must handle, but does not have the tools to grapple with.

Moreover, reliance on external information can intensify polarization, because each side depends on a different set of facts and it is harder to find common ground. Without high-quality, nonpartisan science and technology resources, billions of dollars in funding may be allocated to technologies that do not work or policy solutions at odds with the latest science.

Additional science support could help Congress navigate complex policy questions related to emerging research, understand science and technologies’ impacts on legislative issues, and grapple with the public benefits or negative consequences of various science and technology issues.

The Supreme Court’s 2024 decision in Loper Bright Enterprises v. Raimondo instills a new sense of urgency. The reversal of the decades old “Chevron deference,” which directed courts to defer to agency interpretations in instances where statutes were unclear or silent, means Congress will now have to legislate with more specificity. To do so, it will need the best possible experts and technical guidance.

There is momentum building for Congress to invest in itself. For the past several years, the Select Committee on the Modernization of Congress (which became a permanent subcommittee of the Committee on House Administration) advocated for increases to staff pay and resources to improve recruitment and retention. Additionally, the GAO Science, Technology Assessment, and Analytics (STAA) team has expanded to meet the moment. From 2019 to 2022, STAA’s staff grew from 49 to 129 and produced 46 technology assessments and short-form explainers.These investments are promising but not sufficient. Congress can draw on this energy and the urgency of a post-Chevron environment to invest in a Science and Technology Hub.

Plan of Action

Congress should create a new Science and Technology Hub in GAO STAA

Congress should create a Science and Technology Hub within the GAO’s STAA. While most of the STAA’s current work responds to specific requests from members, a new hub within the STAA would build out more proactive and exploratory work by 1) brokering long-term relationships between experts and lawmakers and 2) translating research for Congress. The new hub would maintain relationships with rank-and-file members, not just committees or leadership. The hub could start by advising Congress on emerging issues where the partisan battle lines have not been drawn, such as AI, and over time it will build institutional trust and advise on more partisan issues.

Research shows that both parties respect and use congressional support agencies, such as GAO, so they are a good place to house the necessary expertise. Housing the new hub within STAA would also build on the existing resources and support STAA already provides and capitalizes on the recent push to expand this team. The Hub could have a small staff of approximately 100 employees. The success of recently created small offices such as the Office of Whistleblower Ombuds proves that a modest staff can be effective. In a post-Chevron world, this hub could also play an important role liaising with federal agencies about how different statutory formulations will change implementation of science related legislation and helping members and staff understand the ins and outs of the passage to implementation process.

The Hub should connect Congress with a wider range of subject matter experts.

Studies show that researcher-policymaker interactions are most effective when they are long-term working relationships rather than ad hoc interactions. The hub could set up advisory councils of experts to guide Congress on different key areas. Though ad hoc groups of experts have advised Congress over the years, Congress does not have institutionalized avenues for soliciting information. The hub’s nonpartisan staff should also screen for potential conflicts of interest. As a starting point, these advisory councils would support committee and caucus staff as they learn about emerging issues, and over time it could build more capacity to manage requests from individual member officers. Agencies like the National Academies of Sciences, Engineering, and Medicine already employ the advisory council model; however, they do not serve Congress exclusively nor do they meet staff needs for quick turnaround or consultative support. The advisory councils would build on the advisory council model of the Office of Technology Assessment (OTA), an agency that advised Congress on science between the 1970s and 1990s. The new hub could take proactive steps to center representation in its advisory councils, learning from the example of the United Kingdom Parliament’s Knowledge Exchange Unit and its efforts to increase the number of women and people of color Parliament hears from.

The Hub should help compile and translate information for Congress.

The hub could maintain a one-stop shop to help Congress find and understand data and research on different policy-relevant topics. The hub could maintain this repository and draw on it to distill large amounts of information into memos that members could digest. It could also hold regular briefings for members and staff on emerging issues. Over time, the Hub could build out a “living evidence” approach in which a body of research is maintained and updated with the best possible evidence at a regular cadence. Such a resource would help counteract the effects of understaffing and staff turnover and provide critical assistance in legislating and oversight, particularly important in a post-Chevron world.

Conclusion

Taking straightforward steps like creating an S&T hub, which brokers relationships between Congress and experts and houses a repository of research on different policy topics, could help Congress to understand and stay up-to-date on urgent science issues in order to ensure more effective decision making in the public interest.

Frequently Asked Questions

What other investments can Congress make in itself at this time?

There are a number of additional investments Congress can make that would complement the work of the proposed Science and Technology Hub, including additional capacity for other Congressional support agencies and entities beyond GAO. For example, Congress could lift the cap on the number of staff each member can hire (currently set at 18), and invest in pipelines for recruitment and retention of personal and committee staff with science expertise. Additionally, Congress could advance digital technologies available to Congress for evidence access and networking with the expert community.

Why should the Hub be placed at GAO and how can the GAO adapt to meet this need?

The Hub should be placed in GAO to build on the momentum of recent investments in the STAA team. GAO has recently invested in building human capital with expertise in science and technology that can support the development of the Hub. The GAO should seize the moment to reimagine how it supports Congress as a modern institution. The new hub in the STAA should be part of an overall evolution, and other GAO departments should also capitalize on the momentum and build more responsive and member-focused processes to support Congress.

How to Prompt New Cross-Agency and Cross-Sector Collaboration to Advance Learning Agendas

The 2018 Foundations for Evidence-Based Policymaking Act (Evidence Act) promotes a culture of evidence within federal agencies. A central part of that culture entails new collaboration between decision-makers and those with diverse forms of expertise inside and outside of the federal government. The challenge, however, is that new cross-agency and cross-sector collaborative relationships don’t always arise on their own. To overcome these challenges, federal evaluation staff can use “unmet desire surveys,” an outreach tool that prompts agency staff to reflect on how the success of their programs relates to what is happening in other agencies and outside government and how engaging with these other programs and organizations would help their work be more effective. It also prompts them to consider the situation from the perspective of potential collaborators—why should they want to engage?

The unmet desire survey is an important data-gathering mechanism that provides actionable information to create new connections between agency staff and people—such as those in other federal agencies, along with researchers, community stakeholders, and others outside the federal government—who have the information they desire. Then, armed with that information, evaluation staff can use the new Evidence Project Portal on Evaluation.gov (to connect with outside researchers) and/or other mechanisms (to connect with other potential collaborators) to conduct matchmaking that will foster new collaborative relationships. Using existing authorities and resources, agencies can pilot unmet desire surveys as a concrete mechanism for advancing federal learning agendas in a way that builds buy-in by directly meeting the needs of agency staff.

Challenge and Opportunity

A core mission of the Evidence Act is to foster a culture of evidence-based decision-making within federal agencies. Since the problems agencies tackle are often complex and multidimensional, new collaborative relationships between decision-makers in the federal government and those in other agencies and in organizations outside the federal government are essential to realizing the Evidence Act’s vision. Along these lines, Office of Management and Budget (OMB) implementation guidance stresses that learning agendas are “an opportunity to align efforts and promote interagency collaboration in areas of joint focus or shared populations or goals” (OMB M-19-23), and that more generally a culture of evidence “cannot happen solely at the top or in isolated analytical offices, but rather must be embedded throughout each agency…and adopted by the hardworking civil servants who serve on behalf of the American people” (OMB M-21-27).

New cross-agency and cross-sector collaborative relationships rarely arise on their own. They are voluntary, and between people who often start off as strangers to one another. Limited resources, lack of explicit permission, poor prior experiences, differing incentives, and stereotypes are all challenges to persuading strangers to engage with each other. In addition, agency staff may not previously have spent much time thinking about how new collaborative relationships could help answer questions posed by their learning agenda, or even that accessible mechanisms exist to form new relationships. This presents an opportunity for new outreach by evaluation staff, to expand a sense of what kinds of collaborative relationships would be both valuable and possible.

For instance, the Department of the Interior (DOI)’s 2024 Learning Agenda asks: What are the primary challenges to training a diverse, highly skilled workforce capable of delivering the department’s mission? The DOI itself has vital historical and other contextual information for answering this question. Yet officials from other departments likely have faced (or currently face) a similar challenge, and are in a position to share what they’ve tried so far, what has worked well, and what has fallen short. In addition, researchers who study human resource development could share insights from literature, as well as possibly partner on a new study to help answer this question in the DOI context.

Each department and agency is different, with its own learning agenda, decision-making processes, capacity constraints, and personnel needs. And so what is needed are new forms of informal collaboration (knowledge exchange) and/or formal collaboration (projects with shared ownership, decision-making authority, and accountability) that foster back-and-forth interaction. The challenge, however, is that agency staff may not consider such possibilities without being prompted to do so or may be uncertain how to communicate the opportunity to potential collaborators in a way that resonates with their goals.

This memo proposes a flexible tool that evaluation staff (e.g., evaluation officers at federal agencies) can use to generate buy-in among agency staff and leadership while also promoting collaboration as emphasized in OMB guidance and in the Evidence Act. The tool, which has already proven valuable in the federal government (see FAQs) , local government, and in the nonprofit sector, is called an “unmet desire survey.” The survey measures unmet desires for collaboration by prompting staff to consider the following types of questions:

Which learning agenda question(s) are you focused on? Is there information about other programs within the government and/or information that outside researchers and other stakeholders have that would help answer it? What kinds of people would be helpful to connect with?
Are you looking for informal collaboration (oriented toward knowledge exchange) or formal collaboration (oriented toward projects with shared ownership, decision-making authority, and accountability)?
What hesitations (perhaps due to prior experiences, lack of explicit permission, stereotypes, and so on) do you have about interacting with other stakeholders? What hesitations do you think they might have about interacting with you?
Why should they want to connect with you?
Why do you think these connections don’t already exist?

These questions elicit critical insights about why agency staff value new connection and are highly flexible. For instance, in the first question posed above, evaluation staff can choose to ask about new information that would be helpful for any program or only about information relevant to programs that are top priorities for their agency. In other words, unmet desire surveys need not add one more thing to the plate; rather, they can be used to accelerate collaboration directly tied to current learning priorities.

Unmet desire surveys also legitimize informal collaborative relationships. Too often, calls for new collaboration in the policy sphere immediately segue into overly structured meetings that fail to uncover promising areas for joint learning and problem-solving. Meetings across government agencies are often scripted presentations about each organization’s activities, providing little insight on ways they could collaborate to achieve better results. Policy discussions with outside research experts tend to focus on formal evaluations and long-term research projects that don’t surface opportunities to accelerate learning in the near term. In contrast, unmet desire surveys explicitly legitimize the idea that diverse thinkers may want to connect only for informal knowledge exchange rather than formal events or partnerships. Indeed, even single conversations can greatly impact decision-makers, and, of course, so can more intensive relationships.

Whether the goal is informal or formal collaboration, the problem that needs to be solved is both factual and relational. In other words, the issue isn’t simply that strangers do not know each other—it’s also that strangers do not always know how to talk to one another. People care about how others relate to them and whether they can successfully relate to others. Uncertainty about relationality prevents people from interacting with others they do not know. This is why unmet desire surveys also include questions that directly measure hesitations about interacting with people from other agencies and organizations, and encourage agency staff to think about interactions from others’ perspectives.

The fact that the barriers to new collaborative relationships are both factual as well as relational underscores why people may not initiate them on their own. That’s why measuring unmet desire is only half the battle—it’s also important to ensure that evaluation staff have a plan in place to conduct matchmaking using the data gathered from the survey. One way is to create a new posting on the Evidence Project Portal (especially if the goal is to engage with outside researchers). A second way is to field the survey as part of a convening, which already has as one of its goals the development of new collaborative relationships. A third option is to directly broker connections. Regardless of which option is pursued, note that large amounts of extra capacity are likely unnecessary, at least at first. The key point is simply to ensure that matchmaking is a valued part of the process.

In sum, by deliberately inquiring about connections with others who have diverse forms of relevant expertise—and then making those connections anew—evaluation staff can generate greater enthusiasm and ownership among people who may not consider evaluation and evidence-building as part of their core responsibilities.

Plan of Action

Using existing authorities and resources, evaluation staff (such as evaluation officers at federal agencies) can take three steps to position unmet desire surveys as a standard component of the government’s evidence toolbox.

Step 1. Design and implement pilot unmet desire surveys.

Evaluation staff are well positioned to conduct outreach to assess unmet desire for new collaborative relationships within their agencies. While individual staff can work independently to design unmet desire surveys, it may be more fruitful to work together, via the Evaluation Officer Council, to design a baseline survey template. Individuals could then work with their teams to adapt the baseline template as needed for each agency, including identifying which agency staff to prioritize as well as the best way to phrase particular questions (e.g., regarding the types of connections that employees want in order to improve the effectiveness of their work or the types of hesitancies to ask about). Given that the question content is highly flexible, unmet desire surveys can directly accelerate learning agendas and build buy-in at the same time. Thus, they can yield tangible, concrete benefits with very little upfront cost.

Step 2. Meet unmet desires by matchmaking.

After the pilot surveys are administered, evaluation staff should act on their results. There are several ways to do this without new appropriations. One way is to create a posting for the Evidence Project Portal, which is explicitly designed to advertise opportunities for new collaborative relationships, especially with researchers outside the federal government. Another way is to field unmet desire surveys in advance of already-planned convenings, which themselves are natural places for matchmaking (e.g., the Agency for Healthcare Research and Quality experience described in the FAQs). Lastly, for new cross-agency collaborative relationships along with other situations, evaluation staff may wish to engage in other low-lift matchmaking on their own. Depending upon the number of people they choose to survey, and the prevalence of unmet desire they uncover, they may also wish to bring on short-term matchmakers through flexible hiring mechanisms (e.g., through the Intergovernmental Personnel Act). Regardless of which option is pursued, the key point is that matchmaking itself must be a valued part of this process. Documenting successes and lessons learned then set the stage for using agency-specific discretionary funds to hire one or more in-house matchmakers as longer-term or staff appointments.

Step 3. Collect information on successes and lessons learned from the pilot.

Unmet desire surveys can be tricky to field because they entail asking employees about topics they may not be used to thinking about. It often takes some trial and error to figure out the best ways to ask about employees’ substantive goals and their hesitations about interacting with people they do not know. Piloting unmet desire surveys and follow-on matchmaking can not only demonstrate value (e.g., the impact of new collaborative relationships fostered through these combined efforts) to justify further investment but also suggest how evaluation leads might best structure future unmet desire surveys and subsequent matchmaking.

Conclusion

An unmet desire survey is an adaptable tool that can reveal fruitful pathways for connection and collaboration. Indeed, unmet desire surveys leverage the science of collaboration by ensuring that efforts to broker connections among strangers consider both substantive goals and uncertainty about relationality. Evaluation staff can pilot unmet desire surveys using existing authorities and resources, and then use the information gathered to identify opportunities for productive matchmaking via the Evidence Project Portal or other methods. Ultimately, positioning the survey as a standard component of the government’s evidence toolbox has great potential to support agency staff in advancing federal learning agendas and building a robust culture of evidence across the U.S. government.

Frequently Asked Questions

Have any agencies tried using unmet desire surveys? What impact did they have?

Yes, the Agency for Healthcare Research and Quality (AHRQ) has used unmet desire surveys several times in 2023 and 2024. Part of AHRQ’s mission is to improve the quality and safety of healthcare delivery. It has prioritized scaling and spreading evidence-based approaches to implementing person-centered care planning for people living with or at risk for multiple chronic conditions. This requires fostering new cross-sector collaborative relationships between clinicians, patients, caregivers, researchers, payers, agency staff and other policymakers, and many others. That’s why, in advance of several recent convenings with these diverse stakeholders, AHRQ fielded unmet desire surveys among the participants. The surveys uncovered several avenues for informal and formal collaboration that stakeholders believed were necessary and, importantly, informed the agenda for their meetings. Relative to many convenings, which are often composed of scripted presentations about individuals’ diverse activities, conducting the surveys in advance and presenting the results during the meeting shaped the agenda in more action-oriented ways.

AHRQ’s experience demonstrates a way to seamlessly incorporate unmet desire surveys into already-planned convenings, which themselves are natural opportunities for matchmaking. While some evaluation staff may wish to hire separate matchmakers or engage in matchmaking using outside mechanisms like the Evidence Project Portal, the AHRQ experience also demonstrates another low-lift, yet powerful, avenue. Lastly, while the majority of this memo and the FAQs focus on measuring unmet desire among agency staff, the AHRQ experience also demonstrates the applicability of this idea to other stakeholders as well.

Who should unmet desire surveys be administered to?

The best place to start—especially when resources are limited—is with potential evidence champions. These are people who are already committed to answering questions on their agency’s learning agenda and are likely to have an idea of the kinds of cross-agency or cross-sector collaborative relationships that would be helpful. These potential evidence champions may not self-identify as such; rather, they may see themselves as program managers, customer-experience experts, bureaucracy hackers, process innovators, or policy entrepreneurs. Regardless of terminology, the unmet desire survey provides people who are already motivated to collaborate and connect with a clear opportunity to articulate their needs. Evaluation staff can then respond by posting on the Evidence Project portal or other matchmaking on their own to stimulate new and productive relationships for those people.

Who should conduct an unmet desire survey?

The administrator should be someone with whom agency staff feel comfortable discussing their needs (e.g., a member of an agency evaluation team) and who is able to effectively facilitate matchmaking—perhaps because of their network, their reputation within the agency, their role in convenings, or their connection to the Evidence Project Portal. The latter criterion helps ensure that staff expect useful follow-up, which in turn motivates survey completion and participation in follow-on activities; it also generates enthusiasm for engaging in new collaborative relationships (as well as creating broader buy-in for the learning agenda). In some cases, it may make the most sense to have multiple people from an evaluation team surveying different agency staff or co-sponsoring the survey with agency innovation offices. Explicit support from agency leadership for the survey and follow-on activities is also crucial for achieving staff buy-in.

What questions should be asked in an unmet desire survey?

Survey content is meant to be tailored and agency-specific, so the sample questions can be adapted as follows:

Which learning agenda question(s) are you focused on? Is there information about other programs within the government and/or information that outside researchers and other stakeholders have that would help answer it? What kinds of people would be helpful to connect with?
This question can be left entirely open-ended or be focused on particular priorities and/or particular potential collaborators (e.g., only researchers, or only other agency staff, etc.).

Are you looking for informal collaboration (oriented toward knowledge exchange) or formal collaboration (oriented toward projects with shared ownership, decision-making authority, and accountability)?
This question may invite responses related to either informal or formal collaboration, or instead may only ask about knowledge exchange (a relatively lower commitment that may be more palatable to agency leadership).

What hesitations (perhaps due to prior experiences, lack of explicit permission, stereotypes, and so on) do you have about interacting with other stakeholders? What hesitations do you think they might have about interacting with you?
This question should refer to specific types of hesitancy that survey administrators believe are most likely (e.g., ask about a few hesitancies that seem most likely to arise, such as lack of explicit permission, concerns about saying something inappropriate, or concerns about lack of trustworthy information).

Why should they want to connect with you?

Why do you think these connections don’t already exist?
These last two questions can similarly be left broad or include a few examples to help spark ideas.

Evaluation staff may also choose to only ask a subset of the questions.

Who should conduct matchmaking in response to an unmet desire survey?

Again, the answer is agency-specific. In cases that will use the Evidence Project Portal, agency evaluation staff will take the first stab at crafting postings. In other cases, meeting the unmet desire may occur via already-planned convenings or matchmaking on one’s own. Formalizing this duty as a part of one or more people’s official responsibilities sends a signal about how much this work is valued. Exactly who those people are will depend on the agency’s structure, as well as on whether there are already people in a given agency who see matchmaking as part of their job. The key point is that matchmaking itself should be a valued part of the process.

When is the right time to field an unmet desire survey?

While unmet desire surveys can be done any time and on a continuous basis, it is best to field them when there is either an upcoming convening (which itself is a natural opportunity for matchmaking) or there is identified staff capacity for follow-on matchmaking and employee willingness to build collaborative relationships.

How is this tool different from other collaboration tools?

Many evaluation officers and their staff are already forming collaborative relationships as part of developing and advancing learning agendas. Unmet desire surveys place explicit focus on what kinds of new collaborative relationships agency staff want to have with staff in other programs, either within their agency/department or outside it. These surveys are designed to prompt staff to reflect on how the success of their program relates to what is happening elsewhere and to consider who might have information that is relevant and helpful, as well as any hesitations they have about interacting with those people. Unmet desire surveys measure both substantive goals as well as staff uncertainty about interacting with others.

The FAS-OMB Evidence Forum: Thinking Back and Looking Forward

Until a month ago, I was an event skeptic.

When it’s as easy as a Zoom link to connect with colleagues, I found it hard to believe that getting a bunch of people together around an agenda was ever really worth the time and effort.

Point one for my colleagues at FAS and the White House Office of Management and Budget (OMB), who thought that co-hosting an “Evidence Forum” in support of the White House Year of Evidence for Action was a good idea. And were completely right.

The FAS-OMB Evidence Forum, a half-day session in Washington, DC on October 7, 2022, proved that events can indeed drive forward progress when they include three key components: compelling ideas, effective champions, and an open structure that enables participants to identify intersections with their own work. Let’s touch on each of these in the context of the Evidence Forum.

Team FAS @ the FAS-OMB Evidence Forum

Compelling ideas

The Evidence Forum showcased four novel proposals for enhancing evidence-based policy and practice across federal government. Three of these ideas were developed through the “Evidence for Action Challenge” co-sponsored by FAS and the Pew Charitable Trusts Evidence Project. These ideas were:

Incorporating evidence on what the public values into policymaking. For instance, public willingness to embrace various forms of climate-adaptation strategies can and should be considered alongside expert assessments when designing climate action plans.
Using unmet desire surveys to facilitate productive collaboration among federal agency staff and external experts. Such surveys could, in particular, identify where federal staff could use help from others in advancing agency learning agendas.
Launching an intergovernmental research and evaluation consortium focused on economic mobility. By linking datasets and creating reusable evaluation templates, such a consortium could enable cheap, fast, and reliable assessments of various economic programs.

A fourth idea also shared at the Forum was incorporating Living Evidence into federal initiatives. While traditional approaches to knowledge synthesis are often static and can quickly go out of date in rapidly evolving fields, Living Evidence uses thoughtfully designed, dynamic methodologies to produce knowledge summaries that are always current. This is especially important for topics where new research evidence is emerging rapidly, current evidence is uncertain, and new research might change policy or practice.

Effective champions

The Forum demonstrated how much more powerful ideas on paper become when articulated by someone who can give those ideas dimensionality and life. In an armchair discussion, Drs. Julian Elliott (Monash University) and Arlene Bierman (U.S. Agency for Healthcare Research and Quality) emphasized that Living Evidence is not a theoretical construct—it is already informing dynamic COVID-19 clinical-practice guidelines and shedding light on plant-based treatments as an alternative pain-management approach. Similarly, Kathy Stack (Yale University) drew on her extensive past experience at OMB to illustrate why intergovernmental partnerships are crucial for improving evaluation and delivery of programs that serve overlapping populations.

The Forum elevated newer voices as well. Postdoctoral research associate Nich Weller (Arizona State University) argued that while federal evidence efforts acknowledge the importance of social, cultural and Indigenous knowledge, they do not draw adequate attention to the challenges of generating, operationalizing, and integrating such evidence in routine policy and decision making. Nich laid out a vision for a federal evidence enterprise that would incorporate the living and lived experiences, knowledge, and values of the public alongside scientific findings and expert analysis. Associate Professor Adam Levine (Johns Hopkins University) also emphasized that effective evidence-based policy depends on interpersonal connections. As Adam explained,

The issue isn’t simply that strangers do not know each other—it’s also that strangers do not always know how to talk to one another.

Intentionally addressing this reality is essential to cultivating productive working relationships.

Open structure

We’ve all been to events where you get talked at for a few hours, then leave and continue on with your day. The FAS-OMB Evidence Forum broke this mold by providing opportunities for interactive Q&A throughout the first portion of the agenda—and then dedicating the second portion to an open ideation session that encouraged attendees to brainstorm how Living Evidence could be applied in their home institutions.

Once a lighthearted icebreaker question (“What is your favorite pasta shape and why?”) got creative juices flowing, the ideas came fast and furious. Working collaboratively first in pairs, then in small groups, and finally altogether, participants ultimately honed in on four pressing policy questions that Living Evidence could help answer:

What types of stakeholder engagement strategies are most effective?
How do various social determinants of health affect health outcomes?
When should those suffering from long COVID be deemed eligible for disability benefits?
How can government bridge local knowledge and academic research on climate adaptation strategies?

Participants also worked together to outline specific components that could be included in Living Evidence research agendas for each of these questions and defined criteria for success.

What is your favorite pasta shape and why?

Looking ahead

A lot of great stuff happened during the Forum itself. But I’ve been even more excited to witness the follow-on in the weeks since. Motivated by the Forum, agencies are already taking concrete steps to scope intergovernmental research and evaluation consortia for other policy domains, update existing surveys with unmet desire questions, reframe public values as empirical evidence, and consider what support and guidance is needed to position Living Evidence as a standard component of the federal evaluation toolkit.

And I am confident that, in what remains of the White House Year of Evidence and thereafter, much more exciting work is to come. To echo the words of Sir Jeremy Farrar (Wellcome) during his Evidence Forum keynote address,

This is the moment for us to grasp—not in fear, not in uncertainty, not to be down beaten by the challenges the world faces at the moment. But to say that we actually can make the world a better place.

How could anyone be skeptical of that?

Missed the Forum? A full recording can be accessed here, using the passcode vm6?$rK%

See this whitepaper recapping the event in more detail for additional policy recommendations.

Public Value Evidence for Public Value Outcomes: Integrating Public Values into Federal Policymaking

Summary

The federal government––through efforts like the White House Year of Evidence for Action––has made a laudable push to ensure that policy decisions are grounded in empirical evidence. While these efforts acknowledge the importance of social, cultural and Indigenous knowledges, they do not draw adequate attention to the challenges of generating, operationalizing, and integrating such evidence in routine policy and decision making. In particular, these endeavors are generally poor at incorporating the living and lived experiences, knowledge, and values of the public. This evidence—which we call evidence about public values—provides important insights for decision making and contributes to better policy or program designs and outcomes.

The federal government should broaden institutional capacity to collect and integrate evidence on public values into policy and decision making. Specifically, we propose that the White House Office of Management and Budget (OMB) and the White House Office of Science and Technology Policy (OSTP):

Provide a directive on the importance of public value evidence.
Develop an implementation roadmap for integrating public value evidence into federal operations (e.g., describe best practices for integrating it into federal decision making, developing skill-building opportunities for federal employees).

Challenge and Opportunity

Evidence about public values informs and improves policies and programs

Evidence about public values is, to put it most simply, information about what people prioritize, care, or think about with respect to a particular issue, which may differ from ideas prioritized by experts. It includes data collected through focus groups, deliberations, citizen review panels, and community-based research, or public opinion surveys. Some of these methods rely on one-way flows of information (e.g., surveys) while others prioritize mutual exchange of information among policy makers and participating publics (e.g., deliberations).

Agencies facing complex policymaking challenges can utilize evidence about public values––along with expert- and evaluation-based evidence––to ensure decisions truly serve the broader public good. If collected as part of the policy-making process, evidence about public values can inform policy goals and programs in real time, including when program goals are taking shape or as programs are deployed.

Evidence about public values within the federal government: three challenges to integration

To fully understand and use public values in policymaking, the U.S. government must first broadly address three challenges.

First, the federal government does not sufficiently value evidence about public values when it researches and designs policy solutions. Federal employees often lack any directive or guidance from leadership that collecting evidence about public values is valuable or important to evidence-based decision making. Efforts like the White House Year of Evidence for Action seek to better integrate evidence into policy making. Yet––for many contexts and topics––scientific or evaluation-based evidence is just one type of evidence. The public’s wisdom, hopes, and perspectives play an important mediating factor in determining and achieving desired public outcomes. The following examples illustrate ways public value evidence can support federal decision making:

An effort to implement climate intervention technologies (e.g., solar geoengineering) might be well-grounded in evidence from the scientific community. However, that same strategy may not consider the diverse values Americans hold about (i) how such research might be governed, (ii) who ought to develop those technologies, and (iii) whether or not they should be used at all. Public values are imperative for such complex, socio-technical decisions if we are to make good on the Year of Evidence’s dual commitment to scientific integrity (including expanded concepts of expertise and evidence) and equity (better understanding of “what works, for whom, and under what circumstances”).
Evidence about the impacts of rising sea levels on national park infrastructure and protected features has historically been tense. To acknowledge the social-environmental complexity in play, park leadership have strived to include both expert assessments and engagement with publics on their own risk tolerance for various mitigation measures. This has helped officials prioritize limited resources as they consider tough decisions on what and how to continue to preserve various park features and artifacts.

Second, the federal government lacks effective mechanisms for collecting evidence about public values. Presently, public comment periods favor credentialed participants—advocacy groups, consultants, business groups, etc.—who possess established avenues for sharing their opinions and positions to policy makers. As a result, these credentialed participants shape policy and other experiences, voices, and inputs go unheard. While the general public can contribute to government programs through platforms like Challenge.gov, credentialed participants still tend to dominate these processes. Effective mechanisms for collecting public values into decision making or research are generally confined to university, local government, and community settings. These methods include participatory budgeting, methods from usable or co-produced science, and participatory technology assessment. Some of these methods have been developed and applied to complex science and technology policy issues in particular, including climate change and various emerging technologies. Their use in federal agencies is far more limited. Even when an agency might seek to collect public values, it may be impeded by regulatory hurdles, such as the Paperwork Reduction Act (PRA), which can limit the collection of public values, ideas, or other input due to potentially long timelines for approval and perceived data collection burden on the public. Cumulatively, these factors prevent agencies from accurately gauging––and being adaptive to––public responses.

Third, federal agencies face challenges integrating evidence about public values into policy making. These challenges can be rooted in the regulatory hurdles described above, difficulties integrating with existing processes, and unfamiliarity with the benefits of collecting evidence about public values. Fortunately, studies have found specific attributes present among policymakers and agencies that allowed for the implementation and use of mechanisms for capturing public values. These attributes included:

Leadership who prioritized public involvement and helped address administrative uncertainties.
An agency culture responsive to broader public needs, concerns, and wants.
Agency staff familiar with mechanisms to capture public values and integrate them in the policy- and decision-making process. The latter can help address translation issues, deal with regulatory hurdles, and can better communicate the benefits of collecting public values with regard to agency needs. Unfortunately, many agencies do not have such staff, and there are no existing roadmaps or professional development programs to help build this capacity across agencies.

Aligning public values with current government policies promotes scientific integrity and equity

The White House Year of Evidence for Action presents an opportunity to address the primary challenges––namely a lack of clear direction, collection protocols, and evidence integration strategies––currently impeding public values evidence’s widespread use in the federal government. Our proposal below is well aligned with the Year of Evidence’s central commitments, including:

A commitment to scientific integrity. Complex problems require expanded concepts of expertise and evidence to ensure that important details and public concerns are not lost or overlooked.
A commitment to equity. We have a better understanding of “what works, for whom, and under what circumstances” when we have ways of discerning and integrating public values into evidence-based decision making. Methods for integrating public values into decision making complement other emerging best practices––such as the co-creation of evaluation studies and including Indigenous knowledges and perspectives––in the policy making process.

Furthermore, this proposal aligns with the goals of the Year of Evidence for Action to “share leading practices to generate and use research-backed knowledge to advance better, more equitable outcomes for all America…” and to “…develop new strategies and structures to promote consistent evidence-based decision-making inside the Federal Government.”

Plan of Action

To integrate public values into federal policy making, the White House Office of Management and Budget (OMB) and the White House Office of Science and Technology Policy (OSTP) should:

Develop a high-level directive for agencies about the importance of collecting public values as a form of evidence to inform policy making.
Oversee the development of a roadmap for the integration of evidence about public values across government, including pathways for training federal employees.

Recommendation 1. OMB and OSTP should issue a high-level directive providing clear direction and strong backing for agencies to collect and integrate evidence on public values into their evidence-based decision-making procedures.

Given the potential utility of integrating public value evidence into science and technology policy as well as OSTP’s involvement in efforts to promote evidence-based policy, OSTP makes a natural partner in crafting this directive alongside OMB. This directive should clearly connect public value evidence to the current policy environment. As described above, efforts like the Foundations for Evidence-Based policy making Act (Evidence Act) and the White House Year of Evidence for Action provide a strong rationale for the collection and integration of evidence about public values. Longer-standing policies––including the Crowdsourcing and Citizen Science Act––provide further context and guidance for the importance of collecting input from broad publics.

Recommendation 2. As part of the directive, or as a follow up to it, OMB and OSTP should oversee the development of a roadmap for integrating evidence about public values across government.

The roadmap should be developed in consultation with various federal stakeholders, such as members of the Evaluation Officer Council, representatives from the Equitable Data Working Group, customer experience strategists, and relevant conceptual and methods experts from within and outside the government.

A comprehensive roadmap would include the following components:

Appropriate contexts and uses for gathering and integrating public values as evidence. Public values should be collected when the issue is one where scientific or expert evidence is necessary, but not sufficient to address the question at hand. This may be due to (i) uncertainty, (ii) high levels of value disagreement, (iii) cases where the societal implications of a policy or program could be wide ranging, or (iv) situations where policy or program outcomes have inequitable impacts on certain communities.
Specific approaches to collecting and integrating public values evidence, accompanied by illustrative case studies describing how the methods have been used. While various approaches for measuring and applying public values evidence exist, a few additional conditions can help enable success. These include staff knowledgeable about social science methods and the importance of public value input; clarity of regulatory requirements; and buy-in from agency leadership. These could include practices for: recruiting and convening diverse public participants; promoting exchanges among those participants; comparing public values against scientific or expert evidence; and ensuring that public values are translated into actionable policy solutions.

Potential training program designs for federal employees. The goal of these training programs should be to develop a workforce that can integrate public value evidence into U.S. policymaking. Participants in these trainings should learn about the importance of integrating evidence about public values alongside other types of evidence, as well as strategies to collect and integrate that evidence into policy and programs. These training programs should promote active learning through applied pilot projects with the learner’s agency or unit.
Specifying a center tasked with improving methods and tools for integrating evidence about public values into federal decision making. This center could exist as a public-private partnership, a federally-funded research and development center, or an innovation lab¹ within an agency. This center could conduct ongoing research, evaluation, and pilot programs of new evidence-gathering methods and tools. This would ensure that as agencies collect and apply evidence about public values, they do so with the latest expertise and techniques.

Conclusion

Collecting evidence about the living and lived experiences, knowledge, and aspirations of the public can help inform policies and programs across government. While methods for collecting evidence about public values have proven effective, they have not been integrated into evidence-based policy efforts within the federal government. The integration of evidence about public values into policy making can promote the provision of broader public goods, elevate the perspectives of historically marginalized communities, and reveal policy or program directions different from those prioritized by experts. The proposed directive and roadmap––while only a first step––would help ensure the federal government considers, respects, and responds to our diverse nation’s values.

Frequently Asked Questions

Which agencies or areas of government could use public value evidence?

Federal agencies can use public value evidence where additional information about what the public thinks, prioritizes, and cares about could improve programs and policies. For example, policy decisions characterized by high uncertainty, potential value disputes, and high stakes could benefit from a broader review of considerations by diverse members of the public to ensure that novel options and unintended consequences are considered in the decision making process. In the context of science and technology related decision making, these situations were called “post-normal science” by Silvio Funtowicz and Jerome Ravetz. They called for an extension of who counts as a subject matter expert in the face of such challenges, citing the potential for technical analyses to overlook important societal values and considerations.

Why should OSTP be engaged in furthering the use of public value evidence?

Many issues where science and technology meet societal needs and policy considerations warrant broad public value input. These issues include emerging technologies with societal implications and existing S&T challenges that have far reaching impacts on society (e.g., climate change). Further, OSTP is already involved in Evidence for Action initiatives and can assist in bringing in external expertise on methods and approaches.

Why do we need this sort of evidence when public values are represented by elective officials?

While guidance from elected officials is an important mechanism for representing public values, evidence collected about public values through other means can be tailored to specific policy making contexts and can explore issue-specific challenges and opportunities.

Are there any examples of public value evidence being used in the government?

There are likely more current examples of identifying and integrating public value evidence than we can point out in government. The roadmap building process should involve identifying those and finding common language to describe diverse public value evidence efforts across government. For specific known examples, see footnotes 1 and 2.

Is evidence about public values different from evidence collected about evaluations?

Evidence about public values might include evidence collected through program and policy evaluations but includes broader types of evidence. The evaluation of policies and programs generally focuses on assessing effectiveness or efficiency. Evidence about public values would be used in broader questions about the aims or goals of a program or policy.

Unlocking Federal Grant Data To Inform Evidence-Based Science Funding

Summary

Federal science-funding agencies spend tens of billions of dollars each year on extramural research. There is growing concern that this funding may be inefficiently awarded (e.g., by under-allocating grants to early-career researchers or to high-risk, high-reward projects). But because there is a dearth of empirical evidence on best practices for funding research, much of this concern is anecdotal or speculative at best.

The National Institutes of Health (NIH) and the National Science Foundation (NSF), as the two largest funders of basic science in the United States, should therefore develop a platform to provide researchers with structured access to historical federal data on grant review, scoring, and funding. This action would build on momentum from both the legislative and executive branches surrounding evidence-based policymaking, as well as on ample support from the research community. And though grantmaking data are often sensitive, there are numerous successful models from other sectors for sharing sensitive data responsibly. Applying these models to grantmaking data would strengthen the incorporation of evidence into grantmaking policy while also guiding future research (such as larger-scale randomized controlled trials) on efficient science funding.

Challenge and Opportunity

The NIH and NSF together disburse tens of billions of dollars each year in the form of competitive research grants. At a high level, the funding process typically works like this: researchers submit detailed proposals for scientific studies, often to particular program areas or topics that have designated funding. Then, expert panels assembled by the funding agency read and score the proposals. These scores are used to decide which proposals will or will not receive funding. (The FAQ provides more details on how the NIH and NSF review competitive research grants.)

A growing number of scholars have advocated for reforming this process to address perceived inefficiencies and biases. Citing evidence that the NIH has become increasingly incremental in its funding decisions, for instance, commentators have called on federal funding agencies to explicitly fund riskier science. These calls grew louder following the success of mRNA vaccines against COVID-19, a technology that struggled for years to receive federal funding due to its high-risk profile.

Others are concerned that the average NIH grant-winner has become too old, especially in light of research suggesting that some scientists do their best work before turning 40. Still others lament the “crippling demands” that grant applications exert on scientists’ time, and argue that a better approach could be to replace or supplement conventional peer-review evaluations with lottery-based mechanisms.

These hypotheses are all reasonable and thought-provoking. Yet there exists surprisingly little empirical evidence to support these theories. If we want to effectively reimagine—or even just tweak—the way the United States funds science, we need better data on how well various funding policies work.

Academics and policymakers interested in the science of science have rightly called for increased experimentation with grantmaking policies in order to build this evidence base. But, realistically, such experiments would likely need to be conducted hand-in-hand with the institutions that fund and support science, investigating how changes in policies and practices shape outcomes. While there is progress in such experimentation becoming a reality, the knowledge gap about how best to support science would ideally be filled sooner rather than later.

Fortunately, we need not wait that long for new insights. The NIH and NSF have a powerful resource at their disposal: decades of historical data on grant proposals, scores, funding status, and eventual research outcomes. These data hold immense value for those investigating the comparative benefits of various science-funding strategies. Indeed, these data have already supported excellent and policy-relevant research. Examples include Ginther et. al (2011) which studies how race and ethnicity affect the probability of receiving an NIH award, and Myers (2020), which studies whether scientists are willing to change the direction of their research in response to increased resources. And there is potential for more. While randomized control trials (RCTs) remain the gold standard for assessing causal inference, economists have for decades been developing methods for drawing causal conclusions from observational data. Applying these methods to federal grantmaking data could quickly and cheaply yield evidence-based recommendations for optimizing federal science funding.

Opening up federal grantmaking data by providing a structured and streamlined access protocol would increase the supply of valuable studies such as those cited above. It would also build on growing governmental interest in evidence-based policymaking. Since its first week in office, the Biden-Harris administration has emphasized the importance of ensuring that “policy and program decisions are informed by the best-available facts, data and research-backed information.” Landmark guidance issued in August 2022 by the White House Office of Science and Technology Policy directs agencies to ensure that federally funded research—and underlying research data—are freely available to the public (i.e., not paywalled) at the time of publication.

On the legislative side, the 2018 Foundations for Evidence-based Policymaking Act (popularly known as the Evidence Act) calls on federal agencies to develop a “systematic plan for identifying and addressing policy questions” relevant to their missions. The Evidence Act specifies that the general public and researchers should be included in developing these plans. The Evidence Act also calls on agencies to “engage the public in using public data assets [and] providing the public with the opportunity to request specific data assets to be prioritized for disclosure.” The recently proposed Secure Research Data Network Act calls for building exactly the type of infrastructure that would be necessary to share federal grantmaking data in a secure and structured way.

Plan of Action

There is clearly appetite to expand access to and use of federally held evidence assets. Below, we recommend four actions for unlocking the insights contained in NIH- and NSF-held grantmaking data—and applying those insights to improve how federal agencies fund science.

Recommendation 1. Review legal and regulatory frameworks applicable to federally held grantmaking data.

The White House Office of Management and Budget (OMB)’s Evidence Team, working with the NIH’s Office of Data Science Strategy and the NSF’s Evaluation and Assessment Capability, should review existing statutory and regulatory frameworks to see whether there are any legal obstacles to sharing federal grantmaking data. If the review team finds that the NIH and NSF face significant legal constraints when it comes to sharing these data, then the White House should work with Congress to amend prevailing law. Otherwise, OMB—in a possible joint capacity with the White House Office of Science and Technology Policy (OSTP)—should issue a memo clarifying that agencies are generally permitted to share federal grantmaking data in a secure, structured way, and stating any categorical exceptions.

Recommendation 2. Build the infrastructure to provide external stakeholders with secure, structured access to federally held grantmaking data for research.

Federal grantmaking data are inherently sensitive, containing information that could jeopardize personal privacy or compromise the integrity of review processes. But even sensitive data can be responsibly shared. The NIH has previously shared historical grantmaking data with some researchers, but the next step is for the NIH and NSF to develop a system that enables broader and easier researcher access. Other federal agencies have developed strategies for handling highly sensitive data in a systematic fashion, which can provide helpful precedent and lessons. Examples include:

The U.S. Census Bureau (USCB)’s Longitudinal Employer-Household Data. These data link individual workers to their respective firms, and provide information on salary, job characteristics, and worker and firm location. Approved researchers have relied on these data to better understand labor-market trends.
The Department of Transportation (DOT)’s Secure Data Commons. The Secure Data Commons allows third-party firms (such as Uber, Lyft, and Waze) to provide individual-level mobility data on trips taken. Approved researchers have used these data to understand mobility patterns in cities.

In both cases, the data in question are available to external researchers contingent on agency approval of a research request that clearly explains the purpose of a proposed study, why the requested data are needed, and how those data will be managed. Federal agencies managing access to sensitive data have also implemented additional security and privacy-preserving measures, such as:

Only allowing researchers to access data via a remote server, or in some cases, inside a Federal Statistical Research Data Center. In other words, the data are never copied onto a researcher’s personal computer.
Replacing any personal identifiers with random number identifiers once any data merges that require personal identifiers are complete.
Reviewing any tables or figures prior to circulating or publishing results, to ensure that all results are appropriately aggregated and that no individual-level information can be inferred.

Building on these precedents, the NIH and NSF should (ideally jointly) develop secure repositories to house grantmaking data. This action aligns closely with recommendations from the U.S. Commission on Evidence-Based Policymaking, as well as with the above-referenced Secure Research Data Network Act (SRDNA). Both the Commission recommendations and the SRDNA advocate for secure ways to share data between agencies. Creating one or more repositories for federal grantmaking data would be an action that is simultaneously narrower and broader in scope (narrower in terms of the types of data included, broader in terms of the parties eligible for access). As such, this action could be considered either a precursor to or an expansion of the SRDNA, and could be logically pursued alongside SRDNA passage.

Once a secure repository is created, the NIH and NSF should (again, ideally jointly) develop protocols for researchers seeking access. These protocols should clearly specify who is eligible to submit a data-access request, the types of requests that are likely to be granted, and technical capabilities that the requester will need in order to access and use the data. Data requests should be evaluated by a small committee at the NIH and/or NSF (depending on the precise data being requested). In reviewing the requests, the committee should consider questions such as:

How important and policy-relevant is the question that the researcher is seeking to answer? If policymakers knew the answer, what would they do with that information? Would it inform policy in a meaningful way?
How well can the researcher answer the question using the data they are requesting? Can they establish a clear causal relationship? Would we be comfortable relying on their conclusions to inform policy?

Finally, NIH and NSF should consider including right-to-review clauses in agreements governing sharing of grantmaking data. Such clauses are typical when using personally identifiable data, as they give the data provider (here, the NIH and NSF) the chance to ensure that all data presented in the final research product has been properly aggregated and no individuals are identifiable. The Census Bureau’s Disclosure Review Board can provide some helpful guidance for NIH and NSF to follow on this front.

Recommendation 3. Encourage researchers to utilize these newly available data, and draw on the resulting research to inform possible improvements to grant funding.

The NIH and NSF frequently face questions and trade-offs when deciding if and how to change existing grantmaking processes. Examples include:

How can we identify promising early-career researchers if they have less of a track record? What signals should we look for?
Should we cap the amount of federal funding that individual scientists can receive, or should we let star researchers take on more grants? In general, is it better to spread funding across more researchers or concentrate it among star researchers?
Is it better to let new grantmaking agencies operate independently, or to embed them within larger, existing agencies?

Typically, these agencies have very little academic or empirical evidence to draw on for answers. A large part of the problem has been the lack of access to data that researchers need to conduct relevant studies. Expanding access, per Recommendations 1 and 2 above, is a necessary part of but not a sufficient solution. Agencies must also invest in attracting researchers to use the data in a socially useful way.

Broadly advertising the new data will be critical. Announcing a new request for proposals (RFP) through the NIH and/or the NSF for projects explicitly using the data could also help. These RFPs could guide researchers toward the highest-impact and most policy-relevant questions, such as those above. The NSF’s “Science of Science: Discovery, Communication and Impact” program would be a natural fit to take the lead on encouraging researchers to use these data.

The goal is to create funding opportunities and programs that give academics clarity on the key issues and questions that federal grantmaking agencies need guidance on, and in turn the evidence academics build should help inform grantmaking policy.

Conclusion

Basic science is a critical input into innovation, which in turn fuels economic growth, health, prosperity, and national security. The NIH and NSF were founded with these critical missions in mind. To fully realize their missions, the NIH and NSF must understand how to maximize scientific return on federal research spending. And to help, researchers need to be able to analyze federal grantmaking data. Thoughtfully expanding access to this key evidence resource is a straightforward, low-cost way to grow the efficiency—and hence impact—of our federally backed national scientific enterprise.

Frequently Asked Questions

How does the NIH currently select research proposals for funding?

For an excellent discussion of this question, see Li (2017). Briefly, the NIH is organized around 27 “Institutes or Centers” (ICs) which typically correspond to disease areas or body systems. ICs have budgets each year that are set by Congress. Research proposals are first evaluated by around 180 different “study sections”, which are committees organized by scientific areas or methods. After being evaluated by the study sections, proposals are returned to their respective ICs. The highest-scoring proposals in each IC are funded, up to budget limits.

How does the NSF currently select research proposals for funding?

Research proposals are typically submitted in response to announced funding opportunities, which are organized around different programs (topics). Each proposal is sent by the Program Officer to at least three independent reviewers who do not work at the NSF. These reviewers judge the proposal on its Intellectual Merit and Broader Impacts. The Program Officer then uses the independent reviews to make a funding recommendation to the Division Director, who makes the final award/decline decision. More details can be found on the NSF’s webpage.

What data on grant funding at the NIH and NSF is currently (publicly) available?

The NIH and NSF both provide data on approved proposals. These data can be found on the RePORTER site for the NIH and award search site for the NSF. However, these data do not provide any information on the rejected applications, nor do they provide information on the underlying scores of approved proposals.

Strengthening Policy by Bringing Evidence to Life

Summary

In a 2021 memorandum, President Biden instructed all federal executive departments and agencies to “make evidence-based decisions guided by the best available science and data.” This policy is sound in theory but increasingly difficult to implement in practice. With millions of new scientific papers published every year, parsing and acting on research insights presents a formidable challenge.

A solution, and one that has proven successful in helping clinicians effectively treat COVID-19, is to take a “living” approach to evidence synthesis. Conventional systematic reviews, meta-analyses, and associated guidelines and standards, are published as static products, and are updated infrequently (e.g., every four to five years)—if at all. This approach is inefficient and produces evidence products that quickly go out of date. It also leads to research waste and poorly allocated research funding.

By contrast, emerging “Living Evidence” models treat knowledge synthesis as an ongoing endeavor. By combining (i) established, scientific methods of summarizing science with (ii) continuous workflows and technology-based solutions for information discovery and processing, Living Evidence approaches yield systematic reviews—and other evidence and guidance—products that are always current.
The recent launch of the White House Year of Evidence for Action provides a pivotal opportunity to harness the Living Evidence model to accelerate research translation and advance evidence-based policymaking. The federal government should consider a two-part strategy to embrace and promote Living Evidence. The first part of this strategy positions the U.S. government to lead by example by embedding Living Evidence within federal agencies. The second part focuses on supporting external actors in launching and maintaining Living Evidence resources for the public good.

Challenge and Opportunity

We live in a time of veritable “scientific overload”. The number of scientific papers in the world has surged exponentially over the past several decades (Figure 1), and millions of new scientific papers are published every year. Making sense of this deluge of documents presents a formidable challenge. For any given topic, experts have to (i) scour the scientific literature for studies on that topic, (ii) separate out low-quality (or even fraudulent) research, (iii) weigh and reconcile contradictory findings from different studies, and (iv) synthesize study results into a product that can usefully inform both societal decision-making and future scientific inquiry.

This process has evolved over several decades into a scientific method known as “systematic review” or “meta-analysis”. Systematic reviews and meta-analyses are detailed and credible, but often take over a year to produce and rapidly go out of date once published. Experts often compensate by drawing attention to the latest research in blog posts, op-eds, “narrative” reviews, informal memos, and the like. But while such “quick and dirty” scanning of the literature is timely, it lacks scientific rigor. Hence those relying on “the best available science” to make informed decisions must choose between summaries of science that are reliable or current…but not both.

The lack of trustworthy and up-to-date summaries of science constrains efforts, including efforts championed by the White House, to promote evidence-informed policymaking. It also leads to research waste when scientists conduct research that is duplicative and unnecessary, and degrades the efficiency of the scientific ecosystem when funders support research that does not address true knowledge gaps.

Figure 1

Total number of scientific papers published over time, according to the Microsoft Access Graph (MAG) dataset. (Source: Herrmannova and Knoth, 2016)

The emerging Living Evidence paradigm solves these problems by treating knowledge synthesis as an ongoing rather than static endeavor. By combining (i) established, scientific methods of summarizing science with (ii) continuous workflows and technology-based solutions for information discovery and processing, Living Evidence approaches yield systematic reviews that are always up to date with the latest research. An opinion piece published in The New York Times called this approach “a quiet revolution to surface the best-available research and make it accessible for all.”

To take a Living Evidence approach, multidisciplinary teams of subject-matter experts and methods experts (e.g., information specialists and data scientists) first develop an evidence resource—such as a systematic review—using standard approaches. But the teams then commit to regular updates of the evidence resource at a frequency that makes sense for their end users (e.g., once a month). Using technologies such as natural-language processing and machine learning, the teams continually monitor online databases to identify new research. Any new research is rapidly incorporated into the evidence resource using established methods for high-quality evidence synthesis. Figure 2 illustrates how Living Evidence builds on and improves traditional approaches for evidence-informed development of guidelines, standards, and other policy instruments.

Figure 2

Illustration of how a Living Evidence approach to development of evidence-informed policies (such as clinical guidelines) is more current and reliable than traditional approaches. (Source: Author-developed graphic)

Living Evidence products are more trusted by stakeholders, enjoy greater engagement (up to a 300% increase in access/use, based on internal data from the Australian Stroke Foundation), and support improved translation of research into practice and policy. Living Evidence holds particular value for domains in which research evidence is emerging rapidly, current evidence is uncertain, and new research might change policy or practice. For example, Nature has credited Living Evidence with “help[ing] chart a route out” of the worst stages of the COVID-19 pandemic. The World Health Organization (WHO) has since committed to using the Living Evidence approach as the organization’s “main platform” for knowledge synthesis and guideline development across all health issues.

Yet Living Evidence approaches remain underutilized in most domains. Many scientists are unaware of Living Evidence approaches. The minority who are familiar often lack the tools and incentives to carry out Living Evidence projects directly. The result is an “evidence to action” pipeline far leakier than it needs to be. Entities like government agencies need credible and up-to-date evidence to efficiently and effectively translate knowledge into impact.

It is time to change the status quo. The 2019 Foundations for Evidence-Based Policymaking Act (“Evidence Act”) advances “a vision for a nation that relies on evidence and data to make decisions at all levels of government.” The Biden Administration’s “Year of Evidence” push has generated significant momentum around evidence-informed policymaking. Demonstrated successes of Living Evidence approaches with respect to COVID-19 have sparked interest in these approaches specifically. The time is ripe for the federal government to position Living Evidence as the “gold standard” of evidence products—and the United States as a leader in knowledge discovery and synthesis.

Plan of Action

The federal government should consider a two-part strategy to embrace and promote Living Evidence. The first part of this strategy positions the U.S. government to lead by example by embedding Living Evidence within federal agencies. The second part focuses on supporting external actors in launching and maintaining Living Evidence resources for the public good.

Part 1. Embedding Living Evidence within federal agencies

Federal science agencies are well positioned to carry out Living Evidence approaches directly. Living Evidence requires “a sustained commitment for the period that the review remains living.” Federal agencies can support the continuous workflows and multidisciplinary project teams needed for excellent Living Evidence products.

In addition, Living Evidence projects can be very powerful mechanisms for building effective, multi-stakeholder partnerships that last—a key objective for a federal government seeking to bolster the U.S. scientific enterprise. A recent example is Wellcome Trust’s decision to fund suites of living systematic reviews in mental health as a foundational investment in its new mental-health strategy, recognizing this as an important opportunity to build a global research community around a shared knowledge source.

Greater interagency coordination and external collaboration will facilitate implementation of Living Evidence across government. As such, President Biden should issue an Executive Order establishing an Living Evidence Interagency Policy Committee (LEIPC) modeled on the effective Interagency Arctic Research Policy Committee (IARPC). The LEIPC would be chartered as an Interagency Working Group of the National Science and Technology Council (NSTC) Committee on Science and Technology Enterprise, and chaired by the Director of the White House Office of Science and Technology Policy (OSTP; or their delegate). Membership would comprise representatives from federal science agencies, including agencies that currently create and maintain evidence clearinghouses, other agencies deeply invested in evidence-informed decision making, and non-governmental experts with deep experience in the practice of Living Evidence and/or associated capabilities (e.g., information science, machine learning).

The LEIPC would be tasked with (1) supporting federal implementation of Living Evidence, (2) identifying priority areas¹ and opportunities for federally managed Living Evidence projects, and (3) fostering greater collaboration between government and external stakeholders in the evidence community. More detail on each of these roles is provided below.

Supporting federal implementation of Living Evidence

Widely accepted guidance for living systematic reviews (LSRs), one type of Living Evidence product, has been published. The LEIPC—working closely with OSTP, the White House Office of Management and Budget (OMB), and the federal Evaluation Officer Council (EOC), should adapt this guidance for the U.S. federal context, resulting in an informational resource for federal agencies seeking to launch or fund Living Evidence projects. The guidance should also be used to update systematic-review processes used by federal agencies and organizations contributing to national evidence clearinghouses.²

Once the federally tailored guidance has been developed, the White House should direct federal agencies to consider and pursue opportunities to embed Living Evidence within their programs and operations. The policy directive could take the form of a Presidential Memorandum, a joint management memo from the heads of OSTP and OMB, or similar. This directive would (i) emphasize the national benefits that Living Evidence could deliver, and (ii) provide agencies with high-level justification for using discretionary funding on Living Evidence projects and for making decisions based on Living Evidence insights.

Identifying priority areas and opportunities for federally managed Living Evidence projects

The LEIPC—again working closely with OSTP, OMB, and the EOC—should survey the federal government for opportunities to deploy Living Evidence internally. Box 1 provides examples of opportunities that the LEIPC could consider.

Below are four illustrative examples of existing federal efforts that could be augmented with Living Evidence.

Example 1: National Primary Drinking Water Regulations. The U.S. Environmental Protection Agency (EPA) currently reviews and updates the National Primary Drinking Water Regulations every six years. But society now produces millions of new chemicals each year, including numerous contaminants of emerging concern (CEC) for drinking water. Taking a Living Evidence approach to drinking-water safety could yield drinking-water regulations that are updated continuously as information on new contaminants comes in, rather than periodically (and potentially after new contaminants have already begun to cause harm).

Example 2: Guidelines for entities participating in the National Wastewater Surveillance System. Australia has demonstrated how valuable Living Evidence can be for COVID-19 management and response. Meanwhile, declines in clinical testing and the continued emergence of new SARS-CoV-2 variants are positioning wastewater surveillance as an increasingly important public-health tool. But no agency or organization has yet taken a Living Evidence approach to the practice of testing wastewater for disease monitoring. Living Evidence could inform practitioners in real time on evolving best protocols and practices for wastewater sampling, concentration, testing, and data analysis.

Example 3: Department of Education Best Practices Clearinghouse. The Best Practices Clearinghouse was launched at President Biden’s direction to support a variety of stakeholders in reopening and operating post-pandemic. Applying Living Evidence analysis to the resources that the Clearinghouse has assembled would help ensure that instruction remains safe and effective in a dramatically transformed and evolving educational landscape.

Example 4: National Climate Assessment. The National Climate Assessment (NCA) is a Congressionally mandated review of climate science and impacts on the United States. The NCA is issued quadrennially, but climate change is presenting itself in new and worrying ways every year. Urgent climate action must be backed up by emergent climate knowledge. While a longer-term goal could be to transition the entire NCA into a frequently updated “living” mode, a near-term effort could focus on transitioning NCA subtopics where the need for new knowledge is especially pressing. For instance, the emergence and intensification of megafires in the West has upended our understanding of fire dynamics. A Living Evidence resource on fire science could give policymakers and program officials critical, up-to-date information on how best to mitigate, respond to, and recover from catastrophic megafires.

The product of this exercise should be a report that describes each of the opportunities identified, and recommends priority projects to pursue. In developing its priority list, the LEIPC should account for both the likely impact of a potential Living Evidence project as well as the near-term feasibility of that project. While the report could outline visions for ambitious Living Evidence undertakings that would require a significant time investment to realize fully (e.g., transitioning the entire National Climate Assessment into a frequently updated “living” mode), it should also scope projects that could be completed within two years and serve as pilots/proofs of concept. Lessons learned from the pilots could ultimately inform a national strategy for incorporating Living Evidence into federal government more systematically. Successful pilots could continue and grow beyond the end of the two-year period, as appropriate.

Fostering greater collaboration between government and external stakeholders

The LEIPC should create an online “LEIPC Collaborations” platform that connects researchers, practitioners, and other stakeholders both inside and outside government. The platform would emulate IARPC Collaborations, which has built out a community of more than 3,000 members and dozens of communities of practice dedicated to the holistic advancement of Arctic science. As one stakeholder has explained:

“IARPC Collaborations members interact primarily in virtual spaces including both video conferencing and a social networking website. Open to anyone who wishes to join, the website serves not only as a venue for sharing information in-between meetings, but also lowers the barrier to meetings and to the IARPC Collaborations community in general, allows the video conferences as well as the IARPC Collaborations community to be open to all, not just an exclusive group of people who happen to be included in an email. Together, IARPC Collaborations members have realized an unprecedented degree of communication, coordination and collaboration, creating new knowledge and contributing to science-informed decision making. The IARPC community managers utilize the IARPC Collaborations website not only for project management, but also to support public engagement. The website contains user-generated-content sharing system where members log-in to share resources such as funding opportunities, publications and reports, events, and general announcements. The community managers also provide communication training for two types of members of IARPC Collaborations: team leaders in order to enhance leadership skill and team engagement, and early career scientists in order to enhance their careers through networking and building interdisciplinary collaborations.”

LEIPC Collaborations could deliver the same participatory opportunities and benefits for members of the evidence community, facilitating holistic advancement of Living Evidence.

Part 2. Make it easier for scientists and researchers to develop LSRs

Many government efforts could be supported by internal Living Evidence initiatives, but not every valuable Living Evidence effort should be conducted by government. Many useful Living Evidence programs will require deep domain knowledge and specialized skills that teams of scientists and researchers working outside of government are best positioned to deliver.

But experts interested in pursuing Living Evidence efforts face two major difficulties. The first is securing funding. Very little research funding is awarded for the sole purpose of conducting systematic reviews and other types of evidence syntheses. The funding that is available is typically not commensurate with the resource and personnel needs of a high-quality synthesis. Living Evidence demands efficient knowledge discovery and the involvement of multidisciplinary teams possessing overlapping skill sets. Yet federal research grants are often structured in a way that precludes principal investigators from hiring research software engineers or from founding co-led research groups.

The second is aligning with incentives. Systematic reviews and other types of evidence syntheses are often not recognized as “true” research outputs by funding agencies or university tenure committees—i.e., they are often not given the same weight in research metrics, despite (i) utilizing well-established scientific methodologies involving detailed protocols and advanced data and statistical techniques, and (ii) resulting in new knowledge. The result is that talented experts are discouraged from investing their time on projects that can contribute significant new insights and could dramatically improve the efficiency and impact of our nation’s research enterprise.

To begin addressing these problems, the two biggest STEM-funding agencies—NIH and NSF—should consider the following actions:

Perform a landscape analysis of federal funding for evidence synthesis. Rigorously documenting the funding opportunities available (or lack thereof) for researchers wishing to pursue evidence synthesis will help NIH and NSF determine where to focus potential new opportunities. The landscape analysis should consider currently available funding opportunities for systematic, scoping, and rapid reviews, and could also include surveys and focus groups to assess the appetite in the research community for pursuing additional evidence-synthesis activities if supported.
Establish new grant opportunities designed to support Living Evidence projects. The goal of these grant opportunities would be to deliver definitive and always up-to-date summaries of research evidence and associated data in specified topics. The opportunities could align with particular research focuses (for instance, a living systematic review on tissue-electronic interfacing could facilitate progress on bionic limb development under NSF’s current “Enhancing Opportunities for Persons with Disabilities” Convergence Accelerator track). The opportunities could also be topic-agnostic, but require applicants to justify a proposed project by demonstrating that (i) the research evidence is emerging rapidly, (ii) current evidence is uncertain, and (iii) new research might materially change policy or practice.
Increase support for career research staff in academia. Although contributors to Living Evidence projects can cycle in and out (analogous to turnover in large research collaboratives), such projects benefit from longevity in a portion of the team. With this core team in place, Living Evidence projects are excellent avenues for grad students to build core research skills, including in research study design.
Leverage prestigious existing grant programs and awards to incentivize work on Living Evidence. For instance, NSF could encourage early-career faculty to propose LSRs in applications for CAREER grants.
Recognize evidence syntheses as research outputs. In all assessments of scientific track record (particularly research-funding schemes), systematic reviews and other types of rigorous evidence synthesis should be recognized as research outputs equivalent to “primary” research.

The grant opportunities should also:

Support collaborative, multidisciplinary research teams.
Include an explicit requirement to build significant stakeholder engagement, including with practitioners and relevant government agencies.
Include opportunities to apply for follow-on funding to support maintenance of high-value Living Evidence products.
Allow funds to be spent on non-traditional personnel resources; e.g., an information scientist to systematically survey for new research.

Conclusion

Policymaking can only be meaningfully informed by evidence if underpinning systems for evidence synthesis are robust. The Biden administration’s Year of Evidence for Action provides a pivotal opportunity to pursue concrete actions that strengthen use of science for the betterment of the American people. Federal investment in Living Evidence is one such action.

Living Evidence has emerged as a powerful mechanism for translating scientific discoveries into policy and practice. The Living Evidence approach is being rapidly embraced by international actors, and the United States has an opportunity to position itself as a leader. A federal initiative on Living Evidence will contribute additional energy and momentum to the Year of Evidence for Action, ensure that our nation does not fall behind on evidence-informed policymaking, and arm federal agencies with the most current and best-available scientific evidence as they pursue their statutory missions.

Frequently Asked Questions

Which sectors and scientific fields can use Living Evidence?

The Living Evidence model can be applied to any sector or scientific field. While the Living Evidence model has so far been most widely applied to the health sector, Living Evidence initiatives are also underway in other fields, such as education and climate sciences. Living Evidence is domain-agnostic: it is simply an approach that builds on existing, rigorous evidence-synthesis methods with a novel workflow of frequent and rapid updating.

What is needed to run a successful Living Evidence project?

It does not take long for teams to develop sufficient experience and expertise to apply the Living Evidence model. The key to a successful Living Evidence project is a team that possesses experience in conventional evidence synthesis, strong project-management skills, an orientation towards innovation and experimentation, and investment in building stakeholder engagement.

How much does Living Evidence cost?

As with evidence synthesis in general, cost depends on topic scope and the complexity of the evidence being appraised. Budgeting for Living Evidence projects should distinguish the higher cost of conducting an initial “baseline” systematic review from the lower cost of maintaining the project thereafter. Teams initiating a Living Evidence project for the first time should also budget for the inevitable experimentation and training required.

Do Living Evidence initiatives require recurrent funding?

No. Living Evidence initiatives are analogous to other significant scientific programs that may extend over many years, but receive funding in discrete, time-bound project periods with clear deliverables and the opportunity to apply for continuation funding.

Living Evidence projects do require funding for enough time to complete the initial “baseline” systematic review (typically 3-12 months, depending on scope and complexity), transition to maintenance (“living”) mode, and continue in living mode for sufficient time (usually about 6–12 months) for all involved to become familiar with maintaining and using the living resource. Hence Living Evidence projects work best when fully funded for a minimum of two years.
If there is support for funding beyond this minimum period, there are operational advantages of instantiating the follow-on funding before the previous funding period concludes. If follow-on funding is not immediately available, Living Evidence resources can simply revert to a conventional static form until and if follow-on funding becomes available.

Is Living Evidence sustainable?

Living Evidence is rapidly gaining momentum as organizations conclude that the conventional model of evidence synthesis is no longer sustainable because the volume of research that must be reviewed and synthesized for each update has grown beyond the capacity of typical project teams. Organizations that transition their evidence resources into “living” mode typically find the dynamic synthesis model to be more consistent, more feasible, easier to manage, and easier to plan for and resource. If the conventional model of intermittent synthesis is like climbing a series of mountains, the Living Evidence approach is like hiking up to and then walking across a plateau.

How can organizations that are already struggling to develop and update conventional evidence resources take on a Living Evidence project?

New initiatives usually need specific resourcing; Living Evidence is no different. The best approach is to identify a champion within the organization that has an innovation orientation and sufficient authority to effect change. The champion plays a key role in building organizational buy-in, particularly from senior leaders, key influencers within the main evidence program, and major partners, stakeholders and end users. Ultimately, the champion (or their surrogate) should be empowered and resourced to establish 1–3 Living Evidence pilots running alongside the organization’s existing evidence activities. Risk can be reduced by starting small and building a “minimum viable product” Living Evidence resource (i.e., by finding a topic area that is relatively modest in scope, of importance to stakeholders, and is characterized by evidence uncertainty as well as relatively rapid movement in the relevant research field). Funding should be structured to enable experimentation and iteration, and then move quickly to scale up, increasing the scope of evidence moving into living mode, as organizational and stakeholder experience and support builds.

Living Evidence sounds neverending. Wouldn’t that lead to burnout in the project team?

One of the advantages of the Living Evidence model is that the project team can gradually evolve over time (members can join and leave as their interests and circumstances change). This is analogous to the evolution of an ongoing scientific network or research collaborative. In contrast, the spikes in workload required for intermittent updates of conventional evidence products often lead to burnout and loss of institutional memory. Furthermore, teams working on Living Evidence are often motivated by participation in an innovative approach to evidence and pride in contributing to a definitive, high-quality, and highly impactful scientific initiative.

How is Living Evidence disseminated?

While dissemination of conventional evidence products involves sharing several dozen key messages in a once-in-several-years communications push, dissemination of Living Evidence amounts to a regular cycle of “what’s new” updates (typically one to two key insights). Living Evidence dissemination feeds become known and trusted by end users, inspiring confidence that end users can “keep up” with the implications of new research. Publication of Living Evidence can take many forms. Typically, the core evidence resource is housed in an organizational website that can be easily and frequently updated, sometimes with an ability for users to access previous versions of the resource. Living Evidence may also be published as articles in academic journals. This could be intermittent overviews of the evidence resource with links back to the main Living Evidence summaries, or (more ambitiously) as a series of frequently updated versions of an article that are logically linked. Multiple academic journals are innovating to better support “living” publications.

If Living Evidence products are continually updated, doesn’t that confuse end users with constantly changing conclusions?

Living Evidence requires continual monitoring for new research, as well as frequent and rapid incorporation of new research into existing evidence products. The volume of research identified and incorporated can vary from dozens of studies each month to a few each year, depending on the topic scope and research activity.

Even across broad topics in fast-moving research fields, though, the overall findings and conclusions of Living Evidence products change infrequently since the threshold for changing a conclusion drawn from a whole body of evidence is high. The largest Living Evidence projects in existence only yield about one to two new major findings or recommendations each update. Furthermore, any good evidence-synthesis product will contextualize conclusions and recommendations with confidence.

What are the implications of Living Evidence for stakeholder engagement?

Living Evidence projects, due to their persistent nature, are great opportunities for building partnerships with stakeholders. Stakeholders tend to be energized and engaged in an innovative project that gives them, their staff, and their constituencies a tractable mechanism by which to engage with the “current state of science”. In addition, the ongoing nature of a Living Evidence project means that project partnerships are always active. Stakeholders are continually engaged in meaningful, collaborative discussions and activities around the current evidence. Finally, this ongoing, always-active nature of Living Evidence projects creates “accumulative” partnerships that gradually broaden and deepen over time.

What are the equity implications of taking a Living Evidence approach?

Living Evidence resources make the latest science available to all. Conventionally, the lack of high-quality summaries of science has meant the latest science is discovered and adopted by those closest to centers of excellence and expertise. Rapid incorporation of the latest science into Living Evidence resources—as well as the wide promotion and dissemination of that science—means that the immediate benefits of science can be shared much more broadly, contributing to equity of access to science and its benefits.

What are the implications of Living Evidence for knowledge translation?

The activities that use research outputs and evidence resources (such as Living Evidence) to change practice and policy are often referred to as “knowledge translation”. These activities are substantial and often multifaceted interventions that identify and address the complex structural, organizational, and cultural barriers that impede knowledge use.

Living Evidence has the potential to accelerate knowledge translation: not because of any changes to the knowledge-translation enterprise, but because Living Evidence identifies earlier the high-certainty evidence that underpins knowledge-translation activities.

Living Evidence may also enhance knowledge translation in two ways. First, Living Evidence is a better evidence product and has been shown to increase trust, engagement, and intention to use among stakeholders. Second, as mentioned above, Living Evidence creates opportunities for deep and effective partnerships. Together, these advantages could position Living Evidence to yield a more effective “enabling environment” for knowledge translation.

Does Living Evidence require use of technologies like machine learning?

Technologies such as natural language processing, machine learning and citizen science (crowdsourcing), as well as efforts to build common data structures (and create Findable, Accessible, Interoperable and Reusable (FAIR) data), are advancing alongside Living Evidence. These technologies are often described as “enablers” of Living Evidence. While such technologies are commonly used and developed in Living Evidence projects, they are not essential. Nevertheless, over the longer term, such technologies will likely be indispensable for creating sustainable systems that make sense of science.

Piloting and Evaluating NSF Science Lottery Grants: A Roadmap to Improving Research Funding Efficiencies and Proposal Diversity

This memo was jointly produced by the Federation of American Scientists & the Institute for Progress

Summary

The United States no longer leads the world in basic science. There is growing recognition of a gap in translational activities — the fruits of American research do not convert to economic benefits. As policymakers consider a slew of proposals that aim to restore American competitiveness with once-in-a-generation investments into the National Science Foundation (NSF), less discussion has been devoted to improving our research productivity — which has been declining for generations. Cross-agency data indicates that this is not the result of a decline in proposal merit, nor of a shift in proposer demographics, nor of an increase (beyond inflation) in the average requested funding per proposal, nor of an increase in the number of proposals per investigator in any one year. As the Senate’s U.S. Innovation and Competition Act (USICA) and House’s America COMPETES Act propose billions of dollars to the NSF for R&D activities, there is an opportunity to bolster research productivity but it will require exploring new, more efficient ways of funding research.

The NSF’s rigorous merit review process has long been regarded as the gold standard for vetting and funding research. However, since its inception in the 1950s, emergent circumstances — such as the significant growth in overall population of principal investigators (PIs) — have introduced a slew of challenges and inefficiencies to the traditional peer-review grantmaking process: The tax on research productivity as PIs submit about 2.3 proposals for every award they receive and spend an average of 116 hours grant-writing per NSF proposal (i.e., “grantsmanship”), corresponding to a staggering loss of nearly 45% of researcher time; the orientation of grantsmanship towards incremental research with the highest likelihood of surviving highly-competitive, consensus-driven, and points-based review (versus riskier, novel, or investigator-driven research); rating bias against interdisciplinary research or previously unfunded researchers as well as reviewer fatigue. The result of such inefficiencies is unsettling: as fewer applicants are funded as a percentage of the increasing pool, some economic analysis suggests that the value of the science that researchers forgo for grantsmanship may exceed the value of the science that the funding program supports.

Our nation’s methods of supporting new ideas should evolve alongside our knowledge base.

Our nation’s methods of supporting new ideas should evolve alongside our knowledge base. Science lotteries — when deployed as a complement to the traditional peer review grant process — could improve the systems’ overall efficiency-cost ratio by randomly selecting a small percentage of already-performed, high quality, yet unfunded grant proposals to extract value from. Tested with majority positive feedback from participants in New Zealand, Germany, and Switzerland, science lotteries would introduce an element of randomness that could unlock innovative, disruptive scholarship across underrepresented demographics and geographies.

This paper proposes an experimental NSF pilot of science lotteries and the Appendix provides illustrative draft legislation text. In particular, House and Senate Science Committees should consider the addition of tight language in the U.S. Innovation and Competition Act (Senate) and the America COMPETES Act (House) that authorizes the use of “grant lotteries” across all NSF directorates, including the Directorate of Technology and Innovation. This language should carry the spirit of expanding the geography of innovation and evidence-based reviews that test what works.

Challenge and Opportunity

A recent NSF report pegged the United States as behind China in key scientific metrics, including the overall number of papers published and patents awarded. The numbers are sobering but reflect the growing understanding that America must pick which frontiers of knowledge it seeks to lead. One of these fields should be the science of science — in other words not just what science & technology innovations we hope to pursue, but in discovering new, more efficient ways to pursue them.

Since its inception in 1950, NSF has played a critical role in advancing the United States’ academic research enterprise, and strengthened our leadership in scientific research across the world. In particular, the NSF’s rigorous merit review process has been described as the gold standard for vetting and funding research. However, growing evidence indicates that, while praiseworthy, the peer review process has been stretched to its limits. In particular, the growing overall population of researchers has introduced a series of burdens on the system.

One NSF report rated nearly 70% of proposals as equally meritorious, while only one-third received funding. With a surplus of competitive proposals, reviewing committees often face tough close calls. In fact, empirical evidence has found that award decisions change nearly a quarter of the time when re-reviewed by a new set of peer experts. In response, PIs spend upwards of 116 hours on each NSF proposal to conform to grant expectations and must submit an average of 2.3 proposals to receive an award — a process known as “grantsmanship” that survey data suggests occupies nearly 45% of top researchers’ time. Even worse, this grantsmanship is oriented towards writing proposals on incremental research topics (versus riskier, novel, or investigator-driven research) which has a higher likelihood of surviving a consensus-driven, points-based review. On the reviewer side, data supports a clear rating bias against interdisciplinary research or previously unfunded researchers PIs, while experts increasingly are declining invitations to review proposals in the interests of protecting their winnowing time (e.g., reviewer fatigue).

These tradeoffs in the current system appear quite troubling and merit further investigation of alternative and complementary funding models. At least one economic analysis suggests that as fewer applicants are funded as a percentage of the increasing pool, the value of the science that researchers forgo because of grantsmanship often exceeds the value of the science that the funding program supports. In fact, despite dramatic increases in research effort, America has for generations been facing dramatic declines in research productivity. And empirical analysis suggests this is notnecessarily the result of a decline in proposal merit, nor of a shift in proposer demographics, nor of an increase (beyond inflation) in the average requested funding per proposal, nor of an increase in the number of proposals per investigator in any one year.

As the Senate’s U.S. Innovation and Competition Act (USICA) and House’s America COMPETES Act propose billions of dollars to the NSF for R&D activities, about 96% of which will be distributed via the peer review, meritocratic grant awards process, now is the time to apply the scientific method to ourselves in the experimentation of alternative and complementary mechanisms for funding scientific research.

Science lotteries, an effort tested in New Zealand, Switzerland, and Germany, represent one innovation particularly suited to reduce the overall taxes on research productivity while uncovering new, worthwhile initiatives for funding that might otherwise slip through the cracks. In particular, modified science lotteries, as those proposed here, select a small percentage of well-qualified grant applications at random for funding. By only selecting from a pool of high-value projects, the lottery supports additional, quality research with minimal comparative costs to the researchers or reviewers. In a lottery, the value to the investigator of being admitted to the lottery scales directly with the number of awards available.

These benefits translate to favorable survey data from PIs who have gone through science lottery processes. In New Zealand, for example, the majority of scientists supported a random allocation of 2% total research expenditures. Sunny Collings, chief executive of New Zealand’s Health Research Council, recounted:

“Applications often have statistically indistinguishable scores, and there is a degree of randomness in peer review selection anyway. So why not formalize that and try to get the best of both approaches?”

By establishing conditions for entrance into the lottery — such as selecting for certain less funded or represented regions — NSF could also over-index for those applicants less prepared for “grantsmanship”.

What we propose, specifically, is a modified “second chance” lottery, whereby proposals that are deemed meritorious by the traditional peer-review process, yet are not selected for funding are entered into a lottery as a second stage in the funding process. This modified format ensures a high level of quality in the projects selected by the lottery to receive funding while still creating a randomized baseline to which the current system can be compared.

The use of science lotteries in the United States as a complement to the traditional peer-review process is likely to improve the overall system. However, it is possible that selecting among well-qualified grants at random could introduce unexpected outcomes. Unfortunately, direct, empirical comparisons between the NSF’s peer review process and partial lotteries do not exist. Through a pilot, the NSF has the opportunity to evaluate to what extent the mechanism could supplement the NSF’s traditional merit review process.

By formalizing a randomized selection process to use as a baseline for comparison, we may discover surprising things about the make up of and process that leads to successful or high-leverage research with reduced costs to researchers and reviewers. For instance, it may be the case that younger scholars who come from non-traditional backgrounds end up having as much or more success in terms of research outcomes through the lottery program as the typical NSF grant, but are selected at higher rates when compared to the traditional NSF grantmaking process. If this is the case, then there will be some evidence that something in the selection process is unfairly penalizing non-traditional candidates.

Alternatively, we may discover that the average grant selected through the lottery is mostly indistinguishable from the average grant selected through the traditional meritorious selection, which would provide some evidence that existing administrative burdens to select candidates are too stringent. Or perhaps, we will discover that randomly selected winners, in fact, produce fewernoteworthy results than candidates selected through traditional means, which would be evidence that the existing process is providing tangible value in filtering funding proposals.By providing a baseline for comparison, a lottery would offer an evidence-based means of assessing the efficacy of the current peer-review system. Any pilot program should therefore make full use of a menu of selection criteria to toggle outcomes, while also undergoing evaluations from internal and external, scientific communities.

Plan of Action

Recommendation 1: Congress should direct the NSF to pilot experimental lotteries through America COMPETES and the U.S. Innovation and Competition Act, among other vehicles.

In reconciling the differing House America COMPETES and Senate USICA, Congress should add language that authorizes a pilot program for “lotteries.”

We recommend opting for signaling language and follow-on legislation that adds textual specificity. For example, in latest text of the COMPETES Act, the responsibilities of the Assistant Director of the Directorate for Science and Engineering Solutions could be amended to include “lotteries”:

Sec. 1308(d)(4)(E). developing and testing diverse merit-review models and mechanisms, including lotteries, for selecting and providing awards for use-inspired and translational research and development at different scales, from individual investigator awards to large multi-institution collaborations;

Specifying language should then require the NSF to employ evidence-based evaluation criteria and grant it the flexibility to determine timeline of the lottery intake and award mechanisms, with broader goals of timeliness and supporting the equitable distribution among regional innovation contenders.

The appendix contains one example structure of a science lottery in bill text (incorporated into the new NSF Directorate established by the Senate-passed United States Innovation and Competition Act), which includes the following key policy choices that Congress should consider:

Limiting eligibility to meritorious proposals;

Ensuring that proposals are timely;

Limiting the grant proposal size to provide the maximum number of awards and create a large sample to fairly evaluate the success of a lottery program;

Rigorous stakeholder feedback mechanisms from the scientific research community;

Fast-tracking award distribution following a lottery; and

Regular reports to Congress in accordance with the NSF’s Open Science Policy to ensure transparency; accountability; and rigorous evaluation.

Recommendation 2: Create a “Translational Science of Science” Program within the new NSF Technology, Innovation and Partnerships Directorate that pilots the use of lotteries with evidence-based testing:

First, the NSF Office of Integrative Activities (OIA) should convene a workshop with relevant stakeholders including representatives from each directorate, the research community including NSF grant recipients, non-recipients, and SME’s on programmatic implementation from New Zealand, Germany, and Switzerland in order to temperature- and pressure-test key criteria for implementing piloted science lotteries across directorates.

The initial goal of the workshop should be to gather feedback and gauge interest from the PI community on this topic. To this end, it would be wise to explore varying elements in science lottery construction to appreciate which are most supported from the PI community. The community, for example, should be involved in the development of baseline parameters for proposal quality and a timely, equitable process, despite varying directorate application deadlines. This might include applicants’ consented sign-off before entrance into the lottery, upfront and consistent communications of timelines, and randomization and selection from a pool with scores of at least [excellent/very good/good] during the peer evaluation process described in the NSF’s “Proposal and Award Policies and Procedures Guide”.

Another goal of this workshop would be to scope the process of an OIA inter-directorate competition to submit applications in order to receive an award from the Division of Grants and Agreements to pursue pilot science lottery. The workshop should therefore develop a clear sense of opportunities with respect to budget sizing for each directorate and could consider making recommendations about the placement of science lottery pilots across directorates based on willingness to devote experimental resources. To maximize the number of lottery recipients, the proposal must not exceed 200% of the median grant proposal to a given directorate;

Finally, a third goal of the workshop should be to explore standards and timeframe for evidence-based evaluation mechanisms as described above and in the bill-text below, including stakeholder feedback mechanisms, regular reports to Congress, and transparency requirements. Additional mechanisms might include detailed reports on grants and awardees like demographic and geographic information of awardees, comparison of outcomes from traditional awardees and lottery awardees, and a statistical picture of the entire pool of grant proposals entered into the lottery. If the workshop is based on competitive directorate applications, the General Services Administration’s Office of Evaluation Sciences (OES) should be invited for later-stage workshop convenings to provide technical assistance in designing evaluation criteria. Some unifying criteria include meeting the requirements of the NSF’s Open Science Policy, Public Access Policy, and making grant information public as soon as feasible to facilitate rapid evaluation from external stakeholders — a potential metric to judge directorate applications.

Appendix: Bill Text

Note: Please view attached PDF for the formatted bill text

H. ______

To establish a pilot program for National Science Foundation grant lotteries.

In the House of Representatives of the United States

February 2, 2022

______________________________

A BILL

Title: To establish a pilot program for National Science Foundation grant lotteries.

Be it enacted by the Senate and House of Representatives of the United States of America in Congress assembled,

SEC. _____. Pilot Program to Establish National Science Foundation Grant Lotteries

Findings.— Congress makes the following findings:
- Over the past seven decades, the National Science Foundation has played a critical role in advancing the United States academic research enterprise by supporting fundamental research and education across all scientific disciplines;
- The National Science Foundation has made remarkable contributions to scientific advancement, economic growth, human health, and national security, and its peer review and merit review processes have identified and funded scientifically and societally relevant basic research;
- Every year, thousands of meritorious grant proposals do not receive National Science Foundation grants, threatening the United States’ leadership in science and technology and harming our efforts to lead translation and development of scientific advances in key technology areas; and
- While Congress reaffirms its belief that the National Science Foundation’s merit-review system is appropriate for evaluating grant proposals, Congress should establish efforts to explore alternative mechanisms for distributing grants and evaluating, objectively, whether it can supplement the merit-review system by funding worthwhile projects that otherwise go unawarded.
Definitions.—In this section:
- Directorate.— The term “Directorate” refers to the Directorate for Technology and Innovation established in Sec. 2102 of this Act.
- Assistant Director.— The term “Assistant Director” refers to the Assistant Director for the Directorate described in Sec. 2102(d) of this Act.
- Foundation.—The term “Foundation” refers to the National Science Foundation.
- PAPPG.—The term “PAPPG” refers to the document entitled “OMB Control Number 3145-0058,” also known as the Proposal and Award Policies and Procedures Guide, published by the National Science Foundation, also published as NSF 22-1.
- Program.—The term “program” refers to the program established in subparagraph (d) of this section.
- Grant request.—The term “grant request” refers to the amount of funding requested in an individual grant proposal to the National Science Foundation.
- Lottery awardee.—The term “lottery awardee” refers to a grant proposal selected for award during a lottery established by this section.
- Lottery year.—The term “lottery year” refers to the calendar year of eligibility for proposals, as determined by the Assistant Director, for a lottery established under this program.
Purpose.—It is the purpose of this section to establish a pilot program for merit-based lotteries to award scientific research grants in order to:
- Provide grants to meritorious but unawarded grant proposals;
- Explore “second-look” mechanisms to distribute grants to meritorious but overlooked grant proposals; and
- To evaluate whether alternative mechanisms can supplement the Foundation’s merit-review system.

Establishment.— No later than 180 days after establishment of the Directorate, the Assistant Director shall establish a lottery program to provide second-look grants for meritorious grant proposals that were declined funding by the Foundation.
Requirements.
- Eligibility.—A grant proposal shall be eligible for a lottery if:
  - It did not receive funding from the Foundation;
  - The grant proposal received an overall evaluation score deemed meritorious during the peer review process;
    - Meritorious.—The Assistant Director determine a minimum score that a proposal must receive during the peer evaluation process described in Chapter III of the PAPPG to be deemed meritorious.
  - The grant request does not exceed 200 percent of the median grant request to a given directorate in the calendar year with the most recently available data;
  - The grant was proposed to one of the following directorates within the Foundation:
    - Biological Sciences;
    - Computer and Information Science and Engineering;
    - Engineering;
    - Geosciences; or
    - Mathematical and Physical Sciences;
    - Social, Behavioral, and Economic Sciences;
    - Education and Human Resources;
    - Environmental Research and Education;
    - International Science of Engineering;
  - The grant has been deemed timely by a Foundation Program Officer; and
  - Any other criteria deemed necessary by the Assistant Director
- Exemptions.—If deemed necessary or worthwhile to further the mission and goals of the Directorate or the Foundation, the Assistant Director may:
  - Exempt grant proposals from the requirement in subparagraph (e)(1)(D); and
  - Determine an appropriate method to include such exempted proposals in a lottery.
- Stakeholder Feedback.—Prior to finalizing eligibility requirements, the Assistant Director shall, to the extent practicable, ensure that the requirements take into consideration advice and feedback from the scientific research community. The Federal Advisory Committee Act (5 U.S.C. App.) shall not apply whenever such advice or feedback is sought in accordance with this subsection.

Implementation.

Policies and Procedures.—The Assistant Director shall:
- Develop procedures and policies to ensure that each grant lottery:
  - Is randomized and affords equal opportunity to all participants; and
  - Is not susceptible to fraud;
- Ensure that grant amounts are distributed equitably among the directorates described in subparagraph (e)(1)(D);
- Ensure that relevant external parties have due notice of their obligations with respect to participation in a lottery;
- Ensure that relevant staff and officers of the Foundation are aware of their duties and responsibilities with respect to implementation of the program;
- Ensure that ranked alternative awardees are selected for each lottery in the event that:
  - a lottery awardee withdraws their application;
  - a lottery awardee receives Foundation funding following an appeals process; or
  - is otherwise deemed ineligible for a Foundation grant.
- Grant Approval.—Once a proposal has been selected for an award:
  - It shall be submitted to the Division of Grants and Agreements for a review of business, financial, and policy implications and award finalization thereafter, as described in PAPPG Chapter III; and
  - It shall not be declined funding by the Division of Grants and Agreements unless granting the award would result in fraud, abuse, or other outcomes deemed egregious and antithetical to the mission of the Foundation.
- Lottery timeline.—For each directorate specified in subparagraph (e)(1)(D), the Assistant Director shall administer a lottery for each calendar year ending in the years [2022, 2023, and 2024].
- Stakeholder Feedback.—Prior to finalizing lottery implementation, and subsequent to conducting each lottery, the Assistant Director shall, to the extent practicable, ensure that lottery implementation takes into consideration advice and feedback from the scientific research community. The Federal Advisory Committee Act (5 U.S.C. App.) shall not apply whenever such advice or feedback is sought in accordance with this subsection.
Deadline of Submission of Grants to the Directorate.—No later than [90 days] following a given lottery year, Foundation Program Officers shall submit all grant proposals that meet the criteria described in subparagraphs (e)(1)(A)—(e)(1)(F) of this section.

Authorization of Appropriation.— There is authorized to be appropriated to the Foundation [$—,000,000] to carry out this Section.

Evaluation and Oversight and Public Access.
- Evaluation.—The Assistant Director shall:
  - Ensure that awards are evaluated using the same methods and procedures as other grant programs of the Foundation, including as set forth by the Foundation’s Evaluation and Assessment Capability and the Foundation’s values of learning, excellence, inclusion, collaboration, integrity, and transparency; and
  - Establish a rapid, empirically-based evaluation program to determine the effectiveness of the lottery program.
- Reports to Congress.—
  - Periodic.— No later than 180 days following completion of a lottery, the Assistant Director shall submit a summary report to Congress including:
    - A list of all grants awarded;
    - Demographic information of the grant awardees;
    - Geographic information of the grant awardees;
    - Information regarding the institutions receiving grants;
    - An assessment comparing lottery grant awardees with those awarded grants through the Foundation’s traditional review process;
    - Information and data describing the entire pool of grant proposals deemed eligible for the lottery.
    - Any other information deemed necessary or valuable by the Assistant Director;
  - Yearly.—Not later than [two years] following the first lottery, the Assistant Director shall submit comprehensive reports on a yearly basis, for a period of five years after the report submission, evaluating awards using the Foundation’s Evaluation and Assessment Capability or other assessment methods used to evaluate grants awarded through the traditional grant process;
  - Final report.—Within [3 years] of completion of the final lottery, the Assistant Director shall submit a final report to Congress evaluating the success of the program and assessing whether Congress should make the program permanent.
- Public Access.—The Assistant Director shall:
  - Ensure that the program meets the requirements of the Foundation’s:
    - Open Science Policy;
    - Public Access Policy; and
    - General values of learning, transparency, and integrity.
  - Make grant information available to the public as soon as is feasible to facilitate rapid, empirically-based evaluation by external stakeholders;

Duties, Conditions, Restrictions, and Prohibitions.—

Right to Review.—Nothing in this section shall affect an applicant’s right to review, appeal, or contest an award decision.

Playbook For Opening Federal Government Data — How Executive & Legislative Leadership Can Help

Summary

Enabling government data to be freely shared and accessed can expedite research and innovation in high-value disciplines, create opportunities for economic development, increase citizen participation in government, and inform decision-making in both public and private sectors. Each day government data remains inaccessible, the public, researchers, and policymakers lose an opportunity to leverage data as a strategic asset to improve social outcomes.

Though federal agencies and policymakers alike support the idea of safely opening their data both to other agencies and to the research community, a substantial fraction of the United States (U.S.) federal government’s safely shareable data is not being shared.

This playbook, compiled based on interviews with current and former government officials, identifies the challenges federal agencies face in 2021 as they work to comply with open data statutes and guidances. More importantly, it offers actionable recommendations for Executive and Congressional leadership to enable federal agencies to prioritize open data.

Paramount among these solutions is the need for the Biden Administration to assign open government data as a 2021 Cross-Agency Priority (CAP) Goal in the President’s Management Agenda (PMA). This goal should revitalize the 2018 CAP Goal: Leveraging Data as a Strategic Asset to improve upon the 2020 U.S. Federal Data Strategy (FDS) and emphasize that open data is a priority for the U.S. Government. The U.S. Chief Technology Officer (CTO) should direct a Deputy CTO to focus solely on fulfilling this 2021 CAP Goal. This Deputy CTO should be a joint appointment with the Office of Management and Budget (OMB).

Absent elevating open data as a top priority in the President’s Agenda, the U.S. risks falling behind internationally. Many nations have surged ahead building smart, prosperous AI-driven societies while the U.S. has failed to unlock its nascent data. If the Biden Administration wants the U.S. to prevail as an international superpower and a global beacon of democracy, it must revitalize its waning open data efforts.

Why and How Faculty Should Participate in U.S. Policy Making

If the U.S. Congress is to produce sound policies that benefit the public good, science and technology faculty members must become active participants in the American policy-making process. One key element of that process is congressional hearings: public forums where members of Congress question witnesses, learn about pressing issues, develop policy initiatives and conduct oversight of both the executive branch and corporate practices.

Faculty in science and technology should contribute to congressional hearings because: 1) legislators should use data and scientifically derived knowledge to guide policy development, 2) deep expertise is needed to support effective oversight of complex issues like the spread of misinformation on internet platforms or pandemic response, and 3) members of Congress are decision makers on major issues that impact the science and technology community, such as research funding priorities or the role of foreign nationals in the research enterprise. A compelling moment during a hearing can have a profound impact on public policy, and faculty members can help make those moments happen.

Read the full article at Inside Higher Ed.

Re-envisioning Reporting of Scientific Methods

Summary

The information contained in the methods section of the overwhelming majority of research publications is insufficient to definitively evaluate research practices, let alone reproduce the work. Publication—and subsequent reuse—of detailed scientific methodologies can save researchers time and money, and can accelerate the pace of research overall. However, there is no existing mechanism for collective action to improve reporting of scientific methods. The Biden-Harris Administration should direct research-funding agencies to support development of new standards for reporting scientific methods. These standards would (1) address ongoing challenges in scientific reproducibility, and (2) benefit our nation’s scientific enterprise by improving research quality, reliability, and efficiency.

A Convergence Directorate at the National Science Foundation

Summary

Convergence is a compelling novel paradigm and a potent force for advancing scientific discovery via transdisciplinary collaboration. It is also a useful framework for multi-sector partnerships. The Biden-Harris Administration should form a Convergence Directorate at the National Science Foundation (NSF) to accelerate research and innovation and help ensure U.S. leadership in the industries of the future.

In forming the Directorate, NSF should:

Commit resources that are commensurate with the importance of the Directorate’s mission.
Provide the sustained focus needed to realize the tremendous potential of convergence.
Ensure that the Directorate is organized to reflect the principles of convergence in its structure and operations.