Collaboration between federal agencies and academic researchers is an important tool for public policy. By facilitating the exchange of knowledge, ideas, and talent, these partnerships can help address pressing societal challenges. But because it is rarely in either party’s job description to conduct outreach and build relationships with the other, many important dynamics are often hidden from view. This primer provides an initial set of questions and topics for agencies to consider when exploring academic partnership.
Why should agencies consider working with academics?
- Accessing the frontier of knowledge: Academics are at the forefront of their fields, and their insights can provide fresh perspectives on agency work.
- Leveraging innovative methods: From data collection to analysis, academics may have access to the new technologies and approaches that can enhance governmental efforts.
- Enhancing credibility: By incorporating research and external expertise, policy decisions gain legitimacy and trust, and align with evidence-based policy guidelines.
- Generating new insights: Collaboration between agencies and outside researchers can lead to discoveries that advance both knowledge and practice..
- Developing human capital: Collaboration can enhance the skills of both public servants and academics, creating a more robust workforce and potentially leading to longer-term talent exchange.
What considerations may arise when working with academics?
- Designing collaborative relationships that are targeted to the incentives of both the agency and the academic partners;
- Navigating different rules and regulations that may impact academic-government collaboration, e.g. rules on external advisory groups, university guidelines, and data/information confidentiality;
- Understanding the different structures and mechanisms that enable academic-government collaboration, such as sabbaticals, fellowships, consultancies, grants, or contracts;
- Identifying and approaching the right academics for different projects and needs.
Academic faculty progress through different stages of professorship — typically assistant, associate, and full — that affect their research and teaching expectations and opportunities. Assistant professors are tenure-track faculty who need to secure funding, publish papers, and meet the standards for tenure. Associate professors have job security and academic freedom, but also more mentoring and leadership responsibilities; associate professors are typically tenured, though this is not always the case. Full professors are senior faculty who have a high reputation and recognition in their field, but also more demands for service and supervision. The nature of agency-academic collaboration may depend on the seniority of the academic. For example, junior faculty may be more available to work with agencies, but primarily in contexts that will lead to traditional academic outputs; while senior faculty may be more selective, but their academic freedom will allow for less formal and more impact-oriented work.
Soft money positions are those that depend largely or entirely on external funding sources, typically research grants, to support the salary and expenses of the faculty. Hard money positions are those that are supported by the academic institution’s central funds, typically tied to more explicit (and more expansive) expectations for teaching and service than soft-money positions. Faculty in soft money positions may face more pressure to secure funding for research, while faculty in hard money positions may have more autonomy in their research agenda but more competing academic activities. Federal agencies should be aware of the funding situation of the academic faculty they collaborate with, as it may affect their incentives and expectations for agency engagement.
A sabbatical is a period of leave from regular academic duties, usually for one or two semesters, that allows faculty to pursue an intensive and unstructured scope of work — this can include research in their own field or others, as well as external engagements or tours of service with non-academic institutions . Faculty accrue sabbatical credits based on their length and type of service at the university, and may apply for a sabbatical once they have enough credits. The amount of salary received during a sabbatical depends on the number of credits and the duration of the leave. Federal agencies may benefit from collaborating with academic faculty who are on sabbatical, as they may have more time and interest to devote to impact-focused work.
Consulting limits & outside activity limits are policies that regulate the amount of time that academic faculty can spend on professional activities outside their university employment. These policies are intended to prevent conflicts of commitment or interest that may interfere with the faculty’s primary obligations to the university, such as teaching, research, and service, and the specific limits vary by university. Federal agencies may need to consider these limits when engaging academic faculty in ongoing or high-commitment collaborations.
Some academic faculty are paid on a 9-month basis, meaning that they receive their annual salary over nine months and have the option to supplement their income with external funding or other activities during the summer months. Other faculty are paid on a 12-month basis, meaning that they receive their annual salary over twelve months and have less flexibility to pursue outside opportunities. Federal agencies may need to consider the salary structure of the academic faculty they work with, as it may affect their availability to engage on projects and the optimal timing with which they can do so.
Advisory relationships consist of an academic providing occasional or periodic guidance to a federal agency on a specific topic or issue, without being formally contracted or compensated. This type of collaboration can be useful for agencies that need access to cutting-edge expertise or perspectives, but do not have a formal deliverable in mind.
- Career stage: Informal advising can be done by faculty at any level of seniority, as long as they have relevant knowledge and experience. However, junior faculty may be more cautious about engaging in informal advising, as it may not count towards their tenure or promotion criteria. Senior faculty, who have established expertise and secured tenure, may be more willing to engage in impact-focused advisory relationships.
- Incentives: Advisory relationships can offer some benefits for faculty regardless of career stage, such as expanding their network, increasing their visibility, and influencing policy or practice. Informal advising can also stimulate new research questions, and create opportunities for future access to data or resources. Some agencies may also acknowledge the contributions of academic advisors in their reports or publications, which may enhance researchers’ academic reputation.
- Conflicts of interest: Informal advising may pose potential conflicts of interest or commitment for faculty, especially if they have other sources of funding or collaboration related to the same topic or issue. Faculty may need to consult with their department chair or dean before engaging in formal conversations, and should also avoid any activities that may compromise their objectivity, integrity, or judgment in conducting or reporting their university research.
- Timing: Faculty on 9-month salaries might be more willing/able to engage during summer months, when they have minimal teaching requirements and are focused on research and impact outputs.
Regulatory & structural considerations
- Contracting: An advisory relationship may not require a formal agreement or contract between the agency and the academic. For some topics or agencies, however, it may require a non-disclosure agreement or consulting agreement if the agency wants to ensure the exclusivity or confidentiality of the conversation.
- Advisory committee rules: Depending on the scope and scale of the academic engagement, agencies should be sure to abide by Federal Advisory Committee Act regulations. With informal one-on-one conversations that are focused on education and knowledge exchange, this is unlikely to be an issue.
- University approval: An NDA or consulting agreement may require approval from the university’s office of sponsored programs or office of technology transfer before engaging in informal advising. These offices may review and approve the agreement between the agency and the academic institution, ensuring compliance with university policies and regulations.
- Compensation: Informal advising typically does not involve compensation for the academic, but it may involve reimbursement for travel or other expenses related to the advisory role. This work is unlikely to count towards the consulting limit for faculty, but it may count towards the outside professional activity limit, depending on the nature and frequency of the advising.
Federal agencies and academic institutions are subject to various laws and regulations that affect their research collaboration, and the ownership and use of the research outputs. Key legislation includes the Federal Advisory Committee Act (FACA), which governs advisory committees and ensures transparency and accountability; the Federal Acquisition Regulation (FAR), which controls the acquisition of supplies and services with appropriated funds; and the Federal Grant and Cooperative Agreement Act (FGCAA), which provides criteria for distinguishing between grants, cooperative agreements, and contracts. Agencies should ensure that collaborations are structured in accordance with these and other laws.
Federal agencies may use various contracting mechanisms to engage researchers from non-federal entities in collaborative roles. These mechanisms include the IPA Mobility Program, which allows the temporary assignment of personnel between federal and non-federal organizations; the Experts & Consultants authority, which allows the appointment of qualified experts and consultants to positions that require only intermittent and/or temporary employment; and Cooperative Research and Development Agreements (CRADAs), which allow agencies to enter into collaborative agreements with non-federal partners to conduct research and development projects of mutual interest.
Offices of Sponsored Programs are units within universities that provide administrative support and oversight for externally funded research projects. OSPs are responsible for reviewing and approving proposals, negotiating and accepting awards, ensuring compliance with sponsor and university policies and regulations, and managing post-award activities such as reporting, invoicing, and auditing. Federal agencies typically interact with OSPs as the authorized representative of the university in matters related to sponsored research.
When engaging with academics, federal agencies may use NDAs to safeguard sensitive information. Agencies each have their own rules and procedures for using and enforcing NDAs involving their grantees and contractors. These rules and procedures vary, but generally require researchers to sign an NDA outlining rights and obligations relating to classified information, data, and research findings shared during collaborations.
A study group is a type of collaboration where an academic participates in a group of experts convened by a federal agency to conduct analysis or education on a specific topic or issue. The study group may produce a report or hold meetings to present their findings to the agency or other stakeholders. This type of collaboration can be useful for agencies that need to gather evidence or insights from multiple sources and disciplines with expertise relevant to their work.
- Career stage: Faculty at any level of seniority can participate in a study group, but junior faculty may be more selective about joining, as they have limited time and resources to devote to activities that may not count towards their tenure or promotion criteria. Senior faculty may be more willing to join a study group, as they have more established expertise and reputation, and may seek to have more impact on policy or practice.
- Soft vs. hard money: Faculty in soft money positions, where their salary and research expenses depend largely on external funding sources, may be more interested in joining a study group if it provides funding or other resources that support their research. Faculty in hard money positions, where their salary and research expenses are supported by institutional funds, may be less motivated by funding, but more by the recognition and impact that comes from participating.
- Incentives: Study groups can offer some benefits for faculty, such as expanding their network, increasing their visibility, and influencing policy or practice. Study groups can also stimulate new research ideas or questions for faculty, and create opportunities for future access to data or resources. Some study groups may also result in publication of output or other forms of recognition (e.g., speaking engagements) that may enhance the academic reputation of the faculty.
- Conflicts of interest: Study groups may pose potential conflicts of interest or commitment for academics, especially if they have other sources of funding related to the same topic. Faculty may also be cautious about entering into more formal agreements if it may impact their ability to apply for & receive federal research funding in the future. Agencies should be aware of any such impacts of academic participation, and faculty should be encouraged to consult with their department chair or dean before joining a study group.
Regulatory & structural considerations
- Contracting and compensation: The optimal contracting mechanism for a study group will depend on the agency, the topic, and the planned output of the group. Some possible contracting mechanisms are extramural grants, service contracts, cooperative agreements, or memoranda of understanding. The mechanism will determine the amount and type of compensation that participants (or the organizing body) receive, and could include salary support, travel reimbursement, honoraria, or overhead costs.
- Advisory committee rules: When setting up study groups, agencies should work carefully to ensure that the structure abides by Federal Advisory Committee Act regulations. To ensure that study groups are distinct from Advisory Committees , these groups should be limited in size, and should be tasked with providing knowledge, research, and education — rather than specific programmatic guidance — to agency partners.
- University approval: Depending on the contracting mechanism and the compensation involved, academic participants may need to obtain approval from their university’s office of sponsored programs or office of technology transfer before joining a study group. These offices may review the terms and conditions of the agreement between the agency and the academic institution, such as the scope of work, the budget, and the reporting requirements.
In 2022, the National Science Foundation (NSF) awarded the National Bureau of Economic Research (NBER) a grant to create the EAGER: Place-Based Innovation Policy Study Group. This group, led by two economists with expertise in entrepreneurship, innovation, and regional development — Jorge Guzman from Columbia University and Scott Stern from MIT — aimed to provide “timely insight for the NSF Regional Innovation Engines program.” During Fall 2022, the group met regularly with NSF staff to i) provide an assessment of the “state of knowledge” of place-based innovation ecosystems, ii) identify the insights of this research to inform NSF staff on design of their policies, and iii) surface potential means by which to measure and evaluate place-based innovation ecosystems on a rigorous and ongoing basis. Several of the academic leads then completed a paper synthesizing the opportunities and design considerations of the regional innovation engine model, based on the collaborative exploration and insights developed throughout the year. In this case, the study group was structured as a grant, with funding provided to the organizing institution (NBER) for personnel and convening costs. Yet other approaches are possible; for example, NSF recently launched a broader study group with the Institute for Progress, which is structured as a no-cost Other Transaction Authority contract.
Active collaboration covers scenarios in which an academic engages in joint research with a federal agency, either as a co-investigator, a subrecipient, a contractor, or a consultant. This type of collaboration can be useful for agencies that need to leverage the expertise, facilities, data, or networks of academics to conduct research that advances their mission, goals, or priorities.
- Career stage: Collaborative research is likely to be attractive to junior faculty, who are seeking opportunities to access data that might not be otherwise available, and to foster new relationships with partners. This is particularly true if there is a commitment that findings or evaluations will be publishable, and if the collaboration does not interfere with teaching and service obligations. Collaborative projects are also likely to be of interest to senior faculty — if work aligns with their established research agenda — and publication of findings may be (slightly) less of a requirement.
- Soft vs. hard money: Researchers on hard money contracts, where their salary and research expenses are supported by institutional funds, may be more motivated by the opportunity to use and publish internal data from the agency. Researchers on soft money contracts, where their salary and research expenses depend largely on external funding sources, may be more motivated by the availability of grants and financial support from the agency.
- Timing: Depending on the scope of the collaboration, and the availability of funding for the researcher, efforts could be targeted for academics’ summer months or their sabbaticals. Alternatively, collaborative research could be integrated into the regular academic year, as part of the researcher’s ongoing research activities.
- Incentives: As mentioned above, collaborative research can offer some benefits for faculty, such as access to data and information, publication opportunities, funding sources, and partnership networks. Collaborative research can also provide faculty with more direct and immediate impact on policy or practice, as well as recognition from the agency and stakeholders (and, perhaps to a lesser extent, the academic community).
Regulatory & structural considerations
- Contracting: The contracting requirements for collaborative research will vary greatly depending on the structure and scope of the collaboration, the partnering agency, and the use of internal government data or resources. Readers are encouraged to explore agency-specific guidance when considering the ideal mechanism for a given project. Some possible contracting mechanisms are extramural grants, service contracts, or cooperative research and development agreements. Each mechanism has different terms and conditions regarding the scope of work, the budget, the intellectual property rights, the reporting requirements, and the oversight responsibilities.
- Regulatory compliance: Collaborative research involving both governmental and non-governmental partners will require compliance with various laws, regulations, and authorities. These include but are not limited to:
- Federal Acquisition Regulation (FAR), which establishes the policies and procedures for acquiring supplies and services with appropriated funds;
- Federal Grant and Cooperative Agreement Act (FGCAA), which provides criteria for determining whether to use a grant or a cooperative agreement to provide assistance to non-federal entities;
- Other Transaction Authority (OTA), a contracting mechanism that provides (most) agencies with the ability to enter into flexible research & development agreements that are not subject to the regulations on standard contracts, grants, or cooperative agreements
- OMB’s Uniform Guidance, which set forth the administrative requirements, cost principles, and audit requirements for federal awards
- Bayh-Dole Act, which allows academic institutions to retain title to inventions made with federal funding, subject to certain conditions and obligations.
- Collaborative research may also require compliance with ethical standards and guidelines for human subjects research, such as the Belmont Report and the Common Rule.
External collaboration between academic researchers and government agencies has repeatedly proven fruitful for both parties. For example, in May 2020, the Rhode Island Department of Health partnered with researchers at Brown University’s Policy Lab to conduct a randomized controlled trial evaluating the effectiveness of different letter designs in encouraging COVID-19 testing. This study identified design principles that improved uptake of testing by 25–60% without increasing cost, and led to follow-on collaborations between the institutions. The North Carolina Office of Strategic Partnerships provides a prime example of how government agencies can take steps to facilitate these collaborations. The office recently launched the North Carolina Project Portal, which serves as a platform for the agency to share their research needs, and for external partners — including academics — to express interest in collaborating. Researchers are encouraged to contact the relevant project leads, who then assess interested parties on their expertise and capacity, extend an offer for a formal research partnership, and initiate the project.
Short-term placements allow for an academic researcher to work at a federal agency for a limited period of time (typically one year or less), either as a fellow, a scholar, a detailee, or a special government employee. This type of collaboration can be useful for agencies that need to fill temporary gaps in expertise, capacity, or leadership, or to foster cross-sector exchange and learning.
- Career stage: Short-term placements may be more appealing to senior faculty, who have more established and impact-focused research agendas, and who may seek to influence policy or practice at the highest levels. Junior faculty may be less interested in placements, particularly if they are still progressing towards tenure — unless the position offers opportunities for publication, funding, or recognition that are relevant to their tenure or promotion criteria.
- Soft vs. hard money: Faculty in soft money positions may face more challenges in arranging short-term placement if they have ongoing grants or labs to maintain; but placements where external resources are available (e.g., established fellowships), could be an attractive option when ongoing commitments are manageable. The impact of hard money will depend largely on the type of placement and the expectations for whether institutional support or external resources will cover a faculty member’s time away from the university.
- Timing: Sabbaticals are an ideal time for short-term placements, as they allow faculty to pursue intensive research or external engagement, without interfering with their regular academic duties. However, convincing faculty to use their sabbaticals for short-term placement may require a longer discovery and recruitment period, as well as a strong value proposition that highlights the benefits and incentives of the collaboration. Because most faculty are subject to the academic calendar, June and January tend to be ideal start dates for this type of engagement.
- Incentives: Short-term placements can offer benefits for academics, such as having an impact on policy or practice, gaining access to new data or research areas, and building relationships with agency officials and other stakeholders. However, short-term placements can also involve some costs and/or risks for participating faculty, including logistical complications, relocation, confidentiality constraints, and publication restrictions.
Regulatory & structural considerations
- Contracting: Short-term placements require a formal agreement or contract between the agency and the academic. There are several contracting & hiring mechanisms that can facilitate short-term placement, such as the Intergovernmental Personnel Act (IPA) Mobility Program, the Experts & Consultants authority, Schedule A(r), or the Special Government Employee (SGE) designation. Each mechanism has different eligibility criteria, terms and conditions, and administrative processes. Alternatively, many fellowship programs already exist within agencies or through outside organizations, which can streamline the process and handle logistics on behalf of both the academic institution and the agency.
- Compensation: The payment of salary support, travel, overhead, etc. will depend on the contracting mechanism and the agreement between the agency and the academic institution. Costs are generally covered by the organization that is expected to benefit most from the placement, which is often the agency itself; though some authorities for facilitating cross-sector exchange (e.g., the IPA program and Experts and Consultants authority) allow research institutions to cost-share or cover the expense of an expert’s compensation when appropriate. External fellowship programs also occasionally provide external resources to cover costs.
- Role and expectations: Placements, more so than more informal collaborations, require clear communication and understanding of the role and expectations. The academic should also be prepared to adapt to the agency’s norms and processes, which will differ from those in academia, and to perform work that may not reflect their typical contribution. The academic should also be aware of their rights and obligations as a federal employee or contractor.
- Confidentiality: Placements may involve access to confidential or sensitive information from the agency, such as classified data or personal information. Academics will likely be required to sign a non-disclosure agreement (NDA) that defines the scope and terms of confidentiality, and will often be subject to security clearance or background check procedures before entering their role.
Various programs exist throughout government to facilitate short-term rotations of outside experts into federal agencies and offices. One of the most well-known examples is the American Association for the Advancement of Science (AAAS) Science & Technology Policy Fellowship (STPF) program, which places scientists and engineers from various disciplines and career stages in federal agencies for one year to apply their scientific knowledge and skills to inform policy making and implementation. The Schedule A(r) hiring authority tends to be well-suited for these kinds of fellowships; it is used, for example, by the Bureau of Economic Analysis to bring on early career fellows through the American Economic Association’s Summer Economics Fellows Program. In some circumstances, outside experts are brought into government “on loan” from their home institution to do a tour of service in a federal office or agency; in these cases, the IPA program can be a useful mechanism. IPAs are used by the National Science Foundation (NSF) in its Rotator Program, which brings outside scientists into the agency to serve as temporary Program Directors and bring cutting-edge knowledge to the agency’s grantmaking and priority-setting. IPA is also used for more ad-hoc talent needs; for example, the Office of Evaluation Sciences (OES) at GSA often uses it to bring in fellows and academic affiliates.
Long-term rotations allow an academic to work at a federal agency for an extended period of time (more than one year), either as a fellow, a scholar, a detailee, or a special government employee. This type of collaboration can be useful for agencies that need to recruit and retain expertise, capacity, or leadership in areas that are critical to their mission, goals, or priorities.
- Career stage: Long-term rotations may be more feasible for senior faculty, who have more experience in their discipline and are likely to have more flexibility and support from their institutions to take a leave of absence. Junior faculty may face more barriers and risks in pursuing long-term rotations, such as losing momentum in their research productivity, missing opportunities for tenure or promotion, or losing connection with their academic peers and mentors.
- Soft vs. hard money: Faculty in soft money positions may have more ability to seek longer-term rotations, as the provision of external support is more in line with their institutions’ expectations. Faculty in hard money positions may face difficulties seeking long-term rotations, as institutional provision of resources comes with expectations for teaching and service that administrations may be wary of pausing for extended periods of time.
- Timing: Long-term rotations require careful planning and coordination with the academic institution and the federal agency, as it may involve significant changes in the academic’s schedule, workload, and responsibilities. These rotations may be easier to arrange during sabbaticals or other periods of leave from the academic institution, but will often still require approval from the institution’s administration. Because most faculty are subject to the academic calendar, June and January tend to be ideal start dates for sabbatical or secondment engagements.
- Incentives: Long-term rotations offer an opportunity for faculty to gain valuable experience and insight into the impact frontier — both in terms of policy and practice — of their field or discipline. These experiences can yield new skills or competencies that enhance their academic performance or career advancement, can help academics build strong relationships and networks with agency officials and other stakeholders, and can provide a lasting impact on public good. However, long-term roles involve challenges for faculty, such as adjusting to a different organizational structure, balancing expectations from both the agency and the academy, and transitioning back into academic work and productivity following the rotation.
Regulatory & structural considerations
- Regulatory and structural considerations — including contracting, compensation, and expectations — are similar to those of short-term placements, and tend to involve the same mechanisms and processes.
- The desired length of a long-term rotation will affect how agencies select and apply the appropriate mechanism. For example, IPA assignments are initially made for up to two years, and can then be extended for another two years when relevant — yielding a maximum continuous term length of four years.
- Longer time frames typically require additional structural considerations. Specifically, extensions of mechanisms like the IPA may be required, or more formal governmental employment may be prioritized at the outset. Given that these types of placements are often bespoke, these considerations should be explored in depth for the agency’s specific needs and regulatory context.
One example of a long-term rotation that draws experts from academia into federal agency work is the Advanced Research Projects Agency (ARPA) Program Manager (PM) role. ARPA PMs — across DARPA, IARPA, ARPA-E, and now ARPA-H — are responsible for leading high-risk, high-reward research programs, and have considerable autonomy and authority in defining their research vision, selecting research performers, managing their research budget, and overseeing their research outcomes. PMs are typically recruited from academia, industry, or government for a term of three to five years, and are expected to return to their academic institutions or pursue other career opportunities after their term at the agency. PMs coming from academia or nonprofit organizations are often brought on through the IPA mobility program, and some entities also have unique term-limited, hiring authorities for this purpose. PMs can also be hired as full government employees; this mechanism is primarily used for candidates coming from the private sector.
The typical science grantmaker seeks to maximize their (positive) impact with a limited amount of money. The decision-making process for how to allocate that funding requires them to consider the different dimensions of risk and uncertainty involved in science proposals, as described in foundational work by economists Chiara Franzoni and Paula Stephan. The Von Neumann-Morgenstern utility theorem implies that there exists for the grantmaker — or the peer reviewer(s) assessing proposals on their behalf — a utility function whose expected value they will seek to maximize.
Common frameworks for evaluating proposals leave this utility function implicit, often evaluating aspects of risk, uncertainty, and potential value independently and qualitatively. Empirical work has suggested that such an approach may lead to biases, resulting in funding decisions that deviate from grantmakers’ ultimate goals. An expected utility approach to reviewing science proposals aims to make that implicit decision-making process explicit, and thus reduce biases, by asking reviewers to directly predict the probability and value of different potential outcomes occurring. Implementing this approach through forecasting brings the added benefits of providing (1) a resolution and scoring process that could help incentivize reviewers to make better, more accurate predictions over time and (2) empirical estimates of reviewers’ accuracy and tendency to over or underestimate the value and probability of success of proposals.
At the Federation of American Scientists, we are currently piloting this approach on a series of proposals in the life sciences that we have collected for Focused Research Organizations (FROs), a new type of non-profit research organization designed to tackle challenges that neither academia or industry is incentivized to work on. The pilot study was developed in collaboration with Metaculus, a forecasting platform and aggregator, and is hosted on their website. In this paper, we provide the detailed methodology for the approach that we have developed, which builds upon Franzoni and Stephan’s work, so that interested grantmakers may adapt it for their own purposes. The motivation for developing this approach and how we believe it may help address biases against risk in traditional peer review processes is discussed in our article “Risk and Reward in Peer Review”.
To illustrate how an expected utility forecasting approach could be applied to scientific proposal evaluation, let us first imagine a research project consisting of multiple possible outcomes or milestones. In the most straightforward case, the outcomes that could arise are mutually exclusive (i.e., only a single one will be observed). Indexing each outcome with the letter 𝑖, we can define the expected value of each as the product of its value (or utility; 𝓊𝑖) and the probability of it occurring, 𝑃(𝑚𝑖). Because the outcomes in this example are mutually exclusive, the total expected utility (TEU) of the proposed project is the sum of the expected value of each outcome1:
𝑇𝐸𝑈 = 𝛴𝑖𝓊𝑖𝑃(𝑚𝑖).
However, in most cases, it is easier and more accurate to define the range of outcomes of a research project as a set of primary and secondary outcomes or research milestones that are not mutually exclusive, and can instead occur in various combinations.
For instance, science proposals usually highlight the primary outcome(s) that they aim to achieve, but may also involve important secondary outcome(s) that can be achieved in addition to or instead of the primary goals. Secondary outcomes can be a research method, tool, or dataset produced for the purpose of achieving the primary outcome; a discovery made in the process of pursuing the primary outcome; or an outcome that researchers pivot to pursuing as they obtain new information from the research process. As such, primary and secondary outcomes are not necessarily mutually exclusive. In the simplest scenario with just two outcomes (either two primary or one primary and one secondary), the total expected utility becomes
𝑇𝐸𝑈 = 𝓊1𝑃(𝑚1⋂ not 𝑚2) + 𝓊2𝑃(𝑚2⋂ not 𝑚1) + (𝓊1 + 𝓊2)𝑃(𝑚1⋂𝑚2),
𝑇𝐸𝑈 = 𝓊1𝑃(𝑚1) – (𝑚1⋂ 𝑚2)) + 𝓊2𝑃(𝑚2) – 𝑃(𝑚1⋂ 𝑚2)) + (𝓊1 + 𝓊2)𝑃(𝑚1⋂𝑚2)
𝑇𝐸𝑈 = 𝓊1𝑃(𝑚1) + 𝓊2𝑃(𝑚2) – (𝓊1 + 𝓊2)𝑃(𝑚1⋂𝑚2).
As the number of outcomes increases, the number of joint probability terms increases as well. Assuming the outcomes are independent though, they can be reduced to the product of the probabilities of individual outcomes. For example,
𝑃(𝑚1⋂𝑚2) = 𝑃(𝑚1) * 𝑃(𝑚2)
On the other hand, milestones are typically designed to build upon one another, such that achieving later milestones necessitates the achievement of prior milestones. In these cases, the value of later milestones typically includes the value of prior milestones: for example, the value of demonstrating a complete pilot of a technology is inclusive of the value of demonstrating individual components of that technology. The total expected utility can thus be defined as the sum of the product of the marginal utility of each additional milestone and its probability of success:
𝑇𝐸𝑈 = 𝛴𝑖(𝓊𝑖 + 𝓊𝑖-1)𝑃(𝑚𝑖),
where 𝓊0 = 0.
Depending on the science proposal, either of these approaches — or a combination — may make the most sense for determining the set of outcomes to evaluate.
In our FRO Forecasting pilot, we worked with proposal authors to define two outcomes for each of their proposals. Depending on what made the most sense for each proposal, the two outcomes reflected either relatively independent primary and secondary goals, or sequential milestone outcomes that directly built upon one another (though for simplicity, we called all of the outcomes milestones).
Defining Probability of Success
Once the set of potential outcomes have been defined, the next step is to determine the probability of success between 0% and 100% for each outcome if the proposal is funded. A prediction of 50% would indicate the highest level of uncertainty about the outcome, whereas the closer the predicted probability of success is to 0% or 100%, the more certainty there is that the outcome will be one over the other.
Furthermore, Franzoni and Stephan decompose probability of success into two components: the probability that the outcome can actually occur in nature or reality and the probability that the proposed methodology will succeed in obtaining the outcome (conditional on it being possible in nature). The total probability is then the product of these two components:
𝑃(𝑚𝑖) = 𝑃nature(𝑚𝑖) * 𝑃proposal(𝑚𝑖)
Depending on the nature of the proposal (e.g., more technology-driven, or more theoretical/discovery driven), each component may be more or less relevant. For example, our forecasting pilot includes a proposal to perform knockout validation of renewable antibodies for 10,000 to 15,000 human proteins; for this project, 𝑃nature(𝑚𝑖) approaches 1 and 𝑃proposal(𝑚𝑖) drives the overall probability of success.
Similarly, the value of an outcome can be separated into its impact on the scientific field and its impact on society at large. Scientific impact aims to capture the extent to which a project advances the frontiers of knowledge, enables new discoveries or innovations, or enhances scientific capabilities or methods. Social impact aims to capture the extent to which a project contributes to solving important societal problems, improving well-being, or advancing social goals.
In both of these cases, determining the value of an outcome entails some subjective preferences, so there is no “correct” choice, at least mathematically speaking. However, proxy metrics may be helpful in considering impact. Though each is imperfect, one could consider citations of papers, patents on tools or methods, or users of method, tools, and datasets as proxies of scientific impact. For social impact, some proxy metrics that one might consider are the value of lives saved, the cost of illness prevented, the number of job-years of employment generated, economic output in terms of GDP, or the social return on investment.
The approach outlined by Franzoni and Stephan asks reviewers to assess scientific and social impact on a linear scale (0-100), after which the values can be averaged to determine the overall impact of an outcome. However, we believe that an exponential scale better captures the tendency in science for a small number of research projects to have an outsized impact and provides more room at the top end of the scale for reviewers to increase the rating of the proposals that they believe will have an exceptional impact.
As such, for our FRO Forecasting pilot, we chose to use a framework in which a simple 1–10 score corresponds to real-world impact via a base 2 exponential scale. In this case, the overall impact score of an outcome can be calculated according to
𝓊𝑖 = log[2science impact of 𝑖 + 2social impact of 𝑖] – 1.
For an exponential scale with a different base, one would substitute that base for two in the above equation. Depending on each funder’s specific understanding of impact and the type(s) of proposals they are evaluating, different relationships between scores and utility could be more appropriate.
In order to capture reviewers’ assessment of uncertainty in their evaluations, we asked them to provide median, 25th, and 75th percentile predictions for impact instead of a single prediction. High uncertainty would be indicated by a narrow confidence interval, while low uncertainty would be indicated by a wide confidence interval.
Determining the “But For” Effect of Funding
The above approach aims to identify the highest impact proposals. However, a grantmaker may not want to simply fund the highest impact proposals; rather, they may be most interested in understanding where their funding would make the highest impact — i.e., their “but for” effect. In this case, the grantmaker would want to fund proposals with the maximum difference between the total expected utility of the research proposal if they chose to funded it versus if they chose not to:
“But For” Impact = 𝑇𝐸𝑈(funding) – 𝑇𝐸𝑈(no funding).
For TEU(funding), the probability of the outcome occurring with this specific grantmaker’s funding using the proposed approach would still be defined as above
𝑃(𝑚𝑖 | funding) = 𝑃nature(𝑚𝑖) * 𝑃proposal(𝑚𝑖),
but for 𝑇𝐸𝑈(no funding), reviewers would need to consider the likelihood of the outcome being achieved through other means. This could involve the outcome being realized by other sources of funding, other researchers, other approaches, etc. Here, the probability of success without this specific grantmaker’s funding could be described as
𝑃(𝑚𝑖 | no funding) = 𝑃nature(𝑚𝑖) * 𝑃other mechanism(𝑚𝑖).
In our FRO Forecasting pilot, we assumed that 𝑃other mechanism(𝑚𝑖) ≈ 0. The theory of change for FROs is that there exists a set of research problems at the boundary of scientific research and engineering that are not adequately supported by traditional research and development models and are unlikely to be pursued by academia or industry. Thus, in these cases it is plausible to assume that,
𝑃(𝑚𝑖 | no funding) ≈ 0
𝑇𝐸𝑈(no funding) ≈ 0
“But For” Impact ≈ 𝑇𝐸𝑈(funding).
This assumption, while not generalizable to all contexts, can help reduce the number of questions that reviewers have to consider — a dynamic which we explore further in the next section.
Designing Forecasting Questions
Once one has determined the total expected utility equation(s) relevant for the proposal(s) that they are trying to evaluate, the parameters of the equation(s) must be translated into forecasting questions for reviewers to respond to. In general, for each outcome, reviewers will need to answer the following four questions:
- If this proposal is funded, what is the probability that this outcome will occur?
- If this proposal is not funded, what is the probability that this outcome will still occur?
- What will be the scientific impact of this outcome occurring?
- What will be the social impact of this outcome occurring?
For the probability questions, one could alternatively ask reviewers about the different probability components (𝑃nature(𝑚𝑖), 𝑃proposal(𝑚𝑖), 𝑃other mechanism(𝑚𝑖), etc.), but in most cases it will be sufficient — and simpler for the reviewer — to focus on the top-level probabilities that feed into the TEU calculation.
In order for the above questions to tap into the benefits of the forecasting framework, they must be resolvable. Resolving the forecasting questions means that at a set time in the future, reviewers’ predictions will be compared to a ground truth based on the actual events that have occurred (i.e., was the outcome actually achieved and, if so, what was its actual impact?). Consequently, reviewers will need to be provided with the resolution date and the resolution criteria for their forecasts.
Resolution of the probability-based questions hinges mostly on a careful and objective definition of the potential outcomes, and is otherwise straightforward — though note that only one of the probability questions will be resolved, since they are mutually exclusive. The optimal resolution of the scientific and social impact questions may depend on the context of the project and the chosen approach to defining utility. A widely applicable approach is to resolve the utility forecasts by having either program managers or subject matter experts evaluate the results of the completed project and score its impact at the resolution date.
For our pilot, we asked forecasting questions only about the probability of success given funding (question 1 above) and the scientific and social impact of each outcome (questions 3 and 4); since we assumed that the probability of success without funding was zero, we did not ask question 2. Because outcomes for the FRO proposals were designed to be either independent or sequential, we did not have to ask additional questions on the joint probability of multiple outcomes being achieved. We chose to resolve our impact questions with a post-project panel of subject matter experts.
In general, there is a tradeoff in implementing this approach between simplicity and thoroughness, efficiency and accuracy. Here are some additional considerations on that tradeoff for those looking to use this approach:
- The responsibility of determining the range of potential outcomes for a proposal could be assigned to three different parties: the proposal author, the proposal reviewers, or the program manager. First, grantmakers could ask proposal authors to comprehensively define within their proposal the potential primary and secondary outcomes and/or project milestones. Alternatively, reviewers could be allowed to individually — or collectively — determine what they see as the full range of potential outcomes. The third option would be for program managers to define the potential outcomes based on each proposal, with or without input from proposal authors. In our pilot, we chose to use the third approach with input from proposal authors, since it simplified the process for reviewers and allowed us to limit the number of outcomes under consideration to a manageable amount.
- In many cases, a “failed” or null outcome may still provide meaningful value by informing other scientists that the research method doesn’t work or that the hypothesis is unlikely to be true. Considering the replication crises in multiple fields, this could be an important and unaddressed aspect of peer review. Grantmakers could choose to ask reviewers to consider the value of these null outcomes alongside other outcomes to obtain a more complete picture of the project’s utility. We chose not to address this consideration in our pilot for the sake of limiting the evaluation burden on reviewers.
- If grant recipients’ are permitted greater flexibility in their research agendas, this expected value approach could become more difficult to implement, since reviewers would have to consider a wider and more uncertain range of potential outcomes. This was not the case for our FRO Forecasting pilot, since FROs are designed to have specific and well-defined research goals.
Other Similar Efforts
Currently, forecasting is an approach rarely used in grantmaking. Open Philanthropy is the only grantmaking organization we know of that has publicized their use of internal forecasts about grant-related outcomes, though their forecasts do not directly influence funding decisions and are not specifically of expected value. Franzoni and Stephan are also currently piloting their Subjective Expected Utility approach with Novo Nordisk.
Our goal in publishing this methodology is for interested grantmakers to freely adapt it to their own needs and iterate upon our approach. We hope that this paper will help start a conversation in the science research and funding communities that leads to further experimentation. A follow up report will be published at the end of the FRO Forecasting pilot sharing the results and learnings from the project.
We’d like to thank Peter Mühlbacher, former research scientist at Metaculus, for his meticulous feedback as we developed this approach and for his guidance in designing resolvable forecasting questions. We’d also like to thank the rest of the Metaculus team for being open to our ideas and working with us on piloting this approach, the process of which has helped refine our ideas to their current state. Any mistakes here are of course our own.
What are the best approaches for structuring, funding, and conducting innovative scientific research? The importance of this question — long pondered by philosophers, historians, sociologists, and scientists themselves — is motivating the rapid growth of a new, interdisciplinary and empirically minded Science of Science that spans academia, industry, and government. At the 2nd annual International Conference on the Science of Science and Innovation, held June 26-29 at Northwestern University, experts from across this diverse community gathered to build new connections and showcase the cutting edge of the field.
At this year’s conference, the Federation of American Scientists aimed to further these goals by partnering with Northwestern’s Kellogg School of Management to host the first Metascience Hackathon. This event brought together participants from eight different countries — representing 20 universities, two federal agencies, and two non-profits — to stimulate cross-disciplinary collaboration and develop new approaches to impact. Diverging from the traditional hackathon model, we encouraged teams to advance the field along one of three distinct dimensions: Policy, Knowledge, and Tools.
Participants rose to the occasion, producing eight creative and impactful projects. In the Policy track, teams proposed transformative strategies to enhance scientific reproducibility, support immigrant STEM researchers, and foster impactful interdisciplinary research. The Knowledge track saw teams leveraging data and AI to explore the dynamics of peer review, scientific collaboration, and the growth of the science of science community. The Tools track introduced novel platforms for fostering global research collaboration and visualizing academic mobility.
These projects, developed in less than a day (!), underscore the potential of the science of science community to drive impactful change. They represent not only the innovative spirit of the participants but also the broader value of interdisciplinary collaboration in addressing complex challenges and shaping the future of science.
We are excited to showcase the teams’ work below.
A Funders’ Tithe for Reproducibility Centers
Project Team: Shambhobi Battacharya (Northwestern University), Elena Chechik (European University at St. Petersburg), Alexander Furnas (Northwestern University), & Greg Quigg (University of Massachusetts Amherst)
Background: The responsibility for ensuring scientific reproducibility is primarily on individual researchers and academic institutions. However, reproducibility efforts are often inadequate due to limited resources, publication bias, time constraints, and lack of incentives.
Solution: We propose a policy whereby large science funding bodies earmark a certain percentage of their allocated grants towards establishing and maintaining reproducibility centers. These specialized entities would employ dedicated teams of independent scientists to reproduce or replicate high-impact, high-leverage, or suspicious research. The existence of dedicated reproducibility centers with independent scientists conducting post-hoc, self-directed reproducibility and replication studies will alter the incentives for researchers throughout the scientific community, strengthening the body of scientific knowledge and increasing public trust in scientific findings.
Immigrant STEM Training: Crossing the Valley of Death
Project Team: Sujata Emani (Personal Capacity), Takahiro Miura (University of Tokyo), Mengyi Sun (Northwestern University), & Alice Wu (Federation of American Scientists)
Background: Immigrants significantly contribute to the U.S. economy, particularly in STEM entrepreneurship and innovation. However, they often encounter legal, financial, and interpersonal barriers that lead to high rates of mental health disorders and attrition from scientific research.
Solution: To mitigate these challenges, we propose that science funding agencies expand eligibility for major federal science fellowships (e.g., the NSF GRFP and NIH NRSA) to include international students, providing them with more stable funding sources. We also propose a broader shift in federal research funding towards research fellowships, reducing hierarchical power structures and improving the training environment. Implementing these recommendations can empower all graduate students, foster greater scientific progress, and benefit the American economy.
Increasing Interdisciplinary Research through a More Balanced Research Funding and Evaluation Process
Project Team: Jonathan Coopersmith (Texas A&M University), Jari Kuusisto (University of Vaasa), Ye Sun (University College London), & Hongyu Zhou (University of Antwerp)
Background: Solving local, national, and global challenges will increasingly require interdisciplinary research that spans diverse perspectives and insights. Despite the need for impactful interdisciplinary research, it has not reached its full potential due to the persistence of significant obstacles at many levels of the creation of knowledge. This lack of support makes it challenging to develop and utilize the full potential of interdisciplinarity.
Solution: We propose that national funding agencies should launch a Balanced Research Funding and Evaluation Initiative to create and implement national standards for interdisciplinary research development, management, promotion, funding, and evidence-based evaluation. We outline the specific mechanisms such a program could use to unlock more impactful research on global challenges.
Identifying Reviewer Disciplines and their Impact on Peer Review
Project Team: Chenyue Jiao (University of Illinois, Urbana Champaign), Erzhuo Shao (Northwestern University), Louis Shekhtman (Northeastern University), & Satyaki Sikdar (Indiana University, Bloomington)
Background: Given the rise in interdisciplinary and multidisciplinary research, there is an increasing need to obtain the perspectives of multiple peer reviewers with unique expertise. In this project, we explore whether reviewers from particular disciplines tend to be more critical of papers applying a different disciplinary approach.
Solution: Using a dataset of open reviews from Nature Communications, we assign concepts to papers and reviews using the OpenAlex concept tagger, and analyze review sentiment using OpenAI’s ChatGPT API. Our results identify network pairs of review and paper concepts; several pairs correspond to expectations, such as engineers’ negativity towards physicists’ work and economists’ criticisms of biology studies. Further study and collection of additional datasets could improve the utility of these results.
Team Formation: Expected or Unexpected
Project Team: Noly Higashide (University of Tokyo), Oh-Hyun Kwon (Pohang University of Science and Technology), Zeijan Lyu (University of Chicago), & Seokkyun Woo (Northwestern University)
Background: This year’s conference highlighted the importance of studying the interaction of scientists to better understand the scientific ecosystem. Here, we explore the dynamics of scientific collaboration and its influence on the success of resulting research.
Solution: Using SciSciNet data, we investigate how the likelihood of team formation affects the impact, disruption, and novelty of the papers in the field of biology, chemistry, psychology, and sociology. Our results suggest that the relationship between team structure and research impact varies across disciplines. Specifically, in chemistry and biology the relationship between proximity and citations has an inverse U-shape, such that papers with moderate proximity have the highest impact. These findings underline the need for further exploration of how collaboration patterns affect scientific discovery.
SciSciPeople: Identifying New Members of the Science of Science Community
Project Team: Sirag Erkol (Northwestern University), Yifan Qian (Northwestern University), & Henry Xu (Carnegie Mellon University)
Background: The growth and diversification of the science of science community is crucial for fostering innovation and broadening perspectives.
Solution: Our project introduces SciSciPeople, a new pipeline designed to identify potential new members for this community. Using data from the ICSSI website, SciSciNet, and Google Scholar, our pipeline identifies individuals who have shown interest in the science of science — either through citing well-known review papers, or noting the field as a research interest on Google Scholar — but are not yet part of the ICSSI community. Applying this pipeline successfully identified hundreds of relevant individuals. This tool not only enriches the science of science community but also has potential applications for various fields aiming to discover new individuals to expand their communities.
ScholarConnect: A Platform for Fostering Knowledge Exchange
Project Team: Sai Koneru (Pennsylvania State University), Xuelai Li (Imperial College London), Casey Meyer (OurResearch), Mark Tschopp (Army Research Laboratory)
Background: The rapid growth of the scientific community has made it hard to stay aware of the researchers working on similar projects to your own. As a result, there is a need for new ways to identify researchers doing relevant work in other institutions or fields.
Solution: We created “ScholarConnect”, an open-source tool designed to foster global collaboration among researchers. ScholarConnect recommends potential collaborators based on similarities in research expertise, determined by factors like publication records, concepts, institutions, and countries. The tool offers personalized recommendations and an interactive user interface, allowing users to explore and connect with like-minded researchers from diverse backgrounds and disciplines. We’ve ensured privacy and security by not storing user-entered information and basing recommendations on anonymized, aggregated profiles, and we invite contributions from the wider research community to improve ScholarConnect.
Scientist Map: A Tool for Visualizing Academic Mobility across Institutions
Project Team: Tianji Jiang (University of California Los Angeles), Jesse Tabak (Northwestern University), & Shibo Zhou (Georgia Institute of Technology)
Background: The migration of academic researchers provides a unique window to observe the mobility of knowledge and innovations today, and has been a valuable area of investigation for scholars across various disciplines.
Solution: To study the migration of academic individuals, we introduce a tool designed to allow users to search an academic’s history of affiliation and visualize their historical path on a map. This tool aims to help scientific producers and consumers better understand the migration of experts across institutions, and to support relevant science of science by providing easy access to researchers’ migration history.