Unlocking the U.S. Bioeconomy with the Plant Genome Project
Summary
Plants are an important yet often overlooked national asset. We propose creating a Plant Genome Project (PGP), a robust Human Genome Project-style initiative to build a comprehensive dataset of genetic information on all plant species, starting with the 7,000 plant species that have historically been cultivated for food and prioritizing plants that are endangered by climate change and habitat loss. In parallel, we recommend expanding the National Plant Germplasm System (NPGS) to include genomic-standard repositories that connect plant genetic information to physical seed/plant material. The PGP will mobilize a whole-of-government approach to advance genomic science, lower costs, and increase access to plant genomic information. By creating a fully sequenced national germplasm repository and leveraging modern software and data science tools, we will unlock the U.S. bioeconomy, promote crop innovation, and help enable a diversified, localized, and climate-resilient food system.
Challenge and Opportunity
Plants provide our food, animal feed, medicinal compounds, and the fiber and fuel required for economic development. Plants contribute to biodiversity and are critical for the existence of all other living creatures. Plants also sequester atmospheric carbon, thereby combating climate change and sustaining the health of our planet.
However, as a result of climate change and human practices, we have been losing plants at an alarming rate. Nearly 40% of the world’s 435,000 unique land plant species are extremely rare and at risk of extinction due to climate change. More than 90% of crop varieties have disappeared from fields worldwide as farmers have abandoned diverse local crop varieties in favor of genetically uniform, commercial varieties.
We currently depend on just 15 plants to provide almost all of the world’s food, making our global food supply extremely vulnerable to climate change, new diseases, and geopolitical upheaval—problems that will be exacerbated as the world’s population rises to 10 billion by 2050.
We are in a race against time to stop the loss of plant biodiversity—and at the same time, we desperately need to increase the diversity in our cultivated crops. To do this, we must catalog, decode, and preserve valuable data on all existing plants. Yet more than two decades since we sequenced the first plant genome, genome sequence information exists for only 798 plant species—a small fraction of all plant diversity.
Although large agriculture companies have made substantial investments in plant genome sequencing, this genetic information is focused on a small number of crops and is not publicly available. What little information we have is siloed, known only to large corporations and not openly available to researchers, farmers, or policymakers. This is especially true for nations in the Global South, who are not usually included in most genome sequencing projects. Furthermore, current data in existing germplasm repositories, State Agricultural Experiment Stations, and land-grant universities is not easily accessible online, making it nearly impossible for researchers in both public and private settings to explore. These U.S. government collections and resources of germplasm and herbaria, documented by the Interagency Working Group on Scientific Collections, have untapped potential to catalyze the bioeconomy and mobilize investment in the next generation of plant genetic advancements and, as a result, food security and new economic opportunities.
Twenty years ago, the United States launched the Human Genome Project (HGP), a shared knowledge-mapping initiative funded by the federal government. We continue to benefit from this initiative, which has identified the cause of many human diseases and enabled the development of new medicines and diagnostics. The HGP had a $5.4 billion price tag ($2.7 billion from U.S. contributions) but resulted in more than $14.5 billion in follow-on genomics investments that enabled the field to rapidly develop and deploy cutting-edge sequencing and other technologies, leading to a drop in genomic sequencing cost from $300 million per genome to less than $1,000.
Today, we need a Human Genome Project for plants—a unified Plant Genome Project that will create a global database of genetic information on all plants to increase food security and unlock plant innovation for generations to come. Collecting, sequencing, decoding, and cataloging the nation’s plant species will fill a key gap in our national natural capital accounting strategy. The PGP will complement existing conservation initiatives led by the Office of Science and Technology Policy (OSTP) and other agencies, by deepening our understanding of America’s unique biodiversity and its potential benefits to society. Such research and innovation investment would also benefit government initiatives like USAID’s Feed the Future (FTF) Initiative, particularly the Global Food Security Research Strategy, around climate-smart agriculture and genetic diversity of crops.
PGP-driven advancements in genomic technology and information about U.S. plant genetic diversity will create opportunities to grow the U.S. bioeconomy, create new jobs, and incentivize industry investment. The PGP will also create opportunities to make our food system more climate-resilient and improve national health and well-being. By extending this effort internationally, and ensuring that the Global South is empowered to contribute to and take advantage of these genetic advancements, we can help mitigate climate change, enhance global food security, and promote equitable plant science innovation.
Plan of Action
The Biden Administration should launch a Plant Genome Project to support and enable a whole-of-government approach to advancing plant genomics and the bioeconomy. The PGP will build a comprehensive, open-access dataset of genetic and biological information on all plant species, starting with the 7,000 plant species that have historically been cultivated for food and prioritizing plants that are endangered by climate change and habitat loss. The PGP will convene key stakeholders and technical talent in a novel coalition of partnerships across public and private sectors. We anticipate that the PGP, like the Human Genome Project, will jump-start new technologies that will further drive down the cost of sequencing and advance a new era for plant science innovation and the U.S. bioeconomy. Our plan envisions two Phases and seven Key Actions.
Phase 1: PGP Planning and Formation
Action 1: Create the Plant Genomics and U.S. Bioeconomy Interagency Working Group
The White House OSTP should convene a Plant Genomics and U.S. Bioeconomy Interagency Working Group to coordinate the creation of a Plant Genome Project and initiate efforts to consult with industry, academic, philanthropy, and social sector partners. The Working Group should include representatives from OSTP, U.S. Department of Agriculture (USDA) and its Agricultural Research Service (ARS), National Plant Germplasm System, Department of Commerce, Department of Interior, National Science Foundation (NSF), National Institutes of Health (NIH), Smithsonian Institution, Environmental Protection Agency, State Department’s Office of Science and Technology Adviser, and USAID’s Feed the Future Initiative. The Working Group should:
- Identify experts and resources to enable the PGP and work with multi-sector entities, including within USDA, to identify sources of seeds/plants in the United States.
- Conduct a kickoff meeting with OSTP and identify a team that includes NPGS representatives to inventory existing resources, coordinate seed collection efforts, and create connectivity with the PGP.
- Provide recommendations on working with international institutes in the Global North (e.g., Global Biodiversity Information Facility and Earth BioGenome Project) and the Global South (e.g., The African BioGenome Project, the International Potato Center and others). The Earth BioGenome Project’s work on green plants and initial genomic quality standards offers potential starting points for collaboration.
- Create recommendations for the nation’s first Plant Genome Research Institute to drive initial and future efforts in obtaining plant genome information and accelerating innovative research in plant genomics.
Action 2: Launch a White House Summit on Plant Genomics Innovation and Food Security
The Biden Administration should bring together multi-sector (agriculture industry, farmers, academics, and philanthropy) and agency partners with the expertise, resource access, and interest in increasing domestic food security and climate resilience. The Summit will secure commitments for the PGP’s initial activities and identify ways to harmonize existing data and advances in plant genomics. The Summit and follow-up activities should outline the steps that the Working Group will take to identify, combine, and encourage the distribution and access of existing plant genome data. Since public-private partnerships play a core enabling role in the strategy, the Summit should also determine opportunities for potential partners, novel financing through philanthropy, and international cooperation.
Action 3: Convene Potential International Collaborators and Partners
International cooperation should be explored from the start (beginning with the Working Group and the White House Summit) to ensure that sequencing is conducted not just at a handful of institutions in the Global North but that countries in the Global South are included and all information is made publicly available.
- During the annual UN General Assembly Summit, OSTP should convene a forum of key leaders across multiple countries, international NGOs, Fortune 1000 companies, and academia.
- This forum will drive international public-private commitments to action that support the launch of the PGP.
- This forum should produce a yearly report–the first of its kind, on progress at the intersection of technological and data-driven advances in plant and crop innovation, preserving plant biodiversity, ending hunger, achieving food security, and improving nutrition.
- This work could culminate in a flagship announcement of new commitments tied to the UN’s 2030 Agenda for Sustainable Development. Various champions and experts on The Nagoya Protocol, Convention on Biological Diversity (CBD), and others working with The Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) should be included.
We envision at least one comprehensive germplasm seed bank in each country or geographical region similar to the Svalbard seed vault or The Royal Botanic Garden at Kew and sequencing contributions from multiple international organizations such as Beijing Genomics Institute and the Sanger Institute.
Phase 2: PGP Formalization and Launch
Action 4: Launch the Plant Genome Research Institute to centralize and coordinate plant genome sequencing
Congress should create a Plant Genome Research Institute (PGRI) that will drive plant genomics research and be the central owner of U.S. government activities. The PGRI would centralize funding and U.S. government ownership over the PGP. We anticipate the PGP would require $2.5 billion over 10 years, with investment frontloaded and funding raised through matched commitments from philanthropy, public, and private sources. The PGRI could be a virtual institute structured as a distributed collaboration between multiple universities and research centers with centralized project management. PGRI funding could also incorporate novel funding mechanisms akin to the BRAIN Initiative through U.S. philanthropy and private sector collaboration (e.g., Science Philanthropy Alliance). The PGRI would:
- Identify key strategic public and private partners to join the coalition that prioritizes and undertakes the sequencing projects.
- Coordinate sequencing that will be conducted at sequencing centers funded by the PGRI while ensuring that all current consortia and initiatives for plant genome sequencing are included and connected.
- Define metrics for gene sequencing (e.g., accuracy, capacity, and cost of finished sequence), genome assembly, and genetic/physical map creation.
- Engage with industry providers of novel sequencing technology to bring down costs.
- Develop the final operational plan with timelines and funders outside the U.S. government in philanthropy and the private sector.
- Promote the development of novel bioinformatic and computational tools to facilitate gene assembly in polyploid plant genomes and view reference genomes of various plant species and varieties.
- Based on recommendations from the Working Group, select the agency or offices best positioned to house and maintain the data.
- Implement FAIR standards for data storage and dissemination and ensure an open-access, user-friendly interface for the final software platform. This could be achieved through a current database such as GenBank.
- Run an open challenge to share existing genome sequence data that is currently not publicly available and ensure that receiving centers undertake appropriate validation and quality control of all imported data.
Action 5: Expand and Strengthen NPGS-Managed Seed Repositories
We recommend strengthening the distributed seed repository managed by the U.S. National Plant Germplasm System and building a comprehensive and open-source catalog of plant genetic information tied to physical samples. The NPGS already stores seed collections at state land-grant universities in a collaborative effort to safeguard the genetic diversity of agriculturally important plants and may need additional funding to expand its work and increase visibility and access.
- Bring in new partnerships, funding, and technical expertise from the private sector, a major user of the NPGS collections and the primary means by which new and improved plants are commercialized.
- Provide funding to create structured and highly annotated datasets of seed profiles with taxonomic data and other criteria such as phenotypic/physical attributes, local usage, commercial characteristics, and rarity.
- Automatically feed data from all new and existing germplasm repositories into the PGP, linking existing physical germplasm data to novel genetic data and connecting genomic and genetic data with taxonomic information.
- Invite computer scientists to develop novel data-driven algorithms and machine-learning models incorporating newly collected genomic data to identify plant varieties that might be especially climate resilient. (This potential innovation has been demonstrated in machine-vision research involving digitized herbarium specimens).
Action 6: Create a Plant Innovation Fund within AgARDA
The Agriculture Advanced Research and Development Authority (AgARDA) is a USDA-based advanced research projects agency like DARPA but for agriculture research. The 2018 Farm Bill authorized AgARDA’s creation to tackle highly ambitious projects that are likely to have an outsize impact on agricultural and environmental challenges—such as the PGP. The existing AgARDA Roadmap could guide program setup.
- The Administration should launch BioDRIVe, a $100 million public natural assets fund targeted at plant innovation and biodiversity gains inspired by the DRIVe program focused on preventing future pandemics. If traditional federal funding mechanisms are inadequate, such data-driven investment vehicles/flexible funding tools could also include philanthropic funding mechanisms and be integrated within the PGRI. This would promote public-private partnerships and de-risk promising technologies that drive innovation for food security and biodiversity conservation.
- Through its RFP process, AgARDA could help drive key agricultural innovation that arises from the PGP, such as creating stronger, more climate-resilient, disease-resistant, or nutritionally superior plants and identifying plants that can break down pollutants or produce novel enzymes and products. AgARDA is ready to go as soon as it receives funding through the annual Congressional funding process.
Phase 3: Long-Term, Tandem Bioeconomy Investments
Action 7: Bioeconomy Workforce Development and Plant Science Education
Invest in plant science and technical workforce development to build a sustainable foundation for global plant innovation and enable long-term growth in the U.S. bioeconomy.
- Use the PGRI as a platform for a renewed focus on training world-class botanists, plant breeders, horticulturists, and agronomists. A networked effort to train the next generation of plant scientists, similar to the 100Kin10 initiative that successfully trained 100,000 new STEM teachers in 10 years, could be very useful. Funding could be targeted to scholarships in these areas at universities, community colleges, and scientific associations such as the American Society of Agronomy. We also recommend increasing emphasis on plant science in K-12 education.
- Build stronger ties between the plant science industry and the engineering workforce to support the growing data and technology needs for plant science research. This could include bringing in science and engineering fellows from existing fellowship programs to help build new software and data science tools for plant science.
- Launch a fellowship program in partnership with NSF, USDA, and scientific associations such as the American Association for the Advancement of Science or the American Society of Plant Biologists for talented plant biologists, agricultural researchers, and data and software engineers to serve a yearlong “tour of duty” in public service, where they would work internationally to collect, maintain, and expand the plant genome database. Existing and new repositories could benefit from this talent pool, and these cohorts of fellows would disseminate knowledge of the database throughout their careers, helping to achieve adoption at scale.
Conclusion
We are in a race against time to identify, decode, catalog, preserve, and cultivate the critical biodiversity of the world’s plant species before they are lost forever. By creating the world’s first comprehensive, open-access catalog of plant genetic information tied to physical samples, the Plant Genome Project will unlock plant innovation for food security, help preserve plant biodiversity in a changing climate, and advance the bioeconomy. The PGP’s whole-of-government approach will accelerate a global effort to secure our food systems and the health of the planet while catalyzing a new era of plant science, agricultural innovation and co-operation.
We estimate that it would cost ~$2.5 billion to sequence the genomes of all plant species. (For reference, the Human Genome Project cost $5.4 billion in 2017 to sequence just one species).
Yes, we recommend active solicitation of existing sequence information from all entities. This data should be validated and checked from a quality control perspective before being integrated into the PGP.
The newly created Plant Genome Research Institute (PGRI) will coordinate the PGP. The structure and operations of the PGRI will follow recommendations from the OSTP-commissioned Stakeholder Working Group. All work will be conducted in partnership with agencies like the U.S. Department of Agriculture, National Institutes of Health, National Science Foundation, private companies, and public academic institutions.
Existing sequencing efforts and seed banks will be included within the framework of the PGP.
The PGP will start as a national initiative, but to have the greatest impact it must be an international effort like the Human Genome Project. The White House Summit and Stakeholder Working Group will help influence scope and staging. The extinction crisis is a global problem, so the PGP should be a global effort in which the United States plays a strong leadership role.
In Phase 1, emphasis might be placed on native “lost crops” that can be grown in areas that are suffering from drought or are affected by climate change. Collection and selection would complement and incorporate active Biden Administration initiatives that center Indigenous science and environmental justice and equity.
In Phase 2, efforts could focus on sequencing all plants in regions or ecosystems within the U.S. that are vulnerable to adverse climate events in collaboration with existing state-level and university programs. An example is the California Conservation Genomics Project, which aims to sequence all the threatened, endangered and commercially exploited flora and fauna of California. Edible and endangered plants will be prioritized, followed by other plants in these ecosystems.
In Phase 3, all remaining plant species will be sequenced.
All collected seeds will be added to secure, distributed physical repositories, with priority given to collecting physical samples and genetic data from endangered species.
The PGP will work to address and even correct some long-standing inequalities, ensuring that the rights and interests of all nations and Indigenous people are respected in multiple areas from specimen collection to benefit sharing while ensuring open access to genomic information. The foundational work being done by the Earth BioGenome Project’s Ethical, Legal and Social Committee will be critically important.
Invitees could include but would not be limited to the following entities with corresponding initial commitments to support the PGP’s launch:
- Genome sequencing companies, such as Illumina, PacBio, Oxford Nanopore Technologies, and others, who would draft a white paper on the current landscape for sequencing technologies and innovation that would be needed to enable a PGP.
- Academic institutions with active sequencing core facilities such as the University of California, Davis and Washington University in St. Louis, among others, who would communicate existing capacity for PGP efforts and forecast additional capacity-building needs, summarize strengths of each entity and past contributions, and identify key thought leaders in the space.
- Large ag companies, such as Bayer Crop Science, Syngenta, Corteva, and others, who are willing to share proprietary sequence information, communicate industry perspectives, identify obstacles to data sharing and potential solutions, and actively participate in the PGP and potentially provide resources.
- Government agencies and public institutions such as NIH/NCBI, NSF, USDA, Foundation for Food and Agriculture Research, CGIAR, Missouri Botanical Garden, would draft white papers communicating existing efforts and funding, identify funding gaps, and assess current and future collaborations.
- Current sequencing groups/consortiums, such as the Wheat Genome Sequencing Consortium, Earth BioGenome Project, Open Green Genomes Project, HudsonAlpha, and others, would draft white papers communicating existing efforts and funding needs, identify gaps, and plan for data connectivity.
- Tech companies, such as Google and Microsoft, could communicate existing efforts and technologies, assess the potential for new technologies and tools to accelerate PGP, curate data, and provide support such as talent in the fields of data science and software engineering.
The incoming administration should work towards encouraging state health departments to develop clear and well-communicated data storage standards for newborn screening samples.
Proposed bills advance research ecosystems, economic development, and education access and move now to the U.S. House of Representatives for a vote
NIST’s guidance on “Managing Misuse Risk for Dual-Use Foundation Models” represents a significant step forward in establishing robust practices for mitigating catastrophic risks associated with advanced AI systems.
Surveillance has been used on citizen activists for decades. What can civil society do to fight back against the growing trend of widespread digital surveillance?