Advance open science through robust data privacy measures

02.09.24 | 4 min read | Text by Alyssa Columbus

In an era of accelerating advancements in data collection and analysis, realizing the full potential of open science hinges on balancing data accessibility and privacy. As we move towards a more open scientific environment, the volume of sensitive data being shared is swiftly increasing. While open science presents an opportunity to fast-track scientific discovery, it also poses a risk to privacy if not managed correctly.

Building on existing data and privacy efforts, the White House and federal science agencies should collaborate to develop and implement clear standards for research data privacy across the data management and sharing life cycle.

Details

Federal agencies’ open data initiatives are a milestone in the move towards open science. They have the potential to foster greater collaboration, transparency, and innovation in the U.S. scientific ecosystem and lead to a new era of discovery. However, a shift towards open data also poses challenges for privacy, as sharing research data openly can expose personal or sensitive information when done without the appropriate care, methods, and tools. Addressing this challenge requires new policies and technologies that allow for open data sharing while also protecting individual privacy.

The U.S. government has shown a strong commitment to addressing data privacy challenges in various scientific and technological contexts. This commitment is underpinned by laws and regulations such as the Health Insurance Portability and Accountability Act and the regulations for human subjects research (e.g., Code of Federal Regulations Title 45, Part 46). These regulations provide a legal framework for protecting sensitive and identifiable information, which is crucial in the context of open science.

The White House Office of Science and Technology Policy (OSTP) has spearheaded the “National Strategy to Advance Privacy-Preserving Data Sharing and Analytics,” aiming to further the development of these technologies to maximize their benefits equitably, promote trust, and mitigate risks. The National Institutes of Health (NIH) operate an internal Privacy Program, responsible for protecting sensitive and identifiable information within NIH work. The National Science Foundation (NSF) complements these efforts with a multidisciplinary approach through programs like the Secure and Trustworthy Cyberspace program, aiming to develop new ways to design, build, and operate cyber systems, protect existing infrastructure, and motivate and educate individuals about cybersecurity.

Given the unique challenges within the open science context and the wide reach of open data initiatives across the scientific ecosystem, there remains a need for further development of clear policies and frameworks that protect privacy while also facilitating the efficient sharing of scientific data. Coordinated efforts across the federal government could ensure these policies are adaptable, comprehensive, and aligned with the rapidly evolving landscape of scientific research and data technologies.

Recommendations

To clarify standards and best practices for research data privacy:

The National Institute of Standards and Technology (NIST) should build on its existing Research Data Framework to develop a new framework that is specific to research data privacy and addresses the unique needs of open science communities and practices. This would provide researchers with a clear roadmap for implementing privacy-preserving data sharing in their work.
- This framework should incorporate the principles of Privacy by Design, ensuring that privacy is an integral part of the research life cycle, rather than an afterthought.
- The framework should be regularly updated to stay current with the changes in state, federal, and international data privacy laws, as well as new privacy-preserving methodologies. This will ensure that it remains relevant and effective in the evolving data privacy landscape.

To ensure best practices are used in federally funded research:

Funding agencies like the NIH and NSF should work with NIST to develop and implement training for Data Management and Sharing Plan applicants and reviewers. This training would equip both parties with knowledge of best practices in privacy-preserving data sharing in open science, thereby ensuring that data privacy measures are effectively integrated into research workflows.
- Agencies should additionally establish programs to foster privacy education, as recommended in the OSTP national strategy.
- Training on open data privacy could additionally be incorporated into agencies’ existing Responsible Conduct of Research requirements.

To catalyze continued improvements in data privacy technologies:

Science funding agencies should increase funding for domain-specific research and development of privacy-preserving methods for research data sharing. Such initiatives would spur innovation in fields like cryptography and secure computation, leading to the development of new technologies that can broaden the scope of open and secure data sharing.
- To further stimulate innovation, these agencies could also host privacy/security innovation competitions, encouraging researchers and developers to create and implement cutting-edge solutions.

To facilitate inter-agency coordination:

OSTP should launch a National Science and Technology Council subcommittee on research data privacy within the Committee on Science. This subcommittee should work closely with the Office of Management and Budget, leveraging its expertise in overseeing federal information resources and implementing data management policies. This collaboration would ensure a coordinated and consistent approach to addressing data privacy issues in open science across different federal agencies.

To learn more about the importance of opening science and to read the rest of the published memos, visit the Open Science Policy sprint landing page.

publications

See all publications

Emerging Technology

Report

Understanding the U.S. Bioeconomy: Agency Perspectives

To understand the range of governmental priorities for the bioeconomy, we spoke with key agencies represented on the National Bioeconomy Board to collect their perspectives.

07.16.24 | 15 min read

Emerging Technology

day one project

Policy Memo

GenAI in Education Research Accelerator (GenAiRA)

Congress should foster a more responsive and evidence-based ecosystem for GenAI-powered educational tools, ensuring that they are equitable, effective, and safe for all students.

06.28.24 | 10 min read

Emerging Technology

day one project

Policy Memo

A Safe Harbor for AI Researchers: Promoting Safety and Trustworthiness Through Good-Faith Research

Without independent research, we do not know if the AI systems that are being deployed today are safe or if they pose widespread risks that have yet to be discovered, including risks to U.S. national security.

06.28.24 | 6 min read

Emerging Technology

day one project

Policy Memo

Update COPPA 2.0 to Strengthen Children’s Online Voice Privacy in the AI Era

Companies that store children’s voice recordings and use them for profit-driven applications without parental consent pose serious privacy threats to children and families.

06.28.24 | 9 min read