Advance open science through robust data privacy measures
In an era of accelerating advancements in data collection and analysis, realizing the full potential of open science hinges on balancing data accessibility and privacy. As we move towards a more open scientific environment, the volume of sensitive data being shared is swiftly increasing. While open science presents an opportunity to fast-track scientific discovery, it also poses a risk to privacy if not managed correctly.
Building on existing data and privacy efforts, the White House and federal science agencies should collaborate to develop and implement clear standards for research data privacy across the data management and sharing life cycle.
Details
Federal agencies’ open data initiatives are a milestone in the move towards open science. They have the potential to foster greater collaboration, transparency, and innovation in the U.S. scientific ecosystem and lead to a new era of discovery. However, a shift towards open data also poses challenges for privacy, as sharing research data openly can expose personal or sensitive information when done without the appropriate care, methods, and tools. Addressing this challenge requires new policies and technologies that allow for open data sharing while also protecting individual privacy.
The U.S. government has shown a strong commitment to addressing data privacy challenges in various scientific and technological contexts. This commitment is underpinned by laws and regulations such as the Health Insurance Portability and Accountability Act and the regulations for human subjects research (e.g., Code of Federal Regulations Title 45, Part 46). These regulations provide a legal framework for protecting sensitive and identifiable information, which is crucial in the context of open science.
The White House Office of Science and Technology Policy (OSTP) has spearheaded the “National Strategy to Advance Privacy-Preserving Data Sharing and Analytics,” aiming to further the development of these technologies to maximize their benefits equitably, promote trust, and mitigate risks. The National Institutes of Health (NIH) operate an internal Privacy Program, responsible for protecting sensitive and identifiable information within NIH work. The National Science Foundation (NSF) complements these efforts with a multidisciplinary approach through programs like the Secure and Trustworthy Cyberspace program, aiming to develop new ways to design, build, and operate cyber systems, protect existing infrastructure, and motivate and educate individuals about cybersecurity.
Given the unique challenges within the open science context and the wide reach of open data initiatives across the scientific ecosystem, there remains a need for further development of clear policies and frameworks that protect privacy while also facilitating the efficient sharing of scientific data. Coordinated efforts across the federal government could ensure these policies are adaptable, comprehensive, and aligned with the rapidly evolving landscape of scientific research and data technologies.
Recommendations
To clarify standards and best practices for research data privacy:
- The National Institute of Standards and Technology (NIST) should build on its existing Research Data Framework to develop a new framework that is specific to research data privacy and addresses the unique needs of open science communities and practices. This would provide researchers with a clear roadmap for implementing privacy-preserving data sharing in their work.
- This framework should incorporate the principles of Privacy by Design, ensuring that privacy is an integral part of the research life cycle, rather than an afterthought.
- The framework should be regularly updated to stay current with the changes in state, federal, and international data privacy laws, as well as new privacy-preserving methodologies. This will ensure that it remains relevant and effective in the evolving data privacy landscape.
To ensure best practices are used in federally funded research:
- Funding agencies like the NIH and NSF should work with NIST to develop and implement training for Data Management and Sharing Plan applicants and reviewers. This training would equip both parties with knowledge of best practices in privacy-preserving data sharing in open science, thereby ensuring that data privacy measures are effectively integrated into research workflows.
- Agencies should additionally establish programs to foster privacy education, as recommended in the OSTP national strategy.
- Training on open data privacy could additionally be incorporated into agencies’ existing Responsible Conduct of Research requirements.
To catalyze continued improvements in data privacy technologies:
- Science funding agencies should increase funding for domain-specific research and development of privacy-preserving methods for research data sharing. Such initiatives would spur innovation in fields like cryptography and secure computation, leading to the development of new technologies that can broaden the scope of open and secure data sharing.
- To further stimulate innovation, these agencies could also host privacy/security innovation competitions, encouraging researchers and developers to create and implement cutting-edge solutions.
To facilitate inter-agency coordination:
- OSTP should launch a National Science and Technology Council subcommittee on research data privacy within the Committee on Science. This subcommittee should work closely with the Office of Management and Budget, leveraging its expertise in overseeing federal information resources and implementing data management policies. This collaboration would ensure a coordinated and consistent approach to addressing data privacy issues in open science across different federal agencies.
Proposed bills advance research ecosystems, economic development, and education access and move now to the U.S. House of Representatives for a vote
NIST’s guidance on “Managing Misuse Risk for Dual-Use Foundation Models” represents a significant step forward in establishing robust practices for mitigating catastrophic risks associated with advanced AI systems.
Surveillance has been used on citizen activists for decades. What can civil society do to fight back against the growing trend of widespread digital surveillance?
Public-private collaboration in standards development also increases the likelihood that companies are able to adopt the standards without being overly burdened.