Privacy-Preserving Collaborative Genomic Research: A Real-Life Deployment and Vision

Abstract

The data revolution holds a significant promise for the health sector. Vast amounts of data collected and measured from individuals will be transformed into knowledge, AI models, predictive systems, and digital best practices. One area of health that stands to benefit greatly from this advancement is the genomic domain. The advancement of AI, machine learning, and data science has opened new opportunities for genomic research, promising breakthroughs in personalized medicine. However, the increasing awareness of privacy and cyber security necessitates robust solutions to protect sensitive data in collaborative research. This paper presents a practical deployment of a privacy-preserving framework for genomic research, developed in collaboration with Lynx.MD, a platform designed for secure health data collaboration. The framework addresses critical cyber security and privacy challenges, enabling the privacy-preserving sharing and analysis of genomic data while mitigating risks associated with data breaches. By integrating advanced privacy-preserving algorithms, the solution ensures the protection of individual privacy without compromising data utility. A unique feature of the system is its ability to balance the trade-offs between data sharing and privacy, providing stakeholders with tools to quantify privacy risks and make informed decisions. The implementation of the framework within Lynx.MD involves encoding genomic data into binary formats and applying noise through controlled perturbation techniques. This approach preserves essential statistical properties of the data, facilitating effective research and analysis. Additionally, the system incorporates real-time data monitoring and advanced visualization tools, enhancing user experience and decision-making capabilities. The paper highlights the need for tailored privacy attacks and defenses specific to genomic data, given its unique characteristics compared to other data types. By addressing these challenges, the proposed solution aims to foster global collaboration in genomic research, ultimately contributing to significant advancements in personalized medicine and public health.

Publication
ACM Workshop on Cybersecurity in Healthcare (HealthSec)