Please use this identifier to cite or link to this item: https://hdl.handle.net/11147/12166
Title: Efficient privacy-preserving whole-genome variant queries
Authors: Akgün, Mete
Pfeifer, Nico
Kohlbacher, Oliver
01. Izmir Institute of Technology
University of Tübingen
University of Tübingen
Issue Date: Apr-2022
Publisher: Oxford University Press
Abstract: Motivation: Diagnosis and treatment decisions on genomic data have become widespread as the cost of genome sequencing decreases gradually. In this context, disease-gene association studies are of great importance. However, genomic data are very sensitive when compared to other data types and contains information about individuals and their relatives. Many studies have shown that this information can be obtained from the query-response pairs on genomic databases. In this work, we propose a method that uses secure multi-party computation to query genomic databases in a privacy-protected manner. The proposed solution privately outsources genomic data from arbitrarily many sources to the two non-colluding proxies and allows genomic databases to be safely stored in semi-honest cloud environments. It provides data privacy, query privacy and output privacy by using XOR-based sharing and unlike previous solutions, it allows queries to run efficiently on hundreds of thousands of genomic data. Results: We measure the performance of our solution with parameters similar to real-world applications. It is possible to query a genomic database with 3 000 000 variants with five genomic query predicates under 400 ms. Querying 1 048 576 genomes, each containing 1 000 000 variants, for the presence of five different query variants can be achieved approximately in 6 min with a small amount of dedicated hardware and connectivity. These execution times are in the right range to enable real-world applications in medical research and healthcare. Unlike previous studies, it is possible to query multiple databases with response times fast enough for practical application. To the best of our knowledge, this is the first solution that provides this performance for querying large-scale genomic data.
URI: https://doi.org/10.1093/bioinformatics/btac070
https://hdl.handle.net/11147/12166
Appears in Collections:Computer Engineering / Bilgisayar Mühendisliği
PubMed İndeksli Yayınlar Koleksiyonu / PubMed Indexed Publications Collection
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Files in This Item:
File Description SizeFormat 
btac070.pdfArticle1.55 MBAdobe PDFView/Open
Show full item record



CORE Recommender

SCOPUSTM   
Citations

4
checked on Feb 16, 2024

WEB OF SCIENCETM
Citations

3
checked on Feb 10, 2024

Page view(s)

1,540
checked on Feb 26, 2024

Download(s)

434
checked on Feb 26, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.