{"title":"使用安全多方计算对单核苷酸多态性进行安全比较(预印本)","authors":"Andrew Woods, Skyler T Kramer, Dong Xu, Wei Jiang","doi":"10.2196/44700","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>While genomic variations can provide valuable information for health care and ancestry, the privacy of individual genomic data must be protected. Thus, a secure environment is desirable for a human DNA database such that the total data are queryable but not directly accessible to involved parties (eg, data hosts and hospitals) and that the query results are learned only by the user or authorized party.</p><p><strong>Objective: </strong>In this study, we provide efficient and secure computations on panels of single nucleotide polymorphisms (SNPs) from genomic sequences as computed under the following set operations: union, intersection, set difference, and symmetric difference.</p><p><strong>Methods: </strong>Using these operations, we can compute similarity metrics, such as the Jaccard similarity, which could allow querying a DNA database to find the same person and genetic relatives securely. We analyzed various security paradigms and show metrics for the protocols under several security assumptions, such as semihonest, malicious with honest majority, and malicious with a malicious majority.</p><p><strong>Results: </strong>We show that our methods can be used practically on realistically sized data. Specifically, we can compute the Jaccard similarity of two genomes when considering sets of SNPs, each with 400,000 SNPs, in 2.16 seconds with the assumption of a malicious adversary in an honest majority and 0.36 seconds under a semihonest model.</p><p><strong>Conclusions: </strong>Our methods may help adopt trusted environments for hosting individual genomic data with end-to-end data security.</p>","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135223/pdf/","citationCount":"0","resultStr":"{\"title\":\"Secure Comparisons of Single Nucleotide Polymorphisms Using Secure Multiparty Computation: Method Development.\",\"authors\":\"Andrew Woods, Skyler T Kramer, Dong Xu, Wei Jiang\",\"doi\":\"10.2196/44700\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>While genomic variations can provide valuable information for health care and ancestry, the privacy of individual genomic data must be protected. Thus, a secure environment is desirable for a human DNA database such that the total data are queryable but not directly accessible to involved parties (eg, data hosts and hospitals) and that the query results are learned only by the user or authorized party.</p><p><strong>Objective: </strong>In this study, we provide efficient and secure computations on panels of single nucleotide polymorphisms (SNPs) from genomic sequences as computed under the following set operations: union, intersection, set difference, and symmetric difference.</p><p><strong>Methods: </strong>Using these operations, we can compute similarity metrics, such as the Jaccard similarity, which could allow querying a DNA database to find the same person and genetic relatives securely. We analyzed various security paradigms and show metrics for the protocols under several security assumptions, such as semihonest, malicious with honest majority, and malicious with a malicious majority.</p><p><strong>Results: </strong>We show that our methods can be used practically on realistically sized data. Specifically, we can compute the Jaccard similarity of two genomes when considering sets of SNPs, each with 400,000 SNPs, in 2.16 seconds with the assumption of a malicious adversary in an honest majority and 0.36 seconds under a semihonest model.</p><p><strong>Conclusions: </strong>Our methods may help adopt trusted environments for hosting individual genomic data with end-to-end data security.</p>\",\"PeriodicalId\":73552,\"journal\":{\"name\":\"JMIR bioinformatics and biotechnology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135223/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR bioinformatics and biotechnology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/44700\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR bioinformatics and biotechnology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/44700","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
背景:虽然基因组变异可以为医疗保健和祖先提供有价值的信息,但个人基因组数据的隐私必须得到保护。因此,人类 DNA 数据库需要一个安全的环境,使所有数据可以查询,但相关方(如数据主机和医院)不能直接访问,只有用户或授权方才能了解查询结果:在这项研究中,我们提供了对基因组序列中单核苷酸多态性(SNPs)面板的高效安全计算,计算方法包括以下集合运算:联合、交集、集合差和对称差:利用这些运算,我们可以计算出相似度指标,如 Jaccard 相似度,从而可以查询 DNA 数据库,安全地找到同一个人和遗传亲属。我们分析了各种安全范式,并展示了在半诚信、恶意与诚信多数、恶意与恶意多数等几种安全假设下的协议度量:我们的研究结果表明,我们的方法可以实际应用于真实大小的数据。具体来说,当考虑到 SNPs 集(每个 SNPs 集有 400,000 个 SNPs)时,我们可以在 2.16 秒内计算出两个基因组的 Jaccard 相似度(假设恶意对手处于诚实多数),而在半诚实模型下只需 0.36 秒:我们的方法有助于采用可信环境来托管具有端到端数据安全性的个体基因组数据。
Secure Comparisons of Single Nucleotide Polymorphisms Using Secure Multiparty Computation: Method Development.
Background: While genomic variations can provide valuable information for health care and ancestry, the privacy of individual genomic data must be protected. Thus, a secure environment is desirable for a human DNA database such that the total data are queryable but not directly accessible to involved parties (eg, data hosts and hospitals) and that the query results are learned only by the user or authorized party.
Objective: In this study, we provide efficient and secure computations on panels of single nucleotide polymorphisms (SNPs) from genomic sequences as computed under the following set operations: union, intersection, set difference, and symmetric difference.
Methods: Using these operations, we can compute similarity metrics, such as the Jaccard similarity, which could allow querying a DNA database to find the same person and genetic relatives securely. We analyzed various security paradigms and show metrics for the protocols under several security assumptions, such as semihonest, malicious with honest majority, and malicious with a malicious majority.
Results: We show that our methods can be used practically on realistically sized data. Specifically, we can compute the Jaccard similarity of two genomes when considering sets of SNPs, each with 400,000 SNPs, in 2.16 seconds with the assumption of a malicious adversary in an honest majority and 0.36 seconds under a semihonest model.
Conclusions: Our methods may help adopt trusted environments for hosting individual genomic data with end-to-end data security.