Fillipe D M de Souza, Hubert de Lassus, Ro Cammarota
{"title":"使用同态加密技术对法医基因组学中的亲属进行私密检测。","authors":"Fillipe D M de Souza, Hubert de Lassus, Ro Cammarota","doi":"10.1186/s12920-024-02037-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Forensic analysis heavily relies on DNA analysis techniques, notably autosomal Single Nucleotide Polymorphisms (SNPs), to expedite the identification of unknown suspects through genomic database searches. However, the uniqueness of an individual's genome sequence designates it as Personal Identifiable Information (PII), subjecting it to stringent privacy regulations that can impede data access and analysis, as well as restrict the parties allowed to handle the data. Homomorphic Encryption (HE) emerges as a promising solution, enabling the execution of complex functions on encrypted data without the need for decryption. HE not only permits the processing of PII as soon as it is collected and encrypted, such as at a crime scene, but also expands the potential for data processing by multiple entities and artificial intelligence services.</p><p><strong>Methods: </strong>This study introduces HE-based privacy-preserving methods for SNP DNA analysis, offering a means to compute kinship scores for a set of genome queries while meticulously preserving data privacy. We present three distinct approaches, including one unsupervised and two supervised methods, all of which demonstrated exceptional performance in the iDASH 2023 Track 1 competition.</p><p><strong>Results: </strong>Our HE-based methods can rapidly predict 400 kinship scores from an encrypted database containing 2000 entries within seconds, capitalizing on advanced technologies like Intel AVX vector extensions, Intel HEXL, and Microsoft SEAL HE libraries. Crucially, all three methods achieve remarkable accuracy levels (ranging from 96% to 100%), as evaluated by the auROC score metric, while maintaining robust 128-bit security. These findings underscore the transformative potential of HE in both safeguarding genomic data privacy and streamlining precise DNA analysis.</p><p><strong>Conclusions: </strong>Results demonstrate that HE-based solutions can be computationally practical to protect genomic privacy during screening of candidate matches for further genealogy analysis in Forensic Genetic Genealogy (FGG).</p>","PeriodicalId":8915,"journal":{"name":"BMC Medical Genomics","volume":"17 1","pages":"273"},"PeriodicalIF":2.1000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Private detection of relatives in forensic genomics using homomorphic encryption.\",\"authors\":\"Fillipe D M de Souza, Hubert de Lassus, Ro Cammarota\",\"doi\":\"10.1186/s12920-024-02037-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Forensic analysis heavily relies on DNA analysis techniques, notably autosomal Single Nucleotide Polymorphisms (SNPs), to expedite the identification of unknown suspects through genomic database searches. However, the uniqueness of an individual's genome sequence designates it as Personal Identifiable Information (PII), subjecting it to stringent privacy regulations that can impede data access and analysis, as well as restrict the parties allowed to handle the data. Homomorphic Encryption (HE) emerges as a promising solution, enabling the execution of complex functions on encrypted data without the need for decryption. HE not only permits the processing of PII as soon as it is collected and encrypted, such as at a crime scene, but also expands the potential for data processing by multiple entities and artificial intelligence services.</p><p><strong>Methods: </strong>This study introduces HE-based privacy-preserving methods for SNP DNA analysis, offering a means to compute kinship scores for a set of genome queries while meticulously preserving data privacy. We present three distinct approaches, including one unsupervised and two supervised methods, all of which demonstrated exceptional performance in the iDASH 2023 Track 1 competition.</p><p><strong>Results: </strong>Our HE-based methods can rapidly predict 400 kinship scores from an encrypted database containing 2000 entries within seconds, capitalizing on advanced technologies like Intel AVX vector extensions, Intel HEXL, and Microsoft SEAL HE libraries. Crucially, all three methods achieve remarkable accuracy levels (ranging from 96% to 100%), as evaluated by the auROC score metric, while maintaining robust 128-bit security. These findings underscore the transformative potential of HE in both safeguarding genomic data privacy and streamlining precise DNA analysis.</p><p><strong>Conclusions: </strong>Results demonstrate that HE-based solutions can be computationally practical to protect genomic privacy during screening of candidate matches for further genealogy analysis in Forensic Genetic Genealogy (FGG).</p>\",\"PeriodicalId\":8915,\"journal\":{\"name\":\"BMC Medical Genomics\",\"volume\":\"17 1\",\"pages\":\"273\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Genomics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12920-024-02037-9\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Genomics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12920-024-02037-9","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
摘要
背景:法医分析在很大程度上依赖于 DNA 分析技术,特别是常染色体单核苷酸多态性 (SNP),通过基因组数据库搜索加快识别未知嫌疑人的速度。然而,由于个人基因组序列的唯一性,它被指定为个人身份信息(PII),需要遵守严格的隐私法规,这可能会阻碍数据访问和分析,并限制允许处理数据的各方。同态加密(HE)是一种很有前途的解决方案,它可以在加密数据上执行复杂的功能,而无需解密。同态加密不仅允许在收集和加密 PII(如在犯罪现场)后立即对其进行处理,而且还扩大了多个实体和人工智能服务处理数据的潜力:本研究介绍了基于 HE 的 SNP DNA 分析隐私保护方法,为一系列基因组查询提供了一种计算亲缘关系得分的方法,同时还细致地保护了数据隐私。我们介绍了三种不同的方法,包括一种无监督方法和两种有监督方法,所有这些方法都在 iDASH 2023 Track 1 竞赛中表现出了优异的性能:我们基于 HE 的方法利用英特尔 AVX 向量扩展、英特尔 HEXL 和微软 SEAL HE 库等先进技术,可在数秒内从包含 2000 个条目的加密数据库中快速预测 400 个亲缘关系得分。最重要的是,根据 auROC 评分标准的评估,所有这三种方法都达到了显著的准确率水平(从 96% 到 100% 不等),同时保持了稳健的 128 位安全性。这些发现强调了 HE 在保护基因组数据隐私和简化 DNA 精确分析方面的变革潜力:结论:研究结果表明,基于 HE 的解决方案在法医遗传系谱学 (FGG) 中筛选候选匹配对象以进行进一步系谱分析的过程中保护基因组隐私方面具有计算上的实用性。
Private detection of relatives in forensic genomics using homomorphic encryption.
Background: Forensic analysis heavily relies on DNA analysis techniques, notably autosomal Single Nucleotide Polymorphisms (SNPs), to expedite the identification of unknown suspects through genomic database searches. However, the uniqueness of an individual's genome sequence designates it as Personal Identifiable Information (PII), subjecting it to stringent privacy regulations that can impede data access and analysis, as well as restrict the parties allowed to handle the data. Homomorphic Encryption (HE) emerges as a promising solution, enabling the execution of complex functions on encrypted data without the need for decryption. HE not only permits the processing of PII as soon as it is collected and encrypted, such as at a crime scene, but also expands the potential for data processing by multiple entities and artificial intelligence services.
Methods: This study introduces HE-based privacy-preserving methods for SNP DNA analysis, offering a means to compute kinship scores for a set of genome queries while meticulously preserving data privacy. We present three distinct approaches, including one unsupervised and two supervised methods, all of which demonstrated exceptional performance in the iDASH 2023 Track 1 competition.
Results: Our HE-based methods can rapidly predict 400 kinship scores from an encrypted database containing 2000 entries within seconds, capitalizing on advanced technologies like Intel AVX vector extensions, Intel HEXL, and Microsoft SEAL HE libraries. Crucially, all three methods achieve remarkable accuracy levels (ranging from 96% to 100%), as evaluated by the auROC score metric, while maintaining robust 128-bit security. These findings underscore the transformative potential of HE in both safeguarding genomic data privacy and streamlining precise DNA analysis.
Conclusions: Results demonstrate that HE-based solutions can be computationally practical to protect genomic privacy during screening of candidate matches for further genealogy analysis in Forensic Genetic Genealogy (FGG).
期刊介绍:
BMC Medical Genomics is an open access journal publishing original peer-reviewed research articles in all aspects of functional genomics, genome structure, genome-scale population genetics, epigenomics, proteomics, systems analysis, and pharmacogenomics in relation to human health and disease.