{"title":"基因组序列的高效隐私保护变长子串匹配。","authors":"Yoshiki Nakagawa, Satsuya Ohata, Kana Shimizu","doi":"10.1186/s13015-022-00211-1","DOIUrl":null,"url":null,"abstract":"<p><p>The development of a privacy-preserving technology is important for accelerating genome data sharing. This study proposes an algorithm that securely searches a variable-length substring match between a query and a database sequence. Our concept hinges on a technique that efficiently applies FM-index for a secret-sharing scheme. More precisely, we developed an algorithm that can achieve a secure table lookup in such a way that [Formula: see text] is computed for a given depth of recursion where [Formula: see text] is an initial position, and V is a vector. We used the secure table lookup for vectors created based on FM-index. The notable feature of the secure table lookup is that time, communication, and round complexities are not dependent on the table length N, after the query input. Therefore, a substring match by reference to the FM-index-based table can also be conducted independently against the database length, and the entire search time is dramatically improved compared to previous approaches. We conducted an experiment using a human genome sequence with the length of 10 million as the database and a query with the length of 100 and found that the query response time of our protocol was at least three orders of magnitude faster than a non-indexed database search protocol under the realistic computation/network environment.</p>","PeriodicalId":405,"journal":{"name":"Polymer","volume":"13 1","pages":"9"},"PeriodicalIF":4.5000,"publicationDate":"2022-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040336/pdf/","citationCount":"0","resultStr":"{\"title\":\"Efficient privacy-preserving variable-length substring match for genome sequence.\",\"authors\":\"Yoshiki Nakagawa, Satsuya Ohata, Kana Shimizu\",\"doi\":\"10.1186/s13015-022-00211-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The development of a privacy-preserving technology is important for accelerating genome data sharing. This study proposes an algorithm that securely searches a variable-length substring match between a query and a database sequence. Our concept hinges on a technique that efficiently applies FM-index for a secret-sharing scheme. More precisely, we developed an algorithm that can achieve a secure table lookup in such a way that [Formula: see text] is computed for a given depth of recursion where [Formula: see text] is an initial position, and V is a vector. We used the secure table lookup for vectors created based on FM-index. The notable feature of the secure table lookup is that time, communication, and round complexities are not dependent on the table length N, after the query input. Therefore, a substring match by reference to the FM-index-based table can also be conducted independently against the database length, and the entire search time is dramatically improved compared to previous approaches. We conducted an experiment using a human genome sequence with the length of 10 million as the database and a query with the length of 100 and found that the query response time of our protocol was at least three orders of magnitude faster than a non-indexed database search protocol under the realistic computation/network environment.</p>\",\"PeriodicalId\":405,\"journal\":{\"name\":\"Polymer\",\"volume\":\"13 1\",\"pages\":\"9\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2022-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040336/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Polymer\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13015-022-00211-1\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"POLYMER SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Polymer","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13015-022-00211-1","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"POLYMER SCIENCE","Score":null,"Total":0}
引用次数: 0
摘要
开发隐私保护技术对于加速基因组数据共享非常重要。本研究提出了一种算法,可以安全地搜索查询和数据库序列之间的可变长度子串匹配。我们的构想依赖于一种有效应用调频索引的保密共享方案技术。更确切地说,我们开发的算法可以实现安全查表,即在给定的递归深度下计算[公式:见正文],其中[公式:见正文]是初始位置,V是向量。我们对基于调频索引创建的向量使用了安全查表。安全查表的显著特点是,在查询输入后,时间、通信和轮次复杂度与表长 N 无关。因此,参考基于调频索引的表进行子串匹配也可以不受数据库长度的影响,与以前的方法相比,整个搜索时间大大缩短。我们使用长度为 1000 万的人类基因组序列作为数据库,长度为 100 的查询进行了实验,发现在现实计算/网络环境下,我们的协议的查询响应时间比非索引数据库搜索协议至少快三个数量级。
Efficient privacy-preserving variable-length substring match for genome sequence.
The development of a privacy-preserving technology is important for accelerating genome data sharing. This study proposes an algorithm that securely searches a variable-length substring match between a query and a database sequence. Our concept hinges on a technique that efficiently applies FM-index for a secret-sharing scheme. More precisely, we developed an algorithm that can achieve a secure table lookup in such a way that [Formula: see text] is computed for a given depth of recursion where [Formula: see text] is an initial position, and V is a vector. We used the secure table lookup for vectors created based on FM-index. The notable feature of the secure table lookup is that time, communication, and round complexities are not dependent on the table length N, after the query input. Therefore, a substring match by reference to the FM-index-based table can also be conducted independently against the database length, and the entire search time is dramatically improved compared to previous approaches. We conducted an experiment using a human genome sequence with the length of 10 million as the database and a query with the length of 100 and found that the query response time of our protocol was at least three orders of magnitude faster than a non-indexed database search protocol under the realistic computation/network environment.
期刊介绍:
Polymer is an interdisciplinary journal dedicated to publishing innovative and significant advances in Polymer Physics, Chemistry and Technology. We welcome submissions on polymer hybrids, nanocomposites, characterisation and self-assembly. Polymer also publishes work on the technological application of polymers in energy and optoelectronics.
The main scope is covered but not limited to the following core areas:
Polymer Materials
Nanocomposites and hybrid nanomaterials
Polymer blends, films, fibres, networks and porous materials
Physical Characterization
Characterisation, modelling and simulation* of molecular and materials properties in bulk, solution, and thin films
Polymer Engineering
Advanced multiscale processing methods
Polymer Synthesis, Modification and Self-assembly
Including designer polymer architectures, mechanisms and kinetics, and supramolecular polymerization
Technological Applications
Polymers for energy generation and storage
Polymer membranes for separation technology
Polymers for opto- and microelectronics.