共识串匹配在转位突变等位基因异质性诊断中的应用

IF 0.4 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY International Journal of Data Mining and Bioinformatics Pub Date : 2015-10-01 DOI:10.1504/IJDMB.2015.072756

F. Zohora, Mohammad Sohel Rahman

{"title":"共识串匹配在转位突变等位基因异质性诊断中的应用","authors":"F. Zohora, Mohammad Sohel Rahman","doi":"10.1504/IJDMB.2015.072756","DOIUrl":null,"url":null,"abstract":"In this paper, an algorithm is proposed that detects the existence of a common ancestor gene sequence for non-overlapping transposition metric given two input DNA sequences. We consider two cases: fixed length transposition and all length transposition. For the first one, the algorithm has the time complexity of O(n3), where n is the length of input sequences. In case of all length transposition, theoretical worst case time complexity of the algorithm is proven to be O(n4). However, practically the worst case and the average case time complexity for all length transposition are found to be O(n3) and O(n2) respectively. This work is motivated by the purpose of diagnosing unknown genetic disease that shows allelic heterogeneity, a case where a normal gene mutates in different orders resulting in two different gene sequences causing two different genetic diseases. The algorithm can be useful as well in the study of breed-related hereditary to determine the genetic spread of a defective gene in the population.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"13 4 1","pages":"360-77"},"PeriodicalIF":0.4000,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJDMB.2015.072756","citationCount":"1","resultStr":"{\"title\":\"Application of consensus string matching in the diagnosis of allelic heterogeneity involving transposition mutation\",\"authors\":\"F. Zohora, Mohammad Sohel Rahman\",\"doi\":\"10.1504/IJDMB.2015.072756\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, an algorithm is proposed that detects the existence of a common ancestor gene sequence for non-overlapping transposition metric given two input DNA sequences. We consider two cases: fixed length transposition and all length transposition. For the first one, the algorithm has the time complexity of O(n3), where n is the length of input sequences. In case of all length transposition, theoretical worst case time complexity of the algorithm is proven to be O(n4). However, practically the worst case and the average case time complexity for all length transposition are found to be O(n3) and O(n2) respectively. This work is motivated by the purpose of diagnosing unknown genetic disease that shows allelic heterogeneity, a case where a normal gene mutates in different orders resulting in two different gene sequences causing two different genetic diseases. The algorithm can be useful as well in the study of breed-related hereditary to determine the genetic spread of a defective gene in the population.\",\"PeriodicalId\":54964,\"journal\":{\"name\":\"International Journal of Data Mining and Bioinformatics\",\"volume\":\"13 4 1\",\"pages\":\"360-77\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2015-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1504/IJDMB.2015.072756\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Data Mining and Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1504/IJDMB.2015.072756\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Mining and Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1504/IJDMB.2015.072756","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 1

摘要

本文提出了一种基于非重叠转位度量的共同祖先基因序列检测算法。我们考虑两种情况:定长换位和全长换位。对于第一种算法，算法的时间复杂度为O(n3)，其中n为输入序列的长度。在全长度转置情况下，证明了该算法的理论最坏情况时间复杂度为O(n4)。然而，实际上，所有长度变换的最坏情况和平均情况的时间复杂度分别为O(n3)和O(n2)。这项工作的动机是为了诊断显示等位基因异质性的未知遗传疾病，即一个正常基因在不同顺序上发生突变，导致两种不同的基因序列，从而导致两种不同的遗传疾病。该算法也可用于研究与品种相关的遗传，以确定缺陷基因在群体中的遗传传播。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Application of consensus string matching in the diagnosis of allelic heterogeneity involving transposition mutation

In this paper, an algorithm is proposed that detects the existence of a common ancestor gene sequence for non-overlapping transposition metric given two input DNA sequences. We consider two cases: fixed length transposition and all length transposition. For the first one, the algorithm has the time complexity of O(n3), where n is the length of input sequences. In case of all length transposition, theoretical worst case time complexity of the algorithm is proven to be O(n4). However, practically the worst case and the average case time complexity for all length transposition are found to be O(n3) and O(n2) respectively. This work is motivated by the purpose of diagnosing unknown genetic disease that shows allelic heterogeneity, a case where a normal gene mutates in different orders resulting in two different gene sequences causing two different genetic diseases. The algorithm can be useful as well in the study of breed-related hereditary to determine the genetic spread of a defective gene in the population.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Data Mining and Bioinformatics 生物-数学与计算生物学

CiteScore

1.00

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Mining bioinformatics data is an emerging area at the intersection between bioinformatics and data mining. The objective of IJDMB is to facilitate collaboration between data mining researchers and bioinformaticians by presenting cutting edge research topics and methodologies in the area of data mining for bioinformatics. This perspective acknowledges the inter-disciplinary nature of research in data mining and bioinformatics and provides a unified forum for researchers/practitioners/students/policy makers to share the latest research and developments in this fast growing multi-disciplinary research area.