M. Uddin, Mohammad Khairul Islam, Md. Rakib Hassan, Aysha Siddika Ratna, Farah Jahan
{"title":"一种新的部分模板匹配技术用于DNA序列相似性鉴定","authors":"M. Uddin, Mohammad Khairul Islam, Md. Rakib Hassan, Aysha Siddika Ratna, Farah Jahan","doi":"10.1109/ICCIT57492.2022.10055778","DOIUrl":null,"url":null,"abstract":"The amount of DNA data is growing exponentially because of enormous applications including gene therapy, new variety development, and evolutionary history tracking. Recently, chaos, kmer count, histogram, and deep learning-based alignment-free (AF) algorithms are widely used for DNA sequence analysis. However, these methods have either high time complexity, memory consumption, or low precision rate. Hence, an optimal solution is needed. Therefore, in this research, a part-wise template matching-based novel similarity feature vector is extracted. Based on this vector, a phylogenetic tree is generated. The method is tested on two benchmark and four standard datasets and compared with recent existing studies. The method achieves 100% accuracy, consumes 10 to 70 times less memory than existing studies, and achieves top-rank benchmark results. Moreover, the required time of this method is very close to the existing best methods. Therefore, in real-time scenarios, industries can use this method with a great level of reliability.","PeriodicalId":255498,"journal":{"name":"2022 25th International Conference on Computer and Information Technology (ICCIT)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A novel part-wise template matching technique for DNA sequence similarity identification\",\"authors\":\"M. Uddin, Mohammad Khairul Islam, Md. Rakib Hassan, Aysha Siddika Ratna, Farah Jahan\",\"doi\":\"10.1109/ICCIT57492.2022.10055778\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The amount of DNA data is growing exponentially because of enormous applications including gene therapy, new variety development, and evolutionary history tracking. Recently, chaos, kmer count, histogram, and deep learning-based alignment-free (AF) algorithms are widely used for DNA sequence analysis. However, these methods have either high time complexity, memory consumption, or low precision rate. Hence, an optimal solution is needed. Therefore, in this research, a part-wise template matching-based novel similarity feature vector is extracted. Based on this vector, a phylogenetic tree is generated. The method is tested on two benchmark and four standard datasets and compared with recent existing studies. The method achieves 100% accuracy, consumes 10 to 70 times less memory than existing studies, and achieves top-rank benchmark results. Moreover, the required time of this method is very close to the existing best methods. Therefore, in real-time scenarios, industries can use this method with a great level of reliability.\",\"PeriodicalId\":255498,\"journal\":{\"name\":\"2022 25th International Conference on Computer and Information Technology (ICCIT)\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 25th International Conference on Computer and Information Technology (ICCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIT57492.2022.10055778\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 25th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIT57492.2022.10055778","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A novel part-wise template matching technique for DNA sequence similarity identification
The amount of DNA data is growing exponentially because of enormous applications including gene therapy, new variety development, and evolutionary history tracking. Recently, chaos, kmer count, histogram, and deep learning-based alignment-free (AF) algorithms are widely used for DNA sequence analysis. However, these methods have either high time complexity, memory consumption, or low precision rate. Hence, an optimal solution is needed. Therefore, in this research, a part-wise template matching-based novel similarity feature vector is extracted. Based on this vector, a phylogenetic tree is generated. The method is tested on two benchmark and four standard datasets and compared with recent existing studies. The method achieves 100% accuracy, consumes 10 to 70 times less memory than existing studies, and achieves top-rank benchmark results. Moreover, the required time of this method is very close to the existing best methods. Therefore, in real-time scenarios, industries can use this method with a great level of reliability.