{"title":"A Novel Measurement of Sequence Dissimilarity and Its Application to Phylogeny","authors":"Xiao-hui Niu, Nana Li, Feng Shi, Xue-yan Li","doi":"10.1109/ICNC.2008.299","DOIUrl":null,"url":null,"abstract":"We present a new computational approach to measure the distance between two biological sequences. A biological sequence quantifies as a Markov Chain with 20 states. Stochastic state transition matrix is computed as the quantitative index of the biological sequence. The Kullback-Leibler discrimination information is used as a diversity indicator to measure the dissimilarity of each pair of the rows in the two state transition matrix. Distance between the two sequences is defined as the average value with the weight of the occurrence possibility of each amino acid. We illustrate its application in reconstructing a phylogeny of the Eutherian orders using concatenated H-stranded amino acid sequences. This phylogeny is consistent with the commonly accepted one for the Eutherians.","PeriodicalId":6404,"journal":{"name":"2008 Fourth International Conference on Natural Computation","volume":"37 1","pages":"231-234"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Fourth International Conference on Natural Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNC.2008.299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We present a new computational approach to measure the distance between two biological sequences. A biological sequence quantifies as a Markov Chain with 20 states. Stochastic state transition matrix is computed as the quantitative index of the biological sequence. The Kullback-Leibler discrimination information is used as a diversity indicator to measure the dissimilarity of each pair of the rows in the two state transition matrix. Distance between the two sequences is defined as the average value with the weight of the occurrence possibility of each amino acid. We illustrate its application in reconstructing a phylogeny of the Eutherian orders using concatenated H-stranded amino acid sequences. This phylogeny is consistent with the commonly accepted one for the Eutherians.