Mohamed El-Dirany, Forrest Wang, J. Furst, J. Rogers, D. Raicu
{"title":"基于压缩的距离方法作为构建系统发育树的统计方法的替代方法","authors":"Mohamed El-Dirany, Forrest Wang, J. Furst, J. Rogers, D. Raicu","doi":"10.1109/BIBM.2016.7822676","DOIUrl":null,"url":null,"abstract":"Distance based methods for constructing phylogenetic trees have long been considered inconsistent and inferior to the more dominant statistical methods. However, use of compression methods specific to DNA could prove valuable in improving the effectiveness of distance based methods. To demonstrate the validity of distance-based methods when utilizing current DNA compression algorithms, such as MFCompress, we have applied such a method to datasets of closely related species of fish from the suborder Labroidei and to strains of Ebola. In both cases, we have managed to produce trees that are either very similar or identical to published trees produced using statistically based methods. This suggests that distance based methods can perform comparably to statistically based methods without requiring as much pre-processing of original DNA sequences or system resources. Additionally, the results also stress the importance of using accurate methods of calculating species distance due to the way that one specific DNA compression algorithm, MFCompress, consistently and convincingly managed to outperform other popular, general use compression algorithms.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"207 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Compression-based distance methods as an alternative to statistical methods for constructing phylogenetic trees\",\"authors\":\"Mohamed El-Dirany, Forrest Wang, J. Furst, J. Rogers, D. Raicu\",\"doi\":\"10.1109/BIBM.2016.7822676\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distance based methods for constructing phylogenetic trees have long been considered inconsistent and inferior to the more dominant statistical methods. However, use of compression methods specific to DNA could prove valuable in improving the effectiveness of distance based methods. To demonstrate the validity of distance-based methods when utilizing current DNA compression algorithms, such as MFCompress, we have applied such a method to datasets of closely related species of fish from the suborder Labroidei and to strains of Ebola. In both cases, we have managed to produce trees that are either very similar or identical to published trees produced using statistically based methods. This suggests that distance based methods can perform comparably to statistically based methods without requiring as much pre-processing of original DNA sequences or system resources. Additionally, the results also stress the importance of using accurate methods of calculating species distance due to the way that one specific DNA compression algorithm, MFCompress, consistently and convincingly managed to outperform other popular, general use compression algorithms.\",\"PeriodicalId\":345384,\"journal\":{\"name\":\"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"volume\":\"207 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBM.2016.7822676\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Compression-based distance methods as an alternative to statistical methods for constructing phylogenetic trees
Distance based methods for constructing phylogenetic trees have long been considered inconsistent and inferior to the more dominant statistical methods. However, use of compression methods specific to DNA could prove valuable in improving the effectiveness of distance based methods. To demonstrate the validity of distance-based methods when utilizing current DNA compression algorithms, such as MFCompress, we have applied such a method to datasets of closely related species of fish from the suborder Labroidei and to strains of Ebola. In both cases, we have managed to produce trees that are either very similar or identical to published trees produced using statistically based methods. This suggests that distance based methods can perform comparably to statistically based methods without requiring as much pre-processing of original DNA sequences or system resources. Additionally, the results also stress the importance of using accurate methods of calculating species distance due to the way that one specific DNA compression algorithm, MFCompress, consistently and convincingly managed to outperform other popular, general use compression algorithms.