P Xing, C Kulikowski, I Muchnik, I Dubchak, D M Wolf, S Spengler, M Zorn
{"title":"Analysis of ribosomal RNA sequences by combinatorial clustering.","authors":"P Xing, C Kulikowski, I Muchnik, I Dubchak, D M Wolf, S Spengler, M Zorn","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>We present an analysis of multi-aligned eukaryotic and procaryotic small subunit rRNA sequences using a novel segmentation and clustering procedure capable of extracting subsets of sequences that share common sequence features. This procedure consists of: i) segmentation of aligned sequences using a dynamic programming procedure, and subsequent identification of likely conserved segments; ii) for each putative conserved segment, extraction of a locall homogeneous cluster using a novel polynomial procedure; and iii) intersection of clusters associated with each conserved segment. Aside from their utilit in processing large gap-filled multi-alignments, these algorithms can be applied to a broad spectrum of rRNA analysis functions such as subalignment, phylogenetic subtree extraction and construction, and organism tree-placement, and can serve as a framework to organize sequence data in an efficient and easily searchable manner. The sequence classification we obtained using the method presented here shows a remarkable consistency with the independently constructed eukaryotic phylogenetic tree.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1999-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We present an analysis of multi-aligned eukaryotic and procaryotic small subunit rRNA sequences using a novel segmentation and clustering procedure capable of extracting subsets of sequences that share common sequence features. This procedure consists of: i) segmentation of aligned sequences using a dynamic programming procedure, and subsequent identification of likely conserved segments; ii) for each putative conserved segment, extraction of a locall homogeneous cluster using a novel polynomial procedure; and iii) intersection of clusters associated with each conserved segment. Aside from their utilit in processing large gap-filled multi-alignments, these algorithms can be applied to a broad spectrum of rRNA analysis functions such as subalignment, phylogenetic subtree extraction and construction, and organism tree-placement, and can serve as a framework to organize sequence data in an efficient and easily searchable manner. The sequence classification we obtained using the method presented here shows a remarkable consistency with the independently constructed eukaryotic phylogenetic tree.