A comparison between hierarchical clustering and community detection method in the collection of gene targets for molecular identification of pathogenic fungi
{"title":"A comparison between hierarchical clustering and community detection method in the collection of gene targets for molecular identification of pathogenic fungi","authors":"I. Thapa, S. Bhowmick, D. Bastola","doi":"10.1109/BIBMW.2012.6470234","DOIUrl":null,"url":null,"abstract":"Ribosomal RNA sequence is a popular primary molecular target in the diagnosis of many fungal and bacterial infections. More recently a number of other molecular targets like `cytochrome b', `rpoB', `actin' is available in public databases such as GenBank. These sequences could be better alternatives to the popular ribosomal RNA as molecular targets. However, existing computational approaches do not provide a convenient method to collect and make these sequences available for the development of new alternative sequence-based diagnostics that are critical for early detection of infectious agents like fungi. The long-term goal of this study is to develop a computational tool for the rapid identification of infectious agents in biological sample. In the present study, we focus on pre-processing of sequence data in public database and compare a number of clustering approaches to classify currently available DNA sequences into different target genes. We evaluate the correctness of these methods based on the target classification of seven different species of Zygomycetes. Use of a clustering comparison metric has shown that community detection and hierarchical clustering methods are on par with high accuracy.","PeriodicalId":6392,"journal":{"name":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBMW.2012.6470234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Ribosomal RNA sequence is a popular primary molecular target in the diagnosis of many fungal and bacterial infections. More recently a number of other molecular targets like `cytochrome b', `rpoB', `actin' is available in public databases such as GenBank. These sequences could be better alternatives to the popular ribosomal RNA as molecular targets. However, existing computational approaches do not provide a convenient method to collect and make these sequences available for the development of new alternative sequence-based diagnostics that are critical for early detection of infectious agents like fungi. The long-term goal of this study is to develop a computational tool for the rapid identification of infectious agents in biological sample. In the present study, we focus on pre-processing of sequence data in public database and compare a number of clustering approaches to classify currently available DNA sequences into different target genes. We evaluate the correctness of these methods based on the target classification of seven different species of Zygomycetes. Use of a clustering comparison metric has shown that community detection and hierarchical clustering methods are on par with high accuracy.