{"title":"Yeast Gene Function Prediction from Different Data Sources: An Empirical Comparison","authors":"Y. Liu","doi":"10.2174/1875036201105010069","DOIUrl":null,"url":null,"abstract":"Different data sources have been used to learn gene function. Whereas combining heterogeneous data sets to infer gene function has been widely studied, there is no empirical comparison to determine the relative effectiveness or usefulness of different types of data in terms of gene function prediction. In this paper, we report a comparative study of yeast gene function prediction using different data sources, namely microarray data, phylogenetic data, literature text data, and a combination of these three data sources. Our results showed that text data outperformed microarray data and phylo- genetic data in gene function prediction (p 0.05). The com- bined data led to decreased prediction performance relative to text data. In addition, we showed that feature selection did not improve the prediction performance of support vector machines.","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"5 1","pages":"69-76"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Bioinformatics Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/1875036201105010069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 1
Abstract
Different data sources have been used to learn gene function. Whereas combining heterogeneous data sets to infer gene function has been widely studied, there is no empirical comparison to determine the relative effectiveness or usefulness of different types of data in terms of gene function prediction. In this paper, we report a comparative study of yeast gene function prediction using different data sources, namely microarray data, phylogenetic data, literature text data, and a combination of these three data sources. Our results showed that text data outperformed microarray data and phylo- genetic data in gene function prediction (p 0.05). The com- bined data led to decreased prediction performance relative to text data. In addition, we showed that feature selection did not improve the prediction performance of support vector machines.
期刊介绍:
The Open Bioinformatics Journal is an Open Access online journal, which publishes research articles, reviews/mini-reviews, letters, clinical trial studies and guest edited single topic issues in all areas of bioinformatics and computational biology. The coverage includes biomedicine, focusing on large data acquisition, analysis and curation, computational and statistical methods for the modeling and analysis of biological data, and descriptions of new algorithms and databases. The Open Bioinformatics Journal, a peer reviewed journal, is an important and reliable source of current information on the developments in the field. The emphasis will be on publishing quality articles rapidly and freely available worldwide.