Predicting gene functions from multiple biological sources using novel ensemble methods.

IF 0.4 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY International Journal of Data Mining and Bioinformatics Pub Date : 2015-01-01 DOI:10.1504/ijdmb.2015.069418

Chandan K Reddy, Mohammad S Aziz

{"title":"Predicting gene functions from multiple biological sources using novel ensemble methods.","authors":"Chandan K Reddy, Mohammad S Aziz","doi":"10.1504/ijdmb.2015.069418","DOIUrl":null,"url":null,"abstract":"<p><p>The functional classification of genes plays a vital role in molecular biology. Detecting previously unknown role of genes and their products in physiological and pathological processes is an important and challenging problem. In this work, information from several biological sources such as comparative genome sequences, gene expression and protein interactions are combined to obtain robust results on predicting gene functions. The information in such heterogeneous sources is often incomplete and hence making the maximum use of all the available information is a challenging problem. We propose an algorithm that improves the performance of prediction of different models built on individual sources. We also develop a heterogeneous boosting framework that uses all the available information even if some sources do not provide any information about some of the genes. We demonstrate the superior performance of the proposed methods in terms of accuracy and F-measure compared to several imputation and integration schemes.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":"12 2","pages":"184-206"},"PeriodicalIF":0.4000,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069418","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Mining and Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1504/ijdmb.2015.069418","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 2

Abstract

The functional classification of genes plays a vital role in molecular biology. Detecting previously unknown role of genes and their products in physiological and pathological processes is an important and challenging problem. In this work, information from several biological sources such as comparative genome sequences, gene expression and protein interactions are combined to obtain robust results on predicting gene functions. The information in such heterogeneous sources is often incomplete and hence making the maximum use of all the available information is a challenging problem. We propose an algorithm that improves the performance of prediction of different models built on individual sources. We also develop a heterogeneous boosting framework that uses all the available information even if some sources do not provide any information about some of the genes. We demonstrate the superior performance of the proposed methods in terms of accuracy and F-measure compared to several imputation and integration schemes.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用新颖的集成方法预测多种生物来源的基因功能。

基因的功能分类在分子生物学中起着至关重要的作用。检测先前未知的基因及其产物在生理和病理过程中的作用是一个重要而具有挑战性的问题。在这项工作中，来自几个生物学来源的信息，如比较基因组序列，基因表达和蛋白质相互作用相结合，以获得预测基因功能的可靠结果。这些异构源中的信息通常是不完整的，因此最大限度地利用所有可用信息是一个具有挑战性的问题。我们提出了一种算法，可以提高基于单个源的不同模型的预测性能。我们还开发了一个异质促进框架，使用所有可用的信息，即使一些来源没有提供有关某些基因的任何信息。我们证明了所提出的方法在精度和f测量方面的优越性能，与几种imputation和integration方案相比。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Data Mining and Bioinformatics 生物-数学与计算生物学

CiteScore

1.00

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Mining bioinformatics data is an emerging area at the intersection between bioinformatics and data mining. The objective of IJDMB is to facilitate collaboration between data mining researchers and bioinformaticians by presenting cutting edge research topics and methodologies in the area of data mining for bioinformatics. This perspective acknowledges the inter-disciplinary nature of research in data mining and bioinformatics and provides a unified forum for researchers/practitioners/students/policy makers to share the latest research and developments in this fast growing multi-disciplinary research area.