MFMS: maximal frequent module set mining from multiple human gene expression data sets

Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference) Pub Date : 2013-08-11 DOI:10.1145/2500863.2500869

Saeed Salem, C. Ozcaglar

引用次数: 11

Abstract

Advances in genomic technologies have allowed vast amounts of gene expression data to be collected. Protein functional annotation and biological module discovery that are based on a single gene expression data suffers from spurious coexpression. Recent work have focused on integrating multiple independent gene expression data sets. In this paper, we propose a two-step approach for mining maximally frequent collection of highly connected modules from coexpression graphs. We first mine maximal frequent edge-sets and then extract highly connected subgraphs from the edge-induced subgraphs. Experimental results on the collection of modules mined from 52 Human gene expression data sets show that coexpression links that occur together in a significant number of experiments have a modular topological structure. Moreover, GO enrichment analysis shows that the proposed approach discovers biologically significant frequent collections of modules.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MFMS:从多个人类基因表达数据集中挖掘最大频繁模块集

基因组技术的进步使得大量的基因表达数据得以收集。基于单个基因表达数据的蛋白质功能注释和生物模块发现存在虚假共表达的问题。最近的工作集中在整合多个独立的基因表达数据集。在本文中，我们提出了一种从共表达式图中挖掘高度连接模块的最大频繁集合的两步方法。首先挖掘最大频繁边集，然后从边诱导子图中提取高连通子图。从52个人类基因表达数据集中挖掘模块的实验结果表明，在大量实验中一起发生的共表达链接具有模块化拓扑结构。此外，氧化石墨烯富集分析表明，该方法发现了具有生物学意义的频繁模块集合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)

自引率

0.00%

发文量