{"title":"An Efficient Weighted Biclustering Algorithm for Gene Expression Data","authors":"Y. Jia, Yidong Li, Weihua Liu, Hai-rong Dong","doi":"10.1109/PDCAT.2016.078","DOIUrl":null,"url":null,"abstract":"Microarrays are one of the latest breakthroughs in experimental molecular biology, which already provide huge amount of valuable gene expression data. Biclustering algorithm was introduced to capture the coherence of a subset of genes and a subset of conditions. In this paper, we presented a MIWB algorithm to find biclusters of gene expression data. MIWB algorithm uses the weighted mutual information as similarity measure which can be simultaneously detected complex linear and nonlinear relationships between genes. Our algorithm first used the weighted mutual information to construct the seed gene set of each biculster, then we calculated each gene's probability belonging to each bicluster and complete the initial partition of genes set utilizing the given threshold, then by optimising the objective function we completed weights update and conditions set selection, by further repartition of the entire dataset and optimization of biclusters we obtained the final biclusters. We evaluated our algorithm on yeast gene expression dataset, and experimental results show that MIWB algorithm can generate large capacity biclusters with lower mean squared residue.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2016.078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Microarrays are one of the latest breakthroughs in experimental molecular biology, which already provide huge amount of valuable gene expression data. Biclustering algorithm was introduced to capture the coherence of a subset of genes and a subset of conditions. In this paper, we presented a MIWB algorithm to find biclusters of gene expression data. MIWB algorithm uses the weighted mutual information as similarity measure which can be simultaneously detected complex linear and nonlinear relationships between genes. Our algorithm first used the weighted mutual information to construct the seed gene set of each biculster, then we calculated each gene's probability belonging to each bicluster and complete the initial partition of genes set utilizing the given threshold, then by optimising the objective function we completed weights update and conditions set selection, by further repartition of the entire dataset and optimization of biclusters we obtained the final biclusters. We evaluated our algorithm on yeast gene expression dataset, and experimental results show that MIWB algorithm can generate large capacity biclusters with lower mean squared residue.