Community detection in biological networks is crucial for capturing genes with similar functions and discovering biomarkers. However, communities discovered by many existing methods are often not closely connected, both topologically and functionally, so studies on community detection are still ongoing. In this paper, a multi-objective multi-stage genetic algorithm, named MOMSGA, is proposed to extract communities in biological networks. Firstly, the pre-reduction and boundary correction strategies are introduced to enhance the scalability of MOMSGA in large-scale networks. Secondly, a genetic algorithm is improved to guide the search process, where two improved objective functions are designed to simultaneously optimize the topological and functional connections to accurately extract information about relevant biological processes. The population initialization strategy and mutation operator are tailored. Thirdly, a multi-stage strategy is proposed that divides the evolutionary process into distinct stages based on the characteristics of the population at each stage, employing different selection and update strategies to obtain better diversity performance. Two notable innovations of MOMSGA lie in its multi-objective and multi-stage strategies. Experiments on 11 synthetic networks and 5 real-world networks demonstrate the superiority of MOMSGA, which outperforms four advanced methods. Furthermore, MOMSGA is applied to four gene expression datasets for biomarker identification. The results consistently show that MOMSGA outperforms other methods in classification performance across six indicators, particularly on the pheochromocytoma dataset, where the AUC reached 0.86, 2.9 % to 10.3 % higher than other methods. Moreover, the identified communities have been shown to be associated with the corresponding diseases through GO and KEGG enrichment analysis.
扫码关注我们
求助内容:
应助结果提醒方式:
