Establishing a GRU-GCN coordination-based prediction model for miRNA-disease associations.

IF 1.9 Q3 GENETICS & HEREDITY BMC genomic data Pub Date : 2025-01-14 DOI:10.1186/s12863-024-01293-z

Kai-Cheng Chuang, Ping-Sung Cheng, Yu-Hung Tsai, Meng-Hsiun Tsai

{"title":"Establishing a GRU-GCN coordination-based prediction model for miRNA-disease associations.","authors":"Kai-Cheng Chuang, Ping-Sung Cheng, Yu-Hung Tsai, Meng-Hsiun Tsai","doi":"10.1186/s12863-024-01293-z","DOIUrl":null,"url":null,"abstract":"Background: miRNAs (microRNAs) are endogenous RNAs with lengths of 18 to 24 nucleotides and play critical roles in gene regulation and disease progression. Although traditional wet-lab experiments provide direct evidence for miRNA-disease associations, they are often time-consuming and complicated to analyze by current bioinformatics tools. In recent years, machine learning (ML) and deep learning (DL) techniques are powerful tools to analyze large-scale biological data. Hence, developing a model to predict, identify, and rank connections in miRNAs and diseases can significantly enhance the precision and efficiency in investigating the relationships between miRNAs and diseases.Results: In this study, we utilized miRNA-disease association data obtained by biotechnological experiments to develop a DL model for miRNA-disease associations. To improve the accuracy of prediction in this model, we introduced two labeling strategies, weight-based and majority-based definitions, to classify miRNA-disease associations. After preprocessing, data was trained with a novel model combining gated recurrent units (GRU) and graph convolutional network (GCN) to predict the level of miRNA-disease associations. The miRNA-disease association datasets were from HMDD (the Human miRNA Disease Database) and categorized by two distinct labeling approaches, weight-based definitions and majority-based definitions. We classified the miRNA-disease associations into three groups, \"upregulated\", \"downregulated\" and \"nonspecific\", by regression analysis and multiclass classification. This GRU-GCN coordinated model achieved a robust area under the curve (AUC) score of 0.8 in all datasets, demonstrating the efficacy in predicting potential miRNA-disease relationships.Conclusions: By introducing innovative label-preprocessing methods, this study addressed the relationships between miRNAs and diseases, and improved the ambiguity of the results in different experiments. Based on these refined label definitions, we developed a DL-based model to refine and predict the results of associations between miRNAs and diseases. This model offers a valuable tool for complementing traditional experimental methods and enhancing our understanding of miRNA-related disease mechanisms.","PeriodicalId":72427,"journal":{"name":"BMC genomic data","volume":"26 1","pages":"4"},"PeriodicalIF":1.9000,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11734345/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC genomic data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12863-024-01293-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: miRNAs (microRNAs) are endogenous RNAs with lengths of 18 to 24 nucleotides and play critical roles in gene regulation and disease progression. Although traditional wet-lab experiments provide direct evidence for miRNA-disease associations, they are often time-consuming and complicated to analyze by current bioinformatics tools. In recent years, machine learning (ML) and deep learning (DL) techniques are powerful tools to analyze large-scale biological data. Hence, developing a model to predict, identify, and rank connections in miRNAs and diseases can significantly enhance the precision and efficiency in investigating the relationships between miRNAs and diseases.

Results: In this study, we utilized miRNA-disease association data obtained by biotechnological experiments to develop a DL model for miRNA-disease associations. To improve the accuracy of prediction in this model, we introduced two labeling strategies, weight-based and majority-based definitions, to classify miRNA-disease associations. After preprocessing, data was trained with a novel model combining gated recurrent units (GRU) and graph convolutional network (GCN) to predict the level of miRNA-disease associations. The miRNA-disease association datasets were from HMDD (the Human miRNA Disease Database) and categorized by two distinct labeling approaches, weight-based definitions and majority-based definitions. We classified the miRNA-disease associations into three groups, "upregulated", "downregulated" and "nonspecific", by regression analysis and multiclass classification. This GRU-GCN coordinated model achieved a robust area under the curve (AUC) score of 0.8 in all datasets, demonstrating the efficacy in predicting potential miRNA-disease relationships.

Conclusions: By introducing innovative label-preprocessing methods, this study addressed the relationships between miRNAs and diseases, and improved the ambiguity of the results in different experiments. Based on these refined label definitions, we developed a DL-based model to refine and predict the results of associations between miRNAs and diseases. This model offers a valuable tool for complementing traditional experimental methods and enhancing our understanding of miRNA-related disease mechanisms.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

建立基于GRU-GCN协调的mirna -疾病关联预测模型。

背景：miRNA（microRNA）是长度为 18-24 个核苷酸的内源性 RNA，在基因调控和疾病进展中发挥着关键作用。虽然传统的湿实验室实验为 miRNA 与疾病的关联提供了直接证据，但利用现有的生物信息学工具进行分析往往耗时且复杂。近年来，机器学习（ML）和深度学习（DL）技术成为分析大规模生物数据的强大工具。因此，建立一个模型来预测、识别和排列 miRNA 与疾病的联系，可以大大提高研究 miRNA 与疾病关系的精度和效率：在这项研究中，我们利用生物技术实验获得的 miRNA 与疾病的关联数据，建立了一个 miRNA 与疾病关联的 DL 模型。为了提高该模型的预测准确性，我们引入了两种标记策略，即基于权重和基于多数的定义，来对 miRNA 与疾病的关联进行分类。数据经过预处理后，使用结合了门控递归单元（GRU）和图卷积网络（GCN）的新型模型进行训练，以预测 miRNA 与疾病关联的水平。miRNA 与疾病的关联数据集来自 HMDD（人类 miRNA 疾病数据库），采用两种不同的标记方法进行分类：基于权重的定义和基于多数的定义。我们通过回归分析和多类分类将 miRNA 与疾病的关联分为三类："上调"、"下调 "和 "非特异性"。该GRU-GCN协调模型在所有数据集上的曲线下面积（AUC）均达到0.8，证明了其在预测潜在的miRNA-疾病关系方面的有效性：本研究通过引入创新的标签预处理方法，解决了 miRNA 与疾病之间的关系问题，并改善了不同实验结果的模糊性。基于这些细化的标签定义，我们开发了一个基于 DL 的模型来细化和预测 miRNA 与疾病之间的关联结果。该模型为补充传统实验方法和加深我们对 miRNA 相关疾病机理的理解提供了宝贵的工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

BMC genomic data

CiteScore

4.90

自引率

0.00%

发文量