{"title":"MDFGNN-SMMA:基于多源数据融合和图神经网络的潜在小分子- mirna关联预测。","authors":"Jianwei Li, Xukun Zhang, Bing Li, Ziyu Li, Zhenzhen Chen","doi":"10.1186/s12859-025-06040-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>MicroRNAs (miRNAs) are pivotal in the initiation and progression of complex human diseases and have been identified as targets for small molecule (SM) drugs. However, the expensive and time-intensive characteristics of conventional experimental techniques for identifying SM-miRNA associations highlight the necessity for efficient computational methodologies in this field.</p><p><strong>Results: </strong>In this study, we proposed a deep learning method called Multi-source Data Fusion and Graph Neural Networks for Small Molecule-MiRNA Association (MDFGNN-SMMA) to predict potential SM-miRNA associations. Firstly, MDFGNN-SMMA extracted features of Atom Pairs fingerprints and Molecular ACCess System fingerprints to derive fusion feature vectors for small molecules (SMs). The K-mer features were employed to generate the initial feature vectors for miRNAs. Secondly, cosine similarity measures were computed to construct the adjacency matrices for SMs and miRNAs, respectively. Thirdly, these feature vectors and adjacency matrices were input into a model comprising GAT and GraphSAGE, which were utilized to generate the final feature vectors for SMs and miRNAs. Finally, the averaged final feature vectors were utilized as input for a multilayer perceptron to predict the associations between SMs and miRNAs.</p><p><strong>Conclusions: </strong>The performance of MDFGNN-SMMA was assessed using 10-fold cross-validation, demonstrating superior compared to the four state-of-the-art models in terms of both AUC and AUPR. Moreover, the experimental results of an independent test set confirmed the model's generalization capability. Additionally, the efficacy of MDFGNN-SMMA was substantiated through three case studies. The findings indicated that among the top 50 predicted miRNAs associated with Cisplatin, 5-Fluorouracil, and Doxorubicin, 42, 36, and 36 miRNAs, respectively, were corroborated by existing literature and the RNAInter database.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"13"},"PeriodicalIF":2.9000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11730471/pdf/","citationCount":"0","resultStr":"{\"title\":\"MDFGNN-SMMA: prediction of potential small molecule-miRNA associations based on multi-source data fusion and graph neural networks.\",\"authors\":\"Jianwei Li, Xukun Zhang, Bing Li, Ziyu Li, Zhenzhen Chen\",\"doi\":\"10.1186/s12859-025-06040-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>MicroRNAs (miRNAs) are pivotal in the initiation and progression of complex human diseases and have been identified as targets for small molecule (SM) drugs. However, the expensive and time-intensive characteristics of conventional experimental techniques for identifying SM-miRNA associations highlight the necessity for efficient computational methodologies in this field.</p><p><strong>Results: </strong>In this study, we proposed a deep learning method called Multi-source Data Fusion and Graph Neural Networks for Small Molecule-MiRNA Association (MDFGNN-SMMA) to predict potential SM-miRNA associations. Firstly, MDFGNN-SMMA extracted features of Atom Pairs fingerprints and Molecular ACCess System fingerprints to derive fusion feature vectors for small molecules (SMs). The K-mer features were employed to generate the initial feature vectors for miRNAs. Secondly, cosine similarity measures were computed to construct the adjacency matrices for SMs and miRNAs, respectively. Thirdly, these feature vectors and adjacency matrices were input into a model comprising GAT and GraphSAGE, which were utilized to generate the final feature vectors for SMs and miRNAs. Finally, the averaged final feature vectors were utilized as input for a multilayer perceptron to predict the associations between SMs and miRNAs.</p><p><strong>Conclusions: </strong>The performance of MDFGNN-SMMA was assessed using 10-fold cross-validation, demonstrating superior compared to the four state-of-the-art models in terms of both AUC and AUPR. Moreover, the experimental results of an independent test set confirmed the model's generalization capability. Additionally, the efficacy of MDFGNN-SMMA was substantiated through three case studies. The findings indicated that among the top 50 predicted miRNAs associated with Cisplatin, 5-Fluorouracil, and Doxorubicin, 42, 36, and 36 miRNAs, respectively, were corroborated by existing literature and the RNAInter database.</p>\",\"PeriodicalId\":8958,\"journal\":{\"name\":\"BMC Bioinformatics\",\"volume\":\"26 1\",\"pages\":\"13\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-01-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11730471/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12859-025-06040-4\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06040-4","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
MDFGNN-SMMA: prediction of potential small molecule-miRNA associations based on multi-source data fusion and graph neural networks.
Background: MicroRNAs (miRNAs) are pivotal in the initiation and progression of complex human diseases and have been identified as targets for small molecule (SM) drugs. However, the expensive and time-intensive characteristics of conventional experimental techniques for identifying SM-miRNA associations highlight the necessity for efficient computational methodologies in this field.
Results: In this study, we proposed a deep learning method called Multi-source Data Fusion and Graph Neural Networks for Small Molecule-MiRNA Association (MDFGNN-SMMA) to predict potential SM-miRNA associations. Firstly, MDFGNN-SMMA extracted features of Atom Pairs fingerprints and Molecular ACCess System fingerprints to derive fusion feature vectors for small molecules (SMs). The K-mer features were employed to generate the initial feature vectors for miRNAs. Secondly, cosine similarity measures were computed to construct the adjacency matrices for SMs and miRNAs, respectively. Thirdly, these feature vectors and adjacency matrices were input into a model comprising GAT and GraphSAGE, which were utilized to generate the final feature vectors for SMs and miRNAs. Finally, the averaged final feature vectors were utilized as input for a multilayer perceptron to predict the associations between SMs and miRNAs.
Conclusions: The performance of MDFGNN-SMMA was assessed using 10-fold cross-validation, demonstrating superior compared to the four state-of-the-art models in terms of both AUC and AUPR. Moreover, the experimental results of an independent test set confirmed the model's generalization capability. Additionally, the efficacy of MDFGNN-SMMA was substantiated through three case studies. The findings indicated that among the top 50 predicted miRNAs associated with Cisplatin, 5-Fluorouracil, and Doxorubicin, 42, 36, and 36 miRNAs, respectively, were corroborated by existing literature and the RNAInter database.
期刊介绍:
BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology.
BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.