GBDTSVM: Combined Support Vector Machine and Gradient Boosting Decision Tree Framework for efficient snoRNA-disease association prediction

IF 6.3 2区 医学 Q1 BIOLOGY Computers in biology and medicine Pub Date : 2025-06-01 Epub Date: 2025-04-26 DOI:10.1016/j.compbiomed.2025.110219
Ummay Maria Muna , Fahim Hafiz , Shanta Biswas , Riasat Azim
{"title":"GBDTSVM: Combined Support Vector Machine and Gradient Boosting Decision Tree Framework for efficient snoRNA-disease association prediction","authors":"Ummay Maria Muna ,&nbsp;Fahim Hafiz ,&nbsp;Shanta Biswas ,&nbsp;Riasat Azim","doi":"10.1016/j.compbiomed.2025.110219","DOIUrl":null,"url":null,"abstract":"<div><div>Small nucleolar RNAs (snoRNAs) are increasingly recognized for their critical role in the pathogenesis and characterization of various human diseases. Consequently, the precise identification of snoRNA-disease associations (SDAs) is essential for the progression of diseases and the advancement of treatment strategies. However, conventional biological experimental approaches are costly, time-consuming, and resource-intensive; therefore, machine learning-based computational methods offer a promising solution to mitigate these limitations. This paper proposes a model called ‘GBDTSVM’, representing a novel and efficient machine learning approach for predicting snoRNA-disease associations by leveraging a Gradient Boosting Decision Tree (GBDT) and Support Vector Machine (SVM). ‘GBDTSVM’ effectively extracts integrated snoRNA-disease feature representations utilizing GBDT, and SVM is subsequently utilized to classify and identify potential associations. Furthermore, the method enhances the accuracy of these predictions by incorporating Gaussian integrated profile kernel similarity for both snoRNAs and diseases. Experimental evaluation of the GBDTSVM model demonstrates superior performance compared to state-of-the-art methods in the field, achieving an AUROC of 0.96 and an AUPRC of 0.95 on the ‘MDRF’ dataset. Moreover, our model shows superior performance on two more datasets named ‘LSGT’ and ‘PsnoD’. Additionally, a case study conducted on the predicted snoRNA-disease associations verified the top-ranked snoRNAs across twelve prevalent diseases, further validating the efficacy of the GBDTSVM approach. These results underscore the model’s potential as a robust tool for advancing snoRNA-related disease research. Source codes and datasets for our proposed framework can be obtained from: <span><span>https://github.com/mariamuna04/gbdtsvm</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"192 ","pages":"Article 110219"},"PeriodicalIF":6.3000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525005700","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/26 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Small nucleolar RNAs (snoRNAs) are increasingly recognized for their critical role in the pathogenesis and characterization of various human diseases. Consequently, the precise identification of snoRNA-disease associations (SDAs) is essential for the progression of diseases and the advancement of treatment strategies. However, conventional biological experimental approaches are costly, time-consuming, and resource-intensive; therefore, machine learning-based computational methods offer a promising solution to mitigate these limitations. This paper proposes a model called ‘GBDTSVM’, representing a novel and efficient machine learning approach for predicting snoRNA-disease associations by leveraging a Gradient Boosting Decision Tree (GBDT) and Support Vector Machine (SVM). ‘GBDTSVM’ effectively extracts integrated snoRNA-disease feature representations utilizing GBDT, and SVM is subsequently utilized to classify and identify potential associations. Furthermore, the method enhances the accuracy of these predictions by incorporating Gaussian integrated profile kernel similarity for both snoRNAs and diseases. Experimental evaluation of the GBDTSVM model demonstrates superior performance compared to state-of-the-art methods in the field, achieving an AUROC of 0.96 and an AUPRC of 0.95 on the ‘MDRF’ dataset. Moreover, our model shows superior performance on two more datasets named ‘LSGT’ and ‘PsnoD’. Additionally, a case study conducted on the predicted snoRNA-disease associations verified the top-ranked snoRNAs across twelve prevalent diseases, further validating the efficacy of the GBDTSVM approach. These results underscore the model’s potential as a robust tool for advancing snoRNA-related disease research. Source codes and datasets for our proposed framework can be obtained from: https://github.com/mariamuna04/gbdtsvm.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GBDTSVM:基于支持向量机和梯度增强决策树框架的有效snorna -疾病关联预测
小核仁rna (Small nucleolar rna, snoRNAs)因其在各种人类疾病的发病机制和特征中发挥的关键作用而越来越受到人们的认可。因此,准确识别snorna -疾病关联(SDAs)对于疾病的进展和治疗策略的推进至关重要。然而,传统的生物实验方法成本高、耗时长、资源密集;因此,基于机器学习的计算方法为减轻这些限制提供了一个有希望的解决方案。本文提出了一个名为“GBDTSVM”的模型,该模型代表了一种利用梯度增强决策树(GBDT)和支持向量机(SVM)预测snorna疾病关联的新颖高效的机器学习方法。“GBDTSVM”利用GBDT有效地提取了snorna -疾病的综合特征表示,随后利用SVM对潜在关联进行分类和识别。此外,该方法通过结合snorna和疾病的高斯积分剖面核相似性来提高这些预测的准确性。与该领域最先进的方法相比,GBDTSVM模型的实验评估显示出优越的性能,在“MDRF”数据集上实现了0.96的AUROC和0.95的AUPRC。此外,我们的模型在另外两个名为“LSGT”和“PsnoD”的数据集上显示出优越的性能。此外,对预测的snorna -疾病关联进行的一项案例研究验证了12种流行疾病中排名最高的snorna,进一步验证了GBDTSVM方法的有效性。这些结果强调了该模型作为推进snorna相关疾病研究的强大工具的潜力。我们提出的框架的源代码和数据集可以从https://github.com/mariamuna04/gbdtsvm获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computers in biology and medicine
Computers in biology and medicine 工程技术-工程:生物医学
CiteScore
11.70
自引率
10.40%
发文量
1086
审稿时长
74 days
期刊介绍: Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.
期刊最新文献
Forecasting-based biomedical time-series data synthesis for open data and robust AI Precise oxygen therapy to emphysema patients by fuzzy-based gain tuning control of set-point regulated MRAC MIPHEI-ViT: Multiplex immunofluorescence prediction from H&E images using ViT foundation models Deep learning with limited data: a transfer learning approach for transcriptomic survival prediction Noninvasive heart rate estimation using semantic segmentation and parameter optimization on 4K UAV videos
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1