An intelligent model for prediction of abiotic stress-responsive microRNAs in plants using statistical moments based features and ensemble approaches

IF 4.2 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Methods Pub Date : 2024-05-18 DOI:10.1016/j.ymeth.2024.05.008
Ansar Naseem , Yaser Daanial Khan
{"title":"An intelligent model for prediction of abiotic stress-responsive microRNAs in plants using statistical moments based features and ensemble approaches","authors":"Ansar Naseem ,&nbsp;Yaser Daanial Khan","doi":"10.1016/j.ymeth.2024.05.008","DOIUrl":null,"url":null,"abstract":"<div><p>This study proposed an intelligent model for predicting abiotic stress-responsive microRNAs in plants. MicroRNAs (miRNAs) are short RNA molecules regulates the stress in genes. Experimental methods are costly and time-consuming, as compare to in-silico prediction. Addressing this gap, the study seeks to develop an efficient computational model for plant stress response prediction. The two benchmark datasets for MiRNA and Pre-MiRNA dataset have been acquired in this study. Four ensemble approaches such as bagging, boosting, stacking, and blending have been employed. Classifiers such as Random Forest (RF), Extra Trees (ET), Ada Boost (ADB), Light Gradient Boosting Machine (LGBM), and Support Vector Machine (SVM). Stacking and Blending employed all stated classifiers as base learners and Logistic Regression (LR) as Meta Classifier. There have been a total of four types of testing used, including independent set, self-consistency, cross-validation with 5 and 10 folds, and jackknife. This study has utilized evaluation metrics such as accuracy score, specificity, sensitivity, Mathew's correlation coefficient (MCC), and AUC. Our proposed methodology has outperformed existing state of the art study in both datasets based on independent set testing. The SVM-based approach has exhibited accuracy score of 0.659 for the MiRNA dataset, which is better than the previous study. The ET classifier has surpassed the accuracy of Pre-MiRNA dataset as compared to the existing benchmark study, achieving an impressive score of 0.67. The proposed method can be used in future research to predict abiotic stresses in plants.</p></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"228 ","pages":"Pages 65-79"},"PeriodicalIF":4.2000,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1046202324001245","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

This study proposed an intelligent model for predicting abiotic stress-responsive microRNAs in plants. MicroRNAs (miRNAs) are short RNA molecules regulates the stress in genes. Experimental methods are costly and time-consuming, as compare to in-silico prediction. Addressing this gap, the study seeks to develop an efficient computational model for plant stress response prediction. The two benchmark datasets for MiRNA and Pre-MiRNA dataset have been acquired in this study. Four ensemble approaches such as bagging, boosting, stacking, and blending have been employed. Classifiers such as Random Forest (RF), Extra Trees (ET), Ada Boost (ADB), Light Gradient Boosting Machine (LGBM), and Support Vector Machine (SVM). Stacking and Blending employed all stated classifiers as base learners and Logistic Regression (LR) as Meta Classifier. There have been a total of four types of testing used, including independent set, self-consistency, cross-validation with 5 and 10 folds, and jackknife. This study has utilized evaluation metrics such as accuracy score, specificity, sensitivity, Mathew's correlation coefficient (MCC), and AUC. Our proposed methodology has outperformed existing state of the art study in both datasets based on independent set testing. The SVM-based approach has exhibited accuracy score of 0.659 for the MiRNA dataset, which is better than the previous study. The ET classifier has surpassed the accuracy of Pre-MiRNA dataset as compared to the existing benchmark study, achieving an impressive score of 0.67. The proposed method can be used in future research to predict abiotic stresses in plants.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用基于统计矩的特征和集合方法预测植物非生物胁迫响应性 microRNA 的智能模型。
本研究提出了一种预测植物非生物胁迫反应微RNA的智能模型。微RNA(miRNA)是调控基因胁迫的短RNA分子。与室内预测相比,实验方法成本高、耗时长。为了弥补这一不足,本研究试图开发一种高效的植物胁迫响应预测计算模型。本研究获得了 MiRNA 和 Pre-MiRNA 两个基准数据集。本研究采用了四种集合方法,如套袋法(bagging)、提升法(boosting)、堆叠法(stacking)和混合法(blending)。分类器包括随机森林(RF)、额外树(ET)、Ada Boost(ADB)、轻梯度提升机(LGBM)和支持向量机(SVM)。堆叠和混合使用了所有上述分类器作为基础学习器,并使用逻辑回归(LR)作为元分类器。总共使用了四种类型的测试,包括独立集、自一致性、5 倍和 10 倍交叉验证以及千斤顶。本研究采用了准确度得分、特异性、灵敏度、马修相关系数(MCC)和 AUC 等评估指标。基于独立集测试,我们提出的方法在两个数据集上的表现都优于现有的技术研究。基于 SVM 的方法在 MiRNA 数据集上的准确率为 0.659,优于之前的研究。与现有的基准研究相比,ET 分类器在 Pre-MiRNA 数据集上的准确率更高,达到了令人印象深刻的 0.67 分。所提出的方法可用于未来预测植物非生物胁迫的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Methods
Methods 生物-生化研究方法
CiteScore
9.80
自引率
2.10%
发文量
222
审稿时长
11.3 weeks
期刊介绍: Methods focuses on rapidly developing techniques in the experimental biological and medical sciences. Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.
期刊最新文献
A roadmap to cysteine specific labeling of membrane proteins for single-molecule photobleaching studies. In silico identification of Histone Deacetylase inhibitors using Streamlined Masked Transformer-based Pretrained features. Robust feature learning using contractive autoencoders for multi-omics clustering in cancer subtyping Optimizing Retinal Imaging: Evaluation of ultrasmall TiO2 nanoparticle- fluorescein conjugates for improved Fundus Fluorescein Angiography Ab-Amy 2.0: Predicting light chain amyloidogenic risk of therapeutic antibodies based on antibody language model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1