pmiRScan: a LightGBM based method for prediction of animal pre-miRNAs

IF 3.9 4区 生物学 Q1 GENETICS & HEREDITY Functional & Integrative Genomics Pub Date : 2025-01-09 DOI:10.1007/s10142-025-01527-y
Amrit Venkatesan, Jolly Basak, Ranjit Prasad Bahadur
{"title":"pmiRScan: a LightGBM based method for prediction of animal pre-miRNAs","authors":"Amrit Venkatesan,&nbsp;Jolly Basak,&nbsp;Ranjit Prasad Bahadur","doi":"10.1007/s10142-025-01527-y","DOIUrl":null,"url":null,"abstract":"<div><p>MicroRNAs (miRNA) are categorized as short endogenous non-coding RNAs, which have a significant role in post-transcriptional gene regulation. Identifying new animal precursor miRNA (pre-miRNA) and miRNA is crucial to understand the role of miRNAs in various biological processes including the development of diseases. The present study focuses on the development of a Light Gradient Boost (LGB) based method for the classification of animal pre-miRNAs using various sequence and secondary structural features. In various pre-miRNA families, distinct k-mer repeat signatures with a length of three nucleotides have been identified. Out of nine different classifiers that have been trained and tested in the present study, LGB has an overall better performance with an AUROC of 0.959. In comparison with the existing methods, our method ‘pmiRScan’ has an overall better performance with accuracy of 0.93, sensitivity of 0.86, specificity of 0.95 and F-score of 0.82. Moreover, pmiRScan effectively classifies pre-miRNAs from four distinct taxonomic groups: mammals, nematodes, molluscs and arthropods. We have used our classifier to predict genome-wide pre-miRNAs in human. We find a total of 313 pre-miRNA candidates using pmiRScan. A total of 180 potential mature miRNAs belonging to 60 distinct miRNA families are extracted from predicted pre-miRNAs; of which 128 were novel and are note reported in miRBase. These discoveries may enhance our current understanding of miRNAs and their targets in human. pmiRScan is freely available at http://www.csb.iitkgp.ac.in/applications/pmiRScan/index.php.</p></div>","PeriodicalId":574,"journal":{"name":"Functional & Integrative Genomics","volume":"25 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Functional & Integrative Genomics","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10142-025-01527-y","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

MicroRNAs (miRNA) are categorized as short endogenous non-coding RNAs, which have a significant role in post-transcriptional gene regulation. Identifying new animal precursor miRNA (pre-miRNA) and miRNA is crucial to understand the role of miRNAs in various biological processes including the development of diseases. The present study focuses on the development of a Light Gradient Boost (LGB) based method for the classification of animal pre-miRNAs using various sequence and secondary structural features. In various pre-miRNA families, distinct k-mer repeat signatures with a length of three nucleotides have been identified. Out of nine different classifiers that have been trained and tested in the present study, LGB has an overall better performance with an AUROC of 0.959. In comparison with the existing methods, our method ‘pmiRScan’ has an overall better performance with accuracy of 0.93, sensitivity of 0.86, specificity of 0.95 and F-score of 0.82. Moreover, pmiRScan effectively classifies pre-miRNAs from four distinct taxonomic groups: mammals, nematodes, molluscs and arthropods. We have used our classifier to predict genome-wide pre-miRNAs in human. We find a total of 313 pre-miRNA candidates using pmiRScan. A total of 180 potential mature miRNAs belonging to 60 distinct miRNA families are extracted from predicted pre-miRNAs; of which 128 were novel and are note reported in miRBase. These discoveries may enhance our current understanding of miRNAs and their targets in human. pmiRScan is freely available at http://www.csb.iitkgp.ac.in/applications/pmiRScan/index.php.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
pmiRScan:一种基于LightGBM的动物pre- mirna预测方法
MicroRNAs (miRNA)被归类为短的内源性非编码rna,在转录后基因调控中起着重要作用。鉴定新的动物前体miRNA (pre-miRNA)和miRNA对于了解miRNA在包括疾病发展在内的各种生物过程中的作用至关重要。本研究的重点是开发一种基于光梯度增强(LGB)的方法,利用各种序列和二级结构特征对动物pre- mirna进行分类。在各种pre-miRNA家族中,已经鉴定出具有三个核苷酸长度的不同k-mer重复签名。在本研究已经训练和测试的9个不同的分类器中,LGB的总体性能更好,AUROC为0.959。与现有方法相比,我们的方法“pmiRScan”具有更好的整体性能,准确率为0.93,灵敏度为0.86,特异性为0.95,f评分为0.82。此外,pmiRScan还能有效地从哺乳动物、线虫、软体动物和节肢动物四个不同的分类类群中对pre- mirna进行分类。我们已经使用我们的分类器来预测人类全基因组的pre- mirna。我们使用pmiRScan共发现了313个pre-miRNA候选基因。从预测的pre-miRNA中共提取了180个潜在的成熟miRNA,属于60个不同的miRNA家族;其中128个是新发现的,在miRBase中有记录。这些发现可能会增强我们目前对mirna及其在人类中的靶点的理解。pmiRScan可在http://www.csb.iitkgp.ac.in/applications/pmiRScan/index.php免费获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
3.50
自引率
3.40%
发文量
92
审稿时长
2 months
期刊介绍: Functional & Integrative Genomics is devoted to large-scale studies of genomes and their functions, including systems analyses of biological processes. The journal will provide the research community an integrated platform where researchers can share, review and discuss their findings on important biological questions that will ultimately enable us to answer the fundamental question: How do genomes work?
期刊最新文献
High-throughput sequencing: a breakthrough in molecular diagnosis for precision medicine The role of nanoparticles in transforming plant genetic engineering: advancements, challenges and future prospects Comparison of the rhizospheric soil bacteriomes of Oryza sativa and Solanum melongena crop cultivars reveals key genes and pathways involved in biosynthesis of ectoine, lysine, and catechol meta-cleavage SUMMER: an integrated nanopore sequencing pipeline for variants detection and clinical annotation on the human genome Genomics-assisted stacking of waxy1, opaque2, and crtRB1 genes for enhancing amylopectin in biofortified maize for industrial utilization and nutritional security
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1