GenoM7GNet: An Efficient N7-Methylguanosine Site Prediction Approach Based on a Nucleotide Language Model.

IF 3.6 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-09-20 DOI:10.1109/TCBB.2024.3459870
Chuang Li, Heshi Wang, Yanhua Wen, Rui Yin, Xiangxiang Zeng, Keqin Li
{"title":"GenoM7GNet: An Efficient N<sup>7</sup>-Methylguanosine Site Prediction Approach Based on a Nucleotide Language Model.","authors":"Chuang Li, Heshi Wang, Yanhua Wen, Rui Yin, Xiangxiang Zeng, Keqin Li","doi":"10.1109/TCBB.2024.3459870","DOIUrl":null,"url":null,"abstract":"<p><p>N<sup>7</sup> -methylguanosine (m7G), one of the mainstream post-transcriptional RNA modifications, occupies an exceedingly significant place in medical treatments. However, classic approaches for identifying m7G sites are costly both in time and equipment. Meanwhile, the existing machine learning methods extract limited hidden information from RNA sequences, thus making it difficult to improve the accuracy. Therefore, we put forward to a deep learning network, called \"GenoM7GNet,\" for m7G site identification. This model utilizes a Bidirectional Encoder Representation from Transformers (BERT) and is pretrained on nucleotide sequences data to capture hidden patterns from RNA sequences for m7G site prediction. Moreover, through detailed comparative experiments with various deep learning models, we discovered that the one-dimensional convolutional neural network (CNN) exhibits outstanding performance in sequence feature learning and classification. The proposed GenoM7GNet model achieved 0.953in accuracy, 0.932in sensitivity, 0.976in specificity, 0.907in Matthews Correlation Coefficient and 0.984in Area Under the receiver operating characteristic Curve on performance evaluation. Extensive experimental results further prove that our GenoM7GNet model markedly surpasses other state-of-the-art models in predicting m7G sites, exhibiting high computing performance.</p>","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/TCBB.2024.3459870","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

N7 -methylguanosine (m7G), one of the mainstream post-transcriptional RNA modifications, occupies an exceedingly significant place in medical treatments. However, classic approaches for identifying m7G sites are costly both in time and equipment. Meanwhile, the existing machine learning methods extract limited hidden information from RNA sequences, thus making it difficult to improve the accuracy. Therefore, we put forward to a deep learning network, called "GenoM7GNet," for m7G site identification. This model utilizes a Bidirectional Encoder Representation from Transformers (BERT) and is pretrained on nucleotide sequences data to capture hidden patterns from RNA sequences for m7G site prediction. Moreover, through detailed comparative experiments with various deep learning models, we discovered that the one-dimensional convolutional neural network (CNN) exhibits outstanding performance in sequence feature learning and classification. The proposed GenoM7GNet model achieved 0.953in accuracy, 0.932in sensitivity, 0.976in specificity, 0.907in Matthews Correlation Coefficient and 0.984in Area Under the receiver operating characteristic Curve on performance evaluation. Extensive experimental results further prove that our GenoM7GNet model markedly surpasses other state-of-the-art models in predicting m7G sites, exhibiting high computing performance.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
GenoM7GNet:基于核苷酸语言模型的高效 N7-甲基鸟苷位点预测方法
N7 -甲基鸟苷(m7G)是转录后 RNA 修饰的主流之一,在医学治疗中占有极其重要的地位。然而,识别 m7G 位点的传统方法在时间和设备上都很昂贵。同时,现有的机器学习方法从 RNA 序列中提取的隐藏信息有限,因此很难提高准确率。因此,我们提出了一种用于识别 m7G 位点的深度学习网络,称为 "GenoM7GNet"。该模型利用双向变换器编码器表征(BERT),并在核苷酸序列数据上进行预训练,以捕捉 RNA 序列中的隐藏模式,用于 m7G 位点预测。此外,通过与各种深度学习模型的详细对比实验,我们发现一维卷积神经网络(CNN)在序列特征学习和分类方面表现出色。所提出的 GenoM7GNet 模型在性能评估上取得了 0.953 的准确率、0.932 的灵敏度、0.976 的特异性、0.907 的马修斯相关系数和 0.984 的接收者工作特征曲线下面积。广泛的实验结果进一步证明,我们的 GenoM7GNet 模型在预测 m7G 位点方面明显超越了其他最先进的模型,表现出很高的计算性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.50
自引率
6.70%
发文量
479
审稿时长
3 months
期刊介绍: IEEE/ACM Transactions on Computational Biology and Bioinformatics emphasizes the algorithmic, mathematical, statistical and computational methods that are central in bioinformatics and computational biology; the development and testing of effective computer programs in bioinformatics; the development of biological databases; and important biological results that are obtained from the use of these methods, programs and databases; the emerging field of Systems Biology, where many forms of data are used to create a computer-based model of a complex biological system
期刊最新文献
DeepLigType: Predicting Ligand Types of Protein-Ligand Binding Sites Using a Deep Learning Model. Performance Comparison between Deep Neural Network and Machine Learning based Classifiers for Huntington Disease Prediction from Human DNA Sequence. AI-based Computational Methods in Early Drug Discovery and Post Market Drug Assessment: A Survey. Enhancing Single-Cell RNA-seq Data Completeness with a Graph Learning Framework. circ2DGNN: circRNA-disease Association Prediction via Transformer-based Graph Neural Network.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1