学习核苷酸之间的序列和结构依赖性以识别 RNA N6-甲基腺苷位点

IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Ieee-Caa Journal of Automatica Sinica Pub Date : 2024-09-04 DOI:10.1109/JAS.2024.124233
Guodong Li;Bowei Zhao;Xiaorui Su;Dongxu Li;Yue Yang;Zhi Zeng;Lun Hu
{"title":"学习核苷酸之间的序列和结构依赖性以识别 RNA N6-甲基腺苷位点","authors":"Guodong Li;Bowei Zhao;Xiaorui Su;Dongxu Li;Yue Yang;Zhi Zeng;Lun Hu","doi":"10.1109/JAS.2024.124233","DOIUrl":null,"url":null,"abstract":"N6-methyladenosine (m6A) is an important RNA methylation modification involved in regulating diverse biological processes across multiple species. Hence, the identification of m6A modification sites provides valuable insight into the biological mechanisms of complex diseases at the post-transcriptional level. Although a variety of identification algorithms have been proposed recently, most of them capture the features of m6A modification sites by focusing on the sequential dependencies of nucleotides at different positions in RNA sequences, while ignoring the structural dependencies of nucleotides in their three-dimensional structures. To overcome this issue, we propose a cross-species end-to-end deep learning model, namely CR-NSSD, which conduct a cross-domain representation learning process integrating nucleotide structural and sequential dependencies for RNA m6A site identification. Specifically, CR-NSSD first obtains the pre-coded representations of RNA sequences by incorporating the position information into single-nucleotide states with chaos game representation theory. It then constructs a cross-domain reconstruction encoder to learn the sequential and structural dependencies between nucleotides. By minimizing the reconstruction and binary cross-entropy losses, CR-NSSD is trained to complete the task of m6A site identification. Extensive experiments have demonstrated the promising performance of CR-NSSD by comparing it with several state-of-the-art m6A identification algorithms. Moreover, the results of cross-species prediction indicate that the integration of sequential and structural dependencies allows CR-NSSD to capture general features of m6A modification sites among different species, thus improving the accuracy of cross-species identification.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 10","pages":"2123-2134"},"PeriodicalIF":15.3000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning Sequential and Structural Dependencies Between Nucleotides for RNA N6-Methyladenosine Site Identification\",\"authors\":\"Guodong Li;Bowei Zhao;Xiaorui Su;Dongxu Li;Yue Yang;Zhi Zeng;Lun Hu\",\"doi\":\"10.1109/JAS.2024.124233\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"N6-methyladenosine (m6A) is an important RNA methylation modification involved in regulating diverse biological processes across multiple species. Hence, the identification of m6A modification sites provides valuable insight into the biological mechanisms of complex diseases at the post-transcriptional level. Although a variety of identification algorithms have been proposed recently, most of them capture the features of m6A modification sites by focusing on the sequential dependencies of nucleotides at different positions in RNA sequences, while ignoring the structural dependencies of nucleotides in their three-dimensional structures. To overcome this issue, we propose a cross-species end-to-end deep learning model, namely CR-NSSD, which conduct a cross-domain representation learning process integrating nucleotide structural and sequential dependencies for RNA m6A site identification. Specifically, CR-NSSD first obtains the pre-coded representations of RNA sequences by incorporating the position information into single-nucleotide states with chaos game representation theory. It then constructs a cross-domain reconstruction encoder to learn the sequential and structural dependencies between nucleotides. By minimizing the reconstruction and binary cross-entropy losses, CR-NSSD is trained to complete the task of m6A site identification. Extensive experiments have demonstrated the promising performance of CR-NSSD by comparing it with several state-of-the-art m6A identification algorithms. Moreover, the results of cross-species prediction indicate that the integration of sequential and structural dependencies allows CR-NSSD to capture general features of m6A modification sites among different species, thus improving the accuracy of cross-species identification.\",\"PeriodicalId\":54230,\"journal\":{\"name\":\"Ieee-Caa Journal of Automatica Sinica\",\"volume\":\"11 10\",\"pages\":\"2123-2134\"},\"PeriodicalIF\":15.3000,\"publicationDate\":\"2024-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ieee-Caa Journal of Automatica Sinica\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10664519/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ieee-Caa Journal of Automatica Sinica","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10664519/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

N6-甲基腺苷(m6A)是一种重要的 RNA 甲基化修饰,参与调控多个物种的多种生物过程。因此,对 m6A 修饰位点的鉴定可在转录后水平为复杂疾病的生物学机制提供有价值的见解。虽然近来提出了多种识别算法,但它们大多只关注 RNA 序列中不同位置核苷酸的序列依赖关系,而忽略了核苷酸在其三维结构中的结构依赖关系,因而无法捕捉 m6A 修饰位点的特征。为了克服这一问题,我们提出了一种跨物种端到端深度学习模型,即CR-NSSD,该模型进行了跨域表征学习,将核苷酸结构依赖性和序列依赖性整合在一起,用于RNA m6A位点的识别。具体来说,CR-NSSD 首先利用混沌博弈表示理论将位置信息纳入单核苷酸状态,从而获得 RNA 序列的预编码表示。然后,它构建了一个跨域重构编码器,以学习核苷酸之间的序列和结构依赖关系。通过最小化重构损失和二元交叉熵损失,CR-NSSD 被训练来完成 m6A 位点识别任务。通过与几种最先进的 m6A 识别算法进行比较,大量实验证明了 CR-NSSD 的良好性能。此外,跨物种预测的结果表明,序列和结构依赖性的整合使 CR-NSSD 能够捕捉不同物种 m6A 修饰位点的一般特征,从而提高跨物种鉴定的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Learning Sequential and Structural Dependencies Between Nucleotides for RNA N6-Methyladenosine Site Identification
N6-methyladenosine (m6A) is an important RNA methylation modification involved in regulating diverse biological processes across multiple species. Hence, the identification of m6A modification sites provides valuable insight into the biological mechanisms of complex diseases at the post-transcriptional level. Although a variety of identification algorithms have been proposed recently, most of them capture the features of m6A modification sites by focusing on the sequential dependencies of nucleotides at different positions in RNA sequences, while ignoring the structural dependencies of nucleotides in their three-dimensional structures. To overcome this issue, we propose a cross-species end-to-end deep learning model, namely CR-NSSD, which conduct a cross-domain representation learning process integrating nucleotide structural and sequential dependencies for RNA m6A site identification. Specifically, CR-NSSD first obtains the pre-coded representations of RNA sequences by incorporating the position information into single-nucleotide states with chaos game representation theory. It then constructs a cross-domain reconstruction encoder to learn the sequential and structural dependencies between nucleotides. By minimizing the reconstruction and binary cross-entropy losses, CR-NSSD is trained to complete the task of m6A site identification. Extensive experiments have demonstrated the promising performance of CR-NSSD by comparing it with several state-of-the-art m6A identification algorithms. Moreover, the results of cross-species prediction indicate that the integration of sequential and structural dependencies allows CR-NSSD to capture general features of m6A modification sites among different species, thus improving the accuracy of cross-species identification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Ieee-Caa Journal of Automatica Sinica
Ieee-Caa Journal of Automatica Sinica Engineering-Control and Systems Engineering
CiteScore
23.50
自引率
11.00%
发文量
880
期刊介绍: The IEEE/CAA Journal of Automatica Sinica is a reputable journal that publishes high-quality papers in English on original theoretical/experimental research and development in the field of automation. The journal covers a wide range of topics including automatic control, artificial intelligence and intelligent control, systems theory and engineering, pattern recognition and intelligent systems, automation engineering and applications, information processing and information systems, network-based automation, robotics, sensing and measurement, and navigation, guidance, and control. Additionally, the journal is abstracted/indexed in several prominent databases including SCIE (Science Citation Index Expanded), EI (Engineering Index), Inspec, Scopus, SCImago, DBLP, CNKI (China National Knowledge Infrastructure), CSCD (Chinese Science Citation Database), and IEEE Xplore.
期刊最新文献
Inside front cover Inside back cover Back cover Front cover On Zero Dynamics and Controllable Cyber-Attacks in Cyber-Physical Systems and Dynamic Coding Schemes as Their Countermeasures
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1