{"title":"通过序列交互感知双编码器预测 circRNA 回接事件的表征学习方法","authors":"Chengxin He;Lei Duan;Huiru Zheng;Xinye Wang;Lili Guan;Jiaxuan Xu","doi":"10.1109/TNB.2024.3454079","DOIUrl":null,"url":null,"abstract":"Circular RNAs (circRNAs) play a crucial role in gene regulation and association with diseases because of their unique closed continuous loop structure, which is more stable and conserved than ordinary linear RNAs. As fundamental work to clarify their functions, a large number of computational approaches for identifying circRNA formation have been proposed. However, these methods fail to fully utilize the important characteristics of back-splicing events, i.e., the positional information of the splice sites and the interaction features of its flanking sequences, for predicting circRNAs. To this end, we hereby propose a novel approach called SIDE for predicting circRNA back-splicing events using only raw RNA sequences. Technically, SIDE employs a dual encoder to capture global and interactive features of the RNA sequence, and then a decoder designed by the contrastive learning to fuse out discriminative features improving the prediction of circRNAs formation. Empirical results on three real-world datasets show the effectiveness of SIDE. Further analysis also reveals that the effectiveness of SIDE.","PeriodicalId":13264,"journal":{"name":"IEEE Transactions on NanoBioscience","volume":"23 4","pages":"603-611"},"PeriodicalIF":3.7000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Representation Learning Approach for Predicting circRNA Back-Splicing Event via Sequence-Interaction-Aware Dual Encoder\",\"authors\":\"Chengxin He;Lei Duan;Huiru Zheng;Xinye Wang;Lili Guan;Jiaxuan Xu\",\"doi\":\"10.1109/TNB.2024.3454079\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Circular RNAs (circRNAs) play a crucial role in gene regulation and association with diseases because of their unique closed continuous loop structure, which is more stable and conserved than ordinary linear RNAs. As fundamental work to clarify their functions, a large number of computational approaches for identifying circRNA formation have been proposed. However, these methods fail to fully utilize the important characteristics of back-splicing events, i.e., the positional information of the splice sites and the interaction features of its flanking sequences, for predicting circRNAs. To this end, we hereby propose a novel approach called SIDE for predicting circRNA back-splicing events using only raw RNA sequences. Technically, SIDE employs a dual encoder to capture global and interactive features of the RNA sequence, and then a decoder designed by the contrastive learning to fuse out discriminative features improving the prediction of circRNAs formation. Empirical results on three real-world datasets show the effectiveness of SIDE. Further analysis also reveals that the effectiveness of SIDE.\",\"PeriodicalId\":13264,\"journal\":{\"name\":\"IEEE Transactions on NanoBioscience\",\"volume\":\"23 4\",\"pages\":\"603-611\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on NanoBioscience\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10663753/\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on NanoBioscience","FirstCategoryId":"99","ListUrlMain":"https://ieeexplore.ieee.org/document/10663753/","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
摘要
环状 RNA(circRNA)因其独特的闭合连续环状结构而在基因调控和疾病相关方面发挥着至关重要的作用,这种结构比普通线性 RNA 更稳定、更保守。作为阐明其功能的基础性工作,人们提出了大量识别 circRNA 形成的计算方法。然而,这些方法未能充分利用反向剪接事件的重要特征,即剪接位点的位置信息及其侧翼序列的相互作用特征来预测 circRNA。为此,我们提出了一种名为 SIDE 的新方法,仅利用原始 RNA 序列预测 circRNA 的反向剪接事件。在技术上,SIDE 采用双重编码器捕捉 RNA 序列的全局和交互特征,然后通过对比学习设计解码器,融合出辨别特征,从而提高 circRNA 形成的预测能力。在三个真实世界数据集上的实证结果表明了 SIDE 的有效性。进一步的分析还显示了 SIDE 的有效性。
A Representation Learning Approach for Predicting circRNA Back-Splicing Event via Sequence-Interaction-Aware Dual Encoder
Circular RNAs (circRNAs) play a crucial role in gene regulation and association with diseases because of their unique closed continuous loop structure, which is more stable and conserved than ordinary linear RNAs. As fundamental work to clarify their functions, a large number of computational approaches for identifying circRNA formation have been proposed. However, these methods fail to fully utilize the important characteristics of back-splicing events, i.e., the positional information of the splice sites and the interaction features of its flanking sequences, for predicting circRNAs. To this end, we hereby propose a novel approach called SIDE for predicting circRNA back-splicing events using only raw RNA sequences. Technically, SIDE employs a dual encoder to capture global and interactive features of the RNA sequence, and then a decoder designed by the contrastive learning to fuse out discriminative features improving the prediction of circRNAs formation. Empirical results on three real-world datasets show the effectiveness of SIDE. Further analysis also reveals that the effectiveness of SIDE.
期刊介绍:
The IEEE Transactions on NanoBioscience reports on original, innovative and interdisciplinary work on all aspects of molecular systems, cellular systems, and tissues (including molecular electronics). Topics covered in the journal focus on a broad spectrum of aspects, both on foundations and on applications. Specifically, methods and techniques, experimental aspects, design and implementation, instrumentation and laboratory equipment, clinical aspects, hardware and software data acquisition and analysis and computer based modelling are covered (based on traditional or high performance computing - parallel computers or computer networks).