DeepRetention: A Deep Learning Approach for Intron Retention Detection

IF 6.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Big Data Mining and Analytics Pub Date : 2023-01-25 DOI:10.26599/BDMA.2022.9020023

Zhenpeng Wu;Jiantao Zheng;Jiashu Liu;Cuixiang Lin;Hong-Dong Li

{"title":"DeepRetention: A Deep Learning Approach for Intron Retention Detection","authors":"Zhenpeng Wu;Jiantao Zheng;Jiashu Liu;Cuixiang Lin;Hong-Dong Li","doi":"10.26599/BDMA.2022.9020023","DOIUrl":null,"url":null,"abstract":"As the least understood mode of alternative splicing, Intron Retention (IR) is emerging as an interesting area and has attracted more and more attention in the field of gene regulation and disease studies. Existing methods detect IR exclusively based on one or a few predefined metrics describing local or summarized characteristics of retained introns. These metrics are not able to describe the pattern of sequencing depth of intronic reads, which is an intuitive and informative characteristic of retained introns. We hypothesize that incorporating the distribution pattern of intronic reads will improve the accuracy of IR detection. Here we present DeepRetention, a novel approach for IR detection by modeling the pattern of sequencing depth of introns. Due to the lack of a gold standard dataset of IR, we first compare DeepRetention with two state-of-the-art methods, i.e. iREAD and IRFinder, on simulated RNA-seq datasets with retained introns. The results show that DeepRetention outperforms these two methods. Next, DeepRetention performs well when it is applied to third-generation long-read RNA-seq data, while IRFinder and iREAD are not applicable to detecting IR from the third-generation sequencing data. Further, we show that IRs predicted by DeepRetention are biologically meaningful on an RNA-seq dataset from Alzheimer's Disease (AD) samples. The differential IRs are found to be significantly associated with AD based on statistical evaluation of an AD-specific functional gene network. The parent genes of differential IRs are enriched in AD-related functions. In summary, DeepRetention detects IR from a new angle of view, providing a valuable tool for IR analysis.","PeriodicalId":52355,"journal":{"name":"Big Data Mining and Analytics","volume":"6 2","pages":"115-126"},"PeriodicalIF":6.2000,"publicationDate":"2023-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8254253/10026288/10026289.pdf","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Mining and Analytics","FirstCategoryId":"1093","ListUrlMain":"https://ieeexplore.ieee.org/document/10026289/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 2

Abstract

As the least understood mode of alternative splicing, Intron Retention (IR) is emerging as an interesting area and has attracted more and more attention in the field of gene regulation and disease studies. Existing methods detect IR exclusively based on one or a few predefined metrics describing local or summarized characteristics of retained introns. These metrics are not able to describe the pattern of sequencing depth of intronic reads, which is an intuitive and informative characteristic of retained introns. We hypothesize that incorporating the distribution pattern of intronic reads will improve the accuracy of IR detection. Here we present DeepRetention, a novel approach for IR detection by modeling the pattern of sequencing depth of introns. Due to the lack of a gold standard dataset of IR, we first compare DeepRetention with two state-of-the-art methods, i.e. iREAD and IRFinder, on simulated RNA-seq datasets with retained introns. The results show that DeepRetention outperforms these two methods. Next, DeepRetention performs well when it is applied to third-generation long-read RNA-seq data, while IRFinder and iREAD are not applicable to detecting IR from the third-generation sequencing data. Further, we show that IRs predicted by DeepRetention are biologically meaningful on an RNA-seq dataset from Alzheimer's Disease (AD) samples. The differential IRs are found to be significantly associated with AD based on statistical evaluation of an AD-specific functional gene network. The parent genes of differential IRs are enriched in AD-related functions. In summary, DeepRetention detects IR from a new angle of view, providing a valuable tool for IR analysis.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

深度保留：一种用于内含子保留检测的深度学习方法

作为人们最不了解的选择性剪接模式，内含子保留（IR）正成为一个有趣的领域，并在基因调控和疾病研究领域引起了越来越多的关注。现有方法仅基于描述保留内含子的局部或概括特征的一个或几个预定义指标来检测IR。这些指标无法描述内含子阅读的测序深度模式，这是保留内含子的直观和信息特征。我们假设结合内含子读数的分布模式将提高IR检测的准确性。在这里，我们介绍了DeepRetention，这是一种通过模拟内含子测序深度模式进行IR检测的新方法。由于缺乏IR的金标准数据集，我们首先在具有保留内含子的模拟RNA-seq数据集上比较了DeepRetention与两种最先进的方法，即iREAD和IRFinder。结果表明，DeepRetention的性能优于这两种方法。接下来，DeepRetention在应用于第三代长读RNA-seq数据时表现良好，而IRFinder和iREAD不适用于从第三代测序数据中检测IR。此外，我们还表明，DeepRetention预测的IRs在阿尔茨海默病（AD）样本的RNA-seq数据集上具有生物学意义。基于AD特异性功能基因网络的统计评估，发现差异IR与AD显著相关。差异IRs的亲本基因富含AD相关功能。总之，DeepRetention从一个新的角度检测IR，为IR分析提供了一个有价值的工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Big Data Mining and Analytics Computer Science-Computer Science Applications

CiteScore

20.90

自引率

2.20%

发文量

期刊介绍： Big Data Mining and Analytics, a publication by Tsinghua University Press, presents groundbreaking research in the field of big data research and its applications. This comprehensive book delves into the exploration and analysis of vast amounts of data from diverse sources to uncover hidden patterns, correlations, insights, and knowledge. Featuring the latest developments, research issues, and solutions, this book offers valuable insights into the world of big data. It provides a deep understanding of data mining techniques, data analytics, and their practical applications. Big Data Mining and Analytics has gained significant recognition and is indexed and abstracted in esteemed platforms such as ESCI, EI, Scopus, DBLP Computer Science, Google Scholar, INSPEC, CSCD, DOAJ, CNKI, and more. With its wealth of information and its ability to transform the way we perceive and utilize data, this book is a must-read for researchers, professionals, and anyone interested in the field of big data analytics.

期刊最新文献

Contents Front Cover Incremental Data Stream Classification with Adaptive Multi-Task Multi-View Learning Attention-Based CNN Fusion Model for Emotion Recognition During Walking Using Discrete Wavelet Transform on EEG and Inertial Signals Gender-Based Analysis of User Reactions to Facebook Posts