对比并结合短RNA和长RNA测序读数捕获的转录组复杂性

IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Genome research Pub Date : 2024-09-25 DOI:10.1101/gr.278659.123
Seong W Han, San Jewell, Andrei Thomas-Tikhonenko, Yoseph Barash
{"title":"对比并结合短RNA和长RNA测序读数捕获的转录组复杂性","authors":"Seong W Han, San Jewell, Andrei Thomas-Tikhonenko, Yoseph Barash","doi":"10.1101/gr.278659.123","DOIUrl":null,"url":null,"abstract":"Mapping transcriptomic variations using either short- or long-reads RNA sequencing is a staple of genomic research. Long reads are able to capture entire isoforms and overcome repetitive regions, while short reads still provide improved coverage and error rates. Yet how to quantitatively compare the technologies, can we combine those, and what may be the benefit of such a combined view remain open questions. We tackle these questions by first creating a pipeline to assess matched long and short reads data using a variety of transcriptome statistics. We find that across datasets, algorithms, and technologies, matched short reads data detects roughly 30% more splice junctions such that 10-30% of the splice junctions included at 20% or more by short reads are missed by long reads. In contrast, long reads detect many more intron retention events and can detect full isoforms, pointing to the benefit of combining the technologies. We introduce MAJIQ-L, an extension of the MAJIQ software to enable a unified view of transcriptome variations from both technologies and demonstrate its benefits. Our software can be used to assess any future long-read technology or algorithm, and combine it with short reads data for improved transcriptome analysis.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Contrasting and combining transcriptome complexity captured by short and long RNA sequencing reads\",\"authors\":\"Seong W Han, San Jewell, Andrei Thomas-Tikhonenko, Yoseph Barash\",\"doi\":\"10.1101/gr.278659.123\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mapping transcriptomic variations using either short- or long-reads RNA sequencing is a staple of genomic research. Long reads are able to capture entire isoforms and overcome repetitive regions, while short reads still provide improved coverage and error rates. Yet how to quantitatively compare the technologies, can we combine those, and what may be the benefit of such a combined view remain open questions. We tackle these questions by first creating a pipeline to assess matched long and short reads data using a variety of transcriptome statistics. We find that across datasets, algorithms, and technologies, matched short reads data detects roughly 30% more splice junctions such that 10-30% of the splice junctions included at 20% or more by short reads are missed by long reads. In contrast, long reads detect many more intron retention events and can detect full isoforms, pointing to the benefit of combining the technologies. We introduce MAJIQ-L, an extension of the MAJIQ software to enable a unified view of transcriptome variations from both technologies and demonstrate its benefits. Our software can be used to assess any future long-read technology or algorithm, and combine it with short reads data for improved transcriptome analysis.\",\"PeriodicalId\":12678,\"journal\":{\"name\":\"Genome research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2024-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genome research\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1101/gr.278659.123\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1101/gr.278659.123","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

使用短读数或长读数 RNA 测序绘制转录组变异图是基因组研究的主要方法。长读数能够捕获整个同工酶体并克服重复区域,而短读数仍能提高覆盖率和错误率。然而,如何对这些技术进行定量比较,我们能否将它们结合起来,以及这种组合视图的好处是什么,这些问题仍然悬而未决。为了解决这些问题,我们首先创建了一个管道,利用各种转录组统计数据来评估匹配的长读和短读数据。我们发现,在不同的数据集、算法和技术中,匹配的短文本数据检测到的剪接接头要多出大约 30%,这样,在短文本检测到的剪接接头中,有 10-30% 的剪接接头被长文本遗漏,而长文本检测到的剪接接头则为 20% 或更多。与此相反,长读数能检测到更多的内含子保留事件,并能检测到完整的同工酶,这说明了结合两种技术的好处。我们介绍了 MAJIQ-L 软件,它是 MAJIQ 软件的扩展,可以统一查看两种技术的转录组变化,并展示了其优势。我们的软件可用于评估任何未来的长读数技术或算法,并将其与短读数数据相结合,以改进转录组分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Contrasting and combining transcriptome complexity captured by short and long RNA sequencing reads
Mapping transcriptomic variations using either short- or long-reads RNA sequencing is a staple of genomic research. Long reads are able to capture entire isoforms and overcome repetitive regions, while short reads still provide improved coverage and error rates. Yet how to quantitatively compare the technologies, can we combine those, and what may be the benefit of such a combined view remain open questions. We tackle these questions by first creating a pipeline to assess matched long and short reads data using a variety of transcriptome statistics. We find that across datasets, algorithms, and technologies, matched short reads data detects roughly 30% more splice junctions such that 10-30% of the splice junctions included at 20% or more by short reads are missed by long reads. In contrast, long reads detect many more intron retention events and can detect full isoforms, pointing to the benefit of combining the technologies. We introduce MAJIQ-L, an extension of the MAJIQ software to enable a unified view of transcriptome variations from both technologies and demonstrate its benefits. Our software can be used to assess any future long-read technology or algorithm, and combine it with short reads data for improved transcriptome analysis.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Genome research
Genome research 生物-生化与分子生物学
CiteScore
12.40
自引率
1.40%
发文量
140
审稿时长
6 months
期刊介绍: Launched in 1995, Genome Research is an international, continuously published, peer-reviewed journal that focuses on research that provides novel insights into the genome biology of all organisms, including advances in genomic medicine. Among the topics considered by the journal are genome structure and function, comparative genomics, molecular evolution, genome-scale quantitative and population genetics, proteomics, epigenomics, and systems biology. The journal also features exciting gene discoveries and reports of cutting-edge computational biology and high-throughput methodologies. New data in these areas are published as research papers, or methods and resource reports that provide novel information on technologies or tools that will be of interest to a broad readership. Complete data sets are presented electronically on the journal''s web site where appropriate. The journal also provides Reviews, Perspectives, and Insight/Outlook articles, which present commentary on the latest advances published both here and elsewhere, placing such progress in its broader biological context.
期刊最新文献
Construction and evaluation of a new rat reference genome assembly, GRCr8, from long reads and long-range scaffolding Nanopore strand-specific mismatch enables de novo detection of bacterial DNA modifications. Gapless assembly of complete human and plant chromosomes using only nanopore sequencing. Long-read subcellular fractionation and sequencing reveals the translational fate of full-length mRNA isoforms during neuronal differentiation. Genomic epidemiology of carbapenem-resistant Enterobacterales at a New York City hospital over a 10-year period reveals complex plasmid-clone dynamics and evidence for frequent horizontal transfer of bla KPC.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1