JULiP: An efficient model for accurate intron selection from multiple RNA-seq samples

Guangyu Yang, L. Florea
{"title":"JULiP: An efficient model for accurate intron selection from multiple RNA-seq samples","authors":"Guangyu Yang, L. Florea","doi":"10.1109/ICCABS.2016.7802790","DOIUrl":null,"url":null,"abstract":"Accurate alternative splicing detection and transcript reconstruction are essential to characterize gene regulation and function and to understand development and disease. However, current methods for extracting splicing variation from RNA-seq data only analyze signals from a single sample, which limits transcript reconstruction and fails to detect a complete set of alternative splicing events. We developed a novel feature selection method, JULiP, that analyzes information across multiple samples to identify alternative splicing variation in the form of splice junctions (introns). It formulates the selection problem as a regularized program, utilizing the latent information from multiple RNA-seq samples to construct an accurate and comprehensive intron set. JULiP is highly accurate, and could detect thousands more introns in any one sample, >30% more than the most sensitive single-sample method, and 10% more introns than in the cumulative set of samples, at higher or comparable precision (>98%). Tested assemblers included Cufflinks, CLASS2, StringTie and FlipFlop, and the multi-sample assembler ISP. JULiP is multi-threaded and parallelized, taking only one minute to analyze up to 100 data sets on a multi-computer cluster, and can easily scale up to allow analyses of hundreds and thousands of RNA-seq samples.","PeriodicalId":306466,"journal":{"name":"2016 IEEE 6th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 6th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCABS.2016.7802790","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Accurate alternative splicing detection and transcript reconstruction are essential to characterize gene regulation and function and to understand development and disease. However, current methods for extracting splicing variation from RNA-seq data only analyze signals from a single sample, which limits transcript reconstruction and fails to detect a complete set of alternative splicing events. We developed a novel feature selection method, JULiP, that analyzes information across multiple samples to identify alternative splicing variation in the form of splice junctions (introns). It formulates the selection problem as a regularized program, utilizing the latent information from multiple RNA-seq samples to construct an accurate and comprehensive intron set. JULiP is highly accurate, and could detect thousands more introns in any one sample, >30% more than the most sensitive single-sample method, and 10% more introns than in the cumulative set of samples, at higher or comparable precision (>98%). Tested assemblers included Cufflinks, CLASS2, StringTie and FlipFlop, and the multi-sample assembler ISP. JULiP is multi-threaded and parallelized, taking only one minute to analyze up to 100 data sets on a multi-computer cluster, and can easily scale up to allow analyses of hundreds and thousands of RNA-seq samples.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
JULiP:从多个RNA-seq样本中精确选择内含子的有效模型
准确的选择性剪接检测和转录本重建对于表征基因调控和功能以及了解发育和疾病至关重要。然而,目前从RNA-seq数据中提取剪接变化的方法仅分析来自单个样本的信号,这限制了转录物的重建,并且无法检测到一整套可选剪接事件。我们开发了一种新的特征选择方法,JULiP,分析多个样本的信息,以确定剪接连接(内含子)形式的可选剪接变化。它将选择问题表述为一个正则化程序,利用来自多个RNA-seq样本的潜在信息构建一个准确而全面的内含子集。JULiP非常准确,在任何一个样品中都可以检测到数千个内含子,比最灵敏的单样品方法多30%,比累积样品多10%,精度更高或相当(98%)。测试的汇编程序包括袖扣,CLASS2, StringTie和FlipFlop,以及多样本汇编程序ISP。JULiP是多线程和并行的,只需要一分钟就可以在多台计算机集群上分析多达100个数据集,并且可以轻松扩展以允许分析数百和数千个RNA-seq样本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Curvelet-based texture classification of critical Gleason patterns of prostate histological images NanoBLASTer: Fast alignment and characterization of Oxford Nanopore single molecule sequencing reads Identifying hotspots in five year survival electronic health records of older adults HRVCam: A software for real-time feedback of heart rate and HRV A deep learning-based segmentation method for brain tumor in MR images
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1