ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets.

Q2 Decision Sciences Source Code for Biology and Medicine Pub Date : 2014-01-24 DOI:10.1186/1751-0473-9-3
Bernard J Pope, Tú Nguyen-Dumont, Fleur Hammet, Daniel J Park
{"title":"ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets.","authors":"Bernard J Pope,&nbsp;Tú Nguyen-Dumont,&nbsp;Fleur Hammet,&nbsp;Daniel J Park","doi":"10.1186/1751-0473-9-3","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>We recently described Hi-Plex, a highly multiplexed PCR-based target-enrichment system for massively parallel sequencing (MPS), which allows the uniform definition of library size so that subsequent paired-end sequencing can achieve complete overlap of read pairs. Variant calling from Hi-Plex-derived datasets can thus rely on the identification of variants appearing in both reads of read-pairs, permitting stringent filtering of sequencing chemistry-induced errors. These principles underly ROVER software (derived from Read Overlap PCR-MPS variant caller), which we have recently used to report the screening for genetic mutations in the breast cancer predisposition gene PALB2. Here, we describe the algorithms underlying ROVER and its usage.</p><p><strong>Results: </strong>ROVER enables users to quickly and accurately identify genetic variants from PCR-targeted, overlapping paired-end MPS datasets. The open-source availability of the software and threshold tailorability enables broad access for a range of PCR-MPS users.</p><p><strong>Methods: </strong>ROVER is implemented in Python and runs on all popular POSIX-like operating systems (Linux, OS X). The software accepts a tab-delimited text file listing the coordinates of the target-specific primers used for targeted enrichment based on a specified genome-build. It also accepts aligned sequence files resulting from mapping to the same genome-build. ROVER identifies the amplicon a given read-pair represents and removes the primer sequences by using the mapping co-ordinates and primer co-ordinates. It considers overlapping read-pairs with respect to primer-intervening sequence. Only when a variant is observed in both reads of a read-pair does the signal contribute to a tally of read-pairs containing or not containing the variant. A user-defined threshold informs the minimum number of, and proportion of, read-pairs a variant must be observed in for a 'call' to be made. ROVER also reports the depth of coverage across amplicons to facilitate the identification of any regions that may require further screening.</p><p><strong>Conclusions: </strong>ROVER can facilitate rapid and accurate genetic variant calling for a broad range of PCR-MPS users.</p>","PeriodicalId":35052,"journal":{"name":"Source Code for Biology and Medicine","volume":"9 1","pages":"3"},"PeriodicalIF":0.0000,"publicationDate":"2014-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1751-0473-9-3","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Source Code for Biology and Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/1751-0473-9-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Decision Sciences","Score":null,"Total":0}
引用次数: 13

Abstract

Background: We recently described Hi-Plex, a highly multiplexed PCR-based target-enrichment system for massively parallel sequencing (MPS), which allows the uniform definition of library size so that subsequent paired-end sequencing can achieve complete overlap of read pairs. Variant calling from Hi-Plex-derived datasets can thus rely on the identification of variants appearing in both reads of read-pairs, permitting stringent filtering of sequencing chemistry-induced errors. These principles underly ROVER software (derived from Read Overlap PCR-MPS variant caller), which we have recently used to report the screening for genetic mutations in the breast cancer predisposition gene PALB2. Here, we describe the algorithms underlying ROVER and its usage.

Results: ROVER enables users to quickly and accurately identify genetic variants from PCR-targeted, overlapping paired-end MPS datasets. The open-source availability of the software and threshold tailorability enables broad access for a range of PCR-MPS users.

Methods: ROVER is implemented in Python and runs on all popular POSIX-like operating systems (Linux, OS X). The software accepts a tab-delimited text file listing the coordinates of the target-specific primers used for targeted enrichment based on a specified genome-build. It also accepts aligned sequence files resulting from mapping to the same genome-build. ROVER identifies the amplicon a given read-pair represents and removes the primer sequences by using the mapping co-ordinates and primer co-ordinates. It considers overlapping read-pairs with respect to primer-intervening sequence. Only when a variant is observed in both reads of a read-pair does the signal contribute to a tally of read-pairs containing or not containing the variant. A user-defined threshold informs the minimum number of, and proportion of, read-pairs a variant must be observed in for a 'call' to be made. ROVER also reports the depth of coverage across amplicons to facilitate the identification of any regions that may require further screening.

Conclusions: ROVER can facilitate rapid and accurate genetic variant calling for a broad range of PCR-MPS users.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ROVER变体调用器:用于基于pcr的大规模并行测序数据集的读取对重叠考虑变体调用软件。
背景:我们最近描述了Hi-Plex,这是一种高度复用的基于pcr的大规模并行测序(MPS)目标富集系统,它允许统一定义文库大小,以便随后的成对端测序可以实现读取对的完全重叠。因此,来自hi - plex衍生数据集的变体调用可以依赖于在读取对的两个读取中出现的变体的识别,从而允许严格过滤测序化学引起的错误。这些原则是ROVER软件(源自Read Overlap PCR-MPS variant caller)的基础,我们最近用它来报道乳腺癌易感基因PALB2的基因突变筛选。在这里,我们描述了基于ROVER的算法及其用法。结果:ROVER使用户能够快速准确地从pcr靶向的重叠成对端MPS数据集中识别遗传变异。软件的开源可用性和阈值可定制性使PCR-MPS用户能够广泛访问。方法:ROVER使用Python实现,可在所有流行的类posix操作系统(Linux、OS X)上运行。该软件接受一个以制表符分隔的文本文件,该文件列出了用于基于特定基因组构建的靶向富集的目标特异性引物的坐标。它还接受由映射到相同基因组构建而产生的对齐序列文件。ROVER识别给定读对所代表的扩增子,并通过使用映射坐标和引物坐标去除引物序列。它考虑了相对于引物介入序列的重叠读对。只有当在一对读对的两次读取中都观察到一个变体时,信号才会对包含或不包含该变体的读对计数做出贡献。用户定义的阈值通知了一个变量在进行“调用”时必须观察到的读对的最小数量和比例。ROVER还报告了扩增子的覆盖深度,以方便识别任何可能需要进一步筛选的区域。结论:ROVER可以促进快速和准确的基因变异,需要广泛的PCR-MPS用户。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Source Code for Biology and Medicine
Source Code for Biology and Medicine Decision Sciences-Information Systems and Management
自引率
0.00%
发文量
0
期刊介绍: Source Code for Biology and Medicine is a peer-reviewed open access, online journal that publishes articles on source code employed over a wide range of applications in biology and medicine. The journal"s aim is to publish source code for distribution and use in the public domain in order to advance biological and medical research. Through this dissemination, it may be possible to shorten the time required for solving certain computational problems for which there is limited source code availability or resources.
期刊最新文献
2DKD: a toolkit for content-based local image search. Computing and graphing probability values of pearson distributions: a SAS/IML macro. iPBAvizu: a PyMOL plugin for an efficient 3D protein structure superimposition approach Social support for collaboration and group awareness in life science research teams. MZPAQ: a FASTQ data compression tool.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1