nf-core/pacvar: a pipeline for analyzing long-read PacBio whole genome and repeat expansion sequencing data.

Tanya Jain, Claire Clelland
{"title":"nf-core/pacvar: a pipeline for analyzing long-read PacBio whole genome and repeat expansion sequencing data.","authors":"Tanya Jain, Claire Clelland","doi":"10.1093/bioinformatics/btaf116","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Pacific Biosciences (PacBio) single-molecule, long-read sequencing enables whole genome annotation and the characterization of 20 complex repetitive repeat regions, especially relevant to neurodegenerative diseases, through their PureTarget panel. Long-read whole-genome sequencing (WGS) also allows for the detection of structural variants that would be difficult to detect with traditional short-read sequencing. However, the raw unaligned Binary Alignment Map data need to be processed before analysis. There is a need for an intuitive comprehensive bioinformatic pipeline that can analyze these data.</p><p><strong>Results: </strong>We present nf-core/pacvar, a comprehensive pipeline for analyzing both PacBio single-molecule PureTarget and WGS data that demultiplexes and parallelizes pre-processing, variant calling and repeat characterization. nf-core/pacvar is compatible with little configuration and has few dependencies. This pipeline enables rapid end-to-end, parallel processing of PacBio single-molecule whole genome and targeted repeat expansion sequencing.</p><p><strong>Availability and implementation: </strong>nf-core/pacvar is available on nf-core website (https://nf-co.re/pacvar/) and on github (https://github.com/nf-core/pacvar) under MIT License (DOI: 10.5281/zenodo.14813048).</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11964484/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Pacific Biosciences (PacBio) single-molecule, long-read sequencing enables whole genome annotation and the characterization of 20 complex repetitive repeat regions, especially relevant to neurodegenerative diseases, through their PureTarget panel. Long-read whole-genome sequencing (WGS) also allows for the detection of structural variants that would be difficult to detect with traditional short-read sequencing. However, the raw unaligned Binary Alignment Map data need to be processed before analysis. There is a need for an intuitive comprehensive bioinformatic pipeline that can analyze these data.

Results: We present nf-core/pacvar, a comprehensive pipeline for analyzing both PacBio single-molecule PureTarget and WGS data that demultiplexes and parallelizes pre-processing, variant calling and repeat characterization. nf-core/pacvar is compatible with little configuration and has few dependencies. This pipeline enables rapid end-to-end, parallel processing of PacBio single-molecule whole genome and targeted repeat expansion sequencing.

Availability and implementation: nf-core/pacvar is available on nf-core website (https://nf-co.re/pacvar/) and on github (https://github.com/nf-core/pacvar) under MIT License (DOI: 10.5281/zenodo.14813048).

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
nf-core/pacvar:用于分析PacBio长链全基因组和重复扩增测序数据的管道。
动机:太平洋生物科学公司(PacBio)的单分子,长读测序能够通过其PureTarget面板实现全基因组注释和20个复杂重复重复区域的表征,特别是与神经退行性疾病相关的区域。长读全基因组测序(WGS)还允许检测传统短读测序难以检测的结构变异。但是,在分析之前需要处理未对齐的原始二进制对齐映射(Binary Alignment Map, BAM)数据。需要一种直观的综合生物信息学管道来分析这些数据。结果:我们提出了nf-core/pacvar,一个用于分析PacBio单分子PureTarget和WGS数据的综合管道,可以解复用和并行预处理,变体调用和重复表征。Nf-core /pacvar兼容的配置很少,依赖关系也很少。该管道能够实现PacBio单分子全基因组和靶向重复扩增测序的快速端到端并行处理。可用性:根据MIT许可(DOI 10.5281/zenodo.14813048), nf-core/pacvar可在nf-core网站(https://nf-co.re/pacvar/)和github (https://github.com/nf-core/pacvar)上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Metappuccino: Large Language Model-driven Reconstruction of Sequence Read Archive Metadata for Cancer Research. MuFaDDG: A Sequence-Based Multiscale Feature Fusion Framework for Protein Stability Changes Prediction. nf-core/viralmetagenome: A Novel Pipeline for Untargeted Viral Genome Reconstruction. FUSE: Data-driven FUnctional SEgmentation of DNA Methylation Data. Inferring the qualities of protein-RNA models with graph transformers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1