{"title":"nf-core/pacvar: a pipeline for analyzing longread PacBio whole genome and repeat expansion sequencing data.","authors":"Tanya Jain, Claire Clelland","doi":"10.1093/bioinformatics/btaf116","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Pacific Biosciences (PacBio) single molecule, long-read sequencing enables whole genome annotation and the characterization of 20 complex repetitive repeat regions especially relevant to neurodegenerative diseases through their PureTarget panel. Long-read whole genome sequencing (WGS) also allows for the detection of structural variants that would be difficult to detect with traditional short-read sequencing. However, the raw unaligned Binary Alignment Map (BAM) data needs to be processed before analysis. There is a need for an intuitive comprehensive bioinformatic pipeline that can analyze this data.</p><p><strong>Results: </strong>We present nf-core/pacvar, a comprehensive pipeline for analyzing both PacBio single-molecule PureTarget and WGS data that demultiplexes and parallelizes pre-processing, variant calling and repeat characterization. nf-core/pacvar is compatible with little configuration and has few dependencies. This pipeline enables rapid end-to-end, parallel processing of PacBio single-molecule whole genome and targeted repeat expansion sequencing.</p><p><strong>Availability: </strong>nf-core/pacvar is available on nf-core website (https://nf-co.re/pacvar/) and on github (https://github.com/nf-core/pacvar) under MIT License (DOI 10.5281/zenodo.14813048).</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: Pacific Biosciences (PacBio) single molecule, long-read sequencing enables whole genome annotation and the characterization of 20 complex repetitive repeat regions especially relevant to neurodegenerative diseases through their PureTarget panel. Long-read whole genome sequencing (WGS) also allows for the detection of structural variants that would be difficult to detect with traditional short-read sequencing. However, the raw unaligned Binary Alignment Map (BAM) data needs to be processed before analysis. There is a need for an intuitive comprehensive bioinformatic pipeline that can analyze this data.
Results: We present nf-core/pacvar, a comprehensive pipeline for analyzing both PacBio single-molecule PureTarget and WGS data that demultiplexes and parallelizes pre-processing, variant calling and repeat characterization. nf-core/pacvar is compatible with little configuration and has few dependencies. This pipeline enables rapid end-to-end, parallel processing of PacBio single-molecule whole genome and targeted repeat expansion sequencing.
Availability: nf-core/pacvar is available on nf-core website (https://nf-co.re/pacvar/) and on github (https://github.com/nf-core/pacvar) under MIT License (DOI 10.5281/zenodo.14813048).