Yash Patel, Chenghao Zhu, Takafumi N Yamaguchi, Nicholas K Wang, Nicholas Wiltsie, Nicole Zeltser, Alfredo E Gonzalez, Helena K Winata, Yu Pan, Mohammed Faizal Eeman Mootor, Timothy Sanders, Sorel T Fitz-Gibbon, Cyriac Kandoth, Julie Livingstone, Lydia Y Liu, Benjamin Carlin, Aaron Holmes, Jieun Oh, John Sahrmann, Shu Tao, Stefan Eng, Rupert Hugh-White, Kiarod Pashminehazar, Andrew Park, Arpi Beshlikyan, Madison Jordan, Selina Wu, Mao Tian, Jaron Arbet, Beth Neilsen, Roni Haas, Yuan Zhe Bugh, Gina Kim, Joseph Salmingo, Wenshu Zhang, Aakarsh Anand, Edward Hwang, Anna Neiman-Golden, Philippa Steinberg, Wenyan Zhao, Prateek Anand, Raag Agrawal, Brandon L Tsai, Paul C Boutros
{"title":"Metapipeline-DNA:全面的种系和体细胞基因组学 Nextflow 管道。","authors":"Yash Patel, Chenghao Zhu, Takafumi N Yamaguchi, Nicholas K Wang, Nicholas Wiltsie, Nicole Zeltser, Alfredo E Gonzalez, Helena K Winata, Yu Pan, Mohammed Faizal Eeman Mootor, Timothy Sanders, Sorel T Fitz-Gibbon, Cyriac Kandoth, Julie Livingstone, Lydia Y Liu, Benjamin Carlin, Aaron Holmes, Jieun Oh, John Sahrmann, Shu Tao, Stefan Eng, Rupert Hugh-White, Kiarod Pashminehazar, Andrew Park, Arpi Beshlikyan, Madison Jordan, Selina Wu, Mao Tian, Jaron Arbet, Beth Neilsen, Roni Haas, Yuan Zhe Bugh, Gina Kim, Joseph Salmingo, Wenshu Zhang, Aakarsh Anand, Edward Hwang, Anna Neiman-Golden, Philippa Steinberg, Wenyan Zhao, Prateek Anand, Raag Agrawal, Brandon L Tsai, Paul C Boutros","doi":"10.1101/2024.09.04.611267","DOIUrl":null,"url":null,"abstract":"<p><p>The price, quality and throughout of DNA sequencing continue to improve. Algorithmic innovations have allowed inference of a growing range of features from DNA sequencing data, quantifying nuclear, mitochondrial and evolutionary aspects of both germline and somatic genomes. To automate analyses of the full range of genomic characteristics, we created an extensible Nextflow meta-pipeline called metapipeline-DNA. Metapipeline-DNA analyzes targeted and whole-genome sequencing data from raw reads through pre-processing, feature detection by multiple algorithms, quality-control and data-visualization. Each step can be run independently and is supported robust software engineering including automated failure-recovery, robust testing and consistent verifications of inputs, outputs and parameters. Metapipeline-DNA is cloud-compatible and highly configurable, with options to subset and optimize each analysis. Metapipeline-DNA facilitates high-scale, comprehensive analysis of DNA sequencing data.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11398472/pdf/","citationCount":"0","resultStr":"{\"title\":\"Metapipeline-DNA: A Comprehensive Germline & Somatic Genomics Nextflow Pipeline.\",\"authors\":\"Yash Patel, Chenghao Zhu, Takafumi N Yamaguchi, Nicholas K Wang, Nicholas Wiltsie, Nicole Zeltser, Alfredo E Gonzalez, Helena K Winata, Yu Pan, Mohammed Faizal Eeman Mootor, Timothy Sanders, Sorel T Fitz-Gibbon, Cyriac Kandoth, Julie Livingstone, Lydia Y Liu, Benjamin Carlin, Aaron Holmes, Jieun Oh, John Sahrmann, Shu Tao, Stefan Eng, Rupert Hugh-White, Kiarod Pashminehazar, Andrew Park, Arpi Beshlikyan, Madison Jordan, Selina Wu, Mao Tian, Jaron Arbet, Beth Neilsen, Roni Haas, Yuan Zhe Bugh, Gina Kim, Joseph Salmingo, Wenshu Zhang, Aakarsh Anand, Edward Hwang, Anna Neiman-Golden, Philippa Steinberg, Wenyan Zhao, Prateek Anand, Raag Agrawal, Brandon L Tsai, Paul C Boutros\",\"doi\":\"10.1101/2024.09.04.611267\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The price, quality and throughout of DNA sequencing continue to improve. Algorithmic innovations have allowed inference of a growing range of features from DNA sequencing data, quantifying nuclear, mitochondrial and evolutionary aspects of both germline and somatic genomes. To automate analyses of the full range of genomic characteristics, we created an extensible Nextflow meta-pipeline called metapipeline-DNA. Metapipeline-DNA analyzes targeted and whole-genome sequencing data from raw reads through pre-processing, feature detection by multiple algorithms, quality-control and data-visualization. Each step can be run independently and is supported robust software engineering including automated failure-recovery, robust testing and consistent verifications of inputs, outputs and parameters. Metapipeline-DNA is cloud-compatible and highly configurable, with options to subset and optimize each analysis. Metapipeline-DNA facilitates high-scale, comprehensive analysis of DNA sequencing data.</p>\",\"PeriodicalId\":519960,\"journal\":{\"name\":\"bioRxiv : the preprint server for biology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11398472/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv : the preprint server for biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.09.04.611267\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.04.611267","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
摘要:随着高通量技术的发展,DNA 测序的价格越来越低,速度越来越快。数据可用性的提高促进了新型算法的开发,以阐明以前模糊不清的特征,并导致人们越来越依赖复杂的工作流程,将这些工具集成到分析管道中。为了便于分析 DNA 测序数据,我们创建了 metapipeline-DNA,这是一个高度可配置和可扩展的管道。它涵盖了广泛的处理过程,包括原始测序读数比对和重新校准、变体调用、质量控制和亚克隆重建。Metapipeline-DNA 还包含配置选项,用于选择和调整分析,同时对故障具有鲁棒性。这使得在临床和研究环境中分析大型 DNA 测序的能力标准化和简单化:Metapipeline-DNA 是开源的 Nextflow 管道,采用 GPLv2 许可,可在 https://github.com/uclahs-cds/metapipeline-DNA 免费获取。
Metapipeline-DNA: A Comprehensive Germline & Somatic Genomics Nextflow Pipeline.
The price, quality and throughout of DNA sequencing continue to improve. Algorithmic innovations have allowed inference of a growing range of features from DNA sequencing data, quantifying nuclear, mitochondrial and evolutionary aspects of both germline and somatic genomes. To automate analyses of the full range of genomic characteristics, we created an extensible Nextflow meta-pipeline called metapipeline-DNA. Metapipeline-DNA analyzes targeted and whole-genome sequencing data from raw reads through pre-processing, feature detection by multiple algorithms, quality-control and data-visualization. Each step can be run independently and is supported robust software engineering including automated failure-recovery, robust testing and consistent verifications of inputs, outputs and parameters. Metapipeline-DNA is cloud-compatible and highly configurable, with options to subset and optimize each analysis. Metapipeline-DNA facilitates high-scale, comprehensive analysis of DNA sequencing data.