Daniel R Cuesta-Aguirre, Assumpció Malgosa, Cristina Santos
{"title":"An easy-to-use pipeline to analyze amplicon-based Next Generation Sequencing results of human mitochondrial DNA from degraded samples.","authors":"Daniel R Cuesta-Aguirre, Assumpció Malgosa, Cristina Santos","doi":"10.1371/journal.pone.0311115","DOIUrl":null,"url":null,"abstract":"<p><p>Genome and transcriptome examinations have become more common due to Next-Generation Sequencing (NGS), which significantly increases throughput and depth coverage while reducing costs and time. Mitochondrial DNA (mtDNA) is often the marker of choice in degraded samples from archaeological and forensic contexts, as its higher number of copies can improve the success of the experiment. Among other sequencing strategies, amplicon-based NGS techniques are currently being used to obtain enough data to be analyzed. There are some pipelines designed for the analysis of ancient mtDNA samples and others for the analysis of amplicon data. However, these pipelines pose a challenge for non-expert users and cannot often address both ancient and forensic DNA particularities and amplicon-based sequencing simultaneously. To overcome these challenges, a user-friendly bioinformatic tool was developed to analyze the non-coding region of human mtDNA from degraded samples recovered in archaeological and forensic contexts. The tool can be easily modified to fit the specifications of other amplicon-based NGS experiments. A comparative analysis between two tools, MarkDuplicates from Picard and dedup parameter from fastp, both designed for duplicate removal was conducted. Additionally, various thresholds of PMDtools, a specialized tool designed for extracting reads affected by post-mortem damage, were used. Finally, the depth coverage of each amplicon was correlated with its level of damage. The results obtained indicated that, for removing duplicates, dedup is a better tool since retains more non-repeated reads, that are removed by MarkDuplicates. On the other hand, a PMDS = 1 in PMDtools was the threshold that allowed better differentiation between present-day and ancient samples, in terms of damage, without losing too many reads in the process. These two bioinformatic tools were added to a pipeline designed to obtain both haplotype and haplogroup of mtDNA. Furthermore, the pipeline presented in the present study generates information about the quality and possible contamination of the sample. This pipeline is designed to automatize mtDNA analysis, however, particularly for ancient samples, some manual analyses may be required to fully validate results since the amplicons that used to be more easily recovered were the ones that had fewer reads with damage, indicating that special care must be taken for poor recovered samples.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"19 11","pages":"e0311115"},"PeriodicalIF":2.9000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11581256/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0311115","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Genome and transcriptome examinations have become more common due to Next-Generation Sequencing (NGS), which significantly increases throughput and depth coverage while reducing costs and time. Mitochondrial DNA (mtDNA) is often the marker of choice in degraded samples from archaeological and forensic contexts, as its higher number of copies can improve the success of the experiment. Among other sequencing strategies, amplicon-based NGS techniques are currently being used to obtain enough data to be analyzed. There are some pipelines designed for the analysis of ancient mtDNA samples and others for the analysis of amplicon data. However, these pipelines pose a challenge for non-expert users and cannot often address both ancient and forensic DNA particularities and amplicon-based sequencing simultaneously. To overcome these challenges, a user-friendly bioinformatic tool was developed to analyze the non-coding region of human mtDNA from degraded samples recovered in archaeological and forensic contexts. The tool can be easily modified to fit the specifications of other amplicon-based NGS experiments. A comparative analysis between two tools, MarkDuplicates from Picard and dedup parameter from fastp, both designed for duplicate removal was conducted. Additionally, various thresholds of PMDtools, a specialized tool designed for extracting reads affected by post-mortem damage, were used. Finally, the depth coverage of each amplicon was correlated with its level of damage. The results obtained indicated that, for removing duplicates, dedup is a better tool since retains more non-repeated reads, that are removed by MarkDuplicates. On the other hand, a PMDS = 1 in PMDtools was the threshold that allowed better differentiation between present-day and ancient samples, in terms of damage, without losing too many reads in the process. These two bioinformatic tools were added to a pipeline designed to obtain both haplotype and haplogroup of mtDNA. Furthermore, the pipeline presented in the present study generates information about the quality and possible contamination of the sample. This pipeline is designed to automatize mtDNA analysis, however, particularly for ancient samples, some manual analyses may be required to fully validate results since the amplicons that used to be more easily recovered were the ones that had fewer reads with damage, indicating that special care must be taken for poor recovered samples.
期刊介绍:
PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides:
* Open-access—freely accessible online, authors retain copyright
* Fast publication times
* Peer review by expert, practicing researchers
* Post-publication tools to indicate quality and impact
* Community-based dialogue on articles
* Worldwide media coverage