One essential initial step in the analysis of ancient DNA is to authenticate that the DNA sequencing reads are actually from ancient DNA. This is done by assessing if the reads exhibit typical characteristics of post-mortem damage (PMD), including cytosine deamination and nicks. We present a novel statistical method implemented in a fast multithreaded programme, ngsBriggs that enables rapid quantification of PMD by estimation of the Briggs ancient damage model parameters (Briggs parameters). Using a multinomial model with maximum likelihood fit, ngsBriggs accurately estimates the parameters of the Briggs model, quantifying the PMD signal from single and double-stranded DNA regions. We extend the original Briggs model to capture PMD signals for contemporary sequencing platforms and show that ngsBriggs accurately estimates the Briggs parameters across a variety of contamination levels. Classification of reads into ancient or modern reads, for the purpose of decontamination, is significantly more accurate using ngsBriggs than using other methods available. Furthermore, ngsBriggs is substantially faster than other state-of-the-art methods. ngsBriggs offers a practical and accurate method for researchers seeking to authenticate ancient DNA and improve the quality of their data.
分析古 DNA 的一个重要初始步骤是鉴定 DNA 测序读数是否真的来自古 DNA。要做到这一点,需要评估读数是否表现出典型的死后损伤(PMD)特征,包括胞嘧啶脱氨和刻痕。我们介绍了一种在快速多线程程序 ngsBriggs 中实施的新型统计方法,该方法可通过估算布里格斯古损伤模型参数(布里格斯参数)快速量化 PMD。ngsBriggs 使用最大似然拟合的多项式模型,准确估计了布里格斯模型的参数,量化了单链和双链 DNA 区域的 PMD 信号。我们对原始布里格斯模型进行了扩展,以捕捉当代测序平台的 PMD 信号,结果表明 ngsBriggs 能准确估计各种污染水平下的布里格斯参数。与其他可用方法相比,使用 ngsBriggs 将读数分为古代读数和现代读数以达到净化目的的准确性要高得多。此外,ngsBriggs 比其他最先进的方法快得多。ngsBriggs 为寻求鉴定古代 DNA 和提高数据质量的研究人员提供了一种实用而准确的方法。
{"title":"Revisiting the Briggs Ancient DNA Damage Model: A Fast Maximum Likelihood Method to Estimate Post-Mortem Damage","authors":"Lei Zhao, Rasmus Amund Henriksen, Abigail Ramsøe, Rasmus Nielsen, Thorfinn Sand Korneliussen","doi":"10.1111/1755-0998.14029","DOIUrl":"10.1111/1755-0998.14029","url":null,"abstract":"<p>One essential initial step in the analysis of ancient DNA is to authenticate that the DNA sequencing reads are actually from ancient DNA. This is done by assessing if the reads exhibit typical characteristics of post-mortem damage (PMD), including cytosine deamination and nicks. We present a novel statistical method implemented in a fast multithreaded programme, ngsBriggs that enables rapid quantification of PMD by estimation of the Briggs ancient damage model parameters (Briggs parameters). Using a multinomial model with maximum likelihood fit, ngsBriggs accurately estimates the parameters of the Briggs model, quantifying the PMD signal from single and double-stranded DNA regions. We extend the original Briggs model to capture PMD signals for contemporary sequencing platforms and show that ngsBriggs accurately estimates the Briggs parameters across a variety of contamination levels. Classification of reads into ancient or modern reads, for the purpose of decontamination, is significantly more accurate using ngsBriggs than using other methods available. Furthermore, ngsBriggs is substantially faster than other state-of-the-art methods. ngsBriggs offers a practical and accurate method for researchers seeking to authenticate ancient DNA and improve the quality of their data.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14029","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucas André Blattner, Pierre Lapellegerie, Colin Courtney-Mustaphi, Oliver Heiri
Chironomidae, so-called non-biting midges, are considered key bioindicators of aquatic ecosystem variability. Data derived from morphologically identifying their chitinous remains in sediments document chironomid larvae assemblages, which are studied to reconstruct ecosystem changes over time. Recent developments in sedimentary DNA (sedDNA) research have demonstrated that molecular techniques are suitable for determining past and present occurrences of organisms. Nevertheless, sedDNA records documenting alterations in chironomid assemblages remain largely unexplored. To close this gap, we examined the applicability of sedDNA metabarcoding to identify Chironomidae assemblages in lake sediments by sampling and processing three 21–35 cm long sediment cores from Lake Sempach in Switzerland. With a focus on developing analytical approaches, we compared an invertebrate-universal (FWH) and a newly designed Chironomidae-specific metabarcoding primer set (CH) to assess their performance in detecting Chironomidae DNA. We isolated and identified chitinous larval remains and compared the morphotype assemblages with the data derived from sedDNA metabarcoding. Results showed a good overall agreement of the morphotype assemblage-specific clustering among the chitinous remains and the metabarcoding datasets. Both methods indicated higher chironomid assemblage similarity between the two littoral cores in contrast to the deep lake core. Moreover, we observed a pronounced primer bias effect resulting in more Chironomidae detections with the CH primer combination compared to the FWH combination. Overall, we conclude that sedDNA metabarcoding can supplement traditional remain identifications and potentially provide independent reconstructions of past chironomid assemblage changes. Furthermore, it has the potential of more efficient workflows, better sample standardisation and species-level resolution datasets.
摇蚊(Chironomidae),即所谓的不咬蠓,被认为是水生生态系统变化的关键生物指标。通过对其在沉积物中的壳质残骸进行形态鉴定而获得的数据记录了摇蚊幼虫的组合,通过研究这些数据可以重建生态系统随时间的变化。沉积 DNA(sedDNA)研究的最新进展表明,分子技术适用于确定生物在过去和现在的分布情况。然而,记录摇蚊组合变化的沉积 DNA 记录在很大程度上仍未得到研究。为了填补这一空白,我们对瑞士森帕赫湖的三块 21-35 厘米长的沉积物岩心进行了取样和处理,研究了沉积 DNA 代谢编码技术在鉴定湖泊沉积物中摇蚊类群方面的适用性。为了开发分析方法,我们比较了无脊椎动物通用引物组(FWH)和新设计的摇蚊科专用代谢标码引物组(CH),以评估它们在检测摇蚊科 DNA 方面的性能。我们分离并鉴定了几丁质幼虫遗骸,并将其形态组合与沉积物 DNA 代谢编码得出的数据进行了比较。结果表明,壳质幼虫遗骸与代谢编码数据集之间的形态组合总体上非常一致。两种方法都表明,与深湖岩心相比,两个沿岸岩心的摇蚊集合相似度更高。此外,我们还观察到明显的引物偏差效应,与 FWH 引物组合相比,CH 引物组合检测到的摇蚊数量更多。总之,我们得出结论:沉积 DNA 代谢编码可以补充传统的残留鉴定,并有可能独立重建过去摇蚊类群的变化。此外,沉积 DNA 代谢编码还具有更高效的工作流程、更好的样本标准化和物种级分辨率数据集的潜力。
{"title":"Sediment Core DNA-Metabarcoding and Chitinous Remain Identification: Integrating Complementary Methods to Characterise Chironomidae Biodiversity in Lake Sediment Archives","authors":"Lucas André Blattner, Pierre Lapellegerie, Colin Courtney-Mustaphi, Oliver Heiri","doi":"10.1111/1755-0998.14035","DOIUrl":"10.1111/1755-0998.14035","url":null,"abstract":"<p>Chironomidae, so-called non-biting midges, are considered key bioindicators of aquatic ecosystem variability. Data derived from morphologically identifying their chitinous remains in sediments document chironomid larvae assemblages, which are studied to reconstruct ecosystem changes over time. Recent developments in sedimentary DNA (sedDNA) research have demonstrated that molecular techniques are suitable for determining past and present occurrences of organisms. Nevertheless, sedDNA records documenting alterations in chironomid assemblages remain largely unexplored. To close this gap, we examined the applicability of sedDNA metabarcoding to identify Chironomidae assemblages in lake sediments by sampling and processing three 21–35 cm long sediment cores from Lake Sempach in Switzerland. With a focus on developing analytical approaches, we compared an invertebrate-universal (FWH) and a newly designed Chironomidae-specific metabarcoding primer set (CH) to assess their performance in detecting Chironomidae DNA. We isolated and identified chitinous larval remains and compared the morphotype assemblages with the data derived from sedDNA metabarcoding. Results showed a good overall agreement of the morphotype assemblage-specific clustering among the chitinous remains and the metabarcoding datasets. Both methods indicated higher chironomid assemblage similarity between the two littoral cores in contrast to the deep lake core. Moreover, we observed a pronounced primer bias effect resulting in more Chironomidae detections with the CH primer combination compared to the FWH combination. Overall, we conclude that sedDNA metabarcoding can supplement traditional remain identifications and potentially provide independent reconstructions of past chironomid assemblage changes. Furthermore, it has the potential of more efficient workflows, better sample standardisation and species-level resolution datasets.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14035","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
de La Harpe, M., J. Hess, O. Loiseau, N. Salamin, C. Lexer, and M. Paris. 2019. A Dedicated Target Capture Approach Reveals Variable Genetic Markers Across Micro-and Macroevolutionary Time Scales in Palms. Molecular Ecology Resources, 19(1): 221–234. https://doi.org/10.1111/1755-0998.12945
For the sake of completeness, the authors wish to provide more detailed information in the acknowledgements section about the samples collection, with a more exhaustive list of researchers and students involved in the field sampling. In addition, we correct an inaccuracy in the funding information, replacing ‘Swiss National Science Foundation (SNSF), Grant/Award Number: CRSII3_147630; University of Zurich; Illumina; National Science Foundation’ by ‘Swiss National Science Foundation (SNSF), Grant/Award Number: CRSII3_147630; University of Fribourg’.
The updated Acknowledgement section is detailed below:
de La Harpe, M., J. Hess, O. Loiseau, N. Salamin, C. Lexer和M. Paris. 2019。一种专门的目标捕获方法揭示了棕榈树微观和宏观进化时间尺度上的可变遗传标记。分子生态资源,19(1):221-234。https://doi.org/10.1111/1755-0998.12945For为了完整起见,作者希望在致谢部分提供有关样本收集的更详细信息,以及参与现场采样的研究人员和学生的更详尽的列表。此外,我们更正了资助信息中的错误,将“瑞士国家科学基金会(SNSF),资助/奖励号:CRSII3_147630;苏黎世大学;Illumina公司;瑞士国家科学基金(SNSF),资助/奖励号:CRSII3_147630;弗里堡大学”。更新后的“确认”部分详列如下:
{"title":"Correction to “A dedicated target capture approach reveals variable genetic markers across micro-and macro-evolutionary time scales in palms”","authors":"","doi":"10.1111/1755-0998.14032","DOIUrl":"10.1111/1755-0998.14032","url":null,"abstract":"<p>de La Harpe, M., J. Hess, O. Loiseau, N. Salamin, C. Lexer, and M. Paris. 2019. A Dedicated Target Capture Approach Reveals Variable Genetic Markers Across Micro-and Macroevolutionary Time Scales in Palms. Molecular Ecology Resources, 19(1): 221–234. https://doi.org/10.1111/1755-0998.12945</p><p>For the sake of completeness, the authors wish to provide more detailed information in the acknowledgements section about the samples collection, with a more exhaustive list of researchers and students involved in the field sampling. In addition, we correct an inaccuracy in the funding information, replacing ‘Swiss National Science Foundation (SNSF), Grant/Award Number: CRSII3_147630; University of Zurich; Illumina; National Science Foundation’ by ‘Swiss National Science Foundation (SNSF), Grant/Award Number: CRSII3_147630; University of Fribourg’.</p><p>The updated Acknowledgement section is detailed below:</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14032","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexandrine Daniel, Paul Savary, Jean-Christophe Foltête, Gilles Vuidel, Bruno Faivre, Stéphane Garnier, Aurélie Khimoun
Modelling population connectivity is central to biodiversity conservation and often relies on resistance surfaces reflecting multi-generational gene flow. ResistanceGA (RGA) is a common optimization framework for parameterizing these surfaces by maximizing the fit between genetic distances and cost distances using maximum likelihood population effect models. As the reliability of this framework has rarely been studied, we investigated the conditions maximizing its accuracy for both prediction and interpretation of landscape features' permeability. We ran demo-genetic simulations in contrasted landscapes for species with distinct dispersal capacities and specialization levels, using corresponding reference cost scenarios. We then optimized resistance surfaces from the simulated genetic distances using RGA. First, we evaluated whether RGA identified the drivers of the genetic patterns, that is, distinguished Isolation-by-Resistance (IBR) patterns from either Isolation-by-Distance or patterns unrelated to ecological distances. We then assessed RGA predictive performance using a cross-validation method, and its ability to recover the reference cost scenarios shaping genetic structure in simulations. IBR patterns were well detected and genetic distances were predicted with great accuracy. This performance depended on the strength of the genetic structuring, sampling design and landscape structure. Matching the scale of the genetic pattern by focusing on population pairs connected through gene flow and limiting overfitting through cross-validation further enhanced inference reliability. Yet, the optimized cost values often departed from the reference values, making their interpretation and extrapolation potentially dubious. While demonstrating the value of RGA for predictive modelling, we call for caution and provide additional guidance for its optimal use.
{"title":"What can optimized cost distances based on genetic distances offer? A simulation study on the use and misuse of ResistanceGA","authors":"Alexandrine Daniel, Paul Savary, Jean-Christophe Foltête, Gilles Vuidel, Bruno Faivre, Stéphane Garnier, Aurélie Khimoun","doi":"10.1111/1755-0998.14024","DOIUrl":"10.1111/1755-0998.14024","url":null,"abstract":"<p>Modelling population connectivity is central to biodiversity conservation and often relies on resistance surfaces reflecting multi-generational gene flow. ResistanceGA (RGA) is a common optimization framework for parameterizing these surfaces by maximizing the fit between genetic distances and cost distances using maximum likelihood population effect models. As the reliability of this framework has rarely been studied, we investigated the conditions maximizing its accuracy for both prediction and interpretation of landscape features' permeability. We ran demo-genetic simulations in contrasted landscapes for species with distinct dispersal capacities and specialization levels, using corresponding reference cost scenarios. We then optimized resistance surfaces from the simulated genetic distances using RGA. First, we evaluated whether RGA identified the drivers of the genetic patterns, that is, distinguished Isolation-by-Resistance (IBR) patterns from either Isolation-by-Distance or patterns unrelated to ecological distances. We then assessed RGA predictive performance using a cross-validation method, and its ability to recover the reference cost scenarios shaping genetic structure in simulations. IBR patterns were well detected and genetic distances were predicted with great accuracy. This performance depended on the strength of the genetic structuring, sampling design and landscape structure. Matching the scale of the genetic pattern by focusing on population pairs connected through gene flow and limiting overfitting through cross-validation further enhanced inference reliability. Yet, the optimized cost values often departed from the reference values, making their interpretation and extrapolation potentially dubious. While demonstrating the value of RGA for predictive modelling, we call for caution and provide additional guidance for its optimal use.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14024","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Antonio Baeza, Jeremiah J. Minish, Todd P. Michael
<p>Complete mitochondrial genomes have become markers of choice to explore phylogenetic relationships at multiple taxonomic levels and they are often assembled using whole genome short-read sequencing. Herein, using three species of sea chubs as an example, we explored the accuracy of mitochondrial chromosomes assembled using Oxford Nanopore Technology (ONT) Kit 14 R10.4.1 long reads at different sequencing depths (high, low and very low or genome skimming) by comparing them to ‘gold’ standard reference mitochondrial genomes assembled using Illumina NovaSeq short reads. In two species of sea chubs, <i>Girella nigricans</i> and <i>Kyphosus azureus</i>, ONT long-read assembled mitochondrial genomes at high sequencing depths (> 25× whole [nuclear] genome) were identical to their respective short-read assembled mitochondrial genomes. Not a single ‘homopolymer insertion’, ‘homopolymer deletion’, ‘simple substitution’, ‘single insertion’, ‘short insertion’, ‘single deletion’ or ‘short deletion’ were detected in the long-read assembled mitochondrial genomes after aligning each one of them to their short-read counterparts. In turn, in a third species, <i>Medialuna californiensis</i>, a 25× sequencing depth long-read assembled mitochondrial genome was 14 nucleotides longer than its short-read counterpart. The difference in total length between the latter two assemblies was due to the presence of a short motif 14 bp long that was repeated (twice) in the long read but not in the short-read assembly. Read subsampling at a sequencing depth of 1× resulted in the assembly of partial or complete mitochondrial genomes with numerous errors, including, among others, simple indels, and indels at homopolymer regions. At 3× and 5× subsampling, genomes were identical (perfect) or almost identical (quasiperfect, 99.5% over 16,500 bp) to their respective Illumina assemblies. The newly assembled mitochondrial genomes exhibit identical gene composition and organisation compared with cofamilial species and a phylomitogenomic analysis based on translated protein-coding genes suggested that the family Kyphosidae is not monophyletic. The same analysis detected possible cases of misidentification of mitochondrial genomes deposited in GenBank. This study demonstrates that perfect (complete and fully accurate) or quasiperfect (complete but with a single or a very few errors) mitochondrial genomes can be assembled at high (> 25×) and low (3–5×) but not very low (1×, genome skimming) sequencing depths using ONT long reads and the latest ONT chemistries (Kit 14 and R10.4.1 flowcells with SUP basecalling). The newly assembled and annotated mitochondrial genomes can be used as a reference in environmental DNA studies focusing on bioprospecting and biomonitoring of these and other coastal species experiencing environmental insult. Given the small size of the sequencing device and low cost, we argue that ONT technology has the potential to improve access to high-throughput sequenc
完整的线粒体基因组已成为探索多分类水平系统发育关系的首选标记,通常使用全基因组短读测序法组装线粒体基因组。在此,我们以三个物种的海鲦为例,通过与使用 Illumina NovaSeq 短读取组装的 "黄金 "标准参考线粒体基因组进行比较,探讨了使用牛津纳米孔技术(ONT)14 R10.4.1 套件在不同测序深度(高、低和极低或基因组撇取)下组装的线粒体染色体的准确性。在两种海鲦(Girella nigricans 和 Kyphosus azureus)中,高测序深度(> 25× 全[核]基因组)下 ONT 长读数组装的线粒体基因组与各自短读数组装的线粒体基因组完全相同。将长线程组装的线粒体基因组与短线程组装的线粒体基因组进行比对后,没有发现任何 "同源多聚物插入"、"同源多聚物缺失"、"简单替换"、"单插入"、"短插入"、"单缺失 "或 "短缺失"。而在第三个物种--加州麦地那龙(Medialuna californiensis)中,25 倍测序深度的长线粒体基因组比短线粒体基因组长 14 个核苷酸。后两种装配的总长度之所以不同,是因为存在一个 14 bp 长的短图案,该图案在长读数中重复(两次),但在短读数装配中没有。在测序深度为 1× 的情况下,读数子取样会导致部分或完整线粒体基因组的组装出现大量错误,其中包括简单嵌合和同源多聚物区域的嵌合。在 3 倍和 5 倍子取样时,基因组与各自的 Illumina 组装结果完全相同(完美)或几乎完全相同(准完美,在 16,500 bp 上达到 99.5%)。与同族物种相比,新组装的线粒体基因组显示出相同的基因组成和组织结构,而基于翻译蛋白编码基因的系统发生组分析表明,Kyphosidae科并非单系。同样的分析还发现了存放在 GenBank 中的线粒体基因组可能存在识别错误的情况。这项研究表明,使用 ONT 长读数和最新的 ONT 化学试剂(Kit 14 和 R10.4.1 flowcells,带 SUP basecalling),可以在高测序深度(> 25×)和低测序深度(3-5×)但不是极低测序深度(1×,基因组略读)下组装出完美(完整且完全准确)或准完美(完整但只有一个或极少数错误)的线粒体基因组。新组装和注释的线粒体基因组可作为环境 DNA 研究的参考,重点是这些物种和其他遭受环境污染的沿海物种的生物勘探和生物监测。鉴于测序装置体积小、成本低,我们认为 ONT 技术有可能改善中低收入国家对高通量测序技术的利用。
{"title":"Assembly of Mitochondrial Genomes Using Nanopore Long-Read Technology in Three Sea Chubs (Teleostei: Kyphosidae)","authors":"J. Antonio Baeza, Jeremiah J. Minish, Todd P. Michael","doi":"10.1111/1755-0998.14034","DOIUrl":"10.1111/1755-0998.14034","url":null,"abstract":"<p>Complete mitochondrial genomes have become markers of choice to explore phylogenetic relationships at multiple taxonomic levels and they are often assembled using whole genome short-read sequencing. Herein, using three species of sea chubs as an example, we explored the accuracy of mitochondrial chromosomes assembled using Oxford Nanopore Technology (ONT) Kit 14 R10.4.1 long reads at different sequencing depths (high, low and very low or genome skimming) by comparing them to ‘gold’ standard reference mitochondrial genomes assembled using Illumina NovaSeq short reads. In two species of sea chubs, <i>Girella nigricans</i> and <i>Kyphosus azureus</i>, ONT long-read assembled mitochondrial genomes at high sequencing depths (> 25× whole [nuclear] genome) were identical to their respective short-read assembled mitochondrial genomes. Not a single ‘homopolymer insertion’, ‘homopolymer deletion’, ‘simple substitution’, ‘single insertion’, ‘short insertion’, ‘single deletion’ or ‘short deletion’ were detected in the long-read assembled mitochondrial genomes after aligning each one of them to their short-read counterparts. In turn, in a third species, <i>Medialuna californiensis</i>, a 25× sequencing depth long-read assembled mitochondrial genome was 14 nucleotides longer than its short-read counterpart. The difference in total length between the latter two assemblies was due to the presence of a short motif 14 bp long that was repeated (twice) in the long read but not in the short-read assembly. Read subsampling at a sequencing depth of 1× resulted in the assembly of partial or complete mitochondrial genomes with numerous errors, including, among others, simple indels, and indels at homopolymer regions. At 3× and 5× subsampling, genomes were identical (perfect) or almost identical (quasiperfect, 99.5% over 16,500 bp) to their respective Illumina assemblies. The newly assembled mitochondrial genomes exhibit identical gene composition and organisation compared with cofamilial species and a phylomitogenomic analysis based on translated protein-coding genes suggested that the family Kyphosidae is not monophyletic. The same analysis detected possible cases of misidentification of mitochondrial genomes deposited in GenBank. This study demonstrates that perfect (complete and fully accurate) or quasiperfect (complete but with a single or a very few errors) mitochondrial genomes can be assembled at high (> 25×) and low (3–5×) but not very low (1×, genome skimming) sequencing depths using ONT long reads and the latest ONT chemistries (Kit 14 and R10.4.1 flowcells with SUP basecalling). The newly assembled and annotated mitochondrial genomes can be used as a reference in environmental DNA studies focusing on bioprospecting and biomonitoring of these and other coastal species experiencing environmental insult. Given the small size of the sequencing device and low cost, we argue that ONT technology has the potential to improve access to high-throughput sequenc","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul D. N. Hebert, Robin Floyd, Saeideh Jafarpour, Sean W. J. Prosser
It is a global priority to better manage the biosphere, but action must be informed by comprehensive data on the abundance and distribution of species. The acquisition of such information is currently constrained by high costs. DNA barcoding can speed the registration of unknown animal species, the most diverse kingdom of eukaryotes, as the BIN system automates their recognition. However, inexpensive sequencing protocols are critical as the census of all animal species is likely to require the analysis of a billion or more specimens. Barcoding involves DNA extraction followed by PCR and sequencing with the last step dominating costs until 2017. By enabling the sequencing of highly multiplexed samples, the Sequel platforms from Pacific BioSciences slashed costs by 90%, but these instruments are only deployed in core facilities because of their expense. Sequencers from Oxford Nanopore Technologies provide an escape from high capital and service costs, but their low sequence fidelity has, until recently, constrained adoption. However, the improved performance of its latest flow cells (R10.4.1) erases this barrier. This study demonstrates that a MinION flow cell can characterise an amplicon pool derived from 100,000 specimens while a Flongle flow cell can process one derived from several thousand. At $0.01 per specimen, DNA sequencing is now the least expensive step in the barcode workflow.
{"title":"Barcode 100K Specimens: In a Single Nanopore Run","authors":"Paul D. N. Hebert, Robin Floyd, Saeideh Jafarpour, Sean W. J. Prosser","doi":"10.1111/1755-0998.14028","DOIUrl":"10.1111/1755-0998.14028","url":null,"abstract":"<p>It is a global priority to better manage the biosphere, but action must be informed by comprehensive data on the abundance and distribution of species. The acquisition of such information is currently constrained by high costs. DNA barcoding can speed the registration of unknown animal species, the most diverse kingdom of eukaryotes, as the BIN system automates their recognition. However, inexpensive sequencing protocols are critical as the census of all animal species is likely to require the analysis of a billion or more specimens. Barcoding involves DNA extraction followed by PCR and sequencing with the last step dominating costs until 2017. By enabling the sequencing of highly multiplexed samples, the Sequel platforms from Pacific BioSciences slashed costs by 90%, but these instruments are only deployed in core facilities because of their expense. Sequencers from Oxford Nanopore Technologies provide an escape from high capital and service costs, but their low sequence fidelity has, until recently, constrained adoption. However, the improved performance of its latest flow cells (R10.4.1) erases this barrier. This study demonstrates that a MinION flow cell can characterise an amplicon pool derived from 100,000 specimens while a Flongle flow cell can process one derived from several thousand. At $0.01 per specimen, DNA sequencing is now the least expensive step in the barcode workflow.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14028","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}