首页 > 最新文献

Molecular Ecology Resources最新文献

英文 中文
Revisiting the Briggs Ancient DNA Damage Model: A Fast Maximum Likelihood Method to Estimate Post-Mortem Damage 重新审视布里格斯古 DNA 损伤模型:估计死后损伤的快速最大似然法。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-21 DOI: 10.1111/1755-0998.14029
Lei Zhao, Rasmus Amund Henriksen, Abigail Ramsøe, Rasmus Nielsen, Thorfinn Sand Korneliussen

One essential initial step in the analysis of ancient DNA is to authenticate that the DNA sequencing reads are actually from ancient DNA. This is done by assessing if the reads exhibit typical characteristics of post-mortem damage (PMD), including cytosine deamination and nicks. We present a novel statistical method implemented in a fast multithreaded programme, ngsBriggs that enables rapid quantification of PMD by estimation of the Briggs ancient damage model parameters (Briggs parameters). Using a multinomial model with maximum likelihood fit, ngsBriggs accurately estimates the parameters of the Briggs model, quantifying the PMD signal from single and double-stranded DNA regions. We extend the original Briggs model to capture PMD signals for contemporary sequencing platforms and show that ngsBriggs accurately estimates the Briggs parameters across a variety of contamination levels. Classification of reads into ancient or modern reads, for the purpose of decontamination, is significantly more accurate using ngsBriggs than using other methods available. Furthermore, ngsBriggs is substantially faster than other state-of-the-art methods. ngsBriggs offers a practical and accurate method for researchers seeking to authenticate ancient DNA and improve the quality of their data.

分析古 DNA 的一个重要初始步骤是鉴定 DNA 测序读数是否真的来自古 DNA。要做到这一点,需要评估读数是否表现出典型的死后损伤(PMD)特征,包括胞嘧啶脱氨和刻痕。我们介绍了一种在快速多线程程序 ngsBriggs 中实施的新型统计方法,该方法可通过估算布里格斯古损伤模型参数(布里格斯参数)快速量化 PMD。ngsBriggs 使用最大似然拟合的多项式模型,准确估计了布里格斯模型的参数,量化了单链和双链 DNA 区域的 PMD 信号。我们对原始布里格斯模型进行了扩展,以捕捉当代测序平台的 PMD 信号,结果表明 ngsBriggs 能准确估计各种污染水平下的布里格斯参数。与其他可用方法相比,使用 ngsBriggs 将读数分为古代读数和现代读数以达到净化目的的准确性要高得多。此外,ngsBriggs 比其他最先进的方法快得多。ngsBriggs 为寻求鉴定古代 DNA 和提高数据质量的研究人员提供了一种实用而准确的方法。
{"title":"Revisiting the Briggs Ancient DNA Damage Model: A Fast Maximum Likelihood Method to Estimate Post-Mortem Damage","authors":"Lei Zhao,&nbsp;Rasmus Amund Henriksen,&nbsp;Abigail Ramsøe,&nbsp;Rasmus Nielsen,&nbsp;Thorfinn Sand Korneliussen","doi":"10.1111/1755-0998.14029","DOIUrl":"10.1111/1755-0998.14029","url":null,"abstract":"<p>One essential initial step in the analysis of ancient DNA is to authenticate that the DNA sequencing reads are actually from ancient DNA. This is done by assessing if the reads exhibit typical characteristics of post-mortem damage (PMD), including cytosine deamination and nicks. We present a novel statistical method implemented in a fast multithreaded programme, ngsBriggs that enables rapid quantification of PMD by estimation of the Briggs ancient damage model parameters (Briggs parameters). Using a multinomial model with maximum likelihood fit, ngsBriggs accurately estimates the parameters of the Briggs model, quantifying the PMD signal from single and double-stranded DNA regions. We extend the original Briggs model to capture PMD signals for contemporary sequencing platforms and show that ngsBriggs accurately estimates the Briggs parameters across a variety of contamination levels. Classification of reads into ancient or modern reads, for the purpose of decontamination, is significantly more accurate using ngsBriggs than using other methods available. Furthermore, ngsBriggs is substantially faster than other state-of-the-art methods. ngsBriggs offers a practical and accurate method for researchers seeking to authenticate ancient DNA and improve the quality of their data.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14029","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sediment Core DNA-Metabarcoding and Chitinous Remain Identification: Integrating Complementary Methods to Characterise Chironomidae Biodiversity in Lake Sediment Archives 沉积物核心DNA-金属标码和壳质残留物鉴定:整合互补方法,确定湖泊沉积物档案中摇蚊科生物多样性的特征。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-21 DOI: 10.1111/1755-0998.14035
Lucas André Blattner, Pierre Lapellegerie, Colin Courtney-Mustaphi, Oliver Heiri

Chironomidae, so-called non-biting midges, are considered key bioindicators of aquatic ecosystem variability. Data derived from morphologically identifying their chitinous remains in sediments document chironomid larvae assemblages, which are studied to reconstruct ecosystem changes over time. Recent developments in sedimentary DNA (sedDNA) research have demonstrated that molecular techniques are suitable for determining past and present occurrences of organisms. Nevertheless, sedDNA records documenting alterations in chironomid assemblages remain largely unexplored. To close this gap, we examined the applicability of sedDNA metabarcoding to identify Chironomidae assemblages in lake sediments by sampling and processing three 21–35 cm long sediment cores from Lake Sempach in Switzerland. With a focus on developing analytical approaches, we compared an invertebrate-universal (FWH) and a newly designed Chironomidae-specific metabarcoding primer set (CH) to assess their performance in detecting Chironomidae DNA. We isolated and identified chitinous larval remains and compared the morphotype assemblages with the data derived from sedDNA metabarcoding. Results showed a good overall agreement of the morphotype assemblage-specific clustering among the chitinous remains and the metabarcoding datasets. Both methods indicated higher chironomid assemblage similarity between the two littoral cores in contrast to the deep lake core. Moreover, we observed a pronounced primer bias effect resulting in more Chironomidae detections with the CH primer combination compared to the FWH combination. Overall, we conclude that sedDNA metabarcoding can supplement traditional remain identifications and potentially provide independent reconstructions of past chironomid assemblage changes. Furthermore, it has the potential of more efficient workflows, better sample standardisation and species-level resolution datasets.

摇蚊(Chironomidae),即所谓的不咬蠓,被认为是水生生态系统变化的关键生物指标。通过对其在沉积物中的壳质残骸进行形态鉴定而获得的数据记录了摇蚊幼虫的组合,通过研究这些数据可以重建生态系统随时间的变化。沉积 DNA(sedDNA)研究的最新进展表明,分子技术适用于确定生物在过去和现在的分布情况。然而,记录摇蚊组合变化的沉积 DNA 记录在很大程度上仍未得到研究。为了填补这一空白,我们对瑞士森帕赫湖的三块 21-35 厘米长的沉积物岩心进行了取样和处理,研究了沉积 DNA 代谢编码技术在鉴定湖泊沉积物中摇蚊类群方面的适用性。为了开发分析方法,我们比较了无脊椎动物通用引物组(FWH)和新设计的摇蚊科专用代谢标码引物组(CH),以评估它们在检测摇蚊科 DNA 方面的性能。我们分离并鉴定了几丁质幼虫遗骸,并将其形态组合与沉积物 DNA 代谢编码得出的数据进行了比较。结果表明,壳质幼虫遗骸与代谢编码数据集之间的形态组合总体上非常一致。两种方法都表明,与深湖岩心相比,两个沿岸岩心的摇蚊集合相似度更高。此外,我们还观察到明显的引物偏差效应,与 FWH 引物组合相比,CH 引物组合检测到的摇蚊数量更多。总之,我们得出结论:沉积 DNA 代谢编码可以补充传统的残留鉴定,并有可能独立重建过去摇蚊类群的变化。此外,沉积 DNA 代谢编码还具有更高效的工作流程、更好的样本标准化和物种级分辨率数据集的潜力。
{"title":"Sediment Core DNA-Metabarcoding and Chitinous Remain Identification: Integrating Complementary Methods to Characterise Chironomidae Biodiversity in Lake Sediment Archives","authors":"Lucas André Blattner,&nbsp;Pierre Lapellegerie,&nbsp;Colin Courtney-Mustaphi,&nbsp;Oliver Heiri","doi":"10.1111/1755-0998.14035","DOIUrl":"10.1111/1755-0998.14035","url":null,"abstract":"<p>Chironomidae, so-called non-biting midges, are considered key bioindicators of aquatic ecosystem variability. Data derived from morphologically identifying their chitinous remains in sediments document chironomid larvae assemblages, which are studied to reconstruct ecosystem changes over time. Recent developments in sedimentary DNA (sedDNA) research have demonstrated that molecular techniques are suitable for determining past and present occurrences of organisms. Nevertheless, sedDNA records documenting alterations in chironomid assemblages remain largely unexplored. To close this gap, we examined the applicability of sedDNA metabarcoding to identify Chironomidae assemblages in lake sediments by sampling and processing three 21–35 cm long sediment cores from Lake Sempach in Switzerland. With a focus on developing analytical approaches, we compared an invertebrate-universal (FWH) and a newly designed Chironomidae-specific metabarcoding primer set (CH) to assess their performance in detecting Chironomidae DNA. We isolated and identified chitinous larval remains and compared the morphotype assemblages with the data derived from sedDNA metabarcoding. Results showed a good overall agreement of the morphotype assemblage-specific clustering among the chitinous remains and the metabarcoding datasets. Both methods indicated higher chironomid assemblage similarity between the two littoral cores in contrast to the deep lake core. Moreover, we observed a pronounced primer bias effect resulting in more Chironomidae detections with the CH primer combination compared to the FWH combination. Overall, we conclude that sedDNA metabarcoding can supplement traditional remain identifications and potentially provide independent reconstructions of past chironomid assemblage changes. Furthermore, it has the potential of more efficient workflows, better sample standardisation and species-level resolution datasets.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14035","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to “A dedicated target capture approach reveals variable genetic markers across micro-and macro-evolutionary time scales in palms” 对 "专用目标捕获方法揭示棕榈树微观和宏观进化时间尺度上的可变遗传标记 "的更正
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-18 DOI: 10.1111/1755-0998.14032

de La Harpe, M., J. Hess, O. Loiseau, N. Salamin, C. Lexer, and M. Paris. 2019. A Dedicated Target Capture Approach Reveals Variable Genetic Markers Across Micro-and Macroevolutionary Time Scales in Palms. Molecular Ecology Resources, 19(1): 221–234. https://doi.org/10.1111/1755-0998.12945

For the sake of completeness, the authors wish to provide more detailed information in the acknowledgements section about the samples collection, with a more exhaustive list of researchers and students involved in the field sampling. In addition, we correct an inaccuracy in the funding information, replacing ‘Swiss National Science Foundation (SNSF), Grant/Award Number: CRSII3_147630; University of Zurich; Illumina; National Science Foundation’ by ‘Swiss National Science Foundation (SNSF), Grant/Award Number: CRSII3_147630; University of Fribourg’.

The updated Acknowledgement section is detailed below:

de La Harpe, M., J. Hess, O. Loiseau, N. Salamin, C. Lexer和M. Paris. 2019。一种专门的目标捕获方法揭示了棕榈树微观和宏观进化时间尺度上的可变遗传标记。分子生态资源,19(1):221-234。https://doi.org/10.1111/1755-0998.12945For为了完整起见,作者希望在致谢部分提供有关样本收集的更详细信息,以及参与现场采样的研究人员和学生的更详尽的列表。此外,我们更正了资助信息中的错误,将“瑞士国家科学基金会(SNSF),资助/奖励号:CRSII3_147630;苏黎世大学;Illumina公司;瑞士国家科学基金(SNSF),资助/奖励号:CRSII3_147630;弗里堡大学”。更新后的“确认”部分详列如下:
{"title":"Correction to “A dedicated target capture approach reveals variable genetic markers across micro-and macro-evolutionary time scales in palms”","authors":"","doi":"10.1111/1755-0998.14032","DOIUrl":"10.1111/1755-0998.14032","url":null,"abstract":"<p>de La Harpe, M., J. Hess, O. Loiseau, N. Salamin, C. Lexer, and M. Paris. 2019. A Dedicated Target Capture Approach Reveals Variable Genetic Markers Across Micro-and Macroevolutionary Time Scales in Palms. Molecular Ecology Resources, 19(1): 221–234. https://doi.org/10.1111/1755-0998.12945</p><p>For the sake of completeness, the authors wish to provide more detailed information in the acknowledgements section about the samples collection, with a more exhaustive list of researchers and students involved in the field sampling. In addition, we correct an inaccuracy in the funding information, replacing ‘Swiss National Science Foundation (SNSF), Grant/Award Number: CRSII3_147630; University of Zurich; Illumina; National Science Foundation’ by ‘Swiss National Science Foundation (SNSF), Grant/Award Number: CRSII3_147630; University of Fribourg’.</p><p>The updated Acknowledgement section is detailed below:</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14032","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What can optimized cost distances based on genetic distances offer? A simulation study on the use and misuse of ResistanceGA 基于遗传距离的优化成本距离能带来什么?关于使用和滥用 ResistanceGA 的模拟研究。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-17 DOI: 10.1111/1755-0998.14024
Alexandrine Daniel, Paul Savary, Jean-Christophe Foltête, Gilles Vuidel, Bruno Faivre, Stéphane Garnier, Aurélie Khimoun

Modelling population connectivity is central to biodiversity conservation and often relies on resistance surfaces reflecting multi-generational gene flow. ResistanceGA (RGA) is a common optimization framework for parameterizing these surfaces by maximizing the fit between genetic distances and cost distances using maximum likelihood population effect models. As the reliability of this framework has rarely been studied, we investigated the conditions maximizing its accuracy for both prediction and interpretation of landscape features' permeability. We ran demo-genetic simulations in contrasted landscapes for species with distinct dispersal capacities and specialization levels, using corresponding reference cost scenarios. We then optimized resistance surfaces from the simulated genetic distances using RGA. First, we evaluated whether RGA identified the drivers of the genetic patterns, that is, distinguished Isolation-by-Resistance (IBR) patterns from either Isolation-by-Distance or patterns unrelated to ecological distances. We then assessed RGA predictive performance using a cross-validation method, and its ability to recover the reference cost scenarios shaping genetic structure in simulations. IBR patterns were well detected and genetic distances were predicted with great accuracy. This performance depended on the strength of the genetic structuring, sampling design and landscape structure. Matching the scale of the genetic pattern by focusing on population pairs connected through gene flow and limiting overfitting through cross-validation further enhanced inference reliability. Yet, the optimized cost values often departed from the reference values, making their interpretation and extrapolation potentially dubious. While demonstrating the value of RGA for predictive modelling, we call for caution and provide additional guidance for its optimal use.

建立种群连通性模型是生物多样性保护的核心,通常依赖于反映多代基因流动的阻力面。ResistanceGA(RGA)是一种常见的优化框架,它通过最大化遗传距离与成本距离之间的拟合,利用最大似然种群效应模型对这些表面进行参数化。由于很少有人研究过这一框架的可靠性,因此我们研究了在预测和解释地貌特征渗透性时,使其准确性最大化的条件。我们利用相应的参考成本方案,在具有不同扩散能力和专业化水平的物种的对比景观中进行了种群遗传模拟。然后,我们利用 RGA 从模拟遗传距离中优化了阻力面。首先,我们评估了 RGA 是否能识别遗传模式的驱动因素,即是否能将 "因抵抗力而隔离(IBR)"模式与 "因距离而隔离 "模式或与生态距离无关的模式区分开来。然后,我们使用交叉验证法评估了 RGA 的预测性能,以及它在模拟中恢复形成遗传结构的参考成本情景的能力。我们很好地检测到了 IBR 模式,并非常准确地预测了遗传距离。这种性能取决于遗传结构的强度、采样设计和景观结构。通过关注通过基因流连接的种群对来匹配遗传模式的规模,以及通过交叉验证限制过度拟合,进一步提高了推断的可靠性。然而,优化后的成本值往往偏离参考值,使其解释和推断可能存在疑问。在证明 RGA 在预测建模中的价值的同时,我们呼吁谨慎使用 RGA,并为其最佳使用提供更多指导。
{"title":"What can optimized cost distances based on genetic distances offer? A simulation study on the use and misuse of ResistanceGA","authors":"Alexandrine Daniel,&nbsp;Paul Savary,&nbsp;Jean-Christophe Foltête,&nbsp;Gilles Vuidel,&nbsp;Bruno Faivre,&nbsp;Stéphane Garnier,&nbsp;Aurélie Khimoun","doi":"10.1111/1755-0998.14024","DOIUrl":"10.1111/1755-0998.14024","url":null,"abstract":"<p>Modelling population connectivity is central to biodiversity conservation and often relies on resistance surfaces reflecting multi-generational gene flow. ResistanceGA (RGA) is a common optimization framework for parameterizing these surfaces by maximizing the fit between genetic distances and cost distances using maximum likelihood population effect models. As the reliability of this framework has rarely been studied, we investigated the conditions maximizing its accuracy for both prediction and interpretation of landscape features' permeability. We ran demo-genetic simulations in contrasted landscapes for species with distinct dispersal capacities and specialization levels, using corresponding reference cost scenarios. We then optimized resistance surfaces from the simulated genetic distances using RGA. First, we evaluated whether RGA identified the drivers of the genetic patterns, that is, distinguished Isolation-by-Resistance (IBR) patterns from either Isolation-by-Distance or patterns unrelated to ecological distances. We then assessed RGA predictive performance using a cross-validation method, and its ability to recover the reference cost scenarios shaping genetic structure in simulations. IBR patterns were well detected and genetic distances were predicted with great accuracy. This performance depended on the strength of the genetic structuring, sampling design and landscape structure. Matching the scale of the genetic pattern by focusing on population pairs connected through gene flow and limiting overfitting through cross-validation further enhanced inference reliability. Yet, the optimized cost values often departed from the reference values, making their interpretation and extrapolation potentially dubious. While demonstrating the value of RGA for predictive modelling, we call for caution and provide additional guidance for its optimal use.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14024","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assembly of Mitochondrial Genomes Using Nanopore Long-Read Technology in Three Sea Chubs (Teleostei: Kyphosidae) 利用纳米孔长读取技术组装三种海鲢鱼(Teleostei: Kyphosidae)的线粒体基因组。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-15 DOI: 10.1111/1755-0998.14034
J. Antonio Baeza, Jeremiah J. Minish, Todd P. Michael
<p>Complete mitochondrial genomes have become markers of choice to explore phylogenetic relationships at multiple taxonomic levels and they are often assembled using whole genome short-read sequencing. Herein, using three species of sea chubs as an example, we explored the accuracy of mitochondrial chromosomes assembled using Oxford Nanopore Technology (ONT) Kit 14 R10.4.1 long reads at different sequencing depths (high, low and very low or genome skimming) by comparing them to ‘gold’ standard reference mitochondrial genomes assembled using Illumina NovaSeq short reads. In two species of sea chubs, <i>Girella nigricans</i> and <i>Kyphosus azureus</i>, ONT long-read assembled mitochondrial genomes at high sequencing depths (> 25× whole [nuclear] genome) were identical to their respective short-read assembled mitochondrial genomes. Not a single ‘homopolymer insertion’, ‘homopolymer deletion’, ‘simple substitution’, ‘single insertion’, ‘short insertion’, ‘single deletion’ or ‘short deletion’ were detected in the long-read assembled mitochondrial genomes after aligning each one of them to their short-read counterparts. In turn, in a third species, <i>Medialuna californiensis</i>, a 25× sequencing depth long-read assembled mitochondrial genome was 14 nucleotides longer than its short-read counterpart. The difference in total length between the latter two assemblies was due to the presence of a short motif 14 bp long that was repeated (twice) in the long read but not in the short-read assembly. Read subsampling at a sequencing depth of 1× resulted in the assembly of partial or complete mitochondrial genomes with numerous errors, including, among others, simple indels, and indels at homopolymer regions. At 3× and 5× subsampling, genomes were identical (perfect) or almost identical (quasiperfect, 99.5% over 16,500 bp) to their respective Illumina assemblies. The newly assembled mitochondrial genomes exhibit identical gene composition and organisation compared with cofamilial species and a phylomitogenomic analysis based on translated protein-coding genes suggested that the family Kyphosidae is not monophyletic. The same analysis detected possible cases of misidentification of mitochondrial genomes deposited in GenBank. This study demonstrates that perfect (complete and fully accurate) or quasiperfect (complete but with a single or a very few errors) mitochondrial genomes can be assembled at high (> 25×) and low (3–5×) but not very low (1×, genome skimming) sequencing depths using ONT long reads and the latest ONT chemistries (Kit 14 and R10.4.1 flowcells with SUP basecalling). The newly assembled and annotated mitochondrial genomes can be used as a reference in environmental DNA studies focusing on bioprospecting and biomonitoring of these and other coastal species experiencing environmental insult. Given the small size of the sequencing device and low cost, we argue that ONT technology has the potential to improve access to high-throughput sequenc
完整的线粒体基因组已成为探索多分类水平系统发育关系的首选标记,通常使用全基因组短读测序法组装线粒体基因组。在此,我们以三个物种的海鲦为例,通过与使用 Illumina NovaSeq 短读取组装的 "黄金 "标准参考线粒体基因组进行比较,探讨了使用牛津纳米孔技术(ONT)14 R10.4.1 套件在不同测序深度(高、低和极低或基因组撇取)下组装的线粒体染色体的准确性。在两种海鲦(Girella nigricans 和 Kyphosus azureus)中,高测序深度(> 25× 全[核]基因组)下 ONT 长读数组装的线粒体基因组与各自短读数组装的线粒体基因组完全相同。将长线程组装的线粒体基因组与短线程组装的线粒体基因组进行比对后,没有发现任何 "同源多聚物插入"、"同源多聚物缺失"、"简单替换"、"单插入"、"短插入"、"单缺失 "或 "短缺失"。而在第三个物种--加州麦地那龙(Medialuna californiensis)中,25 倍测序深度的长线粒体基因组比短线粒体基因组长 14 个核苷酸。后两种装配的总长度之所以不同,是因为存在一个 14 bp 长的短图案,该图案在长读数中重复(两次),但在短读数装配中没有。在测序深度为 1× 的情况下,读数子取样会导致部分或完整线粒体基因组的组装出现大量错误,其中包括简单嵌合和同源多聚物区域的嵌合。在 3 倍和 5 倍子取样时,基因组与各自的 Illumina 组装结果完全相同(完美)或几乎完全相同(准完美,在 16,500 bp 上达到 99.5%)。与同族物种相比,新组装的线粒体基因组显示出相同的基因组成和组织结构,而基于翻译蛋白编码基因的系统发生组分析表明,Kyphosidae科并非单系。同样的分析还发现了存放在 GenBank 中的线粒体基因组可能存在识别错误的情况。这项研究表明,使用 ONT 长读数和最新的 ONT 化学试剂(Kit 14 和 R10.4.1 flowcells,带 SUP basecalling),可以在高测序深度(> 25×)和低测序深度(3-5×)但不是极低测序深度(1×,基因组略读)下组装出完美(完整且完全准确)或准完美(完整但只有一个或极少数错误)的线粒体基因组。新组装和注释的线粒体基因组可作为环境 DNA 研究的参考,重点是这些物种和其他遭受环境污染的沿海物种的生物勘探和生物监测。鉴于测序装置体积小、成本低,我们认为 ONT 技术有可能改善中低收入国家对高通量测序技术的利用。
{"title":"Assembly of Mitochondrial Genomes Using Nanopore Long-Read Technology in Three Sea Chubs (Teleostei: Kyphosidae)","authors":"J. Antonio Baeza,&nbsp;Jeremiah J. Minish,&nbsp;Todd P. Michael","doi":"10.1111/1755-0998.14034","DOIUrl":"10.1111/1755-0998.14034","url":null,"abstract":"&lt;p&gt;Complete mitochondrial genomes have become markers of choice to explore phylogenetic relationships at multiple taxonomic levels and they are often assembled using whole genome short-read sequencing. Herein, using three species of sea chubs as an example, we explored the accuracy of mitochondrial chromosomes assembled using Oxford Nanopore Technology (ONT) Kit 14 R10.4.1 long reads at different sequencing depths (high, low and very low or genome skimming) by comparing them to ‘gold’ standard reference mitochondrial genomes assembled using Illumina NovaSeq short reads. In two species of sea chubs, &lt;i&gt;Girella nigricans&lt;/i&gt; and &lt;i&gt;Kyphosus azureus&lt;/i&gt;, ONT long-read assembled mitochondrial genomes at high sequencing depths (&gt; 25× whole [nuclear] genome) were identical to their respective short-read assembled mitochondrial genomes. Not a single ‘homopolymer insertion’, ‘homopolymer deletion’, ‘simple substitution’, ‘single insertion’, ‘short insertion’, ‘single deletion’ or ‘short deletion’ were detected in the long-read assembled mitochondrial genomes after aligning each one of them to their short-read counterparts. In turn, in a third species, &lt;i&gt;Medialuna californiensis&lt;/i&gt;, a 25× sequencing depth long-read assembled mitochondrial genome was 14 nucleotides longer than its short-read counterpart. The difference in total length between the latter two assemblies was due to the presence of a short motif 14 bp long that was repeated (twice) in the long read but not in the short-read assembly. Read subsampling at a sequencing depth of 1× resulted in the assembly of partial or complete mitochondrial genomes with numerous errors, including, among others, simple indels, and indels at homopolymer regions. At 3× and 5× subsampling, genomes were identical (perfect) or almost identical (quasiperfect, 99.5% over 16,500 bp) to their respective Illumina assemblies. The newly assembled mitochondrial genomes exhibit identical gene composition and organisation compared with cofamilial species and a phylomitogenomic analysis based on translated protein-coding genes suggested that the family Kyphosidae is not monophyletic. The same analysis detected possible cases of misidentification of mitochondrial genomes deposited in GenBank. This study demonstrates that perfect (complete and fully accurate) or quasiperfect (complete but with a single or a very few errors) mitochondrial genomes can be assembled at high (&gt; 25×) and low (3–5×) but not very low (1×, genome skimming) sequencing depths using ONT long reads and the latest ONT chemistries (Kit 14 and R10.4.1 flowcells with SUP basecalling). The newly assembled and annotated mitochondrial genomes can be used as a reference in environmental DNA studies focusing on bioprospecting and biomonitoring of these and other coastal species experiencing environmental insult. Given the small size of the sequencing device and low cost, we argue that ONT technology has the potential to improve access to high-throughput sequenc","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Barcode 100K Specimens: In a Single Nanopore Run 对 10 万个样本进行条形码编码:在一次纳米孔运行中
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-10 DOI: 10.1111/1755-0998.14028
Paul D. N. Hebert, Robin Floyd, Saeideh Jafarpour, Sean W. J. Prosser

It is a global priority to better manage the biosphere, but action must be informed by comprehensive data on the abundance and distribution of species. The acquisition of such information is currently constrained by high costs. DNA barcoding can speed the registration of unknown animal species, the most diverse kingdom of eukaryotes, as the BIN system automates their recognition. However, inexpensive sequencing protocols are critical as the census of all animal species is likely to require the analysis of a billion or more specimens. Barcoding involves DNA extraction followed by PCR and sequencing with the last step dominating costs until 2017. By enabling the sequencing of highly multiplexed samples, the Sequel platforms from Pacific BioSciences slashed costs by 90%, but these instruments are only deployed in core facilities because of their expense. Sequencers from Oxford Nanopore Technologies provide an escape from high capital and service costs, but their low sequence fidelity has, until recently, constrained adoption. However, the improved performance of its latest flow cells (R10.4.1) erases this barrier. This study demonstrates that a MinION flow cell can characterise an amplicon pool derived from 100,000 specimens while a Flongle flow cell can process one derived from several thousand. At $0.01 per specimen, DNA sequencing is now the least expensive step in the barcode workflow.

更好地管理生物圈是全球的当务之急,但行动必须以物种丰度和分布的综合数据为依据。目前,获取此类信息受到高成本的限制。DNA 条形码可以加快登记未知动物物种的速度,因为 BIN 系统可以自动识别真核生物界中最多样化的动物物种。然而,由于对所有动物物种进行普查可能需要对十亿或更多的标本进行分析,因此廉价的测序协议至关重要。条形编码包括 DNA 提取、PCR 和测序,其中最后一步的成本在 2017 年之前一直占主导地位。太平洋生物科学公司(Pacific BioSciences)的 Sequel 平台可以对高度复用的样本进行测序,从而将成本降低了 90%,但由于费用昂贵,这些仪器只能部署在核心设施中。牛津纳米孔技术公司(Oxford Nanopore Technologies)的测序仪可以摆脱高昂的资本和服务成本,但直到最近,其较低的序列保真度还限制了其应用。然而,牛津纳米孔技术公司最新的流式细胞仪(R10.4.1)性能的提高消除了这一障碍。这项研究表明,MinION 流式细胞仪可以鉴定来自 10 万个标本的扩增子库,而 Flongle 流式细胞仪可以处理来自数千个标本的扩增子库。每个标本只需 0.01 美元,DNA 测序现在是条形码工作流程中成本最低的步骤。
{"title":"Barcode 100K Specimens: In a Single Nanopore Run","authors":"Paul D. N. Hebert,&nbsp;Robin Floyd,&nbsp;Saeideh Jafarpour,&nbsp;Sean W. J. Prosser","doi":"10.1111/1755-0998.14028","DOIUrl":"10.1111/1755-0998.14028","url":null,"abstract":"<p>It is a global priority to better manage the biosphere, but action must be informed by comprehensive data on the abundance and distribution of species. The acquisition of such information is currently constrained by high costs. DNA barcoding can speed the registration of unknown animal species, the most diverse kingdom of eukaryotes, as the BIN system automates their recognition. However, inexpensive sequencing protocols are critical as the census of all animal species is likely to require the analysis of a billion or more specimens. Barcoding involves DNA extraction followed by PCR and sequencing with the last step dominating costs until 2017. By enabling the sequencing of highly multiplexed samples, the Sequel platforms from Pacific BioSciences slashed costs by 90%, but these instruments are only deployed in core facilities because of their expense. Sequencers from Oxford Nanopore Technologies provide an escape from high capital and service costs, but their low sequence fidelity has, until recently, constrained adoption. However, the improved performance of its latest flow cells (R10.4.1) erases this barrier. This study demonstrates that a MinION flow cell can characterise an amplicon pool derived from 100,000 specimens while a Flongle flow cell can process one derived from several thousand. At $0.01 per specimen, DNA sequencing is now the least expensive step in the barcode workflow.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14028","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EVE-X: Software to Identify Novel Viral Insertions in Wild-Caught Arthropod Hosts From Next-Generation Short Read Data EVE-X:从下一代短读数数据中识别野生捕获节肢动物宿主中新型病毒插入的软件。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-09 DOI: 10.1111/1755-0998.14026
Jessen Havill, Olivia Strasburg, Tessy Udoh, Jacob E. Crawford, Andrea Gloria-Soria

Eukaryotic genomes harbour sequences derived from non-retroviral RNA viruses, known as endogenous viral elements (EVEs) or non-retroviral integrated RNA virus sequences (NIRVS). These sequences represent a record of past infections and have been implicated in host anti-viral response. We have created a program to identify viral sequences integrated in a host genome. It begins with a specimen BAM file and outputs candidate NIRVS, along with putative host insertion sites and overlapping genomic features of the host genome in XML and visual formats, with minimal intermediary intervention. We ran through this software short-read data derived from the genomes of 222 wild-caught A. aegypti mosquitoes, from a dozen geographical regions, and located putative NIRVS from seven virus families. This program is as accurate as currently available software for NIRVS detection, and represents a significant improvement in adaptability and user-friendliness. Furthermore, the flexibility of this pipeline allows the user to search for sequence integrations across the genome of any organism, as long as a query sequence database and a reference genome is provided. Potential extended applications include identification of integrated transgenic sequences used for research or vector control strategies.

真核生物基因组中含有来自非逆转录病毒 RNA 病毒的序列,称为内源性病毒元件(EVE)或非逆转录病毒整合 RNA 病毒序列(NIRVS)。这些序列代表着过去感染的记录,并与宿主的抗病毒反应有关。我们创建了一个程序来识别整合在宿主基因组中的病毒序列。它从样本 BAM 文件开始,以 XML 和可视化格式输出候选 NIRVS 以及推测的宿主插入位点和宿主基因组的重叠基因组特征,中间干预极少。我们通过该软件运行了来自十几个地理区域的 222 只野外捕获的埃及疟蚊基因组的短读数据,并找到了七个病毒科的假定近红外基因组。该程序与目前可用的 NIRVS 检测软件一样准确,在适应性和用户友好性方面有了显著提高。此外,只要提供查询序列数据库和参考基因组,该程序的灵活性还允许用户在任何生物体的基因组中搜索整合序列。潜在的扩展应用包括识别用于研究或病媒控制策略的整合转基因序列。
{"title":"EVE-X: Software to Identify Novel Viral Insertions in Wild-Caught Arthropod Hosts From Next-Generation Short Read Data","authors":"Jessen Havill,&nbsp;Olivia Strasburg,&nbsp;Tessy Udoh,&nbsp;Jacob E. Crawford,&nbsp;Andrea Gloria-Soria","doi":"10.1111/1755-0998.14026","DOIUrl":"10.1111/1755-0998.14026","url":null,"abstract":"<div>\u0000 \u0000 <p>Eukaryotic genomes harbour sequences derived from non-retroviral RNA viruses, known as endogenous viral elements (EVEs) or non-retroviral integrated RNA virus sequences (NIRVS). These sequences represent a record of past infections and have been implicated in host anti-viral response. We have created a program to identify viral sequences integrated in a host genome. It begins with a specimen BAM file and outputs candidate NIRVS, along with putative host insertion sites and overlapping genomic features of the host genome in XML and visual formats, with minimal intermediary intervention. We ran through this software short-read data derived from the genomes of 222 wild-caught <i>A. aegypti</i> mosquitoes, from a dozen geographical regions, and located putative NIRVS from seven virus families. This program is as accurate as currently available software for NIRVS detection, and represents a significant improvement in adaptability and user-friendliness. Furthermore, the flexibility of this pipeline allows the user to search for sequence integrations across the genome of any organism, as long as a query sequence database and a reference genome is provided. Potential extended applications include identification of integrated transgenic sequences used for research or vector control strategies.</p>\u0000 </div>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142386792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative analysis of amphibian genomes: An emerging resource for basic and applied research 两栖动物基因组的比较分析:基础研究和应用研究的新兴资源。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-04 DOI: 10.1111/1755-0998.14025
Tiffany A. Kosch, Andrew J. Crawford, Rachel Lockridge Mueller, Katharina C. Wollenberg Valero, Megan L. Power, Ariel Rodríguez, Lauren A. O'Connell, Neil D. Young, Lee F. Skerratt

Amphibians are the most threatened group of vertebrates and are in dire need of conservation intervention to ensure their continued survival. They exhibit unique features including a high diversity of reproductive strategies, permeable and specialized skin capable of producing toxins and antimicrobial compounds, multiple genetic mechanisms of sex determination and in some lineages, the ability to regenerate limbs and organs. Although genomic approaches would shed light on these unique traits and aid conservation, sequencing and assembly of amphibian genomes has lagged behind other taxa due to their comparatively large genome sizes. Fortunately, the development of long-read sequencing technologies and initiatives has led to a recent burst of new amphibian genome assemblies. Although growing, the field of amphibian genomics suffers from the lack of annotation resources, tools for working with challenging genomes and lack of high-quality assemblies in multiple clades of amphibians. Here, we analyse 51 publicly available amphibian genomes to evaluate their usefulness for functional genomics research. We report considerable variation in genome assembly quality and completeness and report some of the highest transposable element and repeat contents of any vertebrate. Additionally, we detected an association between transposable element content and climatic variables. Our analysis provides evidence of conserved genome synteny despite the long divergence times of this group, but we also highlight inconsistencies in chromosome naming and orientation across genome assemblies. We discuss sequencing gaps in the phylogeny and suggest key targets for future sequencing endeavours. Finally, we propose increased investment in amphibian genomics research to promote their conservation.

两栖动物是脊椎动物中最受威胁的一类,亟需保护干预措施以确保其继续生存。它们表现出独特的特征,包括繁殖策略的高度多样性、能够产生毒素和抗菌化合物的渗透性和特化皮肤、性别决定的多种遗传机制,以及在某些品系中肢体和器官的再生能力。虽然基因组学方法可以揭示这些独特的特征并有助于保护,但由于两栖动物的基因组相对较大,其基因组的测序和组装工作一直落后于其他类群。幸运的是,随着长线程测序技术和计划的发展,最近出现了大量新的两栖动物基因组组装。两栖动物基因组学领域虽然在不断发展,但仍缺乏注释资源、处理高难度基因组的工具,以及两栖动物多个支系中缺乏高质量的基因组组装。在这里,我们分析了 51 个公开的两栖动物基因组,以评估它们对功能基因组学研究的有用性。我们报告了基因组组装质量和完整性方面的巨大差异,并报告了一些脊椎动物中最高的转座元件和重复含量。此外,我们还发现了转座元件含量与气候变量之间的关联。我们的分析提供了尽管该类群的分化时间较长,但其基因组的同步性却保持不变的证据,但我们也强调了基因组组装中染色体命名和方向的不一致性。我们讨论了系统发育中的测序差距,并提出了未来测序工作的关键目标。最后,我们建议增加对两栖动物基因组学研究的投资,以促进两栖动物的保护。
{"title":"Comparative analysis of amphibian genomes: An emerging resource for basic and applied research","authors":"Tiffany A. Kosch,&nbsp;Andrew J. Crawford,&nbsp;Rachel Lockridge Mueller,&nbsp;Katharina C. Wollenberg Valero,&nbsp;Megan L. Power,&nbsp;Ariel Rodríguez,&nbsp;Lauren A. O'Connell,&nbsp;Neil D. Young,&nbsp;Lee F. Skerratt","doi":"10.1111/1755-0998.14025","DOIUrl":"10.1111/1755-0998.14025","url":null,"abstract":"<p>Amphibians are the most threatened group of vertebrates and are in dire need of conservation intervention to ensure their continued survival. They exhibit unique features including a high diversity of reproductive strategies, permeable and specialized skin capable of producing toxins and antimicrobial compounds, multiple genetic mechanisms of sex determination and in some lineages, the ability to regenerate limbs and organs. Although genomic approaches would shed light on these unique traits and aid conservation, sequencing and assembly of amphibian genomes has lagged behind other taxa due to their comparatively large genome sizes. Fortunately, the development of long-read sequencing technologies and initiatives has led to a recent burst of new amphibian genome assemblies. Although growing, the field of amphibian genomics suffers from the lack of annotation resources, tools for working with challenging genomes and lack of high-quality assemblies in multiple clades of amphibians. Here, we analyse 51 publicly available amphibian genomes to evaluate their usefulness for functional genomics research. We report considerable variation in genome assembly quality and completeness and report some of the highest transposable element and repeat contents of any vertebrate. Additionally, we detected an association between transposable element content and climatic variables. Our analysis provides evidence of conserved genome synteny despite the long divergence times of this group, but we also highlight inconsistencies in chromosome naming and orientation across genome assemblies. We discuss sequencing gaps in the phylogeny and suggest key targets for future sequencing endeavours. Finally, we propose increased investment in amphibian genomics research to promote their conservation.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14025","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142370447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MhGeneS: An Analytical Pipeline to Allow for Robust Microhaplotype Genotyping MhGeneS:一种可进行强大的微单体型基因分型的分析管道。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-04 DOI: 10.1111/1755-0998.14027
Julia C. Geue, Peng Liu, Sonesinh Keobouasone, Paul Wilson, Micheline Manseau

Microhaplotypes are small linked genomic regions comprising two or more single-nucleotide polymorphisms (SNPs) that are being applied in forensics and are emerging in wildlife monitoring studies and genomic epidemiology. Typically, targeted in non-coding regions, microhaplotypes in exonic regions can be designed with larger amplicons to capture functional non-synonymous sites and minimise insertion/deletion (indel) polymorphisms. Quality control is an important first step for high-confidence genotyping to counteract such false-positive variants. As genetic markers with higher polymorphism compared to biallelic SNPs, it is critical to ensure sequencing errors across the microhaplotype amplicon are filtered out to avoid introducing false-haplotypes. We developed the MhGeneS pipeline which works in tandem with Seq2Sat to help validate microhaplotype genotyping of the coding region of genes, with broader applicability to any microhaplotype profiling. We genotyped microhaplotype regions of the Zfx (≅ 160 bp) and Zfy (≅ 140 bp) genes, as well as an exon of the prion protein (Prnp) gene (≅ 370 bp) in caribou (Rangifer tarandus) using paired-end Illumina technology. As important quality metrics affecting microhaplotype calling, we identified the sequencing error rate profile related to the overlap or non-overlap of paired-end reads as well as the read depth as significant. In the case of Prnp, we achieved confident microhaplotype calling through MhGeneS by removing small sections of the 5′ and 3′ amplicons and using a minimum read depth of 20. Read depth and sequence trimming may be locus-specific, and validation of these parameters is recommended before the high-throughput profiling of samples.

微单型是由两个或两个以上单核苷酸多态性(SNPs)组成的小连接基因组区域,目前已被应用于法医学,并正在野生动物监测研究和基因组流行病学中兴起。通常情况下,外显子区的微单倍型以非编码区为目标,可以设计较大的扩增子来捕获功能性非同义位点,并尽量减少插入/缺失(indel)多态性。质量控制是高置信度基因分型的重要第一步,可抵消此类假阳性变异。与双链 SNP 相比,遗传标记具有更高的多态性,因此必须确保过滤掉微单体型扩增片段中的测序错误,以避免引入假单体型。我们开发了 MhGeneS 管道,它与 Seq2Sat 协同工作,帮助验证基因编码区的微单体型基因分型,并可广泛应用于任何微单体型分析。我们利用成对端 Illumina 技术对驯鹿(Rangifer tarandus)的 Zfx(≅ 160 bp)和 Zfy(≅ 140 bp)基因以及朊病毒蛋白(Prnp)基因的一个外显子(≅ 370 bp)进行了微单基因型区域的基因分型。作为影响微单型鉴定的重要质量指标,我们发现与成对端读数重叠或不重叠有关的测序错误率曲线以及读数深度都很重要。就 Prnp 而言,我们通过 MhGeneS 去掉了 5' 和 3' 扩增子的一小部分,并使用最小读数深度为 20 的方法,实现了可靠的微单体型调用。读数深度和序列修剪可能会对特定位点有影响,建议在对样本进行高通量分析前对这些参数进行验证。
{"title":"MhGeneS: An Analytical Pipeline to Allow for Robust Microhaplotype Genotyping","authors":"Julia C. Geue,&nbsp;Peng Liu,&nbsp;Sonesinh Keobouasone,&nbsp;Paul Wilson,&nbsp;Micheline Manseau","doi":"10.1111/1755-0998.14027","DOIUrl":"10.1111/1755-0998.14027","url":null,"abstract":"<p>Microhaplotypes are small linked genomic regions comprising two or more single-nucleotide polymorphisms (SNPs) that are being applied in forensics and are emerging in wildlife monitoring studies and genomic epidemiology. Typically, targeted in non-coding regions, microhaplotypes in exonic regions can be designed with larger amplicons to capture functional non-synonymous sites and minimise insertion/deletion (indel) polymorphisms. Quality control is an important first step for high-confidence genotyping to counteract such false-positive variants. As genetic markers with higher polymorphism compared to biallelic SNPs, it is critical to ensure sequencing errors across the microhaplotype amplicon are filtered out to avoid introducing false-haplotypes. We developed the MhGeneS pipeline which works in tandem with Seq2Sat to help validate microhaplotype genotyping of the coding region of genes, with broader applicability to any microhaplotype profiling. We genotyped microhaplotype regions of the <i>Zfx</i> (≅ 160 bp) and <i>Zfy</i> (≅ 140 bp) genes, as well as an exon of the prion protein (<i>Prnp</i>) gene (≅ 370 bp) in caribou (<i>Rangifer tarandus</i>) using paired-end Illumina technology. As important quality metrics affecting microhaplotype calling, we identified the sequencing error rate profile related to the overlap or non-overlap of paired-end reads as well as the read depth as significant. In the case of <i>Prnp</i>, we achieved confident microhaplotype calling through MhGeneS by removing small sections of the 5′ and 3′ amplicons and using a minimum read depth of 20. Read depth and sequence trimming may be locus-specific, and validation of these parameters is recommended before the high-throughput profiling of samples.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14027","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142370448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Upscaling biodiversity monitoring: Metabarcoding estimates 31,846 insect species from Malaise traps across Germany 扩大生物多样性监测范围:元条码估算出德国各地 Malaise 诱捕器中的 31846 种昆虫。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-04 DOI: 10.1111/1755-0998.14023
Dominik Buchner, James S. Sinclair, Manfred Ayasse, Arne J. Beermann, Jörn Buse, Frank Dziock, Julian Enss, Mark Frenzel, Thomas Hörren, Yuanheng Li, Michael T. Monaghan, Carsten Morkel, Jörg Müller, Steffen U. Pauls, Ronny Richter, Tobias Scharnweber, Martin Sorg, Stefan Stoll, Sönke Twietmeyer, Wolfgang W. Weisser, Benedikt Wiggering, Martin Wilmking, Gerhard Zotz, Mark O. Gessner, Peter Haase, Florian Leese

Mitigating ongoing losses of insects and their key functions (e.g. pollination) requires tracking large-scale and long-term community changes. However, doing so has been hindered by the high diversity of insect species that requires prohibitively high investments of time, funding and taxonomic expertise when addressed with conventional tools. Here, we show that these concerns can be addressed through a comprehensive, scalable and cost-efficient DNA metabarcoding workflow. We use 1815 samples from 75 Malaise traps across Germany from 2019 and 2020 to demonstrate how metabarcoding can be incorporated into large-scale insect monitoring networks for less than 50 € per sample, including supplies, labour and maintenance. We validated the detected species using two publicly available databases (GBOL and GBIF) and the judgement of taxonomic experts. With an average of 1.4 M sequence reads per sample we uncovered 10,803 validated insect species, of which 83.9% were represented by a single Operational Taxonomic Unit (OTU). We estimated another 21,043 plausible species, which we argue either lack a reference barcode or are undescribed. The total of 31,846 species is similar to the number of insect species known for Germany (~35,500). Because Malaise traps capture only a subset of insects, our approach identified many species likely unknown from Germany or new to science. Our reproducible workflow (~80% OTU-similarity among years) provides a blueprint for large-scale biodiversity monitoring of insects and other biodiversity components in near real time.

缓解昆虫及其关键功能(如授粉)的持续损失需要跟踪大规模和长期的群落变化。然而,昆虫物种的高度多样性阻碍了这一工作的开展,使用传统工具时需要投入过高的时间、资金和分类学专业知识。在这里,我们展示了一种全面、可扩展且具有成本效益的 DNA 元条码工作流程可以解决这些问题。我们使用了来自 2019 年和 2020 年德国 75 个 Malaise 诱捕器的 1815 份样本,展示了如何将代谢标码纳入大规模昆虫监测网络,每个样本的成本(包括耗材、人工和维护)不到 50 欧元。我们利用两个公开数据库(GBOL 和 GBIF)和分类专家的判断验证了检测到的物种。每个样本平均有 140 万个序列读数,我们发现了 10803 个有效的昆虫物种,其中 83.9% 由单一的操作分类单元 (OTU) 代表。我们估计还有 21,043 个可能的物种,我们认为这些物种要么缺乏参考条形码,要么尚未被描述。31,846 个物种的总数与德国已知的昆虫物种数量(约 35,500 个)相近。由于马拉伊斯诱捕器只能捕获一部分昆虫,因此我们的方法发现了许多德国未知或科学界新发现的物种。我们的工作流程具有可重复性(各年的 OTU 相似度约为 80%),为近乎实时地对昆虫和其他生物多样性成分进行大规模生物多样性监测提供了蓝图。
{"title":"Upscaling biodiversity monitoring: Metabarcoding estimates 31,846 insect species from Malaise traps across Germany","authors":"Dominik Buchner,&nbsp;James S. Sinclair,&nbsp;Manfred Ayasse,&nbsp;Arne J. Beermann,&nbsp;Jörn Buse,&nbsp;Frank Dziock,&nbsp;Julian Enss,&nbsp;Mark Frenzel,&nbsp;Thomas Hörren,&nbsp;Yuanheng Li,&nbsp;Michael T. Monaghan,&nbsp;Carsten Morkel,&nbsp;Jörg Müller,&nbsp;Steffen U. Pauls,&nbsp;Ronny Richter,&nbsp;Tobias Scharnweber,&nbsp;Martin Sorg,&nbsp;Stefan Stoll,&nbsp;Sönke Twietmeyer,&nbsp;Wolfgang W. Weisser,&nbsp;Benedikt Wiggering,&nbsp;Martin Wilmking,&nbsp;Gerhard Zotz,&nbsp;Mark O. Gessner,&nbsp;Peter Haase,&nbsp;Florian Leese","doi":"10.1111/1755-0998.14023","DOIUrl":"10.1111/1755-0998.14023","url":null,"abstract":"<p>Mitigating ongoing losses of insects and their key functions (e.g. pollination) requires tracking large-scale and long-term community changes. However, doing so has been hindered by the high diversity of insect species that requires prohibitively high investments of time, funding and taxonomic expertise when addressed with conventional tools. Here, we show that these concerns can be addressed through a comprehensive, scalable and cost-efficient DNA metabarcoding workflow. We use 1815 samples from 75 Malaise traps across Germany from 2019 and 2020 to demonstrate how metabarcoding can be incorporated into large-scale insect monitoring networks for less than 50 € per sample, including supplies, labour and maintenance. We validated the detected species using two publicly available databases (GBOL and GBIF) and the judgement of taxonomic experts. With an average of 1.4 M sequence reads per sample we uncovered 10,803 validated insect species, of which 83.9% were represented by a single Operational Taxonomic Unit (OTU). We estimated another 21,043 plausible species, which we argue either lack a reference barcode or are undescribed. The total of 31,846 species is similar to the number of insect species known for Germany (~35,500). Because Malaise traps capture only a subset of insects, our approach identified many species likely unknown from Germany or new to science. Our reproducible workflow (~80% OTU-similarity among years) provides a blueprint for large-scale biodiversity monitoring of insects and other biodiversity components in near real time.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14023","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142370381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Molecular Ecology Resources
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1