Christina Karmisholt Overgaard, Mahwash Jamy, Simona Radutoiu, Fabien Burki, Morten Kam Dahl Dueholm
{"title":"从环境微真核细胞中获取ASV解析rRNA操作子的长线程测序策略基准。","authors":"Christina Karmisholt Overgaard, Mahwash Jamy, Simona Radutoiu, Fabien Burki, Morten Kam Dahl Dueholm","doi":"10.1111/1755-0998.13991","DOIUrl":null,"url":null,"abstract":"<p>The use of short-read metabarcoding for classifying microeukaryotes is challenged by the lack of comprehensive 18S rRNA reference databases. While recent advances in high-throughput long-read sequencing provide the potential to greatly increase the phylogenetic coverage of these databases, the performance of different sequencing technologies and subsequent bioinformatics processing remain to be evaluated, primarily because of the absence of well-defined eukaryotic mock communities. To address this challenge, we created a eukaryotic rRNA operon clone-library and turned it into a precisely defined synthetic eukaryotic mock community. This mock community was then used to evaluate the performance of three long-read sequencing strategies (PacBio circular consensus sequencing and two Nanopore approaches using unique molecular identifiers) and three tools for resolving amplicons sequence variants (ASVs) (USEARCH, VSEARCH, and DADA2). We investigated the sensitivity of the sequencing techniques based on the number of detected mock taxa, and the accuracy of the different ASV-calling tools with a specific focus on the presence of chimera among the final rRNA operon ASVs. Based on our findings, we provide recommendations and best practice protocols for how to cost-effectively obtain essentially error-free rRNA operons in high-throughput. An agricultural soil sample was used to demonstrate that the sequencing and bioinformatic results from the mock community also translates to highly diverse natural samples, which enables us to identify previously undescribed microeukaryotic lineages.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 7","pages":""},"PeriodicalIF":5.5000,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13991","citationCount":"0","resultStr":"{\"title\":\"Benchmarking long-read sequencing strategies for obtaining ASV-resolved rRNA operons from environmental microeukaryotes\",\"authors\":\"Christina Karmisholt Overgaard, Mahwash Jamy, Simona Radutoiu, Fabien Burki, Morten Kam Dahl Dueholm\",\"doi\":\"10.1111/1755-0998.13991\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The use of short-read metabarcoding for classifying microeukaryotes is challenged by the lack of comprehensive 18S rRNA reference databases. While recent advances in high-throughput long-read sequencing provide the potential to greatly increase the phylogenetic coverage of these databases, the performance of different sequencing technologies and subsequent bioinformatics processing remain to be evaluated, primarily because of the absence of well-defined eukaryotic mock communities. To address this challenge, we created a eukaryotic rRNA operon clone-library and turned it into a precisely defined synthetic eukaryotic mock community. This mock community was then used to evaluate the performance of three long-read sequencing strategies (PacBio circular consensus sequencing and two Nanopore approaches using unique molecular identifiers) and three tools for resolving amplicons sequence variants (ASVs) (USEARCH, VSEARCH, and DADA2). We investigated the sensitivity of the sequencing techniques based on the number of detected mock taxa, and the accuracy of the different ASV-calling tools with a specific focus on the presence of chimera among the final rRNA operon ASVs. Based on our findings, we provide recommendations and best practice protocols for how to cost-effectively obtain essentially error-free rRNA operons in high-throughput. An agricultural soil sample was used to demonstrate that the sequencing and bioinformatic results from the mock community also translates to highly diverse natural samples, which enables us to identify previously undescribed microeukaryotic lineages.</p>\",\"PeriodicalId\":211,\"journal\":{\"name\":\"Molecular Ecology Resources\",\"volume\":\"24 7\",\"pages\":\"\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2024-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13991\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Ecology Resources\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/1755-0998.13991\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/1755-0998.13991","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
由于缺乏全面的 18S rRNA 参考数据库,使用短线程元条码对微核生物进行分类面临挑战。虽然高通量长读数测序技术的最新进展有可能大大增加这些数据库的系统发生学覆盖范围,但不同测序技术的性能以及后续的生物信息学处理仍有待评估,这主要是因为缺乏定义明确的真核模拟群落。为了应对这一挑战,我们创建了真核生物 rRNA 操作子克隆库,并将其转化为精确定义的合成真核生物模拟群落。这个模拟群落随后被用来评估三种长读数测序策略(PacBio 循环共识测序和两种使用独特分子标识符的 Nanopore 方法)和三种解决扩增子序列变异(ASV)的工具(USEARCH、VSEARCH 和 DADA2)的性能。我们根据检测到的模拟类群数量调查了测序技术的灵敏度,以及不同 ASV 调用工具的准确性,特别关注最终 rRNA 操作子 ASV 中是否存在嵌合体。基于我们的研究结果,我们为如何以高通量、经济有效的方式获得基本无误的 rRNA 操作子提供了建议和最佳实践方案。我们使用了一个农业土壤样本来证明,模拟群落的测序和生物信息学结果同样适用于高度多样化的自然样本,这使我们能够确定以前未曾描述过的微真核细胞系。
Benchmarking long-read sequencing strategies for obtaining ASV-resolved rRNA operons from environmental microeukaryotes
The use of short-read metabarcoding for classifying microeukaryotes is challenged by the lack of comprehensive 18S rRNA reference databases. While recent advances in high-throughput long-read sequencing provide the potential to greatly increase the phylogenetic coverage of these databases, the performance of different sequencing technologies and subsequent bioinformatics processing remain to be evaluated, primarily because of the absence of well-defined eukaryotic mock communities. To address this challenge, we created a eukaryotic rRNA operon clone-library and turned it into a precisely defined synthetic eukaryotic mock community. This mock community was then used to evaluate the performance of three long-read sequencing strategies (PacBio circular consensus sequencing and two Nanopore approaches using unique molecular identifiers) and three tools for resolving amplicons sequence variants (ASVs) (USEARCH, VSEARCH, and DADA2). We investigated the sensitivity of the sequencing techniques based on the number of detected mock taxa, and the accuracy of the different ASV-calling tools with a specific focus on the presence of chimera among the final rRNA operon ASVs. Based on our findings, we provide recommendations and best practice protocols for how to cost-effectively obtain essentially error-free rRNA operons in high-throughput. An agricultural soil sample was used to demonstrate that the sequencing and bioinformatic results from the mock community also translates to highly diverse natural samples, which enables us to identify previously undescribed microeukaryotic lineages.
期刊介绍:
Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines.
In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.