Bernardo F Santos, Meredith E Miller, Margarita Miklasevskaja, Jaclyn T A McKeown, Niamh E Redmond, Jonathan A Coddington, Jessica Bird, Scott E Miller, Ashton Smith, Seán G Brady, Matthew L Buffington, M Lourdes Chamorro, Torsten Dikow, Michael W Gates, Paul Goldstein, Alexander Konstantinov, Robert Kula, Nicholas D Silverson, M Alma Solis, Stephanie L deWaard, Suresh Naik, Nadya Nikolova, Mikko Pentinsaari, Sean W J Prosser, Jayme E Sones, Evgeny V Zakharov, Jeremy R deWaard
{"title":"Enhancing DNA barcode reference libraries by harvesting terrestrial arthropods at the Smithsonian's National Museum of Natural History.","authors":"Bernardo F Santos, Meredith E Miller, Margarita Miklasevskaja, Jaclyn T A McKeown, Niamh E Redmond, Jonathan A Coddington, Jessica Bird, Scott E Miller, Ashton Smith, Seán G Brady, Matthew L Buffington, M Lourdes Chamorro, Torsten Dikow, Michael W Gates, Paul Goldstein, Alexander Konstantinov, Robert Kula, Nicholas D Silverson, M Alma Solis, Stephanie L deWaard, Suresh Naik, Nadya Nikolova, Mikko Pentinsaari, Sean W J Prosser, Jayme E Sones, Evgeny V Zakharov, Jeremy R deWaard","doi":"10.3897/BDJ.11.e100904","DOIUrl":null,"url":null,"abstract":"<p><p>The use of DNA barcoding has revolutionised biodiversity science, but its application depends on the existence of comprehensive and reliable reference libraries. For many poorly known taxa, such reference sequences are missing even at higher-level taxonomic scales. We harvested the collections of the Smithsonian's National Museum of Natural History (USNM) to generate DNA barcoding sequences for genera of terrestrial arthropods previously not recorded in one or more major public sequence databases. Our workflow used a mix of Sanger and Next-Generation Sequencing (NGS) approaches to maximise sequence recovery while ensuring affordable cost. In total, COI sequences were obtained for 5,686 specimens belonging to 3,737 determined species in 3,886 genera and 205 families distributed in 137 countries. Success rates varied widely according to collection data and focal taxon. NGS helped recover sequences of specimens that failed a previous run of Sanger sequencing. Success rates and the optimal balance between Sanger and NGS are the most important drivers to maximise output and minimise cost in future projects. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, the Global Genome Biodiversity Network Data Portal and the NMNH data portal.</p>","PeriodicalId":54826,"journal":{"name":"Journal of Geology","volume":"92 1","pages":"e100904"},"PeriodicalIF":1.5000,"publicationDate":"2023-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10848724/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geology","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.3897/BDJ.11.e100904","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"GEOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The use of DNA barcoding has revolutionised biodiversity science, but its application depends on the existence of comprehensive and reliable reference libraries. For many poorly known taxa, such reference sequences are missing even at higher-level taxonomic scales. We harvested the collections of the Smithsonian's National Museum of Natural History (USNM) to generate DNA barcoding sequences for genera of terrestrial arthropods previously not recorded in one or more major public sequence databases. Our workflow used a mix of Sanger and Next-Generation Sequencing (NGS) approaches to maximise sequence recovery while ensuring affordable cost. In total, COI sequences were obtained for 5,686 specimens belonging to 3,737 determined species in 3,886 genera and 205 families distributed in 137 countries. Success rates varied widely according to collection data and focal taxon. NGS helped recover sequences of specimens that failed a previous run of Sanger sequencing. Success rates and the optimal balance between Sanger and NGS are the most important drivers to maximise output and minimise cost in future projects. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, the Global Genome Biodiversity Network Data Portal and the NMNH data portal.
DNA 条形码的使用彻底改变了生物多样性科学,但其应用取决于是否存在全面可靠的参考文献库。对于许多知之甚少的类群来说,即使在较高的分类尺度上也缺少这样的参考序列。我们利用史密森尼国家自然历史博物馆(USNM)的藏品,为以前未记录在一个或多个主要公共序列数据库中的陆生节肢动物属生成 DNA 条形码序列。我们的工作流程混合使用了 Sanger 和下一代测序(NGS)方法,以最大限度地恢复序列,同时确保成本低廉。我们总共获得了分布在 137 个国家的 3,886 属 205 科 3,737 个确定物种的 5,686 个标本的 COI 序列。根据采集数据和重点分类群的不同,成功率也大相径庭。NGS 帮助恢复了之前 Sanger 测序失败的标本序列。成功率以及 Sanger 和 NGS 之间的最佳平衡是未来项目中实现产出最大化和成本最小化的最重要驱动因素。相应的序列和分类数据可通过生命条形码数据系统(Barcode of Life Data System)、GenBank、全球生物多样性信息机制(Global Biodiversity Information Facility)、全球基因组生物多样性网络数据门户网站(Global Genome Biodiversity Network Data Portal)和 NMNH 数据门户网站访问。
期刊介绍:
One of the oldest journals in geology, The Journal of Geology has since 1893 promoted the systematic philosophical and fundamental study of geology.
The Journal publishes original research across a broad range of subfields in geology, including geophysics, geochemistry, sedimentology, geomorphology, petrology, plate tectonics, volcanology, structural geology, mineralogy, and planetary sciences. Many of its articles have wide appeal for geologists, present research of topical relevance, and offer new geological insights through the application of innovative approaches and methods.