Scott L. Travers, Carl R. Hutter, Christopher C. Austin, Stephen C. Donnellan, Matthew D. Buehler, Christopher E. Ellison, Sara Ruane
Snake venoms are complex mixtures of toxic proteins that hold significant medical, pharmacological and evolutionary interest. To better understand the genetic diversity underlying snake venoms, we developed VenomCap, a novel exon-capture probe set targeting toxin-coding genes from a wide range of elapid snakes, with a particular focus on the ecologically diverse and medically important subfamily Hydrophiinae. We tested the capture success of VenomCap across 24 species, representing all major elapid lineages. We included snake phylogenomic probes in the VenomCap capture set, allowing us to compare capture performance between venom and phylogenomic loci and to infer elapid phylogenetic relationships. We demonstrated VenomCap's ability to recover exons from ~1500 target markers, representing a total of 24 known venom gene families, which includes the dominant gene families found in elapid venoms. We find that VenomCap's capture results are robust across all elapids sampled, and especially among hydrophiines, with respect to measures of target capture success (target loci matched, sensitivity, specificity and missing data). As a cost-effective and efficient alternative to full genome sequencing, VenomCap can dramatically accelerate the sequencing and analysis of venom gene families. Overall, our tool offers a model for genomic studies on snake venom gene diversity and evolution that can be expanded for comprehensive comparisons across the other families of venomous snakes.
{"title":"VenomCap: An exon-capture probe set for the targeted sequencing of snake venom genes","authors":"Scott L. Travers, Carl R. Hutter, Christopher C. Austin, Stephen C. Donnellan, Matthew D. Buehler, Christopher E. Ellison, Sara Ruane","doi":"10.1111/1755-0998.14020","DOIUrl":"10.1111/1755-0998.14020","url":null,"abstract":"<p>Snake venoms are complex mixtures of toxic proteins that hold significant medical, pharmacological and evolutionary interest. To better understand the genetic diversity underlying snake venoms, we developed VenomCap, a novel exon-capture probe set targeting toxin-coding genes from a wide range of elapid snakes, with a particular focus on the ecologically diverse and medically important subfamily Hydrophiinae. We tested the capture success of VenomCap across 24 species, representing all major elapid lineages. We included snake phylogenomic probes in the VenomCap capture set, allowing us to compare capture performance between venom and phylogenomic loci and to infer elapid phylogenetic relationships. We demonstrated VenomCap's ability to recover exons from ~1500 target markers, representing a total of 24 known venom gene families, which includes the dominant gene families found in elapid venoms. We find that VenomCap's capture results are robust across all elapids sampled, and especially among hydrophiines, with respect to measures of target capture success (target loci matched, sensitivity, specificity and missing data). As a cost-effective and efficient alternative to full genome sequencing, VenomCap can dramatically accelerate the sequencing and analysis of venom gene families. Overall, our tool offers a model for genomic studies on snake venom gene diversity and evolution that can be expanded for comprehensive comparisons across the other families of venomous snakes.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 8","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14020","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142247994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Romana Salis, Johanna Sunde, Nikolaj Gubonin, Markus Franzén, Anders Forsman
For two decades, DNA barcoding and, more recently, DNA metabarcoding have been used for molecular species identification and estimating biodiversity. Despite their growing use, few studies have systematically evaluated these methods. This study aims to evaluate the efficacy of barcoding methods in identifying species and estimating biodiversity, by assessing their consistency with traditional morphological identification and evaluating how assignment consistency is influenced by taxonomic group, sequence similarity thresholds and geographic distance. We first analysed 951 insect specimens across three taxonomic groups: butterflies, bumblebees and parasitic wasps, using both morphological taxonomy and single-specimen COI DNA barcoding. An additional 25,047 butterfly specimens were identified by COI DNA metabarcoding. Finally, we performed a systematic review of 99 studies to assess average consistency between insect species identity assigned via morphology and COI barcoding and to examine the distribution of research effort. Species assignment consistency was influenced by taxonomic group, sequence similarity thresholds and geographic distance. An average assignment consistency of 49% was found across taxonomic groups, with parasitic wasps displaying lower consistency due to taxonomic impediment. The number of missing matches doubled with a 100% sequence similarity threshold and COI intraspecific variation increased with geographic distance. Metabarcoding results aligned well with morphological biodiversity estimates and a strong positive correlation between sequence reads and species abundance was found. The systematic review revealed an 89% average consistency and also indicated taxonomic and geographic biases in research effort. Together, our findings demonstrate that while problems persist, barcoding approaches offer robust alternatives to traditional taxonomy for biodiversity assessment.
二十年来,DNA 条形码以及最近的 DNA 元条码一直被用于分子物种鉴定和生物多样性评估。尽管这些方法的应用越来越广泛,但很少有研究对其进行系统评估。本研究旨在评估条形码方法在鉴定物种和估算生物多样性方面的功效,方法是评估条形码方法与传统形态鉴定方法的一致性,并评估分类群、序列相似性阈值和地理距离对鉴定一致性的影响。我们首先使用形态分类法和单个标本 COI DNA 条形码分析了蝴蝶、熊蜂和寄生蜂三个分类群中的 951 个昆虫标本。此外,我们还通过 COI DNA 元条码鉴定了另外 25,047 份蝴蝶标本。最后,我们对 99 项研究进行了系统回顾,以评估通过形态学和 COI 条形码鉴定昆虫物种的平均一致性,并检查研究工作的分布情况。物种分配一致性受分类群、序列相似性阈值和地理距离的影响。发现各分类群的平均分配一致性为 49%,寄生蜂由于分类障碍而显示出较低的一致性。序列相似性阈值为 100%时,缺失匹配的数量增加了一倍,COI 种内变异随地理距离的增加而增加。元条码结果与形态生物多样性估计值吻合良好,并且发现序列读数与物种丰度之间存在很强的正相关性。系统综述显示平均一致性为 89%,同时也表明在研究工作中存在分类和地理偏差。总之,我们的研究结果表明,尽管问题依然存在,但条形码方法为生物多样性评估提供了传统分类学的有力替代方案。
{"title":"Performance of DNA metabarcoding, standard barcoding and morphological approaches in the identification of insect biodiversity","authors":"Romana Salis, Johanna Sunde, Nikolaj Gubonin, Markus Franzén, Anders Forsman","doi":"10.1111/1755-0998.14018","DOIUrl":"10.1111/1755-0998.14018","url":null,"abstract":"<p>For two decades, DNA barcoding and, more recently, DNA metabarcoding have been used for molecular species identification and estimating biodiversity. Despite their growing use, few studies have systematically evaluated these methods. This study aims to evaluate the efficacy of barcoding methods in identifying species and estimating biodiversity, by assessing their consistency with traditional morphological identification and evaluating how assignment consistency is influenced by taxonomic group, sequence similarity thresholds and geographic distance. We first analysed 951 insect specimens across three taxonomic groups: butterflies, bumblebees and parasitic wasps, using both morphological taxonomy and single-specimen COI DNA barcoding. An additional 25,047 butterfly specimens were identified by COI DNA metabarcoding. Finally, we performed a systematic review of 99 studies to assess average consistency between insect species identity assigned via morphology and COI barcoding and to examine the distribution of research effort. Species assignment consistency was influenced by taxonomic group, sequence similarity thresholds and geographic distance. An average assignment consistency of 49% was found across taxonomic groups, with parasitic wasps displaying lower consistency due to taxonomic impediment. The number of missing matches doubled with a 100% sequence similarity threshold and COI intraspecific variation increased with geographic distance. Metabarcoding results aligned well with morphological biodiversity estimates and a strong positive correlation between sequence reads and species abundance was found. The systematic review revealed an 89% average consistency and also indicated taxonomic and geographic biases in research effort. Together, our findings demonstrate that while problems persist, barcoding approaches offer robust alternatives to traditional taxonomy for biodiversity assessment.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 8","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142247997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ryan J. Daniels, Britta S. Meyer, Marco Giulio, Silvia G. Signorini, Nicoletta Riccardi, Camilla Della Torre, Alexandra A.-T. Weber
DNA methylation (DNAm) is a mechanism for rapid acclimation to environmental conditions. In natural systems, small effect sizes relative to noise necessitates large sampling efforts to detect differences. Large numbers of individually sequenced libraries are costly. Pooling DNA prior to library preparation may be an efficient way to reduce costs and increase sample size, yet there are to date no recommendations in ecological epigenetics research. We test whether pooled and individual libraries yield comparable DNAm signals in a natural system exposed to different pollution levels by generating whole-epigenome data from two invasive molluscs (Corbicula fluminea, Dreissena polymorpha) collected from polluted and unpolluted localities (Italy). DNA of the same individuals were used for pooled and individual epigenomic libraries and sequenced with equivalent resources per individual. We found that pooling effectively captures similar genome-wide and global methylation signals as individual libraries, highlighting that pooled libraries are representative of the global population signal. However, pooled libraries yielded orders of magnitude more data than individual libraries, which was a consequence of higher coverage. We would therefore recommend aiming for a high initial coverage of individual libraries (15×) in future studies. Consequently, we detected many more differentially methylated regions (DMRs) with the pooled libraries and a significantly lower statistical power for regions from individual libraries. Computationally pooled data from the individual libraries produced fewer DMRs and the overlap with wet-lab pooled DMRs was relatively low. We discuss possible causes for discrepancies, list benefits and drawbacks of pooling, and provide recommendations for future epigenomic studies.
DNA 甲基化(DNAm)是快速适应环境条件的一种机制。在自然系统中,相对于噪声而言,效应大小较小,因此必须进行大量取样工作以检测差异。大量单独测序的文库成本高昂。在文库制备之前汇集 DNA 可能是降低成本和增加样本量的有效方法,但迄今为止还没有生态表观遗传学研究方面的建议。我们从意大利受污染和未受污染地区收集了两种入侵软体动物(Corbicula fluminea 和 Dreissena polymorpha)的全表观基因组数据,以检验在暴露于不同污染水平的自然系统中,集合文库和个体文库是否能产生相似的 DNAm 信号。相同个体的 DNA 被用于集合表观基因组文库和个体表观基因组文库,并在每个个体资源相当的情况下进行测序。我们发现,集合文库能有效捕获与单个文库相似的全基因组和全球甲基化信号,这表明集合文库能代表全球种群信号。不过,集合文库比单个文库获得的数据多出几个数量级,这是覆盖率较高的结果。因此,我们建议在今后的研究中将单个文库的初始覆盖率设定为高(15 倍)。因此,我们利用集合文库检测到了更多的差异甲基化区域(DMR),而单个文库检测到的区域的统计能力则明显较低。计算汇集的单个文库数据产生的 DMRs 较少,而且与湿实验室汇集的 DMRs 重叠率相对较低。我们讨论了造成差异的可能原因,列出了汇集的优点和缺点,并为未来的表观基因组研究提供了建议。
{"title":"Benchmarking sample pooling for epigenomics of natural populations","authors":"Ryan J. Daniels, Britta S. Meyer, Marco Giulio, Silvia G. Signorini, Nicoletta Riccardi, Camilla Della Torre, Alexandra A.-T. Weber","doi":"10.1111/1755-0998.14021","DOIUrl":"10.1111/1755-0998.14021","url":null,"abstract":"<p>DNA methylation (DNAm) is a mechanism for rapid acclimation to environmental conditions. In natural systems, small effect sizes relative to noise necessitates large sampling efforts to detect differences. Large numbers of individually sequenced libraries are costly. Pooling DNA prior to library preparation may be an efficient way to reduce costs and increase sample size, yet there are to date no recommendations in ecological epigenetics research. We test whether pooled and individual libraries yield comparable DNAm signals in a natural system exposed to different pollution levels by generating whole-epigenome data from two invasive molluscs (<i>Corbicula fluminea</i>, <i>Dreissena polymorpha</i>) collected from polluted and unpolluted localities (Italy). DNA of the same individuals were used for pooled and individual epigenomic libraries and sequenced with equivalent resources per individual. We found that pooling effectively captures similar genome-wide and global methylation signals as individual libraries, highlighting that pooled libraries are representative of the global population signal. However, pooled libraries yielded orders of magnitude more data than individual libraries, which was a consequence of higher coverage. We would therefore recommend aiming for a high initial coverage of individual libraries (15×) in future studies. Consequently, we detected many more differentially methylated regions (DMRs) with the pooled libraries and a significantly lower statistical power for regions from individual libraries. Computationally pooled data from the individual libraries produced fewer DMRs and the overlap with wet-lab pooled DMRs was relatively low. We discuss possible causes for discrepancies, list benefits and drawbacks of pooling, and provide recommendations for future epigenomic studies.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 8","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14021","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142247999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oliver W. White, Sarah Walkington, Hugh Carter, Lauren Hughes, Melody Clark, Thomas Mock, Geraint A. Tarling, Matthew D. Clark
Antarctic krill (Euphausia superba Dana) is a keystone species in the Southern Ocean ecosystem, with ecological and commercial significance. However, its vulnerability to climate change requires an urgent investigation of its adaptive potential to future environmental conditions. Historical museum collections of krill from the early 20th century represent an ideal opportunity to investigate how krill have changed over time due to predation, fishing and climate change. However, there is currently no cost-effective method for implementing population scale collection genomics for krill given its genome size (48 Gbp). Here, we assessed the utility of two inexpensive methods for population genetics using historical krill samples, specifically low-coverage shotgun sequencing (i.e. ‘genome-skimming’) and exome capture. Two full-length transcriptomes were generated and used to identify 166 putative gene targets for exome capture bait design. A total of 20 historical krill samples were sequenced using shotgun and exome capture. Mitochondrial and nuclear ribosomal sequences were assembled from both low-coverage shotgun and off-target of exome capture data demonstrating that endogenous DNA sequences could be assembled from historical collections. Although, mitochondrial and ribosomal sequences are variable across individuals from different populations, phylogenetic analysis does not identify any population structure. We find exome capture provides approximately 4500-fold enrichment of sequencing targeted genes, suggesting this approach can generate the sequencing depth required to call identify a significant number of variants. Unlocking historical collections for genomic analyses using exome capture, will provide valuable insights into past and present biodiversity, resilience and adaptability of krill populations to climate change.
{"title":"Exome capture of Antarctic krill (Euphausia superba) for cost effective genotyping and population genetics with historical collections","authors":"Oliver W. White, Sarah Walkington, Hugh Carter, Lauren Hughes, Melody Clark, Thomas Mock, Geraint A. Tarling, Matthew D. Clark","doi":"10.1111/1755-0998.14022","DOIUrl":"10.1111/1755-0998.14022","url":null,"abstract":"<p>Antarctic krill (<i>Euphausia superba</i> Dana) is a keystone species in the Southern Ocean ecosystem, with ecological and commercial significance. However, its vulnerability to climate change requires an urgent investigation of its adaptive potential to future environmental conditions. Historical museum collections of krill from the early 20th century represent an ideal opportunity to investigate how krill have changed over time due to predation, fishing and climate change. However, there is currently no cost-effective method for implementing population scale collection genomics for krill given its genome size (48 Gbp). Here, we assessed the utility of two inexpensive methods for population genetics using historical krill samples, specifically low-coverage shotgun sequencing (i.e. ‘genome-skimming’) and exome capture. Two full-length transcriptomes were generated and used to identify 166 putative gene targets for exome capture bait design. A total of 20 historical krill samples were sequenced using shotgun and exome capture. Mitochondrial and nuclear ribosomal sequences were assembled from both low-coverage shotgun and off-target of exome capture data demonstrating that endogenous DNA sequences could be assembled from historical collections. Although, mitochondrial and ribosomal sequences are variable across individuals from different populations, phylogenetic analysis does not identify any population structure. We find exome capture provides approximately 4500-fold enrichment of sequencing targeted genes, suggesting this approach can generate the sequencing depth required to call identify a significant number of variants. Unlocking historical collections for genomic analyses using exome capture, will provide valuable insights into past and present biodiversity, resilience and adaptability of krill populations to climate change.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 8","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14022","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142247995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The origin of sociality represents one of the most important evolutionary transitions. Insect sociality evolved in some hemipteran aphids, which can produce soldiers and normal nymphs with distinct morphology and behaviour through parthenogenesis. The lack of genomic data resources has hindered the investigations into molecular mechanisms underlying their social evolution. Herein, we generated the first chromosomal-level genome of a social hemipteran (Pseudoregma bambucicola) with highly specialized soldiers and performed comparative genomic and transcriptomic analyses to elucidate the molecular signatures and regulatory mechanisms of caste differentiation. P. bambucicola has a larger known aphid genome of 582.2 Mb with an N50 length of 11.24 Mb, and about 99.6% of the assembly was anchored to six chromosomes with a scaffold N50 of 98.27 Mb. A total of 14,027 protein-coding genes were predicted and 37.33% of the assembly were identified as repeat sequences. The social evolution is accompanied by a variety of changes in genome organization, including expansion of gene families related to transcription factors, transposable elements, as well as species-specific expansions of certain sugar transporters and UGPases involved in carbohydrate metabolism. We also characterized large candidate gene sets linked to caste differentiation and found evidence of expression regulation and positive selection acting on energy metabolism and muscle structure, explaining the soldier-specific traits including morphological and behavioural specialization, developmental arrest and infertility. Overall, this study offers new insights into the molecular basis of social aphids and the evolution of insect sociality and also provides valuable data resources for further comparative and functional studies.
{"title":"Genomic and transcriptomic analyses of a social hemipteran provide new insights into insect sociality","authors":"Hui Zhang, Qian Liu, Jianjun Lu, Liying Wu, Zhentao Cheng, Gexia Qiao, Xiaolei Huang","doi":"10.1111/1755-0998.14019","DOIUrl":"10.1111/1755-0998.14019","url":null,"abstract":"<p>The origin of sociality represents one of the most important evolutionary transitions. Insect sociality evolved in some hemipteran aphids, which can produce soldiers and normal nymphs with distinct morphology and behaviour through parthenogenesis. The lack of genomic data resources has hindered the investigations into molecular mechanisms underlying their social evolution. Herein, we generated the first chromosomal-level genome of a social hemipteran (<i>Pseudoregma bambucicola</i>) with highly specialized soldiers and performed comparative genomic and transcriptomic analyses to elucidate the molecular signatures and regulatory mechanisms of caste differentiation. <i>P. bambucicola</i> has a larger known aphid genome of 582.2 Mb with an N50 length of 11.24 Mb, and about 99.6% of the assembly was anchored to six chromosomes with a scaffold N50 of 98.27 Mb. A total of 14,027 protein-coding genes were predicted and 37.33% of the assembly were identified as repeat sequences. The social evolution is accompanied by a variety of changes in genome organization, including expansion of gene families related to transcription factors, transposable elements, as well as species-specific expansions of certain sugar transporters and UGPases involved in carbohydrate metabolism. We also characterized large candidate gene sets linked to caste differentiation and found evidence of expression regulation and positive selection acting on energy metabolism and muscle structure, explaining the soldier-specific traits including morphological and behavioural specialization, developmental arrest and infertility. Overall, this study offers new insights into the molecular basis of social aphids and the evolution of insect sociality and also provides valuable data resources for further comparative and functional studies.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 8","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142185306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jilda Alicia Caccavo, Larissa S. Arantes, Enrique Celemín, Susan Mbedi, Sarah Sparmann, Camila J. Mazzoni
Fish ear bones, known as otoliths, are often collected in fisheries to assist in management, and are a common sample type in museum and national archives. Beyond their utility for ageing, morphological and trace element analysis, otoliths are a repository of valuable genomic information. Previous work has shown that DNA can be extracted from the trace quantities of tissue remaining on the surface of otoliths, despite the fact that they are often stored dry at room temperature. However, much of this work has used reduced representation sequencing methods in clean lab conditions, to achieve adequate yields of DNA, libraries and ultimately single-nucleotide polymorphisms (SNPs). Here, we pioneer the use of small-scale (spike-in) sequencing to screen contemporary otolith samples prepared in regular molecular biology (in contrast to clean) laboratories for contamination and quality levels, submitting for whole-genome resequencing only samples above a defined endogenous DNA threshold. Despite the typically low quality and quantity of DNA extracted from otoliths, we are able to produce whole-genome libraries and ultimately sets of filtered, unlinked and even putatively adaptive SNPs of ample numbers for downstream uses in population, climate and conservation genomics. By comparing with a set of tissue samples from the same species, we are able to highlight the quality and efficacy of otolith samples from DNA extraction and library preparation, to bioinformatic preprocessing and SNP calling. We provide detailed schematics, protocols and scripts of our approach, such that it can be adopted widely by the community, improving the use of otoliths as a source of valuable genomic data.
鱼类耳骨(称为耳石)通常在渔业中收集,以协助管理,也是博物馆和国家档案馆中常见的样本类型。耳石除了用于年龄、形态和微量元素分析外,还是宝贵的基因组信息库。以往的工作表明,尽管耳石通常在室温下干燥保存,但仍可从耳石表面残留的微量组织中提取 DNA。然而,这些工作大多是在洁净的实验室条件下使用代表性降低的测序方法,以获得足够的 DNA 产量、文库和最终的单核苷酸多态性(SNPs)。在这里,我们开创性地使用小规模(spike-in)测序来筛选在常规分子生物学(而非洁净)实验室中制备的当代耳石样本的污染和质量水平,只提交高于定义的内源性DNA阈值的样本进行全基因组重测序。尽管从耳石中提取的 DNA 质量和数量通常都很低,但我们仍能生成全基因组文库,并最终生成筛选过的、无关联的、甚至是推定适应性的 SNPs,这些 SNPs 数量充足,可用于下游的种群、气候和保护基因组学研究。通过与来自同一物种的一组组织样本进行比较,我们能够突出耳石样本从 DNA 提取和文库制备到生物信息预处理和 SNP 调用的质量和功效。我们提供了我们的方法的详细示意图、协议和脚本,以便社区广泛采用,提高耳石作为宝贵基因组数据来源的使用率。
{"title":"Whole-genome resequencing improves the utility of otoliths as a critical source of DNA for fish stock research and monitoring","authors":"Jilda Alicia Caccavo, Larissa S. Arantes, Enrique Celemín, Susan Mbedi, Sarah Sparmann, Camila J. Mazzoni","doi":"10.1111/1755-0998.14013","DOIUrl":"10.1111/1755-0998.14013","url":null,"abstract":"<p>Fish ear bones, known as otoliths, are often collected in fisheries to assist in management, and are a common sample type in museum and national archives. Beyond their utility for ageing, morphological and trace element analysis, otoliths are a repository of valuable genomic information. Previous work has shown that DNA can be extracted from the trace quantities of tissue remaining on the surface of otoliths, despite the fact that they are often stored dry at room temperature. However, much of this work has used reduced representation sequencing methods in clean lab conditions, to achieve adequate yields of DNA, libraries and ultimately single-nucleotide polymorphisms (SNPs). Here, we pioneer the use of small-scale (spike-in) sequencing to screen contemporary otolith samples prepared in regular molecular biology (in contrast to clean) laboratories for contamination and quality levels, submitting for whole-genome resequencing only samples above a defined endogenous DNA threshold. Despite the typically low quality and quantity of DNA extracted from otoliths, we are able to produce whole-genome libraries and ultimately sets of filtered, unlinked and even putatively adaptive SNPs of ample numbers for downstream uses in population, climate and conservation genomics. By comparing with a set of tissue samples from the same species, we are able to highlight the quality and efficacy of otolith samples from DNA extraction and library preparation, to bioinformatic preprocessing and SNP calling. We provide detailed schematics, protocols and scripts of our approach, such that it can be adopted widely by the community, improving the use of otoliths as a source of valuable genomic data.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 8","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142131382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leveraging past allele frequencies has proven to be key for identifying the impact of natural selection across time. However, this approach suffers from imprecise estimations of the intensity (s) and timing (T) of selection, particularly when ancient samples are scarce in specific epochs. Here, we aimed to bypass the computation of allele frequencies across arbitrarily defined past epochs and refine the estimations of selection parameters by implementing convolutional neural networks (CNNs) algorithms that directly use ancient genotypes sampled across time. Using computer simulations, we first show that genotype-based CNNs consistently outperform an approximate Bayesian computation (ABC) approach based on past allele frequency trajectories, regardless of the selection model assumed and the number of available ancient genotypes. When applying this method to empirical data from modern and ancient Europeans, we replicated the reported increased number of selection events in post-Neolithic Europe, independently of the continental subregion studied. Furthermore, we substantially refined the ABC-based estimations of s and T for a set of positively and negatively selected variants, including iconic cases of positive selection and experimentally validated disease-risk variants. Our CNN predictions support a history of recent positive and negative selection targeting variants associated with host defence against pathogens, aligning with previous work that highlights the significant impact of infectious diseases, such as tuberculosis, in Europe. These findings collectively demonstrate that detecting the footprints of natural selection on ancient genomes is crucial for unravelling the history of severe human diseases.
事实证明,利用过去的等位基因频率是识别跨时间自然选择影响的关键。然而,这种方法存在对选择强度(s)和时间(T)估计不精确的问题,尤其是在特定时代的古样本稀缺的情况下。在这里,我们的目标是通过实施卷积神经网络(CNN)算法,直接使用跨时间采样的古代基因型,绕过计算任意定义的过去时代的等位基因频率,并完善选择参数的估计。通过计算机模拟,我们首先证明了基于基因型的 CNN 始终优于基于过去等位基因频率轨迹的近似贝叶斯计算(ABC)方法,而与假设的选择模型和可用的古代基因型数量无关。将这种方法应用于现代和古代欧洲人的经验数据时,我们复制了新石器时代后欧洲选择事件数量增加的报道,与所研究的大陆亚区无关。此外,我们还大大改进了对一系列正选择和负选择变异的基于 ABC 的 s 和 T 估计,其中包括标志性的正选择案例和经实验验证的疾病风险变异。我们的 CNN 预测支持近期针对与宿主防御病原体有关的变体的正选择和负选择的历史,这与以前的工作相一致,以前的工作强调了传染性疾病(如结核病)在欧洲的重大影响。这些发现共同表明,检测古代基因组上自然选择的足迹对于揭示人类严重疾病的历史至关重要。
{"title":"Deep estimation of the intensity and timing of natural selection from ancient genomes","authors":"Guillaume Laval, Etienne Patin, Lluis Quintana-Murci, Gaspard Kerner","doi":"10.1111/1755-0998.14015","DOIUrl":"10.1111/1755-0998.14015","url":null,"abstract":"<p>Leveraging past allele frequencies has proven to be key for identifying the impact of natural selection across time. However, this approach suffers from imprecise estimations of the intensity (<i>s</i>) and timing (<i>T</i>) of selection, particularly when ancient samples are scarce in specific epochs. Here, we aimed to bypass the computation of allele frequencies across arbitrarily defined past epochs and refine the estimations of selection parameters by implementing convolutional neural networks (CNNs) algorithms that directly use ancient genotypes sampled across time. Using computer simulations, we first show that genotype-based CNNs consistently outperform an approximate Bayesian computation (ABC) approach based on past allele frequency trajectories, regardless of the selection model assumed and the number of available ancient genotypes. When applying this method to empirical data from modern and ancient Europeans, we replicated the reported increased number of selection events in post-Neolithic Europe, independently of the continental subregion studied. Furthermore, we substantially refined the ABC-based estimations of <i>s</i> and <i>T</i> for a set of positively and negatively selected variants, including iconic cases of positive selection and experimentally validated disease-risk variants. Our CNN predictions support a history of recent positive and negative selection targeting variants associated with host defence against pathogens, aligning with previous work that highlights the significant impact of infectious diseases, such as tuberculosis, in Europe. These findings collectively demonstrate that detecting the footprints of natural selection on ancient genomes is crucial for unravelling the history of severe human diseases.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 8","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142102710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rapid environmental change poses unprecedented challenges to species persistence. To understand the extent that continued change could have, genomic offset methods have been used to forecast maladaptation of natural populations to future environmental change. However, while their use has become increasingly common, little is known regarding their predictive performance across a wide array of realistic and challenging scenarios. Here, we evaluate the performance of currently available offset methods (gradientForest, the Risk-Of-Non-Adaptedness, redundancy analysis with and without structure correction and LFMM2) using an extensive set of simulated data sets that vary demography, adaptive architecture and the number and spatial patterns of adaptive environments. For each data set, we train models using either all, adaptive or neutral marker sets and evaluate performance using in silico common gardens by correlating known fitness with projected offset. Using over 4,849,600 of such evaluations, we find that (1) method performance is largely due to the degree of local adaptation across the metapopulation (LA), (2) adaptive marker sets provide minimal performance advantages, (3) performance within the species range is variable across gardens and declines when offset models are trained using additional non-adaptive environments and (4) despite (1) performance declines more rapidly in globally novel climates (i.e. a climate without an analogue within the species range) for metapopulations with greater LA than lesser LA. We discuss the implications of these results for management, assisted gene flow and assisted migration.
环境的快速变化给物种的生存带来了前所未有的挑战。为了了解持续变化可能带来的影响,基因组抵消方法被用来预测自然种群对未来环境变化的不适应。然而,虽然基因组偏移方法的使用越来越普遍,但人们对其在各种现实和具有挑战性的情况下的预测性能却知之甚少。在此,我们使用一组广泛的模拟数据集来评估目前可用的抵消方法(梯度森林、非适应性风险、带或不带结构校正的冗余分析以及 LFMM2)的性能,这些数据集改变了人口、适应性结构以及适应性环境的数量和空间模式。对于每个数据集,我们使用全部、适应性或中性标记集来训练模型,并通过将已知适应性与预测偏移相关联,使用硅共同园来评估性能。通过使用超过 484.96 万次这样的评估,我们发现:(1)方法的性能在很大程度上取决于整个元种群(LA)的局部适应程度;(2)适应性标记集带来的性能优势微乎其微;(3)在物种范围内,不同花园的性能是不同的,当使用额外的非适应性环境训练偏移模型时,性能会下降;(4)尽管有(1),但在全球新气候(即物种范围内没有类似气候)中,LA 较高的元种群比 LA 较低的元种群的性能下降得更快。我们将讨论这些结果对管理、辅助基因流和辅助迁移的影响。
{"title":"The accuracy of predicting maladaptation to new environments with genomic data.","authors":"Brandon M Lind, Katie E Lotterhos","doi":"10.1111/1755-0998.14008","DOIUrl":"https://doi.org/10.1111/1755-0998.14008","url":null,"abstract":"<p><p>Rapid environmental change poses unprecedented challenges to species persistence. To understand the extent that continued change could have, genomic offset methods have been used to forecast maladaptation of natural populations to future environmental change. However, while their use has become increasingly common, little is known regarding their predictive performance across a wide array of realistic and challenging scenarios. Here, we evaluate the performance of currently available offset methods (gradientForest, the Risk-Of-Non-Adaptedness, redundancy analysis with and without structure correction and LFMM2) using an extensive set of simulated data sets that vary demography, adaptive architecture and the number and spatial patterns of adaptive environments. For each data set, we train models using either all, adaptive or neutral marker sets and evaluate performance using in silico common gardens by correlating known fitness with projected offset. Using over 4,849,600 of such evaluations, we find that (1) method performance is largely due to the degree of local adaptation across the metapopulation (LA), (2) adaptive marker sets provide minimal performance advantages, (3) performance within the species range is variable across gardens and declines when offset models are trained using additional non-adaptive environments and (4) despite (1) performance declines more rapidly in globally novel climates (i.e. a climate without an analogue within the species range) for metapopulations with greater LA than lesser LA. We discuss the implications of these results for management, assisted gene flow and assisted migration.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14008"},"PeriodicalIF":5.5,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142102711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ngoc-Loi Nguyen, Joanna Pawłowska, Marek Zajaczkowski, Agnes K. M. Weiner, Tristan Cordier, Danielle M. Grant, Stijn De Schepper, Jan Pawłowski
Environmental DNA (eDNA) preserved in marine sediments is increasingly being used to study past ecosystems. However, little is known about how accurately marine biodiversity is recorded in sediment eDNA archives, especially planktonic taxa. Here, we address this question by comparing eukaryotic diversity in 273 eDNA samples from three water depths and the surface sediments of 24 stations in the Nordic Seas. Analysis of 18S-V9 metabarcoding data reveals distinct eukaryotic assemblages between water and sediment eDNA. Only 40% of Amplicon Sequence Variants (ASVs) detected in water were also found in sediment eDNA. Remarkably, the ASVs shared between water and sediment accounted for 80% of total sequence reads suggesting that a large amount of plankton DNA is transported to the seafloor, predominantly from abundant phytoplankton taxa. However, not all plankton taxa were equally archived on the seafloor. The plankton DNA deposited in the sediments was dominated by diatoms and showed an underrepresentation of certain nano- and picoplankton taxa (Picozoa or Prymnesiophyceae). Our study offers the first insights into the patterns of plankton diversity recorded in sediment in relation to seasonality and spatial variability of environmental conditions in the Nordic Seas. Our results suggest that the genetic composition and structure of the plankton community vary considerably throughout the water column and differ from what accumulates in the sediment. Hence, the interpretation of sedimentary eDNA archives should take into account potential taxonomic and abundance biases when reconstructing past changes in marine biodiversity.
保存在海洋沉积物中的环境 DNA(eDNA)越来越多地被用于研究过去的生态系统。然而,人们对沉积物 eDNA 档案中记录的海洋生物多样性(尤其是浮游生物类群)的准确性知之甚少。在这里,我们通过比较来自北欧海洋三个水深和 24 个站点表层沉积物的 273 个 eDNA 样品中的真核生物多样性来解决这个问题。对 18S-V9 代谢编码数据的分析表明,水体和沉积物 eDNA 中的真核生物群落截然不同。在水中检测到的扩增子序列变异(ASVs)中,只有 40% 在沉积物 eDNA 中也被发现。值得注意的是,水中和沉积物中共享的 ASV 占总序列读数的 80%,这表明大量浮游生物 DNA 被迁移到了海底,主要来自丰富的浮游植物类群。然而,并非所有浮游生物类群都在海底存档。沉积在沉积物中的浮游生物 DNA 以硅藻为主,而某些纳米和微微浮游生物类群(微微浮游生物或微微浮游植物)所占比例较低。我们的研究首次揭示了沉积物中记录的浮游生物多样性模式与北欧海洋环境条件的季节性和空间变化的关系。我们的研究结果表明,浮游生物群落的遗传组成和结构在整个水体中差异很大,与沉积物中的情况不同。因此,在解读沉积物 eDNA 档案时,应考虑到重建过去海洋生物多样性变化时可能出现的分类和丰度偏差。
{"title":"Taxonomic and abundance biases affect the record of marine eukaryotic plankton communities in sediment DNA archives","authors":"Ngoc-Loi Nguyen, Joanna Pawłowska, Marek Zajaczkowski, Agnes K. M. Weiner, Tristan Cordier, Danielle M. Grant, Stijn De Schepper, Jan Pawłowski","doi":"10.1111/1755-0998.14014","DOIUrl":"10.1111/1755-0998.14014","url":null,"abstract":"<p>Environmental DNA (<i>e</i>DNA) preserved in marine sediments is increasingly being used to study past ecosystems. However, little is known about how accurately marine biodiversity is recorded in sediment <i>e</i>DNA archives, especially planktonic taxa. Here, we address this question by comparing eukaryotic diversity in 273 <i>e</i>DNA samples from three water depths and the surface sediments of 24 stations in the Nordic Seas. Analysis of 18S-V9 metabarcoding data reveals distinct eukaryotic assemblages between water and sediment <i>e</i>DNA. Only 40% of Amplicon Sequence Variants (ASVs) detected in water were also found in sediment <i>e</i>DNA. Remarkably, the ASVs shared between water and sediment accounted for 80% of total sequence reads suggesting that a large amount of plankton DNA is transported to the seafloor, predominantly from abundant phytoplankton taxa. However, not all plankton taxa were equally archived on the seafloor. The plankton DNA deposited in the sediments was dominated by diatoms and showed an underrepresentation of certain nano- and picoplankton taxa (Picozoa or Prymnesiophyceae). Our study offers the first insights into the patterns of plankton diversity recorded in sediment in relation to seasonality and spatial variability of environmental conditions in the Nordic Seas. Our results suggest that the genetic composition and structure of the plankton community vary considerably throughout the water column and differ from what accumulates in the sediment. Hence, the interpretation of sedimentary <i>e</i>DNA archives should take into account potential taxonomic and abundance biases when reconstructing past changes in marine biodiversity.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 8","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142071609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marine mammals play a fundamental role in the functioning of healthy marine ecosystems and are important indicator species. Studying their biology, distributions, behaviour and health are still technically and logistically demanding for researchers. However, the efforts and commitment have not been in vain, since we are witnessing constant and exponential advancement in the study of these animals, thanks to technological progress in numerous fields. These include miniaturization and performance of biologger tags, which are equipped with sensors for measuring physiological parameters, hydrophones, accelerometers, time-depth records and spatial locations; the use of high throughput ‘Next Generation’ Sequencing to gain genetic information about communities and individual species from nucleic acids in environmental samples at miniscule concentrations; through, to the possibility of monitoring species with autonomous aerial and underwater vehicles. In parallel advances in computing and statistical modelling frameworks support the analysis of increasingly large and complex data sets. In this issue, O'Mahony et al. (2024) draw from at least two of these innovations: (a) the collection of biological material retrieved from large whales' blows using a modified drone and (b) the use of the samples to infer a wide spectrum of genetic information (both nuclear and mitochondrial) about the target animal/population. The methodology is not completely novel, but the study shows an impressive advancement in the amount of data obtained compared to preceding studies using the same approach. In the wake of these promising results, future perspectives are evaluated in relation to alternative sampling methodologies currently in use. It is possible to speculate that, in the next few years, the combination of non-invasive molecular profiling and enhanced drone technology (e.g. assembling increasingly smaller components, thus expanding capacity for autonomous operation) will open up perspectives that were unimaginable at the beginning of this millennium.
{"title":"The answer, my friend, is blowin’ in the wind: Blow sampling provides a new dimension to whale population monitoring","authors":"Elena Valsecchi","doi":"10.1111/1755-0998.14012","DOIUrl":"10.1111/1755-0998.14012","url":null,"abstract":"<p>Marine mammals play a fundamental role in the functioning of healthy marine ecosystems and are important indicator species. Studying their biology, distributions, behaviour and health are still technically and logistically demanding for researchers. However, the efforts and commitment have not been in vain, since we are witnessing constant and exponential advancement in the study of these animals, thanks to technological progress in numerous fields. These include miniaturization and performance of biologger tags, which are equipped with sensors for measuring physiological parameters, hydrophones, accelerometers, time-depth records and spatial locations; the use of high throughput ‘Next Generation’ Sequencing to gain genetic information about communities and individual species from nucleic acids in environmental samples at miniscule concentrations; through, to the possibility of monitoring species with autonomous aerial and underwater vehicles. In parallel advances in computing and statistical modelling frameworks support the analysis of increasingly large and complex data sets. In this issue, O'Mahony et al. (2024) draw from at least two of these innovations: (a) the collection of biological material retrieved from large whales' blows using a modified drone and (b) the use of the samples to infer a wide spectrum of genetic information (both nuclear and mitochondrial) about the target animal/population. The methodology is not completely novel, but the study shows an impressive advancement in the amount of data obtained compared to preceding studies using the same approach. In the wake of these promising results, future perspectives are evaluated in relation to alternative sampling methodologies currently in use. It is possible to speculate that, in the next few years, the combination of non-invasive molecular profiling and enhanced drone technology (e.g. assembling increasingly smaller components, thus expanding capacity for autonomous operation) will open up perspectives that were unimaginable at the beginning of this millennium.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 8","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14012","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142071610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}