Despite their critical roles in genetic sex determination, sex chromosomes remain unknown in many non-model organisms, especially those having recently evolved sex-linked regions (SLRs). These evolutionarily young and labile sex chromosomes are important for understanding early sex chromosome evolution but are difficult to identify due to the lack of Y/W degeneration and SLRs limited to small genomic regions. Here, we present SLRfinder, a method to identify candidate SLRs using linkage disequilibrium (LD) clustering, heterozygosity and genetic divergence. SLRfinder does not rely on specific sequencing methods or a specific type of reference genome (e.g., from the homomorphic sex). In addition, the input of SLRfinder does not require phenotypic sexes, which may be unknown from population sampling, but sex information can be incorporated and is necessary to validate candidate SLRs. We tested SLRfinder using various published datasets and compared it to the local principal component analysis (PCA) method and the depth-based method Sex Assignment Through Coverage (SATC). As expected, the local PCA method could not be used to identify unknown SLRs. SATC works better on conserved sex chromosomes, whereas SLRfinder outperforms SATC in analysing labile sex chromosomes, especially when SLRs harbour inversions. Power analyses showed that SLRfinder worked better when sampling more populations that share the same SLR. If analysing one population, a relatively larger sample size (around 50) is needed for sufficient statistical power to detect significant SLR candidates, although true SLRs are likely always top-ranked. SLRfinder provides a novel and complementary approach for identifying SLRs and uncovering additional sex chromosome diversity in nature.
{"title":"SLRfinder: A method to detect candidate sex-linked regions with linkage disequilibrium clustering","authors":"Xueling Yi, Petri Kemppainen, Juha Merilä","doi":"10.1111/1755-0998.13985","DOIUrl":"10.1111/1755-0998.13985","url":null,"abstract":"<p>Despite their critical roles in genetic sex determination, sex chromosomes remain unknown in many non-model organisms, especially those having recently evolved sex-linked regions (SLRs). These evolutionarily young and labile sex chromosomes are important for understanding early sex chromosome evolution but are difficult to identify due to the lack of Y/W degeneration and SLRs limited to small genomic regions. Here, we present SLRfinder, a method to identify candidate SLRs using linkage disequilibrium (LD) clustering, heterozygosity and genetic divergence. SLRfinder does not rely on specific sequencing methods or a specific type of reference genome (e.g., from the homomorphic sex). In addition, the input of SLRfinder does not require phenotypic sexes, which may be unknown from population sampling, but sex information can be incorporated and is necessary to validate candidate SLRs. We tested SLRfinder using various published datasets and compared it to the local principal component analysis (PCA) method and the depth-based method Sex Assignment Through Coverage (SATC). As expected, the local PCA method could not be used to identify unknown SLRs. SATC works better on conserved sex chromosomes, whereas SLRfinder outperforms SATC in analysing labile sex chromosomes, especially when SLRs harbour inversions. Power analyses showed that SLRfinder worked better when sampling more populations that share the same SLR. If analysing one population, a relatively larger sample size (around 50) is needed for sufficient statistical power to detect significant SLR candidates, although true SLRs are likely always top-ranked. SLRfinder provides a novel and complementary approach for identifying SLRs and uncovering additional sex chromosome diversity in nature.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13985","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141287480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wen Chen, Nathaniel Newlands, Sarah Hambleton, André Laroche, Seyyed Mohammadreza Davoodi, Guus Bakkeren
In the face of evolving agricultural practices and climate change, tools towards an integrated biovigilance platform to combat crop diseases, spore sampling, DNA diagnostics and predictive trajectory modelling were optimized. These tools revealed microbial dynamics and were validated by monitoring cereal rust fungal pathogens affecting wheat, oats, barley and rye across four growing seasons (2015–2018) in British Columbia and during the 2018 season in southern Alberta. ITS2 metabarcoding revealed disparity in aeromycobiota diversity and compositional structure across the Canadian Rocky Mountains, suggesting a barrier effect on air flow and pathogen dispersal. A novel bioinformatics classifier and curated cereal rust fungal ITS2 database, corroborated by real-time PCR, enhanced the precision of cereal rust fungal species identification. Random Forest modelling identified crop and land-use diversification as well as atmospheric pressure and moisture as key factors in rust distribution. As a valuable addition to explain observed differences and patterns in rust fungus distribution, trajectory HYSPLIT modelling tracked rust fungal urediniospores' northeastward dispersal from the Pacific Northwest towards southern British Columbia and Alberta, indicating multiple potential origins. Our Canadian case study exemplifies the power of an advanced biovigilance toolbox towards developing an early-warning system for farmers to detect and mitigate impending disease outbreaks.
{"title":"Optimizing an integrated biovigilance toolbox to study the spatial distribution and dynamic changes of airborne mycobiota, with a focus on cereal rust fungi in western Canada","authors":"Wen Chen, Nathaniel Newlands, Sarah Hambleton, André Laroche, Seyyed Mohammadreza Davoodi, Guus Bakkeren","doi":"10.1111/1755-0998.13983","DOIUrl":"10.1111/1755-0998.13983","url":null,"abstract":"<p>In the face of evolving agricultural practices and climate change, tools towards an integrated biovigilance platform to combat crop diseases, spore sampling, DNA diagnostics and predictive trajectory modelling were optimized. These tools revealed microbial dynamics and were validated by monitoring cereal rust fungal pathogens affecting wheat, oats, barley and rye across four growing seasons (2015–2018) in British Columbia and during the 2018 season in southern Alberta. ITS2 metabarcoding revealed disparity in aeromycobiota diversity and compositional structure across the Canadian Rocky Mountains, suggesting a barrier effect on air flow and pathogen dispersal. A novel bioinformatics classifier and curated cereal rust fungal ITS2 database, corroborated by real-time PCR, enhanced the precision of cereal rust fungal species identification. Random Forest modelling identified crop and land-use diversification as well as atmospheric pressure and moisture as key factors in rust distribution. As a valuable addition to explain observed differences and patterns in rust fungus distribution, trajectory HYSPLIT modelling tracked rust fungal urediniospores' northeastward dispersal from the Pacific Northwest towards southern British Columbia and Alberta, indicating multiple potential origins. Our Canadian case study exemplifies the power of an advanced biovigilance toolbox towards developing an early-warning system for farmers to detect and mitigate impending disease outbreaks.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13983","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141260073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Metabarcoding-based methods for identification of host-associated eukaryotes have the potential to revolutionize parasitology and microbial ecology, yet significant technical challenges remain. In particular, highly abundant host reads can mask the presence of less-abundant target organisms, especially for sample types rich in host DNA (e.g., blood and tissues). Here, we present a new CRISPR-Cas9-mediated approach designed to reduce host signal by selective amplicon digestion, thus enriching clinical samples for eukaryotic endosymbiont sequences during metabarcoding. Our method achieves a nearly 76% increased efficiency in host signal reduction compared with no treatment and a nearly 60% increased efficiency in host signal reduction compared with the most commonly used published method. Furthermore, the application of our method to clinical samples allows for the detection of parasite infections that would otherwise have been missed.
基于元条码的宿主相关真核生物鉴定方法有可能彻底改变寄生虫学和微生物生态学,但仍存在重大的技术挑战。特别是,高度丰富的宿主读数可能会掩盖不那么丰富的目标生物的存在,尤其是对于富含宿主 DNA 的样本类型(如血液和组织)。在这里,我们提出了一种新的 CRISPR-Cas9 介导的方法,旨在通过选择性扩增子消化减少宿主信号,从而在代谢标码过程中富集临床样本中的真核生物内共生体序列。与未经处理的方法相比,我们的方法减少宿主信号的效率提高了近 76%;与最常用的已发表方法相比,我们的方法减少宿主信号的效率提高了近 60%。此外,将我们的方法应用于临床样本,还能检测到原本会被遗漏的寄生虫感染。
{"title":"CRISPR-Cas9-mediated host signal reduction for 18S metabarcoding of host-associated eukaryotes","authors":"Leah A. Owens, Mary I. Thurber, Tony L. Goldberg","doi":"10.1111/1755-0998.13980","DOIUrl":"10.1111/1755-0998.13980","url":null,"abstract":"<p>Metabarcoding-based methods for identification of host-associated eukaryotes have the potential to revolutionize parasitology and microbial ecology, yet significant technical challenges remain. In particular, highly abundant host reads can mask the presence of less-abundant target organisms, especially for sample types rich in host DNA (e.g., blood and tissues). Here, we present a new CRISPR-Cas9-mediated approach designed to reduce host signal by selective amplicon digestion, thus enriching clinical samples for eukaryotic endosymbiont sequences during metabarcoding. Our method achieves a nearly 76% increased efficiency in host signal reduction compared with no treatment and a nearly 60% increased efficiency in host signal reduction compared with the most commonly used published method. Furthermore, the application of our method to clinical samples allows for the detection of parasite infections that would otherwise have been missed.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11288772/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141156939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giovanni Madrigal, Bushra Fazal Minhas, Julian Catchen
The improvement and decreasing costs of third-generation sequencing technologies has widened the scope of biological questions researchers can address with de novo genome assemblies. With the increasing number of reference genomes, validating their integrity with minimal overhead is vital for establishing confident results in their applications. Here, we present Klumpy, a tool for detecting and visualizing both misassembled regions in a genome assembly and genetic elements (e.g. genes) of interest in a set of sequences. By leveraging the initial raw reads in combination with their respective genome assembly, we illustrate Klumpy's utility by investigating antifreeze glycoprotein (afgp) loci across two icefishes, by searching for a reported absent gene in the northern snakehead fish, and by scanning the reference genomes of a mudskipper and bumblebee for misassembled regions. In the two former cases, we were able to provide support for the noncanonical placement of an afgp locus in the icefishes and locate the missing snakehead gene. Furthermore, our genome scans were able identify an unmappable locus in the mudskipper reference genome and identify a putative repetitive element shared among several species of bees.
{"title":"Klumpy: A tool to evaluate the integrity of long-read genome assemblies and illusive sequence motifs.","authors":"Giovanni Madrigal, Bushra Fazal Minhas, Julian Catchen","doi":"10.1111/1755-0998.13982","DOIUrl":"https://doi.org/10.1111/1755-0998.13982","url":null,"abstract":"<p><p>The improvement and decreasing costs of third-generation sequencing technologies has widened the scope of biological questions researchers can address with de novo genome assemblies. With the increasing number of reference genomes, validating their integrity with minimal overhead is vital for establishing confident results in their applications. Here, we present Klumpy, a tool for detecting and visualizing both misassembled regions in a genome assembly and genetic elements (e.g. genes) of interest in a set of sequences. By leveraging the initial raw reads in combination with their respective genome assembly, we illustrate Klumpy's utility by investigating antifreeze glycoprotein (afgp) loci across two icefishes, by searching for a reported absent gene in the northern snakehead fish, and by scanning the reference genomes of a mudskipper and bumblebee for misassembled regions. In the two former cases, we were able to provide support for the noncanonical placement of an afgp locus in the icefishes and locate the missing snakehead gene. Furthermore, our genome scans were able identify an unmappable locus in the mudskipper reference genome and identify a putative repetitive element shared among several species of bees.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":null,"pages":null},"PeriodicalIF":7.7,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141154540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juan Li, Gabriela M. Ulloa, Pedro Mayor, Meddly L. Santolalla Robles, Alex D. Greenwood
Collecting and preserving biological samples in the field, particularly in remote areas in tropical forests, prior to laboratory analysis is challenging. Blood samples in many cases are used for nucleic acid-based species determination, genomics or pathogen research. In most cases, maintaining a cold chain is impossible and samples remain at ambient temperature for extended periods of time before controlled storage conditions become available. Dried blood spot (DBS) storage, blood stored on cellulose-based paper, has been widely applied to facilitate sample collection and preservation in the field for decades. However, it is unclear how long-term storage on this substrate affects nucleic acid concentration and integrity. We analysed nucleic acid quality from DBS stored on Whatman filter paper no. 3 and FTA cards for up to 15 years in comparison to cold-chain stored samples using four nucleic acid extraction methods. We examined the ability to identify viral sequences from samples of 12 free-ranging primates in the Amazon forest, using targeted hybridization capture, and determined if mitochondrial genomes could be retrieved. The results suggest that even after extended periods of storage, DBS will be suitable for some genomic applications but may be of limited use for viral pathogen research, particularly RNA viruses.
{"title":"Nucleic acid degradation after long-term dried blood spot storage","authors":"Juan Li, Gabriela M. Ulloa, Pedro Mayor, Meddly L. Santolalla Robles, Alex D. Greenwood","doi":"10.1111/1755-0998.13979","DOIUrl":"10.1111/1755-0998.13979","url":null,"abstract":"<p>Collecting and preserving biological samples in the field, particularly in remote areas in tropical forests, prior to laboratory analysis is challenging. Blood samples in many cases are used for nucleic acid-based species determination, genomics or pathogen research. In most cases, maintaining a cold chain is impossible and samples remain at ambient temperature for extended periods of time before controlled storage conditions become available. Dried blood spot (DBS) storage, blood stored on cellulose-based paper, has been widely applied to facilitate sample collection and preservation in the field for decades. However, it is unclear how long-term storage on this substrate affects nucleic acid concentration and integrity. We analysed nucleic acid quality from DBS stored on Whatman filter paper no. 3 and FTA cards for up to 15 years in comparison to cold-chain stored samples using four nucleic acid extraction methods. We examined the ability to identify viral sequences from samples of 12 free-ranging primates in the Amazon forest, using targeted hybridization capture, and determined if mitochondrial genomes could be retrieved. The results suggest that even after extended periods of storage, DBS will be suitable for some genomic applications but may be of limited use for viral pathogen research, particularly RNA viruses.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13979","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141079767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Admixture is a common biological phenomenon among populations of the same or different species. Identifying admixed tracts within individual genomes can provide valuable information to date admixture events, reconstruct ancestry-specific demographic histories, or detect adaptive introgression, genetic incompatibilities, as well as regions of the genomes affected by (associative-) overdominance. Although many local ancestry inference (LAI) methods have been developed in the last decade, their performance was accessed using large reference panels, which are rarely available for non-model organisms or ancient samples. Moreover, the demographic conditions for which LAI becomes unreliable have not been explicitly outlined. Here, we identify the demographic conditions for which local ancestries can be best estimated using very small reference panels. Furthermore, we compare the performance of two LAI methods (RFMix and MOSAIC) with the performance of a newly developed approach (simpLAI) that can be used even when reference populations consist of single individuals. Based on simulations of various demographic models, we also determine the limits of these LAI tools and propose post-painting filtering steps to reduce false-positive rates and improve the precision and accuracy of the inferred admixed tracts. Besides providing a guide for using LAI, our work shows that reasonable inferences can be obtained from a single diploid genome per reference under demographic conditions that are not uncommon among past human groups and non-model organisms.
混杂是同一物种或不同物种种群间常见的生物现象。识别个体基因组中的混杂区可以为确定混杂事件的日期、重建特定祖先的人口历史、检测适应性引入、遗传不相容性以及受(关联)过度优势影响的基因组区域提供有价值的信息。尽管在过去十年中已经开发出了许多本地祖先推断(LAI)方法,但这些方法的性能都是通过大型参考面板获得的,而这些面板很少能用于非模式生物或古代样本。此外,LAI变得不可靠的人口学条件也没有明确概述。在此,我们确定了在哪些人口统计学条件下,使用极小的参考样板可以最好地估计本地祖先。此外,我们还将两种 LAI 方法(RFMix 和 MOSAIC)的性能与一种新开发的方法(simpLAI)的性能进行了比较。基于各种人口统计模型的模拟,我们还确定了这些 LAI 工具的局限性,并提出了绘制后过滤步骤,以降低假阳性率,提高推断出的掺杂道的精确度和准确性。除了为使用 LAI 提供指导外,我们的工作还表明,在过去人类群体和非模式生物中并不少见的人口统计条件下,每个参照物只需一个二倍体基因组就能获得合理的推断。
{"title":"Assessing the limits of local ancestry inference from small reference panels","authors":"Sandra Oliveira, Nina Marchi, Laurent Excoffier","doi":"10.1111/1755-0998.13981","DOIUrl":"10.1111/1755-0998.13981","url":null,"abstract":"<p>Admixture is a common biological phenomenon among populations of the same or different species. Identifying admixed tracts within individual genomes can provide valuable information to date admixture events, reconstruct ancestry-specific demographic histories, or detect adaptive introgression, genetic incompatibilities, as well as regions of the genomes affected by (associative-) overdominance. Although many local ancestry inference (LAI) methods have been developed in the last decade, their performance was accessed using large reference panels, which are rarely available for non-model organisms or ancient samples. Moreover, the demographic conditions for which LAI becomes unreliable have not been explicitly outlined. Here, we identify the demographic conditions for which local ancestries can be best estimated using very small reference panels. Furthermore, we compare the performance of two LAI methods (RFMix and MOSAIC) with the performance of a newly developed approach (simpLAI) that can be used even when reference populations consist of single individuals. Based on simulations of various demographic models, we also determine the limits of these LAI tools and propose post-painting filtering steps to reduce false-positive rates and improve the precision and accuracy of the inferred admixed tracts. Besides providing a guide for using LAI, our work shows that reasonable inferences can be obtained from a single diploid genome per reference under demographic conditions that are not uncommon among past human groups and non-model organisms.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13981","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141074982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alex J. Veglia, Ramón E. Rivera-Vicéns, Carsten G. B. Grupstra, Lauren I. Howe-Kerr, Adrienne M. S. Correa
Amplicon sequencing is an effective and increasingly applied method for studying viral communities in the environment. Here, we present vAMPirus, a user-friendly, comprehensive, and versatile DNA and RNA virus amplicon sequence analysis program, designed to support investigators in exploring virus amplicon sequencing data and running informed, reproducible analyses. vAMPirus intakes raw virus amplicon libraries and, by default, performs nucleotide- and amino acid-based analyses to produce results such as sequence abundance information, taxonomic classifications, phylogenies and community diversity metrics. The vAMPirus analytical framework leverages 16 different opensource tools and provides optional approaches that can increase the ratio of biological signal-to-noise and thereby reveal patterns that would have otherwise been masked. Here, we validate the vAMPirus analytical framework and illustrate its implementation as a general virus amplicon sequencing workflow by recapitulating findings from two previously published double-stranded DNA virus datasets. As a case study, we also apply the program to explore the diversity and distribution of a coral reef-associated RNA virus. vAMPirus is streamlined within Nextflow, offering straightforward scalability, standardization and communication of virus lineage-specific analyses. The vAMPirus framework is designed to be adaptable; community-driven analytical standards will continue to be incorporated as the field advances. vAMPirus supports researchers in revealing patterns of virus diversity and population dynamics in nature, while promoting study reproducibility and comparability.
{"title":"vAMPirus: A versatile amplicon processing and analysis program for studying viruses","authors":"Alex J. Veglia, Ramón E. Rivera-Vicéns, Carsten G. B. Grupstra, Lauren I. Howe-Kerr, Adrienne M. S. Correa","doi":"10.1111/1755-0998.13978","DOIUrl":"10.1111/1755-0998.13978","url":null,"abstract":"<p>Amplicon sequencing is an effective and increasingly applied method for studying viral communities in the environment. Here, we present vAMPirus, a user-friendly, comprehensive, and versatile DNA and RNA virus amplicon sequence analysis program, designed to support investigators in exploring virus amplicon sequencing data and running informed, reproducible analyses. vAMPirus intakes raw virus amplicon libraries and, by default, performs nucleotide- and amino acid-based analyses to produce results such as sequence abundance information, taxonomic classifications, phylogenies and community diversity metrics. The vAMPirus analytical framework leverages 16 different opensource tools and provides optional approaches that can increase the ratio of biological signal-to-noise and thereby reveal patterns that would have otherwise been masked. Here, we validate the vAMPirus analytical framework and illustrate its implementation as a general virus amplicon sequencing workflow by recapitulating findings from two previously published double-stranded DNA virus datasets. As a case study, we also apply the program to explore the diversity and distribution of a coral reef-associated RNA virus. vAMPirus is streamlined within Nextflow, offering straightforward scalability, standardization and communication of virus lineage-specific analyses. The vAMPirus framework is designed to be adaptable; community-driven analytical standards will continue to be incorporated as the field advances. vAMPirus supports researchers in revealing patterns of virus diversity and population dynamics in nature, while promoting study reproducibility and comparability.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13978","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141074984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lewis G. Spurgin, Mirte Bosse, Frank Adriaensen, Tamer Albayrak, Christos Barboutis, Eduardo Belda, Andrey Bushuev, Jacopo G. Cecere, Anne Charmantier, Mariusz Cichon, Niels J. Dingemanse, Blandine Doligez, Tapio Eeva, Kjell Einar Erikstad, Vyacheslav Fedorov, Matteo Griggio, Dieter Heylen, Sabine Hille, Camilla A. Hinde, Elena Ivankina, Bart Kempenaers, Anvar Kerimov, Milos Krist, Laura Kvist, Veronika N. Laine, Raivo Mänd, Erik Matthysen, Ruedi Nager, Boris P. Nikolov, Ana Claudia Norte, Markku Orell, Jenny Ouyang, Gergana Petrova-Dinkova, Heinz Richner, Diego Rubolini, Tore Slagsvold, Vallo Tilgar, János Török, Barbara Tschirren, Csongor I. Vágási, Teru Yuta, Martien A. M. Groenen, Marcel E. Visser, Kees van Oers, Ben C. Sheldon, Jon Slate
A major aim of evolutionary biology is to understand why patterns of genomic diversity vary within taxa and space. Large-scale genomic studies of widespread species are useful for studying how environment and demography shape patterns of genomic divergence. Here, we describe one of the most geographically comprehensive surveys of genomic variation in a wild vertebrate to date; the great tit (Parus major) HapMap project. We screened ca 500,000 SNP markers across 647 individuals from 29 populations, spanning ~30 degrees of latitude and 40 degrees of longitude – almost the entire geographical range of the European subspecies. Genome-wide variation was consistent with a recent colonisation across Europe from a South-East European refugium, with bottlenecks and reduced genetic diversity in island populations. Differentiation across the genome was highly heterogeneous, with clear ‘islands of differentiation’, even among populations with very low levels of genome-wide differentiation. Low local recombination rates were a strong predictor of high local genomic differentiation (FST), especially in island and peripheral mainland populations, suggesting that the interplay between genetic drift and recombination causes highly heterogeneous differentiation landscapes. We also detected genomic outlier regions that were confined to one or more peripheral great tit populations, probably as a result of recent directional selection at the species' range edges. Haplotype-based measures of selection were related to recombination rate, albeit less strongly, and highlighted population-specific sweeps that likely resulted from positive selection. Our study highlights how comprehensive screens of genomic variation in wild organisms can provide unique insights into spatio-temporal evolutionary dynamics.
{"title":"The great tit HapMap project: A continental-scale analysis of genomic variation in a songbird","authors":"Lewis G. Spurgin, Mirte Bosse, Frank Adriaensen, Tamer Albayrak, Christos Barboutis, Eduardo Belda, Andrey Bushuev, Jacopo G. Cecere, Anne Charmantier, Mariusz Cichon, Niels J. Dingemanse, Blandine Doligez, Tapio Eeva, Kjell Einar Erikstad, Vyacheslav Fedorov, Matteo Griggio, Dieter Heylen, Sabine Hille, Camilla A. Hinde, Elena Ivankina, Bart Kempenaers, Anvar Kerimov, Milos Krist, Laura Kvist, Veronika N. Laine, Raivo Mänd, Erik Matthysen, Ruedi Nager, Boris P. Nikolov, Ana Claudia Norte, Markku Orell, Jenny Ouyang, Gergana Petrova-Dinkova, Heinz Richner, Diego Rubolini, Tore Slagsvold, Vallo Tilgar, János Török, Barbara Tschirren, Csongor I. Vágási, Teru Yuta, Martien A. M. Groenen, Marcel E. Visser, Kees van Oers, Ben C. Sheldon, Jon Slate","doi":"10.1111/1755-0998.13969","DOIUrl":"10.1111/1755-0998.13969","url":null,"abstract":"<p>A major aim of evolutionary biology is to understand why patterns of genomic diversity vary within taxa and space. Large-scale genomic studies of widespread species are useful for studying how environment and demography shape patterns of genomic divergence. Here, we describe one of the most geographically comprehensive surveys of genomic variation in a wild vertebrate to date; the great tit (<i>Parus major</i>) HapMap project. We screened <i>ca</i> 500,000 SNP markers across 647 individuals from 29 populations, spanning ~30 degrees of latitude and 40 degrees of longitude – almost the entire geographical range of the European subspecies. Genome-wide variation was consistent with a recent colonisation across Europe from a South-East European refugium, with bottlenecks and reduced genetic diversity in island populations. Differentiation across the genome was highly heterogeneous, with clear ‘islands of differentiation’, even among populations with very low levels of genome-wide differentiation. Low local recombination rates were a strong predictor of high local genomic differentiation (<i>F</i><sub>ST</sub>), especially in island and peripheral mainland populations, suggesting that the interplay between genetic drift and recombination causes highly heterogeneous differentiation landscapes. We also detected genomic outlier regions that were confined to one or more peripheral great tit populations, probably as a result of recent directional selection at the species' range edges. Haplotype-based measures of selection were related to recombination rate, albeit less strongly, and highlighted population-specific sweeps that likely resulted from positive selection. Our study highlights how comprehensive screens of genomic variation in wild organisms can provide unique insights into spatio-temporal evolutionary dynamics.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":null,"pages":null},"PeriodicalIF":7.7,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13969","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140920200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eléonore Charrier, Rebecca Chen, Noelle Thundathil, John S. Gilleard
The ITS-2-rRNA has been particularly useful for nematode metabarcoding but does not resolve all phylogenetic relationships, and reference sequences are not available for many nematode species. This is a particular issue when metabarcoding complex communities such as wildlife parasites or terrestrial and aquatic free-living nematode communities. We have used markerDB to produce four databases of distinct regions of the rRNA cistron: the 18S rRNA gene, the 28S rRNA gene, the ITS-1 intergenic spacer and the region spanning ITS-1_5.8S_ITS-2. These databases comprise 2645, 254, 13,461 and 10,107 unique full-length sequences representing 1391, 204, 1837 and 1322 nematode species, respectively. The comparative analysis illustrates the complementary value but also reveals a better representation of Clade III, IV and V than Clade I and Clade II nematodes in each case. Although the ITS-1 database includes the largest number of unique full-length sequences, the 18S rRNA database provides the widest taxonomic coverage. We also developed PrimerTC, a tool to assess primer sequence conservation across any reference sequence database, and have applied it to evaluate a large number of previously published rRNA cistron primers. We identified sets of primers that currently provide the broadest taxonomic coverage for each rRNA marker across the nematode phylum. These new resources will facilitate more comprehensive metabarcoding of nematode communities using either short-read or long-read sequencing platforms. Further, PrimerTC is available as a simple WebApp to guide or assess PCR primer design for any genetic marker and/or taxonomic group beyond the nematode phylum.
{"title":"A set of nematode rRNA cistron databases and a primer assessment tool to enable more flexible and comprehensive metabarcoding","authors":"Eléonore Charrier, Rebecca Chen, Noelle Thundathil, John S. Gilleard","doi":"10.1111/1755-0998.13965","DOIUrl":"10.1111/1755-0998.13965","url":null,"abstract":"<p>The ITS-2-rRNA has been particularly useful for nematode metabarcoding but does not resolve all phylogenetic relationships, and reference sequences are not available for many nematode species. This is a particular issue when metabarcoding complex communities such as wildlife parasites or terrestrial and aquatic free-living nematode communities. We have used markerDB to produce four databases of distinct regions of the rRNA cistron: the 18S rRNA gene, the 28S rRNA gene, the ITS-1 intergenic spacer and the region spanning ITS-1_5.8S_ITS-2. These databases comprise 2645, 254, 13,461 and 10,107 unique full-length sequences representing 1391, 204, 1837 and 1322 nematode species, respectively. The comparative analysis illustrates the complementary value but also reveals a better representation of Clade III, IV and V than Clade I and Clade II nematodes in each case. Although the ITS-1 database includes the largest number of unique full-length sequences, the 18S rRNA database provides the widest taxonomic coverage. We also developed PrimerTC, a tool to assess primer sequence conservation across any reference sequence database, and have applied it to evaluate a large number of previously published rRNA cistron primers. We identified sets of primers that currently provide the broadest taxonomic coverage for each rRNA marker across the nematode phylum. These new resources will facilitate more comprehensive metabarcoding of nematode communities using either short-read or long-read sequencing platforms. Further, PrimerTC is available as a simple WebApp to guide or assess PCR primer design for any genetic marker and/or taxonomic group beyond the nematode phylum.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":null,"pages":null},"PeriodicalIF":7.7,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13965","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140907601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}