Cassandra Elphinstone, Rob Elphinstone, Marco Todesco, Loren H Rieseberg
Tandem repeats play an important role in centromere structure, subtelomeric regions, DNA methylation, recombination and the regulation of gene activity. Analysis of their distribution in genomes offers a potential means for predicting putative centromere locations, which continues to be a challenge for genome annotation. Here we present RepeatOBserver (https://github.com/celphin/RepeatOBserverV1), a new tool for visualising repeat patterns and identifying putative centromere locations, using a Fourier transform of DNA walks. RepeatOBserver can identify and visualise a broad range of perfect and imperfect repeats (3-5000 bp long) in genome assemblies without any a priori knowledge of repeat sequences or the need for optimising parameters. RepeatOBserver heatmaps can distinguish between tandem and retrotransposon repeats. We analysed 159 chromosomes with experimentally-verified centromere positions from 12 plant and animal species. We find that 93% of experimentally-verified tandem repeat centromeres occur in regions of low sequence diversity and 97% of retrotransposon centromeres occur in regions with a high abundance of repeat lengths. Depending on the centromere type predicted by the heatmaps, putative centromere locations can be predicted using either a genomic Shannon diversity index or a repeat abundance sum. RepeatOBserver can also locate other regions of interest including potential neocentromeres and gene copy variation. Split and inverted tandem repeats at inversion boundaries suggest that chromosomal inversions or mis-assemblies can also be located. RepeatOBserver is a flexible tool for comprehensive characterisation of repeat patterns that can be used to visualise and identify a variety of regions of interest in genome assemblies.
{"title":"RepeatOBserver: Tandem Repeat Visualisation and Putative Centromere Detection.","authors":"Cassandra Elphinstone, Rob Elphinstone, Marco Todesco, Loren H Rieseberg","doi":"10.1111/1755-0998.14084","DOIUrl":"https://doi.org/10.1111/1755-0998.14084","url":null,"abstract":"<p><p>Tandem repeats play an important role in centromere structure, subtelomeric regions, DNA methylation, recombination and the regulation of gene activity. Analysis of their distribution in genomes offers a potential means for predicting putative centromere locations, which continues to be a challenge for genome annotation. Here we present RepeatOBserver (https://github.com/celphin/RepeatOBserverV1), a new tool for visualising repeat patterns and identifying putative centromere locations, using a Fourier transform of DNA walks. RepeatOBserver can identify and visualise a broad range of perfect and imperfect repeats (3-5000 bp long) in genome assemblies without any a priori knowledge of repeat sequences or the need for optimising parameters. RepeatOBserver heatmaps can distinguish between tandem and retrotransposon repeats. We analysed 159 chromosomes with experimentally-verified centromere positions from 12 plant and animal species. We find that 93% of experimentally-verified tandem repeat centromeres occur in regions of low sequence diversity and 97% of retrotransposon centromeres occur in regions with a high abundance of repeat lengths. Depending on the centromere type predicted by the heatmaps, putative centromere locations can be predicted using either a genomic Shannon diversity index or a repeat abundance sum. RepeatOBserver can also locate other regions of interest including potential neocentromeres and gene copy variation. Split and inverted tandem repeats at inversion boundaries suggest that chromosomal inversions or mis-assemblies can also be located. RepeatOBserver is a flexible tool for comprehensive characterisation of repeat patterns that can be used to visualise and identify a variety of regions of interest in genome assemblies.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14084"},"PeriodicalIF":5.5,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143539580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fiifi Agyabeng-Dadzie, Megan S Beaudry, Alex Deyanov, Haley Slanis, Minh Q Duong, Randi Turner, Asis Khan, Cesar A Arias, Jessica C Kissinger, Travis C Glenn, Rodrigo de Paula Baptista
Multiple displacement amplification (MDA) outperforms conventional PCR in long fragment and whole-genome amplification, making it attractive to couple MDA with long-read sequencing of samples with limited quantities of DNA to obtain improved genome assemblies. Here, we explore the efficacy and limits of MDA for efficient low-cost genome sequence assembly using Oxford Nanopore Technologies (ONTs) rapid library preparations and minION sequencing. We successfully generated almost complete genome sequences for all organisms examined, including Gram-positive (Staphylococcus aureus, Enterococcus faecium) and Gram-negative (Escherichia coli) prokaryotes and one challenging eukaryotic pathogen (Cryptosporidium spp) representing a broad spectrum of critical infectious disease pathogens. High-quality data from those samples were generated starting with only 0.025 ng of total DNA. Controlled sheared DNA samples exhibited a distinct pattern of size increase after MDA, which may be associated with the amplification of long, low-abundance fragments present in the assay, as well as generating concatemeric sequences during amplification. To address concatemers, we developed a computational pipeline (CADECT: Concatemer Detection Tool) to identify and remove putative concatemeric sequences. This study highlights the efficacy of MDA in generating high-quality genome assemblies from limited amounts of input DNA. Also, the CADECT pipeline effectively mitigated the impact of concatemeric sequences, enabling the assembly of contiguous sequences even in cases where the input genomic DNA was degraded. These results have significant implications for the study of organisms that are challenging to culture in vitro, such as Cryptosporidium, and for expediting critical results in clinical settings with limited quantities of available genomic DNA.
{"title":"Evaluating the Benefits and Limits of Multiple Displacement Amplification With Whole-Genome Oxford Nanopore Sequencing.","authors":"Fiifi Agyabeng-Dadzie, Megan S Beaudry, Alex Deyanov, Haley Slanis, Minh Q Duong, Randi Turner, Asis Khan, Cesar A Arias, Jessica C Kissinger, Travis C Glenn, Rodrigo de Paula Baptista","doi":"10.1111/1755-0998.14094","DOIUrl":"10.1111/1755-0998.14094","url":null,"abstract":"<p><p>Multiple displacement amplification (MDA) outperforms conventional PCR in long fragment and whole-genome amplification, making it attractive to couple MDA with long-read sequencing of samples with limited quantities of DNA to obtain improved genome assemblies. Here, we explore the efficacy and limits of MDA for efficient low-cost genome sequence assembly using Oxford Nanopore Technologies (ONTs) rapid library preparations and minION sequencing. We successfully generated almost complete genome sequences for all organisms examined, including Gram-positive (Staphylococcus aureus, Enterococcus faecium) and Gram-negative (Escherichia coli) prokaryotes and one challenging eukaryotic pathogen (Cryptosporidium spp) representing a broad spectrum of critical infectious disease pathogens. High-quality data from those samples were generated starting with only 0.025 ng of total DNA. Controlled sheared DNA samples exhibited a distinct pattern of size increase after MDA, which may be associated with the amplification of long, low-abundance fragments present in the assay, as well as generating concatemeric sequences during amplification. To address concatemers, we developed a computational pipeline (CADECT: Concatemer Detection Tool) to identify and remove putative concatemeric sequences. This study highlights the efficacy of MDA in generating high-quality genome assemblies from limited amounts of input DNA. Also, the CADECT pipeline effectively mitigated the impact of concatemeric sequences, enabling the assembly of contiguous sequences even in cases where the input genomic DNA was degraded. These results have significant implications for the study of organisms that are challenging to culture in vitro, such as Cryptosporidium, and for expediting critical results in clinical settings with limited quantities of available genomic DNA.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14094"},"PeriodicalIF":5.5,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143522364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jennifer M Standley, Jose Marcelino, Fahong Yu, James D Ellis
Royal jelly (RJ) is a glandular secretion fed to developing honey bee larvae by adult worker bees. It is also a potential source of disease transmission in and between honey bee colonies. We endeavored to characterize the microbiome, virome, and other biota present in RJ via an integrated meta-omics approach. Using a magnetic beads-based extraction protocol, we identified eDNA and eRNA fragments from organisms of interest in RJ using high-throughput metagenomics (DNA-seq), metatranscriptomics (total RNA-seq), and parallel sequencing. This allowed us to enhance the detection of Operational Taxonomic Units (OTUs) undetectable by standard 'omics or amplicon protocols'. Using this integrated approach, we detected OTUs present in RJ from honey bee pests and pathogens, including Melissococcus plutonius, Paenibacillus larvae, Varroa destructor, V. jacobsoni, Aethina tumida, Galleria mellonella, Vairimorpha ceranae, Apis mellifera filamentous virus, Black queen cell virus, Acute bee paralysis virus, Sacbrood virus, Deformed wing virus, Israeli acute bee paralysis virus, Kashmir bee virus, and Slow bee paralysis virus, as well as multiple beneficial gut bacteria from the genera Lactobacillus, Actinobacteria, and Gluconobacter. The presence of DNA and RNA from these organisms does not conclusively indicate the presence of live organisms in the RJ, but it does suggest some exposure of the RJ to these organisms. The results present a comprehensive eDNA and eRNA microbial profile of RJ, demonstrating that our novel method is an effective and sensitive molecular tool for high-resolution metagenomic and metatranscriptomic profiling, and is of value for detection of pathogens of concern for the beekeeping industry.
{"title":"A Meta-Omics Approach Using eDNA and eRNA for the Assessment of Biotic Communities Associated With Royal Jelly Produced by the Western Honey Bee (Apis mellifera L.).","authors":"Jennifer M Standley, Jose Marcelino, Fahong Yu, James D Ellis","doi":"10.1111/1755-0998.14090","DOIUrl":"https://doi.org/10.1111/1755-0998.14090","url":null,"abstract":"<p><p>Royal jelly (RJ) is a glandular secretion fed to developing honey bee larvae by adult worker bees. It is also a potential source of disease transmission in and between honey bee colonies. We endeavored to characterize the microbiome, virome, and other biota present in RJ via an integrated meta-omics approach. Using a magnetic beads-based extraction protocol, we identified eDNA and eRNA fragments from organisms of interest in RJ using high-throughput metagenomics (DNA-seq), metatranscriptomics (total RNA-seq), and parallel sequencing. This allowed us to enhance the detection of Operational Taxonomic Units (OTUs) undetectable by standard 'omics or amplicon protocols'. Using this integrated approach, we detected OTUs present in RJ from honey bee pests and pathogens, including Melissococcus plutonius, Paenibacillus larvae, Varroa destructor, V. jacobsoni, Aethina tumida, Galleria mellonella, Vairimorpha ceranae, Apis mellifera filamentous virus, Black queen cell virus, Acute bee paralysis virus, Sacbrood virus, Deformed wing virus, Israeli acute bee paralysis virus, Kashmir bee virus, and Slow bee paralysis virus, as well as multiple beneficial gut bacteria from the genera Lactobacillus, Actinobacteria, and Gluconobacter. The presence of DNA and RNA from these organisms does not conclusively indicate the presence of live organisms in the RJ, but it does suggest some exposure of the RJ to these organisms. The results present a comprehensive eDNA and eRNA microbial profile of RJ, demonstrating that our novel method is an effective and sensitive molecular tool for high-resolution metagenomic and metatranscriptomic profiling, and is of value for detection of pathogens of concern for the beekeeping industry.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14090"},"PeriodicalIF":5.5,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143514280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thilina S Nimalrathna, Huan Fan, Ahimsa Campos-Arceiz, Akihiro Nakamura
Fungi play crucial ecological and economic roles, yet their diversity and distribution remain poorly known and challenging to assess. Using recent advances in invertebrate-derived DNA (iDNA) for biodiversity monitoring, we investigated the potential of dung beetle iDNA for fungal sampling and monitoring. We sampled two habitats (rainforest vs. rubber plantation) and seasons (dry vs. rainy) in tropical Xishuangbanna, southwest China. We extracted, amplified and identified 9259 unique fungi Amplicon Sequence Variants (ASVs) from the gut of three species of dung beetles (Paragymnopleurus sp., telecoprids; Onthophagus diabolicus, paracoprids; and Onthophagus cf. gracilipes, endocoprids). Fungal community composition was different across habitats and seasons, with the highest diversity found in the rainy season rainforest. Our results were consistent with previous eDNA-based studies based on soil samples in the detection of habitat differences (both approaches were able to detect low diversity of Basidiomycota in rubber plantations). However, our approach outperformed soil-based eDNA studies in being able to detect fungal occurrences associated with seasonal precipitation patterns. Our findings highlight the utility of dung beetle iDNA to uncover spatiotemporal dynamics of fungal communities across different habitats. The use of iDNA broadens fungal biodiversity research, strengthens fungal monitoring to assess anthropogenic impacts and presents opportunities to conserve fungal diversity.
{"title":"Dung Beetle iDNA Provides an Effective Way to Detect Diverse Mycological Communities.","authors":"Thilina S Nimalrathna, Huan Fan, Ahimsa Campos-Arceiz, Akihiro Nakamura","doi":"10.1111/1755-0998.14091","DOIUrl":"https://doi.org/10.1111/1755-0998.14091","url":null,"abstract":"<p><p>Fungi play crucial ecological and economic roles, yet their diversity and distribution remain poorly known and challenging to assess. Using recent advances in invertebrate-derived DNA (iDNA) for biodiversity monitoring, we investigated the potential of dung beetle iDNA for fungal sampling and monitoring. We sampled two habitats (rainforest vs. rubber plantation) and seasons (dry vs. rainy) in tropical Xishuangbanna, southwest China. We extracted, amplified and identified 9259 unique fungi Amplicon Sequence Variants (ASVs) from the gut of three species of dung beetles (Paragymnopleurus sp., telecoprids; Onthophagus diabolicus, paracoprids; and Onthophagus cf. gracilipes, endocoprids). Fungal community composition was different across habitats and seasons, with the highest diversity found in the rainy season rainforest. Our results were consistent with previous eDNA-based studies based on soil samples in the detection of habitat differences (both approaches were able to detect low diversity of Basidiomycota in rubber plantations). However, our approach outperformed soil-based eDNA studies in being able to detect fungal occurrences associated with seasonal precipitation patterns. Our findings highlight the utility of dung beetle iDNA to uncover spatiotemporal dynamics of fungal communities across different habitats. The use of iDNA broadens fungal biodiversity research, strengthens fungal monitoring to assess anthropogenic impacts and presents opportunities to conserve fungal diversity.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14091"},"PeriodicalIF":5.5,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143497735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The StairwayPlot approach provides an elegant, flexible and powerful method to estimate complex demographic histories of single populations from site frequency spectrum data. It uses expected coalescent times to compute the expected site frequency spectrum within a multinomial likelihood function. Population sizes are allowed to vary freely between coalescent events but are constant within each interval. Here, we implement the StairwayPlot approach in the Bayesian software package RevBayes. We use approaches developed for Bayesian Skyline Plots, which include independent and identically distributed (i.i.d.) population sizes, Gaussian Markov random fields and Horseshoe Markov random fields as prior distributions on population sizes. Furthermore, we implement a recently developed approach for computing the leave-one-out cross-validation probability for efficient model selection. We compare inference from our Bayesian implementation to the original Maximum Likelihood implementation, StairwayPlot2. Our results show that our Bayesian implementation in RevBayes performs comparable to StairwayPlot2 in terms of parameter accuracy, which is expected given that both use the same underlying likelihood function. From our set of prior models, the Gaussian Markov random field prior performed best for smoothly varying demographic histories, while the Horseshoe Markov random field performs best for abruptly changing demographic histories. We conclude the study by exploring several choices often faced in empirical studies, including the estimate of the total sequence length, the assumed mutation rate, as well as biases through mis-calling ancestral alleles. We show using our empirical example that as few as 10 diploid individuals are sufficient to infer complex demographic histories, but at least 500 k single nucleotide polymorphisms (SNPs) are required.
{"title":"Bayesian StairwayPlot for Inferring Single Population Demographic Histories From Site Frequency Spectra.","authors":"Sebastian Höhna, Ana Catalán","doi":"10.1111/1755-0998.14087","DOIUrl":"https://doi.org/10.1111/1755-0998.14087","url":null,"abstract":"<p><p>The StairwayPlot approach provides an elegant, flexible and powerful method to estimate complex demographic histories of single populations from site frequency spectrum data. It uses expected coalescent times to compute the expected site frequency spectrum within a multinomial likelihood function. Population sizes are allowed to vary freely between coalescent events but are constant within each interval. Here, we implement the StairwayPlot approach in the Bayesian software package RevBayes. We use approaches developed for Bayesian Skyline Plots, which include independent and identically distributed (i.i.d.) population sizes, Gaussian Markov random fields and Horseshoe Markov random fields as prior distributions on population sizes. Furthermore, we implement a recently developed approach for computing the leave-one-out cross-validation probability for efficient model selection. We compare inference from our Bayesian implementation to the original Maximum Likelihood implementation, StairwayPlot2. Our results show that our Bayesian implementation in RevBayes performs comparable to StairwayPlot2 in terms of parameter accuracy, which is expected given that both use the same underlying likelihood function. From our set of prior models, the Gaussian Markov random field prior performed best for smoothly varying demographic histories, while the Horseshoe Markov random field performs best for abruptly changing demographic histories. We conclude the study by exploring several choices often faced in empirical studies, including the estimate of the total sequence length, the assumed mutation rate, as well as biases through mis-calling ancestral alleles. We show using our empirical example that as few as 10 diploid individuals are sufficient to infer complex demographic histories, but at least 500 k single nucleotide polymorphisms (SNPs) are required.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14087"},"PeriodicalIF":5.5,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143514282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Both environmental DNA (eDNA) and environmental RNA (eRNA) have been widely adopted for biodiversity assessment. While eDNA often persists longer in environments, eRNA offers a more current view of biological activities. In eRNA metabarcoding, extracted eRNA is reverse transcribed into complementary DNA (cDNA) for metabarcoding. However, the efficacy of various reverse transcription strategies has not been evaluated. Here we compared the biodiversity recovery efficiency of three strategies: random priming with hexamers, oligo(dT) priming and taxa-specific priming using Mifish-U for fish in both high- and low-biodiversity regions. Our results demonstrate that reverse transcription strategies significantly impact biodiversity recovery. Random priming consistently detected the highest number of taxa in both low- and high-biodiversity regions. In low-biodiversity areas, oligo(dT) performed comparably to random hexamers; however, in high-biodiversity regions, random hexamers outperformed oligo(dT), particularly in recovering rare taxa. While taxa-specific priming was comparable to the other strategies for high-abundance taxa, it was less effective for rare taxa, thus limiting its utility for comprehensive biodiversity assessment. These differences are largely due to the multiple binding sites for random hexamers compared to the fewer or absent sites with oligo(dT) and taxa-specific primers under high eRNA degradation. Combining random hexamers and oligo(dT) significantly improved taxa recovery, especially for low-abundance species, supporting its best practice in eukaryotes. For prokaryotes or genes lacking polyadenylation, random priming is favoured over taxa- or gene-specific priming. Collectively, these findings underscore the critical importance of selecting appropriate reverse transcription strategies in eRNA metabarcoding, with significant implications for effective biodiversity monitoring and conservation efforts.
{"title":"Selecting Competent Reverse Transcription Strategies to Maximise Biodiversity Recovery With eRNA Metabarcoding.","authors":"Fuwen Wang, Wei Xiong, Xuena Huang, Aibin Zhan","doi":"10.1111/1755-0998.14092","DOIUrl":"https://doi.org/10.1111/1755-0998.14092","url":null,"abstract":"<p><p>Both environmental DNA (eDNA) and environmental RNA (eRNA) have been widely adopted for biodiversity assessment. While eDNA often persists longer in environments, eRNA offers a more current view of biological activities. In eRNA metabarcoding, extracted eRNA is reverse transcribed into complementary DNA (cDNA) for metabarcoding. However, the efficacy of various reverse transcription strategies has not been evaluated. Here we compared the biodiversity recovery efficiency of three strategies: random priming with hexamers, oligo(dT) priming and taxa-specific priming using Mifish-U for fish in both high- and low-biodiversity regions. Our results demonstrate that reverse transcription strategies significantly impact biodiversity recovery. Random priming consistently detected the highest number of taxa in both low- and high-biodiversity regions. In low-biodiversity areas, oligo(dT) performed comparably to random hexamers; however, in high-biodiversity regions, random hexamers outperformed oligo(dT), particularly in recovering rare taxa. While taxa-specific priming was comparable to the other strategies for high-abundance taxa, it was less effective for rare taxa, thus limiting its utility for comprehensive biodiversity assessment. These differences are largely due to the multiple binding sites for random hexamers compared to the fewer or absent sites with oligo(dT) and taxa-specific primers under high eRNA degradation. Combining random hexamers and oligo(dT) significantly improved taxa recovery, especially for low-abundance species, supporting its best practice in eukaryotes. For prokaryotes or genes lacking polyadenylation, random priming is favoured over taxa- or gene-specific priming. Collectively, these findings underscore the critical importance of selecting appropriate reverse transcription strategies in eRNA metabarcoding, with significant implications for effective biodiversity monitoring and conservation efforts.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14092"},"PeriodicalIF":5.5,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143476185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yves Bawin, Beyene Zewdie, Biruk Ayalew, Isabel Roldán-Ruiz, Steven B Janssens, Ayco J M Tack, Sileshi Nemomissa, Kassahun Tesfaye, Kristoffer Hylander, Olivier Honnay, Tom Ruttink
Cultivation of crops close to their wild relatives may jeopardise the integrity of wild genetic resources. Detecting cultivars among wild plants is necessary to characterise crop-wild gene flow, but can be challenging if cultivars and wild plants are phenotypically highly similar. Genomics tools can be used instead, but the selection of diagnostic loci for cultivar identification can be difficult if the wild and cultivated genepools are closely related. In Ethiopia, Arabica coffee cultivars resistant to coffee berry disease (CBD) occur near wild Coffea arabica plants and local landraces. However, the abundance and distribution of these cultivars across coffee sites remains unclear. Here, we present a new module of the SMAP package called SMAP relatedness pairwise to characterise pairwise genetic relationships between individuals based on haplotype calls and to identify diagnostic loci that distinguish (sets of) individuals from each other. Next, we estimate the relative abundance of CBD-resistant cultivars across 60 Ethiopian Arabica coffee sites using a genome-wide fingerprinting approach. We confirm the presence of these cultivars in around 75% of the coffee sites with a high agreement between a field survey and our DNA fingerprinting approach. At least 20 out of 60 sites with supposedly wild C. arabica individuals contain signatures of the cultivated genepool. Overall, we conclude that CBD-resistant cultivars are widespread in Ethiopian coffee sites. The development of SMAP relatedness pairwise opens opportunities to assess the distribution of coffee cultivars in other regions in Ethiopia and to apply similar screenings near wild relatives from other crops.
{"title":"A Molecular Survey of the Occurrence of Coffee Berry Disease Resistant Coffee Cultivars Near the Wild Gene Pool of Arabica Coffee in Its Region of Origin in Southwest Ethiopia.","authors":"Yves Bawin, Beyene Zewdie, Biruk Ayalew, Isabel Roldán-Ruiz, Steven B Janssens, Ayco J M Tack, Sileshi Nemomissa, Kassahun Tesfaye, Kristoffer Hylander, Olivier Honnay, Tom Ruttink","doi":"10.1111/1755-0998.14085","DOIUrl":"https://doi.org/10.1111/1755-0998.14085","url":null,"abstract":"<p><p>Cultivation of crops close to their wild relatives may jeopardise the integrity of wild genetic resources. Detecting cultivars among wild plants is necessary to characterise crop-wild gene flow, but can be challenging if cultivars and wild plants are phenotypically highly similar. Genomics tools can be used instead, but the selection of diagnostic loci for cultivar identification can be difficult if the wild and cultivated genepools are closely related. In Ethiopia, Arabica coffee cultivars resistant to coffee berry disease (CBD) occur near wild Coffea arabica plants and local landraces. However, the abundance and distribution of these cultivars across coffee sites remains unclear. Here, we present a new module of the SMAP package called SMAP relatedness pairwise to characterise pairwise genetic relationships between individuals based on haplotype calls and to identify diagnostic loci that distinguish (sets of) individuals from each other. Next, we estimate the relative abundance of CBD-resistant cultivars across 60 Ethiopian Arabica coffee sites using a genome-wide fingerprinting approach. We confirm the presence of these cultivars in around 75% of the coffee sites with a high agreement between a field survey and our DNA fingerprinting approach. At least 20 out of 60 sites with supposedly wild C. arabica individuals contain signatures of the cultivated genepool. Overall, we conclude that CBD-resistant cultivars are widespread in Ethiopian coffee sites. The development of SMAP relatedness pairwise opens opportunities to assess the distribution of coffee cultivars in other regions in Ethiopia and to apply similar screenings near wild relatives from other crops.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14085"},"PeriodicalIF":5.5,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143466406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luke E Holman, Giulia Zampirolo, Richard Gyllencreutz, James Scourse, Tobias Frøslev, Christian Carøe, Shyam Gopalakrishnan, Mikkel Winther Pedersen, Kristine Bohmann
The condition of ancient marine ecosystems provides context for contemporary biodiversity changes in human-impacted oceans. Sequencing sedimentary ancient DNA (sedaDNA) is an emerging method for generating high-resolution biodiversity time-series data, offering insights into past ecosystems. However, few studies directly compare the two predominant sedaDNA sequencing approaches: metabarcoding and shotgun-metagenomics, and it remains unclear if these methodological differences affect diversity metrics. We compared these methods using sedaDNA from an archived marine sediment record sampled in the Skagerrak, North Sea, spanning almost 8000 years. We performed metabarcoding of a eukaryotic 18S rRNA region (V9) and sequenced 153-229 million metagenomic reads per sample. Our results show limited overlap between metabarcoding and metagenomics, with only three metazoan genera detected by both methods. For overlapping taxa, metabarcoding detections became inconsistent for samples older than 2000 years, while metagenomics detected taxa throughout the time series. We observed divergent patterns of alpha diversity, with metagenomics indicating decreased richness towards the present and metabarcoding showing an increase. However, beta diversity patterns were similar between methods, with discrepancies only in metazoan data comparisons. Our findings demonstrate that the choice of sequencing method significantly impacts detected biodiversity in an ancient marine sediment record. While we stress that studies with limited variation in DNA degradation among samples may not be strongly affected, researchers should exonerate methodological explanations for observed biodiversity changes in marine sediment cores, particularly when considering alpha diversity, before making ecological interpretations.
{"title":"Navigating Past Oceans: Comparing Metabarcoding and Metagenomics of Marine Ancient Sediment Environmental DNA.","authors":"Luke E Holman, Giulia Zampirolo, Richard Gyllencreutz, James Scourse, Tobias Frøslev, Christian Carøe, Shyam Gopalakrishnan, Mikkel Winther Pedersen, Kristine Bohmann","doi":"10.1111/1755-0998.14086","DOIUrl":"https://doi.org/10.1111/1755-0998.14086","url":null,"abstract":"<p><p>The condition of ancient marine ecosystems provides context for contemporary biodiversity changes in human-impacted oceans. Sequencing sedimentary ancient DNA (sedaDNA) is an emerging method for generating high-resolution biodiversity time-series data, offering insights into past ecosystems. However, few studies directly compare the two predominant sedaDNA sequencing approaches: metabarcoding and shotgun-metagenomics, and it remains unclear if these methodological differences affect diversity metrics. We compared these methods using sedaDNA from an archived marine sediment record sampled in the Skagerrak, North Sea, spanning almost 8000 years. We performed metabarcoding of a eukaryotic 18S rRNA region (V9) and sequenced 153-229 million metagenomic reads per sample. Our results show limited overlap between metabarcoding and metagenomics, with only three metazoan genera detected by both methods. For overlapping taxa, metabarcoding detections became inconsistent for samples older than 2000 years, while metagenomics detected taxa throughout the time series. We observed divergent patterns of alpha diversity, with metagenomics indicating decreased richness towards the present and metabarcoding showing an increase. However, beta diversity patterns were similar between methods, with discrepancies only in metazoan data comparisons. Our findings demonstrate that the choice of sequencing method significantly impacts detected biodiversity in an ancient marine sediment record. While we stress that studies with limited variation in DNA degradation among samples may not be strongly affected, researchers should exonerate methodological explanations for observed biodiversity changes in marine sediment cores, particularly when considering alpha diversity, before making ecological interpretations.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14086"},"PeriodicalIF":5.5,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143466409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raquel Lima-Cordón, Jason Travis Mohabir, Mohini Sooklall, Aina Martinez Zurita, Meg Shieh, Cheyenne Knox, Sabrina Gobran, Zachary Johnson, Margaret Laws, Ruchit Panchal, Reza Niles-Robin, Horace Cox, Maria Eugenia Grillet, Jorge E Moreno, Socrates Herrera, Martha Quinones, Angela M Early, Jacob A Tennessen, Daniel E Neafsey
Vector control remains an important strategy worldwide to prevent human infection with pathogens transmitted by arthropods. Vector control strategies rely on accurate identification of vector taxa along with vector-specific biological indicators such as feeding ecology, infection prevalence and insecticide resistance. Multiple 'DNA barcoding' protocols have been published over the past several decades to support these applications, generally relying on informal manual approaches such as BLAST to assign taxonomic identity to the resulting sequences. We present a standardised informatic pipeline for analysis of DNA barcoding data from dipteran vectors, VecTreeID, that uses short-read amplicon sequencing (AmpSeq) coupled with sequence similarity assessment (BLAST) and an evolutionary placement algorithm (EPA-ng) to achieve vector taxonomic identification, capture bionomic features (blood and plant meal sources), determine Plasmodium infection status (for anopheline mosquitoes) and detect target-site insecticide resistance mutations. The VecTreeID pipeline provides uncertainty in assignment through identifications at varying levels of taxonomic rank, a feature missing from many approaches to DNA barcoding, but important given gaps and labelling problems in public sequence databases. We validated an Illumina-based implementation of VecTreeID on laboratory and field samples, and find that the blood meal amplicons can detect vertebrate DNA sequences up to 36 h post-feeding, and that short-read sequencing data are capable of sensitively detecting minor sequences in DNA mixtures representing multi-species blood or nectar meals. This high-throughput VecTreeID approach empowers researchers and public health professionals to survey and control arthropod disease vectors consistently and effectively.
{"title":"A Short-Read Amplicon Sequencing Protocol and Bioinformatic Pipeline for Ecological Surveillance of Dipteran Disease Vectors.","authors":"Raquel Lima-Cordón, Jason Travis Mohabir, Mohini Sooklall, Aina Martinez Zurita, Meg Shieh, Cheyenne Knox, Sabrina Gobran, Zachary Johnson, Margaret Laws, Ruchit Panchal, Reza Niles-Robin, Horace Cox, Maria Eugenia Grillet, Jorge E Moreno, Socrates Herrera, Martha Quinones, Angela M Early, Jacob A Tennessen, Daniel E Neafsey","doi":"10.1111/1755-0998.14088","DOIUrl":"10.1111/1755-0998.14088","url":null,"abstract":"<p><p>Vector control remains an important strategy worldwide to prevent human infection with pathogens transmitted by arthropods. Vector control strategies rely on accurate identification of vector taxa along with vector-specific biological indicators such as feeding ecology, infection prevalence and insecticide resistance. Multiple 'DNA barcoding' protocols have been published over the past several decades to support these applications, generally relying on informal manual approaches such as BLAST to assign taxonomic identity to the resulting sequences. We present a standardised informatic pipeline for analysis of DNA barcoding data from dipteran vectors, VecTreeID, that uses short-read amplicon sequencing (AmpSeq) coupled with sequence similarity assessment (BLAST) and an evolutionary placement algorithm (EPA-ng) to achieve vector taxonomic identification, capture bionomic features (blood and plant meal sources), determine Plasmodium infection status (for anopheline mosquitoes) and detect target-site insecticide resistance mutations. The VecTreeID pipeline provides uncertainty in assignment through identifications at varying levels of taxonomic rank, a feature missing from many approaches to DNA barcoding, but important given gaps and labelling problems in public sequence databases. We validated an Illumina-based implementation of VecTreeID on laboratory and field samples, and find that the blood meal amplicons can detect vertebrate DNA sequences up to 36 h post-feeding, and that short-read sequencing data are capable of sensitively detecting minor sequences in DNA mixtures representing multi-species blood or nectar meals. This high-throughput VecTreeID approach empowers researchers and public health professionals to survey and control arthropod disease vectors consistently and effectively.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14088"},"PeriodicalIF":5.5,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143439439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}