Asserting which allele is ancestral or derived, known as polarisation, is a prerequisite of many population and quantitative genetic methods. One important application is the inference of the unfolded site-frequency spectrum (uSFS). The most widely used approaches are based on outgroup data. However, for studies on species with only distantly related outgroups, large divergence between the ingroup and outgroup can result in alignment difficulties and substantial missing data, causing many sites of interest to be lost. Here, we present PolarBEAR (Polarisation By Estimation of the Ancestral Recombination graph), a method that uses the local genealogies from the ancestral recombination graph (ARG) to infer ancestral states. We show that PolarBEAR reaches high accuracy in polarisation and uSFS estimation using simulations under several scenarios. This accuracy, however, heavily depends on the ARG reconstruction method employed. We also applied our method to human population data and compared it with the outgroup-based method est-sfs. Although PolarBEAR could not infer the ancestral state with high confidence at certain positions, it obtained results for positions that est-sfs could not polarise due to missing outgroup data. The polarisation results of the two methods were highly consistent at positions inferred by both methods. The two methods inferred similar uSFS, with PolarBEAR estimating slightly fewer high-frequency derived alleles. Furthermore, we demonstrate that PolarBEAR is robust across different mutation models in our simulations, while est-sfs exhibits a bias in the presence of heterogeneous base composition. PolarBEAR can complement outgroup-based methods, or replace them when no appropriate outgroup sequence is available.
{"title":"Polarising SNPs Without Outgroup.","authors":"Jinyang Liang, Julien Y Dutheil","doi":"10.1111/1755-0998.70105","DOIUrl":"10.1111/1755-0998.70105","url":null,"abstract":"<p><p>Asserting which allele is ancestral or derived, known as polarisation, is a prerequisite of many population and quantitative genetic methods. One important application is the inference of the unfolded site-frequency spectrum (uSFS). The most widely used approaches are based on outgroup data. However, for studies on species with only distantly related outgroups, large divergence between the ingroup and outgroup can result in alignment difficulties and substantial missing data, causing many sites of interest to be lost. Here, we present PolarBEAR (Polarisation By Estimation of the Ancestral Recombination graph), a method that uses the local genealogies from the ancestral recombination graph (ARG) to infer ancestral states. We show that PolarBEAR reaches high accuracy in polarisation and uSFS estimation using simulations under several scenarios. This accuracy, however, heavily depends on the ARG reconstruction method employed. We also applied our method to human population data and compared it with the outgroup-based method est-sfs. Although PolarBEAR could not infer the ancestral state with high confidence at certain positions, it obtained results for positions that est-sfs could not polarise due to missing outgroup data. The polarisation results of the two methods were highly consistent at positions inferred by both methods. The two methods inferred similar uSFS, with PolarBEAR estimating slightly fewer high-frequency derived alleles. Furthermore, we demonstrate that PolarBEAR is robust across different mutation models in our simulations, while est-sfs exhibits a bias in the presence of heterogeneous base composition. PolarBEAR can complement outgroup-based methods, or replace them when no appropriate outgroup sequence is available.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"26 2","pages":"e70105"},"PeriodicalIF":5.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12878804/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146123046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to \"Advancing Yeast Identification Using High-Throughput DNA Barcode Data From a Curated Culture Collection\".","authors":"","doi":"10.1111/1755-0998.70106","DOIUrl":"10.1111/1755-0998.70106","url":null,"abstract":"","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"26 2","pages":"e70106"},"PeriodicalIF":5.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12862242/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146099660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Environmental DNA (eDNA) constitutes a valuable tool for monitoring terrestrial animal diversity, but outcomes are affected by multiple factors. Among these factors, the choice of sampling substrate and method is especially important and must be aligned with research objectives. We reviewed 245 published studies that utilise eDNA for terrestrial animal monitoring and compiled an overview of the most frequently used environmental substrates. Based on the reviewed literature, we provide a key description of each substrate, as well as its particular properties and limitations related to the detection of animal species across different spatial and temporal scales. We categorise these substrates into three groups: abiotic substrates (soil, water, air, sediment), biotic substrates (invertebrate samples, plant tissues, spiderwebs) and direct-evidence substrates (scat, footprints, shelters, feeding sites). In addition, we identify several key challenges with the interpretation of eDNA-based biodiversity monitoring, including false negatives and false positives, as well as the dynamics of spatial and temporal deviations. The latter concepts, which we propose and define in this review, describe the temporal and spatial discrepancies between the DNA source and its detection in a given sample. We reflect on how these temporal and spatial deviations are expected to affect eDNA data extracted from the different types of substrates and how knowledge of these dynamics can inform effective and accurate biomonitoring. In summary, this review provides a decision basis for designing terrestrial eDNA monitoring studies by summarising the properties and limitations of different substrates and contextualising the interpretation of results in light of substrate-specific challenges.
{"title":"Properties and Limitations of eDNA Substrates for Terrestrial Animal Monitoring","authors":"Beilun Zhao, Tobias Andermann","doi":"10.1111/1755-0998.70096","DOIUrl":"10.1111/1755-0998.70096","url":null,"abstract":"<p>Environmental DNA (eDNA) constitutes a valuable tool for monitoring terrestrial animal diversity, but outcomes are affected by multiple factors. Among these factors, the choice of sampling substrate and method is especially important and must be aligned with research objectives. We reviewed 245 published studies that utilise eDNA for terrestrial animal monitoring and compiled an overview of the most frequently used environmental substrates. Based on the reviewed literature, we provide a key description of each substrate, as well as its particular properties and limitations related to the detection of animal species across different spatial and temporal scales. We categorise these substrates into three groups: abiotic substrates (soil, water, air, sediment), biotic substrates (invertebrate samples, plant tissues, spiderwebs) and direct-evidence substrates (scat, footprints, shelters, feeding sites). In addition, we identify several key challenges with the interpretation of eDNA-based biodiversity monitoring, including false negatives and false positives, as well as the dynamics of spatial and temporal deviations. The latter concepts, which we propose and define in this review, describe the temporal and spatial discrepancies between the DNA source and its detection in a given sample. We reflect on how these temporal and spatial deviations are expected to affect eDNA data extracted from the different types of substrates and how knowledge of these dynamics can inform effective and accurate biomonitoring. In summary, this review provides a decision basis for designing terrestrial eDNA monitoring studies by summarising the properties and limitations of different substrates and contextualising the interpretation of results in light of substrate-specific challenges.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"26 2","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12831013/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146040187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katja Reichel, Jaakko Pohjoismäki, Jonas J. Astrin, Astrid Böhne, Chiara Bortoluzzi, Elena Bužan, Javier del Campo, Claudio Ciofi, Camilla B. Di-Nizo, Pradeep K. Divakar, Carola Greve, Vladimír Hampl, Leon Hilgers, Veronika N. Laine, Jennifer A. Leonard, Jesus Lozano-Fernandez, Lada Lukić Bilela, Camila J. Mazzoni, Ann M. McCartney, José Melo-Ferreira, Rita Monteiro, Rebekah A. Oomen, Martina Pavlek, João Pimenta, Michal Rindos, Ole Seehausen, Andrii Tarieiev, Salvatore Tomasello, Olga Vinnere Pettersson, Robert M. Waterhouse, Alexandra A.-T. Weber, Oleksandr Zinenko, Christian de Guttry
High-quality reference genome assemblies have become essential for deepening our understanding of biodiversity, yet obtaining them for many species remains surprisingly challenging. Drawing on experiences from the European Reference Genome Atlas (ERGA) community, we focus on permit and sample-handling procedures leading up to nucleic acid sequencing, covering tasks such as ensuring ethical and legal compliance, verifying accurate species identification, maintaining sample integrity during transport, and isolating high-quality DNA or nuclei. While many of the challenges and solutions we discuss are broadly relevant, our regulatory and logistical examples are primarily from Europe. By synthesising practical guidance, we highlight the crucial importance of taxonomic expertise, proper vouchering and biobanking, rigorous cold-chain management or alternative preservation methods, and emphasise adherence to packaging and shipping requirements for biological materials. We showcase examples spanning diverse regions, taxa and source materials, which underscore the importance of context-specific strategies and internationally harmonised protocols, particularly for metadata reporting. Our recommendations aim to support both small-scale projects and large initiatives, directing collective efforts to facilitate efficient sampling, vouchering and sample processing for future genomic studies.
{"title":"From Permits to Samples: Addressing Key Challenges for High-Quality Reference Genome Generation in Europe","authors":"Katja Reichel, Jaakko Pohjoismäki, Jonas J. Astrin, Astrid Böhne, Chiara Bortoluzzi, Elena Bužan, Javier del Campo, Claudio Ciofi, Camilla B. Di-Nizo, Pradeep K. Divakar, Carola Greve, Vladimír Hampl, Leon Hilgers, Veronika N. Laine, Jennifer A. Leonard, Jesus Lozano-Fernandez, Lada Lukić Bilela, Camila J. Mazzoni, Ann M. McCartney, José Melo-Ferreira, Rita Monteiro, Rebekah A. Oomen, Martina Pavlek, João Pimenta, Michal Rindos, Ole Seehausen, Andrii Tarieiev, Salvatore Tomasello, Olga Vinnere Pettersson, Robert M. Waterhouse, Alexandra A.-T. Weber, Oleksandr Zinenko, Christian de Guttry","doi":"10.1111/1755-0998.70100","DOIUrl":"10.1111/1755-0998.70100","url":null,"abstract":"<p>High-quality reference genome assemblies have become essential for deepening our understanding of biodiversity, yet obtaining them for many species remains surprisingly challenging. Drawing on experiences from the European Reference Genome Atlas (ERGA) community, we focus on permit and sample-handling procedures leading up to nucleic acid sequencing, covering tasks such as ensuring ethical and legal compliance, verifying accurate species identification, maintaining sample integrity during transport, and isolating high-quality DNA or nuclei. While many of the challenges and solutions we discuss are broadly relevant, our regulatory and logistical examples are primarily from Europe. By synthesising practical guidance, we highlight the crucial importance of taxonomic expertise, proper vouchering and biobanking, rigorous cold-chain management or alternative preservation methods, and emphasise adherence to packaging and shipping requirements for biological materials. We showcase examples spanning diverse regions, taxa and source materials, which underscore the importance of context-specific strategies and internationally harmonised protocols, particularly for metadata reporting. Our recommendations aim to support both small-scale projects and large initiatives, directing collective efforts to facilitate efficient sampling, vouchering and sample processing for future genomic studies.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"26 2","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12820446/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Environmental RNA (eRNA) metabarcoding has rapidly emerged as a powerful tool for assessing contemporary biodiversity patterns across diverse ecosystems. However, the potential for false positive detections caused by co-extracted environmental DNA (eDNA) remains unquantified. Distinguishing true signals from false positives caused by residual eDNA is a technical challenge in eRNA-based metabarcoding. To address this issue, we employed a freshwater river receiving treated effluent from a wastewater treatment plant as a model system. In such settings, eDNA in the treated effluent can lead to the detection of non-local species (e.g., marine taxa). Treated effluent typically contains minimal or no eRNA, making it well-suited for evaluating the influence of eDNA carryover. By comparing DNase-treated and untreated eRNA samples, we assessed the impact of residual eDNA on fish species richness and community composition. Our results showed that omitting DNase treatment significantly inflated taxonomic richness, with untreated samples detecting a conservative estimate of over 25% more taxa per site. Fold-change analysis revealed that residual eDNA inflated taxon abundances in both high- and low-abundance taxa, with some showing over 10-fold increases. Community composition analyses revealed clear clustering between treated and untreated samples, highlighting substantial shifts driven by residual eDNA. These findings demonstrate that co-extracted eDNA can severely distort eRNA-based biodiversity estimates, leading to false positives and misrepresented contemporary community profiles. We recommend further evaluation of DNase treatment parameters, including enzyme concentration, incubation time and treatment times, and the adoption of optimised protocols to standardise and improve the accuracy of eRNA-based biodiversity monitoring.
{"title":"Residual eDNA in eRNA Extracts Skews eRNA-Based Biodiversity Assessment: Call for Optimised DNase Treatment","authors":"Fuwen Wang, Wei Xiong, Xuena Huang, Shiguo Li, Aibin Zhan","doi":"10.1111/1755-0998.70102","DOIUrl":"10.1111/1755-0998.70102","url":null,"abstract":"<p>Environmental RNA (eRNA) metabarcoding has rapidly emerged as a powerful tool for assessing contemporary biodiversity patterns across diverse ecosystems. However, the potential for false positive detections caused by co-extracted environmental DNA (eDNA) remains unquantified. Distinguishing true signals from false positives caused by residual eDNA is a technical challenge in eRNA-based metabarcoding. To address this issue, we employed a freshwater river receiving treated effluent from a wastewater treatment plant as a model system. In such settings, eDNA in the treated effluent can lead to the detection of non-local species (e.g., marine taxa). Treated effluent typically contains minimal or no eRNA, making it well-suited for evaluating the influence of eDNA carryover. By comparing DNase-treated and untreated eRNA samples, we assessed the impact of residual eDNA on fish species richness and community composition. Our results showed that omitting DNase treatment significantly inflated taxonomic richness, with untreated samples detecting a conservative estimate of over 25% more taxa per site. Fold-change analysis revealed that residual eDNA inflated taxon abundances in both high- and low-abundance taxa, with some showing over 10-fold increases. Community composition analyses revealed clear clustering between treated and untreated samples, highlighting substantial shifts driven by residual eDNA. These findings demonstrate that co-extracted eDNA can severely distort eRNA-based biodiversity estimates, leading to false positives and misrepresented contemporary community profiles. We recommend further evaluation of DNase treatment parameters, including enzyme concentration, incubation time and treatment times, and the adoption of optimised protocols to standardise and improve the accuracy of eRNA-based biodiversity monitoring.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"26 2","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.70102","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145996732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karen K. Martien, Robin W. Baird, Kelly M. Robertson, Michaela A. Kratofil, Sabre D. Mahaffy, Kristi L. West, Susan J. Chivers, Frederick I. Archer
Epigenetic aging models hold great promise for enhancing many aspects of wildlife research and management. However, their utility is limited by the need to train models using known-aged animals, which are rare among wildlife species. We present a novel approach to developing methylation-based age prediction models that enables us to train models using samples from individuals whose chronological age is estimated with uncertainty based on photo-identification catalogue data. Our approach incorporates this uncertainty into model training by representing the age of each individual with a probability distribution rather than a point estimate. We similarly represent the methylation profiles of individuals as binomial distributions and produce a distribution of predicted age for each sample that reflects the uncertainty in both its age and methylation profile. We compared age models trained using a wide range of parameterisations, training data sets and analytical methods to determine how well they predicted the catalogue-based age estimates. The resulting model has a median absolute error of 1.70 years, outperforming many published clocks trained with known-age samples. This approach significantly expands the range of species for which accurate methylation-based age models can be developed, particularly those of conservation concern where known-age samples are limited. By producing distributions of predicted age, it also enables researchers to accurately communicate the uncertainty in their age estimates to subsequent data users.
{"title":"Epigenetic Age Estimation for Hawaiian False Killer Whales (Pseudorca crassidens) in the Absence of ‘Known-Age’ Individuals","authors":"Karen K. Martien, Robin W. Baird, Kelly M. Robertson, Michaela A. Kratofil, Sabre D. Mahaffy, Kristi L. West, Susan J. Chivers, Frederick I. Archer","doi":"10.1111/1755-0998.70099","DOIUrl":"10.1111/1755-0998.70099","url":null,"abstract":"<p>Epigenetic aging models hold great promise for enhancing many aspects of wildlife research and management. However, their utility is limited by the need to train models using known-aged animals, which are rare among wildlife species. We present a novel approach to developing methylation-based age prediction models that enables us to train models using samples from individuals whose chronological age is estimated with uncertainty based on photo-identification catalogue data. Our approach incorporates this uncertainty into model training by representing the age of each individual with a probability distribution rather than a point estimate. We similarly represent the methylation profiles of individuals as binomial distributions and produce a distribution of predicted age for each sample that reflects the uncertainty in both its age and methylation profile. We compared age models trained using a wide range of parameterisations, training data sets and analytical methods to determine how well they predicted the catalogue-based age estimates. The resulting model has a median absolute error of 1.70 years, outperforming many published clocks trained with known-age samples. This approach significantly expands the range of species for which accurate methylation-based age models can be developed, particularly those of conservation concern where known-age samples are limited. By producing distributions of predicted age, it also enables researchers to accurately communicate the uncertainty in their age estimates to subsequent data users.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"26 2","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12811820/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145987611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Environmental DNA (eDNA) analysis has increasingly been used for aquatic biomonitoring, although interpretation of results needs to consider how environmental factors influence the degradation process of eDNA. This review focuses on pH, which has long been considered a key factor in eDNA degradation although its apparent effect on eDNA degradation has varied across studies. Here I present a synthesis of existing research that summarises what is so far known about the pH dependence of eDNA degradation, and a meta-analysis that demonstrates a nonlinear, upward convex relationship between eDNA decay rates and pH, with modelled eDNA decay rates peaking around pH 8. These results suggest that under slightly alkaline conditions, which are often considered suitable for DNA preservation, eDNA degradation may be accelerated by the promotion of microbial and enzymatic activity. On the other hand, there were substantial inter-study discrepancies in the dataset, suggesting that the present meta-analysis could not fully address the complexity of pH-dependent eDNA degradation. I end this review with an extensive discussion of some possible mechanisms that should be further investigated in order to achieve a more comprehensive understanding of the pH dependence of eDNA degradation. These efforts will help with more accurate predictions of the persistence time and decay rates of eDNA under various environmental conditions, thereby potentially improving our interpretations of eDNA-based biodiversity surveys.
{"title":"pH-Dependent Degradation of Macrobial Environmental DNA in Water","authors":"Toshiaki S. Jo","doi":"10.1111/1755-0998.70101","DOIUrl":"10.1111/1755-0998.70101","url":null,"abstract":"<p>Environmental DNA (eDNA) analysis has increasingly been used for aquatic biomonitoring, although interpretation of results needs to consider how environmental factors influence the degradation process of eDNA. This review focuses on pH, which has long been considered a key factor in eDNA degradation although its apparent effect on eDNA degradation has varied across studies. Here I present a synthesis of existing research that summarises what is so far known about the pH dependence of eDNA degradation, and a meta-analysis that demonstrates a nonlinear, upward convex relationship between eDNA decay rates and pH, with modelled eDNA decay rates peaking around pH 8. These results suggest that under slightly alkaline conditions, which are often considered suitable for DNA preservation, eDNA degradation may be accelerated by the promotion of microbial and enzymatic activity. On the other hand, there were substantial inter-study discrepancies in the dataset, suggesting that the present meta-analysis could not fully address the complexity of pH-dependent eDNA degradation. I end this review with an extensive discussion of some possible mechanisms that should be further investigated in order to achieve a more comprehensive understanding of the pH dependence of eDNA degradation. These efforts will help with more accurate predictions of the persistence time and decay rates of eDNA under various environmental conditions, thereby potentially improving our interpretations of eDNA-based biodiversity surveys.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"26 2","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12809876/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145987563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The rapid growth of genome sequencing has outpaced the development of efficient annotation tools, especially for species lacking transcriptome data. To address this challenge, we present QuickProt, a fast, accurate and user-friendly homology-based protein annotation tool. QuickProt constructs a non-redundant gene model by aligning homologous proteins from closely related species, offering an accurate and cost-effective solution suitable for large-scale comparative genomic studies. Benchmarking against BRAKER2 and GALBA across reference genomes demonstrated that QuickProt offers high specificity and dramatically improved runtime, while maintaining competitive annotation accuracy. To demonstrate its utility, we applied QuickProt to diverse genomes, including a non-model teleost (Epinephelus bruneus), two tetraploid Xenopus species and 11 Rutaceae plants. Across these datasets, QuickProt supported robust phylogenetic reconstruction, identification of conserved orthologs and detection of biologically functional genes, pathways, and chromosomal evolution mechanisms, regardless of genome ploidy. Notably, it revealed a potential horizontal gene transfer event between groupers and Vibrio, and uncovered conserved modules involved in volatile oil biosynthesis and oil gland development in citrus. With its scalability and minimal computational demands, QuickProt provides a powerful platform for genome annotation and evolutionary inference. As the number of sequenced genomes continues to expand, QuickProt is a useful tool for accelerating comparative genomics and functional exploration across the tree of life.
{"title":"QuickProt: A Fast and Accurate Homology-Based Protein Annotation Tool for Non-Model Organisms to Advance Comparative Genomics","authors":"Guisen Chen, Hehe Du, Zhenjie Cao, Ying Wu, Chen Zhang, Yongcan Zhou, Jingqun Ao, Yun Sun, Zihao Yuan","doi":"10.1111/1755-0998.70097","DOIUrl":"10.1111/1755-0998.70097","url":null,"abstract":"<p>The rapid growth of genome sequencing has outpaced the development of efficient annotation tools, especially for species lacking transcriptome data. To address this challenge, we present QuickProt, a fast, accurate and user-friendly homology-based protein annotation tool. QuickProt constructs a non-redundant gene model by aligning homologous proteins from closely related species, offering an accurate and cost-effective solution suitable for large-scale comparative genomic studies. Benchmarking against BRAKER2 and GALBA across reference genomes demonstrated that QuickProt offers high specificity and dramatically improved runtime, while maintaining competitive annotation accuracy. To demonstrate its utility, we applied QuickProt to diverse genomes, including a non-model teleost (<i>Epinephelus bruneus</i>), two tetraploid <i>Xenopus</i> species and 11 Rutaceae plants. Across these datasets, QuickProt supported robust phylogenetic reconstruction, identification of conserved orthologs and detection of biologically functional genes, pathways, and chromosomal evolution mechanisms, regardless of genome ploidy. Notably, it revealed a potential horizontal gene transfer event between groupers and <i>Vibrio</i>, and uncovered conserved modules involved in volatile oil biosynthesis and oil gland development in citrus. With its scalability and minimal computational demands, QuickProt provides a powerful platform for genome annotation and evolutionary inference. As the number of sequenced genomes continues to expand, QuickProt is a useful tool for accelerating comparative genomics and functional exploration across the tree of life.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"26 2","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12794128/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145950845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sarah Christin Gronefeld, Heriberto López, Robin Schmidt, Axel Hochkirch
Key Biodiversity Areas (KBAs) are sites that contribute significantly to the global persistence of biodiversity. Distinct genetic diversity has been introduced as one of the metrics to estimate whether a site holds a threshold proportion of a species' global genetic diversity during the KBA identification process. However, genetic data has so far not been applied in KBA identification due to the lack of thoroughly tested methods and guidance. We tested the suitability of six analytical methods for identification of KBAs based upon genetic data: allelic overlap, Analyses of Molecular Variance (AMOVA), average taxonomic distinctness (AvTD, Δ+), effective population size (Ne), the genetic differentiation index (Dest), and the diversity index Simpson's λ. We conclude that Δ+, a measure that was developed to measure taxonomic distinctness of biotic communities, performs best in the context of KBA identification as it reflects the unique nature of a species' genetic diversity, is based on simple allele frequencies, and can be easily applied and calculated. AMOVA, Ne, allelic overlap, and our modified version of λ were difficult to apply, interpret, or both. Dest is easily applied for measuring genetic distinctiveness but not genetic diversity. For this reason, it may not be suitable for prioritising areas for the long-term protection of the species.
{"title":"Identifying Key Biodiversity Areas Based on Distinct Genetic Diversity","authors":"Sarah Christin Gronefeld, Heriberto López, Robin Schmidt, Axel Hochkirch","doi":"10.1111/1755-0998.70094","DOIUrl":"10.1111/1755-0998.70094","url":null,"abstract":"<p>Key Biodiversity Areas (KBAs) are sites that contribute significantly to the global persistence of biodiversity. Distinct genetic diversity has been introduced as one of the metrics to estimate whether a site holds a threshold proportion of a species' global genetic diversity during the KBA identification process. However, genetic data has so far not been applied in KBA identification due to the lack of thoroughly tested methods and guidance. We tested the suitability of six analytical methods for identification of KBAs based upon genetic data: allelic overlap, Analyses of Molecular Variance (AMOVA), average taxonomic distinctness (AvTD, Δ<sup>+</sup>), effective population size (<i>N</i><sub>e</sub>), the genetic differentiation index (<i>D</i><sub>est</sub>), and the diversity index Simpson's λ. We conclude that Δ<sup>+</sup>, a measure that was developed to measure taxonomic distinctness of biotic communities, performs best in the context of KBA identification as it reflects the unique nature of a species' genetic diversity, is based on simple allele frequencies, and can be easily applied and calculated. AMOVA, <i>N</i><sub>e</sub>, allelic overlap, and our modified version of λ were difficult to apply, interpret, or both. <i>D</i><sub>est</sub> is easily applied for measuring genetic distinctiveness but not genetic diversity. For this reason, it may not be suitable for prioritising areas for the long-term protection of the species.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"26 2","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12789961/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145941857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kai Chen, Shuai Luo, Chuanqi Jiang, Siyu Gu, Fangdian Yang, Xuehua Liu, Su Wang, Xiao Qu, Qi Zhang, Peng Zhang, Yingchun Gong, Honghui Zeng, Dongru Qiu, Wei Miao, Jie Xiong
Aquatic ecosystems host diverse organisms across all six life kingdoms, yet their complex interactions remain poorly understood, primarily due to limitations in transkingdom species detection methods. To address this limitation, we developed HiMBar (https://github.com/Xchenkai2019/HIFI_barcoding), a high-fidelity (HiFi) metagenomic barcoding approach that utilises long, highly accurate reads to extract multiple full-length marker genes (such as rRNA genes, COI, rbcL) directly from environmental DNA sequencing reads. These genes are subsequently clustered into operational taxonomic units (OTUs) for species identification, eliminating the need for PCR amplification or sequence assembly. HiMBar outperforms existing DNA-based methods in accuracy, recall and consistency. Applying HiMBar, we identified a stable interaction network among Cyanobacteria, Planctomycetota, Verrucomicrobiota and Fungi. Further analysis revealed that glucose metabolism plays a key role in maintaining these interactions. Our study offers a powerful tool for transkingdom species monitoring and provides a case study for exploring transkingdom interactions and their molecular mechanisms.
{"title":"HiMBar: A High-Fidelity Metagenomic Barcoding Approach for Transkingdom Species Detection and Interaction Analysis in Aquatic Ecosystems","authors":"Kai Chen, Shuai Luo, Chuanqi Jiang, Siyu Gu, Fangdian Yang, Xuehua Liu, Su Wang, Xiao Qu, Qi Zhang, Peng Zhang, Yingchun Gong, Honghui Zeng, Dongru Qiu, Wei Miao, Jie Xiong","doi":"10.1111/1755-0998.70092","DOIUrl":"10.1111/1755-0998.70092","url":null,"abstract":"<p>Aquatic ecosystems host diverse organisms across all six life kingdoms, yet their complex interactions remain poorly understood, primarily due to limitations in transkingdom species detection methods. To address this limitation, we developed <i>HiMBar</i> (https://github.com/Xchenkai2019/HIFI_barcoding), a high-fidelity (HiFi) metagenomic barcoding approach that utilises long, highly accurate reads to extract multiple full-length marker genes (such as rRNA genes, COI, rbcL) directly from environmental DNA sequencing reads. These genes are subsequently clustered into operational taxonomic units (OTUs) for species identification, eliminating the need for PCR amplification or sequence assembly. <i>HiMBar</i> outperforms existing DNA-based methods in accuracy, recall and consistency. Applying <i>HiMBar</i>, we identified a stable interaction network among Cyanobacteria, Planctomycetota, Verrucomicrobiota and Fungi. Further analysis revealed that glucose metabolism plays a key role in maintaining these interactions. Our study offers a powerful tool for transkingdom species monitoring and provides a case study for exploring transkingdom interactions and their molecular mechanisms.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"26 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12779096/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145916236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}