Şevval Aktürk, Igor Mapelli, Merve N. Güler, Kanat Gürün, Büşra Katırcıoğlu, Kıvılcım Başak Vural, Ekin Sağlıcan, Mehmet Çetin, Reyhan Yaka, Elif Sürer, Gözde Atağ, Sevim Seda Çokoğlu, Arda Sevkar, N. Ezgi Altınışık, Dilek Koptekin, Mehmet Somel
There is growing interest in uncovering genetic kinship patterns in past societies using low-coverage palaeogenomes. Here, we benchmark four tools for kinship estimation with such data: lcMLkin, NgsRelate, KIN, and READ, which differ in their input, IBD estimation methods, and statistical approaches. We used pedigree and ancient genome sequence simulations to evaluate these tools when only a limited number (1 to 50 K, with minor allele frequency ≥0.01) of shared SNPs are available. The performance of all four tools was comparable using ≥20 K SNPs. We found that first-degree related pairs can be accurately classified even with 1 K SNPs, with 85% F1 scores using READ and 96% using NgsRelate or lcMLkin. Distinguishing third-degree relatives from unrelated pairs or second-degree relatives was also possible with high accuracy (F1 > 90%) with 5 K SNPs using NgsRelate and lcMLkin, while READ and KIN showed lower success (69 and 79% respectively). Meanwhile, noise in population allele frequencies and inbreeding (first-cousin mating) led to deviations in kinship coefficients, with different sensitivities across tools. We conclude that using multiple tools in parallel might be an effective approach to achieve robust estimates on ultra-low-coverage genomes.
人们对利用低覆盖率古基因组揭示过去社会的遗传亲缘关系模式越来越感兴趣。在此,我们对使用此类数据进行亲缘关系估计的四种工具进行了基准测试:lcMLkin、NgsRelate、KIN 和 READ,它们在输入、IBD 估计方法和统计方法上各不相同。我们使用血统和古基因组序列模拟来评估这些工具,当只有有限数量(1 到 50 K,小等位基因频率≥0.01)的共享 SNP 可用时。使用≥20 K SNPs时,所有四种工具的性能相当。我们发现,即使只有 1 K 个 SNPs,也能对一级亲属配对进行准确分类,使用 READ 的 F1 得分率为 85%,使用 NgsRelate 或 lcMLkin 的 F1 得分率为 96%。利用 5 K SNPs,使用 NgsRelate 和 lcMLkin 也能以较高的准确率(F1 > 90%)将三代亲属与无亲属关系的配对或二代亲属区分开来,而 READ 和 KIN 的成功率较低(分别为 69% 和 79%)。同时,种群等位基因频率和近亲繁殖(嫡亲交配)的噪声导致亲缘关系系数出现偏差,不同工具的敏感度也不同。我们的结论是,并行使用多种工具可能是在超低覆盖率基因组上实现稳健估计的有效方法。
{"title":"Benchmarking kinship estimation tools for ancient genomes using pedigree simulations","authors":"Şevval Aktürk, Igor Mapelli, Merve N. Güler, Kanat Gürün, Büşra Katırcıoğlu, Kıvılcım Başak Vural, Ekin Sağlıcan, Mehmet Çetin, Reyhan Yaka, Elif Sürer, Gözde Atağ, Sevim Seda Çokoğlu, Arda Sevkar, N. Ezgi Altınışık, Dilek Koptekin, Mehmet Somel","doi":"10.1111/1755-0998.13960","DOIUrl":"10.1111/1755-0998.13960","url":null,"abstract":"<p>There is growing interest in uncovering genetic kinship patterns in past societies using low-coverage palaeogenomes. Here, we benchmark four tools for kinship estimation with such data: lcMLkin, NgsRelate, KIN, and READ, which differ in their input, IBD estimation methods, and statistical approaches. We used pedigree and ancient genome sequence simulations to evaluate these tools when only a limited number (1 to 50 K, with minor allele frequency ≥0.01) of shared SNPs are available. The performance of all four tools was comparable using ≥20 K SNPs. We found that first-degree related pairs can be accurately classified even with 1 K SNPs, with 85% <i>F</i><sub>1</sub> scores using READ and 96% using NgsRelate or lcMLkin. Distinguishing third-degree relatives from unrelated pairs or second-degree relatives was also possible with high accuracy (<i>F</i><sub>1</sub> > 90%) with 5 K SNPs using NgsRelate and lcMLkin, while READ and KIN showed lower success (69 and 79% respectively). Meanwhile, noise in population allele frequencies and inbreeding (first-cousin mating) led to deviations in kinship coefficients, with different sensitivities across tools. We conclude that using multiple tools in parallel might be an effective approach to achieve robust estimates on ultra-low-coverage genomes.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13960","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140805597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Phylogenetic studies now routinely require manipulating and summarizing thousands of data files. For most of these tasks, currently available software requires considerable computing resources and substantial knowledge of command-line applications. We develop an ultrafast and memory-efficient software, SEGUL, that performs common phylogenomic dataset manipulations and calculates statistics summarizing essential data features. Our software is available as standalone command-line interface (CLI) and graphical user interface (GUI) applications, and as a library for Rust, R and Python, with possible support of other languages. The CLI and library versions run native on Windows, Linux and macOS, including Apple ARM Macs. The GUI version extends support to include mobile iOS, iPadOS and Android operating systems. SEGUL leverages the high performance of the Rust programming language to offer fast execution times and low memory footprints regardless of dataset size and platform choice. The inclusion of a GUI minimizes bioinformatics barriers to phylogenomics while SEGUL's efficiency reduces economic barriers by allowing analysis on inexpensive hardware. Our support for mobile operating systems further enables teaching phylogenomics where access to computing power is limited.
{"title":"SEGUL: Ultrafast, memory-efficient and mobile-friendly software for manipulating and summarizing phylogenomic datasets","authors":"Heru Handika, Jacob A. Esselstyn","doi":"10.1111/1755-0998.13964","DOIUrl":"10.1111/1755-0998.13964","url":null,"abstract":"<p>Phylogenetic studies now routinely require manipulating and summarizing thousands of data files. For most of these tasks, currently available software requires considerable computing resources and substantial knowledge of command-line applications. We develop an ultrafast and memory-efficient software, SEGUL, that performs common phylogenomic dataset manipulations and calculates statistics summarizing essential data features. Our software is available as standalone command-line interface (CLI) and graphical user interface (GUI) applications, and as a library for Rust, R and Python, with possible support of other languages. The CLI and library versions run native on Windows, Linux and macOS, including Apple ARM Macs. The GUI version extends support to include mobile iOS, iPadOS and Android operating systems. SEGUL leverages the high performance of the Rust programming language to offer fast execution times and low memory footprints regardless of dataset size and platform choice. The inclusion of a GUI minimizes bioinformatics barriers to phylogenomics while SEGUL's efficiency reduces economic barriers by allowing analysis on inexpensive hardware. Our support for mobile operating systems further enables teaching phylogenomics where access to computing power is limited.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 7","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140652800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Preparation of DNA polymorphism datasets for analysis is an important step in evolutionary genetic and molecular ecology studies. Ever-growing dataset sizes make this step time consuming, but few convenient software tools are available to facilitate processing of large-scale datasets including thousands of sequence alignments. Here I report “processor of sequences v4” (proSeq4)—a user-friendly multiplatform software for preparation and evolutionary genetic analyses of genome- or transcriptome-scale sequence polymorphism datasets. The program has an easy-to-use graphic user interface and is designed to process and analyse many thousands of datasets. It supports over two dozen file formats, includes a flexible sequence editor and various tools for data visualization, quality control and most commonly used evolutionary genetic analyses, such as NJ-phylogeny reconstruction, DNA polymorphism analyses and coalescent simulations. Command line tools (e.g. vcf2fasta) are also provided for easier integration into bioinformatic pipelines. Apart of molecular ecology and evolution research, proSeq4 may be useful for teaching, e.g. for visual illustration of different shapes of phylogenies generated with coalescent simulations in different scenarios. ProSeq4 source code and binaries for Windows, MacOS and Ubuntu are available from https://sourceforge.net/projects/proseq/.
{"title":"ProSeq4: A user-friendly multiplatform program for preparation and analysis of large-scale DNA polymorphism datasets","authors":"Dmitry A. Filatov","doi":"10.1111/1755-0998.13962","DOIUrl":"10.1111/1755-0998.13962","url":null,"abstract":"<p>Preparation of DNA polymorphism datasets for analysis is an important step in evolutionary genetic and molecular ecology studies. Ever-growing dataset sizes make this step time consuming, but few convenient software tools are available to facilitate processing of large-scale datasets including thousands of sequence alignments. Here I report “processor of sequences v4” (proSeq4)—a user-friendly multiplatform software for preparation and evolutionary genetic analyses of genome- or transcriptome-scale sequence polymorphism datasets. The program has an easy-to-use graphic user interface and is designed to process and analyse many thousands of datasets. It supports over two dozen file formats, includes a flexible sequence editor and various tools for data visualization, quality control and most commonly used evolutionary genetic analyses, such as NJ-phylogeny reconstruction, DNA polymorphism analyses and coalescent simulations. Command line tools (e.g. vcf2fasta) are also provided for easier integration into bioinformatic pipelines. Apart of molecular ecology and evolution research, proSeq4 may be useful for teaching, e.g. for visual illustration of different shapes of phylogenies generated with coalescent simulations in different scenarios. ProSeq4 source code and binaries for Windows, MacOS and Ubuntu are available from https://sourceforge.net/projects/proseq/.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13962","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140672187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bruno H. Saranholi, Filipe M. França, Alfried P. Vogler, Jos Barlow, Fernando Z. Vaz de Mello, Maria E. Maldaner, Edrielly Carvalho, Carla C. Gestich, Benjamin Howes, Cristina Banks-Leite, Pedro M. Galetti Jr
Over the past few years, insects have been used as samplers of vertebrate diversity by assessing the ingested-derived DNA (iDNA), and dung beetles have been shown to be a good mammal sampler given their broad feeding preference, wide distribution and easy sampling. Here, we tested and optimized the use of iDNA from dung beetles to assess the mammal community by evaluating if some biological and methodological aspects affect the use of dung beetles as mammal species samplers. We collected 403 dung beetles from 60 pitfall traps. iDNA from each dung beetle was sequenced by metabarcoding using two mini-barcodes (12SrRNA and 16SrRNA). We assessed whether dung beetles with different traits related to feeding, nesting and body size differed in the number of mammal species found in their iDNA. We also tested differences among four killing solutions in preserving the iDNA and compared the effectiveness of each mini barcode to recover mammals. We identified a total of 50 mammal OTUs (operational taxonomic unit), including terrestrial and arboreal species from 10 different orders. We found that at least one mammal-matching sequence was obtained from 70% of the dung beetle specimens. The number of mammal OTUs obtained did not vary with dung beetle traits as well as between the killing solutions. The 16SrRNA mini-barcode recovered a higher number of mammal OTUs than 12SrRNA, although both sets were partly non-overlapping. Thus, the complete mammal diversity may not be achieved by using only one of them. This study refines the methodology for routine assessment of tropical mammal communities via dung beetle ‘samplers’ and its universal applicability independently of the species traits of local beetle communities.
{"title":"Testing and optimizing metabarcoding of iDNA from dung beetles to sample mammals in the hyperdiverse Neotropics","authors":"Bruno H. Saranholi, Filipe M. França, Alfried P. Vogler, Jos Barlow, Fernando Z. Vaz de Mello, Maria E. Maldaner, Edrielly Carvalho, Carla C. Gestich, Benjamin Howes, Cristina Banks-Leite, Pedro M. Galetti Jr","doi":"10.1111/1755-0998.13961","DOIUrl":"10.1111/1755-0998.13961","url":null,"abstract":"<p>Over the past few years, insects have been used as samplers of vertebrate diversity by assessing the ingested-derived DNA (iDNA), and dung beetles have been shown to be a good mammal sampler given their broad feeding preference, wide distribution and easy sampling. Here, we tested and optimized the use of iDNA from dung beetles to assess the mammal community by evaluating if some biological and methodological aspects affect the use of dung beetles as mammal species samplers. We collected 403 dung beetles from 60 pitfall traps. iDNA from each dung beetle was sequenced by metabarcoding using two mini-barcodes (12SrRNA and 16SrRNA). We assessed whether dung beetles with different traits related to feeding, nesting and body size differed in the number of mammal species found in their iDNA. We also tested differences among four killing solutions in preserving the iDNA and compared the effectiveness of each mini barcode to recover mammals. We identified a total of 50 mammal OTUs (operational taxonomic unit), including terrestrial and arboreal species from 10 different orders. We found that at least one mammal-matching sequence was obtained from 70% of the dung beetle specimens. The number of mammal OTUs obtained did not vary with dung beetle traits as well as between the killing solutions. The 16SrRNA mini-barcode recovered a higher number of mammal OTUs than 12SrRNA, although both sets were partly non-overlapping. Thus, the complete mammal diversity may not be achieved by using only one of them. This study refines the methodology for routine assessment of tropical mammal communities via dung beetle ‘samplers’ and its universal applicability independently of the species traits of local beetle communities.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140673023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Éadin N. O'Mahony, Angela L. Sremba, Eric M. Keen, Nicole Robinson, Archie Dundas, Debbie Steel, Janie Wray, C. Scott Baker, Oscar E. Gaggiotti
In coastal British Columbia, Canada, marine megafauna such as humpback whales (Megaptera novaeangliae) and fin whales (Balaenoptera physalus velifera) have been subject to a history of exploitation and near extirpation. While their populations have been in recovery, significant threats are posed to these vulnerable species by proposed natural resource ventures in this region, in addition to the compounding effects of anthropogenic climate change. Genetic tools play a vital role in informing conservation efforts, but the associated collection of tissue biopsy samples can be challenging for the investigators and disruptive to the ongoing behaviour of the targeted whales. Here, we evaluate a minimally intrusive approach based on collecting exhaled breath condensate, or respiratory ‘blow’ samples, from baleen whales using an unoccupied aerial system (UAS), within Gitga'at First Nation territory for conservation genetics. Minimal behavioural responses to the sampling technique were observed, with no response detected 87% of the time (of 112 UAS deployments). DNA from whale blow (n = 88 samples) was extracted, and DNA profiles consisting of 10 nuclear microsatellite loci, sex identification and mitochondrial (mt) DNA haplotypes were constructed. An average of 7.5 microsatellite loci per individual were successfully genotyped. The success rates for mtDNA and sex assignment were 80% and 89% respectively. Thus, this minimally intrusive sampling method can be used to describe genetic diversity and generate genetic profiles for individual identification. The results of this research demonstrate the potential of UAS-collected whale blow for conservation genetics from a remote location.
{"title":"Collecting baleen whale blow samples by drone: A minimally intrusive tool for conservation genetics","authors":"Éadin N. O'Mahony, Angela L. Sremba, Eric M. Keen, Nicole Robinson, Archie Dundas, Debbie Steel, Janie Wray, C. Scott Baker, Oscar E. Gaggiotti","doi":"10.1111/1755-0998.13957","DOIUrl":"10.1111/1755-0998.13957","url":null,"abstract":"<p>In coastal British Columbia, Canada, marine megafauna such as humpback whales (<i>Megaptera novaeangliae</i>) and fin whales (<i>Balaenoptera physalus velifera</i>) have been subject to a history of exploitation and near extirpation. While their populations have been in recovery, significant threats are posed to these vulnerable species by proposed natural resource ventures in this region, in addition to the compounding effects of anthropogenic climate change. Genetic tools play a vital role in informing conservation efforts, but the associated collection of tissue biopsy samples can be challenging for the investigators and disruptive to the ongoing behaviour of the targeted whales. Here, we evaluate a minimally intrusive approach based on collecting exhaled breath condensate, or respiratory ‘blow’ samples, from baleen whales using an unoccupied aerial system (UAS), within Gitga'at First Nation territory for conservation genetics. Minimal behavioural responses to the sampling technique were observed, with no response detected 87% of the time (of 112 UAS deployments). DNA from whale blow (<i>n</i> = 88 samples) was extracted, and DNA profiles consisting of 10 nuclear microsatellite loci, sex identification and mitochondrial (mt) DNA haplotypes were constructed. An average of 7.5 microsatellite loci per individual were successfully genotyped. The success rates for mtDNA and sex assignment were 80% and 89% respectively. Thus, this minimally intrusive sampling method can be used to describe genetic diversity and generate genetic profiles for individual identification. The results of this research demonstrate the potential of UAS-collected whale blow for conservation genetics from a remote location.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 8","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13957","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140569086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lignin, as an abundant organic carbon, plays a vital role in the global carbon cycle. However, our understanding of the global lignin-degrading microbiome remains elusive. The greatest barrier has been absence of a comprehensive and accurate functional gene database. Here, we first developed a curated functional gene database (LCdb) for metagenomic profiling of lignin degrading microbial consortia. Via the LCdb, we draw a clear picture describing the global biogeography of communities with lignin-degrading potential. They exhibit clear niche differentiation at the levels of taxonomy and functional traits. The terrestrial microbiomes showed the highest diversity, yet the lowest correlations. In particular, there were few correlations between genes involved in aerobic and anaerobic degradation pathways, showing a clear functional redundancy property. In contrast, enhanced correlations, especially closer inter-connections between anaerobic and aerobic groups, were observed in aquatic consortia in response to the lower diversity. Specifically, dypB and dypA, are widespread on Earth, indicating their essential roles in lignin depolymerization. Estuarine and marine consortia featured the laccase and mnsod genes, respectively. Notably, the roles of archaea in lignin degradation were revealed in marine ecosystems. Environmental factors strongly influenced functional traits, but weakly shaped taxonomic groups. Null mode analysis further verified that composition of functional traits was deterministic, while taxonomic composition was highly stochastic, demonstrating that the environment selects functional genes rather than taxonomic groups. Our study not only develops a useful tool to study lignin degrading microbial communities via metagenome sequencing but also advances our understanding of ecological traits of these global microbiomes.
{"title":"Metagenomic-based discovery and comparison of the lignin degrading potential of microbiomes in aquatic and terrestrial ecosystems via the LCdb database","authors":"Jiyu Chen, Lu Lin, Qichao Tu, Qiannan Peng, Xiaopeng Wang, Congying Liang, Jiayin Zhou, Xiaoli Yu","doi":"10.1111/1755-0998.13950","DOIUrl":"10.1111/1755-0998.13950","url":null,"abstract":"<p>Lignin, as an abundant organic carbon, plays a vital role in the global carbon cycle. However, our understanding of the global lignin-degrading microbiome remains elusive. The greatest barrier has been absence of a comprehensive and accurate functional gene database. Here, we first developed a curated functional gene database (LCdb) for metagenomic profiling of lignin degrading microbial consortia. Via the LCdb, we draw a clear picture describing the global biogeography of communities with lignin-degrading potential. They exhibit clear niche differentiation at the levels of taxonomy and functional traits. The terrestrial microbiomes showed the highest diversity, yet the lowest correlations. In particular, there were few correlations between genes involved in aerobic and anaerobic degradation pathways, showing a clear functional redundancy property. In contrast, enhanced correlations, especially closer inter-connections between anaerobic and aerobic groups, were observed in aquatic consortia in response to the lower diversity. Specifically, <i>dypB</i> and <i>dypA</i>, are widespread on Earth, indicating their essential roles in lignin depolymerization. Estuarine and marine consortia featured the <i>laccase</i> and <i>mnsod</i> genes, respectively. Notably, the roles of archaea in lignin degradation were revealed in marine ecosystems. Environmental factors strongly influenced functional traits, but weakly shaped taxonomic groups. Null mode analysis further verified that composition of functional traits was deterministic, while taxonomic composition was highly stochastic, demonstrating that the environment selects functional genes rather than taxonomic groups. Our study not only develops a useful tool to study lignin degrading microbial communities via metagenome sequencing but also advances our understanding of ecological traits of these global microbiomes.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianmin Zhang, Haohao Jing, Jinhong Wang, Le Zhao, Yang Liu, Stephen J. Rossiter, Huimeng Lu, Gang Li
The origin of flight and laryngeal echolocation in bats is likely to have been accompanied by evolutionary changes in other aspects of their sensory biology. Of all sensory modalities in bats, olfaction is perhaps the least well understood. Olfactory receptors (ORs) function in recognizing odour molecules, with crucial roles in evaluating food, as well as in processing social information. Here we compare OR repertoire sizes across taxa and apply a new pipeline that integrates comparative genome data with protein structure modelling and then we employ molecular docking techniques with small molecules to analyse OR functionality based on binding energies. Our results suggest a sharp contraction in odorant recognition of the functional OR repertoire during the origin of bats, consistent with a reduced dependence on olfaction. We also compared bat lineages with contrasting different ecological characteristics and found evidence of differences in OR gene expansion and contraction, and in the composition of ORs with different tuning breadths. The strongest binding energies of ORs in non-echolocating fruit-eating bats were seen to correspond to ester odorants, although we did not detect a quantitative advantage of functional OR repertoires in these bats compared with echolocating insectivorous species. Overall, our findings based on molecular modelling and computational docking suggest that bats have undergone olfactory evolution linked to dietary adaptation. Our results from extant and ancestral bats help to lay the groundwork for targeted experimental functional tests in the future.
蝙蝠飞行和喉回声定位的起源很可能伴随着其感官生物学其他方面的进化变化。在蝙蝠的所有感官模式中,嗅觉可能是最不为人所知的。嗅觉受体(OR)具有识别气味分子的功能,在评估食物和处理社会信息方面起着至关重要的作用。在这里,我们比较了不同类群中嗅觉受体的数量,并采用了一种新的方法,将比较基因组数据与蛋白质结构建模结合起来,然后利用小分子的分子对接技术,根据结合能分析嗅觉受体的功能。我们的研究结果表明,在蝙蝠的起源过程中,功能性OR剧目对气味的识别能力急剧收缩,这与蝙蝠对嗅觉的依赖性降低是一致的。我们还比较了具有不同生态特征的蝙蝠种系,发现了OR基因扩张和收缩的差异,以及具有不同调谐广度的OR组成的差异。在非回声定位的食果蝙蝠中,OR 的最强结合能与酯类气味相对应,但与回声定位的食虫蝙蝠相比,我们并未发现这些蝙蝠的功能性 OR 重奏在数量上有优势。总之,我们基于分子建模和计算对接的研究结果表明,蝙蝠的嗅觉进化与饮食适应有关。我们从现生蝙蝠和祖先蝙蝠身上获得的结果有助于为未来有针对性的实验功能测试奠定基础。
{"title":"Evolution of olfactory receptor superfamily in bats based on high throughput molecular modelling","authors":"Tianmin Zhang, Haohao Jing, Jinhong Wang, Le Zhao, Yang Liu, Stephen J. Rossiter, Huimeng Lu, Gang Li","doi":"10.1111/1755-0998.13958","DOIUrl":"10.1111/1755-0998.13958","url":null,"abstract":"<p>The origin of flight and laryngeal echolocation in bats is likely to have been accompanied by evolutionary changes in other aspects of their sensory biology. Of all sensory modalities in bats, olfaction is perhaps the least well understood. Olfactory receptors (ORs) function in recognizing odour molecules, with crucial roles in evaluating food, as well as in processing social information. Here we compare OR repertoire sizes across taxa and apply a new pipeline that integrates comparative genome data with protein structure modelling and then we employ molecular docking techniques with small molecules to analyse OR functionality based on binding energies. Our results suggest a sharp contraction in odorant recognition of the functional OR repertoire during the origin of bats, consistent with a reduced dependence on olfaction. We also compared bat lineages with contrasting different ecological characteristics and found evidence of differences in OR gene expansion and contraction, and in the composition of ORs with different tuning breadths. The strongest binding energies of ORs in non-echolocating fruit-eating bats were seen to correspond to ester odorants, although we did not detect a quantitative advantage of functional OR repertoires in these bats compared with echolocating insectivorous species. Overall, our findings based on molecular modelling and computational docking suggest that bats have undergone olfactory evolution linked to dietary adaptation. Our results from extant and ancestral bats help to lay the groundwork for targeted experimental functional tests in the future.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Natalie Czajka, Joseph M. Northrup, Meaghan J. Jones, Aaron B. A. Shafer
The development of epigenetic clocks, or the DNA methylation-based inference of age, is an emerging tool for ageing in free ranging populations. In this study, we developed epigenetic clocks for three species of large mammals that are the focus of extensive management throughout their range in North America: white-tailed deer, black bear and mountain goat. We quantified differential DNA methylation patterns at over 30,000 cytosine-guanine sites (CpGs) from tissue samples of all three species (black bear n = 49; white-tailed deer n = 47; mountain goat n = 45). We used a penalized regression model (elastic net) to build explanatory (black bear r = .95; white-tailed deer r = .99; mountain goat r = .97) and robust (black bear Median Absolute Error or MAE = 1.33; white-tailed deer MAE = 0.29; mountain goat MAE = 0.61) models of age or clocks. We also characterized individual CpG sites within each species that demonstrated clear differences in methylation levels between age classes and sex, which can be used to develop a suite of accessible diagnostic markers. This tool has the potential to contribute to wildlife monitoring by providing easily obtainable representations of age structure in managed populations.
表观遗传时钟的开发或基于 DNA 甲基化的年龄推断是一种新兴的工具,可用于自由活动种群的年龄测定。在这项研究中,我们为白尾鹿、黑熊和山羊这三种大型哺乳动物开发了表观遗传时钟。我们对所有三个物种(黑熊 n = 49;白尾鹿 n = 47;山羊 n = 45)组织样本中超过 30,000 个胞嘧啶-鸟嘌呤位点(CpGs)的不同 DNA 甲基化模式进行了量化。我们使用惩罚回归模型(弹性网)建立了年龄或时钟的解释性(黑熊 r = .95;白尾鹿 r = .99;山羊 r = .97)和稳健性(黑熊中位绝对误差或 MAE = 1.33;白尾鹿 MAE = 0.29;山羊 MAE = 0.61)模型。我们还表征了每个物种中的单个 CpG 位点,这些位点在不同年龄段和性别之间的甲基化水平存在明显差异,可用于开发一套可访问的诊断标记。该工具可提供易于获取的受管理种群年龄结构表征,从而有可能为野生动物监测做出贡献。
{"title":"Epigenetic clocks, sex markers and age-class diagnostics in three harvested large mammals","authors":"Natalie Czajka, Joseph M. Northrup, Meaghan J. Jones, Aaron B. A. Shafer","doi":"10.1111/1755-0998.13956","DOIUrl":"10.1111/1755-0998.13956","url":null,"abstract":"<p>The development of epigenetic clocks, or the DNA methylation-based inference of age, is an emerging tool for ageing in free ranging populations. In this study, we developed epigenetic clocks for three species of large mammals that are the focus of extensive management throughout their range in North America: white-tailed deer, black bear and mountain goat. We quantified differential DNA methylation patterns at over 30,000 cytosine-guanine sites (CpGs) from tissue samples of all three species (black bear <i>n</i> = 49; white-tailed deer <i>n</i> = 47; mountain goat <i>n</i> = 45). We used a penalized regression model (elastic net) to build explanatory (black bear <i>r</i> = .95; white-tailed deer <i>r</i> = .99; mountain goat <i>r</i> = .97) and robust (black bear Median Absolute Error or MAE = 1.33; white-tailed deer MAE = 0.29; mountain goat MAE = 0.61) models of age or clocks. We also characterized individual CpG sites within each species that demonstrated clear differences in methylation levels between age classes and sex, which can be used to develop a suite of accessible diagnostic markers. This tool has the potential to contribute to wildlife monitoring by providing easily obtainable representations of age structure in managed populations.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13956","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140326110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent declines in insect abundances, especially populations of wild pollinators, pose a threat to many natural and agricultural ecosystems. Traditional species monitoring relies on morphological character identification and is inadequate for efficient and standardized surveys. DNA barcoding has become a standard approach for molecular identification of organisms, aiming to overcome the shortcomings of traditional biodiversity monitoring. However, its efficacy depends on the completeness of reference databases. Large DNA barcoding efforts are (almost entirely) lacking in many European countries and such patchy data limit Europe-wide analyses of precisely how to apply DNA barcoding in wild bee identification. Here, we advance towards an effective molecular identification of European wild bees. We conducted a high-effort survey of wild bees at the junction of central and southern Europe and DNA barcoded all collected morphospecies. For global analyses, we complemented our DNA barcode dataset with all relevant European species and conducted global analyses of species delimitation, general and genus-specific barcoding gaps and examined the error rate in DNA data repositories. We found that (i) a sixth of all specimens from Slovenia could not be reliably identified, (ii) species delimitation methods show numerous systematic discrepancies, (iii) there is no general barcoding gap across all bees and (iv) the barcoding gap is genus specific, but only after curating for errors in DNA data repositories. Intense sampling and barcoding efforts in underrepresented regions and strict curation of DNA barcode repositories are needed to enhance the use of DNA barcoding for the identification of wild bees.
最近昆虫数量的减少,尤其是野生传粉昆虫种群的减少,对许多自然和农业生态系统构成了威胁。传统的物种监测依赖于形态特征鉴定,不足以进行高效和标准化的调查。DNA 条形码已成为生物分子鉴定的标准方法,旨在克服传统生物多样性监测的不足。然而,其有效性取决于参考数据库的完整性。许多欧洲国家(几乎完全)缺乏大规模的DNA条形码工作,这种零散的数据限制了对如何在野生蜜蜂鉴定中精确应用DNA条形码的全欧洲范围的分析。在此,我们将推进对欧洲野生蜜蜂的有效分子鉴定。我们在中欧和南欧交界处对野生蜜蜂进行了一次艰苦的调查,并对收集到的所有形态物种进行了DNA条形码编码。为了进行全球分析,我们用所有相关的欧洲物种对 DNA 条形码数据集进行了补充,并对物种划界、一般和属特异性条形码差距进行了全球分析,还检查了 DNA 数据库中的错误率。我们发现:(i) 斯洛文尼亚有六分之一的标本无法得到可靠的鉴定;(ii) 物种划分方法存在大量系统性差异;(iii) 所有蜜蜂不存在普遍的条形码缺口;(iv) 条形码缺口是针对具体属的,但仅限于在对 DNA 数据库中的错误进行整理之后。需要在代表性不足的地区加强采样和条形码工作,并严格管理 DNA 条形码库,以加强 DNA 条形码在野生蜜蜂鉴定中的应用。
{"title":"DNA barcoding insufficiently identifies European wild bees (Hymenoptera, Anthophila) due to undefined species diversity, genus-specific barcoding gaps and database errors","authors":"Šet Janko, Šturm Rok, Koderman Blaž, Bevk Danilo, Gogala Andrej, Kutnjak Denis, Čandek Klemen, Gregorič Matjaž","doi":"10.1111/1755-0998.13953","DOIUrl":"10.1111/1755-0998.13953","url":null,"abstract":"<p>Recent declines in insect abundances, especially populations of wild pollinators, pose a threat to many natural and agricultural ecosystems. Traditional species monitoring relies on morphological character identification and is inadequate for efficient and standardized surveys. DNA barcoding has become a standard approach for molecular identification of organisms, aiming to overcome the shortcomings of traditional biodiversity monitoring. However, its efficacy depends on the completeness of reference databases. Large DNA barcoding efforts are (almost entirely) lacking in many European countries and such patchy data limit Europe-wide analyses of precisely how to apply DNA barcoding in wild bee identification. Here, we advance towards an effective molecular identification of European wild bees. We conducted a high-effort survey of wild bees at the junction of central and southern Europe and DNA barcoded all collected morphospecies. For global analyses, we complemented our DNA barcode dataset with all relevant European species and conducted global analyses of species delimitation, general and genus-specific barcoding gaps and examined the error rate in DNA data repositories. We found that (i) a sixth of all specimens from Slovenia could not be reliably identified, (ii) species delimitation methods show numerous systematic discrepancies, (iii) there is no general barcoding gap across all bees and (iv) the barcoding gap is genus specific, but only after curating for errors in DNA data repositories. Intense sampling and barcoding efforts in underrepresented regions and strict curation of DNA barcode repositories are needed to enhance the use of DNA barcoding for the identification of wild bees.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13953","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140206014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinyi Zhang, Haimei Chen, Yang Ni, Bin Wu, Jingling Li, Artur Burzyński, Chang Liu
Tools for visualizing genomes are essential for investigating genomic features and their interactions. Currently, tools designed originally for animal mitogenomes and plant plastomes are used to visualize the mitogens of plants but cannot accurately display features specific to plant mitogenomes, such as nonlinear exon arrangement for genes, the prevalence of functional noncoding features and complex chromosomal architecture. To address these problems, a software package, plant mitochondrial genome map (PMGmap), was developed using the Python programming language. PMGmap can draw genes at exon levels; draw cis- and trans-splicing gene maps, noncoding features and repetitive sequences; and scale genic regions by using the scaling of the genic regions on the mitogenome (SAGM) algorithm. It can also draw multiple chromosomes simultaneously. Compared with other state-of-the-art tools, PMGmap showed better performance in visualizing 405 plant mitogenomes, showing potential as an invaluable tool for plant mitogenome research. The web and container versions and the source code of PMGmap can be accessed through the following link: http://www.1kmpg.cn/pmgmap.
{"title":"Plant mitochondrial genome map (PMGmap): A software tool for the comprehensive visualization of coding, noncoding and genome features of plant mitochondrial genomes","authors":"Xinyi Zhang, Haimei Chen, Yang Ni, Bin Wu, Jingling Li, Artur Burzyński, Chang Liu","doi":"10.1111/1755-0998.13952","DOIUrl":"10.1111/1755-0998.13952","url":null,"abstract":"<p>Tools for visualizing genomes are essential for investigating genomic features and their interactions. Currently, tools designed originally for animal mitogenomes and plant plastomes are used to visualize the mitogens of plants but cannot accurately display features specific to plant mitogenomes, such as nonlinear exon arrangement for genes, the prevalence of functional noncoding features and complex chromosomal architecture. To address these problems, a software package, plant mitochondrial genome map (PMGmap), was developed using the Python programming language. PMGmap can draw genes at exon levels; draw cis- and trans-splicing gene maps, noncoding features and repetitive sequences; and scale genic regions by using the scaling of the genic regions on the mitogenome (SAGM) algorithm. It can also draw multiple chromosomes simultaneously. Compared with other state-of-the-art tools, PMGmap showed better performance in visualizing 405 plant mitogenomes, showing potential as an invaluable tool for plant mitogenome research. The web and container versions and the source code of PMGmap can be accessed through the following link: http://www.1kmpg.cn/pmgmap.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 5","pages":""},"PeriodicalIF":7.7,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140206015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}