首页 > 最新文献

Molecular Ecology Resources最新文献

英文 中文
OGU: A Toolbox for Better Utilising Organelle Genomic Data. OGU:更好地利用细胞器基因组数据的工具箱。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-11 DOI: 10.1111/1755-0998.14044
Ping Wu, Ningning Xue, Jie Yang, Qiang Zhang, Yuzhe Sun, Wen Zhang

Organelle genomes serve as crucial datasets for investigating the genetics and evolution of plants and animals, genome diversity, and species identification. To enhance the collection, analysis, and visualisation of such data, we have developed a novel open-source software tool named Organelle Genome Utilities (OGU). The software encompasses three modules designed to streamline the handling of organelle genome data. The data collection module is dedicated to retrieving, validating and organising sequence information. The evaluation module assesses sequence variance using a range of methods, including novel metrics termed stem and terminal phylogenetic diversity. The primer module designs universal primers for downstream applications. Finally, a visualisation pipeline has been developed to present comprehensive insights into organelle genomes across different lineages rather than focusing solely on individual species. The performance, compatibility and stability of OGU have been rigorously evaluated through benchmarking with four datasets, including one million mixed GenBank records, plastid genomic data from the Lamiaceae family, mitochondrial data from rodents, and 308 plastid genomes sourced from various angiosperm families. Based on software capabilities, we identified 30 plastid intergenic spacers. These spacers exhibit a moderate evolutionary rate and offer practical utility comparable to coding regions, highlighting the potential applications of intergenic spacers in organelle genomes. We anticipate that OGU will substantially enhance the efficient utilisation of organelle genomic data and broaden the prospects for related research endeavours.

细胞器基因组是研究动植物遗传和进化、基因组多样性和物种鉴定的重要数据集。为了加强此类数据的收集、分析和可视化,我们开发了一款名为 Organelle Genome Utilities (OGU) 的新型开源软件工具。该软件包括三个模块,旨在简化细胞器基因组数据的处理。数据收集模块专门用于检索、验证和组织序列信息。评估模块使用一系列方法评估序列差异,包括称为茎和末端系统发育多样性的新指标。引物模块为下游应用设计通用引物。最后,还开发了一个可视化流水线,以全面了解不同系的细胞器基因组,而不是仅仅关注单个物种。OGU的性能、兼容性和稳定性已通过四个数据集的基准测试进行了严格评估,其中包括一百万条GenBank混合记录、苎麻科植物的质粒基因组数据、啮齿类动物的线粒体数据以及来自不同被子植物科的308个质粒基因组。根据软件功能,我们确定了 30 个质体基因间距。这些间隔表现出适度的进化速度,并提供了与编码区相当的实用性,凸显了基因间间隔在细胞器基因组中的潜在应用。我们预计,OGU 将大大提高细胞器基因组数据的利用效率,并拓宽相关研究工作的前景。
{"title":"OGU: A Toolbox for Better Utilising Organelle Genomic Data.","authors":"Ping Wu, Ningning Xue, Jie Yang, Qiang Zhang, Yuzhe Sun, Wen Zhang","doi":"10.1111/1755-0998.14044","DOIUrl":"https://doi.org/10.1111/1755-0998.14044","url":null,"abstract":"<p><p>Organelle genomes serve as crucial datasets for investigating the genetics and evolution of plants and animals, genome diversity, and species identification. To enhance the collection, analysis, and visualisation of such data, we have developed a novel open-source software tool named Organelle Genome Utilities (OGU). The software encompasses three modules designed to streamline the handling of organelle genome data. The data collection module is dedicated to retrieving, validating and organising sequence information. The evaluation module assesses sequence variance using a range of methods, including novel metrics termed stem and terminal phylogenetic diversity. The primer module designs universal primers for downstream applications. Finally, a visualisation pipeline has been developed to present comprehensive insights into organelle genomes across different lineages rather than focusing solely on individual species. The performance, compatibility and stability of OGU have been rigorously evaluated through benchmarking with four datasets, including one million mixed GenBank records, plastid genomic data from the Lamiaceae family, mitochondrial data from rodents, and 308 plastid genomes sourced from various angiosperm families. Based on software capabilities, we identified 30 plastid intergenic spacers. These spacers exhibit a moderate evolutionary rate and offer practical utility comparable to coding regions, highlighting the potential applications of intergenic spacers in organelle genomes. We anticipate that OGU will substantially enhance the efficient utilisation of organelle genomic data and broaden the prospects for related research endeavours.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14044"},"PeriodicalIF":5.5,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142613338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to “Characterisation of Putative Circular Plasmids in Sponge-Associated Bacterial Communities Using a Selective Multiply-Primed Rolling Circle Amplification” 对 "利用选择性多边滚圆扩增法鉴定海绵相关细菌群落中的推定环状质粒 "的更正。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-04 DOI: 10.1111/1755-0998.14043

Oliveira, V., A. R. M. Polónia, D. F. R. Cleary, et al. 2021. “Characterization of putative circular plasmids in sponge-associated bacterial communities using a selective multiply-primed rolling circle amplification.” Molecular Ecology Resources 21, no. 1: 110–121. https://doi.org/10.1111/1755-0998.13248.

The authors of the above article noticed an error in the DNA concentration which is detailed in the ‘Methods’ section, section 2.3 (‘Selective multiply-primed rolling circle amplification’), paragraph 2. The correct text should read as ‘1 μL template DNA (ca. 200 ng)’.

The authors apologise for this error and any inconvenience it may have caused.

奥利维拉,V., A. R. M. Polónia, D. F. R.克利里等。2021。使用选择性多重引物滚动圈扩增技术表征海绵相关细菌群落中假定的圆形质粒。分子生态资源,第21期。1: 110 - 121。https://doi.org/10.1111/1755-0998.13248.The上述文章的作者注意到DNA浓度中的一个错误,详细信息请参见第2.3节(“选择性多重引物滚动圈扩增”)第2段的“方法”部分。正确的文本应为“1 μL模板DNA (ca. 200 ng)”。作者对这个错误及其可能造成的任何不便表示歉意。
{"title":"Correction to “Characterisation of Putative Circular Plasmids in Sponge-Associated Bacterial Communities Using a Selective Multiply-Primed Rolling Circle Amplification”","authors":"","doi":"10.1111/1755-0998.14043","DOIUrl":"10.1111/1755-0998.14043","url":null,"abstract":"<p>Oliveira, V., A. R. M. Polónia, D. F. R. Cleary, et al. 2021. “Characterization of putative circular plasmids in sponge-associated bacterial communities using a selective multiply-primed rolling circle amplification.” <i>Molecular Ecology Resources</i> <b>21</b>, no. 1: 110–121. https://doi.org/10.1111/1755-0998.13248.</p><p>The authors of the above article noticed an error in the DNA concentration which is detailed in the ‘Methods’ section, section 2.3 (‘Selective multiply-primed rolling circle amplification’), paragraph 2. The correct text should read as ‘1 μL template DNA (ca. 200 ng)’.</p><p>The authors apologise for this error and any inconvenience it may have caused.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14043","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142567134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Chromosome-Scale Genome of Magnolia sieboldii K. Koch Provides Insight Into the Evolutionary Position of Magnoliids and Seed Germination 木兰(Magnolia sieboldii K. Koch)的染色体级基因组为了解木兰科植物的进化地位和种子萌发提供了启示。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-30 DOI: 10.1111/1755-0998.14030
Xiujun Lu, Mei Mei, Lin Liu, Xin Xu, Wanfeng Ai

Magnolia sieboldii K. Koch (M. sieboldii) stands as an elegant tree species within the Magnoliaceae family, esteemed for its exquisite beauty, cultural significance and economic advantages. The species faces challenges in seed germination under natural conditions, primarily attributed to morphological dormancy. Despite its significance, the molecular mechanisms governing M. sieboldii seed germination remain elusive, compounded by the absence of genomic resources specific to this species. In this study, we present the first chromosome-scale genome assembly of M. sieboldii, with a total genome size of 2.01 Gb, including 1096 scaffolds assigned to 19 chromosomes (N50 = 102.4 Mb). Phylogenetic analyses, incorporating 13 plant species, illuminate the evolutionary independence of Magnoliids from monocots and eudicots, positioning them as a sister clade. Through RNA-seq analysis, we identify pivotal genes and pathways contributing to seed dormancy and germination. In addition, our investigation delves into the the far-red-impaired response (FAR1) transcription factor gene family, revealing their enrichment throughout evolution and their involvement in the intricate process of seed germination. This comprehensive genome sequencing initiative offers invaluable insights into the biological attributes of M. sieboldii, with a specific emphasis on unravelling the complexities of seed dormancy and germination.

木兰(M. sieboldii K. Koch)是木兰科中的一个优雅树种,因其精致美观、文化意义和经济优势而备受推崇。该树种在自然条件下种子萌发面临挑战,主要原因是形态休眠。尽管其重要性不言而喻,但管理 M. sieboldii 种子萌发的分子机制仍然难以捉摸,而该物种特有基因组资源的缺乏又加剧了这一问题。在本研究中,我们首次完成了 M. sieboldii 的染色体级基因组组装,基因组总大小为 2.01 Gb,包括分配给 19 条染色体的 1096 个支架(N50 = 102.4 Mb)。包含 13 个植物物种的系统进化分析表明,木兰科植物在进化上独立于单子叶植物和真叶植物,是一个姊妹支系。通过RNA-seq分析,我们确定了有助于种子休眠和萌发的关键基因和途径。此外,我们还深入研究了远红外损伤反应(FAR1)转录因子基因家族,揭示了它们在整个进化过程中的富集及其在种子萌发复杂过程中的参与。这项全面的基因组测序计划为我们深入了解西波胆酵母菌的生物学特性提供了宝贵的资料,尤其是在揭示种子休眠和萌发的复杂性方面。
{"title":"The Chromosome-Scale Genome of Magnolia sieboldii K. Koch Provides Insight Into the Evolutionary Position of Magnoliids and Seed Germination","authors":"Xiujun Lu,&nbsp;Mei Mei,&nbsp;Lin Liu,&nbsp;Xin Xu,&nbsp;Wanfeng Ai","doi":"10.1111/1755-0998.14030","DOIUrl":"10.1111/1755-0998.14030","url":null,"abstract":"<div>\u0000 \u0000 <p><i>Magnolia sieboldii</i> K. Koch (<i>M. sieboldii</i>) stands as an elegant tree species within the Magnoliaceae family, esteemed for its exquisite beauty, cultural significance and economic advantages. The species faces challenges in seed germination under natural conditions, primarily attributed to morphological dormancy. Despite its significance, the molecular mechanisms governing <i>M. sieboldii</i> seed germination remain elusive, compounded by the absence of genomic resources specific to this species. In this study, we present the first chromosome-scale genome assembly of <i>M. sieboldii</i>, with a total genome size of 2.01 Gb, including 1096 scaffolds assigned to 19 chromosomes (N50 = 102.4 Mb). Phylogenetic analyses, incorporating 13 plant species, illuminate the evolutionary independence of Magnoliids from monocots and eudicots, positioning them as a sister clade. Through RNA-seq analysis, we identify pivotal genes and pathways contributing to seed dormancy and germination. In addition, our investigation delves into the the far-red-impaired response (FAR1) transcription factor gene family, revealing their enrichment throughout evolution and their involvement in the intricate process of seed germination. This comprehensive genome sequencing initiative offers invaluable insights into the biological attributes of <i>M. sieboldii</i>, with a specific emphasis on unravelling the complexities of seed dormancy and germination.</p>\u0000 </div>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142542399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Snakemake Toolkit for the Batch Assembly, Annotation and Phylogenetic Analysis of Mitochondrial Genomes and Ribosomal Genes From Genome Skims of Museum Collections 用于从博物馆藏品基因组扦取线粒体基因组和核糖体基因进行批量组装、注释和系统发育分析的 Snakemake 工具包。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-28 DOI: 10.1111/1755-0998.14036
Oliver W. White, Andie Hall, Ben W. Price, Suzanne T. Williams, Matthew D. Clark

Low coverage ‘genome-skims’ are often used to assemble organelle genomes and ribosomal gene sequences for cost-effective phylogenetic and barcoding studies. Natural history collections hold invaluable biological information, yet poor preservation resulting in degraded DNA often hinders polymerase chain reaction-based analyses. However, it is possible to generate libraries and sequence the short fragments typical of degraded DNA to generate genome-skims from museum collections. Here we introduce a snakemake toolkit comprised of three pipelines skim2mito, skim2rrna and gene2phylo, designed to unlock the genomic potential of historical museum specimens using genome skimming. Specifically, skim2mito and skim2rrna perform the batch assembly, annotation and phylogenetic analysis of mitochondrial genomes and nuclear ribosomal genes, respectively, from low-coverage genome skims. The third pipeline gene2phylo takes a set of gene alignments and performs phylogenetic analysis of individual genes, partitioned analysis of concatenated alignments and a phylogenetic analysis based on gene trees. We benchmark our pipelines with simulated data, followed by testing with a novel genome skimming dataset from both recent and historical solariellid gastropod samples. We show that the toolkit can recover mitochondrial and ribosomal genes from poorly preserved museum specimens of the gastropod family Solariellidae, and the phylogenetic analysis is consistent with our current understanding of taxonomic relationships. The generation of bioinformatic pipelines that facilitate processing large quantities of sequence data from the vast repository of specimens held in natural history museum collections will greatly aid species discovery and exploration of biodiversity over time, ultimately aiding conservation efforts in the face of a changing planet.

低覆盖率的 "基因组基线 "通常用于组装细胞器基因组和核糖体基因序列,以进行经济有效的系统发育和条形码研究。自然历史藏品蕴藏着宝贵的生物信息,但由于保存不善导致 DNA 降解,往往会阻碍基于聚合酶链反应的分析。不过,可以生成文库,并对典型的降解 DNA 短片段进行测序,从而从博物馆藏品中生成基因组片段。在这里,我们介绍一个由 skim2mito、skim2rrna 和 gene2phylo 三个管道组成的 snakemake 工具包,旨在利用基因组撇取技术发掘博物馆历史标本的基因组潜力。具体来说,skim2mito 和 skim2rrna 分别从低覆盖率的基因组标本中对线粒体基因组和核核糖体基因进行批量组装、注释和系统发育分析。第三个管道 gene2phylo 利用一组基因排列,对单个基因进行系统发育分析,对连接排列进行分区分析,并基于基因树进行系统发育分析。我们先用模拟数据对我们的管道进行基准测试,然后再用一个新的基因组撇取数据集进行测试,该数据集来自近期和历史上的腹足纲动物样本。我们的结果表明,该工具包可以从腹足纲腹足目保存较差的博物馆标本中恢复线粒体和核糖体基因,而且系统发育分析符合我们目前对分类关系的理解。从自然历史博物馆收藏的大量标本中生成生物信息学管道,以便于处理大量序列数据,这将极大地有助于物种发现和生物多样性的长期探索,最终有助于面对不断变化的地球的保护工作。
{"title":"A Snakemake Toolkit for the Batch Assembly, Annotation and Phylogenetic Analysis of Mitochondrial Genomes and Ribosomal Genes From Genome Skims of Museum Collections","authors":"Oliver W. White,&nbsp;Andie Hall,&nbsp;Ben W. Price,&nbsp;Suzanne T. Williams,&nbsp;Matthew D. Clark","doi":"10.1111/1755-0998.14036","DOIUrl":"10.1111/1755-0998.14036","url":null,"abstract":"<p>Low coverage ‘genome-skims’ are often used to assemble organelle genomes and ribosomal gene sequences for cost-effective phylogenetic and barcoding studies. Natural history collections hold invaluable biological information, yet poor preservation resulting in degraded DNA often hinders polymerase chain reaction-based analyses. However, it is possible to generate libraries and sequence the short fragments typical of degraded DNA to generate genome-skims from museum collections. Here we introduce a snakemake toolkit comprised of three pipelines <i>skim2mito</i>, <i>skim2rrna</i> and <i>gene2phylo</i>, designed to unlock the genomic potential of historical museum specimens using genome skimming. Specifically, <i>skim2mito</i> and <i>skim2rrna</i> perform the batch assembly, annotation and phylogenetic analysis of mitochondrial genomes and nuclear ribosomal genes, respectively, from low-coverage genome skims. The third pipeline <i>gene2phylo</i> takes a set of gene alignments and performs phylogenetic analysis of individual genes, partitioned analysis of concatenated alignments and a phylogenetic analysis based on gene trees. We benchmark our pipelines with simulated data, followed by testing with a novel genome skimming dataset from both recent and historical solariellid gastropod samples. We show that the toolkit can recover mitochondrial and ribosomal genes from poorly preserved museum specimens of the gastropod family Solariellidae, and the phylogenetic analysis is consistent with our current understanding of taxonomic relationships. The generation of bioinformatic pipelines that facilitate processing large quantities of sequence data from the vast repository of specimens held in natural history museum collections will greatly aid species discovery and exploration of biodiversity over time, ultimately aiding conservation efforts in the face of a changing planet.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142491750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
That's Not a Hybrid: How to Distinguish Patterns of Admixture and Isolation By Distance. 这不是杂交种:如何通过距离区分混血和隔离模式》(That's Not a Hybrid: How to Distinguish Patterns of Admixture and Isolation By Distance)。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-28 DOI: 10.1111/1755-0998.14039
Ben J Wiens, Jocelyn P Colella

Describing naturally occurring genetic variation is a fundamental goal of molecular phylogeography and population genetics. Popular methods for this task include STRUCTURE, a model-based algorithm that assigns individuals to genetic clusters, and principal component analysis (PCA), a parameter-free method. The ability of STRUCTURE to infer mixed ancestry makes it popular for documenting natural hybridisation, which is of considerable interest to evolutionary biologists, given that such systems provide a window into the speciation process. Yet, STRUCTURE can produce misleading results when its underlying assumptions are violated, like when genetic variation is distributed continuously across geographic space. To test the ability of STRUCTURE and PCA to accurately distinguish admixture from continuous variation, we use forward-time simulations to generate population genetic data under three demographic scenarios: two involving admixture and one with isolation by distance (IBD). STRUCTURE and PCA alone cannot distinguish admixture from IBD, but complementing these analyses with triangle plots, which visualise hybrid index against interclass heterozygosity, provides more accurate inference of demographic history, especially in cases of recent admixture. We demonstrate that triangle plots are robust to missing data, while STRUCTURE and PCA are not, and show that setting a low allele frequency difference threshold for ancestry-informative marker (AIM) identification can accurately characterise the relationship between hybrid index and interclass heterozygosity across demographic histories of admixture and range expansion. While STRUCTURE and PCA provide useful summaries of genetic variation, results should be paired with triangle plots before admixture is inferred.

描述自然发生的遗传变异是分子系统地理学和群体遗传学的基本目标。常用的方法包括 STRUCTURE 和主成分分析(PCA),前者是一种基于模型的算法,可将个体分配到遗传聚类中,后者是一种无参数方法。STRUCTURE 能够推断混血祖先,这使它在记录自然杂交方面很受欢迎,进化生物学家对自然杂交相当感兴趣,因为这种系统提供了一个了解物种形成过程的窗口。然而,当STRUCTURE的基本假设被违反时,比如当遗传变异在地理空间上连续分布时,它可能会产生误导性的结果。为了测试 STRUCTURE 和 PCA 准确区分混杂和连续变异的能力,我们使用前向时间模拟法生成了三种人口学情景下的种群遗传数据:两种情景涉及混杂,一种情景涉及距离隔离(IBD)。单靠 STRUCTURE 和 PCA 无法区分混杂与 IBD,但如果用三角图来补充这些分析(三角图可视化混杂指数与类间杂合度),就能更准确地推断人口历史,尤其是在近期混杂的情况下。我们证明了三角形图对缺失数据的稳健性,而 STRUCTURE 和 PCA 则不然,并表明为祖先信息标记(AIM)鉴定设定一个较低的等位基因频率差异阈值,可以准确地描述杂交指数与类间杂合度之间在混杂和范围扩大的人口历史中的关系。虽然 STRUCTURE 和 PCA 提供了有用的遗传变异总结,但在推断混杂之前,应将结果与三角形图配对。
{"title":"That's Not a Hybrid: How to Distinguish Patterns of Admixture and Isolation By Distance.","authors":"Ben J Wiens, Jocelyn P Colella","doi":"10.1111/1755-0998.14039","DOIUrl":"https://doi.org/10.1111/1755-0998.14039","url":null,"abstract":"<p><p>Describing naturally occurring genetic variation is a fundamental goal of molecular phylogeography and population genetics. Popular methods for this task include STRUCTURE, a model-based algorithm that assigns individuals to genetic clusters, and principal component analysis (PCA), a parameter-free method. The ability of STRUCTURE to infer mixed ancestry makes it popular for documenting natural hybridisation, which is of considerable interest to evolutionary biologists, given that such systems provide a window into the speciation process. Yet, STRUCTURE can produce misleading results when its underlying assumptions are violated, like when genetic variation is distributed continuously across geographic space. To test the ability of STRUCTURE and PCA to accurately distinguish admixture from continuous variation, we use forward-time simulations to generate population genetic data under three demographic scenarios: two involving admixture and one with isolation by distance (IBD). STRUCTURE and PCA alone cannot distinguish admixture from IBD, but complementing these analyses with triangle plots, which visualise hybrid index against interclass heterozygosity, provides more accurate inference of demographic history, especially in cases of recent admixture. We demonstrate that triangle plots are robust to missing data, while STRUCTURE and PCA are not, and show that setting a low allele frequency difference threshold for ancestry-informative marker (AIM) identification can accurately characterise the relationship between hybrid index and interclass heterozygosity across demographic histories of admixture and range expansion. While STRUCTURE and PCA provide useful summaries of genetic variation, results should be paired with triangle plots before admixture is inferred.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14039"},"PeriodicalIF":5.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142520551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correcting for Replicated Genotypes May Introduce More Problems Than it Solves. 对重复基因型进行校正可能会带来比解决更多的问题。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-28 DOI: 10.1111/1755-0998.14041
Patrick G Meirmans

Across the tree of life, many organisms are able to reproduce clonally, via vegetative spread, budding or parthenogenesis. In population genetic analyses of clonally reproducing organisms, it is common practice to retain only a single representative per multilocus genotype. Though this practice of clone correction is widespread, the theoretical justification behind it has been very little studied. Here, I use individual-based simulations to study the effect of clone correction on the estimation of the genetic summary statistics HO, HS, FIS, FST, F''ST and Dest. The simulations follow the standard finite island model, consisting of a set of populations connected by gene flow, but with a variable rate of sexual versus asexual reproduction. The results of the simulations show that by itself, the inclusion of replicated genotypes does not lead to a deviation in the values of the summary statistics, except when the rate of sexual reproduction is less than about one in thousand. However, clone correction can introduce a strong deviation in the values of most of the statistics, when compared to a scenario of full sexual reproduction. For HS and FIS, this deviation can be informative about the process of asexual reproduction, but for FST, F''ST and Dest, clone correction can lead to incorrect conclusions. I therefore argue that clone correction is not strictly necessary, but can in some cases be insightful. However, when clone correction is applied, it is imperative that results for both the corrected and uncorrected data are presented.

在生命之树上,许多生物都能通过无性繁殖、芽生或孤雌生殖进行克隆繁殖。在对克隆生殖生物进行群体遗传分析时,通常的做法是每个多聚焦基因型只保留一个代表。虽然克隆校正的做法很普遍,但对其背后的理论依据却研究甚少。在此,我使用基于个体的模拟来研究克隆校正对遗传汇总统计量 HO、HS、FIS、FST、F''ST 和 Dest 估算的影响。模拟采用了标准的有限岛模型,由一组通过基因流连接的种群组成,但有性生殖和无性生殖的比率各不相同。模拟结果表明,除了有性繁殖率低于千分之一时,加入复制基因型本身并不会导致汇总统计值出现偏差。然而,与完全有性生殖的情况相比,克隆校正会使大多数统计量的值出现较大偏差。对 HS 和 FIS 来说,这种偏差可以说明无性生殖的过程,但对 FST、F''ST 和 Dest 来说,克隆校正会导致错误的结论。因此,我认为克隆校正并不是绝对必要的,但在某些情况下可能会有启发。不过,在进行克隆校正时,必须同时提交校正和未校正数据的结果。
{"title":"Correcting for Replicated Genotypes May Introduce More Problems Than it Solves.","authors":"Patrick G Meirmans","doi":"10.1111/1755-0998.14041","DOIUrl":"https://doi.org/10.1111/1755-0998.14041","url":null,"abstract":"<p><p>Across the tree of life, many organisms are able to reproduce clonally, via vegetative spread, budding or parthenogenesis. In population genetic analyses of clonally reproducing organisms, it is common practice to retain only a single representative per multilocus genotype. Though this practice of clone correction is widespread, the theoretical justification behind it has been very little studied. Here, I use individual-based simulations to study the effect of clone correction on the estimation of the genetic summary statistics H<sub>O</sub>, H<sub>S</sub>, F<sub>IS</sub>, F<sub>ST</sub>, F''<sub>ST</sub> and D<sub>est</sub>. The simulations follow the standard finite island model, consisting of a set of populations connected by gene flow, but with a variable rate of sexual versus asexual reproduction. The results of the simulations show that by itself, the inclusion of replicated genotypes does not lead to a deviation in the values of the summary statistics, except when the rate of sexual reproduction is less than about one in thousand. However, clone correction can introduce a strong deviation in the values of most of the statistics, when compared to a scenario of full sexual reproduction. For H<sub>S</sub> and F<sub>IS</sub>, this deviation can be informative about the process of asexual reproduction, but for F<sub>ST</sub>, F''<sub>ST</sub> and D<sub>est</sub>, clone correction can lead to incorrect conclusions. I therefore argue that clone correction is not strictly necessary, but can in some cases be insightful. However, when clone correction is applied, it is imperative that results for both the corrected and uncorrected data are presented.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14041"},"PeriodicalIF":5.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142491751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Assembly Errors With Klumpy: Building Confidence in Your Daily Genomic Analysis 用 Klumpy 检测装配错误:在日常基因组分析中建立信心。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-23 DOI: 10.1111/1755-0998.14037
Isheng Jason Tsai

In the realm of genome assembly, even minor errors can send researchers down to rabbit holes of unintended misinterpretation. Enter Klumpy—a tool designed to help detecting these elusive mistakes before they cause significant problems. By providing detailed, region-specific assessments and an intuitive visualisation platform, Klumpy (Madrigal, et al. 2024) empowers researchers to pinpoint and resolve potential issues with precision, paving the way for more reliable downstream analyses and discoveries.

在基因组组装领域,即使是微小的错误也会让研究人员陷入意外误读的兔子洞。Klumpy就是这样一款工具,它可以帮助人们在这些难以捉摸的错误造成重大问题之前就发现它们。通过提供详细的特定区域评估和直观的可视化平台,Klumpy(Madrigal 等人,2024 年)使研究人员能够准确定位和解决潜在问题,为更可靠的下游分析和发现铺平道路。
{"title":"Detecting Assembly Errors With Klumpy: Building Confidence in Your Daily Genomic Analysis","authors":"Isheng Jason Tsai","doi":"10.1111/1755-0998.14037","DOIUrl":"10.1111/1755-0998.14037","url":null,"abstract":"<p>In the realm of genome assembly, even minor errors can send researchers down to rabbit holes of unintended misinterpretation. Enter Klumpy—a tool designed to help detecting these elusive mistakes before they cause significant problems. By providing detailed, region-specific assessments and an intuitive visualisation platform, Klumpy (Madrigal, et al. 2024) empowers researchers to pinpoint and resolve potential issues with precision, paving the way for more reliable downstream analyses and discoveries.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142491752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genotyping Error Detection and Customised Filtration for SNP Datasets 基因分型错误检测和 SNP 数据集定制过滤。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-22 DOI: 10.1111/1755-0998.14033
Noa Yaffa Kan-Lingwood, Liran Sagi, Shahar Mazie, Naama Shahar, Lilith Zecherle Bitton, Alan Templeton, Daniel Rubenstein, Amos Bouskila, Shirli Bar-David

A major challenge in analysing single-nucleotide polymorphism (SNP) genotype datasets is detecting and filtering errors that bias analyses and misinterpret ecological and evolutionary processes. Here, we present a comprehensive method to estimate and minimise genotyping error rates (deviations from the ‘true’ genotype) in any SNP datasets using triplicates (three repeats of the same sample) in a four-step filtration pipeline. The approach involves: (1) SNP filtering by missing data; (2) SNP filtering by error rates; (3) sample filtering by missing data and (4) detection of recaptured individuals by using estimated SNP error rates. The modular pipeline is provided in an R script that allows customised adjustments. We demonstrate the applicability of the method using non-invasive sampling from the Asiatic wild ass (Equus hemionus) population in Israel. We genotyped 756 samples using 625 SNPs, of which 255 were triplicates of 85 samples. The average SNP error rate, calculated based on the number of mismatching genotypes across triplicates before filtration, was 0.0034 and was reduced to 0.00174 following filtration. Evaluating genetic distance (GD) and relatedness (r) between triplicates before and after filtration (expected to be at the minimum and maximum respectively) showed a significant reduction in the average GD, from 58.1 to 25.3 (p = 0.0002) and a significant increase in relatedness, from r = 0.98 to r = 0.991 (p = 0.00587). We demonstrate how error rate estimation enhances recapture detection and improves genotype quality.

分析单核苷酸多态性(SNP)基因型数据集的一个主要挑战是检测和过滤错误,这些错误会使分析产生偏差并误解生态和进化过程。在这里,我们提出了一种综合方法,利用三重样本(同一样本的三次重复)在四步过滤管道中估算并最小化任何 SNP 数据集中的基因分型错误率(与 "真实 "基因型的偏差)。该方法包括:(1) 根据缺失数据过滤 SNP;(2) 根据错误率过滤 SNP;(3) 根据缺失数据过滤样本;(4) 根据估计的 SNP 错误率检测重新捕获的个体。该模块化管道以 R 脚本的形式提供,可进行定制调整。我们利用对以色列亚洲野驴(Equus hemionus)种群的非侵入性采样证明了该方法的适用性。我们使用 625 个 SNP 对 756 个样本进行了基因分型,其中 255 个样本是 85 个样本的三倍体。根据过滤前三重样本中不匹配基因型的数量计算,SNP 平均错误率为 0.0034,过滤后降至 0.00174。评估过滤前后(预计分别为最小值和最大值)三重样之间的遗传距离(GD)和亲缘关系(r)显示,平均 GD 显著降低,从 58.1 降至 25.3(p = 0.0002),亲缘关系显著增加,从 r = 0.98 升至 r = 0.991(p = 0.00587)。我们展示了误差率估计是如何增强再捕获检测并提高基因型质量的。
{"title":"Genotyping Error Detection and Customised Filtration for SNP Datasets","authors":"Noa Yaffa Kan-Lingwood,&nbsp;Liran Sagi,&nbsp;Shahar Mazie,&nbsp;Naama Shahar,&nbsp;Lilith Zecherle Bitton,&nbsp;Alan Templeton,&nbsp;Daniel Rubenstein,&nbsp;Amos Bouskila,&nbsp;Shirli Bar-David","doi":"10.1111/1755-0998.14033","DOIUrl":"10.1111/1755-0998.14033","url":null,"abstract":"<div>\u0000 \u0000 <p>A major challenge in analysing single-nucleotide polymorphism (SNP) genotype datasets is detecting and filtering errors that bias analyses and misinterpret ecological and evolutionary processes. Here, we present a comprehensive method to estimate and minimise genotyping error rates (deviations from the ‘true’ genotype) in any SNP datasets using triplicates (three repeats of the same sample) in a four-step filtration pipeline. The approach involves: (1) SNP filtering by missing data; (2) SNP filtering by error rates; (3) sample filtering by missing data and (4) detection of recaptured individuals by using estimated SNP error rates. The modular pipeline is provided in an R script that allows customised adjustments. We demonstrate the applicability of the method using non-invasive sampling from the Asiatic wild ass (<i>Equus hemionus</i>) population in Israel. We genotyped 756 samples using 625 SNPs, of which 255 were triplicates of 85 samples. The average SNP error rate, calculated based on the number of mismatching genotypes across triplicates before filtration, was 0.0034 and was reduced to 0.00174 following filtration. Evaluating genetic distance (GD) and relatedness (<i>r</i>) between triplicates before and after filtration (expected to be at the minimum and maximum respectively) showed a significant reduction in the average GD, from 58.1 to 25.3 (<i>p</i> = 0.0002) and a significant increase in relatedness, from <i>r</i> = 0.98 to <i>r =</i> 0.991 (<i>p</i> = 0.00587). We demonstrate how error rate estimation enhances recapture detection and improves genotype quality.</p>\u0000 </div>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three Novel Spider Genomes Unveil Spidroin Diversification and Hox Cluster Architecture: Ryuthela nishihirai (Liphistiidae), Uloborus plumipes (Uloboridae) and Cheiracanthium punctorium (Cheiracanthiidae) 三个新的蜘蛛基因组揭示了蜘蛛蛋白的多样化和Hox簇结构:Ryuthela nishihirai (Liphistiidae), Uloborus plumipes (Uloboridae) and Cheiracanthium punctorium (Cheiracanthiidae).
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-22 DOI: 10.1111/1755-0998.14038
Yannis Schöneberg, Tracy Lynn Audisio, Alexander Ben Hamadou, Martin Forman, Jiří Král, Tereza Kořínková, Eva Líznarová, Christoph Mayer, Lenka Prokopcová, Henrik Krehenwinkel, Stefan Prost, Susan Kennedy

Spiders are a hyperdiverse taxon and among the most abundant predators in nearly all terrestrial habitats. Their success is often attributed to key developments in their evolution such as silk and venom production and major apomorphies such as a whole-genome duplication. Resolving deep relationships within the spider tree of life has been historically challenging, making it difficult to measure the relative importance of these novelties for spider evolution. Whole-genome data offer an essential resource in these efforts, but also for functional genomic studies. Here, we present de novo assemblies for three spider species: Ryuthela nishihirai (Liphistiidae), a representative of the ancient Mesothelae, the suborder that is sister to all other extant spiders; Uloborus plumipes (Uloboridae), a cribellate orbweaver whose phylogenetic placement is especially challenging; and Cheiracanthium punctorium (Cheiracanthiidae), which represents only the second family to be sequenced in the hyperdiverse Dionycha clade. These genomes fill critical gaps in the spider tree of life. Using these novel genomes along with 25 previously published ones, we examine the evolutionary history of spidroin gene and structural hox cluster diversity. Our assemblies provide critical genomic resources to facilitate deeper investigations into spider evolution. The near chromosome-level genome of the ‘living fossil’ R. nishihirai represents an especially important step forward, offering new insights into the origins of spider traits.

蜘蛛是一个种类繁多的类群,也是几乎所有陆地栖息地中最丰富的捕食者之一。它们的成功往往归功于其进化过程中的关键发展,如产丝和产毒,以及主要的非形态,如全基因组复制。解决蜘蛛生命树内部的深层关系一直是一项挑战,因此很难衡量这些新发现对蜘蛛进化的相对重要性。全基因组数据为这些工作提供了重要资源,同时也为功能基因组研究提供了重要资源。在这里,我们展示了三个蜘蛛物种的全新组装:Ryuthela nishihirai (Liphistiidae),古代中蛛亚目(Mesothelae)的代表,该亚目是所有其他现生蜘蛛的姐妹目;Uloborus plumipes (Uloboridae),一种楔形口织蛛,其系统发生学定位特别具有挑战性;Cheiracanthium punctorium (Cheiracanthiidae),仅代表了超多样化的 Dionycha 支系中第二个被测序的家族。这些基因组填补了蜘蛛生命树中的重要空白。利用这些新的基因组以及之前发表的 25 个基因组,我们研究了蜘蛛素基因和结构 hox 簇多样性的进化历史。我们的组配为深入研究蜘蛛进化提供了重要的基因组资源。活化石 "R. nishihirai的近染色体级基因组代表着我们向前迈出了特别重要的一步,为我们提供了有关蜘蛛性状起源的新见解。
{"title":"Three Novel Spider Genomes Unveil Spidroin Diversification and Hox Cluster Architecture: Ryuthela nishihirai (Liphistiidae), Uloborus plumipes (Uloboridae) and Cheiracanthium punctorium (Cheiracanthiidae)","authors":"Yannis Schöneberg,&nbsp;Tracy Lynn Audisio,&nbsp;Alexander Ben Hamadou,&nbsp;Martin Forman,&nbsp;Jiří Král,&nbsp;Tereza Kořínková,&nbsp;Eva Líznarová,&nbsp;Christoph Mayer,&nbsp;Lenka Prokopcová,&nbsp;Henrik Krehenwinkel,&nbsp;Stefan Prost,&nbsp;Susan Kennedy","doi":"10.1111/1755-0998.14038","DOIUrl":"10.1111/1755-0998.14038","url":null,"abstract":"<p>Spiders are a hyperdiverse taxon and among the most abundant predators in nearly all terrestrial habitats. Their success is often attributed to key developments in their evolution such as silk and venom production and major apomorphies such as a whole-genome duplication. Resolving deep relationships within the spider tree of life has been historically challenging, making it difficult to measure the relative importance of these novelties for spider evolution. Whole-genome data offer an essential resource in these efforts, but also for functional genomic studies. Here, we present de novo assemblies for three spider species: <i>Ryuthela nishihirai</i> (Liphistiidae), a representative of the ancient Mesothelae, the suborder that is sister to all other extant spiders; <i>Uloborus plumipes</i> (Uloboridae), a cribellate orbweaver whose phylogenetic placement is especially challenging; and <i>Cheiracanthium punctorium</i> (Cheiracanthiidae), which represents only the second family to be sequenced in the hyperdiverse Dionycha clade. These genomes fill critical gaps in the spider tree of life. Using these novel genomes along with 25 previously published ones, we examine the evolutionary history of spidroin gene and structural hox cluster diversity. Our assemblies provide critical genomic resources to facilitate deeper investigations into spider evolution. The near chromosome-level genome of the ‘living fossil’ <i>R. nishihirai</i> represents an especially important step forward, offering new insights into the origins of spider traits.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14038","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Ribosomal Operon Database: A Full-Length rDNA Operon Database Derived From Genome Assemblies 核糖体操作子数据库:从基因组组装中提取的全长 rDNA 操作子数据库。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-21 DOI: 10.1111/1755-0998.14031
Anders K. Krabberød, Embla Stokke, Ella Thoen, Inger Skrede, Håvard Kauserud

Current rDNA reference sequence databases are tailored towards shorter DNA markers, such as parts of the 16/18S marker or the internally transcribed spacer (ITS) region. However, due to advances in long-read DNA sequencing technologies, longer stretches of the rDNA operon are increasingly used in environmental sequencing studies to increase the phylogenetic resolution. There is, therefore, a growing need for longer rDNA reference sequences. Here, we present the ribosomal operon database (ROD), which includes eukaryotic full-length rDNA operons fished from publicly available genome assemblies. Full-length operons were detected in 34.1% of the 34,701 examined eukaryotic genome assemblies from NCBI. In most cases (53.1%), more than one operon variant was detected, which can be due to intragenomic operon copy variability, allelic variation in non-haploid genomes, or technical errors from the sequencing and assembly process. The highest copy number found was 5947 in Zea mays. In total, 453,697 unique operons were detected, with 69,480 operon variant clusters remaining after intragenomic clustering at 99% sequence identity. The operon length varied extensively across eukaryotes, ranging from 4136 to 16,463 bp, which will lead to considerable polymerase chain reaction (PCR) bias during amplification of the entire operon. Clustering the full-length operons revealed that the different parts (i.e., 18S, 28S, and the hypervariable regions V4 and V9 of 18S) provide divergent taxonomic resolution, with 18S, the V4 and V9 regions being the most conserved. The ROD will be updated regularly to provide an increasing number of full-length rDNA operons to the scientific community.

目前的 rDNA 参考序列数据库主要针对较短的 DNA 标记,如 16/18S 标记或内部转录间隔区(ITS)的一部分。然而,由于长读程 DNA 测序技术的进步,环境测序研究中越来越多地使用较长的 rDNA 操作子,以提高系统发育的分辨率。因此,越来越需要更长的 rDNA 参考序列。在这里,我们介绍了核糖体操作子数据库(ROD),其中包括从公开的基因组汇编中获取的真核生物全长 rDNA 操作子。在NCBI提供的34,701个真核生物基因组汇编中,有34.1%检测到了全长操作子。在大多数情况下(53.1%),检测到一个以上的操作子变体,这可能是由于基因组内操作子拷贝变异、非单倍体基因组中的等位基因变异或测序和组装过程中的技术错误造成的。在玉米中发现的最高拷贝数为 5947。总共检测到 453,697 个独特的操作子,经过基因组内聚类后,剩下 69,480 个操作子变异群,序列同一性为 99%。真核生物的操作子长度差异很大,从 4136 到 16,463 bp 不等,这将导致在扩增整个操作子时聚合酶链反应(PCR)产生相当大的偏差。对全长操作子进行聚类发现,不同部分(即 18S、28S 以及 18S 的 V4 和 V9 超变区)提供了不同的分类分辨率,其中 18S、V4 和 V9 区最为保守。ROD 将定期更新,为科学界提供越来越多的全长 rDNA 操作子。
{"title":"The Ribosomal Operon Database: A Full-Length rDNA Operon Database Derived From Genome Assemblies","authors":"Anders K. Krabberød,&nbsp;Embla Stokke,&nbsp;Ella Thoen,&nbsp;Inger Skrede,&nbsp;Håvard Kauserud","doi":"10.1111/1755-0998.14031","DOIUrl":"10.1111/1755-0998.14031","url":null,"abstract":"<p>Current rDNA reference sequence databases are tailored towards shorter DNA markers, such as parts of the 16/18S marker or the internally transcribed spacer (ITS) region. However, due to advances in long-read DNA sequencing technologies, longer stretches of the rDNA operon are increasingly used in environmental sequencing studies to increase the phylogenetic resolution. There is, therefore, a growing need for longer rDNA reference sequences. Here, we present the ribosomal operon database (ROD), which includes eukaryotic full-length rDNA operons fished from publicly available genome assemblies. Full-length operons were detected in 34.1% of the 34,701 examined eukaryotic genome assemblies from NCBI. In most cases (53.1%), more than one operon variant was detected, which can be due to intragenomic operon copy variability, allelic variation in non-haploid genomes, or technical errors from the sequencing and assembly process. The highest copy number found was 5947 in Zea mays. In total, 453,697 unique operons were detected, with 69,480 operon variant clusters remaining after intragenomic clustering at 99% sequence identity. The operon length varied extensively across eukaryotes, ranging from 4136 to 16,463 bp, which will lead to considerable polymerase chain reaction (PCR) bias during amplification of the entire operon. Clustering the full-length operons revealed that the different parts (i.e., 18S, 28S, and the hypervariable regions V4 and V9 of 18S) provide divergent taxonomic resolution, with 18S, the V4 and V9 regions being the most conserved. The ROD will be updated regularly to provide an increasing number of full-length rDNA operons to the scientific community.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"25 1","pages":""},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14031","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Molecular Ecology Resources
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1