首页 > 最新文献

Molecular Ecology Resources最新文献

英文 中文
Genotyping Error Detection and Customised Filtration for SNP Datasets. 基因分型错误检测和 SNP 数据集定制过滤。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-22 DOI: 10.1111/1755-0998.14033
Noa Yaffa Kan-Lingwood, Liran Sagi, Shahar Mazie, Naama Shahar, Lilith Zecherle Bitton, Alan Templeton, Daniel Rubenstein, Amos Bouskila, Shirli Bar-David

A major challenge in analysing single-nucleotide polymorphism (SNP) genotype datasets is detecting and filtering errors that bias analyses and misinterpret ecological and evolutionary processes. Here, we present a comprehensive method to estimate and minimise genotyping error rates (deviations from the 'true' genotype) in any SNP datasets using triplicates (three repeats of the same sample) in a four-step filtration pipeline. The approach involves: (1) SNP filtering by missing data; (2) SNP filtering by error rates; (3) sample filtering by missing data and (4) detection of recaptured individuals by using estimated SNP error rates. The modular pipeline is provided in an R script that allows customised adjustments. We demonstrate the applicability of the method using non-invasive sampling from the Asiatic wild ass (Equus hemionus) population in Israel. We genotyped 756 samples using 625 SNPs, of which 255 were triplicates of 85 samples. The average SNP error rate, calculated based on the number of mismatching genotypes across triplicates before filtration, was 0.0034 and was reduced to 0.00174 following filtration. Evaluating genetic distance (GD) and relatedness (r) between triplicates before and after filtration (expected to be at the minimum and maximum respectively) showed a significant reduction in the average GD, from 58.1 to 25.3 (p = 0.0002) and a significant increase in relatedness, from r = 0.98 to r = 0.991 (p = 0.00587). We demonstrate how error rate estimation enhances recapture detection and improves genotype quality.

分析单核苷酸多态性(SNP)基因型数据集的一个主要挑战是检测和过滤错误,这些错误会使分析产生偏差并误解生态和进化过程。在这里,我们提出了一种综合方法,利用三重样本(同一样本的三次重复)在四步过滤管道中估算并最小化任何 SNP 数据集中的基因分型错误率(与 "真实 "基因型的偏差)。该方法包括:(1) 根据缺失数据过滤 SNP;(2) 根据错误率过滤 SNP;(3) 根据缺失数据过滤样本;(4) 根据估计的 SNP 错误率检测重新捕获的个体。该模块化管道以 R 脚本的形式提供,可进行定制调整。我们利用对以色列亚洲野驴(Equus hemionus)种群的非侵入性采样证明了该方法的适用性。我们使用 625 个 SNP 对 756 个样本进行了基因分型,其中 255 个样本是 85 个样本的三倍体。根据过滤前三重样本中不匹配基因型的数量计算,SNP 平均错误率为 0.0034,过滤后降至 0.00174。评估过滤前后(预计分别为最小值和最大值)三重样之间的遗传距离(GD)和亲缘关系(r)显示,平均 GD 显著降低,从 58.1 降至 25.3(p = 0.0002),亲缘关系显著增加,从 r = 0.98 升至 r = 0.991(p = 0.00587)。我们展示了误差率估计是如何增强再捕获检测并提高基因型质量的。
{"title":"Genotyping Error Detection and Customised Filtration for SNP Datasets.","authors":"Noa Yaffa Kan-Lingwood, Liran Sagi, Shahar Mazie, Naama Shahar, Lilith Zecherle Bitton, Alan Templeton, Daniel Rubenstein, Amos Bouskila, Shirli Bar-David","doi":"10.1111/1755-0998.14033","DOIUrl":"https://doi.org/10.1111/1755-0998.14033","url":null,"abstract":"<p><p>A major challenge in analysing single-nucleotide polymorphism (SNP) genotype datasets is detecting and filtering errors that bias analyses and misinterpret ecological and evolutionary processes. Here, we present a comprehensive method to estimate and minimise genotyping error rates (deviations from the 'true' genotype) in any SNP datasets using triplicates (three repeats of the same sample) in a four-step filtration pipeline. The approach involves: (1) SNP filtering by missing data; (2) SNP filtering by error rates; (3) sample filtering by missing data and (4) detection of recaptured individuals by using estimated SNP error rates. The modular pipeline is provided in an R script that allows customised adjustments. We demonstrate the applicability of the method using non-invasive sampling from the Asiatic wild ass (Equus hemionus) population in Israel. We genotyped 756 samples using 625 SNPs, of which 255 were triplicates of 85 samples. The average SNP error rate, calculated based on the number of mismatching genotypes across triplicates before filtration, was 0.0034 and was reduced to 0.00174 following filtration. Evaluating genetic distance (GD) and relatedness (r) between triplicates before and after filtration (expected to be at the minimum and maximum respectively) showed a significant reduction in the average GD, from 58.1 to 25.3 (p = 0.0002) and a significant increase in relatedness, from r = 0.98 to r = 0.991 (p = 0.00587). We demonstrate how error rate estimation enhances recapture detection and improves genotype quality.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14033"},"PeriodicalIF":5.5,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three Novel Spider Genomes Unveil Spidroin Diversification and Hox Cluster Architecture: Ryuthela nishihirai (Liphistiidae), Uloborus plumipes (Uloboridae) and Cheiracanthium punctorium (Cheiracanthiidae). 三个新的蜘蛛基因组揭示了蜘蛛蛋白的多样化和Hox簇结构:Ryuthela nishihirai (Liphistiidae), Uloborus plumipes (Uloboridae) and Cheiracanthium punctorium (Cheiracanthiidae).
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-22 DOI: 10.1111/1755-0998.14038
Yannis Schöneberg, Tracy Lynn Audisio, Alexander Ben Hamadou, Martin Forman, Jiří Král, Tereza Kořínková, Eva Líznarová, Christoph Mayer, Lenka Prokopcová, Henrik Krehenwinkel, Stefan Prost, Susan Kennedy

Spiders are a hyperdiverse taxon and among the most abundant predators in nearly all terrestrial habitats. Their success is often attributed to key developments in their evolution such as silk and venom production and major apomorphies such as a whole-genome duplication. Resolving deep relationships within the spider tree of life has been historically challenging, making it difficult to measure the relative importance of these novelties for spider evolution. Whole-genome data offer an essential resource in these efforts, but also for functional genomic studies. Here, we present de novo assemblies for three spider species: Ryuthela nishihirai (Liphistiidae), a representative of the ancient Mesothelae, the suborder that is sister to all other extant spiders; Uloborus plumipes (Uloboridae), a cribellate orbweaver whose phylogenetic placement is especially challenging; and Cheiracanthium punctorium (Cheiracanthiidae), which represents only the second family to be sequenced in the hyperdiverse Dionycha clade. These genomes fill critical gaps in the spider tree of life. Using these novel genomes along with 25 previously published ones, we examine the evolutionary history of spidroin gene and structural hox cluster diversity. Our assemblies provide critical genomic resources to facilitate deeper investigations into spider evolution. The near chromosome-level genome of the 'living fossil' R. nishihirai represents an especially important step forward, offering new insights into the origins of spider traits.

蜘蛛是一个种类繁多的类群,也是几乎所有陆地栖息地中最丰富的捕食者之一。它们的成功往往归功于其进化过程中的关键发展,如产丝和产毒,以及主要的非形态,如全基因组复制。解决蜘蛛生命树内部的深层关系一直是一项挑战,因此很难衡量这些新发现对蜘蛛进化的相对重要性。全基因组数据为这些工作提供了重要资源,同时也为功能基因组研究提供了重要资源。在这里,我们展示了三个蜘蛛物种的全新组装:Ryuthela nishihirai (Liphistiidae),古代中蛛亚目(Mesothelae)的代表,该亚目是所有其他现生蜘蛛的姐妹目;Uloborus plumipes (Uloboridae),一种楔形口织蛛,其系统发生学定位特别具有挑战性;Cheiracanthium punctorium (Cheiracanthiidae),仅代表了超多样化的 Dionycha 支系中第二个被测序的家族。这些基因组填补了蜘蛛生命树中的重要空白。利用这些新的基因组以及之前发表的 25 个基因组,我们研究了蜘蛛素基因和结构 hox 簇多样性的进化历史。我们的组配为深入研究蜘蛛进化提供了重要的基因组资源。活化石 "R. nishihirai的近染色体级基因组代表着我们向前迈出了特别重要的一步,为我们提供了有关蜘蛛性状起源的新见解。
{"title":"Three Novel Spider Genomes Unveil Spidroin Diversification and Hox Cluster Architecture: Ryuthela nishihirai (Liphistiidae), Uloborus plumipes (Uloboridae) and Cheiracanthium punctorium (Cheiracanthiidae).","authors":"Yannis Schöneberg, Tracy Lynn Audisio, Alexander Ben Hamadou, Martin Forman, Jiří Král, Tereza Kořínková, Eva Líznarová, Christoph Mayer, Lenka Prokopcová, Henrik Krehenwinkel, Stefan Prost, Susan Kennedy","doi":"10.1111/1755-0998.14038","DOIUrl":"https://doi.org/10.1111/1755-0998.14038","url":null,"abstract":"<p><p>Spiders are a hyperdiverse taxon and among the most abundant predators in nearly all terrestrial habitats. Their success is often attributed to key developments in their evolution such as silk and venom production and major apomorphies such as a whole-genome duplication. Resolving deep relationships within the spider tree of life has been historically challenging, making it difficult to measure the relative importance of these novelties for spider evolution. Whole-genome data offer an essential resource in these efforts, but also for functional genomic studies. Here, we present de novo assemblies for three spider species: Ryuthela nishihirai (Liphistiidae), a representative of the ancient Mesothelae, the suborder that is sister to all other extant spiders; Uloborus plumipes (Uloboridae), a cribellate orbweaver whose phylogenetic placement is especially challenging; and Cheiracanthium punctorium (Cheiracanthiidae), which represents only the second family to be sequenced in the hyperdiverse Dionycha clade. These genomes fill critical gaps in the spider tree of life. Using these novel genomes along with 25 previously published ones, we examine the evolutionary history of spidroin gene and structural hox cluster diversity. Our assemblies provide critical genomic resources to facilitate deeper investigations into spider evolution. The near chromosome-level genome of the 'living fossil' R. nishihirai represents an especially important step forward, offering new insights into the origins of spider traits.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14038"},"PeriodicalIF":5.5,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Ribosomal Operon Database: A Full-Length rDNA Operon Database Derived From Genome Assemblies. 核糖体操作子数据库:从基因组组装中提取的全长 rDNA 操作子数据库。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-21 DOI: 10.1111/1755-0998.14031
Anders K Krabberød, Embla Stokke, Ella Thoen, Inger Skrede, Håvard Kauserud

Current rDNA reference sequence databases are tailored towards shorter DNA markers, such as parts of the 16/18S marker or the internally transcribed spacer (ITS) region. However, due to advances in long-read DNA sequencing technologies, longer stretches of the rDNA operon are increasingly used in environmental sequencing studies to increase the phylogenetic resolution. There is, therefore, a growing need for longer rDNA reference sequences. Here, we present the ribosomal operon database (ROD), which includes eukaryotic full-length rDNA operons fished from publicly available genome assemblies. Full-length operons were detected in 34.1% of the 34,701 examined eukaryotic genome assemblies from NCBI. In most cases (53.1%), more than one operon variant was detected, which can be due to intragenomic operon copy variability, allelic variation in non-haploid genomes, or technical errors from the sequencing and assembly process. The highest copy number found was 5947 in Zea mays. In total, 453,697 unique operons were detected, with 69,480 operon variant clusters remaining after intragenomic clustering at 99% sequence identity. The operon length varied extensively across eukaryotes, ranging from 4136 to 16,463 bp, which will lead to considerable polymerase chain reaction (PCR) bias during amplification of the entire operon. Clustering the full-length operons revealed that the different parts (i.e., 18S, 28S, and the hypervariable regions V4 and V9 of 18S) provide divergent taxonomic resolution, with 18S, the V4 and V9 regions being the most conserved. The ROD will be updated regularly to provide an increasing number of full-length rDNA operons to the scientific community.

目前的 rDNA 参考序列数据库主要针对较短的 DNA 标记,如 16/18S 标记或内部转录间隔区(ITS)的一部分。然而,由于长读程 DNA 测序技术的进步,环境测序研究中越来越多地使用较长的 rDNA 操作子,以提高系统发育的分辨率。因此,越来越需要更长的 rDNA 参考序列。在这里,我们介绍了核糖体操作子数据库(ROD),其中包括从公开的基因组汇编中获取的真核生物全长 rDNA 操作子。在NCBI提供的34,701个真核生物基因组汇编中,有34.1%检测到了全长操作子。在大多数情况下(53.1%),检测到一个以上的操作子变体,这可能是由于基因组内操作子拷贝变异、非单倍体基因组中的等位基因变异或测序和组装过程中的技术错误造成的。在玉米中发现的最高拷贝数为 5947。总共检测到 453,697 个独特的操作子,经过基因组内聚类后,剩下 69,480 个操作子变异群,序列同一性为 99%。真核生物的操作子长度差异很大,从 4136 到 16,463 bp 不等,这将导致在扩增整个操作子时聚合酶链反应(PCR)产生相当大的偏差。对全长操作子进行聚类发现,不同部分(即 18S、28S 以及 18S 的 V4 和 V9 超变区)提供了不同的分类分辨率,其中 18S、V4 和 V9 区最为保守。ROD 将定期更新,为科学界提供越来越多的全长 rDNA 操作子。
{"title":"The Ribosomal Operon Database: A Full-Length rDNA Operon Database Derived From Genome Assemblies.","authors":"Anders K Krabberød, Embla Stokke, Ella Thoen, Inger Skrede, Håvard Kauserud","doi":"10.1111/1755-0998.14031","DOIUrl":"https://doi.org/10.1111/1755-0998.14031","url":null,"abstract":"<p><p>Current rDNA reference sequence databases are tailored towards shorter DNA markers, such as parts of the 16/18S marker or the internally transcribed spacer (ITS) region. However, due to advances in long-read DNA sequencing technologies, longer stretches of the rDNA operon are increasingly used in environmental sequencing studies to increase the phylogenetic resolution. There is, therefore, a growing need for longer rDNA reference sequences. Here, we present the ribosomal operon database (ROD), which includes eukaryotic full-length rDNA operons fished from publicly available genome assemblies. Full-length operons were detected in 34.1% of the 34,701 examined eukaryotic genome assemblies from NCBI. In most cases (53.1%), more than one operon variant was detected, which can be due to intragenomic operon copy variability, allelic variation in non-haploid genomes, or technical errors from the sequencing and assembly process. The highest copy number found was 5947 in Zea mays. In total, 453,697 unique operons were detected, with 69,480 operon variant clusters remaining after intragenomic clustering at 99% sequence identity. The operon length varied extensively across eukaryotes, ranging from 4136 to 16,463 bp, which will lead to considerable polymerase chain reaction (PCR) bias during amplification of the entire operon. Clustering the full-length operons revealed that the different parts (i.e., 18S, 28S, and the hypervariable regions V4 and V9 of 18S) provide divergent taxonomic resolution, with 18S, the V4 and V9 regions being the most conserved. The ROD will be updated regularly to provide an increasing number of full-length rDNA operons to the scientific community.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14031"},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting the Briggs Ancient DNA Damage Model: A Fast Maximum Likelihood Method to Estimate Post-Mortem Damage. 重新审视布里格斯古 DNA 损伤模型:估计死后损伤的快速最大似然法。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-21 DOI: 10.1111/1755-0998.14029
Lei Zhao, Rasmus Amund Henriksen, Abigail Ramsøe, Rasmus Nielsen, Thorfinn Sand Korneliussen

One essential initial step in the analysis of ancient DNA is to authenticate that the DNA sequencing reads are actually from ancient DNA. This is done by assessing if the reads exhibit typical characteristics of post-mortem damage (PMD), including cytosine deamination and nicks. We present a novel statistical method implemented in a fast multithreaded programme, ngsBriggs that enables rapid quantification of PMD by estimation of the Briggs ancient damage model parameters (Briggs parameters). Using a multinomial model with maximum likelihood fit, ngsBriggs accurately estimates the parameters of the Briggs model, quantifying the PMD signal from single and double-stranded DNA regions. We extend the original Briggs model to capture PMD signals for contemporary sequencing platforms and show that ngsBriggs accurately estimates the Briggs parameters across a variety of contamination levels. Classification of reads into ancient or modern reads, for the purpose of decontamination, is significantly more accurate using ngsBriggs than using other methods available. Furthermore, ngsBriggs is substantially faster than other state-of-the-art methods. ngsBriggs offers a practical and accurate method for researchers seeking to authenticate ancient DNA and improve the quality of their data.

分析古 DNA 的一个重要初始步骤是鉴定 DNA 测序读数是否真的来自古 DNA。要做到这一点,需要评估读数是否表现出典型的死后损伤(PMD)特征,包括胞嘧啶脱氨和刻痕。我们介绍了一种在快速多线程程序 ngsBriggs 中实施的新型统计方法,该方法可通过估算布里格斯古损伤模型参数(布里格斯参数)快速量化 PMD。ngsBriggs 使用最大似然拟合的多项式模型,准确估计了布里格斯模型的参数,量化了单链和双链 DNA 区域的 PMD 信号。我们对原始布里格斯模型进行了扩展,以捕捉当代测序平台的 PMD 信号,结果表明 ngsBriggs 能准确估计各种污染水平下的布里格斯参数。与其他可用方法相比,使用 ngsBriggs 将读数分为古代读数和现代读数以达到净化目的的准确性要高得多。此外,ngsBriggs 比其他最先进的方法快得多。ngsBriggs 为寻求鉴定古代 DNA 和提高数据质量的研究人员提供了一种实用而准确的方法。
{"title":"Revisiting the Briggs Ancient DNA Damage Model: A Fast Maximum Likelihood Method to Estimate Post-Mortem Damage.","authors":"Lei Zhao, Rasmus Amund Henriksen, Abigail Ramsøe, Rasmus Nielsen, Thorfinn Sand Korneliussen","doi":"10.1111/1755-0998.14029","DOIUrl":"https://doi.org/10.1111/1755-0998.14029","url":null,"abstract":"<p><p>One essential initial step in the analysis of ancient DNA is to authenticate that the DNA sequencing reads are actually from ancient DNA. This is done by assessing if the reads exhibit typical characteristics of post-mortem damage (PMD), including cytosine deamination and nicks. We present a novel statistical method implemented in a fast multithreaded programme, ngsBriggs that enables rapid quantification of PMD by estimation of the Briggs ancient damage model parameters (Briggs parameters). Using a multinomial model with maximum likelihood fit, ngsBriggs accurately estimates the parameters of the Briggs model, quantifying the PMD signal from single and double-stranded DNA regions. We extend the original Briggs model to capture PMD signals for contemporary sequencing platforms and show that ngsBriggs accurately estimates the Briggs parameters across a variety of contamination levels. Classification of reads into ancient or modern reads, for the purpose of decontamination, is significantly more accurate using ngsBriggs than using other methods available. Furthermore, ngsBriggs is substantially faster than other state-of-the-art methods. ngsBriggs offers a practical and accurate method for researchers seeking to authenticate ancient DNA and improve the quality of their data.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14029"},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sediment Core DNA-Metabarcoding and Chitinous Remain Identification: Integrating Complementary Methods to Characterise Chironomidae Biodiversity in Lake Sediment Archives. 沉积物核心DNA-金属标码和壳质残留物鉴定:整合互补方法,确定湖泊沉积物档案中摇蚊科生物多样性的特征。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-21 DOI: 10.1111/1755-0998.14035
Lucas André Blattner, Pierre Lapellegerie, Colin Courtney-Mustaphi, Oliver Heiri

Chironomidae, so-called non-biting midges, are considered key bioindicators of aquatic ecosystem variability. Data derived from morphologically identifying their chitinous remains in sediments document chironomid larvae assemblages, which are studied to reconstruct ecosystem changes over time. Recent developments in sedimentary DNA (sedDNA) research have demonstrated that molecular techniques are suitable for determining past and present occurrences of organisms. Nevertheless, sedDNA records documenting alterations in chironomid assemblages remain largely unexplored. To close this gap, we examined the applicability of sedDNA metabarcoding to identify Chironomidae assemblages in lake sediments by sampling and processing three 21-35 cm long sediment cores from Lake Sempach in Switzerland. With a focus on developing analytical approaches, we compared an invertebrate-universal (FWH) and a newly designed Chironomidae-specific metabarcoding primer set (CH) to assess their performance in detecting Chironomidae DNA. We isolated and identified chitinous larval remains and compared the morphotype assemblages with the data derived from sedDNA metabarcoding. Results showed a good overall agreement of the morphotype assemblage-specific clustering among the chitinous remains and the metabarcoding datasets. Both methods indicated higher chironomid assemblage similarity between the two littoral cores in contrast to the deep lake core. Moreover, we observed a pronounced primer bias effect resulting in more Chironomidae detections with the CH primer combination compared to the FWH combination. Overall, we conclude that sedDNA metabarcoding can supplement traditional remain identifications and potentially provide independent reconstructions of past chironomid assemblage changes. Furthermore, it has the potential of more efficient workflows, better sample standardisation and species-level resolution datasets.

摇蚊(Chironomidae),即所谓的不咬蠓,被认为是水生生态系统变化的关键生物指标。通过对其在沉积物中的壳质残骸进行形态鉴定而获得的数据记录了摇蚊幼虫的组合,通过研究这些数据可以重建生态系统随时间的变化。沉积 DNA(sedDNA)研究的最新进展表明,分子技术适用于确定生物在过去和现在的分布情况。然而,记录摇蚊组合变化的沉积 DNA 记录在很大程度上仍未得到研究。为了填补这一空白,我们对瑞士森帕赫湖的三块 21-35 厘米长的沉积物岩心进行了取样和处理,研究了沉积 DNA 代谢编码技术在鉴定湖泊沉积物中摇蚊类群方面的适用性。为了开发分析方法,我们比较了无脊椎动物通用引物组(FWH)和新设计的摇蚊科专用代谢标码引物组(CH),以评估它们在检测摇蚊科 DNA 方面的性能。我们分离并鉴定了几丁质幼虫遗骸,并将其形态组合与沉积物 DNA 代谢编码得出的数据进行了比较。结果表明,壳质幼虫遗骸与代谢编码数据集之间的形态组合总体上非常一致。两种方法都表明,与深湖岩心相比,两个沿岸岩心的摇蚊集合相似度更高。此外,我们还观察到明显的引物偏差效应,与 FWH 引物组合相比,CH 引物组合检测到的摇蚊数量更多。总之,我们得出结论:沉积 DNA 代谢编码可以补充传统的残留鉴定,并有可能独立重建过去摇蚊类群的变化。此外,沉积 DNA 代谢编码还具有更高效的工作流程、更好的样本标准化和物种级分辨率数据集的潜力。
{"title":"Sediment Core DNA-Metabarcoding and Chitinous Remain Identification: Integrating Complementary Methods to Characterise Chironomidae Biodiversity in Lake Sediment Archives.","authors":"Lucas André Blattner, Pierre Lapellegerie, Colin Courtney-Mustaphi, Oliver Heiri","doi":"10.1111/1755-0998.14035","DOIUrl":"https://doi.org/10.1111/1755-0998.14035","url":null,"abstract":"<p><p>Chironomidae, so-called non-biting midges, are considered key bioindicators of aquatic ecosystem variability. Data derived from morphologically identifying their chitinous remains in sediments document chironomid larvae assemblages, which are studied to reconstruct ecosystem changes over time. Recent developments in sedimentary DNA (sedDNA) research have demonstrated that molecular techniques are suitable for determining past and present occurrences of organisms. Nevertheless, sedDNA records documenting alterations in chironomid assemblages remain largely unexplored. To close this gap, we examined the applicability of sedDNA metabarcoding to identify Chironomidae assemblages in lake sediments by sampling and processing three 21-35 cm long sediment cores from Lake Sempach in Switzerland. With a focus on developing analytical approaches, we compared an invertebrate-universal (FWH) and a newly designed Chironomidae-specific metabarcoding primer set (CH) to assess their performance in detecting Chironomidae DNA. We isolated and identified chitinous larval remains and compared the morphotype assemblages with the data derived from sedDNA metabarcoding. Results showed a good overall agreement of the morphotype assemblage-specific clustering among the chitinous remains and the metabarcoding datasets. Both methods indicated higher chironomid assemblage similarity between the two littoral cores in contrast to the deep lake core. Moreover, we observed a pronounced primer bias effect resulting in more Chironomidae detections with the CH primer combination compared to the FWH combination. Overall, we conclude that sedDNA metabarcoding can supplement traditional remain identifications and potentially provide independent reconstructions of past chironomid assemblage changes. Furthermore, it has the potential of more efficient workflows, better sample standardisation and species-level resolution datasets.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14035"},"PeriodicalIF":5.5,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to "A dedicated target capture approach reveals variable genetic markers across micro-and macro-evolutionary time scales in palms". 对 "专用目标捕获方法揭示棕榈树微观和宏观进化时间尺度上的可变遗传标记 "的更正
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-18 DOI: 10.1111/1755-0998.14032
{"title":"Correction to \"A dedicated target capture approach reveals variable genetic markers across micro-and macro-evolutionary time scales in palms\".","authors":"","doi":"10.1111/1755-0998.14032","DOIUrl":"https://doi.org/10.1111/1755-0998.14032","url":null,"abstract":"","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14032"},"PeriodicalIF":5.5,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What can optimized cost distances based on genetic distances offer? A simulation study on the use and misuse of ResistanceGA. 基于遗传距离的优化成本距离能带来什么?关于使用和滥用 ResistanceGA 的模拟研究。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-17 DOI: 10.1111/1755-0998.14024
Alexandrine Daniel, Paul Savary, Jean-Christophe Foltête, Gilles Vuidel, Bruno Faivre, Stéphane Garnier, Aurélie Khimoun

Modelling population connectivity is central to biodiversity conservation and often relies on resistance surfaces reflecting multi-generational gene flow. ResistanceGA (RGA) is a common optimization framework for parameterizing these surfaces by maximizing the fit between genetic distances and cost distances using maximum likelihood population effect models. As the reliability of this framework has rarely been studied, we investigated the conditions maximizing its accuracy for both prediction and interpretation of landscape features' permeability. We ran demo-genetic simulations in contrasted landscapes for species with distinct dispersal capacities and specialization levels, using corresponding reference cost scenarios. We then optimized resistance surfaces from the simulated genetic distances using RGA. First, we evaluated whether RGA identified the drivers of the genetic patterns, that is, distinguished Isolation-by-Resistance (IBR) patterns from either Isolation-by-Distance or patterns unrelated to ecological distances. We then assessed RGA predictive performance using a cross-validation method, and its ability to recover the reference cost scenarios shaping genetic structure in simulations. IBR patterns were well detected and genetic distances were predicted with great accuracy. This performance depended on the strength of the genetic structuring, sampling design and landscape structure. Matching the scale of the genetic pattern by focusing on population pairs connected through gene flow and limiting overfitting through cross-validation further enhanced inference reliability. Yet, the optimized cost values often departed from the reference values, making their interpretation and extrapolation potentially dubious. While demonstrating the value of RGA for predictive modelling, we call for caution and provide additional guidance for its optimal use.

建立种群连通性模型是生物多样性保护的核心,通常依赖于反映多代基因流动的阻力面。ResistanceGA(RGA)是一种常见的优化框架,它通过最大化遗传距离与成本距离之间的拟合,利用最大似然种群效应模型对这些表面进行参数化。由于很少有人研究过这一框架的可靠性,因此我们研究了在预测和解释地貌特征渗透性时,使其准确性最大化的条件。我们利用相应的参考成本方案,在具有不同扩散能力和专业化水平的物种的对比景观中进行了种群遗传模拟。然后,我们利用 RGA 从模拟遗传距离中优化了阻力面。首先,我们评估了 RGA 是否能识别遗传模式的驱动因素,即是否能将 "因抵抗力而隔离(IBR)"模式与 "因距离而隔离 "模式或与生态距离无关的模式区分开来。然后,我们使用交叉验证法评估了 RGA 的预测性能,以及它在模拟中恢复形成遗传结构的参考成本情景的能力。我们很好地检测到了 IBR 模式,并非常准确地预测了遗传距离。这种性能取决于遗传结构的强度、采样设计和景观结构。通过关注通过基因流连接的种群对来匹配遗传模式的规模,以及通过交叉验证限制过度拟合,进一步提高了推断的可靠性。然而,优化后的成本值往往偏离参考值,使其解释和推断可能存在疑问。在证明 RGA 在预测建模中的价值的同时,我们呼吁谨慎使用 RGA,并为其最佳使用提供更多指导。
{"title":"What can optimized cost distances based on genetic distances offer? A simulation study on the use and misuse of ResistanceGA.","authors":"Alexandrine Daniel, Paul Savary, Jean-Christophe Foltête, Gilles Vuidel, Bruno Faivre, Stéphane Garnier, Aurélie Khimoun","doi":"10.1111/1755-0998.14024","DOIUrl":"https://doi.org/10.1111/1755-0998.14024","url":null,"abstract":"<p><p>Modelling population connectivity is central to biodiversity conservation and often relies on resistance surfaces reflecting multi-generational gene flow. ResistanceGA (RGA) is a common optimization framework for parameterizing these surfaces by maximizing the fit between genetic distances and cost distances using maximum likelihood population effect models. As the reliability of this framework has rarely been studied, we investigated the conditions maximizing its accuracy for both prediction and interpretation of landscape features' permeability. We ran demo-genetic simulations in contrasted landscapes for species with distinct dispersal capacities and specialization levels, using corresponding reference cost scenarios. We then optimized resistance surfaces from the simulated genetic distances using RGA. First, we evaluated whether RGA identified the drivers of the genetic patterns, that is, distinguished Isolation-by-Resistance (IBR) patterns from either Isolation-by-Distance or patterns unrelated to ecological distances. We then assessed RGA predictive performance using a cross-validation method, and its ability to recover the reference cost scenarios shaping genetic structure in simulations. IBR patterns were well detected and genetic distances were predicted with great accuracy. This performance depended on the strength of the genetic structuring, sampling design and landscape structure. Matching the scale of the genetic pattern by focusing on population pairs connected through gene flow and limiting overfitting through cross-validation further enhanced inference reliability. Yet, the optimized cost values often departed from the reference values, making their interpretation and extrapolation potentially dubious. While demonstrating the value of RGA for predictive modelling, we call for caution and provide additional guidance for its optimal use.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14024"},"PeriodicalIF":5.5,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assembly of Mitochondrial Genomes Using Nanopore Long-Read Technology in Three Sea Chubs (Teleostei: Kyphosidae). 利用纳米孔长读取技术组装三种海鲢鱼(Teleostei: Kyphosidae)的线粒体基因组。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-15 DOI: 10.1111/1755-0998.14034
J Antonio Baeza, Jeremiah J Minish, Todd P Michael
<p><p>Complete mitochondrial genomes have become markers of choice to explore phylogenetic relationships at multiple taxonomic levels and they are often assembled using whole genome short-read sequencing. Herein, using three species of sea chubs as an example, we explored the accuracy of mitochondrial chromosomes assembled using Oxford Nanopore Technology (ONT) Kit 14 R10.4.1 long reads at different sequencing depths (high, low and very low or genome skimming) by comparing them to 'gold' standard reference mitochondrial genomes assembled using Illumina NovaSeq short reads. In two species of sea chubs, Girella nigricans and Kyphosus azureus, ONT long-read assembled mitochondrial genomes at high sequencing depths (> 25× whole [nuclear] genome) were identical to their respective short-read assembled mitochondrial genomes. Not a single 'homopolymer insertion', 'homopolymer deletion', 'simple substitution', 'single insertion', 'short insertion', 'single deletion' or 'short deletion' were detected in the long-read assembled mitochondrial genomes after aligning each one of them to their short-read counterparts. In turn, in a third species, Medialuna californiensis, a 25× sequencing depth long-read assembled mitochondrial genome was 14 nucleotides longer than its short-read counterpart. The difference in total length between the latter two assemblies was due to the presence of a short motif 14 bp long that was repeated (twice) in the long read but not in the short-read assembly. Read subsampling at a sequencing depth of 1× resulted in the assembly of partial or complete mitochondrial genomes with numerous errors, including, among others, simple indels, and indels at homopolymer regions. At 3× and 5× subsampling, genomes were identical (perfect) or almost identical (quasiperfect, 99.5% over 16,500 bp) to their respective Illumina assemblies. The newly assembled mitochondrial genomes exhibit identical gene composition and organisation compared with cofamilial species and a phylomitogenomic analysis based on translated protein-coding genes suggested that the family Kyphosidae is not monophyletic. The same analysis detected possible cases of misidentification of mitochondrial genomes deposited in GenBank. This study demonstrates that perfect (complete and fully accurate) or quasiperfect (complete but with a single or a very few errors) mitochondrial genomes can be assembled at high (> 25×) and low (3-5×) but not very low (1×, genome skimming) sequencing depths using ONT long reads and the latest ONT chemistries (Kit 14 and R10.4.1 flowcells with SUP basecalling). The newly assembled and annotated mitochondrial genomes can be used as a reference in environmental DNA studies focusing on bioprospecting and biomonitoring of these and other coastal species experiencing environmental insult. Given the small size of the sequencing device and low cost, we argue that ONT technology has the potential to improve access to high-throughput sequencing technologies in low-
完整的线粒体基因组已成为探索多分类水平系统发育关系的首选标记,通常使用全基因组短读测序法组装线粒体基因组。在此,我们以三个物种的海鲦为例,通过与使用 Illumina NovaSeq 短读取组装的 "黄金 "标准参考线粒体基因组进行比较,探讨了使用牛津纳米孔技术(ONT)14 R10.4.1 套件在不同测序深度(高、低和极低或基因组撇取)下组装的线粒体染色体的准确性。在两种海鲦(Girella nigricans 和 Kyphosus azureus)中,高测序深度(> 25× 全[核]基因组)下 ONT 长读数组装的线粒体基因组与各自短读数组装的线粒体基因组完全相同。将长线程组装的线粒体基因组与短线程组装的线粒体基因组进行比对后,没有发现任何 "同源多聚物插入"、"同源多聚物缺失"、"简单替换"、"单插入"、"短插入"、"单缺失 "或 "短缺失"。而在第三个物种--加州麦地那龙(Medialuna californiensis)中,25 倍测序深度的长线粒体基因组比短线粒体基因组长 14 个核苷酸。后两种装配的总长度之所以不同,是因为存在一个 14 bp 长的短图案,该图案在长读数中重复(两次),但在短读数装配中没有。在测序深度为 1× 的情况下,读数子取样会导致部分或完整线粒体基因组的组装出现大量错误,其中包括简单嵌合和同源多聚物区域的嵌合。在 3 倍和 5 倍子取样时,基因组与各自的 Illumina 组装结果完全相同(完美)或几乎完全相同(准完美,在 16,500 bp 上达到 99.5%)。与同族物种相比,新组装的线粒体基因组显示出相同的基因组成和组织结构,而基于翻译蛋白编码基因的系统发生组分析表明,Kyphosidae科并非单系。同样的分析还发现了存放在 GenBank 中的线粒体基因组可能存在识别错误的情况。这项研究表明,使用 ONT 长读数和最新的 ONT 化学试剂(Kit 14 和 R10.4.1 flowcells,带 SUP basecalling),可以在高测序深度(> 25×)和低测序深度(3-5×)但不是极低测序深度(1×,基因组略读)下组装出完美(完整且完全准确)或准完美(完整但只有一个或极少数错误)的线粒体基因组。新组装和注释的线粒体基因组可作为环境 DNA 研究的参考,重点是这些物种和其他遭受环境污染的沿海物种的生物勘探和生物监测。鉴于测序装置体积小、成本低,我们认为 ONT 技术有可能改善中低收入国家对高通量测序技术的利用。
{"title":"Assembly of Mitochondrial Genomes Using Nanopore Long-Read Technology in Three Sea Chubs (Teleostei: Kyphosidae).","authors":"J Antonio Baeza, Jeremiah J Minish, Todd P Michael","doi":"10.1111/1755-0998.14034","DOIUrl":"https://doi.org/10.1111/1755-0998.14034","url":null,"abstract":"&lt;p&gt;&lt;p&gt;Complete mitochondrial genomes have become markers of choice to explore phylogenetic relationships at multiple taxonomic levels and they are often assembled using whole genome short-read sequencing. Herein, using three species of sea chubs as an example, we explored the accuracy of mitochondrial chromosomes assembled using Oxford Nanopore Technology (ONT) Kit 14 R10.4.1 long reads at different sequencing depths (high, low and very low or genome skimming) by comparing them to 'gold' standard reference mitochondrial genomes assembled using Illumina NovaSeq short reads. In two species of sea chubs, Girella nigricans and Kyphosus azureus, ONT long-read assembled mitochondrial genomes at high sequencing depths (&gt; 25× whole [nuclear] genome) were identical to their respective short-read assembled mitochondrial genomes. Not a single 'homopolymer insertion', 'homopolymer deletion', 'simple substitution', 'single insertion', 'short insertion', 'single deletion' or 'short deletion' were detected in the long-read assembled mitochondrial genomes after aligning each one of them to their short-read counterparts. In turn, in a third species, Medialuna californiensis, a 25× sequencing depth long-read assembled mitochondrial genome was 14 nucleotides longer than its short-read counterpart. The difference in total length between the latter two assemblies was due to the presence of a short motif 14 bp long that was repeated (twice) in the long read but not in the short-read assembly. Read subsampling at a sequencing depth of 1× resulted in the assembly of partial or complete mitochondrial genomes with numerous errors, including, among others, simple indels, and indels at homopolymer regions. At 3× and 5× subsampling, genomes were identical (perfect) or almost identical (quasiperfect, 99.5% over 16,500 bp) to their respective Illumina assemblies. The newly assembled mitochondrial genomes exhibit identical gene composition and organisation compared with cofamilial species and a phylomitogenomic analysis based on translated protein-coding genes suggested that the family Kyphosidae is not monophyletic. The same analysis detected possible cases of misidentification of mitochondrial genomes deposited in GenBank. This study demonstrates that perfect (complete and fully accurate) or quasiperfect (complete but with a single or a very few errors) mitochondrial genomes can be assembled at high (&gt; 25×) and low (3-5×) but not very low (1×, genome skimming) sequencing depths using ONT long reads and the latest ONT chemistries (Kit 14 and R10.4.1 flowcells with SUP basecalling). The newly assembled and annotated mitochondrial genomes can be used as a reference in environmental DNA studies focusing on bioprospecting and biomonitoring of these and other coastal species experiencing environmental insult. Given the small size of the sequencing device and low cost, we argue that ONT technology has the potential to improve access to high-throughput sequencing technologies in low-","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14034"},"PeriodicalIF":5.5,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Barcode 100K Specimens: In a Single Nanopore Run. 对 10 万个样本进行条形码编码:在一次纳米孔运行中
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-10 DOI: 10.1111/1755-0998.14028
Paul D N Hebert, Robin Floyd, Saeideh Jafarpour, Sean W J Prosser

It is a global priority to better manage the biosphere, but action must be informed by comprehensive data on the abundance and distribution of species. The acquisition of such information is currently constrained by high costs. DNA barcoding can speed the registration of unknown animal species, the most diverse kingdom of eukaryotes, as the BIN system automates their recognition. However, inexpensive sequencing protocols are critical as the census of all animal species is likely to require the analysis of a billion or more specimens. Barcoding involves DNA extraction followed by PCR and sequencing with the last step dominating costs until 2017. By enabling the sequencing of highly multiplexed samples, the Sequel platforms from Pacific BioSciences slashed costs by 90%, but these instruments are only deployed in core facilities because of their expense. Sequencers from Oxford Nanopore Technologies provide an escape from high capital and service costs, but their low sequence fidelity has, until recently, constrained adoption. However, the improved performance of its latest flow cells (R10.4.1) erases this barrier. This study demonstrates that a MinION flow cell can characterise an amplicon pool derived from 100,000 specimens while a Flongle flow cell can process one derived from several thousand. At $0.01 per specimen, DNA sequencing is now the least expensive step in the barcode workflow.

更好地管理生物圈是全球的当务之急,但行动必须以物种丰度和分布的综合数据为依据。目前,获取此类信息受到高成本的限制。DNA 条形码可以加快登记未知动物物种的速度,因为 BIN 系统可以自动识别真核生物界中最多样化的动物物种。然而,由于对所有动物物种进行普查可能需要对十亿或更多的标本进行分析,因此廉价的测序协议至关重要。条形编码包括 DNA 提取、PCR 和测序,其中最后一步的成本在 2017 年之前一直占主导地位。太平洋生物科学公司(Pacific BioSciences)的 Sequel 平台可以对高度复用的样本进行测序,从而将成本降低了 90%,但由于费用昂贵,这些仪器只能部署在核心设施中。牛津纳米孔技术公司(Oxford Nanopore Technologies)的测序仪可以摆脱高昂的资本和服务成本,但直到最近,其较低的序列保真度还限制了其应用。然而,牛津纳米孔技术公司最新的流式细胞仪(R10.4.1)性能的提高消除了这一障碍。这项研究表明,MinION 流式细胞仪可以鉴定来自 10 万个标本的扩增子库,而 Flongle 流式细胞仪可以处理来自数千个标本的扩增子库。每个标本只需 0.01 美元,DNA 测序现在是条形码工作流程中成本最低的步骤。
{"title":"Barcode 100K Specimens: In a Single Nanopore Run.","authors":"Paul D N Hebert, Robin Floyd, Saeideh Jafarpour, Sean W J Prosser","doi":"10.1111/1755-0998.14028","DOIUrl":"https://doi.org/10.1111/1755-0998.14028","url":null,"abstract":"<p><p>It is a global priority to better manage the biosphere, but action must be informed by comprehensive data on the abundance and distribution of species. The acquisition of such information is currently constrained by high costs. DNA barcoding can speed the registration of unknown animal species, the most diverse kingdom of eukaryotes, as the BIN system automates their recognition. However, inexpensive sequencing protocols are critical as the census of all animal species is likely to require the analysis of a billion or more specimens. Barcoding involves DNA extraction followed by PCR and sequencing with the last step dominating costs until 2017. By enabling the sequencing of highly multiplexed samples, the Sequel platforms from Pacific BioSciences slashed costs by 90%, but these instruments are only deployed in core facilities because of their expense. Sequencers from Oxford Nanopore Technologies provide an escape from high capital and service costs, but their low sequence fidelity has, until recently, constrained adoption. However, the improved performance of its latest flow cells (R10.4.1) erases this barrier. This study demonstrates that a MinION flow cell can characterise an amplicon pool derived from 100,000 specimens while a Flongle flow cell can process one derived from several thousand. At $0.01 per specimen, DNA sequencing is now the least expensive step in the barcode workflow.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14028"},"PeriodicalIF":5.5,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EVE-X: Software to Identify Novel Viral Insertions in Wild-Caught Arthropod Hosts From Next-Generation Short Read Data. EVE-X:从下一代短读数数据中识别野生捕获节肢动物宿主中新型病毒插入的软件。
IF 5.5 1区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-10-09 DOI: 10.1111/1755-0998.14026
Jessen Havill, Olivia Strasburg, Tessy Udoh, Jacob E Crawford, Andrea Gloria-Soria

Eukaryotic genomes harbour sequences derived from non-retroviral RNA viruses, known as endogenous viral elements (EVEs) or non-retroviral integrated RNA virus sequences (NIRVS). These sequences represent a record of past infections and have been implicated in host anti-viral response. We have created a program to identify viral sequences integrated in a host genome. It begins with a specimen BAM file and outputs candidate NIRVS, along with putative host insertion sites and overlapping genomic features of the host genome in XML and visual formats, with minimal intermediary intervention. We ran through this software short-read data derived from the genomes of 222 wild-caught A. aegypti mosquitoes, from a dozen geographical regions, and located putative NIRVS from seven virus families. This program is as accurate as currently available software for NIRVS detection, and represents a significant improvement in adaptability and user-friendliness. Furthermore, the flexibility of this pipeline allows the user to search for sequence integrations across the genome of any organism, as long as a query sequence database and a reference genome is provided. Potential extended applications include identification of integrated transgenic sequences used for research or vector control strategies.

真核生物基因组中含有来自非逆转录病毒 RNA 病毒的序列,称为内源性病毒元件(EVE)或非逆转录病毒整合 RNA 病毒序列(NIRVS)。这些序列代表着过去感染的记录,并与宿主的抗病毒反应有关。我们创建了一个程序来识别整合在宿主基因组中的病毒序列。它从样本 BAM 文件开始,以 XML 和可视化格式输出候选 NIRVS 以及推测的宿主插入位点和宿主基因组的重叠基因组特征,中间干预极少。我们通过该软件运行了来自十几个地理区域的 222 只野外捕获的埃及疟蚊基因组的短读数据,并找到了七个病毒科的假定近红外基因组。该程序与目前可用的 NIRVS 检测软件一样准确,在适应性和用户友好性方面有了显著提高。此外,只要提供查询序列数据库和参考基因组,该程序的灵活性还允许用户在任何生物体的基因组中搜索整合序列。潜在的扩展应用包括识别用于研究或病媒控制策略的整合转基因序列。
{"title":"EVE-X: Software to Identify Novel Viral Insertions in Wild-Caught Arthropod Hosts From Next-Generation Short Read Data.","authors":"Jessen Havill, Olivia Strasburg, Tessy Udoh, Jacob E Crawford, Andrea Gloria-Soria","doi":"10.1111/1755-0998.14026","DOIUrl":"https://doi.org/10.1111/1755-0998.14026","url":null,"abstract":"<p><p>Eukaryotic genomes harbour sequences derived from non-retroviral RNA viruses, known as endogenous viral elements (EVEs) or non-retroviral integrated RNA virus sequences (NIRVS). These sequences represent a record of past infections and have been implicated in host anti-viral response. We have created a program to identify viral sequences integrated in a host genome. It begins with a specimen BAM file and outputs candidate NIRVS, along with putative host insertion sites and overlapping genomic features of the host genome in XML and visual formats, with minimal intermediary intervention. We ran through this software short-read data derived from the genomes of 222 wild-caught A. aegypti mosquitoes, from a dozen geographical regions, and located putative NIRVS from seven virus families. This program is as accurate as currently available software for NIRVS detection, and represents a significant improvement in adaptability and user-friendliness. Furthermore, the flexibility of this pipeline allows the user to search for sequence integrations across the genome of any organism, as long as a query sequence database and a reference genome is provided. Potential extended applications include identification of integrated transgenic sequences used for research or vector control strategies.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":" ","pages":"e14026"},"PeriodicalIF":5.5,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142386792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Molecular Ecology Resources
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1