首页 > 最新文献

Genome research最新文献

英文 中文
Modeling gene interactions in polygenic prediction via geometric deep learning 通过几何深度学习为多基因预测中的基因相互作用建模
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-19 DOI: 10.1101/gr.279694.124
Han Li, Jianyang Zeng, Michael P Snyder, Sai Zhang
Polygenic risk score (PRS) is a widely-used approach for predicting individuals' genetic risk of complex diseases, playing a pivotal role in advancing precision medicine. Traditional PRS methods, predominantly following a linear structure, often fall short in capturing the intricate relationships between genotype and phenotype. In this study, we present PRS-Net, an interpretable geometric deep learning-based framework that effectively models the nonlinearity of biological systems for enhanced disease prediction and biological discovery. PRS-Net begins by deconvoluting the genome-wide PRS at the single-gene resolution, and then explicitly encapsulates gene-gene interactions leveraging a graph neural network (GNN) for genetic risk prediction, enabling a systematic characterization of molecular interplay underpinning diseases. An attentive readout module is introduced to facilitate model interpretation. Extensive tests across multiple complex traits and diseases demonstrate the superior prediction performance of PRS-Net compared to conventional PRS methods. The interpretability of PRS-Net further enhances the identification of disease-relevant genes and gene programs. PRS-Net provides a potent tool for concurrent genetic risk prediction and biological discovery for complex diseases.
多基因风险评分(PRS)是一种广泛应用于预测个体复杂疾病遗传风险的方法,在推进精准医疗方面发挥着举足轻重的作用。传统的多基因风险评分方法主要采用线性结构,往往无法捕捉基因型与表型之间错综复杂的关系。在本研究中,我们提出了一种基于几何深度学习的可解释框架--PRS-Net,它能有效地模拟生物系统的非线性,从而增强疾病预测和生物发现的能力。PRS-Net 首先在单基因分辨率上对全基因组 PRS 进行去卷积,然后利用图神经网络(GNN)明确封装基因与基因之间的相互作用,以进行遗传风险预测,从而系统地描述支撑疾病的分子相互作用。为便于解释模型,还引入了一个细心的读出模块。对多种复杂性状和疾病的广泛测试表明,与传统的 PRS 方法相比,PRS-Net 的预测性能更优越。PRS-Net 的可解释性进一步提高了疾病相关基因和基因程序的鉴定能力。PRS-Net 为同时进行复杂疾病的遗传风险预测和生物学发现提供了有力的工具。
{"title":"Modeling gene interactions in polygenic prediction via geometric deep learning","authors":"Han Li, Jianyang Zeng, Michael P Snyder, Sai Zhang","doi":"10.1101/gr.279694.124","DOIUrl":"https://doi.org/10.1101/gr.279694.124","url":null,"abstract":"Polygenic risk score (PRS) is a widely-used approach for predicting individuals' genetic risk of complex diseases, playing a pivotal role in advancing precision medicine. Traditional PRS methods, predominantly following a linear structure, often fall short in capturing the intricate relationships between genotype and phenotype. In this study, we present PRS-Net, an interpretable geometric deep learning-based framework that effectively models the nonlinearity of biological systems for enhanced disease prediction and biological discovery. PRS-Net begins by deconvoluting the genome-wide PRS at the single-gene resolution, and then explicitly encapsulates gene-gene interactions leveraging a graph neural network (GNN) for genetic risk prediction, enabling a systematic characterization of molecular interplay underpinning diseases. An attentive readout module is introduced to facilitate model interpretation. Extensive tests across multiple complex traits and diseases demonstrate the superior prediction performance of PRS-Net compared to conventional PRS methods. The interpretability of PRS-Net further enhances the identification of disease-relevant genes and gene programs. PRS-Net provides a potent tool for concurrent genetic risk prediction and biological discovery for complex diseases.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"99 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ISWI1 complex proteins facilitate developmental genome editing in Paramecium ISWI1复合蛋白促进了鹦鹉螺的发育基因组编辑
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-14 DOI: 10.1101/gr.278402.123
Aditi Singh, Lilia Häußermann, Christiane Emmerich, Emily Nischwitz, Brandon KB Seah, Falk Butter, Mariusz Nowacki, Estienne C. Swart
One of the most extensive forms of natural genome editing occurs in ciliates, a group of microbial eukaryotes. Ciliate germline and somatic genomes are contained in distinct nuclei within the same cell. During the massive reorganization process of somatic genome development, ciliates eliminate tens of thousands of DNA sequences from a germline genome copy. Recently, we showed that the chromatin remodeler ISWI1 is required for somatic genome development in the ciliate Paramecium tetraurelia. Here, we describe two high similarity paralogous proteins, ICOPa and ICOPb, essential for their genome editing. ICOPa and ICOPb are highly divergent from known proteins; the only domain detected showed distant homology to the WSD (WHIM2+WHIM3) motif. We show that both ICOPa and ICOPb interact with the chromatin remodeler ISWI1. Upon ICOP knockdown, changes in alternative DNA excision boundaries and nucleosome densities are similar to those observed for ISWI1 knockdown. We thus propose that a complex comprising ISWI1 and either or both ICOPa and ICOPb are needed for Paramecium's precise genome editing.
纤毛虫是一类微生物真核生物,是自然基因组编辑最广泛的形式之一。纤毛虫的生殖基因组和体细胞基因组包含在同一细胞内不同的细胞核中。在体细胞基因组发育的大规模重组过程中,纤毛虫会从生殖基因组拷贝中删除数以万计的 DNA 序列。最近,我们发现染色质重塑器 ISWI1 是纤毛虫四膜虫体细胞基因组发育所必需的。在这里,我们描述了两个高度相似的同源蛋白 ICOPa 和 ICOPb,它们对基因组编辑至关重要。ICOPa 和 ICOPb 与已知的蛋白质有很大差异;检测到的唯一结构域与 WSD(WHIM2+WHIM3)基团有很远的同源性。我们的研究表明,ICOPa 和 ICOPb 都与染色质重塑因子 ISWI1 相互作用。敲除 ICOP 后,替代 DNA 切割边界和核小体密度的变化与敲除 ISWI1 后观察到的变化相似。因此,我们认为副鳞虫的精确基因组编辑需要一个由 ISWI1 和 ICOPa 或 ICOPb 组成的复合物。
{"title":"ISWI1 complex proteins facilitate developmental genome editing in Paramecium","authors":"Aditi Singh, Lilia Häußermann, Christiane Emmerich, Emily Nischwitz, Brandon KB Seah, Falk Butter, Mariusz Nowacki, Estienne C. Swart","doi":"10.1101/gr.278402.123","DOIUrl":"https://doi.org/10.1101/gr.278402.123","url":null,"abstract":"One of the most extensive forms of natural genome editing occurs in ciliates, a group of microbial eukaryotes. Ciliate germline and somatic genomes are contained in distinct nuclei within the same cell. During the massive reorganization process of somatic genome development, ciliates eliminate tens of thousands of DNA sequences from a germline genome copy. Recently, we showed that the chromatin remodeler ISWI1 is required for somatic genome development in the ciliate <em>Paramecium tetraurelia</em>. Here, we describe two high similarity paralogous proteins, ICOPa and ICOPb, essential for their genome editing. ICOPa and ICOPb are highly divergent from known proteins; the only domain detected showed distant homology to the WSD (WHIM2+WHIM3) motif. We show that both ICOPa and ICOPb interact with the chromatin remodeler ISWI1. Upon ICOP knockdown, changes in alternative DNA excision boundaries and nucleosome densities are similar to those observed for <em>ISWI1</em> knockdown. We thus propose that a complex comprising ISWI1 and either or both ICOPa and ICOPb are needed for <em>Paramecium's</em> precise genome editing.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"9 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142610480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-quality sika deer omics data and integrative analysis reveal genic and cellular regulation of antler regeneration 高质量梅花鹿全息数据和综合分析揭示了鹿茸再生的基因和细胞调控机制
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-14 DOI: 10.1101/gr.279448.124
Zihe Li, Ziyu Xu, Lei Zhu, Tao Qin, Jinrui Ma, Zhanying Feng, Huishan Yue, Qing Guan, Botong Zhou, Ge Han, Guokun Zhang, Chunyi Li, Shuaijun Jia, Qiang Qiu, Dingjun Hao, Yong Wang, Wen Wang
Antler is the only organ that can fully regenerate annually in mammals. However, the regulatory pattern and mechanism of gene expression and cell differentiation during this process remain largely unknown. Here, we obtain comprehensive assembly and gene annotation of the sika deer (Cervus nippon) genome. Together with large-scale chromatin accessibility and gene expression data, we construct gene regulatory networks involved in antler regeneration, identifying four transcription factors, MYC, KLF4, NFE2L2, and JDP2 with high regulatory activity across whole regeneration process. Comparative studies and luciferase reporter assay suggest the MYC expression driven by a cervid-specific regulatory element might be important for antler regenerative ability. We further develop a model called cTOP which integrates single-cell data with bulk regulatory networks and find PRDM1, FOSL1, BACH1, and NFATC1 as potential pivotal factors in antler stem cell activation and osteogenic differentiation. Additionally, we uncover interactions within and between cell programs and pathways during the regeneration process. These findings provide insights into the gene and cell regulatory mechanisms of antler regeneration, particularly in stem cell activation and differentiation.
鹿角是哺乳动物中唯一能每年完全再生的器官。然而,这一过程中基因表达和细胞分化的调控模式和机制在很大程度上仍是未知的。在这里,我们获得了梅花鹿(Cervus nippon)基因组的全面组装和基因注释。结合大规模染色质可及性和基因表达数据,我们构建了参与鹿茸再生的基因调控网络,发现 MYC、KLF4、NFE2L2 和 JDP2 四个转录因子在整个再生过程中具有较高的调控活性。比较研究和荧光素酶报告实验表明,由鹿类特异性调控元件驱动的MYC表达可能对鹿茸的再生能力非常重要。我们进一步建立了一个名为 cTOP 的模型,该模型将单细胞数据与大体调控网络相结合,发现 PRDM1、FOSL1、BACH1 和 NFATC1 是鹿茸干细胞活化和成骨分化的潜在关键因素。此外,我们还发现了再生过程中细胞程序和途径内部和之间的相互作用。这些发现为鹿茸再生的基因和细胞调控机制,特别是干细胞活化和分化提供了深入的见解。
{"title":"High-quality sika deer omics data and integrative analysis reveal genic and cellular regulation of antler regeneration","authors":"Zihe Li, Ziyu Xu, Lei Zhu, Tao Qin, Jinrui Ma, Zhanying Feng, Huishan Yue, Qing Guan, Botong Zhou, Ge Han, Guokun Zhang, Chunyi Li, Shuaijun Jia, Qiang Qiu, Dingjun Hao, Yong Wang, Wen Wang","doi":"10.1101/gr.279448.124","DOIUrl":"https://doi.org/10.1101/gr.279448.124","url":null,"abstract":"Antler is the only organ that can fully regenerate annually in mammals. However, the regulatory pattern and mechanism of gene expression and cell differentiation during this process remain largely unknown. Here, we obtain comprehensive assembly and gene annotation of the sika deer (<em>Cervus nippon</em>) genome. Together with large-scale chromatin accessibility and gene expression data, we construct gene regulatory networks involved in antler regeneration, identifying four transcription factors, <em>MYC</em>, <em>KLF4</em>, <em>NFE2L2</em>, and <em>JDP2</em> with high regulatory activity across whole regeneration process. Comparative studies and luciferase reporter assay suggest the <em>MYC</em> expression driven by a cervid-specific regulatory element might be important for antler regenerative ability. We further develop a model called cTOP which integrates single-cell data with bulk regulatory networks and find <em>PRDM1</em>, <em>FOSL1</em>, <em>BACH1</em>, and <em>NFATC1</em> as potential pivotal factors in antler stem cell activation and osteogenic differentiation. Additionally, we uncover interactions within and between cell programs and pathways during the regeneration process. These findings provide insights into the gene and cell regulatory mechanisms of antler regeneration, particularly in stem cell activation and differentiation.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"22 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142610479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Haplotype-resolved genome and population genomics of the threatened garden dormouse in Europe. 欧洲濒危花园睡鼠的单倍型基因组和种群基因组学。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-14 DOI: 10.1101/gr.279066.124
Paige A Byerly, Alina von Thaden, Evgeny Leushkin, Leon Hilgers, Shenglin Liu, Sven Winter, Tilman Schell, Charlotte Gerheim, Alexander Ben Hamadou, Carola Greve, Christian Betz, Hanno J Bolz, Sven Büchner, Johannes Lang, Holger Meinig, Evax Marie Famira-Parcsetich, Sarah P Stubbe, Alice Mouton, Sandro Bertolino, Goedele Verbeylen, Thomas Briner, Lídia Freixas, Lorenzo Vinciguerra, Sarah A Mueller, Carsten Nowak, Michael Hiller

Genomic resources are important for evaluating genetic diversity and supporting conservation efforts. The garden dormouse (Eliomys quercinus) is a small rodent that has experienced one of the most severe modern population declines in Europe. We present a high-quality haplotype-resolved reference genome for the garden dormouse, and combine comprehensive short and long-read transcriptomics data sets with homology-based methods to generate a highly complete gene annotation. Demographic history analysis of the genome reveal a sharp population decline since the last interglacial, indicating an association between colder climates and population declines before anthropogenic influence. Using our genome and genetic data from 100 individuals, largely sampled in a citizen-science project across the contemporary range, we conduct the first population genomic analysis for this species. We find clear evidence for population structure across the species' core Central European range. Notably, our data show that the Alpine population, characterized by strong differentiation and reduced genetic diversity, is reproductively isolated from other regions and likely represents a differentiated evolutionary significant unit (ESU). The predominantly declining Eastern European populations also show signs of recent isolation, a pattern consistent with a range expansion from Western to Eastern Europe during the Holocene, leaving relict populations now facing local extinction. Overall, our findings suggest that garden dormouse conservation may be enhanced in Europe through the designation of ESUs.

基因组资源对于评估遗传多样性和支持保护工作非常重要。花园睡鼠(Eliomys quercinus)是一种小型啮齿类动物,是欧洲现代种群数量下降最严重的动物之一。我们为花园睡鼠提供了一个高质量的单倍型解析参考基因组,并将全面的短线程和长线程转录组学数据集与基于同源性的方法相结合,生成了高度完整的基因注释。基因组的种群历史分析表明,花园睡鼠的种群数量自上一次间冰期以来急剧下降,这表明在人类活动影响之前,寒冷气候与种群数量下降之间存在关联。利用我们的基因组和来自 100 个个体的遗传数据,我们首次对该物种进行了种群基因组分析。我们发现了该物种在中欧核心分布区种群结构的明显证据。值得注意的是,我们的数据显示,阿尔卑斯山种群具有强烈分化和遗传多样性降低的特点,在繁殖上与其他地区隔离,很可能代表了一个分化的重要进化单元(ESU)。以衰退为主的东欧种群也显示出近期隔离的迹象,这种模式与全新世期间从西欧向东欧扩张的分布范围一致,留下的孑遗种群目前正面临局部灭绝。总之,我们的研究结果表明,可以通过指定 ESU 来加强欧洲花园睡鼠的保护。
{"title":"Haplotype-resolved genome and population genomics of the threatened garden dormouse in Europe.","authors":"Paige A Byerly, Alina von Thaden, Evgeny Leushkin, Leon Hilgers, Shenglin Liu, Sven Winter, Tilman Schell, Charlotte Gerheim, Alexander Ben Hamadou, Carola Greve, Christian Betz, Hanno J Bolz, Sven Büchner, Johannes Lang, Holger Meinig, Evax Marie Famira-Parcsetich, Sarah P Stubbe, Alice Mouton, Sandro Bertolino, Goedele Verbeylen, Thomas Briner, Lídia Freixas, Lorenzo Vinciguerra, Sarah A Mueller, Carsten Nowak, Michael Hiller","doi":"10.1101/gr.279066.124","DOIUrl":"https://doi.org/10.1101/gr.279066.124","url":null,"abstract":"<p><p>Genomic resources are important for evaluating genetic diversity and supporting conservation efforts. The garden dormouse (<i>Eliomys quercinus</i>) is a small rodent that has experienced one of the most severe modern population declines in Europe. We present a high-quality haplotype-resolved reference genome for the garden dormouse, and combine comprehensive short and long-read transcriptomics data sets with homology-based methods to generate a highly complete gene annotation. Demographic history analysis of the genome reveal a sharp population decline since the last interglacial, indicating an association between colder climates and population declines before anthropogenic influence. Using our genome and genetic data from 100 individuals, largely sampled in a citizen-science project across the contemporary range, we conduct the first population genomic analysis for this species. We find clear evidence for population structure across the species' core Central European range. Notably, our data show that the Alpine population, characterized by strong differentiation and reduced genetic diversity, is reproductively isolated from other regions and likely represents a differentiated evolutionary significant unit (ESU). The predominantly declining Eastern European populations also show signs of recent isolation, a pattern consistent with a range expansion from Western to Eastern Europe during the Holocene, leaving relict populations now facing local extinction. Overall, our findings suggest that garden dormouse conservation may be enhanced in Europe through the designation of ESUs.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142618653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple paralogues and recombination mechanisms contribute to the high incidence of 22q11.2 Deletion Syndrome 多种旁系基因和重组机制导致 22q11.2 缺失综合征的高发病率
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-13 DOI: 10.1101/gr.279331.124
Lisanne Vervoort, Nicolas Dierckxsens, Marta Sousa Santos, Senne Meynants, Erika Souche, Ruben Cools, Tracy Heung, Koen Devriendt, Hilde Peeters, Donna McDonald-McGinn, Ann Swillen, Jeroen Breckpot, Beverly S. Emanuel, Hilde Van Esch, Anne S. Bassett, Joris R. Vermeesch
The 22q11.2 deletion syndrome (22q11.2DS) is the most common microdeletion disorder. Why the incidence of 22q11.2DS is much greater than that of other genomic disorders remains unknown. Short read sequencing cannot resolve the complex segmental duplicon structure to provide direct confirmation of the hypothesis that the rearrangements are caused by nonallelic homologous recombination between the low copy repeats on Chromosome 22 (LCR22s). To enable haplotype-specific assembly and rearrangement mapping in LCR22 regions, we combined fiber-FISH optical mapping with whole genome (ultra-)long read sequencing or rearrangement-specific long-range PCR on 24 duos (22q11.2DS patient and parent-of-origin) comprising several different LCR22-mediated rearrangements. Unexpectedly, we demonstrate that not only different paralogous segmental duplicon but also palindromic AT-rich repeats (PATRR) are driving 22q11.2 rearrangements. In addition, we show the existence of two different inversion polymorphisms preceding rearrangement, and somatic mosaicism. The existence of different recombination sites and mechanisms in paralogues and PATRRs which are copy number expanding in the human population are a likely explanation for the high 22q11.2DS incidence.
22q11.2 缺失综合征(22q11.2DS)是最常见的微缺失疾病。22q11.2DS的发病率远高于其他基因组疾病,原因至今仍不清楚。短读测序无法解析复杂的片段重复子结构,因此无法直接证实重排是由染色体 22 上低拷贝重复子(LCR22s)之间的非等位同源重组引起的这一假设。为了在 LCR22 区域进行单倍型特异性组装和重排图谱绘制,我们将光纤-FISH 光学图谱与全基因组(超)长读数测序或重排特异性长程 PCR 结合在一起,对 24 个包含多个不同 LCR22 介导的重排的双人组(22q11.2DS 患者和原生父母)进行了研究。出乎意料的是,我们发现不仅有不同的旁系节段重复子,而且还有富含AT的宫位重复子(PATRR)在驱动22q11.2重排。此外,我们还发现在重排和体细胞嵌合之前存在两种不同的反转多态性。在人类人群中拷贝数不断扩大的旁系和 PATRRs 中存在不同的重组位点和机制,这可能是 22q11.2DS 高发病率的一个原因。
{"title":"Multiple paralogues and recombination mechanisms contribute to the high incidence of 22q11.2 Deletion Syndrome","authors":"Lisanne Vervoort, Nicolas Dierckxsens, Marta Sousa Santos, Senne Meynants, Erika Souche, Ruben Cools, Tracy Heung, Koen Devriendt, Hilde Peeters, Donna McDonald-McGinn, Ann Swillen, Jeroen Breckpot, Beverly S. Emanuel, Hilde Van Esch, Anne S. Bassett, Joris R. Vermeesch","doi":"10.1101/gr.279331.124","DOIUrl":"https://doi.org/10.1101/gr.279331.124","url":null,"abstract":"The 22q11.2 deletion syndrome (22q11.2DS) is the most common microdeletion disorder. Why the incidence of 22q11.2DS is much greater than that of other genomic disorders remains unknown. Short read sequencing cannot resolve the complex segmental duplicon structure to provide direct confirmation of the hypothesis that the rearrangements are caused by nonallelic homologous recombination between the low copy repeats on Chromosome 22 (LCR22s). To enable haplotype-specific assembly and rearrangement mapping in LCR22 regions, we combined fiber-FISH optical mapping with whole genome (ultra-)long read sequencing or rearrangement-specific long-range PCR on 24 duos (22q11.2DS patient and parent-of-origin) comprising several different LCR22-mediated rearrangements. Unexpectedly, we demonstrate that not only different paralogous segmental duplicon but also palindromic AT-rich repeats (PATRR) are driving 22q11.2 rearrangements. In addition, we show the existence of two different inversion polymorphisms preceding rearrangement, and somatic mosaicism. The existence of different recombination sites and mechanisms in paralogues and PATRRs which are copy number expanding in the human population are a likely explanation for the high 22q11.2DS incidence.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"6 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142610482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multisample motif discovery and visualization for tandem repeats 串联重复序列的多样本主题发现和可视化
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-13 DOI: 10.1101/gr.279278.124
Yaran Zhang, Marc Hulsman, Alex Salazar, Niccoló Tesi, Lydian Knoop, Sven van der Lee, Sanduni Wijesekera, Jana Krizova, Erik-Jan Kamsteeg, Henne Holstege
Tandem Repeats (TR) occupy a significant portion of the human genome and are the source of polymorphism due to variations in sizes and motif compositions. Some of these variations have been associated with various neuropathological disorders, highlighting the clinical importance of assessing the motif structure of TRs. Moreover, assessing the TR motif variation can offer valuable insights into evolutionary dynamics and population structure. Previously, characterizations of TRs have been limited by short-read sequencing technology, which lacks the ability to accurately capture the full TR sequences. As long-read sequencing becomes more accessible and can capture the full complexity of TRs, there is now also a need for tools to characterize and analyze TRs using long-read data across multiple samples. In this study, we present MotifScope, a novel algorithm for characterization and visualization of TRs based on a de novo k-mer approach for motif discovery. Comparative analysis against established tools reveals that MotifScope can identify a greater number of motifs and more accurately represent the underlying repeat sequence. Moreover, MotifScope has been specifically designed to enable motif composition comparisons across assemblies of different individuals, as well as across long-read sequencing reads within an individual, through combined motif discovery and sequence alignment. We showcase potential applications of MotifScope in diverse fields, including population genetics, clinical settings, and forensic analyses.
串联重复序列(TR)在人类基因组中占有很大的比例,由于其大小和基序组成的变化而成为多态性的来源。其中一些变异与各种神经病理学疾病有关,这凸显了评估串联重复序列基序结构的临床重要性。此外,评估TR基序变异还能为了解进化动态和种群结构提供有价值的信息。以前,TRs的特征描述受到短线程测序技术的限制,因为短线程测序技术无法准确捕捉TRs的完整序列。随着长线程测序技术的普及并能捕捉到TRs的全部复杂性,现在也需要一些工具来利用多个样本的长线程数据表征和分析TRs。在本研究中,我们介绍了 MotifScope,这是一种基于从头发现 k-mer 主题词的方法来表征和可视化 TRs 的新型算法。与已有工具的比较分析表明,MotifScope 能识别出更多的基元,并更准确地表示底层重复序列。此外,MotifScope 还经过专门设计,可通过组合主题发现和序列比对,在不同个体的集合间以及个体内的长读数测序读数间进行主题组成比较。我们展示了 MotifScope 在不同领域的潜在应用,包括群体遗传学、临床环境和法医分析。
{"title":"Multisample motif discovery and visualization for tandem repeats","authors":"Yaran Zhang, Marc Hulsman, Alex Salazar, Niccoló Tesi, Lydian Knoop, Sven van der Lee, Sanduni Wijesekera, Jana Krizova, Erik-Jan Kamsteeg, Henne Holstege","doi":"10.1101/gr.279278.124","DOIUrl":"https://doi.org/10.1101/gr.279278.124","url":null,"abstract":"Tandem Repeats (TR) occupy a significant portion of the human genome and are the source of polymorphism due to variations in sizes and motif compositions. Some of these variations have been associated with various neuropathological disorders, highlighting the clinical importance of assessing the motif structure of TRs. Moreover, assessing the TR motif variation can offer valuable insights into evolutionary dynamics and population structure. Previously, characterizations of TRs have been limited by short-read sequencing technology, which lacks the ability to accurately capture the full TR sequences. As long-read sequencing becomes more accessible and can capture the full complexity of TRs, there is now also a need for tools to characterize and analyze TRs using long-read data across multiple samples. In this study, we present MotifScope, a novel algorithm for characterization and visualization of TRs based on a de novo <em>k</em>-mer approach for motif discovery. Comparative analysis against established tools reveals that MotifScope can identify a greater number of motifs and more accurately represent the underlying repeat sequence. Moreover, MotifScope has been specifically designed to enable motif composition comparisons across assemblies of different individuals, as well as across long-read sequencing reads within an individual, through combined motif discovery and sequence alignment. We showcase potential applications of MotifScope in diverse fields, including population genetics, clinical settings, and forensic analyses.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"98 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142610478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction and evaluation of a new rat reference genome assembly, GRCr8, from long reads and long-range scaffolding 利用长读数和长范围脚手架构建和评估新的大鼠参考基因组序列 GRCr8
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-08 DOI: 10.1101/gr.279292.124
Kai Li, Melissa L. Smith, J. Chris Blazier, Kelli J. Kochan, Jonathan M.D. Wood, Kerstin Howe, Anne E. Kwitek, Melinda R. Dwinell, Hao Chen, Julia L. Ciosek, Patrick Masterson, Terence D. Murphy, Theodore S. Kalbfleisch, Peter A. Doris
We report the construction and analysis of a new reference genome assembly for Rattus norvegicus, the laboratory rat, a widely used experimental animal model organism. The assembly has been adopted as the rat reference assembly by the Genome Reference Consortium and is named GRCr8. The assembly has employed 40× Pacific Biosciences (PacBio) HiFi sequencing coverage and scaffolding using optical mapping and Hi-C. We used genomic DNA from a male BN/NHsdMcwi (BN) rat of the same strain and from the same colony as the prior reference assembly, mRatBN7.2. The assembly is at chromosome level with 98.7% of the sequence assigned to chromosomes. All chromosomes have increased in size compared with the prior assembly and k-mer analysis indicates that the subject animal is fully inbred and that the genome is represented as a single haploid assembly. Notable increases are observed in Chromosomes 3, 11, and 12 in the prospective rDNA regions. In addition, Chr Y has increased threefold in size and is more consistent with the rat karyotype than previous assemblies. Several other chromosomes have grown by the incorporation of sizable discrete new blocks. These contain highly repetitive sequences and encode numerous previously unannotated genes. In addition, centromeric sequences are incorporated in most chromosomes. Genome annotation has been performed by NCBI RefSeq, which confirms improvement in assembly quality and adds more than 1100 new protein coding genes. PacBio Iso-Seq data have been acquired from multiple tissues of the subject animal and are released concurrently with the new assembly to aid further analyses.
我们报告了大鼠(Rattus norvegicus)新参考基因组的构建和分析,大鼠是一种广泛使用的实验动物模型生物。基因组参考联盟(Genome Reference Consortium)已将该组配作为大鼠的参考组配,并将其命名为 GRCr8。该汇编采用了 40×Pacific Biosciences (PacBio) HiFi 测序覆盖率,并使用光学映射和 Hi-C 搭架。我们使用了来自雄性 BN/NHsdMcwi (BN) 大鼠的基因组 DNA,该大鼠与之前的参考基因组 mRatBN7.2 属同一品系、同一群体。该基因组是染色体水平的,98.7% 的序列分配给了染色体。所有染色体的大小都比之前的序列集有所增加,K-聚合体分析表明,受试动物是完全近亲繁殖的,基因组表现为一个单倍体序列集。在 3 号、11 号和 12 号染色体的准 rDNA 区域观察到显著的增加。此外,Y染色体的大小增加了三倍,与之前的组合相比,更符合大鼠核型。其他几条染色体也因加入了相当大的离散新区块而增大。这些染色体包含高度重复的序列,并编码了许多以前未注明的基因。此外,大多数染色体都加入了中心粒序列。NCBI RefSeq 对基因组进行了注释,证实了组装质量的提高,并增加了 1100 多个新的蛋白质编码基因。PacBio Iso-Seq 数据来自受试动物的多个组织,与新的组装同时发布,以帮助进一步分析。
{"title":"Construction and evaluation of a new rat reference genome assembly, GRCr8, from long reads and long-range scaffolding","authors":"Kai Li, Melissa L. Smith, J. Chris Blazier, Kelli J. Kochan, Jonathan M.D. Wood, Kerstin Howe, Anne E. Kwitek, Melinda R. Dwinell, Hao Chen, Julia L. Ciosek, Patrick Masterson, Terence D. Murphy, Theodore S. Kalbfleisch, Peter A. Doris","doi":"10.1101/gr.279292.124","DOIUrl":"https://doi.org/10.1101/gr.279292.124","url":null,"abstract":"We report the construction and analysis of a new reference genome assembly for <em>Rattus norvegicus</em>, the laboratory rat, a widely used experimental animal model organism. The assembly has been adopted as the rat reference assembly by the Genome Reference Consortium and is named GRCr8. The assembly has employed 40× Pacific Biosciences (PacBio) HiFi sequencing coverage and scaffolding using optical mapping and Hi-C. We used genomic DNA from a male BN/NHsdMcwi (BN) rat of the same strain and from the same colony as the prior reference assembly, mRatBN7.2. The assembly is at chromosome level with 98.7% of the sequence assigned to chromosomes. All chromosomes have increased in size compared with the prior assembly and <em>k</em>-mer analysis indicates that the subject animal is fully inbred and that the genome is represented as a single haploid assembly. Notable increases are observed in Chromosomes 3, 11, and 12 in the prospective rDNA regions. In addition, Chr Y has increased threefold in size and is more consistent with the rat karyotype than previous assemblies. Several other chromosomes have grown by the incorporation of sizable discrete new blocks. These contain highly repetitive sequences and encode numerous previously unannotated genes. In addition, centromeric sequences are incorporated in most chromosomes. Genome annotation has been performed by NCBI RefSeq, which confirms improvement in assembly quality and adds more than 1100 new protein coding genes. PacBio Iso-Seq data have been acquired from multiple tissues of the subject animal and are released concurrently with the new assembly to aid further analyses.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"70 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142597544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nanopore strand-specific mismatch enables de novo detection of bacterial DNA modifications. 纳米孔链条特异性错配可实现对细菌 DNA 修饰的从头检测。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-07 DOI: 10.1101/gr.279012.124
Xudong Liu, Ying Ni, Lianwei Ye, Zhihao Guo, Lu Tan, Jun Li, Mengsu Yang, Sheng Chen, Runsheng Li

DNA modifications in bacteria present diverse types and distributions, playing crucial functional roles. Current methods for detecting bacterial DNA modifications via nanopore sequencing typically involve comparing raw current signals to a methylation-free control. In this study, we found that bacterial DNA modification induces errors in nanopore reads. And these errors are found only in one strand but not the other, showing a strand-specific bias. Leveraging this discovery, we developed Hammerhead, a pioneering pipeline designed for de novo methylation discovery that circumvents the necessity of raw signal inference and a methylation-free control. The majority (14 out of 16) of the identified motifs can be validated by raw signal comparison methods or by identifying corresponding methyltransferases in bacteria. Additionally, we included a novel polishing strategy employing duplex reads to correct modification-induced errors in bacterial genome assemblies, achieving a reduction of over 85% in such errors. In summary, Hammerhead enables users to effectively locate bacterial DNA methylation sites from nanopore FASTQ/FASTA reads, thus holds promise as a routine pipeline for a wide range of nanopore sequencing applications, such as genome assembly, metagenomic binning, decontaminating eukaryotic genome assemblies, and functional analysis for DNA modifications.

细菌中的 DNA 修饰具有多种类型和分布,发挥着重要的功能作用。目前通过纳米孔测序检测细菌 DNA 修饰的方法通常是将原始电流信号与无甲基化对照进行比较。在这项研究中,我们发现细菌 DNA 修饰会导致纳米孔读数出现错误。而且这些误差只出现在一条链上,而不是另一条链上,这显示了链特异性偏差。利用这一发现,我们开发了 Hammerhead,这是一种用于从头甲基化发现的开创性流水线,它避免了原始信号推断和无甲基化对照的必要性。大部分(16 个中的 14 个)鉴定出的主题可以通过原始信号比较方法或鉴定细菌中相应的甲基转移酶来验证。此外,我们还采用了一种新颖的抛光策略,利用双链读数纠正细菌基因组组装中由修饰引起的错误,减少了 85% 以上的此类错误。总之,Hammerhead 能让用户从纳米孔 FASTQ/FASTA 读数中有效定位细菌 DNA 甲基化位点,因此有望成为基因组组装、元基因组分选、去污真核基因组组装和 DNA 修饰功能分析等各种纳米孔测序应用的常规管道。
{"title":"Nanopore strand-specific mismatch enables de novo detection of bacterial DNA modifications.","authors":"Xudong Liu, Ying Ni, Lianwei Ye, Zhihao Guo, Lu Tan, Jun Li, Mengsu Yang, Sheng Chen, Runsheng Li","doi":"10.1101/gr.279012.124","DOIUrl":"10.1101/gr.279012.124","url":null,"abstract":"<p><p>DNA modifications in bacteria present diverse types and distributions, playing crucial functional roles. Current methods for detecting bacterial DNA modifications via nanopore sequencing typically involve comparing raw current signals to a methylation-free control. In this study, we found that bacterial DNA modification induces errors in nanopore reads. And these errors are found only in one strand but not the other, showing a strand-specific bias. Leveraging this discovery, we developed Hammerhead, a pioneering pipeline designed for de novo methylation discovery that circumvents the necessity of raw signal inference and a methylation-free control. The majority (14 out of 16) of the identified motifs can be validated by raw signal comparison methods or by identifying corresponding methyltransferases in bacteria. Additionally, we included a novel polishing strategy employing duplex reads to correct modification-induced errors in bacterial genome assemblies, achieving a reduction of over 85% in such errors. In summary, Hammerhead enables users to effectively locate bacterial DNA methylation sites from nanopore FASTQ/FASTA reads, thus holds promise as a routine pipeline for a wide range of nanopore sequencing applications, such as genome assembly, metagenomic binning, decontaminating eukaryotic genome assemblies, and functional analysis for DNA modifications.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142365032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gapless assembly of complete human and plant chromosomes using only nanopore sequencing. 仅使用纳米孔测序技术无间隙组装完整的人类和植物染色体。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-06 DOI: 10.1101/gr.279334.124
Sergey Koren, Zhigui Bao, Andrea Guarracino, Shujun Ou, Sara Goodwin, Katharine M Jenike, Julian Lucas, Brandy McNulty, Jimin Park, Mikko Rautiainen, Arang Rhie, Dick Roelofs, Harrie Schneiders, Ilse Vrijenhoek, Koen Nijbroek, Olle Nordesjo, Sergey Nurk, Mike Vella, Katherine R Lawrence, Doreen Ware, Michael C Schatz, Erik Garrison, Sanwen Huang, William Richard McCombie, Karen H Miga, Alexander H J Wittenberg, Adam M Phillippy

The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the UL reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and provides a multirun single-instrument solution for the reconstruction of complete genomes.

牛津纳米孔技术公司(ONT)的超长(UL)测序读数与太平洋生物科学公司(PacBio)的长而精确的高保真(HiFi)读数相结合,完成了人类基因组,并推动了完成许多其他物种基因组的类似工作。然而,这种 "端粒到端粒 "的完整基因组组装方法依赖于多个测序平台,限制了其可及性。ONT "双链 "测序读数可同时读取DNA的两条链以提高质量,并保证每个碱基的高准确性。为了评估这一新的数据类型,我们为三个被广泛研究的基因组生成了 ONT 双重数据:人类 HG002、Solanum lycopersicum Heinz 1706(番茄)和 Zea mays B73(玉米)。对于二倍体、杂合子 HG002 基因组,我们还使用了 "Pore-C "染色质接触图谱来完全分期单倍型。我们发现,Duplex 数据的准确性与 HiFi 测序相似,但读数长度长几十个千碱基,而 Pore-C 数据与现有的二倍体组装算法兼容。读数长度和准确性的结合使我们能够构建高质量的初始组装,然后利用 UL 读数进一步解析,最后利用 Pore-C 分阶段形成染色体规模的单倍型。最终的组装结果具有超过 99.999% 的碱基准确率(Q50)和近乎完美的连续性,大多数染色体组装为单个等位基因。我们的结论是,ONT 测序是全新基因组组装中 HiFi 测序的可行替代方案,并为重建完整基因组提供了多轮单仪器解决方案。
{"title":"Gapless assembly of complete human and plant chromosomes using only nanopore sequencing.","authors":"Sergey Koren, Zhigui Bao, Andrea Guarracino, Shujun Ou, Sara Goodwin, Katharine M Jenike, Julian Lucas, Brandy McNulty, Jimin Park, Mikko Rautiainen, Arang Rhie, Dick Roelofs, Harrie Schneiders, Ilse Vrijenhoek, Koen Nijbroek, Olle Nordesjo, Sergey Nurk, Mike Vella, Katherine R Lawrence, Doreen Ware, Michael C Schatz, Erik Garrison, Sanwen Huang, William Richard McCombie, Karen H Miga, Alexander H J Wittenberg, Adam M Phillippy","doi":"10.1101/gr.279334.124","DOIUrl":"10.1101/gr.279334.124","url":null,"abstract":"<p><p>The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, \"telomere-to-telomere\" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT \"Duplex\" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely studied genomes: human HG002, <i>Solanum lycopersicum</i> Heinz 1706 (tomato), and <i>Zea mays</i> B73 (maize). For the diploid, heterozygous HG002 genome, we also used \"Pore-C\" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the UL reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and provides a multirun single-instrument solution for the reconstruction of complete genomes.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142589915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genomic epidemiology of carbapenem-resistant Enterobacterales at a New York City hospital over a 10-year period reveals complex plasmid-clone dynamics and evidence for frequent horizontal transfer of bla KPC. 纽约市一家医院十年间耐碳青霉烯类肠杆菌的基因组流行病学揭示了复杂的质粒克隆动态和 bla KPC 频繁水平转移的证据。
IF 6.2 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-11-05 DOI: 10.1101/gr.279355.124
Angela Gomez-Simmonds, Medini K Annavajhala, Dwayne Seeram, Todd W Hokunson, Heekuk Park, Anne-Catrin Uhlemann

Transmission of carbapenem-resistant Enterobacterales (CRE) in hospitals has been shown to occur through complex, multifarious networks driven by both clonal spread and horizontal transfer mediated by plasmids and other mobile genetic elements. We performed nanopore long-read sequencing on CRE isolates from a large urban hospital system to determine the overall contribution of plasmids to CRE transmission and identify specific plasmids implicated in the spread of bla KPC (the Klebsiella pneumoniae carbapenemase [KPC] gene). Six hundred and five CRE isolates collected between 2009 and 2018 first underwent Illumina sequencing for genome-wide genotyping; 435 bla KPC-positive isolates were then successfully nanopore sequenced to generate hybrid assemblies including circularized bla KPC-harboring plasmids. Phylogenetic analysis and Mash clustering were used to define putative clonal and plasmid transmission clusters, respectively. Overall, CRE isolates belonged to 96 multilocus sequence types (STs) encoding bla KPC on 447 plasmids which formed 54 plasmid clusters. We found evidence for clonal transmission in 66% of CRE isolates, over half of which belonged to four clades comprising K. pneumoniae ST258. Plasmid-mediated acquisition of bla KPC occurred in 23%-27% of isolates. While most plasmid clusters were small, several plasmids were identified in multiple different species and STs, including a highly promiscuous IncN plasmid and an IncF plasmid putatively spreading bla KPC from ST258 to other clones. Overall, this points to both the continued dominance of K. pneumoniae ST258 and the dissemination of bla KPC across clones and species by diverse plasmid backbones. These findings support integrating long-read sequencing into genomic surveillance approaches to detect the hitherto silent spread of carbapenem resistance driven by mobile plasmids.

耐碳青霉烯类肠杆菌(CRE)在医院中的传播已被证明是通过由质粒和其他移动遗传因子介导的克隆传播和水平转移所驱动的复杂而多样的网络进行的。我们对来自一个大型城市医院系统的 CRE 分离物进行了纳米孔长读数测序,以确定质粒对 CRE 传播的总体贡献,并识别与 bla KPC(肺炎克雷伯菌碳青霉烯酶 [KPC] 基因)传播有关的特定质粒。2009-2018 年间收集的 605 株 CRE 分离物首先进行了 Illumina 测序,以进行全基因组基因分型;然后对 435 株 bla KPC 阳性分离物进行了成功的纳米孔测序,以生成包括环化 bla KPC 携带质粒的杂交组合。系统发育分析和 Mash 聚类分别用于确定假定的克隆和质粒传播群。总体而言,CRE 分离物属于 96 个多焦点序列类型(ST),在 447 个质粒上编码 bla KPC,形成 54 个质粒群。我们在 66% 的 CRE 分离物中发现了克隆传播的证据,其中一半以上属于由肺炎克菌 ST258 组成的四个支系。23-27%的分离株通过质粒获得了 bla KPC。虽然大多数质粒群规模较小,但在多个不同物种和 ST 中发现了几种质粒,包括一种高度杂合的 IncN 质粒和一种可能将 bla KPC 从 ST258 传播到其他克隆的 IncF 质粒。总之,这表明肺炎克菌 ST258 仍处于优势地位,而 bla KPC 则通过不同的质粒骨架在克隆和物种间传播。这些发现支持将长读测序纳入基因组监测方法,以检测迄今为止由移动质粒驱动的碳青霉烯耐药性的无声传播。
{"title":"Genomic epidemiology of carbapenem-resistant Enterobacterales at a New York City hospital over a 10-year period reveals complex plasmid-clone dynamics and evidence for frequent horizontal transfer of <i>bla</i> <sub>KPC</sub>.","authors":"Angela Gomez-Simmonds, Medini K Annavajhala, Dwayne Seeram, Todd W Hokunson, Heekuk Park, Anne-Catrin Uhlemann","doi":"10.1101/gr.279355.124","DOIUrl":"10.1101/gr.279355.124","url":null,"abstract":"<p><p>Transmission of carbapenem-resistant Enterobacterales (CRE) in hospitals has been shown to occur through complex, multifarious networks driven by both clonal spread and horizontal transfer mediated by plasmids and other mobile genetic elements. We performed nanopore long-read sequencing on CRE isolates from a large urban hospital system to determine the overall contribution of plasmids to CRE transmission and identify specific plasmids implicated in the spread of <i>bla</i> <sub>KPC</sub> (the <i>Klebsiella pneumoniae</i> carbapenemase [KPC] gene). Six hundred and five CRE isolates collected between 2009 and 2018 first underwent Illumina sequencing for genome-wide genotyping; 435 <i>bla</i> <sub>KPC</sub>-positive isolates were then successfully nanopore sequenced to generate hybrid assemblies including circularized <i>bla</i> <sub>KPC</sub>-harboring plasmids. Phylogenetic analysis and Mash clustering were used to define putative clonal and plasmid transmission clusters, respectively. Overall, CRE isolates belonged to 96 multilocus sequence types (STs) encoding <i>bla</i> <sub>KPC</sub> on 447 plasmids which formed 54 plasmid clusters. We found evidence for clonal transmission in 66% of CRE isolates, over half of which belonged to four clades comprising <i>K. pneumoniae</i> ST258. Plasmid-mediated acquisition of <i>bla</i> <sub>KPC</sub> occurred in 23%-27% of isolates. While most plasmid clusters were small, several plasmids were identified in multiple different species and STs, including a highly promiscuous IncN plasmid and an IncF plasmid putatively spreading <i>bla</i> <sub>KPC</sub> from ST258 to other clones. Overall, this points to both the continued dominance of <i>K. pneumoniae</i> ST258 and the dissemination of <i>bla</i> <sub>KPC</sub> across clones and species by diverse plasmid backbones. These findings support integrating long-read sequencing into genomic surveillance approaches to detect the hitherto silent spread of carbapenem resistance driven by mobile plasmids.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":""},"PeriodicalIF":6.2,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142375382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1