Georgi K. Marinov, Benjamin Doughty, Anshul Kundaje, William J Greenleaf
Histone proteins have traditionally been thought to be restricted to eukaryotes and most archaea, with eukaryotic nucleosomal histones deriving from their archaeal ancestors. In contrast, bacteria lack histones as a rule. However, histone proteins have recently been identified in a few bacterial clades, most notably the phylum Bdellovibrionota, and these histones have been proposed to exhibit a range of divergent features compared to histones in archaea and eukaryotes. However, no functional genomic studies of the properties of Bdellovibrionota chromatin have been carried out. In this work, we map the landscape of chromatin accessibility, active transcription and three-dimensional genome organization in a member of Bdellovibrionota (a Bacteriovorax strain). We find that, similar to what is observed in some archaea and in eukaryotes with compact genomes such as yeast, Bacteriovorax chromatin is characterized by preferential accessibility around promoter regions. Similar to eukaryotes, chromatin accessibility in Bacteriovorax positively correlates with gene expression. Mapping active transcription through single-strand DNA (ssDNA) profiling revealed that unlike in yeast, but similar to the state of mammalian and fly promoters, Bacteriovorax promoters exhibit very strong polymerase pausing. Finally, similar to that of other bacteria without histones, the Bacteriovorax genome exists in a three-dimensional (3D) configuration organized by the parABS system along the axis defined by replication origin and termination regions. These results provide a foundation for understanding the chromatin biology of the unique Bdellovibrionota bacteria and the functional diversity in chromatin organization across the tree of life.
{"title":"The chromatin landscape of the histone-possessing Bacteriovorax bacteria","authors":"Georgi K. Marinov, Benjamin Doughty, Anshul Kundaje, William J Greenleaf","doi":"10.1101/gr.279418.124","DOIUrl":"https://doi.org/10.1101/gr.279418.124","url":null,"abstract":"Histone proteins have traditionally been thought to be restricted to eukaryotes and most archaea, with eukaryotic nucleosomal histones deriving from their archaeal ancestors. In contrast, bacteria lack histones as a rule. However, histone proteins have recently been identified in a few bacterial clades, most notably the phylum Bdellovibrionota, and these histones have been proposed to exhibit a range of divergent features compared to histones in archaea and eukaryotes. However, no functional genomic studies of the properties of Bdellovibrionota chromatin have been carried out. In this work, we map the landscape of chromatin accessibility, active transcription and three-dimensional genome organization in a member of Bdellovibrionota (a <em>Bacteriovorax</em> strain). We find that, similar to what is observed in some archaea and in eukaryotes with compact genomes such as yeast, <em>Bacteriovorax</em> chromatin is characterized by preferential accessibility around promoter regions. Similar to eukaryotes, chromatin accessibility in <em>Bacteriovorax</em> positively correlates with gene expression. Mapping active transcription through single-strand DNA (ssDNA) profiling revealed that unlike in yeast, but similar to the state of mammalian and fly promoters, <em>Bacteriovorax</em> promoters exhibit very strong polymerase pausing. Finally, similar to that of other bacteria without histones, the <em>Bacteriovorax</em> genome exists in a three-dimensional (3D) configuration organized by the parABS system along the axis defined by replication origin and termination regions. These results provide a foundation for understanding the chromatin biology of the unique Bdellovibrionota bacteria and the functional diversity in chromatin organization across the tree of life.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"61 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142684319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoli Zhang, Yirui Huang, Yajing Yang, Qi-En Wang, Lang Li
Single-cell lineage tracing (scLT) has emerged as a powerful tool, providing unparalleled resolution to investigate cellular dynamics, fate determination, and the underlying molecular mechanisms. This review thoroughly examines the latest prospective lineage DNA barcode tracing technologies. It further highlights pivotal studies that leverage single-cell lentiviral integration barcoding technology to unravel the dynamic nature of cell lineages in both developmental biology and cancer research. Additionally, the review navigates through critical considerations for successful experimental design in lineage tracing and addresses challenges inherent in this field, including technical limitations, complexities in data analysis, and the imperative for standardization. It also outlines current gaps in knowledge and suggests future research directions, contributing to the ongoing advancement of scLT studies.
单细胞谱系追踪(scLT)已成为一种强大的工具,可提供无与伦比的分辨率来研究细胞动力学、命运决定和潜在的分子机制。本综述深入探讨了最新的前瞻性细胞系 DNA 条形码追踪技术。它进一步强调了利用单细胞慢病毒整合条形码技术揭示发育生物学和癌症研究中细胞系动态性质的关键研究。此外,该综述还介绍了成功设计品系追踪实验的关键注意事项,并探讨了该领域固有的挑战,包括技术限制、数据分析的复杂性以及标准化的必要性。综述还概述了目前的知识空白,并提出了未来的研究方向,为持续推进血统追踪研究做出了贡献。
{"title":"Advancements in prospective single-cell lineage barcoding and their applications in research","authors":"Xiaoli Zhang, Yirui Huang, Yajing Yang, Qi-En Wang, Lang Li","doi":"10.1101/gr.278944.124","DOIUrl":"https://doi.org/10.1101/gr.278944.124","url":null,"abstract":"Single-cell lineage tracing (scLT) has emerged as a powerful tool, providing unparalleled resolution to investigate cellular dynamics, fate determination, and the underlying molecular mechanisms. This review thoroughly examines the latest prospective lineage DNA barcode tracing technologies. It further highlights pivotal studies that leverage single-cell lentiviral integration barcoding technology to unravel the dynamic nature of cell lineages in both developmental biology and cancer research. Additionally, the review navigates through critical considerations for successful experimental design in lineage tracing and addresses challenges inherent in this field, including technical limitations, complexities in data analysis, and the imperative for standardization. It also outlines current gaps in knowledge and suggests future research directions, contributing to the ongoing advancement of scLT studies.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"16 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142684317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kathleen Zeglinski, Christian Montellese, Matthew E Ritchie, Monther Alhamdoosh, Cédric Vonarburg, Rory Bowden, Monika Jordi, Quentin Gouil, Florian Aeschimann, Arthur Hsu
Despite recent advances made toward improving the efficacy of lentiviral gene therapies, a sizeable proportion of produced vector contains an incomplete and thus potentially nonfunctional RNA genome. This can undermine gene delivery by the lentivirus as well as increase manufacturing costs and must be improved to facilitate the widespread clinical implementation of lentiviral gene therapies. Here, we compare three long-read sequencing technologies for their ability to detect issues in vector design and determine nanopore direct RNA sequencing to be the most powerful. We show how this approach identifies and quantifies incomplete RNA caused by cryptic splicing and polyadenylation sites, including a potential cryptic polyadenylation site in the widely used Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE). Using artificial polyadenylation of the lentiviral RNA, we also identify multiple hairpin-associated truncations in the analyzed lentiviral vectors (LVs), which account for most of the detected RNA fragments. Finally, we show that these insights can be used for the optimization of LV design. In summary, nanopore direct RNA sequencing is a powerful tool for the quality control and optimization of LVs, which may help to improve lentivirus manufacturing and thus the development of higher quality lentiviral gene therapies.
{"title":"An optimized protocol for quality control of gene therapy vectors using nanopore direct RNA sequencing.","authors":"Kathleen Zeglinski, Christian Montellese, Matthew E Ritchie, Monther Alhamdoosh, Cédric Vonarburg, Rory Bowden, Monika Jordi, Quentin Gouil, Florian Aeschimann, Arthur Hsu","doi":"10.1101/gr.279405.124","DOIUrl":"10.1101/gr.279405.124","url":null,"abstract":"<p><p>Despite recent advances made toward improving the efficacy of lentiviral gene therapies, a sizeable proportion of produced vector contains an incomplete and thus potentially nonfunctional RNA genome. This can undermine gene delivery by the lentivirus as well as increase manufacturing costs and must be improved to facilitate the widespread clinical implementation of lentiviral gene therapies. Here, we compare three long-read sequencing technologies for their ability to detect issues in vector design and determine nanopore direct RNA sequencing to be the most powerful. We show how this approach identifies and quantifies incomplete RNA caused by cryptic splicing and polyadenylation sites, including a potential cryptic polyadenylation site in the widely used Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE). Using artificial polyadenylation of the lentiviral RNA, we also identify multiple hairpin-associated truncations in the analyzed lentiviral vectors (LVs), which account for most of the detected RNA fragments. Finally, we show that these insights can be used for the optimization of LV design. In summary, nanopore direct RNA sequencing is a powerful tool for the quality control and optimization of LVs, which may help to improve lentivirus manufacturing and thus the development of higher quality lentiviral gene therapies.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1966-1975"},"PeriodicalIF":6.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610601/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142521724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xudong Liu, Ying Ni, Lianwei Ye, Zhihao Guo, Lu Tan, Jun Li, Mengsu Yang, Sheng Chen, Runsheng Li
DNA modifications in bacteria present diverse types and distributions, playing crucial functional roles. Current methods for detecting bacterial DNA modifications via nanopore sequencing typically involve comparing raw current signals to a methylation-free control. In this study, we found that bacterial DNA modification induces errors in nanopore reads. And these errors are found only in one strand but not the other, showing a strand-specific bias. Leveraging this discovery, we developed Hammerhead, a pioneering pipeline designed for de novo methylation discovery that circumvents the necessity of raw signal inference and a methylation-free control. The majority (14 out of 16) of the identified motifs can be validated by raw signal comparison methods or by identifying corresponding methyltransferases in bacteria. Additionally, we included a novel polishing strategy employing duplex reads to correct modification-induced errors in bacterial genome assemblies, achieving a reduction of over 85% in such errors. In summary, Hammerhead enables users to effectively locate bacterial DNA methylation sites from nanopore FASTQ/FASTA reads, thus holds promise as a routine pipeline for a wide range of nanopore sequencing applications, such as genome assembly, metagenomic binning, decontaminating eukaryotic genome assemblies, and functional analysis for DNA modifications.
细菌中的 DNA 修饰具有多种类型和分布,发挥着重要的功能作用。目前通过纳米孔测序检测细菌 DNA 修饰的方法通常是将原始电流信号与无甲基化对照进行比较。在这项研究中,我们发现细菌 DNA 修饰会导致纳米孔读数出现错误。而且这些误差只出现在一条链上,而不是另一条链上,这显示了链特异性偏差。利用这一发现,我们开发了 Hammerhead,这是一种用于从头甲基化发现的开创性流水线,它避免了原始信号推断和无甲基化对照的必要性。大部分(16 个中的 14 个)鉴定出的主题可以通过原始信号比较方法或鉴定细菌中相应的甲基转移酶来验证。此外,我们还采用了一种新颖的抛光策略,利用双链读数纠正细菌基因组组装中由修饰引起的错误,减少了 85% 以上的此类错误。总之,Hammerhead 能让用户从纳米孔 FASTQ/FASTA 读数中有效定位细菌 DNA 甲基化位点,因此有望成为基因组组装、元基因组分选、去污真核基因组组装和 DNA 修饰功能分析等各种纳米孔测序应用的常规管道。
{"title":"Nanopore strand-specific mismatch enables de novo detection of bacterial DNA modifications.","authors":"Xudong Liu, Ying Ni, Lianwei Ye, Zhihao Guo, Lu Tan, Jun Li, Mengsu Yang, Sheng Chen, Runsheng Li","doi":"10.1101/gr.279012.124","DOIUrl":"10.1101/gr.279012.124","url":null,"abstract":"<p><p>DNA modifications in bacteria present diverse types and distributions, playing crucial functional roles. Current methods for detecting bacterial DNA modifications via nanopore sequencing typically involve comparing raw current signals to a methylation-free control. In this study, we found that bacterial DNA modification induces errors in nanopore reads. And these errors are found only in one strand but not the other, showing a strand-specific bias. Leveraging this discovery, we developed Hammerhead, a pioneering pipeline designed for de novo methylation discovery that circumvents the necessity of raw signal inference and a methylation-free control. The majority (14 out of 16) of the identified motifs can be validated by raw signal comparison methods or by identifying corresponding methyltransferases in bacteria. Additionally, we included a novel polishing strategy employing duplex reads to correct modification-induced errors in bacterial genome assemblies, achieving a reduction of over 85% in such errors. In summary, Hammerhead enables users to effectively locate bacterial DNA methylation sites from nanopore FASTQ/FASTA reads, thus holds promise as a routine pipeline for a wide range of nanopore sequencing applications, such as genome assembly, metagenomic binning, decontaminating eukaryotic genome assemblies, and functional analysis for DNA modifications.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"2025-2038"},"PeriodicalIF":6.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610603/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142365032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shruti V Iyer, Sara Goodwin, William Richard McCombie
Long-read sequencing technologies have improved the contiguity and, as a result, the quality of genome assemblies by generating reads long enough to span and resolve complex or repetitive regions of the genome. Several groups have shown the power of long reads in detecting thousands of genomic and epigenomic features that were previously missed by short-read sequencing approaches. While these studies demonstrate how long reads can help resolve repetitive and complex regions of the genome, they also highlight the throughput and coverage requirements needed to accurately resolve variant alleles across large populations using these platforms. At the time of this review, whole-genome long-read sequencing is more expensive than short-read sequencing on the highest throughput short-read instruments; thus, achieving sufficient coverage to detect low-frequency variants (such as somatic variation) in heterogenous samples remains challenging. Targeted sequencing, on the other hand, provides the depth necessary to detect these low-frequency variants in heterogeneous populations. Here, we review currently used and recently developed targeted sequencing strategies that leverage existing long-read technologies to increase the resolution with which we can look at nucleic acids in a variety of biological contexts.
{"title":"Leveraging the power of long reads for targeted sequencing.","authors":"Shruti V Iyer, Sara Goodwin, William Richard McCombie","doi":"10.1101/gr.279168.124","DOIUrl":"10.1101/gr.279168.124","url":null,"abstract":"<p><p>Long-read sequencing technologies have improved the contiguity and, as a result, the quality of genome assemblies by generating reads long enough to span and resolve complex or repetitive regions of the genome. Several groups have shown the power of long reads in detecting thousands of genomic and epigenomic features that were previously missed by short-read sequencing approaches. While these studies demonstrate how long reads can help resolve repetitive and complex regions of the genome, they also highlight the throughput and coverage requirements needed to accurately resolve variant alleles across large populations using these platforms. At the time of this review, whole-genome long-read sequencing is more expensive than short-read sequencing on the highest throughput short-read instruments; thus, achieving sufficient coverage to detect low-frequency variants (such as somatic variation) in heterogenous samples remains challenging. Targeted sequencing, on the other hand, provides the depth necessary to detect these low-frequency variants in heterogeneous populations. Here, we review currently used and recently developed targeted sequencing strategies that leverage existing long-read technologies to increase the resolution with which we can look at nucleic acids in a variety of biological contexts.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"34 11","pages":"1701-1718"},"PeriodicalIF":6.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142681505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonas A Gustafson, Sophia B Gibson, Nikhita Damaraju, Miranda P G Zalusky, Kendra Hoekzema, David Twesigomwe, Lei Yang, Anthony A Snead, Phillip A Richmond, Wouter De Coster, Nathan D Olson, Andrea Guarracino, Qiuhui Li, Angela L Miller, Joy Goffena, Zachary B Anderson, Sophie H R Storz, Sydney A Ward, Maisha Sinha, Claudia Gonzaga-Jauregui, Wayne E Clarke, Anna O Basile, André Corvelo, Catherine Reeves, Adrienne Helland, Rajeeva Lochan Musunuri, Mahler Revsine, Karynne E Patterson, Cate R Paschal, Christina Zakarian, Sara Goodwin, Tanner D Jensen, Esther Robb, William Richard McCombie, Fritz J Sedlazeck, Justin M Zook, Stephen B Montgomery, Erik Garrison, Mikhail Kolmogorov, Michael C Schatz, Richard N McLaughlin, Harriet Dashnow, Michael C Zody, Matt Loose, Miten Jain, Evan E Eichler, Danny E Miller
Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project (1KGP) Oxford Nanopore Technologies Sequencing Consortium aims to generate LRS data from at least 800 of the 1KGP samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37× and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.
{"title":"High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation.","authors":"Jonas A Gustafson, Sophia B Gibson, Nikhita Damaraju, Miranda P G Zalusky, Kendra Hoekzema, David Twesigomwe, Lei Yang, Anthony A Snead, Phillip A Richmond, Wouter De Coster, Nathan D Olson, Andrea Guarracino, Qiuhui Li, Angela L Miller, Joy Goffena, Zachary B Anderson, Sophie H R Storz, Sydney A Ward, Maisha Sinha, Claudia Gonzaga-Jauregui, Wayne E Clarke, Anna O Basile, André Corvelo, Catherine Reeves, Adrienne Helland, Rajeeva Lochan Musunuri, Mahler Revsine, Karynne E Patterson, Cate R Paschal, Christina Zakarian, Sara Goodwin, Tanner D Jensen, Esther Robb, William Richard McCombie, Fritz J Sedlazeck, Justin M Zook, Stephen B Montgomery, Erik Garrison, Mikhail Kolmogorov, Michael C Schatz, Richard N McLaughlin, Harriet Dashnow, Michael C Zody, Matt Loose, Miten Jain, Evan E Eichler, Danny E Miller","doi":"10.1101/gr.279273.124","DOIUrl":"10.1101/gr.279273.124","url":null,"abstract":"<p><p>Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project (1KGP) Oxford Nanopore Technologies Sequencing Consortium aims to generate LRS data from at least 800 of the 1KGP samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37× and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"2061-2073"},"PeriodicalIF":6.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610458/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142365031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angela Gomez-Simmonds, Medini K Annavajhala, Dwayne Seeram, Todd W Hokunson, Heekuk Park, Anne-Catrin Uhlemann
Transmission of carbapenem-resistant Enterobacterales (CRE) in hospitals has been shown to occur through complex, multifarious networks driven by both clonal spread and horizontal transfer mediated by plasmids and other mobile genetic elements. We performed nanopore long-read sequencing on CRE isolates from a large urban hospital system to determine the overall contribution of plasmids to CRE transmission and identify specific plasmids implicated in the spread of blaKPC (the Klebsiella pneumoniae carbapenemase [KPC] gene). Six hundred and five CRE isolates collected between 2009 and 2018 first underwent Illumina sequencing for genome-wide genotyping; 435 blaKPC-positive isolates were then successfully nanopore sequenced to generate hybrid assemblies including circularized blaKPC-harboring plasmids. Phylogenetic analysis and Mash clustering were used to define putative clonal and plasmid transmission clusters, respectively. Overall, CRE isolates belonged to 96 multilocus sequence types (STs) encoding blaKPC on 447 plasmids which formed 54 plasmid clusters. We found evidence for clonal transmission in 66% of CRE isolates, over half of which belonged to four clades comprising K. pneumoniae ST258. Plasmid-mediated acquisition of blaKPC occurred in 23%-27% of isolates. While most plasmid clusters were small, several plasmids were identified in multiple different species and STs, including a highly promiscuous IncN plasmid and an IncF plasmid putatively spreading blaKPC from ST258 to other clones. Overall, this points to both the continued dominance of K. pneumoniae ST258 and the dissemination of blaKPC across clones and species by diverse plasmid backbones. These findings support integrating long-read sequencing into genomic surveillance approaches to detect the hitherto silent spread of carbapenem resistance driven by mobile plasmids.
耐碳青霉烯类肠杆菌(CRE)在医院中的传播已被证明是通过由质粒和其他移动遗传因子介导的克隆传播和水平转移所驱动的复杂而多样的网络进行的。我们对来自一个大型城市医院系统的 CRE 分离物进行了纳米孔长读数测序,以确定质粒对 CRE 传播的总体贡献,并识别与 bla KPC(肺炎克雷伯菌碳青霉烯酶 [KPC] 基因)传播有关的特定质粒。2009-2018 年间收集的 605 株 CRE 分离物首先进行了 Illumina 测序,以进行全基因组基因分型;然后对 435 株 bla KPC 阳性分离物进行了成功的纳米孔测序,以生成包括环化 bla KPC 携带质粒的杂交组合。系统发育分析和 Mash 聚类分别用于确定假定的克隆和质粒传播群。总体而言,CRE 分离物属于 96 个多焦点序列类型(ST),在 447 个质粒上编码 bla KPC,形成 54 个质粒群。我们在 66% 的 CRE 分离物中发现了克隆传播的证据,其中一半以上属于由肺炎克菌 ST258 组成的四个支系。23-27%的分离株通过质粒获得了 bla KPC。虽然大多数质粒群规模较小,但在多个不同物种和 ST 中发现了几种质粒,包括一种高度杂合的 IncN 质粒和一种可能将 bla KPC 从 ST258 传播到其他克隆的 IncF 质粒。总之,这表明肺炎克菌 ST258 仍处于优势地位,而 bla KPC 则通过不同的质粒骨架在克隆和物种间传播。这些发现支持将长读测序纳入基因组监测方法,以检测迄今为止由移动质粒驱动的碳青霉烯耐药性的无声传播。
{"title":"Genomic epidemiology of carbapenem-resistant Enterobacterales at a New York City hospital over a 10-year period reveals complex plasmid-clone dynamics and evidence for frequent horizontal transfer of <i>bla</i> <sub>KPC</sub>.","authors":"Angela Gomez-Simmonds, Medini K Annavajhala, Dwayne Seeram, Todd W Hokunson, Heekuk Park, Anne-Catrin Uhlemann","doi":"10.1101/gr.279355.124","DOIUrl":"10.1101/gr.279355.124","url":null,"abstract":"<p><p>Transmission of carbapenem-resistant Enterobacterales (CRE) in hospitals has been shown to occur through complex, multifarious networks driven by both clonal spread and horizontal transfer mediated by plasmids and other mobile genetic elements. We performed nanopore long-read sequencing on CRE isolates from a large urban hospital system to determine the overall contribution of plasmids to CRE transmission and identify specific plasmids implicated in the spread of <i>bla</i> <sub>KPC</sub> (the <i>Klebsiella pneumoniae</i> carbapenemase [KPC] gene). Six hundred and five CRE isolates collected between 2009 and 2018 first underwent Illumina sequencing for genome-wide genotyping; 435 <i>bla</i> <sub>KPC</sub>-positive isolates were then successfully nanopore sequenced to generate hybrid assemblies including circularized <i>bla</i> <sub>KPC</sub>-harboring plasmids. Phylogenetic analysis and Mash clustering were used to define putative clonal and plasmid transmission clusters, respectively. Overall, CRE isolates belonged to 96 multilocus sequence types (STs) encoding <i>bla</i> <sub>KPC</sub> on 447 plasmids which formed 54 plasmid clusters. We found evidence for clonal transmission in 66% of CRE isolates, over half of which belonged to four clades comprising <i>K. pneumoniae</i> ST258. Plasmid-mediated acquisition of <i>bla</i> <sub>KPC</sub> occurred in 23%-27% of isolates. While most plasmid clusters were small, several plasmids were identified in multiple different species and STs, including a highly promiscuous IncN plasmid and an IncF plasmid putatively spreading <i>bla</i> <sub>KPC</sub> from ST258 to other clones. Overall, this points to both the continued dominance of <i>K. pneumoniae</i> ST258 and the dissemination of <i>bla</i> <sub>KPC</sub> across clones and species by diverse plasmid backbones. These findings support integrating long-read sequencing into genomic surveillance approaches to detect the hitherto silent spread of carbapenem resistance driven by mobile plasmids.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1895-1907"},"PeriodicalIF":6.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610580/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142375382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sergey Koren, Zhigui Bao, Andrea Guarracino, Shujun Ou, Sara Goodwin, Katharine M Jenike, Julian Lucas, Brandy McNulty, Jimin Park, Mikko Rautiainen, Arang Rhie, Dick Roelofs, Harrie Schneiders, Ilse Vrijenhoek, Koen Nijbroek, Olle Nordesjo, Sergey Nurk, Mike Vella, Katherine R Lawrence, Doreen Ware, Michael C Schatz, Erik Garrison, Sanwen Huang, William Richard McCombie, Karen H Miga, Alexander H J Wittenberg, Adam M Phillippy
The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the UL reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and provides a multirun single-instrument solution for the reconstruction of complete genomes.
{"title":"Gapless assembly of complete human and plant chromosomes using only nanopore sequencing.","authors":"Sergey Koren, Zhigui Bao, Andrea Guarracino, Shujun Ou, Sara Goodwin, Katharine M Jenike, Julian Lucas, Brandy McNulty, Jimin Park, Mikko Rautiainen, Arang Rhie, Dick Roelofs, Harrie Schneiders, Ilse Vrijenhoek, Koen Nijbroek, Olle Nordesjo, Sergey Nurk, Mike Vella, Katherine R Lawrence, Doreen Ware, Michael C Schatz, Erik Garrison, Sanwen Huang, William Richard McCombie, Karen H Miga, Alexander H J Wittenberg, Adam M Phillippy","doi":"10.1101/gr.279334.124","DOIUrl":"10.1101/gr.279334.124","url":null,"abstract":"<p><p>The combination of ultra-long (UL) Oxford Nanopore Technologies (ONT) sequencing reads with long, accurate Pacific Bioscience (PacBio) High Fidelity (HiFi) reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, \"telomere-to-telomere\" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT \"Duplex\" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely studied genomes: human HG002, <i>Solanum lycopersicum</i> Heinz 1706 (tomato), and <i>Zea mays</i> B73 (maize). For the diploid, heterozygous HG002 genome, we also used \"Pore-C\" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the UL reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and provides a multirun single-instrument solution for the reconstruction of complete genomes.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1919-1930"},"PeriodicalIF":6.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610574/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142589915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander J Ritter, Jolene M Draper, Christopher Vollmers, Jeremy R Sanford
Alternative splicing (AS) alters the cis-regulatory landscape of mRNA isoforms, leading to transcripts with distinct localization, stability, and translational efficiency. To rigorously investigate mRNA isoform-specific ribosome association, we generated subcellular fractionation and sequencing (Frac-seq) libraries using both conventional short reads and long reads from human embryonic stem cells (ESCs) and neural progenitor cells (NPCs) derived from the same ESCs. We performed de novo transcriptome assembly from high-confidence long reads from cytosolic, monosomal, light, and heavy polyribosomal fractions and quantified their abundance using short reads from their respective subcellular fractions. Thousands of transcripts in each cell type exhibited association with particular subcellular fractions relative to the cytosol. Of the multi-isoform genes, 27% and 19% exhibited significant differential isoform sedimentation in ESCs and NPCs, respectively. Alternative promoter usage and internal exon skipping accounted for the majority of differences between isoforms from the same gene. Random forest classifiers implicated coding sequence (CDS) and untranslated region (UTR) lengths as important determinants of isoform-specific sedimentation profiles, and motif analyses reveal potential cell type-specific and subcellular fraction-associated RNA-binding protein signatures. Taken together, our data demonstrate that alternative mRNA processing within the CDS and UTRs impacts the translational control of mRNA isoforms during stem cell differentiation, and highlight the utility of using a novel long-read sequencing-based method to study translational control.
{"title":"Long-read subcellular fractionation and sequencing reveals the translational fate of full-length mRNA isoforms during neuronal differentiation.","authors":"Alexander J Ritter, Jolene M Draper, Christopher Vollmers, Jeremy R Sanford","doi":"10.1101/gr.279170.124","DOIUrl":"10.1101/gr.279170.124","url":null,"abstract":"<p><p>Alternative splicing (AS) alters the <i>cis</i>-regulatory landscape of mRNA isoforms, leading to transcripts with distinct localization, stability, and translational efficiency. To rigorously investigate mRNA isoform-specific ribosome association, we generated subcellular fractionation and sequencing (Frac-seq) libraries using both conventional short reads and long reads from human embryonic stem cells (ESCs) and neural progenitor cells (NPCs) derived from the same ESCs. We performed de novo transcriptome assembly from high-confidence long reads from cytosolic, monosomal, light, and heavy polyribosomal fractions and quantified their abundance using short reads from their respective subcellular fractions. Thousands of transcripts in each cell type exhibited association with particular subcellular fractions relative to the cytosol. Of the multi-isoform genes, 27% and 19% exhibited significant differential isoform sedimentation in ESCs and NPCs, respectively. Alternative promoter usage and internal exon skipping accounted for the majority of differences between isoforms from the same gene. Random forest classifiers implicated coding sequence (CDS) and untranslated region (UTR) lengths as important determinants of isoform-specific sedimentation profiles, and motif analyses reveal potential cell type-specific and subcellular fraction-associated RNA-binding protein signatures. Taken together, our data demonstrate that alternative mRNA processing within the CDS and UTRs impacts the translational control of mRNA isoforms during stem cell differentiation, and highlight the utility of using a novel long-read sequencing-based method to study translational control.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"2000-2011"},"PeriodicalIF":6.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610577/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141261622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paige A Byerly, Alina von Thaden, Evgeny Leushkin, Leon Hilgers, Shenglin Liu, Sven Winter, Tilman Schell, Charlotte Gerheim, Alexander Ben Hamadou, Carola Greve, Christian Betz, Hanno J Bolz, Sven Büchner, Johannes Lang, Holger Meinig, Evax Marie Famira-Parcsetich, Sarah P Stubbe, Alice Mouton, Sandro Bertolino, Goedele Verbeylen, Thomas Briner, Lídia Freixas, Lorenzo Vinciguerra, Sarah A Mueller, Carsten Nowak, Michael Hiller
Genomic resources are important for evaluating genetic diversity and supporting conservation efforts. The garden dormouse (Eliomys quercinus) is a small rodent that has experienced one of the most severe modern population declines in Europe. We present a high-quality haplotype-resolved reference genome for the garden dormouse, and combine comprehensive short and long-read transcriptomics data sets with homology-based methods to generate a highly complete gene annotation. Demographic history analysis of the genome reveal a sharp population decline since the last interglacial, indicating an association between colder climates and population declines before anthropogenic influence. Using our genome and genetic data from 100 individuals, largely sampled in a citizen-science project across the contemporary range, we conduct the first population genomic analysis for this species. We find clear evidence for population structure across the species' core Central European range. Notably, our data show that the Alpine population, characterized by strong differentiation and reduced genetic diversity, is reproductively isolated from other regions and likely represents a differentiated evolutionary significant unit (ESU). The predominantly declining Eastern European populations also show signs of recent isolation, a pattern consistent with a range expansion from Western to Eastern Europe during the Holocene, leaving relict populations now facing local extinction. Overall, our findings suggest that garden dormouse conservation may be enhanced in Europe through the designation of ESUs.
{"title":"Haplotype-resolved genome and population genomics of the threatened garden dormouse in Europe.","authors":"Paige A Byerly, Alina von Thaden, Evgeny Leushkin, Leon Hilgers, Shenglin Liu, Sven Winter, Tilman Schell, Charlotte Gerheim, Alexander Ben Hamadou, Carola Greve, Christian Betz, Hanno J Bolz, Sven Büchner, Johannes Lang, Holger Meinig, Evax Marie Famira-Parcsetich, Sarah P Stubbe, Alice Mouton, Sandro Bertolino, Goedele Verbeylen, Thomas Briner, Lídia Freixas, Lorenzo Vinciguerra, Sarah A Mueller, Carsten Nowak, Michael Hiller","doi":"10.1101/gr.279066.124","DOIUrl":"10.1101/gr.279066.124","url":null,"abstract":"<p><p>Genomic resources are important for evaluating genetic diversity and supporting conservation efforts. The garden dormouse (<i>Eliomys quercinus</i>) is a small rodent that has experienced one of the most severe modern population declines in Europe. We present a high-quality haplotype-resolved reference genome for the garden dormouse, and combine comprehensive short and long-read transcriptomics data sets with homology-based methods to generate a highly complete gene annotation. Demographic history analysis of the genome reveal a sharp population decline since the last interglacial, indicating an association between colder climates and population declines before anthropogenic influence. Using our genome and genetic data from 100 individuals, largely sampled in a citizen-science project across the contemporary range, we conduct the first population genomic analysis for this species. We find clear evidence for population structure across the species' core Central European range. Notably, our data show that the Alpine population, characterized by strong differentiation and reduced genetic diversity, is reproductively isolated from other regions and likely represents a differentiated evolutionary significant unit (ESU). The predominantly declining Eastern European populations also show signs of recent isolation, a pattern consistent with a range expansion from Western to Eastern Europe during the Holocene, leaving relict populations now facing local extinction. Overall, our findings suggest that garden dormouse conservation may be enhanced in Europe through the designation of ESUs.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"2094-2107"},"PeriodicalIF":6.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610594/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142618653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}