Thiago L. Knittel, Brooke E. Montgomery, Alex J. Tate, Ennis W. Deihl, Anastasia S. Nawrocki, Frederic J. Hoerndli, Taiowa A. Montgomery
Canonical small interfering RNAs (siRNAs) are processed from double-stranded RNA (dsRNA) by Dicer and associate with Argonautes to direct RNA silencing. In Caenorhabditis elegans, 22G-RNAs and 26G-RNAs are often referred to as siRNAs but display distinct characteristics. For example, 22G-RNAs do not originate from dsRNA and do not depend on Dicer, whereas 26G-RNAs require Dicer but derive from an atypical RNA duplex and are produced exclusively antisense to their messenger RNA (mRNA) templates. To identify canonical siRNAs in C. elegans, we first characterized the siRNAs produced via the exogenous RNA interference (RNAi) pathway. During RNAi, dsRNA is processed into ∼23 nt duplexes with ∼2 nt, 3′-overhangs, ultimately yielding siRNAs devoid of 5′G-containing sequences that bind with high affinity to the Argonaute RDE-1, but also to the microRNA (miRNA) pathway Argonaute, ALG-1. Using these characteristics, we searched for their endogenous counterparts and identified thousands of endogenous loci representing dozens of unique elements that give rise to mostly low to moderate levels of siRNAs, called 23H-RNAs. These loci include repetitive elements, putative coding genes, pseudogenes, noncoding RNAs, and unannotated features, many of which adopt hairpin (hp) structures reminiscent of the hpRNA/RNAi pathway in flies and mice. RDE-1 competes with other Argonautes for binding to 23H-RNAs. When RDE-1 is depleted, these siRNAs are enriched in ALG-1 and ALG-2 complexes. Our results expand the known repertoire of C. elegans small RNAs and their Argonaute interactors, and demonstrate that key features of the endogenous siRNA pathway are relatively unchanged in animals.
{"title":"A low-abundance class of Dicer-dependent siRNAs produced from a variety of features in C. elegans","authors":"Thiago L. Knittel, Brooke E. Montgomery, Alex J. Tate, Ennis W. Deihl, Anastasia S. Nawrocki, Frederic J. Hoerndli, Taiowa A. Montgomery","doi":"10.1101/gr.279083.124","DOIUrl":"https://doi.org/10.1101/gr.279083.124","url":null,"abstract":"Canonical small interfering RNAs (siRNAs) are processed from double-stranded RNA (dsRNA) by Dicer and associate with Argonautes to direct RNA silencing. In <em>Caenorhabditis elegans</em>, 22G-RNAs and 26G-RNAs are often referred to as siRNAs but display distinct characteristics. For example, 22G-RNAs do not originate from dsRNA and do not depend on Dicer, whereas 26G-RNAs require Dicer but derive from an atypical RNA duplex and are produced exclusively antisense to their messenger RNA (mRNA) templates. To identify canonical siRNAs in <em>C. elegans</em>, we first characterized the siRNAs produced via the exogenous RNA interference (RNAi) pathway. During RNAi, dsRNA is processed into ∼23 nt duplexes with ∼2 nt, 3′-overhangs, ultimately yielding siRNAs devoid of 5′G-containing sequences that bind with high affinity to the Argonaute RDE-1, but also to the microRNA (miRNA) pathway Argonaute, ALG-1. Using these characteristics, we searched for their endogenous counterparts and identified thousands of endogenous loci representing dozens of unique elements that give rise to mostly low to moderate levels of siRNAs, called 23H-RNAs. These loci include repetitive elements, putative coding genes, pseudogenes, noncoding RNAs, and unannotated features, many of which adopt hairpin (hp) structures reminiscent of the hpRNA/RNAi pathway in flies and mice. RDE-1 competes with other Argonautes for binding to 23H-RNAs. When RDE-1 is depleted, these siRNAs are enriched in ALG-1 and ALG-2 complexes. Our results expand the known repertoire of <em>C. elegans</em> small RNAs and their Argonaute interactors, and demonstrate that key features of the endogenous siRNA pathway are relatively unchanged in animals.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"45 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142760361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Howard Womersley, Daniel Muliaditan, Ramanuj DasGupta, Lih Feng Cheow
Interrogating regulatory epigenetic alterations during tumor progression at the resolution of single cells has remained an understudied area of research. Here we developed a highly sensitive single-nucleus CUT&RUN (snCUT&RUN) assay to profile histone modifications in isogenic primary, metastatic, and cisplatin-resistant head and neck squamous cell carcinoma (HNSCC) patient-derived tumor cell lines. We find that the epigenome can be involved in diverse modes to contribute towards HNSCC progression. First, we demonstrate that gene expression changes during HNSCC progression can be comodulated by alterations in both copy number and chromatin activity, driving epigenetic rewiring of cell states. Furthermore, intratumour epigenetic heterogeneity (ITeH) may predispose subclonal populations within the primary tumour to adapt to selective pressures and foster the acquisition of malignant characteristics. In conclusion, snCUT&RUN serves as a valuable addition to the existing toolkit of single-cell epigenomic assays and can be used to dissect the functionality of the epigenome during cancer progression.
{"title":"Single-nucleus CUT&RUN elucidates the function of intrinsic and genomics-driven epigenetic heterogeneity in head and neck cancer progression","authors":"Howard Womersley, Daniel Muliaditan, Ramanuj DasGupta, Lih Feng Cheow","doi":"10.1101/gr.279105.124","DOIUrl":"https://doi.org/10.1101/gr.279105.124","url":null,"abstract":"Interrogating regulatory epigenetic alterations during tumor progression at the resolution of single cells has remained an understudied area of research. Here we developed a highly sensitive single-nucleus CUT&RUN (snCUT&RUN) assay to profile histone modifications in isogenic primary, metastatic, and cisplatin-resistant head and neck squamous cell carcinoma (HNSCC) patient-derived tumor cell lines. We find that the epigenome can be involved in diverse modes to contribute towards HNSCC progression. First, we demonstrate that gene expression changes during HNSCC progression can be comodulated by alterations in both copy number and chromatin activity, driving epigenetic rewiring of cell states. Furthermore, intratumour epigenetic heterogeneity (ITeH) may predispose subclonal populations within the primary tumour to adapt to selective pressures and foster the acquisition of malignant characteristics. In conclusion, snCUT&RUN serves as a valuable addition to the existing toolkit of single-cell epigenomic assays and can be used to dissect the functionality of the epigenome during cancer progression.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"13 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142760655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amy R Vandiver, Allen Herbst, Paul Stothard, Jonathan Wanagat
While it is well understood that mitochondrial DNA (mtDNA) deletion mutations cause incurable diseases and contribute to aging, little is known about the transcriptional products that arise from these DNA structural variants. We hypothesized that mitochondrial genomes containing deletion mutations express chimeric mitochondrial RNAs. To test this, we analyzed human and rat RNA sequencing data to identify, quantitate, and characterize chimeric mitochondrial RNAs. We observed increased chimeric mitochondrial RNA frequency in samples from patients with mitochondrial genetic diseases and in samples from aged humans. The spectrum of chimeric mitochondrial transcripts reflected the known pattern of mtDNA deletion mutations. To test the hypothesis that mtDNA deletions induce chimeric RNA transcripts, we treated 18 mo and 34 mo rats with guanidinopropionic acid to induce high levels of skeletal muscle mtDNA deletion mutations. With mtDNA deletion induction, we demonstrate that the chimeric mitochondrial transcript frequency also increased and correlated strongly with an orthogonal DNA-based mutation assay performed on identical samples. Further, we show that the frequency of chimeric mitochondrial transcripts predicts expression of both nuclear and mitochondrial genes central to mitochondrial function, demonstrating the utility of these events as metrics of age-induced metabolic change. Mapping and quantitation of chimeric mitochondrial RNAs provides an accessible, orthogonal approach to DNA-based mutation assays, offers a potential method for identifying mitochondrial pathology in widely accessible datasets, and opens a new area of study in mitochondrial genetics and transcriptomics.
尽管线粒体 DNA(mtDNA)缺失突变会导致无法治愈的疾病并导致衰老,但人们对这些 DNA 结构变异产生的转录产物却知之甚少。我们假设,含有缺失突变的线粒体基因组会表达嵌合线粒体 RNA。为了验证这一假设,我们分析了人类和大鼠的 RNA 测序数据,以识别、定量和描述嵌合线粒体 RNA。我们观察到,在线粒体遗传疾病患者的样本和老年人的样本中,嵌合线粒体 RNA 的频率有所增加。嵌合线粒体转录本的频谱反映了已知的 mtDNA 缺失突变模式。为了验证 mtDNA 缺失会诱导嵌合 RNA 转录本的假设,我们用胍基丙酸处理了 18 个月和 34 个月的大鼠,以诱导高水平的骨骼肌 mtDNA 缺失突变。随着 mtDNA 缺失的诱导,我们发现嵌合线粒体转录本的频率也在增加,并且与在相同样本上进行的基于 DNA 的正交突变检测密切相关。此外,我们还发现嵌合线粒体转录本的频率可以预测线粒体功能的核心核基因和线粒体基因的表达情况,从而证明了这些事件作为年龄诱导的代谢变化指标的实用性。嵌合线粒体 RNA 的制图和定量为基于 DNA 的突变检测提供了一种便捷、正交的方法,为在广泛获取的数据集中识别线粒体病理学提供了一种潜在的方法,并为线粒体遗传学和转录组学的研究开辟了一个新的领域。
{"title":"Chimeric mitochondrial RNA transcripts predict mitochondrial genome deletion mutations in mitochondrial genetic diseases and aging","authors":"Amy R Vandiver, Allen Herbst, Paul Stothard, Jonathan Wanagat","doi":"10.1101/gr.279072.124","DOIUrl":"https://doi.org/10.1101/gr.279072.124","url":null,"abstract":"While it is well understood that mitochondrial DNA (mtDNA) deletion mutations cause incurable diseases and contribute to aging, little is known about the transcriptional products that arise from these DNA structural variants. We hypothesized that mitochondrial genomes containing deletion mutations express chimeric mitochondrial RNAs. To test this, we analyzed human and rat RNA sequencing data to identify, quantitate, and characterize chimeric mitochondrial RNAs. We observed increased chimeric mitochondrial RNA frequency in samples from patients with mitochondrial genetic diseases and in samples from aged humans. The spectrum of chimeric mitochondrial transcripts reflected the known pattern of mtDNA deletion mutations. To test the hypothesis that mtDNA deletions induce chimeric RNA transcripts, we treated 18 mo and 34 mo rats with guanidinopropionic acid to induce high levels of skeletal muscle mtDNA deletion mutations. With mtDNA deletion induction, we demonstrate that the chimeric mitochondrial transcript frequency also increased and correlated strongly with an orthogonal DNA-based mutation assay performed on identical samples. Further, we show that the frequency of chimeric mitochondrial transcripts predicts expression of both nuclear and mitochondrial genes central to mitochondrial function, demonstrating the utility of these events as metrics of age-induced metabolic change. Mapping and quantitation of chimeric mitochondrial RNAs provides an accessible, orthogonal approach to DNA-based mutation assays, offers a potential method for identifying mitochondrial pathology in widely accessible datasets, and opens a new area of study in mitochondrial genetics and transcriptomics.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"25 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142718243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuai Guo, Xiaoqian Liu, Xuesen Cheng, Yujie Jiang, Shuangxi Ji, Qingnan Liang, Andrew Koval, Yumei Li, Leah A. Owen, Ivana K. Kim, Ana Aparicio, Sanghoon Lee, Anil K. Sood, Scott Kopetz, John Paul Shen, John N. Weinstein, Margaret M. DeAngelis, Rui Chen, Wenyi Wang
Bulk deconvolution with single-cell/nucleus RNA-seq data is critical for understanding heterogeneity in complex biological samples, yet the technological discrepancy across sequencing platforms limits deconvolution accuracy. To address this, we utilize an experimental design to match inter-platform biological signals, hence revealing the technological discrepancy, and then develop a deconvolution framework called DeMixSC using this well-matched, i.e., benchmark, data. Built upon a novel weighted nonnegative least-squares framework, DeMixSC identifies and adjusts genes with high technological discrepancy and aligns the benchmark data with large patient cohorts of matched-tissue-type for large-scale deconvolution. Our results using two benchmark datasets of healthy retinas and ovarian cancer tissues suggest much-improved deconvolution accuracy. Leveraging tissue-specific benchmark datasets, we applied DeMixSC to a large cohort of 453 age-related macular degeneration patients and a cohort of 30 ovarian cancer patients with various responses to neoadjuvant chemotherapy. Only DeMixSC successfully unveiled biologically meaningful differences across patient groups, demonstrating its broad applicability in diverse real-world clinical scenarios. Our findings reveal the impact of technological discrepancy on deconvolution performance and underscore the importance of a well-matched dataset to resolve this challenge. The developed DeMixSC framework is generally applicable for accurately deconvolving large cohorts of disease tissues, including cancers, when a well-matched benchmark dataset is available.
{"title":"A deconvolution framework that uses single-cell sequencing plus a small benchmark dataset for accurate analysis of cell type ratios in complex tissue samples","authors":"Shuai Guo, Xiaoqian Liu, Xuesen Cheng, Yujie Jiang, Shuangxi Ji, Qingnan Liang, Andrew Koval, Yumei Li, Leah A. Owen, Ivana K. Kim, Ana Aparicio, Sanghoon Lee, Anil K. Sood, Scott Kopetz, John Paul Shen, John N. Weinstein, Margaret M. DeAngelis, Rui Chen, Wenyi Wang","doi":"10.1101/gr.278822.123","DOIUrl":"https://doi.org/10.1101/gr.278822.123","url":null,"abstract":"Bulk deconvolution with single-cell/nucleus RNA-seq data is critical for understanding heterogeneity in complex biological samples, yet the technological discrepancy across sequencing platforms limits deconvolution accuracy. To address this, we utilize an experimental design to match inter-platform biological signals, hence revealing the technological discrepancy, and then develop a deconvolution framework called DeMixSC using this well-matched, i.e., benchmark, data. Built upon a novel weighted nonnegative least-squares framework, DeMixSC identifies and adjusts genes with high technological discrepancy and aligns the benchmark data with large patient cohorts of matched-tissue-type for large-scale deconvolution. Our results using two benchmark datasets of healthy retinas and ovarian cancer tissues suggest much-improved deconvolution accuracy. Leveraging tissue-specific benchmark datasets, we applied DeMixSC to a large cohort of 453 age-related macular degeneration patients and a cohort of 30 ovarian cancer patients with various responses to neoadjuvant chemotherapy. Only DeMixSC successfully unveiled biologically meaningful differences across patient groups, demonstrating its broad applicability in diverse real-world clinical scenarios. Our findings reveal the impact of technological discrepancy on deconvolution performance and underscore the importance of a well-matched dataset to resolve this challenge. The developed DeMixSC framework is generally applicable for accurately deconvolving large cohorts of disease tissues, including cancers, when a well-matched benchmark dataset is available.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"35 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142712790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bertille Montibus, James A. Cain, Rocio T. Martinez-Nunez, Rebecca J. Oakey
Nucleotide sequences along a gene provide instructions to transcriptional and cotranscriptional machinery allowing genome expansion into the transcriptome. Nucleotide sequence can often be shared between two genes and in some occurrences, a gene is located completely within a different gene; these are known as host/nested gene pairs. In these instances, if both genes are transcribed, overlap can result in a transcriptional crosstalk where genes regulate each other. Despite this, a comprehensive annotation of where such genes are located and their expression patterns is lacking. To address this, we provide an up-to-date catalog of host/nested gene pairs in mouse and human, showing that over a tenth of all genes contain a nested gene. We discovered that transcriptional co-occurrence is often tissue specific. This coexpression was especially prevalent within the transcriptionally permissive tissue, testis. We use this developmental system and scRNA-seq analysis to demonstrate that the coexpression of pairs can occur in single cells and transcription in the same place at the same time can enhance the transcript diversity of the host gene. In agreement, host genes are more transcript-diverse than the rest of the transcriptome. Host/nested gene configurations are common in both human and mouse, suggesting that interplay between gene pairs is a feature of the mammalian genome. This highlights the relevance of transcriptional crosstalk between genes which share nucleic acid sequence. The results and analysis are available on an Rshiny application (https://hngeneviewer.sites.er.kcl.ac.uk/hn_viewer/).
{"title":"Global identification of mammalian host and nested gene pairs reveal tissue-specific transcriptional interplay","authors":"Bertille Montibus, James A. Cain, Rocio T. Martinez-Nunez, Rebecca J. Oakey","doi":"10.1101/gr.279430.124","DOIUrl":"https://doi.org/10.1101/gr.279430.124","url":null,"abstract":"Nucleotide sequences along a gene provide instructions to transcriptional and cotranscriptional machinery allowing genome expansion into the transcriptome. Nucleotide sequence can often be shared between two genes and in some occurrences, a gene is located completely within a different gene; these are known as host/nested gene pairs. In these instances, if both genes are transcribed, overlap can result in a transcriptional crosstalk where genes regulate each other. Despite this, a comprehensive annotation of where such genes are located and their expression patterns is lacking. To address this, we provide an up-to-date catalog of host/nested gene pairs in mouse and human, showing that over a tenth of all genes contain a nested gene. We discovered that transcriptional co-occurrence is often tissue specific. This coexpression was especially prevalent within the transcriptionally permissive tissue, testis. We use this developmental system and scRNA-seq analysis to demonstrate that the coexpression of pairs can occur in single cells and transcription in the same place at the same time can enhance the transcript diversity of the host gene. In agreement, host genes are more transcript-diverse than the rest of the transcriptome. Host/nested gene configurations are common in both human and mouse, suggesting that interplay between gene pairs is a feature of the mammalian genome. This highlights the relevance of transcriptional crosstalk between genes which share nucleic acid sequence. The results and analysis are available on an Rshiny application (https://hngeneviewer.sites.er.kcl.ac.uk/hn_viewer/).","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"34 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142690668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew D. Pollard, Wynn K. Meyer, Emily E. Puckett
Mammalia comprises a great diversity of diet types and associated adaptations. An understanding of the genomic mechanisms underlying these adaptations may offer insights for improving human health. Comparative genomic studies of diet that employ taxonomically restricted analyses or simplified diet classifications may suffer reduced power to detect molecular convergence associated with diet evolution. Here, we use a quantitative carnivory score—indicative of the amount of animal protein in the diet—for 80 mammalian species to detect significant correlations between the relative evolutionary rates of genes and changes in diet. We have identified six genes—ACADSB, CLDN16, CPB1, PNLIP, SLC13A2, and SLC14A2—that experienced significant changes in evolutionary constraint alongside changes in carnivory score, becoming less constrained in lineages evolving more herbivorous diets. We further consider the biological functions associated with diet evolution and observe that pathways related to amino acid and lipid metabolism, biological oxidation, and small molecule transport experienced reduced purifying selection as lineages became more herbivorous. Liver and kidney functions show similar patterns of constraint with dietary change. Our results indicate that these functions are important for the consumption of animal matter and become less important with the evolution of increasing herbivory. So, genes expressed in these tissues experience a relaxation of evolutionary constraint in more herbivorous lineages.
{"title":"Convergent relaxation of molecular constraint in herbivores reveals the changing role of liver and kidney functions across mammalian diets","authors":"Matthew D. Pollard, Wynn K. Meyer, Emily E. Puckett","doi":"10.1101/gr.278930.124","DOIUrl":"https://doi.org/10.1101/gr.278930.124","url":null,"abstract":"Mammalia comprises a great diversity of diet types and associated adaptations. An understanding of the genomic mechanisms underlying these adaptations may offer insights for improving human health. Comparative genomic studies of diet that employ taxonomically restricted analyses or simplified diet classifications may suffer reduced power to detect molecular convergence associated with diet evolution. Here, we use a quantitative carnivory score—indicative of the amount of animal protein in the diet—for 80 mammalian species to detect significant correlations between the relative evolutionary rates of genes and changes in diet. We have identified six genes—<em>ACADSB</em>, <em>CLDN16</em>, <em>CPB1</em>, <em>PNLIP</em>, <em>SLC13A2</em>, and <em>SLC14A2</em>—that experienced significant changes in evolutionary constraint alongside changes in carnivory score, becoming less constrained in lineages evolving more herbivorous diets. We further consider the biological functions associated with diet evolution and observe that pathways related to amino acid and lipid metabolism, biological oxidation, and small molecule transport experienced reduced purifying selection as lineages became more herbivorous. Liver and kidney functions show similar patterns of constraint with dietary change. Our results indicate that these functions are important for the consumption of animal matter and become less important with the evolution of increasing herbivory. So, genes expressed in these tissues experience a relaxation of evolutionary constraint in more herbivorous lineages.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"5 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142690669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel H Kim, Georgi K. Marinov, William Greenleaf
Gene regulation in most eukaryotes involves two fundamental physical processes -- alterations in the packaging of the genome by nucleosomes, with active cis-regulatory elements (CREs) generally characterized by an open-chromatin configuration, and the activation of transcription. Mapping these physical properties and biochemical activities genome-wide -- through profiling chromatin accessibility and active transcription -- are key tools used to understand the logic and mechanisms of transcription and its regulation. However, the relationship between these two states has until now not been accessible to simultaneous measurement. To address this, we developed KAS-ATAC, a combination of the KAS-seq (Kethoxal-Assisted SsDNA sequencing and ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) methods for mapping single-stranded DNA (and thus active transcription) and chromatin accessibility, respectively, enabling the genome-wide identification of DNA fragments that are simultaneously accessible and contain ssDNA. We use KAS-ATAC to evaluate levels of active transcription over different classes of regulatory elements in the human genome, to estimate the absolute levels of transcribed accessible DNA over CREs, to map the nucleosomal configurations associated with RNA polymerase activities, and to assess transcription factor association with transcribed DNA through transcription factor binding site (TFBS) footprinting. We observe lower levels of transcription over distal enhancers compared to promoters and distinct nucleosomal configurations around transcription initiation sites associated with active transcription. Most TFs associate equally with transcribed and nontranscribed DNA but a few factors specifically do not exhibit footprints over ssDNA-containing fragments. We anticipate KAS-ATAC to continue to derive useful insights into chromatin organization and transcriptional regulation in other contexts in the future.
大多数真核生物的基因调控涉及两个基本的物理过程--核小体改变基因组的包装,活性顺式调控元件(CRE)通常具有开放染色质构型的特征;以及激活转录。通过分析染色质的可及性和活跃的转录来绘制全基因组的这些物理特性和生化活动图谱,是用于了解转录及其调控的逻辑和机制的关键工具。然而,到目前为止,这两种状态之间的关系还无法同时测量。为了解决这个问题,我们开发了KAS-ATAC,它是KAS-seq(Kethoxal-Assisted SsDNA sequencing)和ATAC-seq(Assay for Transposase-Accessible Chromatin using sequencing)方法的结合体,分别用于绘制单链DNA(进而活跃转录)和染色质可及性的图谱,从而在全基因组范围内鉴定同时可及且含有ssDNA的DNA片段。我们利用 KAS-ATAC 评估了人类基因组中不同类别调控元件上的活跃转录水平,估算了 CRE 上转录可及 DNA 的绝对水平,绘制了与 RNA 聚合酶活性相关的核糖体构型图,并通过转录因子结合位点(TFBS)足迹分析评估了转录因子与转录 DNA 的关联。与启动子相比,我们观察到远端增强子上的转录水平较低,而与活跃转录相关的转录起始位点周围的核糖体构型各不相同。大多数转录因子与转录和非转录 DNA 的结合程度相同,但也有少数因子在含 ssDNA 的片段上没有表现出特定的足迹。我们预计 KAS-ATAC 将继续为染色质组织和转录调控提供有用的见解。
{"title":"KAS-ATAC reveals the genome-wide single-stranded accessible chromatin landscape of the human genome","authors":"Samuel H Kim, Georgi K. Marinov, William Greenleaf","doi":"10.1101/gr.279621.124","DOIUrl":"https://doi.org/10.1101/gr.279621.124","url":null,"abstract":"Gene regulation in most eukaryotes involves two fundamental physical processes -- alterations in the packaging of the genome by nucleosomes, with active <em>cis</em>-regulatory elements (CREs) generally characterized by an open-chromatin configuration, and the activation of transcription. Mapping these physical properties and biochemical activities genome-wide -- through profiling chromatin accessibility and active transcription -- are key tools used to understand the logic and mechanisms of transcription and its regulation. However, the relationship between these two states has until now not been accessible to simultaneous measurement. To address this, we developed KAS-ATAC, a combination of the KAS-seq (Kethoxal-Assisted SsDNA sequencing and ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) methods for mapping single-stranded DNA (and thus active transcription) and chromatin accessibility, respectively, enabling the genome-wide identification of DNA fragments that are simultaneously accessible and contain ssDNA. We use KAS-ATAC to evaluate levels of active transcription over different classes of regulatory elements in the human genome, to estimate the absolute levels of transcribed accessible DNA over CREs, to map the nucleosomal configurations associated with RNA polymerase activities, and to assess transcription factor association with transcribed DNA through transcription factor binding site (TFBS) footprinting. We observe lower levels of transcription over distal enhancers compared to promoters and distinct nucleosomal configurations around transcription initiation sites associated with active transcription. Most TFs associate equally with transcribed and nontranscribed DNA but a few factors specifically do not exhibit footprints over ssDNA-containing fragments. We anticipate KAS-ATAC to continue to derive useful insights into chromatin organization and transcriptional regulation in other contexts in the future.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"27 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142684311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Georgi K. Marinov, Benjamin Doughty, Anshul Kundaje, William J Greenleaf
Histone proteins have traditionally been thought to be restricted to eukaryotes and most archaea, with eukaryotic nucleosomal histones deriving from their archaeal ancestors. In contrast, bacteria lack histones as a rule. However, histone proteins have recently been identified in a few bacterial clades, most notably the phylum Bdellovibrionota, and these histones have been proposed to exhibit a range of divergent features compared to histones in archaea and eukaryotes. However, no functional genomic studies of the properties of Bdellovibrionota chromatin have been carried out. In this work, we map the landscape of chromatin accessibility, active transcription and three-dimensional genome organization in a member of Bdellovibrionota (a Bacteriovorax strain). We find that, similar to what is observed in some archaea and in eukaryotes with compact genomes such as yeast, Bacteriovorax chromatin is characterized by preferential accessibility around promoter regions. Similar to eukaryotes, chromatin accessibility in Bacteriovorax positively correlates with gene expression. Mapping active transcription through single-strand DNA (ssDNA) profiling revealed that unlike in yeast, but similar to the state of mammalian and fly promoters, Bacteriovorax promoters exhibit very strong polymerase pausing. Finally, similar to that of other bacteria without histones, the Bacteriovorax genome exists in a three-dimensional (3D) configuration organized by the parABS system along the axis defined by replication origin and termination regions. These results provide a foundation for understanding the chromatin biology of the unique Bdellovibrionota bacteria and the functional diversity in chromatin organization across the tree of life.
{"title":"The chromatin landscape of the histone-possessing Bacteriovorax bacteria","authors":"Georgi K. Marinov, Benjamin Doughty, Anshul Kundaje, William J Greenleaf","doi":"10.1101/gr.279418.124","DOIUrl":"https://doi.org/10.1101/gr.279418.124","url":null,"abstract":"Histone proteins have traditionally been thought to be restricted to eukaryotes and most archaea, with eukaryotic nucleosomal histones deriving from their archaeal ancestors. In contrast, bacteria lack histones as a rule. However, histone proteins have recently been identified in a few bacterial clades, most notably the phylum Bdellovibrionota, and these histones have been proposed to exhibit a range of divergent features compared to histones in archaea and eukaryotes. However, no functional genomic studies of the properties of Bdellovibrionota chromatin have been carried out. In this work, we map the landscape of chromatin accessibility, active transcription and three-dimensional genome organization in a member of Bdellovibrionota (a <em>Bacteriovorax</em> strain). We find that, similar to what is observed in some archaea and in eukaryotes with compact genomes such as yeast, <em>Bacteriovorax</em> chromatin is characterized by preferential accessibility around promoter regions. Similar to eukaryotes, chromatin accessibility in <em>Bacteriovorax</em> positively correlates with gene expression. Mapping active transcription through single-strand DNA (ssDNA) profiling revealed that unlike in yeast, but similar to the state of mammalian and fly promoters, <em>Bacteriovorax</em> promoters exhibit very strong polymerase pausing. Finally, similar to that of other bacteria without histones, the <em>Bacteriovorax</em> genome exists in a three-dimensional (3D) configuration organized by the parABS system along the axis defined by replication origin and termination regions. These results provide a foundation for understanding the chromatin biology of the unique Bdellovibrionota bacteria and the functional diversity in chromatin organization across the tree of life.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"61 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142684319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoli Zhang, Yirui Huang, Yajing Yang, Qi-En Wang, Lang Li
Single-cell lineage tracing (scLT) has emerged as a powerful tool, providing unparalleled resolution to investigate cellular dynamics, fate determination, and the underlying molecular mechanisms. This review thoroughly examines the latest prospective lineage DNA barcode tracing technologies. It further highlights pivotal studies that leverage single-cell lentiviral integration barcoding technology to unravel the dynamic nature of cell lineages in both developmental biology and cancer research. Additionally, the review navigates through critical considerations for successful experimental design in lineage tracing and addresses challenges inherent in this field, including technical limitations, complexities in data analysis, and the imperative for standardization. It also outlines current gaps in knowledge and suggests future research directions, contributing to the ongoing advancement of scLT studies.
单细胞谱系追踪(scLT)已成为一种强大的工具,可提供无与伦比的分辨率来研究细胞动力学、命运决定和潜在的分子机制。本综述深入探讨了最新的前瞻性细胞系 DNA 条形码追踪技术。它进一步强调了利用单细胞慢病毒整合条形码技术揭示发育生物学和癌症研究中细胞系动态性质的关键研究。此外,该综述还介绍了成功设计品系追踪实验的关键注意事项,并探讨了该领域固有的挑战,包括技术限制、数据分析的复杂性以及标准化的必要性。综述还概述了目前的知识空白,并提出了未来的研究方向,为持续推进血统追踪研究做出了贡献。
{"title":"Advancements in prospective single-cell lineage barcoding and their applications in research","authors":"Xiaoli Zhang, Yirui Huang, Yajing Yang, Qi-En Wang, Lang Li","doi":"10.1101/gr.278944.124","DOIUrl":"https://doi.org/10.1101/gr.278944.124","url":null,"abstract":"Single-cell lineage tracing (scLT) has emerged as a powerful tool, providing unparalleled resolution to investigate cellular dynamics, fate determination, and the underlying molecular mechanisms. This review thoroughly examines the latest prospective lineage DNA barcode tracing technologies. It further highlights pivotal studies that leverage single-cell lentiviral integration barcoding technology to unravel the dynamic nature of cell lineages in both developmental biology and cancer research. Additionally, the review navigates through critical considerations for successful experimental design in lineage tracing and addresses challenges inherent in this field, including technical limitations, complexities in data analysis, and the imperative for standardization. It also outlines current gaps in knowledge and suggests future research directions, contributing to the ongoing advancement of scLT studies.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"16 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142684317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kathleen Zeglinski, Christian Montellese, Matthew E Ritchie, Monther Alhamdoosh, Cédric Vonarburg, Rory Bowden, Monika Jordi, Quentin Gouil, Florian Aeschimann, Arthur Hsu
Despite recent advances made toward improving the efficacy of lentiviral gene therapies, a sizeable proportion of produced vector contains an incomplete and thus potentially nonfunctional RNA genome. This can undermine gene delivery by the lentivirus as well as increase manufacturing costs and must be improved to facilitate the widespread clinical implementation of lentiviral gene therapies. Here, we compare three long-read sequencing technologies for their ability to detect issues in vector design and determine nanopore direct RNA sequencing to be the most powerful. We show how this approach identifies and quantifies incomplete RNA caused by cryptic splicing and polyadenylation sites, including a potential cryptic polyadenylation site in the widely used Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE). Using artificial polyadenylation of the lentiviral RNA, we also identify multiple hairpin-associated truncations in the analyzed lentiviral vectors (LVs), which account for most of the detected RNA fragments. Finally, we show that these insights can be used for the optimization of LV design. In summary, nanopore direct RNA sequencing is a powerful tool for the quality control and optimization of LVs, which may help to improve lentivirus manufacturing and thus the development of higher quality lentiviral gene therapies.
{"title":"An optimized protocol for quality control of gene therapy vectors using nanopore direct RNA sequencing.","authors":"Kathleen Zeglinski, Christian Montellese, Matthew E Ritchie, Monther Alhamdoosh, Cédric Vonarburg, Rory Bowden, Monika Jordi, Quentin Gouil, Florian Aeschimann, Arthur Hsu","doi":"10.1101/gr.279405.124","DOIUrl":"10.1101/gr.279405.124","url":null,"abstract":"<p><p>Despite recent advances made toward improving the efficacy of lentiviral gene therapies, a sizeable proportion of produced vector contains an incomplete and thus potentially nonfunctional RNA genome. This can undermine gene delivery by the lentivirus as well as increase manufacturing costs and must be improved to facilitate the widespread clinical implementation of lentiviral gene therapies. Here, we compare three long-read sequencing technologies for their ability to detect issues in vector design and determine nanopore direct RNA sequencing to be the most powerful. We show how this approach identifies and quantifies incomplete RNA caused by cryptic splicing and polyadenylation sites, including a potential cryptic polyadenylation site in the widely used Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE). Using artificial polyadenylation of the lentiviral RNA, we also identify multiple hairpin-associated truncations in the analyzed lentiviral vectors (LVs), which account for most of the detected RNA fragments. Finally, we show that these insights can be used for the optimization of LV design. In summary, nanopore direct RNA sequencing is a powerful tool for the quality control and optimization of LVs, which may help to improve lentivirus manufacturing and thus the development of higher quality lentiviral gene therapies.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1966-1975"},"PeriodicalIF":6.2,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11610601/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142521724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}