首页 > 最新文献

Genome research最新文献

英文 中文
Mapping multitissue regulatory variants reveals a liver-centric coexpression network associated with duck egg-laying performance 绘制多组织调控变异揭示了与鸭产蛋性能相关的肝脏中心共表达网络
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-09-10 DOI: 10.1101/gr.280345.124
Yang Xi, Jingjing Qi, Zhao Yang, Yutian Zeng, Huicong Zhang, Qiuyu Tao, Mengru Xu, Anqi Huang, Shenqiang Hu, Chunchun Han, Lili Bai, Jiwei Hu, Jiwen Wang, Liang Li, Lingzhao Fang, Hehe Liu
Poultry egg production is shaped by the intertwined action of multiple physiological systems, greatly magnifying the complexity of its underlying genetic regulation. Although multitissue mapping of regulatory variants offers a powerful route to untangle this complexity, comprehensive data sets in ducks remain scarce. Meanwhile, the contributions of peripheral systems beyond neuroendocrine regulation on poultry egg production are still largely unexplored. Here, we generate 979 RNA-seq samples from the liver, ovary, oviduct shell gland, and spleen, along with matched whole-genome sequencing data from 307 egg-laying ducks. We map cis-regulatory variants associated with gene expression (eQTL), alternative splicing (sQTL), and 3′ alternative polyadenylation (apaQTL), yielding 14,074, 6267, and 4994 genes with at least one significant eQTL, sQTL, and apaQTL, respectively. By integrating this resource and GWAS results, we confirm that ABCG2 expression in the shell gland specifically regulates eggshell color, with additional involvement of ENOPH1’s 3′APA sites in both the shell gland and liver. In addition, expression of LOC101800576 and LOC101790890 in the shell gland, of LOC119713219 in the ovary, and of GLP2R in the spleen is causally linked to declining egg production at peak laying. Last, we delineate a cross-tissue regulatory landscape underlying duck egg production and identify liver-derived modules, particularly Liver_ME1, which is mainly involved in cell cycle regulation, as central hubs coordinating with peripheral tissues affecting duck egg production. This work delivers a key resource and fresh perspectives for the genetic mechanism dissection of duck egg production and for future studies on cross-tissue regulation of reproduction.
禽蛋的生产是由多个生理系统的相互交织作用形成的,这极大地放大了其潜在遗传调控的复杂性。尽管调控变异的多组织映射为解开这种复杂性提供了有力的途径,但鸭子的综合数据集仍然很少。同时,除了神经内分泌调节外,外周系统对禽蛋生产的贡献在很大程度上仍未被探索。在这里,我们从肝脏、卵巢、输卵管壳腺和脾脏中提取了979个RNA-seq样本,以及来自307只蛋鸭的匹配全基因组测序数据。我们绘制了与基因表达(eQTL)、选择性剪接(sQTL)和3 '选择性聚腺酰化(apaQTL)相关的顺式调控变异图谱,分别得到14074、6267和4994个基因,其中至少有一个显著的eQTL、sQTL和apaQTL。通过整合这些资源和GWAS结果,我们证实了ABCG2在壳腺中的表达特异性地调节蛋壳颜色,并在壳腺和肝脏中参与ENOPH1的3'APA位点。此外,壳腺中LOC101800576和LOC101790890的表达、卵巢中LOC119713219的表达以及脾脏中GLP2R的表达与产蛋高峰期产蛋量下降有因果关系。最后,我们描绘了鸭蛋生产背后的跨组织调控景观,并确定了肝脏来源的模块,特别是Liver_ME1,它主要参与细胞周期调控,作为协调影响鸭蛋生产的外周组织的中心枢纽。本研究为鸭产蛋的遗传机制解剖和未来跨组织繁殖调控的研究提供了重要的资源和新的视角。
{"title":"Mapping multitissue regulatory variants reveals a liver-centric coexpression network associated with duck egg-laying performance","authors":"Yang Xi, Jingjing Qi, Zhao Yang, Yutian Zeng, Huicong Zhang, Qiuyu Tao, Mengru Xu, Anqi Huang, Shenqiang Hu, Chunchun Han, Lili Bai, Jiwei Hu, Jiwen Wang, Liang Li, Lingzhao Fang, Hehe Liu","doi":"10.1101/gr.280345.124","DOIUrl":"https://doi.org/10.1101/gr.280345.124","url":null,"abstract":"Poultry egg production is shaped by the intertwined action of multiple physiological systems, greatly magnifying the complexity of its underlying genetic regulation. Although multitissue mapping of regulatory variants offers a powerful route to untangle this complexity, comprehensive data sets in ducks remain scarce. Meanwhile, the contributions of peripheral systems beyond neuroendocrine regulation on poultry egg production are still largely unexplored. Here, we generate 979 RNA-seq samples from the liver, ovary, oviduct shell gland, and spleen, along with matched whole-genome sequencing data from 307 egg-laying ducks. We map <em>cis</em>-regulatory variants associated with gene expression (eQTL), alternative splicing (sQTL), and 3′ alternative polyadenylation (apaQTL), yielding 14,074, 6267, and 4994 genes with at least one significant eQTL, sQTL, and apaQTL, respectively. By integrating this resource and GWAS results, we confirm that <em>ABCG2</em> expression in the shell gland specifically regulates eggshell color, with additional involvement of <em>ENOPH1</em>’s 3′APA sites in both the shell gland and liver. In addition, expression of <em>LOC101800576</em> and <em>LOC101790890</em> in the shell gland, of <em>LOC119713219</em> in the ovary, and of <em>GLP2R</em> in the spleen is causally linked to declining egg production at peak laying. Last, we delineate a cross-tissue regulatory landscape underlying duck egg production and identify liver-derived modules, particularly Liver_ME1, which is mainly involved in cell cycle regulation, as central hubs coordinating with peripheral tissues affecting duck egg production. This work delivers a key resource and fresh perspectives for the genetic mechanism dissection of duck egg production and for future studies on cross-tissue regulation of reproduction.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"12 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145031923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ScisTree2 enables large-scale inference of cell lineage trees and genotype calling using efficient local search ScisTree2可以通过高效的局部搜索实现细胞谱系树和基因型调用的大规模推断
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-09-03 DOI: 10.1101/gr.280542.125
Haotian Zhang, Yiming Zhang, Teng Gao, Yufeng Wu
In a multicellular organism, cell lineages share a common evolutionary history. Knowing this history can facilitate the study of development, aging, and cancer. Cell lineage trees represent the evolutionary history of cells sampled from an organism. Recent developments in single-cell sequencing have greatly facilitated the inference of cell lineage trees. However, single-cell data are sparse and noisy, and the size of single-cell data is increasing rapidly. Accurate inference of cell lineage tree from large single-cell data is computationally challenging. In this paper, we present ScisTree2, a fast and accurate cell lineage tree inference and genotype calling approach based on the infinite-sites model. ScisTree2 relies on an efficient local search approach to find optimal trees. ScisTree2 also calls single-cell genotypes based on the inferred cell lineage tree. Experiments on simulated and real biological data show that ScisTree2 achieves better overall accuracy while being significantly more efficient than existing methods. To the best of our knowledge, ScisTree2 is the first model-based cell lineage tree inference and genotype calling approach that is capable of handling datasets from tens of thousands of cells or more.
在多细胞生物中,细胞系有着共同的进化史。了解这段历史有助于研究发育、衰老和癌症。细胞谱系树代表了从生物体中取样的细胞的进化史。单细胞测序的最新进展极大地促进了细胞谱系树的推断。然而,单cell数据具有稀疏性和噪声性,且单cell数据的规模在快速增长。从大量单细胞数据中准确推断细胞谱系树在计算上具有挑战性。在本文中,我们提出了ScisTree2,一种基于无限位点模型的快速准确的细胞谱系树推断和基因型调用方法。ScisTree2依赖于一种高效的局部搜索方法来找到最优树。ScisTree2还根据推断的细胞谱系树调用单细胞基因型。在模拟和真实生物数据上的实验表明,与现有方法相比,ScisTree2的总体精度更高,效率也显著提高。据我们所知,ScisTree2是第一个基于模型的细胞谱系树推断和基因型调用方法,能够处理来自数万个或更多细胞的数据集。
{"title":"ScisTree2 enables large-scale inference of cell lineage trees and genotype calling using efficient local search","authors":"Haotian Zhang, Yiming Zhang, Teng Gao, Yufeng Wu","doi":"10.1101/gr.280542.125","DOIUrl":"https://doi.org/10.1101/gr.280542.125","url":null,"abstract":"In a multicellular organism, cell lineages share a common evolutionary history. Knowing this history can facilitate the study of development, aging, and cancer. Cell lineage trees represent the evolutionary history of cells sampled from an organism. Recent developments in single-cell sequencing have greatly facilitated the inference of cell lineage trees. However, single-cell data are sparse and noisy, and the size of single-cell data is increasing rapidly. Accurate inference of cell lineage tree from large single-cell data is computationally challenging. In this paper, we present ScisTree2, a fast and accurate cell lineage tree inference and genotype calling approach based on the infinite-sites model. ScisTree2 relies on an efficient local search approach to find optimal trees. ScisTree2 also calls single-cell genotypes based on the inferred cell lineage tree. Experiments on simulated and real biological data show that ScisTree2 achieves better overall accuracy while being significantly more efficient than existing methods. To the best of our knowledge, ScisTree2 is the first model-based cell lineage tree inference and genotype calling approach that is capable of handling datasets from tens of thousands of cells or more.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"24 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144987594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
APOBEC3A drives deaminase mutagenesis in human gastric epithelium APOBEC3A驱动人胃上皮脱氨酶突变
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-08-26 DOI: 10.1101/gr.280338.124
Yohan An, Ji-Hyun Lee, Joonoh Lim, Jeonghwan Youk, Seongyeol Park, Ji-Hyung Park, Kijong Yi, Taewoo Kim, Chang Hyun Nam, Won Hee Lee, Soo A Oh, Yoo Jin Bae, Thomas M. Klompstra, Haeun Lee, Jinju Han, Junehwak Lee, Jung Woo Park, Jie-Hyun Kim, Hyunki Kim, Hugo Snippert, Bon-Kyoung Koo, Young Seok Ju
Cancer genomes frequently carry APOBEC (apolipoprotein B mRNA editing catalytic polypeptide-like)-associated DNA mutations, suggesting APOBEC enzymes as innate mutagens during cancer initiation and evolution. However, the pure mutagenic impacts of the specific enzymes among this family remain unclear in human normal cell lineages. Here, we investigated the comparative mutagenic activities of APOBEC3A and APOBEC3B, through whole-genome sequencing of human normal gastric organoid lines carrying doxycycline-inducible APOBEC expression cassettes. Our findings demonstrated that transcriptional upregulation of APOBEC3A led to the acquisition of a massive number of genomic mutations in just a few cell cycles. By contrast, despite clear deaminase activity and DNA damage, APOBEC3B upregulation did not generate a significant increase in mutations in the gastric epithelium. APOBEC3B-associated mutagenesis remained minimal even in the context of TP53 inactivation. Further analysis of the mutational landscape following APOBEC3A upregulation revealed a detailed spectrum of APOBEC3A-associated mutations, including indels, primarily 1 bp deletions, clustered mutations, and evidence of selective pressures acting on cells carrying the mutations. Our observations provide a clear foundation for understanding the mutational impact of APOBEC enzymes in human cells.
癌症基因组经常携带APOBEC(载脂蛋白B mRNA编辑催化多肽样)相关DNA突变,表明APOBEC酶在癌症发生和进化过程中是先天诱变剂。然而,在人类正常细胞系中,该家族中特定酶的纯诱变作用尚不清楚。在这里,我们通过对携带强力霉素诱导的APOBEC表达盒的人类正常胃类器官系进行全基因组测序,研究了APOBEC3A和APOBEC3B的比较诱变活性。我们的研究结果表明,APOBEC3A的转录上调导致在几个细胞周期内获得大量基因组突变。相比之下,尽管有明显的脱氨酶活性和DNA损伤,APOBEC3B上调并未导致胃上皮细胞突变的显著增加。即使在TP53失活的情况下,apobec3b相关的突变仍然很小。对APOBEC3A上调后的突变景观的进一步分析揭示了APOBEC3A相关突变的详细谱,包括indel、主要是1bp缺失、聚集突变和选择性压力作用于携带突变的细胞的证据。我们的观察结果为理解APOBEC酶在人类细胞中的突变影响提供了明确的基础。
{"title":"APOBEC3A drives deaminase mutagenesis in human gastric epithelium","authors":"Yohan An, Ji-Hyun Lee, Joonoh Lim, Jeonghwan Youk, Seongyeol Park, Ji-Hyung Park, Kijong Yi, Taewoo Kim, Chang Hyun Nam, Won Hee Lee, Soo A Oh, Yoo Jin Bae, Thomas M. Klompstra, Haeun Lee, Jinju Han, Junehwak Lee, Jung Woo Park, Jie-Hyun Kim, Hyunki Kim, Hugo Snippert, Bon-Kyoung Koo, Young Seok Ju","doi":"10.1101/gr.280338.124","DOIUrl":"https://doi.org/10.1101/gr.280338.124","url":null,"abstract":"Cancer genomes frequently carry APOBEC (apolipoprotein B mRNA editing catalytic polypeptide-like)-associated DNA mutations, suggesting APOBEC enzymes as innate mutagens during cancer initiation and evolution. However, the pure mutagenic impacts of the specific enzymes among this family remain unclear in human normal cell lineages. Here, we investigated the comparative mutagenic activities of <em>APOBEC3A</em> and <em>APOBEC3B</em>, through whole-genome sequencing of human normal gastric organoid lines carrying doxycycline-inducible APOBEC expression cassettes. Our findings demonstrated that transcriptional upregulation of <em>APOBEC3A</em> led to the acquisition of a massive number of genomic mutations in just a few cell cycles. By contrast, despite clear deaminase activity and DNA damage, <em>APOBEC3B</em> upregulation did not generate a significant increase in mutations in the gastric epithelium. <em>APOBEC3B</em>-associated mutagenesis remained minimal even in the context of TP53 inactivation. Further analysis of the mutational landscape following <em>APOBEC3A</em> upregulation revealed a detailed spectrum of <em>APOBEC3A</em>-associated mutations, including indels, primarily 1 bp deletions, clustered mutations, and evidence of selective pressures acting on cells carrying the mutations. Our observations provide a clear foundation for understanding the mutational impact of APOBEC enzymes in human cells.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"15 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Molecular and genetic landscapes of retina and brain microglia in neurodegenerative diseases 神经退行性疾病中视网膜和脑小胶质细胞的分子和遗传景观
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-08-26 DOI: 10.1101/gr.280554.125
Khang Ma, Rinki Ratnapriya
Microglia-driven dysregulation has emerged as a significant underlying mechanism in many neurodegenerative diseases, such as Age-related Macular Degeneration (AMD) and Alzheimer's disease (AD). While both brain and retinal microglia originate from the yolk sac, it is uncertain whether they share molecular similarities or genetic and molecular foundations related to neurodegenerative diseases. In this study, we examine the transcriptomic and epigenetic profiles of retina and brain microglia through integrative analyses of single-nucleus RNA sequencing (snRNA-seq) and single-nucleus ATAC sequencing (snATAC-seq) from 97 independent human samples across eleven different studies. Our findings reveal that retina and brain microglia share similar expression and regulatory profiles when compared to other cell types in retina and brain. By integrating genome-wide association studies (GWAS) data with gene expression profiles, we demonstrate that genetic variants associated with AMD and AD are linked to microglia-specific gene signatures. Furthermore, integrating regulatory annotations with GWAS data shows that susceptibility loci for both AMD and AD are notably enriched in the open chromatin regions of microglia from brain and retina, emphasizing their relevance to these neurodegenerative conditions. Finally, a comparison with microglia annotations from other tissues highlights the specific enrichment of microglia in relation to neurodegenerative diseases. These findings contribute to the understanding of the role of microglia in AMD and AD pathogenesis and offer an opportunity to utilize resources from both retinal and brain microglia to deepen our understanding of their contributions to genetic variations in neurodegenerative diseases.
小胶质细胞驱动的失调已成为许多神经退行性疾病的重要潜在机制,如年龄相关性黄斑变性(AMD)和阿尔茨海默病(AD)。虽然大脑和视网膜小胶质细胞都起源于卵黄囊,但尚不确定它们是否具有与神经退行性疾病相关的分子相似性或遗传和分子基础。在这项研究中,我们通过对来自11个不同研究的97个独立人类样本的单核RNA测序(snRNA-seq)和单核ATAC测序(snATAC-seq)的综合分析,研究了视网膜和脑小胶质细胞的转录组学和表观遗传学特征。我们的研究结果表明,与视网膜和大脑中的其他细胞类型相比,视网膜和脑小胶质细胞具有相似的表达和调节特征。通过整合全基因组关联研究(GWAS)数据和基因表达谱,我们证明了与AMD和AD相关的遗传变异与小胶质细胞特异性基因特征有关。此外,将调控注释与GWAS数据相结合表明,AMD和AD的易感位点在大脑和视网膜小胶质细胞的开放染色质区域显著富集,强调了它们与这些神经退行性疾病的相关性。最后,与来自其他组织的小胶质细胞注释的比较突出了与神经退行性疾病相关的小胶质细胞的特异性富集。这些发现有助于理解小胶质细胞在AMD和AD发病机制中的作用,并为利用视网膜和脑小胶质细胞的资源加深我们对它们在神经退行性疾病遗传变异中的作用的理解提供了机会。
{"title":"Molecular and genetic landscapes of retina and brain microglia in neurodegenerative diseases","authors":"Khang Ma, Rinki Ratnapriya","doi":"10.1101/gr.280554.125","DOIUrl":"https://doi.org/10.1101/gr.280554.125","url":null,"abstract":"Microglia-driven dysregulation has emerged as a significant underlying mechanism in many neurodegenerative diseases, such as Age-related Macular Degeneration (AMD) and Alzheimer's disease (AD). While both brain and retinal microglia originate from the yolk sac, it is uncertain whether they share molecular similarities or genetic and molecular foundations related to neurodegenerative diseases. In this study, we examine the transcriptomic and epigenetic profiles of retina and brain microglia through integrative analyses of single-nucleus RNA sequencing (snRNA-seq) and single-nucleus ATAC sequencing (snATAC-seq) from 97 independent human samples across eleven different studies. Our findings reveal that retina and brain microglia share similar expression and regulatory profiles when compared to other cell types in retina and brain. By integrating genome-wide association studies (GWAS) data with gene expression profiles, we demonstrate that genetic variants associated with AMD and AD are linked to microglia-specific gene signatures. Furthermore, integrating regulatory annotations with GWAS data shows that susceptibility loci for both AMD and AD are notably enriched in the open chromatin regions of microglia from brain and retina, emphasizing their relevance to these neurodegenerative conditions. Finally, a comparison with microglia annotations from other tissues highlights the specific enrichment of microglia in relation to neurodegenerative diseases. These findings contribute to the understanding of the role of microglia in AMD and AD pathogenesis and offer an opportunity to utilize resources from both retinal and brain microglia to deepen our understanding of their contributions to genetic variations in neurodegenerative diseases.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"43 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Widespread specific intron-retention events in nuclear RNA complexes identified by sedimentation analysis of pluripotent cellular extracts 通过多能细胞提取物的沉淀分析鉴定核RNA复合物中广泛的特异性内含子保留事件
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-08-26 DOI: 10.1101/gr.280431.125
Isabela T Pereira, Izabela Mamede, Paulo de Paiva Amaral, Gloria Regina Franco, John L Rinn
Many essential cellular processes require RNA to interact with protein(s) to form ribonucleic protein complexes (RNPs). For example, all cellular proteins are produced by the ribosome - a large and stable RNP, gene splicing requires a choreography of numerous small and large RNPs, even the replication of telomeric DNA requires an RNP. All these examples are stable RNPs that exhibit specific sedimentation rates (e.g., in a sucrose gradient) based on the composition of RNA and protein. In this study we aimed to identify RNA components of discrete RNPs on a transcriptome-wide scale. Using sucrose-gradient sedimentation followed by sequencing, we identified 1,057 RNA transcripts, both coding and noncoding, that are likely to be components of cellular RNPs. We named these transcripts Gradient Enriched Transcripts (GETs). GETs were predominantly nuclear, metabolically stable, and they were not the major splice isoforms but instead each contained a specific retained intron. Collectively our study reveals a widespread phenomenon of a specific intron being retained in a stable nuclear RNPs.
许多基本的细胞过程需要RNA与蛋白质相互作用以形成核糖核蛋白复合物(RNPs)。例如,所有的细胞蛋白质都是由核糖体——一个大而稳定的RNP产生的,基因剪接需要大量大小RNP的编排,甚至端粒DNA的复制也需要RNP。所有这些例子都是稳定的RNPs,它们表现出基于RNA和蛋白质组成的特定沉降速率(例如,在蔗糖梯度中)。在这项研究中,我们的目的是在转录组范围内鉴定离散RNPs的RNA成分。使用蔗糖梯度沉降和测序,我们鉴定了1,057个RNA转录本,包括编码和非编码,它们可能是细胞RNA的组成部分。我们将这些转录本命名为梯度富集转录本(Gradient enrichment transcripts, GETs)。get主要是核的,代谢稳定,它们不是主要的剪接异构体,而是每个包含一个特定的保留内含子。总的来说,我们的研究揭示了在稳定的核RNPs中保留特定内含子的普遍现象。
{"title":"Widespread specific intron-retention events in nuclear RNA complexes identified by sedimentation analysis of pluripotent cellular extracts","authors":"Isabela T Pereira, Izabela Mamede, Paulo de Paiva Amaral, Gloria Regina Franco, John L Rinn","doi":"10.1101/gr.280431.125","DOIUrl":"https://doi.org/10.1101/gr.280431.125","url":null,"abstract":"Many essential cellular processes require RNA to interact with protein(s) to form ribonucleic protein complexes (RNPs). For example, all cellular proteins are produced by the ribosome - a large and stable RNP, gene splicing requires a choreography of numerous small and large RNPs, even the replication of telomeric DNA requires an RNP. All these examples are stable RNPs that exhibit specific sedimentation rates (e.g., in a sucrose gradient) based on the composition of RNA and protein. In this study we aimed to identify RNA components of discrete RNPs on a transcriptome-wide scale. Using sucrose-gradient sedimentation followed by sequencing, we identified 1,057 RNA transcripts, both coding and noncoding, that are likely to be components of cellular RNPs. We named these transcripts Gradient Enriched Transcripts (GETs). GETs were predominantly nuclear, metabolically stable, and they were not the major splice isoforms but instead each contained a specific retained intron. Collectively our study reveals a widespread phenomenon of a specific intron being retained in a stable nuclear RNPs.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"23 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quadrupia provides a comprehensive catalog of G-quadruplexes across genomes from the tree of life Quadrupia提供了来自生命之树基因组的g -四联体的综合目录
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-08-26 DOI: 10.1101/gr.279790.124
Nikol Chantzi, Akshatha Nayak, Fotis A. Baltoumas, Eleni Aplakidou, Shiau Wei Liew, Jesslyn Elvaretta Galuh, Michail Patsakis, Austin Montgomery, Camille Moeckel, Ioannis Mouratidis, Saiful Arefeen Sazed, Wilfried Guiblet, Panagiotis Karmiris-Obratański, Guliang Wang, Apostolos Zaravinos, Karen M. Vasquez, Chun Kit Kwok, Georgios A. Pavlopoulos, Ilias Georgakopoulos-Soares
G-quadruplex DNA structures exhibit a profound influence on essential biological processes, including transcription, replication, telomere maintenance, and genomic stability. These structures have demonstrably shaped organismal evolution. However, a comprehensive, organism-wide G-quadruplex map encompassing the diversity of life has remained elusive. Here, we introduce Quadrupia, the most extensive and well-characterized G-quadruplex database to date, facilitating the exploration of G-quadruplex structures across the evolutionary spectrum. Quadrupia has identified G-quadruplex sequences in 108,449 reference genomes, with a total of 140,181,277 G-quadruplexes. The database also hosts a collection of 319,784 G-quadruplex clusters of 20 or more members, annotated by taxonomic distributions, multiple sequence alignments, profile hidden Markov models and cross-references to G-quadruplex 3D structures. Examination of G-quadruplexes across functional genomic elements in different taxa indicates preferential orientation and positioning, with significant differences between individual taxonomic groups. For example, we find that G-quadruplexes in bacteria with a single replication origin display profound preference for the leading orientation. Finally, we experimentally validate the most frequently observed G-quadruplexes using CD-spectroscopy, UV melting, and fluorescent-based approaches.
g -四重体DNA结构对转录、复制、端粒维持和基因组稳定性等基本生物过程具有深远的影响。这些结构明显地影响了生物体的进化。然而,一个全面的、涵盖生物多样性的g -四重体图谱仍然是难以捉摸的。在这里,我们介绍了Quadrupia,迄今为止最广泛和最具特征的g -四重体数据库,促进了g -四重体结构在进化光谱中的探索。Quadrupia已经在108,449个参考基因组中鉴定出g -四重体序列,共有140,181,277个g -四重体。该数据库还拥有319,784个g -四plex集群的20个或更多成员的集合,通过分类分布,多序列比对,剖面隐马尔可夫模型和g -四plex 3D结构的交叉引用进行注释。对不同分类群中功能基因组元件的g -四重丛的检测表明,g -四重丛具有优先取向和定位,在不同分类群之间存在显著差异。例如,我们发现具有单一复制起源的细菌中的g -四重体对前导取向表现出深刻的偏好。最后,我们实验验证了最常见的观察到的g -四复体使用cd光谱,紫外熔融和荧光为基础的方法。
{"title":"Quadrupia provides a comprehensive catalog of G-quadruplexes across genomes from the tree of life","authors":"Nikol Chantzi, Akshatha Nayak, Fotis A. Baltoumas, Eleni Aplakidou, Shiau Wei Liew, Jesslyn Elvaretta Galuh, Michail Patsakis, Austin Montgomery, Camille Moeckel, Ioannis Mouratidis, Saiful Arefeen Sazed, Wilfried Guiblet, Panagiotis Karmiris-Obratański, Guliang Wang, Apostolos Zaravinos, Karen M. Vasquez, Chun Kit Kwok, Georgios A. Pavlopoulos, Ilias Georgakopoulos-Soares","doi":"10.1101/gr.279790.124","DOIUrl":"https://doi.org/10.1101/gr.279790.124","url":null,"abstract":"G-quadruplex DNA structures exhibit a profound influence on essential biological processes, including transcription, replication, telomere maintenance, and genomic stability. These structures have demonstrably shaped organismal evolution. However, a comprehensive, organism-wide G-quadruplex map encompassing the diversity of life has remained elusive. Here, we introduce Quadrupia, the most extensive and well-characterized G-quadruplex database to date, facilitating the exploration of G-quadruplex structures across the evolutionary spectrum. Quadrupia has identified G-quadruplex sequences in 108,449 reference genomes, with a total of 140,181,277 G-quadruplexes. The database also hosts a collection of 319,784 G-quadruplex clusters of 20 or more members, annotated by taxonomic distributions, multiple sequence alignments, profile hidden Markov models and cross-references to G-quadruplex 3D structures. Examination of G-quadruplexes across functional genomic elements in different taxa indicates preferential orientation and positioning, with significant differences between individual taxonomic groups. For example, we find that G-quadruplexes in bacteria with a single replication origin display profound preference for the leading orientation. Finally, we experimentally validate the most frequently observed G-quadruplexes using CD-spectroscopy, UV melting, and fluorescent-based approaches.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"191 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust 16S rRNA classification based on a compressed LCA index 基于压缩LCA索引的稳健16S rRNA分类
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-08-25 DOI: 10.1101/gr.279846.124
Omar Y. Ahmed, Christina Boucher, Ben Langmead
Taxonomic sequence classification is a computational problem central to the study of metagenomics and evolution Advances in compressed indexing with the r-index enable full-text pattern matching against large sequence collections. But the data structures that link pattern sequences to their clades of origin still do not scale well to large collections. Previous work proposed the document array profiles, which use O(rd) words of space where r is the number of maximal-equal letter runs in the Burrows-Wheeler transform and d is the number of distinct genomes. The linear dependence on d is limiting, since real taxonomies can easily contain 10,000s of leaves or more. We propose a method called cliff compression that reduces this size by a large factor, over 250× when indexing the SILVA 16S rRNA gene database. This method uses Θ(r log d) words of space in expectation under a random model we propose here. We implemented these ideas in an open source tool called Cliffy that performs efficient taxonomic classification of sequencing reads with respect to a compressed taxonomic index. When applied to simulated 16S rRNA reads, Cliffy's read-level accuracy is higher than Kraken2's by 11-18%. Clade abundances are also more accurately predicted by Cliffy compared to Kraken2 and Bracken. Overall, Cliffy is a fast and space-economical extension to compressed full-text indexes, enabling them to perform fast and accurate taxonomic classification queries. Cliffy's accuracy underscores the advantages of full-text indexes, which offer a more precise solution compared to k-mer indexes designed for a specific k value.
分类序列分类是元基因组学和进化研究的一个核心计算问题,利用r-index进行压缩索引的进展使全文模式匹配能够针对大型序列集合。但是,将模式序列与其起源分支联系起来的数据结构仍然不能很好地扩展到大型集合。先前的工作提出了文档阵列概况,它使用O(rd)个空间单词,其中r是Burrows-Wheeler变换中最大相等字母的运行次数,d是不同基因组的数量。对d的线性依赖是有限的,因为真正的分类法很容易包含10,000个或更多的叶子。我们提出了一种称为悬崖压缩的方法,该方法在索引SILVA 16S rRNA基因数据库时将该大小大大减少,超过250倍。该方法在我们提出的随机模型下使用Θ(r log d)个期望空间词。我们在一个名为Cliffy的开源工具中实现了这些想法,该工具根据压缩的分类索引对测序读数执行有效的分类分类。当应用于模拟16S rRNA读取时,Cliffy的读取级准确度比Kraken2的高11-18%。与Kraken2和Bracken相比,Cliffy预测的进化支丰度也更准确。总的来说,Cliffy是对压缩全文索引的快速和节省空间的扩展,使它们能够执行快速和准确的分类分类查询。Cliffy的准确性强调了全文索引的优势,与为特定k值设计的k-mer索引相比,全文索引提供了更精确的解决方案。
{"title":"Robust 16S rRNA classification based on a compressed LCA index","authors":"Omar Y. Ahmed, Christina Boucher, Ben Langmead","doi":"10.1101/gr.279846.124","DOIUrl":"https://doi.org/10.1101/gr.279846.124","url":null,"abstract":"Taxonomic sequence classification is a computational problem central to the study of metagenomics and evolution Advances in compressed indexing with the <em>r</em>-index enable full-text pattern matching against large sequence collections. But the data structures that link pattern sequences to their clades of origin still do not scale well to large collections. Previous work proposed the document array profiles, which use <em>O</em>(<em>rd</em>) words of space where<em> r</em> is the number of maximal-equal letter runs in the Burrows-Wheeler transform and <em> d</em> is the number of distinct genomes. The linear dependence on <em> d</em> is limiting, since real taxonomies can easily contain 10,000s of leaves or more. We propose a method called cliff compression that reduces this size by a large factor, over 250× when indexing the SILVA 16S rRNA gene database. This method uses Θ(<em>r</em> log <em> d</em>) words of space in expectation under a random model we propose here. We implemented these ideas in an open source tool called Cliffy that performs efficient taxonomic classification of sequencing reads with respect to a compressed taxonomic index. When applied to simulated 16S rRNA reads, Cliffy's read-level accuracy is higher than Kraken2's by 11-18%. Clade abundances are also more accurately predicted by Cliffy compared to Kraken2 and Bracken. Overall, Cliffy is a fast and space-economical extension to compressed full-text indexes, enabling them to perform fast and accurate taxonomic classification queries. Cliffy's accuracy underscores the advantages of full-text indexes, which offer a more precise solution compared to <em>k</em>-mer indexes designed for a specific <em>k</em> value.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"10 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tree-based differential testing using inferential uncertainty for RNA-seq 基于树的差分测试,使用RNA-seq的推理不确定性
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-08-21 DOI: 10.1101/gr.279981.124
Noor P Singh, Euphy Wu, Jason Fan, Michael I Love, Rob Patro
Identifying differentially expressed transcripts poses a crucial yet challenging problem in transcriptomic. Substantial uncertainty is associated with the abundance estimates of certain transcripts which, if ignored, can lead to the exaggeration of false positives and, if included, may lead to reduced power. Given a set of RNA-seq samples, TreeTerminus arranges transcripts in a hierarchical tree structure that encodes different layers of resolution for interpretation of the abundance of transcriptional groups, with uncertainty generally decreasing as one ascends the tree from the leaves. We introduce mehenDi, which utilizes the tree structure from TreeTerminus for differential testing. The nodes output by mehenDi, called the selected nodes are determined in a data-driven manner to maximize the signal that can be extracted from the data while controlling for the uncertainty associated with estimating the transcript abundances. The identified selected nodes can include transcripts and inner nodes, with no two nodes having an ancestor/descendant relationship. We evaluated our method on both simulated and experimental datasets, comparing its performance with other tree-based differential methods as well as with uncertainty-aware differential transcript/gene expression methods. Our method detects inner nodes that show a strong signal for differential expression, which would have been overlooked when analyzing the transcripts alone.
鉴定差异表达转录本是转录组学研究中一个重要而又具有挑战性的问题。实质性的不确定性与某些转录本的丰度估计有关,如果忽视,可能导致假阳性的夸大,如果包括在内,可能导致功率降低。给定一组RNA-seq样本,TreeTerminus将转录本排列成分层树状结构,编码不同层次的分辨率,以解释转录组的丰度,随着人们从叶子上升到树状结构,不确定性通常会降低。我们介绍了mehenDi,它利用TreeTerminus的树形结构进行差分测试。mehenDi输出的节点(称为选定节点)以数据驱动的方式确定,以最大限度地从数据中提取信号,同时控制与估计转录本丰度相关的不确定性。确定的选定节点可以包括转录本和内部节点,没有两个节点具有祖先/后代关系。我们在模拟和实验数据集上评估了我们的方法,并将其与其他基于树的差异方法以及不确定性感知的差异转录物/基因表达方法进行了比较。我们的方法检测内部节点,这些节点显示出强烈的差异表达信号,这在单独分析转录本时可能会被忽略。
{"title":"Tree-based differential testing using inferential uncertainty for RNA-seq","authors":"Noor P Singh, Euphy Wu, Jason Fan, Michael I Love, Rob Patro","doi":"10.1101/gr.279981.124","DOIUrl":"https://doi.org/10.1101/gr.279981.124","url":null,"abstract":"Identifying differentially expressed transcripts poses a crucial yet challenging problem in transcriptomic. Substantial uncertainty is associated with the abundance estimates of certain transcripts which, if ignored, can lead to the exaggeration of false positives and, if included, may lead to reduced power. Given a set of RNA-seq samples, TreeTerminus arranges transcripts in a hierarchical tree structure that encodes different layers of resolution for interpretation of the abundance of transcriptional groups, with uncertainty generally decreasing as one ascends the tree from the leaves. We introduce mehenDi, which utilizes the tree structure from TreeTerminus for differential testing. The nodes output by mehenDi, called the selected nodes are determined in a data-driven manner to maximize the signal that can be extracted from the data while controlling for the uncertainty associated with estimating the transcript abundances. The identified selected nodes can include transcripts and inner nodes, with no two nodes having an ancestor/descendant relationship. We evaluated our method on both simulated and experimental datasets, comparing its performance with other tree-based differential methods as well as with uncertainty-aware differential transcript/gene expression methods. Our method detects inner nodes that show a strong signal for differential expression, which would have been overlooked when analyzing the transcripts alone.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"9 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating the size of long tandem repeat expansions from short reads with ScatTR 用ScatTR估计短读长串联重复扩增的大小
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-08-21 DOI: 10.1101/gr.280563.125
Rashid Al-Abri, Gamze Gursoy
Tandem repeats (TRs) are sequences of DNA where two or more base pairs are repeated back-to-back at specific locations in the genome. TR expansions, where the number of repeat units exceeds the normal range, have been implicated in over 50 conditions. However, accurately measuring the copy number of TRs is challenging, especially when their expansions are larger than the fragment sizes used in standard short-read genome sequencing. Here, we introduce ScatTR, a novel computational method that leverages a maximum likelihood framework to estimate the copy number of large TR expansions from short-read sequencing data. ScatTR calculates the likelihood of different alignments between sequencing reads and reference sequences that represent various TR lengths and employs a Monte Carlo technique to find the best match. In simulated data, ScatTR outperforms state-of-the-art methods, particularly for TRs with longer motifs and those with lengths that greatly exceed typical sequencing fragment sizes. When applied to data from the 1000 Genomes Project, ScatTR detects potential large TR expansions that other methods missed, highlighting its ability to better characterize genome-wide TR variation.
串联重复序列(TRs)是DNA序列,其中两个或多个碱基对在基因组的特定位置连续重复。在超过50种情况下,重复单位数量超过正常范围的TR扩展已被涉及。然而,准确测量TRs的拷贝数是具有挑战性的,特别是当它们的扩增量大于标准短读基因组测序中使用的片段大小时。在这里,我们介绍了ScatTR,一种新的计算方法,利用最大似然框架来估计短读测序数据中大TR扩展的拷贝数。ScatTR计算测序读数和代表不同TR长度的参考序列之间不同排列的可能性,并采用蒙特卡罗技术找到最佳匹配。在模拟数据中,ScatTR优于最先进的方法,特别是对于具有较长基序和长度大大超过典型测序片段大小的TRs。当应用于1000基因组计划的数据时,ScatTR检测到其他方法遗漏的潜在的大TR扩增,突出了其更好地表征全基因组TR变异的能力。
{"title":"Estimating the size of long tandem repeat expansions from short reads with ScatTR","authors":"Rashid Al-Abri, Gamze Gursoy","doi":"10.1101/gr.280563.125","DOIUrl":"https://doi.org/10.1101/gr.280563.125","url":null,"abstract":"Tandem repeats (TRs) are sequences of DNA where two or more base pairs are repeated back-to-back at specific locations in the genome. TR expansions, where the number of repeat units exceeds the normal range, have been implicated in over 50 conditions. However, accurately measuring the copy number of TRs is challenging, especially when their expansions are larger than the fragment sizes used in standard short-read genome sequencing. Here, we introduce ScatTR, a novel computational method that leverages a maximum likelihood framework to estimate the copy number of large TR expansions from short-read sequencing data. ScatTR calculates the likelihood of different alignments between sequencing reads and reference sequences that represent various TR lengths and employs a Monte Carlo technique to find the best match. In simulated data, ScatTR outperforms state-of-the-art methods, particularly for TRs with longer motifs and those with lengths that greatly exceed typical sequencing fragment sizes. When applied to data from the 1000 Genomes Project, ScatTR detects potential large TR expansions that other methods missed, highlighting its ability to better characterize genome-wide TR variation.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"146 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deciphering context-specific gene programs from single-cell and spatial transcriptomics data with DeCEP 用DeCEP从单细胞和空间转录组学数据中破译上下文特异性基因程序
IF 7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-08-21 DOI: 10.1101/gr.279689.124
Lin Li, Xianbin Su, Ze-Guang Han
Functional gene programs play a wide range of roles in health and disease by orchestrating transcriptional coregulation to govern cell identity. Understanding these intricate gene programs is essential for unraveling the complexities of biological systems; however, deciphering them remains a significant challenge. Recent advancements in single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) technologies have empowered the comprehensive characterization of gene programs at both single-cell and spatial resolutions. Here, we present DeCEP, a computational framework designed to characterize context-specific gene programs using scRNA-seq and ST data. DeCEP leverages functional gene lists and directed graphs to construct functional networks underlying distinct cellular or spatial contexts. It then identifies context-dependent hub genes associated with specific gene programs based on network topology and assigns gene program activity to individual cells or spatial locations. Through evaluation on both simulated and real biological datasets, DeCEP demonstrates complementary strengths over existing methods by enabling more fine-grained characterization of gene programs within specific contexts, particularly those characterized by pronounced transcriptional heterogeneity. Furthermore, we showcase the ability of DeCEP in elucidating biological insights through case studies on normal liver tissue, Alzheimer' disease, and cancer.
功能性基因程序在健康和疾病中发挥着广泛的作用,通过协调转录协同调节来控制细胞身份。理解这些复杂的基因程序对于揭示生物系统的复杂性至关重要;然而,破译它们仍然是一个重大挑战。单细胞RNA测序(scRNA-seq)和空间转录组学(ST)技术的最新进展使得在单细胞和空间分辨率上对基因程序进行全面表征成为可能。在这里,我们提出了DeCEP,这是一个计算框架,旨在利用scRNA-seq和ST数据表征上下文特异性基因程序。DeCEP利用功能基因列表和有向图来构建不同细胞或空间背景下的功能网络。然后,它识别与基于网络拓扑的特定基因程序相关的上下文依赖的中心基因,并将基因程序活动分配给单个细胞或空间位置。通过对模拟和真实生物数据集的评估,DeCEP证明了与现有方法相比的互补优势,它能够在特定背景下更精细地表征基因程序,特别是那些以明显的转录异质性为特征的基因程序。此外,我们通过对正常肝组织、阿尔茨海默病和癌症的案例研究,展示了DeCEP在阐明生物学见解方面的能力。
{"title":"Deciphering context-specific gene programs from single-cell and spatial transcriptomics data with DeCEP","authors":"Lin Li, Xianbin Su, Ze-Guang Han","doi":"10.1101/gr.279689.124","DOIUrl":"https://doi.org/10.1101/gr.279689.124","url":null,"abstract":"Functional gene programs play a wide range of roles in health and disease by orchestrating transcriptional coregulation to govern cell identity. Understanding these intricate gene programs is essential for unraveling the complexities of biological systems; however, deciphering them remains a significant challenge. Recent advancements in single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) technologies have empowered the comprehensive characterization of gene programs at both single-cell and spatial resolutions. Here, we present DeCEP, a computational framework designed to characterize context-specific gene programs using scRNA-seq and ST data. DeCEP leverages functional gene lists and directed graphs to construct functional networks underlying distinct cellular or spatial contexts. It then identifies context-dependent hub genes associated with specific gene programs based on network topology and assigns gene program activity to individual cells or spatial locations. Through evaluation on both simulated and real biological datasets, DeCEP demonstrates complementary strengths over existing methods by enabling more fine-grained characterization of gene programs within specific contexts, particularly those characterized by pronounced transcriptional heterogeneity. Furthermore, we showcase the ability of DeCEP in elucidating biological insights through case studies on normal liver tissue, Alzheimer' disease, and cancer.","PeriodicalId":12678,"journal":{"name":"Genome research","volume":"38 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1