首页 > 最新文献

Genome research最新文献

英文 中文
Genome-wide maps of CPD deamination in yeast reveal the impact of DNA sequence context and nucleosome architecture on cytosine deamination rates. 酵母CPD脱胺的全基因组图谱揭示了DNA序列背景和核小体结构对胞嘧啶脱胺率的影响。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.280384.124
Marian F Laughery, Bastian Stark, Benjamin Morledge-Hampton, Steven A Roberts, John J Wyrick

UV light induces cyclobutane pyrimidine dimers (CPDs) and other mutagenic lesions in cellular DNA. Cytosine-containing CPDs can subsequently undergo rapid deamination to uracil, a process that has been linked to UV mutagenesis. However, the impact of genomic context and chromatin architecture on CPD deamination rates in cells remains poorly understood. Here, we develop a method known as dCPD-seq to map deaminated CPDs (dCPDs) across the genome of repair-deficient yeast cells at single-nucleotide resolution. Our dCPD-seq data reveal that sequence context significantly modulates CPD deamination rates in UV-irradiated yeast cells, with CPDs in TCG contexts showing particularly rapid deamination rates. Our analysis indicates that rapid CPD deamination can explain why UV-induced mutations are specifically enriched at TCG sequences, both in UV-irradiated yeast cells and in human skin cancers. CPD deamination is suppressed near the transcription start and end sites of yeast genes, which may in part by mediated by DNA-bound transcription factors. Finally, we show that the wrapping of DNA in nucleosomes modulates CPD deamination in yeast cells. Our data indicate that CPD deamination is elevated at minor-in rotational positions where the DNA minor groove faces the histone octamer, likely owing to increased solvent accessibility of the C4 position of the cytosine base. Moreover, we also observe strand-specific enrichment of CPD deamination at rotational positions where the DNA backbone faces out toward the solvent. Taken together, these findings reveal how DNA sequence context and chromatin architecture modulates CPD deamination rates across a eukaryotic genome.

紫外光诱导细胞DNA中的环丁烷嘧啶二聚体(CPDs)和其他诱变损伤。含有胞嘧啶的cpd随后可迅速脱胺为尿嘧啶,这一过程与紫外线诱变有关。然而,基因组背景和染色质结构对细胞中CPD脱氨率的影响仍然知之甚少。在这里,我们开发了一种称为dCPD-seq的方法,以单核苷酸分辨率绘制修复缺陷酵母细胞基因组中的脱氨基CPDs (dCPDs)。我们的dCPD-seq数据显示,在紫外线照射的酵母细胞中,序列背景显著调节CPD的脱氨率,TCG背景下的CPD表现出特别快的脱氨率。我们的分析表明,快速的CPD脱胺可以解释为什么在紫外线照射的酵母细胞和人类皮肤癌中,紫外线诱导的突变在TCG序列上特异性富集。CPD脱氨作用在酵母基因转录起始和结束位点附近受到抑制,这可能部分是由dna结合转录因子介导的。最后,我们发现DNA在核小体中的包裹调节酵母细胞中的CPD脱胺作用。我们的数据表明,CPD的脱胺作用在DNA小槽面对组蛋白八聚体的小旋转位置上升高,这可能是由于胞嘧啶碱基C4位置的溶剂可及性增加。此外,我们还观察到在DNA主链面向溶剂的旋转位置,CPD脱胺的链特异性富集。综上所述,这些发现揭示了DNA序列背景和染色质结构如何调节真核生物基因组的CPD脱氨率。
{"title":"Genome-wide maps of CPD deamination in yeast reveal the impact of DNA sequence context and nucleosome architecture on cytosine deamination rates.","authors":"Marian F Laughery, Bastian Stark, Benjamin Morledge-Hampton, Steven A Roberts, John J Wyrick","doi":"10.1101/gr.280384.124","DOIUrl":"10.1101/gr.280384.124","url":null,"abstract":"<p><p>UV light induces cyclobutane pyrimidine dimers (CPDs) and other mutagenic lesions in cellular DNA. Cytosine-containing CPDs can subsequently undergo rapid deamination to uracil, a process that has been linked to UV mutagenesis. However, the impact of genomic context and chromatin architecture on CPD deamination rates in cells remains poorly understood. Here, we develop a method known as dCPD-seq to map deaminated CPDs (dCPDs) across the genome of repair-deficient yeast cells at single-nucleotide resolution. Our dCPD-seq data reveal that sequence context significantly modulates CPD deamination rates in UV-irradiated yeast cells, with CPDs in TCG contexts showing particularly rapid deamination rates. Our analysis indicates that rapid CPD deamination can explain why UV-induced mutations are specifically enriched at TCG sequences, both in UV-irradiated yeast cells and in human skin cancers. CPD deamination is suppressed near the transcription start and end sites of yeast genes, which may in part by mediated by DNA-bound transcription factors. Finally, we show that the wrapping of DNA in nucleosomes modulates CPD deamination in yeast cells. Our data indicate that CPD deamination is elevated at minor-in rotational positions where the DNA minor groove faces the histone octamer, likely owing to increased solvent accessibility of the C4 position of the cytosine base. Moreover, we also observe strand-specific enrichment of CPD deamination at rotational positions where the DNA backbone faces out toward the solvent. Taken together, these findings reveal how DNA sequence context and chromatin architecture modulates CPD deamination rates across a eukaryotic genome.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"183-196"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12887450/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145713697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated chromatin profiling with spa-ChIP-seq uncovers the impacts of condition variations. spa-ChIP-seq自动染色质分析揭示了条件变化的影响。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.281320.125
Yuwei Cao, Lauren Patel, Lauren Alcoser, Eric Mendenhall, Christopher Benner, Sven Heinz, Alon Goren

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is widely used to study the genomic localization of DNA-associated proteins. However, conventional protocols include multiple manual steps that can introduce inconsistency and limit scalability, thereby restricting the inclusion of appropriate replicates and controls. Although the introduction of liquid handling platforms has improved reproducibility, most existing efforts have automated only a subset of the workflow, and extending automation to efficiently map nonhistone proteins, such as chromatin regulators, remains challenging. Here, we present a fully automated implementation of our previously developed single-pot ChIP-seq protocol, named spa-ChIP-seq, which enables scalable processing of eight to 96 ChIP-seq samples from cross-linked cells to a sequencing-ready library in approximately 3 days with an estimated cost of $70 per sample. Benchmarking spa-ChIP-seq against manual ChIP-seq performed in parallel demonstrates a comparable signal-to-noise ratio between the two workflows. Using spa-ChIP-seq, we systematically evaluate multiple parameters including shearing and cross-linking conditions, buffer compositions, and the ratio of antibody to cell number. We find, for the first time to our knowledge, that weaker genomic localization signals are sensitive to changing the antibody-to-cell-number ratio, whereas the stronger signals remain unaffected. This finding underscores the importance of maintaining consistent antibody-to-cell-number ratio for comparative studies, such as treatment responses or chromatin-QTL mapping. The spa-ChIP-seq protocol is publicly available, including deck setups, operational parameters, and scripts. We envision that this robust, cost-efficient protocol will facilitate high-throughput, reproducible ChIP-seq analyses, supporting large-scale studies of antibody validation, compound screening, population genomics, and diagnostic frameworks.

染色质免疫沉淀测序(ChIP-seq)被广泛用于研究dna相关蛋白的基因组定位。然而,传统协议包含多个手动步骤,可能会引入不一致并限制可伸缩性,从而限制适当复制和控制的包含。尽管液体处理平台的引入提高了重现性,但大多数现有的工作仅实现了工作流程的一部分自动化,并且将自动化扩展到有效地绘制非组蛋白(如染色质调节因子)仍然具有挑战性。在这里,我们展示了我们之前开发的单锅ChIP-seq协议的全自动实现,名为spa-ChIP-seq,它可以在大约3天内将8到96个ChIP-seq样品从交联细胞扩展到测序准备库,每个样品的估计成本为70美元。对并行执行的spa-ChIP-seq和手动ChIP-seq进行基准测试表明,两个工作流程之间的信噪比相当。使用spa-ChIP-seq,我们系统地评估了多个参数,包括剪切和交联条件,缓冲成分和抗体与细胞数的比例。据我们所知,我们首次发现,较弱的基因组定位信号对改变抗体与细胞数量的比例很敏感,而较强的信号则不受影响。这一发现强调了在比较研究中保持一致的抗体与细胞数量比例的重要性,例如治疗反应或染色质- qtl定位。spa-ChIP-seq协议是公开的,包括甲板设置、操作参数和脚本。我们设想这种稳健、经济的方案将促进高通量、可重复的ChIP-seq分析,支持抗体验证、化合物筛选、群体基因组学和诊断框架的大规模研究。
{"title":"Automated chromatin profiling with spa-ChIP-seq uncovers the impacts of condition variations.","authors":"Yuwei Cao, Lauren Patel, Lauren Alcoser, Eric Mendenhall, Christopher Benner, Sven Heinz, Alon Goren","doi":"10.1101/gr.281320.125","DOIUrl":"10.1101/gr.281320.125","url":null,"abstract":"<p><p>Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is widely used to study the genomic localization of DNA-associated proteins. However, conventional protocols include multiple manual steps that can introduce inconsistency and limit scalability, thereby restricting the inclusion of appropriate replicates and controls. Although the introduction of liquid handling platforms has improved reproducibility, most existing efforts have automated only a subset of the workflow, and extending automation to efficiently map nonhistone proteins, such as chromatin regulators, remains challenging. Here, we present a fully automated implementation of our previously developed single-pot ChIP-seq protocol, named spa-ChIP-seq, which enables scalable processing of eight to 96 ChIP-seq samples from cross-linked cells to a sequencing-ready library in approximately 3 days with an estimated cost of $70 per sample. Benchmarking spa-ChIP-seq against manual ChIP-seq performed in parallel demonstrates a comparable signal-to-noise ratio between the two workflows. Using spa-ChIP-seq, we systematically evaluate multiple parameters including shearing and cross-linking conditions, buffer compositions, and the ratio of antibody to cell number. We find, for the first time to our knowledge, that weaker genomic localization signals are sensitive to changing the antibody-to-cell-number ratio, whereas the stronger signals remain unaffected. This finding underscores the importance of maintaining consistent antibody-to-cell-number ratio for comparative studies, such as treatment responses or chromatin-QTL mapping. The spa-ChIP-seq protocol is publicly available, including deck setups, operational parameters, and scripts. We envision that this robust, cost-efficient protocol will facilitate high-throughput, reproducible ChIP-seq analyses, supporting large-scale studies of antibody validation, compound screening, population genomics, and diagnostic frameworks.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"129-141"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758389/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145742134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome-wide nucleosome and transcription factor responses to genetic perturbations reveal chromatin-mediated mechanisms of transcriptional regulation. 全基因组核小体和转录因子对遗传扰动的反应揭示了染色质介导的转录调控机制。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.279637.124
Kevin Moyung, Yulong Li, Heather K MacAlpine, Alexander J Hartemink, David M MacAlpine

Epigenetic mechanisms contribute to gene regulation by altering chromatin accessibility through changes in transcription factor (TF) and nucleosome occupancy across the genome. Despite numerous studies focusing on changes in gene expression, the intricate chromatin-mediated regulatory code remains largely uncharted on a comprehensive scale. We address this by employing a factor-agnostic, reverse-genetics approach that uses MNase-seq to capture genome-wide TF and nucleosome occupancies in response to the individual deletion of 201 transcriptional regulators in Saccharomyces cerevisiae, thereby assaying nearly 1 million mutant-gene interactions. We develop a principled new approach to identify and quantify chromatin changes genome-wide, allowing us to observe differences in TF and nucleosome occupancy that recapitulate well-established pathways identified by gene expression data. We also discover distinct chromatin signatures associated with the up- and downregulation of genes and use these signatures to reveal regulatory mechanisms previously unexplored in expression-based studies. Finally, we demonstrate that chromatin features are predictive of transcriptional activity, and we leverage these features to reconstruct chromatin-based transcriptional regulatory networks. Overall, these results illustrate the power of an approach combining genetic perturbation with high-resolution epigenomic profiling; the latter enables a close examination of the interplay between TFs and nucleosomes genome-wide, providing a deeper, more mechanistic understanding of the complex relationship between chromatin organization and transcription.

表观遗传机制通过改变转录因子(TF)和核小体在基因组中的占用来改变染色质可及性,从而促进基因调控。尽管有大量的研究关注基因表达的变化,但复杂的染色质介导的调控代码在很大程度上仍然是未知的。为了解决这个问题,我们采用了一种因子不可知的反向遗传学方法,该方法使用MNase-seq捕捉全基因组TF和核小体占用,以响应酿酒酵母中201个转录调节因子的个体缺失,从而分析了近100万个突变基因的相互作用。我们开发了一种原则性的新方法来鉴定和量化全基因组的染色质变化,使我们能够观察TF和核小体占用的差异,这些差异概括了基因表达数据确定的既定途径。我们还发现了与基因上调和下调相关的不同染色质特征,并利用这些特征揭示了以前未在基于表达的研究中探索的调节机制。最后,我们证明了染色质特征可以预测转录活性,我们利用这些特征来重建基于染色质的转录调控网络。总的来说,这些结果说明了将遗传扰动与高分辨率表观基因组分析相结合的方法的力量;后者能够在全基因组范围内仔细检查tf和核小体之间的相互作用,为染色质组织和转录之间的复杂关系提供更深入、更机械的理解。
{"title":"Genome-wide nucleosome and transcription factor responses to genetic perturbations reveal chromatin-mediated mechanisms of transcriptional regulation.","authors":"Kevin Moyung, Yulong Li, Heather K MacAlpine, Alexander J Hartemink, David M MacAlpine","doi":"10.1101/gr.279637.124","DOIUrl":"10.1101/gr.279637.124","url":null,"abstract":"<p><p>Epigenetic mechanisms contribute to gene regulation by altering chromatin accessibility through changes in transcription factor (TF) and nucleosome occupancy across the genome. Despite numerous studies focusing on changes in gene expression, the intricate chromatin-mediated regulatory code remains largely uncharted on a comprehensive scale. We address this by employing a factor-agnostic, reverse-genetics approach that uses MNase-seq to capture genome-wide TF and nucleosome occupancies in response to the individual deletion of 201 transcriptional regulators in <i>Saccharomyces cerevisiae</i>, thereby assaying nearly 1 million mutant-gene interactions. We develop a principled new approach to identify and quantify chromatin changes genome-wide, allowing us to observe differences in TF and nucleosome occupancy that recapitulate well-established pathways identified by gene expression data. We also discover distinct chromatin signatures associated with the up- and downregulation of genes and use these signatures to reveal regulatory mechanisms previously unexplored in expression-based studies. Finally, we demonstrate that chromatin features are predictive of transcriptional activity, and we leverage these features to reconstruct chromatin-based transcriptional regulatory networks. Overall, these results illustrate the power of an approach combining genetic perturbation with high-resolution epigenomic profiling; the latter enables a close examination of the interplay between TFs and nucleosomes genome-wide, providing a deeper, more mechanistic understanding of the complex relationship between chromatin organization and transcription.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"115-128"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758391/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145714012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A versatile type VI CRISPR-based approach for targeted m6A demethylation in mRNAs. 一种多功能的基于VI型crispr的mrna靶向m6A去甲基化方法。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.280476.125
Panagiotis G Adamopoulos, Konstantina Athanasopoulou, Andreas Scorilas

Epitranscriptomics, a rapidly evolving field mainly driven by massive parallel sequencing technologies, explores post-transcriptional RNA modifications. N 6-methyladenosine (m6A) has emerged as the most prominent and dynamically regulated modification in human mRNAs, being implicated in the regulation of diverse biological processes, including spermatogenesis, heat shock response, ultraviolet-induced DNA damage response and maternal mRNA clearance. Despite the recognized significance of m6A in mRNA regulation, limited studies have focused on the targeted and efficient manipulation of this modification in mRNAs. Here, we present Dem6A-Vec, an "all-in-one" plasmid vector designed for site-specific m6A demethylation in human mRNAs. Dem6A-Vec integrates the expression of a catalytically inactive RfxCas13d fused to the m6A demethylase ALKBH5 and a U6-driven customizable guide RNA in a single construct, simplifying experimental workflows and enhancing targeting efficiency. Using nanopore direct RNA sequencing, we identify high-confident m6A sites in HeLa cells, which serve as targets for Dem6A-Vec. We validate the targeted demethylation of m6A sites in the EEF2 and RRAGA genes using the established SELECT-qPCR method, confirming the impacts on mRNA stability and highlighting the tool's precision and versatility. The presented approach is implemented in multiple mRNA sites with diverse methylation stoichiometries, underscoring its adaptability to various transcriptomic contexts. This study provides a robust and scalable method for investigating the functional roles of m6A modifications, offering a transformative platform for advancing epitranscriptomic research and potential therapeutic applications.

上皮转录组学是一个快速发展的领域,主要由大规模平行测序技术驱动,研究转录后RNA修饰。n6 -甲基腺苷(m6A)是人类mRNA中最重要的动态调控修饰,参与多种生物过程的调控,包括精子发生、热休克反应、紫外线诱导的DNA损伤反应和母体mRNA清除。尽管人们认识到m6A在mRNA调控中的重要作用,但有限的研究集中在靶向和有效地操纵mRNA中的这种修饰。在这里,我们提出了Dem6A-Vec,一种“all-in-one”质粒载体,设计用于人类mrna中特定位点的m6A去甲基化。Dem6A-Vec整合了与m6A去甲基化酶ALKBH5融合的催化无活性RfxCas13d和u6驱动的可定制向导RNA的表达,简化了实验工作流程,提高了靶向效率。利用纳米孔直接RNA测序,我们在HeLa细胞中确定了高可信度的m6A位点,作为Dem6A-Vec的靶点。我们使用已建立的SELECT-qPCR方法验证了EEF2和raga基因中m6A位点的靶向去甲基化,确认了对mRNA稳定性的影响,并强调了该工具的准确性和通用性。所提出的方法在具有不同甲基化化学计量的多个mRNA位点上实现,强调其对各种转录组背景的适应性。该研究为研究m6A修饰的功能作用提供了一种强大且可扩展的方法,为推进表转录组学研究和潜在的治疗应用提供了一个变革性的平台。
{"title":"A versatile type VI CRISPR-based approach for targeted m<sup>6</sup>A demethylation in mRNAs.","authors":"Panagiotis G Adamopoulos, Konstantina Athanasopoulou, Andreas Scorilas","doi":"10.1101/gr.280476.125","DOIUrl":"10.1101/gr.280476.125","url":null,"abstract":"<p><p>Epitranscriptomics, a rapidly evolving field mainly driven by massive parallel sequencing technologies, explores post-transcriptional RNA modifications. <i>N</i> <sup>6</sup>-methyladenosine (m<sup>6</sup>A) has emerged as the most prominent and dynamically regulated modification in human mRNAs, being implicated in the regulation of diverse biological processes, including spermatogenesis, heat shock response, ultraviolet-induced DNA damage response and maternal mRNA clearance. Despite the recognized significance of m<sup>6</sup>A in mRNA regulation, limited studies have focused on the targeted and efficient manipulation of this modification in mRNAs. Here, we present Dem6A-Vec, an \"all-in-one\" plasmid vector designed for site-specific m<sup>6</sup>A demethylation in human mRNAs. Dem6A-Vec integrates the expression of a catalytically inactive RfxCas13d fused to the m<sup>6</sup>A demethylase ALKBH5 and a U6-driven customizable guide RNA in a single construct, simplifying experimental workflows and enhancing targeting efficiency. Using nanopore direct RNA sequencing, we identify high-confident m<sup>6</sup>A sites in HeLa cells, which serve as targets for Dem6A-Vec. We validate the targeted demethylation of m<sup>6</sup>A sites in the <i>EEF2</i> and <i>RRAGA</i> genes using the established SELECT-qPCR method, confirming the impacts on mRNA stability and highlighting the tool's precision and versatility. The presented approach is implemented in multiple mRNA sites with diverse methylation stoichiometries, underscoring its adaptability to various transcriptomic contexts. This study provides a robust and scalable method for investigating the functional roles of m<sup>6</sup>A modifications, offering a transformative platform for advancing epitranscriptomic research and potential therapeutic applications.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"169-182"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758396/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145742068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cell-type- and chromosome-specific chromatin landscapes and DNA replication programs of Drosophila testis tumor stem cell-like cells. 睾丸果蝇肿瘤干细胞样细胞的细胞型和染色体特异性染色质景观和DNA复制程序。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.280809.125
Jennifer A Urban, Daniel Ringwalt, John M Urban, Wingel Xue, Ryan Gleason, Keji Zhao, Xin Chen

Stem cells have the unique ability to self-renew and differentiate into specialized cell types. Epigenetic mechanisms, including histones and their post-translational modifications, play a crucial role in regulating programs integral to a cell's identity, like gene expression and DNA replication. However, the transcriptional, chromatin, and replication timing profiles of adult stem cells in vivo remain poorly understood. Containing germline stem cells (GSCs) and somatic cyst stem cells (CySCs), the Drosophila testis provides an excellent in vivo model for studying adult stem cells. However, the small number of stem cells and the cellular heterogeneity of this tissue have limited comprehensive genomic studies. In this study, we develop cell-type-specific genomic techniques to analyze the transcriptome, histone modification patterns, and replication timing of germline stem cell (GSC)-like and somatic cyst stem cell (CySC)-like cells. Single-cell RNA sequencing validates previous findings on GSC-CySC intercellular communication and reveals a high expression of chromatin regulators in GSC-like cells. To characterize chromatin landscapes, we develop a cell-type-specific chromatin profiling assay to map H3K4me3-, H3K27me3-, and H3K9me3-enriched regions, corresponding to the euchromatic, facultative heterochromatic, and constitutive heterochromatic domains, respectively. Finally, we determine cell-type-specific replication timing profiles, integrating our in vivo data sets with published data using cultured cell lines. Our results reveal that GSC-like cells display a distinct replication program, compared with somatic lineages, that aligns with chromatin state differences. Collectively, our integrated transcriptomic, chromatin, and replication data sets provide a comprehensive framework for understanding genome regulation differences between these in vivo stem-cell populations, demonstrating the power of multiomics in uncovering cell-type-specific regulatory features.

干细胞具有自我更新和分化为特化细胞类型的独特能力。表观遗传机制,包括组蛋白及其翻译后修饰,在调节细胞身份不可或缺的程序中起着至关重要的作用,如基因表达和DNA复制。然而,体内成体干细胞的转录、染色质和复制时间谱仍然知之甚少。果蝇睾丸含有生殖系干细胞(GSCs)和体细胞囊肿干细胞(CySCs),为研究成体干细胞提供了良好的体内模型。然而,干细胞数量少,且干细胞组织的细胞异质性限制了全面的基因组研究。在这项研究中,我们开发了细胞类型特异性基因组技术来分析生殖系干细胞(GSC)样细胞和体细胞囊肿干细胞(CySC)样细胞的转录组、组蛋白修饰模式和复制时间。单细胞RNA测序验证了先前关于GSC-CySC细胞间通讯的发现,并揭示了gsc样细胞中染色质调节因子的高表达。为了表征染色质景观,我们开发了一种细胞类型特异性的染色质分析方法来绘制H3K4me3-、H3K27me3-和h3k9me3富集区域,分别对应于常染色质、兼性异染色质和本构异染色质区域。最后,我们确定了细胞类型特异性复制时间谱,将我们的体内数据集与使用培养细胞系的已发表数据相结合。我们的研究结果表明,与体细胞谱系相比,gsc样细胞显示出独特的复制程序,这与染色质状态差异一致。总的来说,我们整合的转录组学、染色质和复制数据集为理解这些体内干细胞群体之间的基因组调控差异提供了一个全面的框架,展示了多组学在揭示细胞类型特异性调控特征方面的力量。
{"title":"Cell-type- and chromosome-specific chromatin landscapes and DNA replication programs of <i>Drosophila</i> testis tumor stem cell-like cells.","authors":"Jennifer A Urban, Daniel Ringwalt, John M Urban, Wingel Xue, Ryan Gleason, Keji Zhao, Xin Chen","doi":"10.1101/gr.280809.125","DOIUrl":"10.1101/gr.280809.125","url":null,"abstract":"<p><p>Stem cells have the unique ability to self-renew and differentiate into specialized cell types. Epigenetic mechanisms, including histones and their post-translational modifications, play a crucial role in regulating programs integral to a cell's identity, like gene expression and DNA replication. However, the transcriptional, chromatin, and replication timing profiles of adult stem cells in vivo remain poorly understood. Containing germline stem cells (GSCs) and somatic cyst stem cells (CySCs), the <i>Drosophila</i> testis provides an excellent in vivo model for studying adult stem cells. However, the small number of stem cells and the cellular heterogeneity of this tissue have limited comprehensive genomic studies. In this study, we develop cell-type-specific genomic techniques to analyze the transcriptome, histone modification patterns, and replication timing of germline stem cell (GSC)-like and somatic cyst stem cell (CySC)-like cells. Single-cell RNA sequencing validates previous findings on GSC-CySC intercellular communication and reveals a high expression of chromatin regulators in GSC-like cells. To characterize chromatin landscapes, we develop a cell-type-specific chromatin profiling assay to map H3K4me3-, H3K27me3-, and H3K9me3-enriched regions, corresponding to the euchromatic, facultative heterochromatic, and constitutive heterochromatic domains, respectively. Finally, we determine cell-type-specific replication timing profiles, integrating our in vivo data sets with published data using cultured cell lines. Our results reveal that GSC-like cells display a distinct replication program, compared with somatic lineages, that aligns with chromatin state differences. Collectively, our integrated transcriptomic, chromatin, and replication data sets provide a comprehensive framework for understanding genome regulation differences between these in vivo stem-cell populations, demonstrating the power of multiomics in uncovering cell-type-specific regulatory features.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"83-101"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758400/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145722456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The PanOryza pangene catalog of Asian cultivated rice. 亚洲栽培水稻的全景盘古目录。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.280790.125
Bruno Contreras-Moreira, Eshan Sharma, Shradha Saraf, Guy Naamati, Parul Gupta, Justin Elser, Dmytro Chebotarov, Kapeel Chougule, Zhenyuan Lu, Sharon Wei, Andrew Olson, Ian Tsang, Disha Lodha, Yong Zhou, Zhichao Yu, Wen Zhao, Jianwei Zhang, Sandeep Amberkar, Kawinnat Sue-Ob, Zhi Sun, Maria Martin, Kenneth L McNally, Doreen Ware, Eric W Deutsch, Dario Copetti, Rod A Wing, Pankaj Jaiswal, Sarah Dyer, Andrew R Jones

The rice genome underpins fundamental research and breeding, but the Nipponbare (japonica) reference does not fully encompass the genetic diversity of Asian rice. To address this gap, the Rice Population Reference Panel (RPRP) was developed, comprising high-quality assemblies of 16 rice cultivars to represent the japonica, indica, aus, and aromatic varietal groups. The RPRP has been consistently annotated and supported by extensive experimental data, and here, we report the computational assignment, characterization, and dissemination of stably identified pangenes, collectively called the PanOryza data set. We identify 25,178 core pangenes shared across all cultivars, alongside cultivar-specific and family-enriched genes. Core genes exhibit higher gene expression and proteomic evidence, higher confidence protein domains, and AlphaFold structures, whereas cultivar-specific genes are enriched for domains under selective breeding pressure, such as for disease resistance. We identify more than 5000 genes absent in the IRGSP rice reference genome and present in at least two other Oryza cultivars. We demonstrate the utility of this resource through various examples of pangenes and their protein domains. This resource, integrated into public databases, enables researchers to explore genetic and functional diversity via a population-aware "reference guide" across rice genomes, advancing both basic and applied research.

水稻基因组是基础研究和育种的基础,但是日本水稻(japonica)的参考文献并没有完全包含亚洲水稻的遗传多样性。为了解决这一差距,水稻种群参考小组(RPRP)的建立,由代表粳稻、籼稻、黄稻和芳香品种群的16个水稻品种组成。RPRP得到了大量实验数据的一致注释和支持,在这里,我们报告了稳定鉴定的泛基因的计算分配、表征和传播,统称为PanOryza数据集。我们鉴定出所有品种共有的25178个核心泛基因,以及品种特异性和家族富集基因。核心基因表现出更高的基因表达和蛋白质组学证据,更高的置信度蛋白结构域和AlphaFold结构,而品种特异性基因则在选择性育种压力下丰富了结构域,如抗病。我们发现了超过5000个在IRGSP水稻参考基因组中缺失的基因,这些基因至少存在于另外两个水稻品种中。我们通过各种泛基因及其蛋白质结构域的例子来展示这种资源的效用。该资源被整合到公共数据库中,使科学家能够通过种群感知的水稻基因组“参考指南”探索遗传和功能多样性,从而推进基础研究和应用研究。
{"title":"The PanOryza pangene catalog of Asian cultivated rice.","authors":"Bruno Contreras-Moreira, Eshan Sharma, Shradha Saraf, Guy Naamati, Parul Gupta, Justin Elser, Dmytro Chebotarov, Kapeel Chougule, Zhenyuan Lu, Sharon Wei, Andrew Olson, Ian Tsang, Disha Lodha, Yong Zhou, Zhichao Yu, Wen Zhao, Jianwei Zhang, Sandeep Amberkar, Kawinnat Sue-Ob, Zhi Sun, Maria Martin, Kenneth L McNally, Doreen Ware, Eric W Deutsch, Dario Copetti, Rod A Wing, Pankaj Jaiswal, Sarah Dyer, Andrew R Jones","doi":"10.1101/gr.280790.125","DOIUrl":"10.1101/gr.280790.125","url":null,"abstract":"<p><p>The rice genome underpins fundamental research and breeding, but the Nipponbare (<i>japonica</i>) reference does not fully encompass the genetic diversity of Asian rice. To address this gap, the Rice Population Reference Panel (RPRP) was developed, comprising high-quality assemblies of 16 rice cultivars to represent the <i>japonica</i>, <i>indica</i>, <i>aus</i>, and <i>aromatic</i> varietal groups. The RPRP has been consistently annotated and supported by extensive experimental data, and here, we report the computational assignment, characterization, and dissemination of stably identified pangenes, collectively called the PanOryza data set. We identify 25,178 core pangenes shared across all cultivars, alongside cultivar-specific and family-enriched genes. Core genes exhibit higher gene expression and proteomic evidence, higher confidence protein domains, and AlphaFold structures, whereas cultivar-specific genes are enriched for domains under selective breeding pressure, such as for disease resistance. We identify more than 5000 genes absent in the IRGSP rice reference genome and present in at least two other <i>Oryza</i> cultivars. We demonstrate the utility of this resource through various examples of pangenes and their protein domains. This resource, integrated into public databases, enables researchers to explore genetic and functional diversity via a population-aware \"reference guide\" across rice genomes, advancing both basic and applied research.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"226-238"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758395/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145742243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Epigenetic and evolutionary features of ape subterminal heterochromatin. 类人猿亚末端异染色质的表观遗传和进化特征。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.280987.125
DongAhn Yoo, Katherine M Munson, Evan E Eichler

Many African great ape chromosomes possess large subterminal heterochromatic caps at their telomeres that are conspicuously absent from the human lineage. Leveraging the complete sequences of great ape genomes, we characterize the organization of subterminal caps and reconstruct the evolutionary history of these regions in chimpanzees and gorillas. Detailed analyses of the composition of the associated terminal 32 bp satellite array from chimpanzee (termed pCht) and intervening segmental duplication (SD) spacers confirm two independent origins in the Pan and gorilla lineages. In chimpanzee and bonobo, we estimate these structures emerged ∼7.7 million years ago (MYA) in contrast to gorilla, in which they expanded more recently, ∼5.0 MYA, and now make up 8.5% of the total gorilla genome. In both lineages, the SD spacers punctuating the pCht heterochromatic satellite arrays correspond to pockets of decreased methylation, although in gorilla such regions are significantly less methylated (P < 2.2 × 10-16) than in chimpanzee or bonobo. Allelic pairs of subterminal caps show a higher degree of sequence divergence than euchromatic sequences, with bonobo showing less divergent haplotypes and less differentially methylated spacers. In contrast, we identify virtually identical subterminal caps mapping to nonhomologous chromosomes within a species, suggesting ectopic recombination potentially mediated by SD spacers. We find that the transition regions from heterochromatic subterminal caps to euchromatin are enriched for structural variant insertions and lineage-specific duplicated genes. Our findings suggest independent evolution of subterminal caps converging on a common genetic and epigenetic structure that promoted ectopic exchange as well as the emergence of novel genes at transition regions between euchromatin and heterochromatin.

许多非洲类人猿染色体在其端粒处具有大的亚末端异色帽,这在人类谱系中是明显不存在的。利用类人猿基因组的完整序列,我们刻画了黑猩猩和大猩猩亚末端帽的组织结构,并重建了这些区域的进化史。对黑猩猩相关末端32 bp卫星序列(称为pCht)和间隔片段重复(SD)间隔序列组成的详细分析证实了猩猩和大猩猩谱系中有两个独立的起源。在黑猩猩和倭黑猩猩中,我们估计这些结构出现在大约770万年前(MYA),而大猩猩则在大约5.0 MYA (MYA)之后才出现,现在占大猩猩总基因组的8.5%。在这两个谱系中,打断pCht异色卫星阵列的SD间隔区对应于甲基化减少的区域,尽管在大猩猩中,这些区域的甲基化程度明显较低
{"title":"Epigenetic and evolutionary features of ape subterminal heterochromatin.","authors":"DongAhn Yoo, Katherine M Munson, Evan E Eichler","doi":"10.1101/gr.280987.125","DOIUrl":"10.1101/gr.280987.125","url":null,"abstract":"<p><p>Many African great ape chromosomes possess large subterminal heterochromatic caps at their telomeres that are conspicuously absent from the human lineage. Leveraging the complete sequences of great ape genomes, we characterize the organization of subterminal caps and reconstruct the evolutionary history of these regions in chimpanzees and gorillas. Detailed analyses of the composition of the associated terminal 32 bp satellite array from chimpanzee (termed pCht) and intervening segmental duplication (SD) spacers confirm two independent origins in the <i>Pan</i> and gorilla lineages. In chimpanzee and bonobo, we estimate these structures emerged ∼7.7 million years ago (MYA) in contrast to gorilla, in which they expanded more recently, ∼5.0 MYA, and now make up 8.5% of the total gorilla genome. In both lineages, the SD spacers punctuating the pCht heterochromatic satellite arrays correspond to pockets of decreased methylation, although in gorilla such regions are significantly less methylated (<i>P</i> < 2.2 × 10<sup>-16</sup>) than in chimpanzee or bonobo. Allelic pairs of subterminal caps show a higher degree of sequence divergence than euchromatic sequences, with bonobo showing less divergent haplotypes and less differentially methylated spacers. In contrast, we identify virtually identical subterminal caps mapping to nonhomologous chromosomes within a species, suggesting ectopic recombination potentially mediated by SD spacers. We find that the transition regions from heterochromatic subterminal caps to euchromatin are enriched for structural variant insertions and lineage-specific duplicated genes. Our findings suggest independent evolution of subterminal caps converging on a common genetic and epigenetic structure that promoted ectopic exchange as well as the emergence of novel genes at transition regions between euchromatin and heterochromatin.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"38-49"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758386/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145344936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The superpowers of imprinting control regions. 印印控制区域的超能力。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-05 DOI: 10.1101/gr.281215.125
Bertille Montibus, Franck Court, Philippe Arnaud

Genomic imprinting is a specialized mechanism of transcriptional regulation whereby approximately 200 mammalian genes are expressed monoallelically according to their parental origin. This crucial developmental process is primarily controlled by discrete cis-regulatory elements known as imprinting control regions (ICRs), which play essential roles in directing allele-specific gene expression across large imprinted domains. In this review, we highlight the features that define ICRs as a distinct class of cis-regulatory regions, from their ability to maintain germline-inherited DNA methylation to their multifunctional roles in transcriptional control. For each imprinted domain, we examine the diverse mechanisms by which individual ICRs integrate multiple regulatory functions to coordinate both proximal and distal imprinted gene expression. By uncovering the multifaceted roles of ICRs, this review provides a compelling framework for understanding, more broadly, the molecular basis of finely controlled gene expression.

基因组印记是一种特殊的转录调控机制,约200种哺乳动物基因根据亲本来源单等位表达。这个关键的发育过程主要由被称为印迹控制区(ICRs)的离散顺式调控元件控制,ICRs在指导等位基因特异性基因在大印迹结构域的表达中起着重要作用。在这篇综述中,我们强调了将ICRs定义为一类独特的顺式调控区域的特征,从它们维持种系遗传DNA甲基化的能力到它们在转录控制中的多功能作用。对于每个印迹结构域,我们研究了个体ICRs整合多种调控功能以协调近端和远端印迹基因表达的不同机制。通过揭示ICRs的多方面作用,本综述为更广泛地理解精细控制基因表达的分子基础提供了一个令人信服的框架。
{"title":"The superpowers of imprinting control regions.","authors":"Bertille Montibus, Franck Court, Philippe Arnaud","doi":"10.1101/gr.281215.125","DOIUrl":"10.1101/gr.281215.125","url":null,"abstract":"<p><p>Genomic imprinting is a specialized mechanism of transcriptional regulation whereby approximately 200 mammalian genes are expressed monoallelically according to their parental origin. This crucial developmental process is primarily controlled by discrete <i>cis</i>-regulatory elements known as imprinting control regions (ICRs), which play essential roles in directing allele-specific gene expression across large imprinted domains. In this review, we highlight the features that define ICRs as a distinct class of <i>cis</i>-regulatory regions, from their ability to maintain germline-inherited DNA methylation to their multifunctional roles in transcriptional control. For each imprinted domain, we examine the diverse mechanisms by which individual ICRs integrate multiple regulatory functions to coordinate both proximal and distal imprinted gene expression. By uncovering the multifaceted roles of ICRs, this review provides a compelling framework for understanding, more broadly, the molecular basis of finely controlled gene expression.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"1-19"},"PeriodicalIF":5.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12758399/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145722442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpretable phenotype decoding from multicondition sequencing data with ALPINE. 可解释的表型解码从多条件测序数据与ALPINE。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-12-03 DOI: 10.1101/gr.280566.125
Wei-Hao Lee, Lechuan Li, Ruth Dannenfelser, Vicky Yao

As sequencing techniques advance in precision, affordability, and diversity, an abundance of heterogeneous sequencing data has become available, encompassing a wide range of phenotypic features and biological perturbations. Unfortunately, increased resolution comes with the cost of increased complexity of the biological search space, even at the individual study level, as perturbations are now often examined across many dimensions simultaneously, including different donor phenotypes, anatomical regions and cell types, and time points. Furthermore, broad integration across studies promises a unique opportunity to explore the molecular underpinnings of distinct healthy and disease states, larger than the original scope of the individual study. To fully realize the promise of both individual higher resolution studies and large cross-study integrations, we need a robust methodology that can disentangle the influence of technical and nonrelevant phenotypic factors, isolating relevant condition-specific signals from shared biological information while also providing interpretable insights into the genetic effects of these conditions. Current methods typically excel in only one of these areas. To address this gap, we have developed ALPINE, a supervised nonnegative matrix factorization (NMF) framework that effectively separates both technical and nontechnical factors while simultaneously offering direct interpretability of condition-associated genes. Through simulations across four different scenarios, we demonstrate that ALPINE outperforms existing methods in both isolating the effect of different phenotypic conditions and prioritizing condition-associated genes. Furthermore, ALPINE has favorable performance in batch effect removal compared with state-of-the-art integration methods. When applied to real-world case studies, we showcase how ALPINE can be used to extract insights into the biological mechanisms that underlie differences between phenotypic conditions.

随着测序技术在精度、可负担性和多样性方面的进步,大量的异质测序数据已经可用,包括广泛的表型特征和生物扰动。不幸的是,即使在个体研究水平上,分辨率的提高也伴随着生物搜索空间复杂性的增加,因为现在经常同时在多个维度上检查扰动,包括不同的:供体表型,解剖区域和细胞类型以及时间点。此外,跨研究的广泛整合为探索不同健康和疾病状态的分子基础提供了独特的机会,比原始的个体研究范围更大。为了充分实现个体高分辨率研究和大型交叉研究整合的希望,我们需要一种强大的方法,可以解开技术和非相关表型因素的影响,从共享的生物信息中分离出相关的特定条件信号,同时也为这些条件的遗传效应提供可解释的见解。目前的方法通常只在这些领域中的一个方面表现出色。为了解决这一差距,我们开发了ALPINE,这是一个有监督的非负矩阵分解(NMF)框架,可以有效地分离技术和非技术因素,同时提供条件相关基因的直接可解释性。通过对4种不同情况的模拟,我们证明ALPINE在分离不同表型条件的影响和优先考虑条件相关基因方面优于现有方法。此外,与最先进的集成方法相比,ALPINE在批次效应去除方面具有良好的性能。当应用于现实世界的案例研究时,我们展示了ALPINE如何用于提取对表型条件差异背后的生物学机制的见解。
{"title":"Interpretable phenotype decoding from multicondition sequencing data with ALPINE.","authors":"Wei-Hao Lee, Lechuan Li, Ruth Dannenfelser, Vicky Yao","doi":"10.1101/gr.280566.125","DOIUrl":"10.1101/gr.280566.125","url":null,"abstract":"<p><p>As sequencing techniques advance in precision, affordability, and diversity, an abundance of heterogeneous sequencing data has become available, encompassing a wide range of phenotypic features and biological perturbations. Unfortunately, increased resolution comes with the cost of increased complexity of the biological search space, even at the individual study level, as perturbations are now often examined across many dimensions simultaneously, including different donor phenotypes, anatomical regions and cell types, and time points. Furthermore, broad integration across studies promises a unique opportunity to explore the molecular underpinnings of distinct healthy and disease states, larger than the original scope of the individual study. To fully realize the promise of both individual higher resolution studies and large cross-study integrations, we need a robust methodology that can disentangle the influence of technical and nonrelevant phenotypic factors, isolating relevant condition-specific signals from shared biological information while also providing interpretable insights into the genetic effects of these conditions. Current methods typically excel in only one of these areas. To address this gap, we have developed ALPINE, a supervised nonnegative matrix factorization (NMF) framework that effectively separates both technical and nontechnical factors while simultaneously offering direct interpretability of condition-associated genes. Through simulations across four different scenarios, we demonstrate that ALPINE outperforms existing methods in both isolating the effect of different phenotypic conditions and prioritizing condition-associated genes. Furthermore, ALPINE has favorable performance in batch effect removal compared with state-of-the-art integration methods. When applied to real-world case studies, we showcase how ALPINE can be used to extract insights into the biological mechanisms that underlie differences between phenotypic conditions.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"2756-2769"},"PeriodicalIF":5.5,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12667713/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145344885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BayesRVAT enhances rare-variant association testing through Bayesian aggregation of functional annotations. BayesRVAT通过功能注释的贝叶斯聚合增强了罕见变量关联测试。
IF 5.5 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2025-12-03 DOI: 10.1101/gr.280689.125
Antonio Nappi, Liubov Shilova, Theofanis Karaletsos, Na Cai, Francesco Paolo Casale

Gene-level rare variant association tests (RVATs) are essential for uncovering disease mechanisms and identifying therapeutic targets. Advances in sequence-based machine learning have generated diverse variant pathogenicity scores, creating opportunities to improve RVATs. However, existing methods often rely on rigid models or single annotations, limiting their ability to leverage these advances. Here, we introduce BayesRVAT, a Bayesian rare variant association test that jointly models multiple annotations. By specifying priors on annotation effects and estimating gene- and trait-specific posterior burden scores, BayesRVAT flexibly captures diverse rare-variant architectures. In simulations, BayesRVAT improves power while maintaining calibration. In UK Biobank analyses, it detects 10.2% more blood-trait associations and reveals novel gene-disease links, including PRPH2 with retinal disease. Integrating BayesRVAT within omnibus frameworks further increases discoveries, demonstrating that flexible annotation modeling captures complementary signals beyond existing burden and variance-component tests.

基因水平罕见变异关联试验(RVATs)对于揭示疾病机制和确定治疗靶点至关重要。基于序列的机器学习的进步产生了不同的致病力评分,为提高rvat创造了机会。然而,现有的方法通常依赖于严格的模型或单个注释,限制了它们利用这些进步的能力。我们引入BayesRVAT,一个贝叶斯罕见变体关联测试,联合建模多个注释。BayesRVAT通过指定注释效应的先验和估计基因特异性后验负担得分,灵活地捕获各种罕见变异结构。在模拟中,BayesRVAT在保持校准的同时提高了功率。在英国生物银行的分析中,它检测到10.2%以上的血液特征关联,并揭示了新的基因疾病联系,包括PRPH2与视网膜疾病。在综合框架中集成BayesRVAT进一步增加了发现,证明了灵活的注释建模可以捕获超越现有负担和方差成分测试的互补信号。
{"title":"BayesRVAT enhances rare-variant association testing through Bayesian aggregation of functional annotations.","authors":"Antonio Nappi, Liubov Shilova, Theofanis Karaletsos, Na Cai, Francesco Paolo Casale","doi":"10.1101/gr.280689.125","DOIUrl":"10.1101/gr.280689.125","url":null,"abstract":"<p><p>Gene-level rare variant association tests (RVATs) are essential for uncovering disease mechanisms and identifying therapeutic targets. Advances in sequence-based machine learning have generated diverse variant pathogenicity scores, creating opportunities to improve RVATs. However, existing methods often rely on rigid models or single annotations, limiting their ability to leverage these advances. Here, we introduce BayesRVAT, a Bayesian rare variant association test that jointly models multiple annotations. By specifying priors on annotation effects and estimating gene- and trait-specific posterior burden scores, BayesRVAT flexibly captures diverse rare-variant architectures. In simulations, BayesRVAT improves power while maintaining calibration. In UK Biobank analyses, it detects 10.2% more blood-trait associations and reveals novel gene-disease links, including <i>PRPH2</i> with retinal disease. Integrating BayesRVAT within omnibus frameworks further increases discoveries, demonstrating that flexible annotation modeling captures complementary signals beyond existing burden and variance-component tests.</p>","PeriodicalId":12678,"journal":{"name":"Genome research","volume":" ","pages":"2682-2690"},"PeriodicalIF":5.5,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12667389/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145367897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genome research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1