首页 > 最新文献

Cell genomics最新文献

英文 中文
High-quality mouse reference genomes reveal the structural complexity of the murine protein-coding landscape. 高质量的小鼠参考基因组揭示了小鼠蛋白质编码景观的结构复杂性。
IF 11.1 Q1 CELL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1016/j.xgen.2025.101074
Mohab Helmy, Jin U Li, Xinyu F Yan, Rachel K Meade, Elizabeth Anderson, Patrick B Chen, Anne M Czechanski, Tomás Di Domenico, Jonathan Flint, Erik Garrison, Marco T P Gontijo, Andrea Guarracino, Leanne Haggerty, Edith Heard, Kerstin Howe, Narendra Meena, Fergal J Martin, Eric A Miska, Isabell Rall, Navin B Ramakrishna, Alexandra Sapetschnig, Swati Sinha, Diandian Sun, Francesca F Tricomi, Runjia Qu, Jonathan M D Wood, Tianzhen Wu, Dian J Zhou, Laura Reinholdt, David J Adams, Clare M Smith, Jingtao Lilue, Thomas M Keane

We present a collection of 17 high-quality long-read inbred mouse strain genomes with complete annotation (contig N50s of 0.8-33.9 Mbp). This collection includes 12 widely used classical laboratory strains and 5 wild-derived strains. We have resolved previously incomplete genomic regions, including the major histocompatibility complex (MHC), defensin cluster, T cell receptor, and Ly49 complexes. Hundreds of non-reference genes from previous publications not found in GRCm39, such as Defa1, Raet1a, and Klra20 (Ly49T), were localized in the new reference genomes. We conducted a genome-wide scan of variable number tandem repeats (VNTRs) within the coding regions, identifying over 400 genes with VNTR polymorphisms with up to 600 repeat copies and repeat units reaching 990 nucleotides. Our strain-specific annotations enhance RNA sequencing (RNA-seq) analyses, as demonstrated in PWK/PhJ, where we observed a 5.1% improvement in read mapping and expression-level differences in 2.1% of coding genes compared to using GRCm39.

我们收集了17个具有完整注释的高质量长读近交系小鼠基因组(n50为0.8-33.9 Mbp)。该收集包括12个广泛使用的经典实验室菌株和5个野生衍生菌株。我们已经解决了以前不完整的基因组区域,包括主要组织相容性复合体(MHC),防御蛋白簇,T细胞受体和Ly49复合体。数百个以前未在GRCm39中发现的非参考基因,如Defa1、Raet1a和Klra20 (Ly49T),被定位在新的参考基因组中。我们对编码区内的可变数串联重复序列(VNTRs)进行了全基因组扫描,鉴定出400多个具有VNTR多态性的基因,多达600个重复拷贝,重复单位达到990个核苷酸。我们的菌株特异性注释增强了RNA测序(RNA-seq)分析,正如PWK/PhJ所证明的那样,我们观察到与使用GRCm39相比,读取定位和2.1%编码基因的表达水平差异提高了5.1%。
{"title":"High-quality mouse reference genomes reveal the structural complexity of the murine protein-coding landscape.","authors":"Mohab Helmy, Jin U Li, Xinyu F Yan, Rachel K Meade, Elizabeth Anderson, Patrick B Chen, Anne M Czechanski, Tomás Di Domenico, Jonathan Flint, Erik Garrison, Marco T P Gontijo, Andrea Guarracino, Leanne Haggerty, Edith Heard, Kerstin Howe, Narendra Meena, Fergal J Martin, Eric A Miska, Isabell Rall, Navin B Ramakrishna, Alexandra Sapetschnig, Swati Sinha, Diandian Sun, Francesca F Tricomi, Runjia Qu, Jonathan M D Wood, Tianzhen Wu, Dian J Zhou, Laura Reinholdt, David J Adams, Clare M Smith, Jingtao Lilue, Thomas M Keane","doi":"10.1016/j.xgen.2025.101074","DOIUrl":"https://doi.org/10.1016/j.xgen.2025.101074","url":null,"abstract":"<p><p>We present a collection of 17 high-quality long-read inbred mouse strain genomes with complete annotation (contig N50s of 0.8-33.9 Mbp). This collection includes 12 widely used classical laboratory strains and 5 wild-derived strains. We have resolved previously incomplete genomic regions, including the major histocompatibility complex (MHC), defensin cluster, T cell receptor, and Ly49 complexes. Hundreds of non-reference genes from previous publications not found in GRCm39, such as Defa1, Raet1a, and Klra20 (Ly49T), were localized in the new reference genomes. We conducted a genome-wide scan of variable number tandem repeats (VNTRs) within the coding regions, identifying over 400 genes with VNTR polymorphisms with up to 600 repeat copies and repeat units reaching 990 nucleotides. Our strain-specific annotations enhance RNA sequencing (RNA-seq) analyses, as demonstrated in PWK/PhJ, where we observed a 5.1% improvement in read mapping and expression-level differences in 2.1% of coding genes compared to using GRCm39.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"101074"},"PeriodicalIF":11.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cystic fibrosis risk variants confer protection against inflammatory bowel disease. 囊性纤维化风险变异可预防炎症性肠病。
IF 11.1 Q1 CELL BIOLOGY Pub Date : 2025-12-01 DOI: 10.1016/j.xgen.2025.101071
Mingrui Yu, Qian Zhang, Kai Yuan, Aleksejs Sazonovs, Christine R Stevens, Laura Fachal, Christopher A Lamb, Carl A Anderson, Mark J Daly, Hailiang Huang

Genetic mutations that yield a defective cystic fibrosis (CF) transmembrane regulator (CFTR) protein cause CF, a life-limiting autosomal-recessive Mendelian disorder. A protective role of CFTR loss-of-function mutations in inflammatory bowel disease (IBD) has been suggested, but its evidence has been inconclusive and contradictory. Here, leveraging a large IBD exome sequencing dataset comprising 38,558 cases and 66,945 controls of European ancestry in the discovery stage and a combined total of 42,475 cases and 192,050 controls across diverse ancestry groups in the replication stage, we established a protective role of CF-risk variants against IBD based on the association test of CFTR deltaF508 (p = 8.96E-11) and the gene-based burden test of CF-risk variants (p = 3.9E-07). Furthermore, we assessed variant prioritization methods, including AlphaMissense, using clinically annotated CF-risk variants as the gold standard. Our findings highlight the critical and unmet need for effective variant prioritization in gene-based burden tests.

产生有缺陷的囊性纤维化(CF)跨膜调节(CFTR)蛋白的基因突变导致CF,这是一种限制生命的常染色体隐性孟德尔疾病。CFTR功能缺失突变在炎症性肠病(IBD)中的保护作用已被提出,但其证据尚无定论和相互矛盾。在这里,利用大型IBD外显子组测序数据集,包括发现阶段的38,558例欧洲血统和66,945例对照,以及复制阶段不同血统群体的42,475例和192,050例对照,我们基于CFTR deltaF508的关联测试(p = 8.96E-11)和cf风险变异的基于基因的负担测试(p = 3.9E-07)建立了cf风险变异对IBD的保护作用。此外,我们评估了变异优先排序方法,包括AlphaMissense,使用临床注释的cf风险变异作为金标准。我们的研究结果强调了在基于基因的负担测试中对有效的变异优先排序的关键和未满足的需求。
{"title":"Cystic fibrosis risk variants confer protection against inflammatory bowel disease.","authors":"Mingrui Yu, Qian Zhang, Kai Yuan, Aleksejs Sazonovs, Christine R Stevens, Laura Fachal, Christopher A Lamb, Carl A Anderson, Mark J Daly, Hailiang Huang","doi":"10.1016/j.xgen.2025.101071","DOIUrl":"10.1016/j.xgen.2025.101071","url":null,"abstract":"<p><p>Genetic mutations that yield a defective cystic fibrosis (CF) transmembrane regulator (CFTR) protein cause CF, a life-limiting autosomal-recessive Mendelian disorder. A protective role of CFTR loss-of-function mutations in inflammatory bowel disease (IBD) has been suggested, but its evidence has been inconclusive and contradictory. Here, leveraging a large IBD exome sequencing dataset comprising 38,558 cases and 66,945 controls of European ancestry in the discovery stage and a combined total of 42,475 cases and 192,050 controls across diverse ancestry groups in the replication stage, we established a protective role of CF-risk variants against IBD based on the association test of CFTR deltaF508 (p = 8.96E-11) and the gene-based burden test of CF-risk variants (p = 3.9E-07). Furthermore, we assessed variant prioritization methods, including AlphaMissense, using clinically annotated CF-risk variants as the gold standard. Our findings highlight the critical and unmet need for effective variant prioritization in gene-based burden tests.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"101071"},"PeriodicalIF":11.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Polygenic scores capture genetic modification of the adiposity-cardiometabolic risk factor relationship. 多基因评分捕获了肥胖-心脏代谢危险因素关系的遗传修饰。
IF 11.1 Q1 CELL BIOLOGY Pub Date : 2025-11-25 DOI: 10.1016/j.xgen.2025.101075
Kenneth E Westerman, Julie E Gervis, Luke J O'Connor, Miriam S Udler, Alisa K Manning

Polygenic scores (PGSs) that can predict response to interventions can facilitate precision medicine and are detectable in observational datasets as PGS-by-exposure (PGS×E) interactions. PGSs based on interactions (iPGSs) or variance effects (vPGSs) may be more powerful than standard PGSs for detecting PGS×E, but these have yet to be systematically compared. We describe a generalized pipeline for developing and comparing these PGS types and apply it to detect genetic modification of the relationship between adiposity (measured by BMI) and a broad set of cardiometabolic risk factors. Our applied analysis in the UK Biobank identified significant PGS×BMI for 16/20 risk factors, most consistently for the iPGS approach. Many interactions replicated in All of Us (AoU); for example, we observed a 72% larger BMI-alanine aminotransferase association in the top iPGS decile in AoU. Our study provides a framework for the comparison of PGS×E strategies and informs efforts toward clinically useful response-focused PGSs.

多基因评分(pgs)可以预测对干预措施的反应,可以促进精准医疗,并在观察数据集中作为pgs -暴露(PGS×E)相互作用进行检测。基于相互作用(ipgs)或方差效应(vpgs)的pgs在检测PGS×E方面可能比标准pgs更强大,但这些还没有被系统地比较。我们描述了一个开发和比较这些PGS类型的通用管道,并将其应用于检测肥胖(由BMI测量)与一系列广泛的心脏代谢危险因素之间关系的遗传修饰。我们在英国生物银行的应用分析确定了16/20个风险因素的显著PGS×BMI,最一致的是iPGS方法。在《All of Us》(AoU)中复制了许多互动;例如,我们观察到,在AoU的iPGS前十分位数中,bmi -丙氨酸转氨酶的关联要大72%。我们的研究为PGS×E策略的比较提供了一个框架,并为临床有用的以反应为重点的pgs提供了信息。
{"title":"Polygenic scores capture genetic modification of the adiposity-cardiometabolic risk factor relationship.","authors":"Kenneth E Westerman, Julie E Gervis, Luke J O'Connor, Miriam S Udler, Alisa K Manning","doi":"10.1016/j.xgen.2025.101075","DOIUrl":"10.1016/j.xgen.2025.101075","url":null,"abstract":"<p><p>Polygenic scores (PGSs) that can predict response to interventions can facilitate precision medicine and are detectable in observational datasets as PGS-by-exposure (PGS×E) interactions. PGSs based on interactions (iPGSs) or variance effects (vPGSs) may be more powerful than standard PGSs for detecting PGS×E, but these have yet to be systematically compared. We describe a generalized pipeline for developing and comparing these PGS types and apply it to detect genetic modification of the relationship between adiposity (measured by BMI) and a broad set of cardiometabolic risk factors. Our applied analysis in the UK Biobank identified significant PGS×BMI for 16/20 risk factors, most consistently for the iPGS approach. Many interactions replicated in All of Us (AoU); for example, we observed a 72% larger BMI-alanine aminotransferase association in the top iPGS decile in AoU. Our study provides a framework for the comparison of PGS×E strategies and informs efforts toward clinically useful response-focused PGSs.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"101075"},"PeriodicalIF":11.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145642999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrative profiling of condensation-prone RNAs during early development. 早期发育过程中易于凝析的rna的综合分析。
IF 11.1 Q1 CELL BIOLOGY Pub Date : 2025-11-19 DOI: 10.1016/j.xgen.2025.101065
Tajda Klobučar, Jona Novljan, Ira A Iosub, Boštjan Kokot, Iztok Urbančič, D Marc Jones, Anob M Chakrabarti, Nicholas M Luscombe, Jernej Ule, Miha Modic

Complex RNA-protein networks play a pivotal role in the formation of many types of biomolecular condensates. How RNA features contribute to condensate formation, however, remains incompletely understood. Here, we integrate tailored transcriptomics assays to identify a distinct class of developmental condensation-prone RNAs termed "smOOPs" (semi-extractable, orthogonal-organic-phase-separation-enriched RNAs). These transcripts localize to larger intracellular foci, form denser RNA subnetworks than expected, and are heavily bound by RNA-binding proteins (RBPs). Using an explainable deep learning framework, we reveal that smOOPs harbor characteristic sequence composition, with lower sequence complexity, increased intramolecular folding, and specific RBP-binding patterns. Intriguingly, these RNAs encode proteins bearing extensive intrinsically disordered regions and are highly predicted to be involved in biomolecular condensates, indicating an interplay between RNA- and protein-based features in phase separation. This work advances our understanding of condensation-prone RNAs and provides a versatile resource to further investigate RNA-driven condensation principles.

复杂的rna -蛋白网络在多种生物分子凝聚物的形成中起着关键作用。然而,RNA的特征是如何促成凝析物形成的,目前还不完全清楚。在这里,我们整合了定制的转录组学分析,以鉴定一类独特的发育凝析倾向rna,称为“smOOPs”(半可提取的,正交有机相分离富集的rna)。这些转录本定位于更大的细胞内病灶,形成比预期更密集的RNA子网络,并与RNA结合蛋白(rbp)紧密结合。使用可解释的深度学习框架,我们发现smOOPs具有特征序列组成,具有较低的序列复杂性,增加的分子内折叠和特定的rbp结合模式。有趣的是,这些RNA编码的蛋白质具有广泛的内在无序区域,并且被高度预测参与生物分子凝聚,表明在相分离中RNA和蛋白质之间的相互作用。这项工作促进了我们对易于冷凝的rna的理解,并为进一步研究rna驱动的冷凝原理提供了一个通用的资源。
{"title":"Integrative profiling of condensation-prone RNAs during early development.","authors":"Tajda Klobučar, Jona Novljan, Ira A Iosub, Boštjan Kokot, Iztok Urbančič, D Marc Jones, Anob M Chakrabarti, Nicholas M Luscombe, Jernej Ule, Miha Modic","doi":"10.1016/j.xgen.2025.101065","DOIUrl":"https://doi.org/10.1016/j.xgen.2025.101065","url":null,"abstract":"<p><p>Complex RNA-protein networks play a pivotal role in the formation of many types of biomolecular condensates. How RNA features contribute to condensate formation, however, remains incompletely understood. Here, we integrate tailored transcriptomics assays to identify a distinct class of developmental condensation-prone RNAs termed \"smOOPs\" (semi-extractable, orthogonal-organic-phase-separation-enriched RNAs). These transcripts localize to larger intracellular foci, form denser RNA subnetworks than expected, and are heavily bound by RNA-binding proteins (RBPs). Using an explainable deep learning framework, we reveal that smOOPs harbor characteristic sequence composition, with lower sequence complexity, increased intramolecular folding, and specific RBP-binding patterns. Intriguingly, these RNAs encode proteins bearing extensive intrinsically disordered regions and are highly predicted to be involved in biomolecular condensates, indicating an interplay between RNA- and protein-based features in phase separation. This work advances our understanding of condensation-prone RNAs and provides a versatile resource to further investigate RNA-driven condensation principles.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"101065"},"PeriodicalIF":11.1,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genetic architecture of the murine red blood cell proteome reveals central role of hemoglobin beta cysteine 93 in maintaining redox balance. 小鼠红细胞蛋白质组的遗传结构揭示了血红蛋白β -半胱氨酸93在维持氧化还原平衡中的核心作用。
IF 11.1 Q1 CELL BIOLOGY Pub Date : 2025-11-19 DOI: 10.1016/j.xgen.2025.101069
Gregory R Keele, Monika Dzieciatkowska, Ariel M Hay, Matthew Vincent, Callan O'Connor, Daniel Stephenson, Julie A Reisz, Travis Nemkov, Kirk C Hansen, Grier P Page, James C Zimring, Gary A Churchill, Angelo D'Alessandro

Red blood cells (RBCs) transport oxygen but accumulate oxidative damage over time, reducing function in vivo and during storage, critical for transfusions. To explore the genetics of RBC resilience, we profiled proteins, metabolites, and lipids from fresh and stored RBCs from 350 genetically diverse mice. Our analysis identified over 6,000 quantitative trait loci (QTLs). Compared to other tissues, the prevalence of trans genetic effects over cis ones reflects the absence of de novo protein synthesis in anucleated RBCs. QTL hotspots at Hbb, Hba, Mon1a, and (storage-specific) Steap3 linked ferroptosis to hemolysis. Proteasome QTLs clustered at multiple loci, underscoring the importance of degrading oxidized proteins. Post-translational modification (PTM) QTLs mapped predominantly to hemoglobins, including cysteine residues. The loss of reactive C93 in humanized mice (hemoglobulin beta [HBB] C93A) disrupted redox balance, glutathione pools, glutathionylation, and redox PTMs. These findings highlight genetic regulation of RBC oxidation, with implications for transfusion biology and oxidative-stress-dependent hemolytic disorders.

红细胞(rbc)运输氧气,但随着时间的推移会积累氧化损伤,降低体内和储存期间的功能,这对输血至关重要。为了探索红细胞恢复力的遗传学,我们分析了350只遗传多样性小鼠的新鲜红细胞和储存红细胞的蛋白质、代谢物和脂质。我们的分析确定了6000多个数量性状位点(qtl)。与其他组织相比,转基因效应在顺式组织中的普遍存在反映了无核红细胞中缺乏从头蛋白合成。Hbb, Hba, Mon1a和(储存特异性)Steap3的QTL热点将铁下垂与溶血联系起来。蛋白酶体qtl聚集在多个位点上,强调了降解氧化蛋白的重要性。翻译后修饰(PTM) qtl主要定位血红蛋白,包括半胱氨酸残基。在人源化小鼠中,反应性C93(血红蛋白β [HBB] C93A)的缺失破坏了氧化还原平衡、谷胱甘肽池、谷胱甘肽酰化和氧化还原PTMs。这些发现强调了红细胞氧化的遗传调控,对输血生物学和氧化应激依赖性溶血疾病具有启示意义。
{"title":"Genetic architecture of the murine red blood cell proteome reveals central role of hemoglobin beta cysteine 93 in maintaining redox balance.","authors":"Gregory R Keele, Monika Dzieciatkowska, Ariel M Hay, Matthew Vincent, Callan O'Connor, Daniel Stephenson, Julie A Reisz, Travis Nemkov, Kirk C Hansen, Grier P Page, James C Zimring, Gary A Churchill, Angelo D'Alessandro","doi":"10.1016/j.xgen.2025.101069","DOIUrl":"10.1016/j.xgen.2025.101069","url":null,"abstract":"<p><p>Red blood cells (RBCs) transport oxygen but accumulate oxidative damage over time, reducing function in vivo and during storage, critical for transfusions. To explore the genetics of RBC resilience, we profiled proteins, metabolites, and lipids from fresh and stored RBCs from 350 genetically diverse mice. Our analysis identified over 6,000 quantitative trait loci (QTLs). Compared to other tissues, the prevalence of trans genetic effects over cis ones reflects the absence of de novo protein synthesis in anucleated RBCs. QTL hotspots at Hbb, Hba, Mon1a, and (storage-specific) Steap3 linked ferroptosis to hemolysis. Proteasome QTLs clustered at multiple loci, underscoring the importance of degrading oxidized proteins. Post-translational modification (PTM) QTLs mapped predominantly to hemoglobins, including cysteine residues. The loss of reactive C93 in humanized mice (hemoglobulin beta [HBB] C93A) disrupted redox balance, glutathione pools, glutathionylation, and redox PTMs. These findings highlight genetic regulation of RBC oxidation, with implications for transfusion biology and oxidative-stress-dependent hemolytic disorders.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"101069"},"PeriodicalIF":11.1,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging protein language models to identify complex trait associations with previously inaccessible classes of functional rare variants. 利用蛋白质语言模型来识别复杂的性状与以前难以接近的功能罕见变异类的关联。
IF 11.1 Q1 CELL BIOLOGY Pub Date : 2025-11-19 DOI: 10.1016/j.xgen.2025.101068
Seon-Kyeong Jang, Zitian Wang, Richard Border, Dinh Tuan, Angela Wei, Ulzee An, Sriram Sankararaman, Vasilis Ntranos, Jonathan Flint, Noah Zaitlen

Protein language models (PLMs) improve variant effect predictions, but their role in gene discovery for complex traits remains unclear. We introduce an allelic series-based regression test that uses PLM-derived variant effect predictions as proxies for effect sizes, identifying ∼46% more associations than standard burden tests. Extending this to isoform-level analysis, we find 26 gene-trait pairs with stronger associations in non-canonical versus canonical transcripts, highlighting isoform-specific effects. Finally, we identify evolutionary plausible variants (EPVs), missense variants assigned higher likelihoods than the wild-type alleles by PLMs, representing 0.45% of missense variants. EPVs show higher allele frequencies than synonymous variants, consistent with differential selection pressures, and are linked to nine traits, including protective associations with low-density lipoprotein (LDL) and bone mineral density. Together, our results demonstrate how PLMs can enhance rare-variant interpretation and gene-trait association discovery in exome data.

蛋白质语言模型(PLMs)改善了变异效应预测,但其在复杂性状基因发现中的作用尚不清楚。我们引入了一种基于等位基因序列的回归测试,该测试使用plm衍生的变异效应预测作为效应大小的代理,比标准负担测试多识别出46%的关联。将其扩展到同型水平分析,我们发现26个基因性状对在非规范转录本和规范转录本中具有更强的关联,突出了同型特异性效应。最后,我们确定了进化似是而非的变异(epv),这些错义变异被PLMs赋予了比野生型等位基因更高的可能性,占错义变异的0.45%。epv的等位基因频率高于同音变体,与差异选择压力一致,并与9个性状相关,包括与低密度脂蛋白(LDL)和骨矿物质密度的保护性关联。总之,我们的研究结果证明了PLMs如何能够增强外显子组数据中的罕见变异解释和基因性状关联发现。
{"title":"Leveraging protein language models to identify complex trait associations with previously inaccessible classes of functional rare variants.","authors":"Seon-Kyeong Jang, Zitian Wang, Richard Border, Dinh Tuan, Angela Wei, Ulzee An, Sriram Sankararaman, Vasilis Ntranos, Jonathan Flint, Noah Zaitlen","doi":"10.1016/j.xgen.2025.101068","DOIUrl":"10.1016/j.xgen.2025.101068","url":null,"abstract":"<p><p>Protein language models (PLMs) improve variant effect predictions, but their role in gene discovery for complex traits remains unclear. We introduce an allelic series-based regression test that uses PLM-derived variant effect predictions as proxies for effect sizes, identifying ∼46% more associations than standard burden tests. Extending this to isoform-level analysis, we find 26 gene-trait pairs with stronger associations in non-canonical versus canonical transcripts, highlighting isoform-specific effects. Finally, we identify evolutionary plausible variants (EPVs), missense variants assigned higher likelihoods than the wild-type alleles by PLMs, representing 0.45% of missense variants. EPVs show higher allele frequencies than synonymous variants, consistent with differential selection pressures, and are linked to nine traits, including protective associations with low-density lipoprotein (LDL) and bone mineral density. Together, our results demonstrate how PLMs can enhance rare-variant interpretation and gene-trait association discovery in exome data.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"101068"},"PeriodicalIF":11.1,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
In silico generation of synthetic cancer genomes using generative AI. 利用生成式人工智能在计算机上生成合成癌症基因组。
IF 11.1 Q1 CELL BIOLOGY Pub Date : 2025-11-12 Epub Date: 2025-08-12 DOI: 10.1016/j.xgen.2025.100969
Ander Díaz-Navarro, Xindi Zhang, Wei Jiao, Bo Wang, Lincoln Stein

Understanding how genomic alterations drive cancer is key to advancing precision oncology. To detect these alterations, accurate algorithms are used; however, due to privacy concerns, few deeply sequenced cancer genomes can be shared, limiting benchmarking and representing a major obstacle to the improvement of analytic tools. To address this, we developed OncoGAN, a generative AI model combining adversarial networks and variational autoencoders to create realistic synthetic cancer genomes. Trained on large-scale genomic datasets, OncoGAN accurately reproduces somatic mutations, copy number alterations, and structural variants across cancer types while preserving donors' privacy. The synthetic genomes reflect tumor-specific mutational signatures and positional mutation patterns. Using DeepTumour, we validated the synthetic data's fidelity, showing high concordance between generated and predicted tumors. Moreover, augmenting the training data with synthetic genomes improved DeepTumour's accuracy, underscoring OncoGAN's potential to generate shareable datasets with known ground truths for benchmarking and enhancement of cancer genome analysis tools.

了解基因组改变如何驱动癌症是推进精准肿瘤学的关键。为了检测这些变化,使用了精确的算法;然而,由于隐私问题,很少有深度测序的癌症基因组可以共享,这限制了基准测试,并代表了分析工具改进的主要障碍。为了解决这个问题,我们开发了OncoGAN,这是一种结合对抗网络和变分自编码器的生成式人工智能模型,可以创建真实的合成癌症基因组。在大规模基因组数据集的训练下,OncoGAN在保护捐赠者隐私的同时,准确地再现了不同癌症类型的体细胞突变、拷贝数改变和结构变异。合成基因组反映肿瘤特异性突变特征和位置突变模式。使用deeptumor,我们验证了合成数据的保真度,显示了生成和预测肿瘤之间的高度一致性。此外,用合成基因组增强训练数据提高了deeptumor的准确性,强调了OncoGAN在生成具有已知基础事实的可共享数据集方面的潜力,这些数据集可用于基准测试和增强癌症基因组分析工具。
{"title":"In silico generation of synthetic cancer genomes using generative AI.","authors":"Ander Díaz-Navarro, Xindi Zhang, Wei Jiao, Bo Wang, Lincoln Stein","doi":"10.1016/j.xgen.2025.100969","DOIUrl":"10.1016/j.xgen.2025.100969","url":null,"abstract":"<p><p>Understanding how genomic alterations drive cancer is key to advancing precision oncology. To detect these alterations, accurate algorithms are used; however, due to privacy concerns, few deeply sequenced cancer genomes can be shared, limiting benchmarking and representing a major obstacle to the improvement of analytic tools. To address this, we developed OncoGAN, a generative AI model combining adversarial networks and variational autoencoders to create realistic synthetic cancer genomes. Trained on large-scale genomic datasets, OncoGAN accurately reproduces somatic mutations, copy number alterations, and structural variants across cancer types while preserving donors' privacy. The synthetic genomes reflect tumor-specific mutational signatures and positional mutation patterns. Using DeepTumour, we validated the synthetic data's fidelity, showing high concordance between generated and predicted tumors. Moreover, augmenting the training data with synthetic genomes improved DeepTumour's accuracy, underscoring OncoGAN's potential to generate shareable datasets with known ground truths for benchmarking and enhancement of cancer genome analysis tools.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100969"},"PeriodicalIF":11.1,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12648103/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144849956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analog epigenetic memory revealed by targeted chromatin editing. 靶向染色质编辑揭示的模拟表观遗传记忆。
IF 11.1 Q1 CELL BIOLOGY Pub Date : 2025-11-12 Epub Date: 2025-09-09 DOI: 10.1016/j.xgen.2025.100985
Sebastian Palacios, Simone Bruno, Ron Weiss, Elia Salibi, Isabella Goodchild-Michelman, Andrew Kane, Katherine Ilia, Domitilla Del Vecchio

Cells store information by means of chromatin modifications that persist through cell divisions and can hold gene expression silenced over generations. However, how these modifications may maintain other gene expression states has remained unclear. This study shows that chromatin modifications can maintain a wide range of gene expression levels over time, thus uncovering analog epigenetic memory. By engineering a genomic reporter and epigenetic effectors, we tracked the gene expression dynamics following targeted perturbations to the chromatin state. We found that distinct grades of DNA methylation led to corresponding, persistent gene expression levels. Altering the DNA methylation grade, in turn, resulted in permanent loss of gene expression memory. Consistent with experiments, our chromatin modification model indicates that analog memory arises when the positive feedback between DNA methylation and repressive histone modifications is lacking. This discovery will lead to a deeper understanding of epigenetic memory and to new tools for synthetic biology.

细胞通过染色质修饰来存储信息,这种修饰在细胞分裂过程中持续存在,并可以在几代人的时间内保持基因表达沉默。然而,这些修饰如何维持其他基因表达状态仍不清楚。这项研究表明,随着时间的推移,染色质修饰可以维持大范围的基因表达水平,从而揭示类似的表观遗传记忆。通过设计基因组报告因子和表观遗传效应因子,我们跟踪了靶向干扰染色质状态后的基因表达动态。我们发现不同程度的DNA甲基化导致相应的、持续的基因表达水平。反过来,改变DNA甲基化等级会导致基因表达记忆的永久性丧失。与实验结果一致,我们的染色质修饰模型表明,当DNA甲基化和抑制性组蛋白修饰之间缺乏正反馈时,模拟记忆就会出现。这一发现将导致对表观遗传记忆的更深层次的理解,并为合成生物学提供新的工具。
{"title":"Analog epigenetic memory revealed by targeted chromatin editing.","authors":"Sebastian Palacios, Simone Bruno, Ron Weiss, Elia Salibi, Isabella Goodchild-Michelman, Andrew Kane, Katherine Ilia, Domitilla Del Vecchio","doi":"10.1016/j.xgen.2025.100985","DOIUrl":"10.1016/j.xgen.2025.100985","url":null,"abstract":"<p><p>Cells store information by means of chromatin modifications that persist through cell divisions and can hold gene expression silenced over generations. However, how these modifications may maintain other gene expression states has remained unclear. This study shows that chromatin modifications can maintain a wide range of gene expression levels over time, thus uncovering analog epigenetic memory. By engineering a genomic reporter and epigenetic effectors, we tracked the gene expression dynamics following targeted perturbations to the chromatin state. We found that distinct grades of DNA methylation led to corresponding, persistent gene expression levels. Altering the DNA methylation grade, in turn, resulted in permanent loss of gene expression memory. Consistent with experiments, our chromatin modification model indicates that analog memory arises when the positive feedback between DNA methylation and repressive histone modifications is lacking. This discovery will lead to a deeper understanding of epigenetic memory and to new tools for synthetic biology.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100985"},"PeriodicalIF":11.1,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12648113/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The gut's hidden arsenal: A genomics-guided atlas of class II bacteriocins. 肠道隐藏的武器库:基因组学引导的II类细菌素图谱。
IF 11.1 Q1 CELL BIOLOGY Pub Date : 2025-11-12 DOI: 10.1016/j.xgen.2025.101064
Tianang Leng, Cesar de la Fuente-Nunez

Unmodified class II bacteriocins promise precision antimicrobials that spare bystander microbes. Zhang and colleagues introduce IIBacFinder, a genomics-guided pipeline that detects precursor and context genes with a curated pHMM library, infers leader-peptide cleavage, and triages candidates by meta-omics signals. The authors apply it across bacterial genomes, including an atlas of ∼280,000 human-gut genomes, and recover a vast reservoir of narrow-spectrum peptides and prioritize gut-resident candidates for synthesis. Of the 26 synthesized, 16 display activity in vitro, largely via membrane perturbation and with additive effects alongside vancomycin, while ex vivo assays show minimal compositional disruption of fecal communities compared with antibiotic controls. These results position unmodified class II bacteriocins as tractable, microbiome-sparing agents and illustrate how genome-scale mining coupled to meta-omics can bridge sequence to function in complex ecosystems.

未经修饰的II类细菌素有望提供精确的抗菌剂,以保护周围的微生物。Zhang和他的同事们介绍了IIBacFinder,这是一种基因组学引导的管道,它通过一个策划的pHMM文库检测前体和背景基因,推断先导肽的切割,并通过元组学信号对候选基因进行分类。作者将其应用于细菌基因组,包括约280,000个人类肠道基因组图谱,并恢复了大量窄谱肽库,并优先考虑肠道内的候选物进行合成。在合成的26种抗生素中,16种在体外显示出活性,主要是通过膜扰动,并与万古霉素一起产生加性效应,而离体试验显示,与抗生素对照相比,对粪便群落的组成破坏最小。这些结果将未修饰的II类细菌素定位为易于处理的微生物组保护剂,并说明了基因组规模挖掘与元组学相结合如何在复杂生态系统中连接序列功能。
{"title":"The gut's hidden arsenal: A genomics-guided atlas of class II bacteriocins.","authors":"Tianang Leng, Cesar de la Fuente-Nunez","doi":"10.1016/j.xgen.2025.101064","DOIUrl":"10.1016/j.xgen.2025.101064","url":null,"abstract":"<p><p>Unmodified class II bacteriocins promise precision antimicrobials that spare bystander microbes. Zhang and colleagues introduce IIBacFinder, a genomics-guided pipeline that detects precursor and context genes with a curated pHMM library, infers leader-peptide cleavage, and triages candidates by meta-omics signals. The authors apply it across bacterial genomes, including an atlas of ∼280,000 human-gut genomes, and recover a vast reservoir of narrow-spectrum peptides and prioritize gut-resident candidates for synthesis. Of the 26 synthesized, 16 display activity in vitro, largely via membrane perturbation and with additive effects alongside vancomycin, while ex vivo assays show minimal compositional disruption of fecal communities compared with antibiotic controls. These results position unmodified class II bacteriocins as tractable, microbiome-sparing agents and illustrate how genome-scale mining coupled to meta-omics can bridge sequence to function in complex ecosystems.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":"5 11","pages":"101064"},"PeriodicalIF":11.1,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12648078/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145514600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Phenotypic pleiotropy of missense variants in human B cell confinement receptor P2RY8. 人B细胞约束受体P2RY8错义变异的表型多效性。
IF 11.1 Q1 CELL BIOLOGY Pub Date : 2025-11-12 Epub Date: 2025-09-09 DOI: 10.1016/j.xgen.2025.100981
Taylor N LaFlam, Christian B Billesbølle, Tuan Dinh, Finn D Wolfreys, Erick Lu, Tomas Matteson, Jinping An, Ying Xu, Arushi Singhal, Nadav Brandes, Vasilis Ntranos, Aashish Manglik, Jason G Cyster, Chun Jimmie Ye

Missense variants can have pleiotropic effects on protein function, and predicting these effects can be difficult. We performed near-saturation deep mutational scanning of P2RY8, a G protein-coupled receptor that promotes germinal center B cell confinement. We assayed the effect of each variant on surface expression, migration, and proliferation. We delineated variants that affected both expression and function, affected function independently of expression, and discrepantly affected migration and proliferation. We also used cryo-electron microscopy to determine the structure of activated, ligand-bound P2RY8, providing structural insights into the effects of variants on ligand binding and signal transmission. We applied the deep mutational scanning results to both improve computational variant effect predictions and to characterize the phenotype of germline variants and lymphoma-associated variants. Together, our results demonstrate the power of integrating deep mutational scanning, structure determination, and in silico prediction to advance the understanding of a receptor important in human health.

错义变异可以对蛋白质功能产生多效性影响,预测这些影响是很困难的。我们对P2RY8进行了近饱和深度突变扫描,P2RY8是一种促进生发中心B细胞禁闭的G蛋白偶联受体。我们分析了每种变异对表面表达、迁移和增殖的影响。我们描述了影响表达和功能的变异,独立于表达影响功能,不同地影响迁移和增殖。我们还使用低温电子显微镜确定了活化的配体结合P2RY8的结构,为变异对配体结合和信号传输的影响提供了结构上的见解。我们应用深度突变扫描结果来改进计算变异效应预测,并表征种系变异和淋巴瘤相关变异的表型。总之,我们的结果证明了整合深度突变扫描,结构确定和计算机预测的力量,以促进对人类健康重要受体的理解。
{"title":"Phenotypic pleiotropy of missense variants in human B cell confinement receptor P2RY8.","authors":"Taylor N LaFlam, Christian B Billesbølle, Tuan Dinh, Finn D Wolfreys, Erick Lu, Tomas Matteson, Jinping An, Ying Xu, Arushi Singhal, Nadav Brandes, Vasilis Ntranos, Aashish Manglik, Jason G Cyster, Chun Jimmie Ye","doi":"10.1016/j.xgen.2025.100981","DOIUrl":"10.1016/j.xgen.2025.100981","url":null,"abstract":"<p><p>Missense variants can have pleiotropic effects on protein function, and predicting these effects can be difficult. We performed near-saturation deep mutational scanning of P2RY8, a G protein-coupled receptor that promotes germinal center B cell confinement. We assayed the effect of each variant on surface expression, migration, and proliferation. We delineated variants that affected both expression and function, affected function independently of expression, and discrepantly affected migration and proliferation. We also used cryo-electron microscopy to determine the structure of activated, ligand-bound P2RY8, providing structural insights into the effects of variants on ligand binding and signal transmission. We applied the deep mutational scanning results to both improve computational variant effect predictions and to characterize the phenotype of germline variants and lymphoma-associated variants. Together, our results demonstrate the power of integrating deep mutational scanning, structure determination, and in silico prediction to advance the understanding of a receptor important in human health.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100981"},"PeriodicalIF":11.1,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12648108/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Cell genomics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1