Pub Date : 2024-05-08Epub Date: 2024-04-24DOI: 10.1016/j.xgen.2024.100542
Matthew Ginley-Hidinger, Hosiana Abewe, Kyle Osborne, Alexandra Richey, Noel Kitchen, Katelyn L Mortenson, Erin M Wissink, John Lis, Xiaoyang Zhang, Jason Gertz
Cis-regulatory elements control transcription levels, temporal dynamics, and cell-cell variation or transcriptional noise. However, the combination of regulatory features that control these different attributes is not fully understood. Here, we used single-cell RNA-seq during an estrogen treatment time course and machine learning to identify predictors of expression timing and noise. We found that genes with multiple active enhancers exhibit faster temporal responses. We verified this finding by showing that manipulation of enhancer activity changes the temporal response of estrogen target genes. Analysis of transcriptional noise uncovered a relationship between promoter and enhancer activity, with active promoters associated with low noise and active enhancers linked to high noise. Finally, we observed that co-expression across single cells is an emergent property associated with chromatin looping, timing, and noise. Overall, our results indicate a fundamental tradeoff between a gene's ability to quickly respond to incoming signals and maintain low variation across cells.
{"title":"Cis-regulatory control of transcriptional timing and noise in response to estrogen.","authors":"Matthew Ginley-Hidinger, Hosiana Abewe, Kyle Osborne, Alexandra Richey, Noel Kitchen, Katelyn L Mortenson, Erin M Wissink, John Lis, Xiaoyang Zhang, Jason Gertz","doi":"10.1016/j.xgen.2024.100542","DOIUrl":"10.1016/j.xgen.2024.100542","url":null,"abstract":"<p><p>Cis-regulatory elements control transcription levels, temporal dynamics, and cell-cell variation or transcriptional noise. However, the combination of regulatory features that control these different attributes is not fully understood. Here, we used single-cell RNA-seq during an estrogen treatment time course and machine learning to identify predictors of expression timing and noise. We found that genes with multiple active enhancers exhibit faster temporal responses. We verified this finding by showing that manipulation of enhancer activity changes the temporal response of estrogen target genes. Analysis of transcriptional noise uncovered a relationship between promoter and enhancer activity, with active promoters associated with low noise and active enhancers linked to high noise. Finally, we observed that co-expression across single cells is an emergent property associated with chromatin looping, timing, and noise. Overall, our results indicate a fundamental tradeoff between a gene's ability to quickly respond to incoming signals and maintain low variation across cells.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100542"},"PeriodicalIF":0.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140871645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08Epub Date: 2024-04-30DOI: 10.1016/j.xgen.2024.100544
Robert F Hillary, Hong Kiat Ng, Daniel L McCartney, Hannah R Elliott, Rosie M Walker, Archie Campbell, Felicia Huang, Kenan Direk, Paul Welsh, Naveed Sattar, Janie Corley, Caroline Hayward, Andrew M McIntosh, Cathie Sudlow, Kathryn L Evans, Simon R Cox, John C Chambers, Marie Loh, Caroline L Relton, Riccardo E Marioni, Paul D Yousefi, Matthew Suderman
Chronic inflammation is a hallmark of age-related disease states. The effectiveness of inflammatory proteins including C-reactive protein (CRP) in assessing long-term inflammation is hindered by their phasic nature. DNA methylation (DNAm) signatures of CRP may act as more reliable markers of chronic inflammation. We show that inter-individual differences in DNAm capture 50% of the variance in circulating CRP (N = 17,936, Generation Scotland). We develop a series of DNAm predictors of CRP using state-of-the-art algorithms. An elastic-net-regression-based predictor outperformed competing methods and explained 18% of phenotypic variance in the Lothian Birth Cohort of 1936 (LBC1936) cohort, doubling that of existing DNAm predictors. DNAm predictors performed comparably in four additional test cohorts (Avon Longitudinal Study of Parents and Children, Health for Life in Singapore, Southall and Brent Revisited, and LBC1921), including for individuals of diverse genetic ancestry and different age groups. The best-performing predictor surpassed assay-measured CRP and a genetic score in its associations with 26 health outcomes. Our findings forge new avenues for assessing chronic low-grade inflammation in diverse populations.
{"title":"Blood-based epigenome-wide analyses of chronic low-grade inflammation across diverse population cohorts.","authors":"Robert F Hillary, Hong Kiat Ng, Daniel L McCartney, Hannah R Elliott, Rosie M Walker, Archie Campbell, Felicia Huang, Kenan Direk, Paul Welsh, Naveed Sattar, Janie Corley, Caroline Hayward, Andrew M McIntosh, Cathie Sudlow, Kathryn L Evans, Simon R Cox, John C Chambers, Marie Loh, Caroline L Relton, Riccardo E Marioni, Paul D Yousefi, Matthew Suderman","doi":"10.1016/j.xgen.2024.100544","DOIUrl":"10.1016/j.xgen.2024.100544","url":null,"abstract":"<p><p>Chronic inflammation is a hallmark of age-related disease states. The effectiveness of inflammatory proteins including C-reactive protein (CRP) in assessing long-term inflammation is hindered by their phasic nature. DNA methylation (DNAm) signatures of CRP may act as more reliable markers of chronic inflammation. We show that inter-individual differences in DNAm capture 50% of the variance in circulating CRP (N = 17,936, Generation Scotland). We develop a series of DNAm predictors of CRP using state-of-the-art algorithms. An elastic-net-regression-based predictor outperformed competing methods and explained 18% of phenotypic variance in the Lothian Birth Cohort of 1936 (LBC1936) cohort, doubling that of existing DNAm predictors. DNAm predictors performed comparably in four additional test cohorts (Avon Longitudinal Study of Parents and Children, Health for Life in Singapore, Southall and Brent Revisited, and LBC1921), including for individuals of diverse genetic ancestry and different age groups. The best-performing predictor surpassed assay-measured CRP and a genetic score in its associations with 26 health outcomes. Our findings forge new avenues for assessing chronic low-grade inflammation in diverse populations.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100544"},"PeriodicalIF":0.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099341/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140869007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08DOI: 10.1016/j.xgen.2024.100558
Yin Wang, Ying Wai Chan
In this issue of Cell Genomics, Wang, Liu, Zuo, Wang, et al.1 investigate rare variants in hepatocellular carcinoma (HCC) by performing the first rare-variant association study (RVAS) in a Chinese population cohort. It uncovers BRCAness phenotypes associated with the NRDE2-p.N377I variant, suggesting PARP inhibitors as a promising therapeutic approach for certain HCC patients.
{"title":"Rare-variant association study unveils the Achilles' heel for HCC.","authors":"Yin Wang, Ying Wai Chan","doi":"10.1016/j.xgen.2024.100558","DOIUrl":"10.1016/j.xgen.2024.100558","url":null,"abstract":"<p><p>In this issue of Cell Genomics, Wang, Liu, Zuo, Wang, et al.<sup>1</sup> investigate rare variants in hepatocellular carcinoma (HCC) by performing the first rare-variant association study (RVAS) in a Chinese population cohort. It uncovers BRCAness phenotypes associated with the NRDE2-p.N377I variant, suggesting PARP inhibitors as a promising therapeutic approach for certain HCC patients.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":"4 5","pages":"100558"},"PeriodicalIF":0.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099380/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140900696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The complex pathobiology of late-onset Alzheimer's disease (AD) poses significant challenges to therapeutic and preventative interventions. Despite these difficulties, genomics and related disciplines are allowing fundamental mechanistic insights to emerge with clarity, particularly with the introduction of high-resolution sequencing technologies. After all, the disrupted processes at the interface between DNA and gene expression, which we call the broken AD genome, offer detailed quantitative evidence unrestrained by preconceived notions about the disease. In addition to highlighting biological pathways beyond the classical pathology hallmarks, these advances have revitalized drug discovery efforts and are driving improvements in clinical tools. We review genetic, epigenomic, and gene expression findings related to AD pathogenesis and explore how their integration enables a better understanding of the multicellular imbalances contributing to this heterogeneous condition. The frontiers opening on the back of these research milestones promise a future of AD care that is both more personalized and predictive.
晚发性阿尔茨海默病(AD)的病理生物学非常复杂,给治疗和预防干预带来了巨大挑战。尽管存在这些困难,但基因组学和相关学科,尤其是高分辨率测序技术的引入,使基本的机理认识得以清晰呈现。毕竟,DNA 与基因表达界面的中断过程(我们称之为 "破碎的 AD 基因组")提供了详细的定量证据,不受对该疾病先入为主的观念的限制。除了强调经典病理学特征之外的生物学途径之外,这些进展还为药物发现工作注入了新的活力,并推动了临床工具的改进。我们回顾了与多发性硬化症发病机制相关的遗传学、表观基因组学和基因表达研究成果,并探讨了如何通过整合这些研究成果更好地理解导致这种异质性疾病的多细胞失衡。这些具有里程碑意义的研究成果所开辟的前沿领域有望为未来的注意力缺失症治疗提供更加个性化和更具预测性的服务。
{"title":"The broken Alzheimer's disease genome.","authors":"Cláudio Gouveia Roque, Hemali Phatnani, Ulrich Hengst","doi":"10.1016/j.xgen.2024.100555","DOIUrl":"10.1016/j.xgen.2024.100555","url":null,"abstract":"<p><p>The complex pathobiology of late-onset Alzheimer's disease (AD) poses significant challenges to therapeutic and preventative interventions. Despite these difficulties, genomics and related disciplines are allowing fundamental mechanistic insights to emerge with clarity, particularly with the introduction of high-resolution sequencing technologies. After all, the disrupted processes at the interface between DNA and gene expression, which we call the broken AD genome, offer detailed quantitative evidence unrestrained by preconceived notions about the disease. In addition to highlighting biological pathways beyond the classical pathology hallmarks, these advances have revitalized drug discovery efforts and are driving improvements in clinical tools. We review genetic, epigenomic, and gene expression findings related to AD pathogenesis and explore how their integration enables a better understanding of the multicellular imbalances contributing to this heterogeneous condition. The frontiers opening on the back of these research milestones promise a future of AD care that is both more personalized and predictive.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100555"},"PeriodicalIF":0.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099344/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140854556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To identify novel susceptibility genes for hepatocellular carcinoma (HCC), we performed a rare-variant association study in Chinese populations consisting of 2,750 cases and 4,153 controls. We identified four HCC-associated genes, including NRDE2, RANBP17, RTEL1, and STEAP3. Using NRDE2 (index rs199890497 [p.N377I], p = 1.19 × 10-9) as an exemplary candidate, we demonstrated that it promotes homologous recombination (HR) repair and suppresses HCC. Mechanistically, NRDE2 binds to the subunits of casein kinase 2 (CK2) and facilitates the assembly and activity of the CK2 holoenzyme. This NRDE2-mediated enhancement of CK2 activity increases the phosphorylation of MDC1 and then facilitates the HR repair. These functions are eliminated almost completely by the NRDE2-p.N377I variant, which sensitizes the HCC cells to poly(ADP-ribose) polymerase (PARP) inhibitors, especially when combined with chemotherapy. Collectively, our findings highlight the relevance of the rare variants to genetic susceptibility to HCC, which would be helpful for the precise treatment of this malignancy.
{"title":"NRDE2 deficiency impairs homologous recombination repair and sensitizes hepatocellular carcinoma to PARP inhibitors.","authors":"Yahui Wang, Xinyi Liu, Xianbo Zuo, Cuiling Wang, Zheng Zhang, Haitao Zhang, Tao Zeng, Shunqi Chen, Mengyu Liu, Hongxia Chen, Qingfeng Song, Qi Li, Chenning Yang, Yi Le, Jinliang Xing, Hongxin Zhang, Jiaze An, Weihua Jia, Longli Kang, Hongxing Zhang, Hui Xie, Jiazhou Ye, Tianzhun Wu, Fuchu He, Xuejun Zhang, Yuanfeng Li, Gangqiao Zhou","doi":"10.1016/j.xgen.2024.100550","DOIUrl":"10.1016/j.xgen.2024.100550","url":null,"abstract":"<p><p>To identify novel susceptibility genes for hepatocellular carcinoma (HCC), we performed a rare-variant association study in Chinese populations consisting of 2,750 cases and 4,153 controls. We identified four HCC-associated genes, including NRDE2, RANBP17, RTEL1, and STEAP3. Using NRDE2 (index rs199890497 [p.N377I], p = 1.19 × 10<sup>-9</sup>) as an exemplary candidate, we demonstrated that it promotes homologous recombination (HR) repair and suppresses HCC. Mechanistically, NRDE2 binds to the subunits of casein kinase 2 (CK2) and facilitates the assembly and activity of the CK2 holoenzyme. This NRDE2-mediated enhancement of CK2 activity increases the phosphorylation of MDC1 and then facilitates the HR repair. These functions are eliminated almost completely by the NRDE2-p.N377I variant, which sensitizes the HCC cells to poly(ADP-ribose) polymerase (PARP) inhibitors, especially when combined with chemotherapy. Collectively, our findings highlight the relevance of the rare variants to genetic susceptibility to HCC, which would be helpful for the precise treatment of this malignancy.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100550"},"PeriodicalIF":0.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099347/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140872393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08Epub Date: 2024-05-01DOI: 10.1016/j.xgen.2024.100554
Roshni A Patel, Rachel A Ungar, Alanna L Pyke, Alvina Adimoelja, Meenakshi Chakraborty, Daniel J Cotter, Malika Freund, Pagé Goddard, Justin Gomez-Stafford, Emily Greenwald, Emily Higgs, Naiomi Hunter, Tim M G MacKenzie, Anjali Narain, Tamara Gjorgjieva, Daphne O Martschenko
Despite the profound impacts of scientific research, few scientists have received the necessary training to productively discuss the ethical and societal implications of their work. To address this critical gap, we-a group of predominantly human genetics trainees-developed a course on genetics, ethics, and society. We intend for this course to serve as a template for other institutions and scientific disciplines. Our curriculum positions human genetics within its historical and societal context and encourages students to evaluate how societal norms and structures impact the conduct of scientific research. We demonstrate the utility of this course via surveys of enrolled students and provide resources and strategies for others hoping to teach a similar course. We conclude by arguing that if we are to work toward rectifying the inequities and injustices produced by our field, we must first learn to view our own research as impacting and being impacted by society.
{"title":"Increasing equity in science requires better ethics training: A course by trainees, for trainees.","authors":"Roshni A Patel, Rachel A Ungar, Alanna L Pyke, Alvina Adimoelja, Meenakshi Chakraborty, Daniel J Cotter, Malika Freund, Pagé Goddard, Justin Gomez-Stafford, Emily Greenwald, Emily Higgs, Naiomi Hunter, Tim M G MacKenzie, Anjali Narain, Tamara Gjorgjieva, Daphne O Martschenko","doi":"10.1016/j.xgen.2024.100554","DOIUrl":"10.1016/j.xgen.2024.100554","url":null,"abstract":"<p><p>Despite the profound impacts of scientific research, few scientists have received the necessary training to productively discuss the ethical and societal implications of their work. To address this critical gap, we-a group of predominantly human genetics trainees-developed a course on genetics, ethics, and society. We intend for this course to serve as a template for other institutions and scientific disciplines. Our curriculum positions human genetics within its historical and societal context and encourages students to evaluate how societal norms and structures impact the conduct of scientific research. We demonstrate the utility of this course via surveys of enrolled students and provide resources and strategies for others hoping to teach a similar course. We conclude by arguing that if we are to work toward rectifying the inequities and injustices produced by our field, we must first learn to view our own research as impacting and being impacted by society.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100554"},"PeriodicalIF":0.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099339/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140873664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08Epub Date: 2024-05-01DOI: 10.1016/j.xgen.2024.100556
Sheridan H Littleton, Khanh B Trang, Christina M Volpe, Kieona Cook, Nicole DeBruyne, Jean Ann Maguire, Mary Ann Weidekamp, Kenyaita M Hodge, Keith Boehm, Sumei Lu, Alessandra Chesi, Jonathan P Bradfield, James A Pippin, Stewart A Anderson, Andrew D Wells, Matthew C Pahl, Struan F A Grant
The ch12q13 locus is among the most significant childhood obesity loci identified in genome-wide association studies. This locus resides in a non-coding region within FAIM2; thus, the underlying causal variant(s) presumably influence disease susceptibility via cis-regulation. We implicated rs7132908 as a putative causal variant by leveraging our in-house 3D genomic data and public domain datasets. Using a luciferase reporter assay, we observed allele-specific cis-regulatory activity of the immediate region harboring rs7132908. We generated isogenic human embryonic stem cell lines homozygous for either rs7132908 allele to assess changes in gene expression and chromatin accessibility throughout a differentiation to hypothalamic neurons, a key cell type known to regulate feeding behavior. The rs7132908 obesity risk allele influenced expression of FAIM2 and other genes and decreased the proportion of neurons produced by differentiation. We have functionally validated rs7132908 as a causal obesity variant that temporally regulates nearby effector genes and influences neurodevelopment and survival.
{"title":"Variant-to-function analysis of the childhood obesity chr12q13 locus implicates rs7132908 as a causal variant within the 3' UTR of FAIM2.","authors":"Sheridan H Littleton, Khanh B Trang, Christina M Volpe, Kieona Cook, Nicole DeBruyne, Jean Ann Maguire, Mary Ann Weidekamp, Kenyaita M Hodge, Keith Boehm, Sumei Lu, Alessandra Chesi, Jonathan P Bradfield, James A Pippin, Stewart A Anderson, Andrew D Wells, Matthew C Pahl, Struan F A Grant","doi":"10.1016/j.xgen.2024.100556","DOIUrl":"10.1016/j.xgen.2024.100556","url":null,"abstract":"<p><p>The ch12q13 locus is among the most significant childhood obesity loci identified in genome-wide association studies. This locus resides in a non-coding region within FAIM2; thus, the underlying causal variant(s) presumably influence disease susceptibility via cis-regulation. We implicated rs7132908 as a putative causal variant by leveraging our in-house 3D genomic data and public domain datasets. Using a luciferase reporter assay, we observed allele-specific cis-regulatory activity of the immediate region harboring rs7132908. We generated isogenic human embryonic stem cell lines homozygous for either rs7132908 allele to assess changes in gene expression and chromatin accessibility throughout a differentiation to hypothalamic neurons, a key cell type known to regulate feeding behavior. The rs7132908 obesity risk allele influenced expression of FAIM2 and other genes and decreased the proportion of neurons produced by differentiation. We have functionally validated rs7132908 as a causal obesity variant that temporally regulates nearby effector genes and influences neurodevelopment and survival.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100556"},"PeriodicalIF":0.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099382/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140859201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08Epub Date: 2024-04-29DOI: 10.1016/j.xgen.2024.100553
Yicheng Gao, Kejing Dong, Yuli Gao, Xuan Jin, Jingya Yang, Gang Yan, Qi Liu
Single-cell RNA sequencing (scRNA-seq) and T cell receptor sequencing (TCR-seq) are pivotal for investigating T cell heterogeneity. Integrating these modalities, which is expected to uncover profound insights in immunology that might otherwise go unnoticed with a single modality, faces computational challenges due to the low-resource characteristics of the multimodal data. Herein, we present UniTCR, a novel low-resource-aware multimodal representation learning framework designed for the unified cross-modality integration, enabling comprehensive T cell analysis. By designing a dual-modality contrastive learning module and a single-modality preservation module to effectively embed each modality into a common latent space, UniTCR demonstrates versatility in connecting TCR sequences with T cell transcriptomes across various tasks, including single-modality analysis, modality gap analysis, epitope-TCR binding prediction, and TCR profile cross-modality generation, in a low-resource-aware way. Extensive evaluations conducted on multiple scRNA-seq/TCR-seq paired datasets showed the superior performance of UniTCR, exhibiting the ability of exploring the complexity of immune system.
单细胞 RNA 测序(scRNA-seq)和 T 细胞受体测序(TCR-seq)是研究 T 细胞异质性的关键。由于多模态数据的低资源特性,将这些模态整合在一起面临着计算上的挑战。在此,我们提出了 UniTCR,这是一种新型的低资源感知多模态表征学习框架,旨在进行统一的跨模态整合,从而实现全面的 T 细胞分析。UniTCR 设计了双模态对比学习模块和单模态保存模块,将每种模态有效地嵌入到一个共同的潜在空间中,从而以一种低资源感知的方式在各种任务中展示了连接 TCR 序列和 T 细胞转录组的多功能性,包括单模态分析、模态差距分析、表位-TCR 结合预测和 TCR 图谱跨模态生成。在多个scRNA-seq/TCR-seq配对数据集上进行的广泛评估表明,UniTCR性能优越,具有探索免疫系统复杂性的能力。
{"title":"Unified cross-modality integration and analysis of T cell receptors and T cell transcriptomes by low-resource-aware representation learning.","authors":"Yicheng Gao, Kejing Dong, Yuli Gao, Xuan Jin, Jingya Yang, Gang Yan, Qi Liu","doi":"10.1016/j.xgen.2024.100553","DOIUrl":"10.1016/j.xgen.2024.100553","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) and T cell receptor sequencing (TCR-seq) are pivotal for investigating T cell heterogeneity. Integrating these modalities, which is expected to uncover profound insights in immunology that might otherwise go unnoticed with a single modality, faces computational challenges due to the low-resource characteristics of the multimodal data. Herein, we present UniTCR, a novel low-resource-aware multimodal representation learning framework designed for the unified cross-modality integration, enabling comprehensive T cell analysis. By designing a dual-modality contrastive learning module and a single-modality preservation module to effectively embed each modality into a common latent space, UniTCR demonstrates versatility in connecting TCR sequences with T cell transcriptomes across various tasks, including single-modality analysis, modality gap analysis, epitope-TCR binding prediction, and TCR profile cross-modality generation, in a low-resource-aware way. Extensive evaluations conducted on multiple scRNA-seq/TCR-seq paired datasets showed the superior performance of UniTCR, exhibiting the ability of exploring the complexity of immune system.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100553"},"PeriodicalIF":0.0,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099349/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140873684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-10DOI: 10.1016/j.xgen.2024.100539
Jin Jin, Jianan Zhan, Jingning Zhang, Ruzhang Zhao, Jared O'Connell, Yunxuan Jiang, Steven Buyske, Christopher Gignoux, Christopher Haiman, Eimear E Kenny, Charles Kooperberg, Kari North, Bertram L Koelsch, Genevieve Wojcik, Haoyu Zhang, Nilanjan Chatterjee
Polygenic risk scores (PRSs) are now showing promising predictive performance on a wide variety of complex traits and diseases, but there exists a substantial performance gap across populations. We propose MUSSEL, a method for ancestry-specific polygenic prediction that borrows information in summary statistics from genome-wide association studies (GWASs) across multiple ancestry groups via Bayesian hierarchical modeling and ensemble learning. In our simulation studies and data analyses across four distinct studies, totaling 5.7 million participants with a substantial ancestral diversity, MUSSEL shows promising performance compared to alternatives. For example, MUSSEL has an average gain in prediction R2 across 11 continuous traits of 40.2% and 49.3% compared to PRS-CSx and CT-SLEB, respectively, in the African ancestry population. The best-performing method, however, varies by GWAS sample size, target ancestry, trait architecture, and linkage disequilibrium reference samples; thus, ultimately a combination of methods may be needed to generate the most robust PRSs across diverse populations.
{"title":"MUSSEL: Enhanced Bayesian polygenic risk prediction leveraging information across multiple ancestry groups.","authors":"Jin Jin, Jianan Zhan, Jingning Zhang, Ruzhang Zhao, Jared O'Connell, Yunxuan Jiang, Steven Buyske, Christopher Gignoux, Christopher Haiman, Eimear E Kenny, Charles Kooperberg, Kari North, Bertram L Koelsch, Genevieve Wojcik, Haoyu Zhang, Nilanjan Chatterjee","doi":"10.1016/j.xgen.2024.100539","DOIUrl":"https://doi.org/10.1016/j.xgen.2024.100539","url":null,"abstract":"<p><p>Polygenic risk scores (PRSs) are now showing promising predictive performance on a wide variety of complex traits and diseases, but there exists a substantial performance gap across populations. We propose MUSSEL, a method for ancestry-specific polygenic prediction that borrows information in summary statistics from genome-wide association studies (GWASs) across multiple ancestry groups via Bayesian hierarchical modeling and ensemble learning. In our simulation studies and data analyses across four distinct studies, totaling 5.7 million participants with a substantial ancestral diversity, MUSSEL shows promising performance compared to alternatives. For example, MUSSEL has an average gain in prediction R<sup>2</sup> across 11 continuous traits of 40.2% and 49.3% compared to PRS-CSx and CT-SLEB, respectively, in the African ancestry population. The best-performing method, however, varies by GWAS sample size, target ancestry, trait architecture, and linkage disequilibrium reference samples; thus, ultimately a combination of methods may be needed to generate the most robust PRSs across diverse populations.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":"4 4","pages":"100539"},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11019365/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140874035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-10Epub Date: 2024-03-19DOI: 10.1016/j.xgen.2024.100523
Buu Truong, Leland E Hull, Yunfeng Ruan, Qin Qin Huang, Whitney Hornsby, Hilary Martin, David A van Heel, Ying Wang, Alicia R Martin, S Hong Lee, Pradeep Natarajan
Polygenic risk scores (PRSs) are an emerging tool to predict the clinical phenotypes and outcomes of individuals. We propose PRSmix, a framework that leverages the PRS corpus of a target trait to improve prediction accuracy, and PRSmix+, which incorporates genetically correlated traits to better capture the human genetic architecture for 47 and 32 diseases/traits in European and South Asian ancestries, respectively. PRSmix demonstrated a mean prediction accuracy improvement of 1.20-fold (95% confidence interval [CI], [1.10; 1.3]; p = 9.17 × 10-5) and 1.19-fold (95% CI, [1.11; 1.27]; p = 1.92 × 10-6), and PRSmix+ improved the prediction accuracy by 1.72-fold (95% CI, [1.40; 2.04]; p = 7.58 × 10-6) and 1.42-fold (95% CI, [1.25; 1.59]; p = 8.01 × 10-7) in European and South Asian ancestries, respectively. Compared to the previously cross-trait-combination methods with scores from pre-defined correlated traits, we demonstrated that our method improved prediction accuracy for coronary artery disease up to 3.27-fold (95% CI, [2.1; 4.44]; p value after false discovery rate (FDR) correction = 2.6 × 10-4). Our method provides a comprehensive framework to benchmark and leverage the combined power of PRS for maximal performance in a desired target population.
{"title":"Integrative polygenic risk score improves the prediction accuracy of complex traits and diseases.","authors":"Buu Truong, Leland E Hull, Yunfeng Ruan, Qin Qin Huang, Whitney Hornsby, Hilary Martin, David A van Heel, Ying Wang, Alicia R Martin, S Hong Lee, Pradeep Natarajan","doi":"10.1016/j.xgen.2024.100523","DOIUrl":"10.1016/j.xgen.2024.100523","url":null,"abstract":"<p><p>Polygenic risk scores (PRSs) are an emerging tool to predict the clinical phenotypes and outcomes of individuals. We propose PRSmix, a framework that leverages the PRS corpus of a target trait to improve prediction accuracy, and PRSmix+, which incorporates genetically correlated traits to better capture the human genetic architecture for 47 and 32 diseases/traits in European and South Asian ancestries, respectively. PRSmix demonstrated a mean prediction accuracy improvement of 1.20-fold (95% confidence interval [CI], [1.10; 1.3]; p = 9.17 × 10<sup>-5</sup>) and 1.19-fold (95% CI, [1.11; 1.27]; p = 1.92 × 10<sup>-6</sup>), and PRSmix+ improved the prediction accuracy by 1.72-fold (95% CI, [1.40; 2.04]; p = 7.58 × 10<sup>-6</sup>) and 1.42-fold (95% CI, [1.25; 1.59]; p = 8.01 × 10<sup>-7</sup>) in European and South Asian ancestries, respectively. Compared to the previously cross-trait-combination methods with scores from pre-defined correlated traits, we demonstrated that our method improved prediction accuracy for coronary artery disease up to 3.27-fold (95% CI, [2.1; 4.44]; p value after false discovery rate (FDR) correction = 2.6 × 10<sup>-4</sup>). Our method provides a comprehensive framework to benchmark and leverage the combined power of PRS for maximal performance in a desired target population.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100523"},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11019356/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140178046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}