首页 > 最新文献

Human Genetics最新文献

英文 中文
Bayesian network-based Mendelian randomization for variant prioritization and phenotypic causal inference. 基于贝叶斯网络的孟德尔随机化,用于变体优先排序和表型因果推断。
IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY Pub Date : 2024-10-01 Epub Date: 2024-02-21 DOI: 10.1007/s00439-024-02640-x
Jianle Sun, Jie Zhou, Yuqiao Gong, Chongchen Pang, Yanran Ma, Jian Zhao, Zhangsheng Yu, Yue Zhang

Mendelian randomization is a powerful method for inferring causal relationships. However, obtaining suitable genetic instrumental variables is often challenging due to gene interaction, linkage, and pleiotropy. We propose Bayesian network-based Mendelian randomization (BNMR), a Bayesian causal learning and inference framework using individual-level data. BNMR employs the random graph forest, an ensemble Bayesian network structural learning process, to prioritize candidate genetic variants and select appropriate instrumental variables, and then obtains a pleiotropy-robust estimate by incorporating a shrinkage prior in the Bayesian framework. Simulations demonstrate BNMR can efficiently reduce the false-positive discoveries in variant selection, and outperforms existing MR methods in terms of accuracy and statistical power in effect estimation. With application to the UK Biobank, BNMR exhibits its capacity in handling modern genomic data, and reveals the causal relationships from hematological traits to blood pressures and psychiatric disorders. Its effectiveness in handling complex genetic structures and modern genomic data highlights the potential to facilitate real-world evidence studies, making it a promising tool for advancing our understanding of causal mechanisms.

孟德尔随机化是推断因果关系的有力方法。然而,由于基因相互作用、关联和多义性,获得合适的遗传工具变量往往具有挑战性。我们提出了基于贝叶斯网络的孟德尔随机化(BNMR),这是一种使用个体水平数据的贝叶斯因果学习和推断框架。BNMR 采用随机图森林(一种集合贝叶斯网络结构学习过程)对候选遗传变异进行优先排序,并选择适当的工具变量,然后通过在贝叶斯框架中加入收缩先验,获得多向性稳健估计。模拟结果表明,BNMR 能有效减少变异选择中的假阳性发现,在效应估计的准确性和统计能力方面优于现有的 MR 方法。通过在英国生物库中的应用,BNMR 展示了其处理现代基因组数据的能力,并揭示了从血液特征到血压和精神疾病的因果关系。BNMR 在处理复杂遗传结构和现代基因组数据方面的有效性凸显了它在促进真实世界证据研究方面的潜力,使其成为促进我们对因果机制的理解的一种有前途的工具。
{"title":"Bayesian network-based Mendelian randomization for variant prioritization and phenotypic causal inference.","authors":"Jianle Sun, Jie Zhou, Yuqiao Gong, Chongchen Pang, Yanran Ma, Jian Zhao, Zhangsheng Yu, Yue Zhang","doi":"10.1007/s00439-024-02640-x","DOIUrl":"10.1007/s00439-024-02640-x","url":null,"abstract":"<p><p>Mendelian randomization is a powerful method for inferring causal relationships. However, obtaining suitable genetic instrumental variables is often challenging due to gene interaction, linkage, and pleiotropy. We propose Bayesian network-based Mendelian randomization (BNMR), a Bayesian causal learning and inference framework using individual-level data. BNMR employs the random graph forest, an ensemble Bayesian network structural learning process, to prioritize candidate genetic variants and select appropriate instrumental variables, and then obtains a pleiotropy-robust estimate by incorporating a shrinkage prior in the Bayesian framework. Simulations demonstrate BNMR can efficiently reduce the false-positive discoveries in variant selection, and outperforms existing MR methods in terms of accuracy and statistical power in effect estimation. With application to the UK Biobank, BNMR exhibits its capacity in handling modern genomic data, and reveals the causal relationships from hematological traits to blood pressures and psychiatric disorders. Its effectiveness in handling complex genetic structures and modern genomic data highlights the potential to facilitate real-world evidence studies, making it a promising tool for advancing our understanding of causal mechanisms.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":"1081-1094"},"PeriodicalIF":3.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139912502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genetic evidence for T-wave area from 12-lead electrocardiograms to monitor cardiovascular diseases in patients taking diabetes medications. 从 12 导联心电图中获得 T 波区域的基因证据,以监测糖尿病患者的心血管疾病。
IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY Pub Date : 2024-10-01 Epub Date: 2024-03-20 DOI: 10.1007/s00439-024-02661-6
Mengling Qi, Haoyang Zhang, Xuehao Xiu, Dan He, David N Cooper, Yuanhao Yang, Huiying Zhao

Aims Many studies indicated use of diabetes medications can influence the electrocardiogram (ECG), which remains the simplest and fastest tool for assessing cardiac functions. However, few studies have explored the role of genetic factors in determining the relationship between the use of diabetes medications and ECG trace characteristics (ETC). Methods Genome-wide association studies (GWAS) were performed for 168 ETCs extracted from the 12-lead ECGs of 42,340 Europeans in the UK Biobank. The genetic correlations, causal relationships, and phenotypic relationships of these ETCs with medication usage, as well as the risk of cardiovascular diseases (CVDs), were estimated by linkage disequilibrium score regression (LDSC), Mendelian randomization (MR), and regression model, respectively. Results The GWAS identified 124 independent single nucleotide polymorphisms (SNPs) that were study-wise and genome-wide significantly associated with at least one ETC. Regression model and LDSC identified significant phenotypic and genetic correlations of T-wave area in lead aVR (aVR_T-area) with usage of diabetes medications (ATC code: A10 drugs, and metformin), and the risks of ischemic heart disease (IHD) and coronary atherosclerosis (CA). MR analyses support a putative causal effect of the use of diabetes medications on decreasing aVR_T-area, and on increasing risk of IHD and CA. ConclusionPatients taking diabetes medications are prone to have decreased aVR_T-area and an increased risk of IHD and CA. The aVR_T-area is therefore a potential ECG marker for pre-clinical prediction of IHD and CA in patients taking diabetes medications.

目的 许多研究表明,糖尿病药物的使用会影响心电图(ECG),而心电图仍是评估心脏功能的最简单、最快捷的工具。然而,很少有研究探讨遗传因素在决定糖尿病药物使用与心电图描记特征(ETC)之间关系中的作用。方法 对英国生物库中 42,340 名欧洲人的 12 导联心电图中提取的 168 个 ETC 进行了全基因组关联研究(GWAS)。通过连锁不平衡评分回归(LDSC)、孟德尔随机化(MR)和回归模型,分别估算了这些ETCs与药物使用以及心血管疾病(CVDs)风险的遗传相关性、因果关系和表型关系。结果 GWAS 发现了 124 个独立的单核苷酸多态性(SNPs),这些单核苷酸多态性在研究范围和全基因组范围内与至少一种 ETC 显著相关。回归模型和 LDSC 发现 aVR 导联 T 波面积(aVR_T-area)与糖尿病药物(ATC 代码:A10 药物和二甲双胍)的使用以及缺血性心脏病(IHD)和冠状动脉粥样硬化(CA)的风险存在明显的表型和遗传相关性。磁共振分析支持使用糖尿病药物对减少 aVR_T-面积、增加 IHD 和 CA 风险的推定因果效应。结论服用糖尿病药物的患者容易导致 aVR_T-area 降低,并增加罹患 IHD 和 CA 的风险。因此,aVR_T-area 是一种潜在的心电图标志物,可用于临床前预测糖尿病患者的 IHD 和 CA。
{"title":"Genetic evidence for T-wave area from 12-lead electrocardiograms to monitor cardiovascular diseases in patients taking diabetes medications.","authors":"Mengling Qi, Haoyang Zhang, Xuehao Xiu, Dan He, David N Cooper, Yuanhao Yang, Huiying Zhao","doi":"10.1007/s00439-024-02661-6","DOIUrl":"10.1007/s00439-024-02661-6","url":null,"abstract":"<p><p>Aims Many studies indicated use of diabetes medications can influence the electrocardiogram (ECG), which remains the simplest and fastest tool for assessing cardiac functions. However, few studies have explored the role of genetic factors in determining the relationship between the use of diabetes medications and ECG trace characteristics (ETC). Methods Genome-wide association studies (GWAS) were performed for 168 ETCs extracted from the 12-lead ECGs of 42,340 Europeans in the UK Biobank. The genetic correlations, causal relationships, and phenotypic relationships of these ETCs with medication usage, as well as the risk of cardiovascular diseases (CVDs), were estimated by linkage disequilibrium score regression (LDSC), Mendelian randomization (MR), and regression model, respectively. Results The GWAS identified 124 independent single nucleotide polymorphisms (SNPs) that were study-wise and genome-wide significantly associated with at least one ETC. Regression model and LDSC identified significant phenotypic and genetic correlations of T-wave area in lead aVR (aVR_T-area) with usage of diabetes medications (ATC code: A10 drugs, and metformin), and the risks of ischemic heart disease (IHD) and coronary atherosclerosis (CA). MR analyses support a putative causal effect of the use of diabetes medications on decreasing aVR_T-area, and on increasing risk of IHD and CA. ConclusionPatients taking diabetes medications are prone to have decreased aVR_T-area and an increased risk of IHD and CA. The aVR_T-area is therefore a potential ECG marker for pre-clinical prediction of IHD and CA in patients taking diabetes medications.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":"1095-1108"},"PeriodicalIF":3.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140174493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The crucial prognostic signaling pathways of pancreatic ductal adenocarcinoma were identified by single-cell and bulk RNA sequencing data. 通过单细胞和大量 RNA 测序数据,确定了胰腺导管腺癌的关键预后信号通路。
IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY Pub Date : 2024-10-01 Epub Date: 2024-03-25 DOI: 10.1007/s00439-024-02663-4
Wenwen Wang, Guo Chen, Wenli Zhang, Xihua Zhang, Manli Huang, Chen Li, Ling Wang, Zifan Lu, Jielai Xia

Pancreatic ductal adenocarcinoma (PDAC) is a malignant tumor with poor prognosis and high mortality. Although a large number of studies have explored its potential prognostic markers using traditional RNA sequencing (RNA-Seq) data, they have not achieved good prediction effect. In order to explore the possible prognostic signaling pathways leading to the difference in prognosis, we identified differentially expressed genes from one scRNA-seq cohort and four GEO cohorts, respectively. Then Cox and Lasso regression analysis showed that 12 genes were independent prognostic factors for PDAC. AUC and calibration curve analysis showed that the prognostic model had good discrimination and calibration. Compared with the low-risk group, the high-risk group had a higher proportion of gene mutations than the low-risk group. Immune infiltration analysis revealed differences in macrophages and monocytes between the two groups. Prognosis related genes were mainly distributed in fibroblasts, macrophages and type 2 ducts. The results of cell communication analysis showed that there was a strong communication between cancer-associated fibroblasts (CAF) and type 2 ductal cells, and collagen formation was the main interaction pathway.

胰腺导管腺癌(PDAC)是一种预后差、死亡率高的恶性肿瘤。尽管大量研究利用传统的RNA测序(RNA-Seq)数据探索了其潜在的预后标志物,但并未取得良好的预测效果。为了探索导致预后差异的可能预后信号通路,我们分别从一个scRNA-seq队列和四个GEO队列中发现了差异表达基因。然后,Cox 和 Lasso 回归分析表明,12 个基因是 PDAC 的独立预后因素。AUC和校准曲线分析表明,预后模型具有良好的区分度和校准性。与低风险组相比,高风险组的基因突变比例高于低风险组。免疫浸润分析显示,两组患者的巨噬细胞和单核细胞存在差异。与预后相关的基因主要分布在成纤维细胞、巨噬细胞和 2 型导管中。细胞通讯分析结果显示,癌相关成纤维细胞(CAF)与2型导管细胞之间存在强烈的通讯,胶原形成是主要的相互作用途径。
{"title":"The crucial prognostic signaling pathways of pancreatic ductal adenocarcinoma were identified by single-cell and bulk RNA sequencing data.","authors":"Wenwen Wang, Guo Chen, Wenli Zhang, Xihua Zhang, Manli Huang, Chen Li, Ling Wang, Zifan Lu, Jielai Xia","doi":"10.1007/s00439-024-02663-4","DOIUrl":"10.1007/s00439-024-02663-4","url":null,"abstract":"<p><p>Pancreatic ductal adenocarcinoma (PDAC) is a malignant tumor with poor prognosis and high mortality. Although a large number of studies have explored its potential prognostic markers using traditional RNA sequencing (RNA-Seq) data, they have not achieved good prediction effect. In order to explore the possible prognostic signaling pathways leading to the difference in prognosis, we identified differentially expressed genes from one scRNA-seq cohort and four GEO cohorts, respectively. Then Cox and Lasso regression analysis showed that 12 genes were independent prognostic factors for PDAC. AUC and calibration curve analysis showed that the prognostic model had good discrimination and calibration. Compared with the low-risk group, the high-risk group had a higher proportion of gene mutations than the low-risk group. Immune infiltration analysis revealed differences in macrophages and monocytes between the two groups. Prognosis related genes were mainly distributed in fibroblasts, macrophages and type 2 ducts. The results of cell communication analysis showed that there was a strong communication between cancer-associated fibroblasts (CAF) and type 2 ductal cells, and collagen formation was the main interaction pathway.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":"1109-1129"},"PeriodicalIF":3.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11485037/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140287333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification of TACSTD2 as novel therapeutic targets for cisplatin-induced acute kidney injury by multi-omics data integration. 通过多组学数据整合鉴定顺铂诱发急性肾损伤的新治疗靶点TACSTD2
IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY Pub Date : 2024-10-01 Epub Date: 2024-02-18 DOI: 10.1007/s00439-024-02641-w
Zebin Deng, Zheng Dong, Yinhuai Wang, Yingbo Dai, Jiachen Liu, Fei Deng

Cisplatin-induced acute kidney injury (CP-AKI) is a common complication in cancer patients. Although ferroptosis is believed to contribute to the progression of CP-AKI, its mechanisms remain incompletely understood. In this study, after initially processed individual omics datasets, we integrated multi-omics data to construct a ferroptosis network in the kidney, resulting in the identification of the key driver TACSTD2. In vitro and in vivo results showed that TACSTD2 was notably upregulated in cisplatin-treated kidneys and BUMPT cells. Overexpression of TACSTD2 accelerated ferroptosis, while its gene disruption decelerated ferroptosis, likely mediated by its potential downstream targets HMGB1, IRF6, and LCN2. Drug prediction and molecular docking were further used to propose that drugs targeting TACSTD2 may have therapeutic potential in CP-AKI, such as parthenolide, progesterone, premarin, estradiol and rosiglitazone. Our findings suggest a significant association between ferroptosis and the development of CP-AKI, with TACSTD2 playing a crucial role in modulating ferroptosis, which provides novel perspectives on the pathogenesis and treatment of CP-AKI.

顺铂诱导的急性肾损伤(CP-AKI)是癌症患者常见的并发症。尽管人们认为铁变态反应是 CP-AKI 进展的原因之一,但对其机制仍不完全清楚。在本研究中,我们初步处理了单个组学数据集,然后整合多组学数据构建了肾脏中的铁蜕变网络,从而确定了关键驱动因子TACSTD2。体外和体内研究结果表明,TACSTD2在顺铂处理的肾脏和BUMPT细胞中显著上调。TACSTD2的过表达加速了铁变态反应,而其基因破坏则减缓了铁变态反应,这可能是由其潜在的下游靶标HMGB1、IRF6和LCN2介导的。通过药物预测和分子对接,我们进一步提出了针对TACSTD2的药物可能对CP-AKI有治疗潜力,如分苯内酯、黄体酮、前列腺素、雌二醇和罗格列酮。我们的研究结果表明,铁色素沉着与 CP-AKI 的发生有重要关联,而 TACSTD2 在调节铁色素沉着中起着关键作用,这为 CP-AKI 的发病机制和治疗提供了新的视角。
{"title":"Identification of TACSTD2 as novel therapeutic targets for cisplatin-induced acute kidney injury by multi-omics data integration.","authors":"Zebin Deng, Zheng Dong, Yinhuai Wang, Yingbo Dai, Jiachen Liu, Fei Deng","doi":"10.1007/s00439-024-02641-w","DOIUrl":"10.1007/s00439-024-02641-w","url":null,"abstract":"<p><p>Cisplatin-induced acute kidney injury (CP-AKI) is a common complication in cancer patients. Although ferroptosis is believed to contribute to the progression of CP-AKI, its mechanisms remain incompletely understood. In this study, after initially processed individual omics datasets, we integrated multi-omics data to construct a ferroptosis network in the kidney, resulting in the identification of the key driver TACSTD2. In vitro and in vivo results showed that TACSTD2 was notably upregulated in cisplatin-treated kidneys and BUMPT cells. Overexpression of TACSTD2 accelerated ferroptosis, while its gene disruption decelerated ferroptosis, likely mediated by its potential downstream targets HMGB1, IRF6, and LCN2. Drug prediction and molecular docking were further used to propose that drugs targeting TACSTD2 may have therapeutic potential in CP-AKI, such as parthenolide, progesterone, premarin, estradiol and rosiglitazone. Our findings suggest a significant association between ferroptosis and the development of CP-AKI, with TACSTD2 playing a crucial role in modulating ferroptosis, which provides novel perspectives on the pathogenesis and treatment of CP-AKI.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":"1061-1080"},"PeriodicalIF":3.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139899662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fine mapping of candidate effector genes for heart rate. 精细绘制心率候选效应基因图谱
IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY Pub Date : 2024-10-01 Epub Date: 2024-07-06 DOI: 10.1007/s00439-024-02684-z
Julia Ramírez, Stefan van Duijvenboden, William J Young, Yutang Chen, Tania Usman, Michele Orini, Pier D Lambiase, Andrew Tinker, Christopher G Bell, Andrew P Morris, Patricia B Munroe

An elevated resting heart rate (RHR) is associated with increased cardiovascular mortality. Genome-wide association studies (GWAS) have identified > 350 loci. Uniquely, in this study we applied genetic fine-mapping leveraging tissue specific chromatin segmentation and colocalization analyses to identify causal variants and candidate effector genes for RHR. We used RHR GWAS summary statistics from 388,237 individuals of European ancestry from UK Biobank and performed fine mapping using publicly available genomic annotation datasets. High-confidence causal variants (accounting for > 75% posterior probability) were identified, and we collated candidate effector genes using a multi-omics approach that combined evidence from colocalisation with molecular quantitative trait loci (QTLs), and long-range chromatin interaction analyses. Finally, we performed druggability analyses to investigate drug repurposing opportunities. The fine mapping pipeline indicated 442 distinct RHR signals. For 90 signals, a single variant was identified as a high-confidence causal variant, of which 22 were annotated as missense. In trait-relevant tissues, 39 signals colocalised with cis-expression QTLs (eQTLs), 3 with cis-protein QTLs (pQTLs), and 75 had promoter interactions via Hi-C. In total, 262 candidate genes were highlighted (79% had promoter interactions, 15% had a colocalised eQTL, 8% had a missense variant and 1% had a colocalised pQTL), and, for the first time, enrichment in nervous system pathways. Druggability analyses highlighted ACHE, CALCRL, MYT1 and TDP1 as potential targets. Our genetic fine-mapping pipeline prioritised 262 candidate genes for RHR that warrant further investigation in functional studies, and we provide potential therapeutic targets to reduce RHR and cardiovascular mortality.

静息心率(RHR)升高与心血管死亡率增加有关。全基因组关联研究(GWAS)发现了超过 350 个基因位点。与众不同的是,在本研究中,我们利用组织特异性染色质分割和共定位分析进行了基因精细图谱绘制,以确定 RHR 的因果变异和候选效应基因。我们使用了英国生物库中 388,237 名欧洲血统个体的 RHR GWAS 统计摘要,并利用公开的基因组注释数据集进行了精细图谱绘制。我们确定了高置信度的因果变异(后验概率大于 75%),并使用多组学方法整理了候选效应基因,该方法结合了与分子数量性状位点 (QTL) 共定位的证据以及长程染色质相互作用分析。最后,我们进行了可药性分析,以研究药物再利用的机会。精细图谱管道显示了 442 个不同的 RHR 信号。在 90 个信号中,有一个变体被鉴定为高置信度的因果变体,其中 22 个被注释为错义变体。在性状相关组织中,39 个信号与顺式表达 QTLs(eQTLs)共定位,3 个信号与顺式蛋白 QTLs(pQTLs)共定位,75 个信号通过 Hi-C 与启动子相互作用。总共有 262 个候选基因被突出显示(79% 的候选基因有启动子相互作用,15% 的候选基因有共定位 eQTL,8% 的候选基因有错义变异,1% 的候选基因有共定位 pQTL),并首次在神经系统通路中出现富集。可药用性分析突出表明,ACHE、CALCRL、MYT1 和 TDP1 是潜在的靶点。我们的基因精细图谱管道确定了 262 个 RHR 候选基因的优先顺序,这些基因值得在功能研究中进一步调查,我们还提供了降低 RHR 和心血管死亡率的潜在治疗靶点。
{"title":"Fine mapping of candidate effector genes for heart rate.","authors":"Julia Ramírez, Stefan van Duijvenboden, William J Young, Yutang Chen, Tania Usman, Michele Orini, Pier D Lambiase, Andrew Tinker, Christopher G Bell, Andrew P Morris, Patricia B Munroe","doi":"10.1007/s00439-024-02684-z","DOIUrl":"10.1007/s00439-024-02684-z","url":null,"abstract":"<p><p>An elevated resting heart rate (RHR) is associated with increased cardiovascular mortality. Genome-wide association studies (GWAS) have identified > 350 loci. Uniquely, in this study we applied genetic fine-mapping leveraging tissue specific chromatin segmentation and colocalization analyses to identify causal variants and candidate effector genes for RHR. We used RHR GWAS summary statistics from 388,237 individuals of European ancestry from UK Biobank and performed fine mapping using publicly available genomic annotation datasets. High-confidence causal variants (accounting for > 75% posterior probability) were identified, and we collated candidate effector genes using a multi-omics approach that combined evidence from colocalisation with molecular quantitative trait loci (QTLs), and long-range chromatin interaction analyses. Finally, we performed druggability analyses to investigate drug repurposing opportunities. The fine mapping pipeline indicated 442 distinct RHR signals. For 90 signals, a single variant was identified as a high-confidence causal variant, of which 22 were annotated as missense. In trait-relevant tissues, 39 signals colocalised with cis-expression QTLs (eQTLs), 3 with cis-protein QTLs (pQTLs), and 75 had promoter interactions via Hi-C. In total, 262 candidate genes were highlighted (79% had promoter interactions, 15% had a colocalised eQTL, 8% had a missense variant and 1% had a colocalised pQTL), and, for the first time, enrichment in nervous system pathways. Druggability analyses highlighted ACHE, CALCRL, MYT1 and TDP1 as potential targets. Our genetic fine-mapping pipeline prioritised 262 candidate genes for RHR that warrant further investigation in functional studies, and we provide potential therapeutic targets to reduce RHR and cardiovascular mortality.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":"1207-1221"},"PeriodicalIF":3.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11485034/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141537867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A methodology for gene level omics-WAS integration identifies genes influencing traits associated with cardiovascular risks: the Long Life Family Study 基因水平 omics-WAS 整合方法识别影响心血管风险相关特征的基因:长寿家族研究
IF 5.3 2区 生物学 Q2 GENETICS & HEREDITY Pub Date : 2024-09-14 DOI: 10.1007/s00439-024-02701-1
Sandeep Acharya, Shu Liao, Wooseok J. Jung, Yu S. Kang, Vaha Akbary Moghaddam, Mary F. Feitosa, Mary K. Wojczynski, Shiow Lin, Jason A. Anema, Karen Schwander, Jeff O. Connell, Michael A. Province, Michael R. Brent

The Long Life Family Study (LLFS) enrolled 4953 participants in 539 pedigrees displaying exceptional longevity. To identify genetic mechanisms that affect cardiovascular risks in the LLFS population, we developed a multi-omics integration pipeline and applied it to 11 traits associated with cardiovascular risks. Using our pipeline, we aggregated gene-level statistics from rare-variant analysis, GWAS, and gene expression-trait association by Correlated Meta-Analysis (CMA). Across all traits, CMA identified 64 significant genes after Bonferroni correction (p ≤ 2.8 × 10–7), 29 of which replicated in the Framingham Heart Study (FHS) cohort. Notably, 20 of the 29 replicated genes do not have a previously known trait-associated variant in the GWAS Catalog within 50 kb. Thirteen modules in Protein–Protein Interaction (PPI) networks are significantly enriched in genes with low meta-analysis p-values for at least one trait, three of which are replicated in the FHS cohort. The functional annotation of genes in these modules showed a significant over-representation of trait-related biological processes including sterol transport, protein-lipid complex remodeling, and immune response regulation. Among major findings, our results suggest a role of triglyceride-associated and mast-cell functional genes FCER1A, MS4A2, GATA2, HDC, and HRH4 in atherosclerosis risks. Our findings also suggest that lower expression of ATG2A, a gene we found to be associated with BMI, may be both a cause and consequence of obesity. Finally, our results suggest that ENPP3 may play an intermediary role in triglyceride-induced inflammation. Our pipeline is freely available and implemented in the Nextflow workflow language, making it easily runnable on any compute platform (https://nf-co.re/omicsgenetraitassociation).

长寿家族研究(LLFS)招募了 539 个血统中的 4953 名参与者,这些血统都具有超长寿命。为了确定影响 LLFS 群体心血管风险的遗传机制,我们开发了一个多组学整合管道,并将其应用于与心血管风险相关的 11 个性状。利用我们的管道,我们通过相关元分析(CMA)汇总了来自罕见变异分析、全球基因组研究和基因表达与性状关联的基因水平统计数据。在所有性状中,经过 Bonferroni 校正(p ≤ 2.8 × 10-7)后,CMA 发现了 64 个显著基因,其中 29 个基因在弗雷明汉心脏研究(FHS)队列中得到了复制。值得注意的是,在这 29 个重复的基因中,有 20 个基因在 50 kb 范围内的 GWAS 目录中没有先前已知的性状相关变异。蛋白质-蛋白质相互作用(PPI)网络中的 13 个模块在至少一个性状的荟萃分析 p 值较低的基因中显著富集,其中 3 个模块在 FHS 队列中得到了复制。对这些模块中基因的功能注释显示,与性状相关的生物过程(包括固醇转运、蛋白-脂质复合物重塑和免疫反应调控)的代表性明显偏高。在主要发现中,我们的结果表明甘油三酯相关基因和肥大细胞功能基因 FCER1A、MS4A2、GATA2、HDC 和 HRH4 在动脉粥样硬化风险中的作用。我们的研究结果还表明,我们发现与体重指数(BMI)相关的基因 ATG2A 表达较低,这可能既是肥胖的原因,也是肥胖的结果。最后,我们的研究结果表明,ENPP3 可能在甘油三酯诱导的炎症中扮演中间角色。我们的管道是免费提供的,并用 Nextflow 工作流语言实现,因此可以在任何计算平台上轻松运行 (https://nf-co.re/omicsgenetraitassociation)。
{"title":"A methodology for gene level omics-WAS integration identifies genes influencing traits associated with cardiovascular risks: the Long Life Family Study","authors":"Sandeep Acharya, Shu Liao, Wooseok J. Jung, Yu S. Kang, Vaha Akbary Moghaddam, Mary F. Feitosa, Mary K. Wojczynski, Shiow Lin, Jason A. Anema, Karen Schwander, Jeff O. Connell, Michael A. Province, Michael R. Brent","doi":"10.1007/s00439-024-02701-1","DOIUrl":"https://doi.org/10.1007/s00439-024-02701-1","url":null,"abstract":"<p>The Long Life Family Study (LLFS) enrolled 4953 participants in 539 pedigrees displaying exceptional longevity. To identify genetic mechanisms that affect cardiovascular risks in the LLFS population, we developed a multi-omics integration pipeline and applied it to 11 traits associated with cardiovascular risks. Using our pipeline, we aggregated gene-level statistics from rare-variant analysis, GWAS, and gene expression-trait association by Correlated Meta-Analysis (CMA). Across all traits, CMA identified 64 significant genes after Bonferroni correction (p ≤ 2.8 × 10<sup>–7</sup>), 29 of which replicated in the Framingham Heart Study (FHS) cohort. Notably, 20 of the 29 replicated genes do not have a previously known trait-associated variant in the GWAS Catalog within 50 kb. Thirteen modules in Protein–Protein Interaction (PPI) networks are significantly enriched in genes with low meta-analysis p-values for at least one trait, three of which are replicated in the FHS cohort. The functional annotation of genes in these modules showed a significant over-representation of trait-related biological processes including sterol transport, protein-lipid complex remodeling, and immune response regulation. Among major findings, our results suggest a role of triglyceride-associated and mast-cell functional genes <i>FCER1A</i>, <i>MS4A2</i>, <i>GATA2</i>, <i>HDC</i>, and <i>HRH4</i> in atherosclerosis risks. Our findings also suggest that lower expression of <i>ATG2A</i>, a gene we found to be associated with BMI, may be both a cause and consequence of obesity. Finally, our results suggest that <i>ENPP3</i> may play an intermediary role in triglyceride-induced inflammation. Our pipeline is freely available and implemented in the Nextflow workflow language, making it easily runnable on any compute platform (https://nf-co.re/omicsgenetraitassociation<u>)</u>.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":"15 1","pages":""},"PeriodicalIF":5.3,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142254866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structure-informed protein language models are robust predictors for variant effects. 结构信息蛋白质语言模型是变异效应的稳健预测器。
IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY Pub Date : 2024-08-08 DOI: 10.1007/s00439-024-02695-w
Yuanfei Sun, Yang Shen

Emerging variant effect predictors, protein language models (pLMs) learn evolutionary distribution of functional sequences to capture fitness landscape. Considering that variant effects are manifested through biological contexts beyond sequence (such as structure), we first assess how much structure context is learned in sequence-only pLMs and affecting variant effect prediction. And we establish a need to inject into pLMs protein structural context purposely and controllably. We thus introduce a framework of structure-informed pLMs (SI-pLMs), by extending masked sequence denoising to cross-modality denoising for both sequence and structure. Numerical results over deep mutagenesis scanning benchmarks show that our SI-pLMs, even when using smaller models and less data, are robustly top performers against competing methods including other pLMs, which shows that introducing biological context can be more effective at capturing fitness landscape than simply using larger models or bigger data. Case studies reveal that, compared to sequence-only pLMs, SI-pLMs can be better at capturing fitness landscape because (a) learned embeddings of low/high-fitness sequences can be more separable and (b) learned amino-acid distributions of functionally and evolutionarily conserved residues can be of much lower entropy, thus much more conserved, than other residues. Our SI-pLMs are applicable to revising any sequence-only pLMs through model architecture and training objectives. They do not require structure data as model inputs for variant effect prediction and only use structures as context provider and model regularizer during training.

作为新兴的变异效应预测工具,蛋白质语言模型(pLMs)通过学习功能序列的进化分布来捕捉适应性景观。考虑到变异效应是通过序列之外的生物背景(如结构)表现出来的,我们首先评估了纯序列 pLMs 学习到的结构背景对变异效应预测的影响程度。我们认为有必要有目的、可控地将蛋白质结构背景注入 pLM。因此,我们引入了结构信息 pLMs(SI-pLMs)框架,将屏蔽序列去噪扩展到序列和结构的跨模态去噪。对深度诱变扫描基准的数值结果表明,即使使用较小的模型和较少的数据,我们的SI-pLMs在与包括其他pLMs在内的竞争方法的竞争中也能稳健地名列前茅。案例研究表明,与纯序列 pLMs 相比,SI-pLMs 可以更好地捕捉适配性景观,这是因为:(a)低/高适配性序列的学习嵌入更容易分离;(b)功能和进化保守残基的学习氨基酸分布的熵值可能比其他残基低得多,因此保守性也更高。通过模型结构和训练目标,我们的 SI-pLMs 适用于修正任何纯序列 pLMs。它们不需要结构数据作为变异效应预测的模型输入,在训练过程中只使用结构作为上下文提供者和模型规整器。
{"title":"Structure-informed protein language models are robust predictors for variant effects.","authors":"Yuanfei Sun, Yang Shen","doi":"10.1007/s00439-024-02695-w","DOIUrl":"10.1007/s00439-024-02695-w","url":null,"abstract":"<p><p>Emerging variant effect predictors, protein language models (pLMs) learn evolutionary distribution of functional sequences to capture fitness landscape. Considering that variant effects are manifested through biological contexts beyond sequence (such as structure), we first assess how much structure context is learned in sequence-only pLMs and affecting variant effect prediction. And we establish a need to inject into pLMs protein structural context purposely and controllably. We thus introduce a framework of structure-informed pLMs (SI-pLMs), by extending masked sequence denoising to cross-modality denoising for both sequence and structure. Numerical results over deep mutagenesis scanning benchmarks show that our SI-pLMs, even when using smaller models and less data, are robustly top performers against competing methods including other pLMs, which shows that introducing biological context can be more effective at capturing fitness landscape than simply using larger models or bigger data. Case studies reveal that, compared to sequence-only pLMs, SI-pLMs can be better at capturing fitness landscape because (a) learned embeddings of low/high-fitness sequences can be more separable and (b) learned amino-acid distributions of functionally and evolutionarily conserved residues can be of much lower entropy, thus much more conserved, than other residues. Our SI-pLMs are applicable to revising any sequence-only pLMs through model architecture and training objectives. They do not require structure data as model inputs for variant effect prediction and only use structures as context provider and model regularizer during training.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141906463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing predictions on fitness effects of missense variants in HMBS in CAGI6. 评估 CAGI6 中 HMBS 的错义变异对健康影响的预测。
IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY Pub Date : 2024-08-07 DOI: 10.1007/s00439-024-02680-3
Jing Zhang, Lisa Kinch, Panagiotis Katsonis, Olivier Lichtarge, Milind Jagota, Yun S Song, Yuanfei Sun, Yang Shen, Nurdan Kuru, Onur Dereli, Ogun Adebali, Muttaqi Ahmad Alladin, Debnath Pal, Emidio Capriotti, Maria Paola Turina, Castrense Savojardo, Pier Luigi Martelli, Giulia Babbi, Rita Casadio, Fabrizio Pucci, Marianne Rooman, Gabriel Cia, Matsvei Tsishyn, Alexey Strokach, Zhiqiang Hu, Warren van Loggerenberg, Frederick P Roth, Predrag Radivojac, Steven E Brenner, Qian Cong, Nick V Grishin

This paper presents an evaluation of predictions submitted for the "HMBS" challenge, a component of the sixth round of the Critical Assessment of Genome Interpretation held in 2021. The challenge required participants to predict the effects of missense variants of the human HMBS gene on yeast growth. The HMBS enzyme, critical for the biosynthesis of heme in eukaryotic cells, is highly conserved among eukaryotes. Despite the application of a variety of algorithms and methods, the performance of predictors was relatively similar, with Kendall's tau correlation coefficients between predictions and experimental scores around 0.3 for a majority of submissions. Notably, the median correlation (≥ 0.34) observed among these predictors, especially the top predictions from different groups, was greater than the correlation observed between their predictions and the actual experimental results. Most predictors were moderately successful in distinguishing between deleterious and benign variants, as evidenced by an area under the receiver operating characteristic (ROC) curve (AUC) of approximately 0.7 respectively. Compared with the recent two rounds of CAGI competitions, we noticed more predictors outperformed the baseline predictor, which is solely based on the amino acid frequencies. Nevertheless, the overall accuracy of predictions is still far short of positive control, which is derived from experimental scores, indicating the necessity for considerable improvements in the field. The most inaccurately predicted variants in this round were associated with the insertion loop, which is absent in many orthologs, suggesting the predictors still heavily rely on the information from multiple sequence alignment.

本文介绍了对 "HMBS "挑战赛所提交预测的评估,该挑战赛是 2021 年举行的第六轮基因组解读关键评估的一个组成部分。这项挑战要求参赛者预测人类 HMBS 基因的错义变体对酵母生长的影响。HMBS 酶对真核细胞中血红素的生物合成至关重要,在真核生物中高度保守。尽管采用了多种算法和方法,但预测结果的性能相对相似,大多数提交的预测结果和实验得分之间的 Kendall's tau 相关系数都在 0.3 左右。值得注意的是,在这些预测器之间观察到的相关性中值(≥ 0.34),尤其是来自不同组别的顶级预测器,比它们的预测与实际实验结果之间观察到的相关性更大。大多数预测因子在区分有害变异体和良性变异体方面取得了中等程度的成功,接收者操作特征曲线(ROC)下面积(AUC)约为 0.7。与最近两轮 CAGI 竞赛相比,我们注意到有更多的预测结果优于仅基于氨基酸频率的基线预测结果。尽管如此,预测的总体准确率仍远低于根据实验分数得出的正向对照,这表明该领域仍有很大的改进空间。本轮预测最不准确的变体与插入环有关,而许多直向同源物中都不存在插入环,这表明预测器仍然严重依赖多序列比对的信息。
{"title":"Assessing predictions on fitness effects of missense variants in HMBS in CAGI6.","authors":"Jing Zhang, Lisa Kinch, Panagiotis Katsonis, Olivier Lichtarge, Milind Jagota, Yun S Song, Yuanfei Sun, Yang Shen, Nurdan Kuru, Onur Dereli, Ogun Adebali, Muttaqi Ahmad Alladin, Debnath Pal, Emidio Capriotti, Maria Paola Turina, Castrense Savojardo, Pier Luigi Martelli, Giulia Babbi, Rita Casadio, Fabrizio Pucci, Marianne Rooman, Gabriel Cia, Matsvei Tsishyn, Alexey Strokach, Zhiqiang Hu, Warren van Loggerenberg, Frederick P Roth, Predrag Radivojac, Steven E Brenner, Qian Cong, Nick V Grishin","doi":"10.1007/s00439-024-02680-3","DOIUrl":"10.1007/s00439-024-02680-3","url":null,"abstract":"<p><p>This paper presents an evaluation of predictions submitted for the \"HMBS\" challenge, a component of the sixth round of the Critical Assessment of Genome Interpretation held in 2021. The challenge required participants to predict the effects of missense variants of the human HMBS gene on yeast growth. The HMBS enzyme, critical for the biosynthesis of heme in eukaryotic cells, is highly conserved among eukaryotes. Despite the application of a variety of algorithms and methods, the performance of predictors was relatively similar, with Kendall's tau correlation coefficients between predictions and experimental scores around 0.3 for a majority of submissions. Notably, the median correlation (≥ 0.34) observed among these predictors, especially the top predictions from different groups, was greater than the correlation observed between their predictions and the actual experimental results. Most predictors were moderately successful in distinguishing between deleterious and benign variants, as evidenced by an area under the receiver operating characteristic (ROC) curve (AUC) of approximately 0.7 respectively. Compared with the recent two rounds of CAGI competitions, we noticed more predictors outperformed the baseline predictor, which is solely based on the amino acid frequencies. Nevertheless, the overall accuracy of predictions is still far short of positive control, which is derived from experimental scores, indicating the necessity for considerable improvements in the field. The most inaccurately predicted variants in this round were associated with the insertion loop, which is absent in many orthologs, suggesting the predictors still heavily rely on the information from multiple sequence alignment.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":""},"PeriodicalIF":3.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141897332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable approaches for generating, validating and incorporating data from high-throughput functional assays to improve clinical variant classification. 生成、验证和纳入高通量功能测定数据的可扩展方法,以改进临床变异分类。
IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY Pub Date : 2024-08-01 DOI: 10.1007/s00439-024-02691-0
Samskruthi Reddy Padigepati, David A Stafford, Christopher A Tan, Melanie R Silvis, Kirsty Jamieson, Andrew Keyser, Paola Alejandra Correa Nunez, John M Nicoludis, Toby Manders, Laure Fresard, Yuya Kobayashi, Carlos L Araya, Swaroop Aradhya, Britt Johnson, Keith Nykamp, Jason A Reuter

As the adoption and scope of genetic testing continue to expand, interpreting the clinical significance of DNA sequence variants at scale remains a formidable challenge, with a high proportion classified as variants of uncertain significance (VUSs). Genetic testing laboratories have historically relied, in part, on functional data from academic literature to support variant classification. High-throughput functional assays or multiplex assays of variant effect (MAVEs), designed to assess the effects of DNA variants on protein stability and function, represent an important and increasingly available source of evidence for variant classification, but their potential is just beginning to be realized in clinical lab settings. Here, we describe a framework for generating, validating and incorporating data from MAVEs into a semi-quantitative variant classification method applied to clinical genetic testing. Using single-cell gene expression measurements, cellular evidence models were built to assess the effects of DNA variation in 44 genes of clinical interest. This framework was also applied to models for an additional 22 genes with previously published MAVE datasets. In total, modeling data was incorporated from 24 genes into our variant classification method. These data contributed evidence for classifying 4043 observed variants in over 57,000 individuals. Genetic testing laboratories are uniquely positioned to generate, analyze, validate, and incorporate evidence from high-throughput functional data and ultimately enable the use of these data to provide definitive clinical variant classifications for more patients.

随着基因检测的应用和范围不断扩大,如何大规模解释 DNA 序列变异的临床意义仍是一项艰巨的挑战,其中很大一部分被归类为意义不确定的变异(VUS)。基因检测实验室历来部分依赖学术文献中的功能数据来支持变异分类。高通量功能检测或变异效应多重检测(MAVEs)旨在评估DNA变异对蛋白质稳定性和功能的影响,是变异分类的一个重要且日益可用的证据来源,但其潜力在临床实验室环境中才刚刚开始发挥出来。在这里,我们描述了一个用于生成、验证 MAVE 数据并将其纳入应用于临床基因检测的半定量变异分类方法的框架。通过单细胞基因表达测量,我们建立了细胞证据模型,以评估 44 个临床相关基因中 DNA 变异的影响。这一框架还应用于另外 22 个基因的模型,这些基因都有先前发表的 MAVE 数据集。在我们的变异分类方法中,总共纳入了 24 个基因的建模数据。这些数据为对 57,000 多人中的 4043 个观察到的变异进行分类提供了证据。基因检测实验室在生成、分析、验证和整合高通量功能数据证据方面具有得天独厚的优势,并能最终利用这些数据为更多患者提供明确的临床变异分类。
{"title":"Scalable approaches for generating, validating and incorporating data from high-throughput functional assays to improve clinical variant classification.","authors":"Samskruthi Reddy Padigepati, David A Stafford, Christopher A Tan, Melanie R Silvis, Kirsty Jamieson, Andrew Keyser, Paola Alejandra Correa Nunez, John M Nicoludis, Toby Manders, Laure Fresard, Yuya Kobayashi, Carlos L Araya, Swaroop Aradhya, Britt Johnson, Keith Nykamp, Jason A Reuter","doi":"10.1007/s00439-024-02691-0","DOIUrl":"10.1007/s00439-024-02691-0","url":null,"abstract":"<p><p>As the adoption and scope of genetic testing continue to expand, interpreting the clinical significance of DNA sequence variants at scale remains a formidable challenge, with a high proportion classified as variants of uncertain significance (VUSs). Genetic testing laboratories have historically relied, in part, on functional data from academic literature to support variant classification. High-throughput functional assays or multiplex assays of variant effect (MAVEs), designed to assess the effects of DNA variants on protein stability and function, represent an important and increasingly available source of evidence for variant classification, but their potential is just beginning to be realized in clinical lab settings. Here, we describe a framework for generating, validating and incorporating data from MAVEs into a semi-quantitative variant classification method applied to clinical genetic testing. Using single-cell gene expression measurements, cellular evidence models were built to assess the effects of DNA variation in 44 genes of clinical interest. This framework was also applied to models for an additional 22 genes with previously published MAVE datasets. In total, modeling data was incorporated from 24 genes into our variant classification method. These data contributed evidence for classifying 4043 observed variants in over 57,000 individuals. Genetic testing laboratories are uniquely positioned to generate, analyze, validate, and incorporate evidence from high-throughput functional data and ultimately enable the use of these data to provide definitive clinical variant classifications for more patients.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":"995-1004"},"PeriodicalIF":3.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11303574/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141859632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The missing link: ARID1B non-truncating variants causing Coffin-Siris syndrome due to protein aggregation. 缺失的一环:ARID1B非截断变异因蛋白质聚集而导致科芬-西里斯综合征。
IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY Pub Date : 2024-08-01 Epub Date: 2024-07-19 DOI: 10.1007/s00439-024-02688-9
Elisabeth Bosch, Esther Güse, Philipp Kirchner, Andreas Winterpacht, Mona Walther, Marielle Alders, Jennifer Kerkhof, Arif B Ekici, Heinrich Sticht, Bekim Sadikovic, André Reis, Georgia Vasileiou

ARID1B is the most frequently mutated gene in Coffin-Siris syndrome (CSS). To date, the vast majority of causative variants reported in ARID1B are truncating, leading to nonsense-mediated mRNA decay. In the absence of experimental data, only few ARID1B amino acid substitutions have been classified as pathogenic, mainly based on clinical data and their de novo occurrence, while most others are currently interpreted as variants of unknown significance. The present study substantiates the pathogenesis of ARID1B non-truncating/NMD-escaping variants located in the SMARCA4-interacting EHD2 and DNA-binding ARID domains. Overexpression assays in cell lines revealed that the majority of EHD2 variants lead to protein misfolding and formation of cytoplasmic aggresomes surrounded by vimentin cage-like structures and co-localizing with the microtubule organisation center. ARID domain variants exhibited not only aggresomes, but also nuclear aggregates, demonstrating robust pathological effects. Protein levels were not compromised, as shown by quantitative western blot analysis. In silico structural analysis predicted the exposure of amylogenic segments in both domains due to the nearby variants, likely causing this aggregation. Genome-wide transcriptome and methylation analysis in affected individuals revealed expression and methylome patterns consistent with those of the pathogenic haploinsufficiency ARID1B alterations in CSS cases. These results further support pathogenicity and indicate two approaches for disambiguation of such variants in everyday practice. The few affected individuals harbouring EHD2 non-truncating variants described to date exhibit mild CSS clinical traits. In summary, this study paves the way for the re-evaluation of previously unclear ARID1B non-truncating variants and opens a new era in CSS genetic diagnosis.

ARID1B 是 Coffin-Siris 综合征(CSS)中最常见的突变基因。迄今为止,所报道的绝大多数 ARID1B 致病变异都是截断变异,导致无义介导的 mRNA 衰减。在缺乏实验数据的情况下,只有少数 ARID1B 氨基酸置换被归类为致病变异,主要依据是临床数据及其新发生的情况,而其他大多数变异目前被解释为意义不明的变异。本研究证实了位于与SMARCA4相互作用的EHD2和DNA结合ARID结构域的ARID1B非截断/NMD-escaping变体的致病机理。细胞系中的过表达实验显示,大多数 EHD2 变体会导致蛋白质错误折叠,形成由波形蛋白笼状结构包围的细胞质侵染体,并与微管组织中心共定位。ARID 结构域变体不仅表现出侵染体,还表现出核聚集体,显示出强大的病理效应。定量 Western 印迹分析表明,蛋白质水平并未受到影响。硅学结构分析预测,由于邻近的变体,两个结构域中的淀粉形成区段都暴露了出来,这可能是造成这种聚集的原因。受影响个体的全基因组转录组和甲基化分析表明,其表达和甲基化模式与 CSS 病例中的致病性单倍体缺乏 ARID1B 变异一致。这些结果进一步支持了致病性,并指出了在日常实践中消除此类变异的两种方法。迄今为止,少数携带 EHD2 非截断变异的受影响个体表现出轻微的 CSS 临床特征。总之,这项研究为重新评估以前不清楚的 ARID1B 非截断变异铺平了道路,开创了 CSS 基因诊断的新纪元。
{"title":"The missing link: ARID1B non-truncating variants causing Coffin-Siris syndrome due to protein aggregation.","authors":"Elisabeth Bosch, Esther Güse, Philipp Kirchner, Andreas Winterpacht, Mona Walther, Marielle Alders, Jennifer Kerkhof, Arif B Ekici, Heinrich Sticht, Bekim Sadikovic, André Reis, Georgia Vasileiou","doi":"10.1007/s00439-024-02688-9","DOIUrl":"10.1007/s00439-024-02688-9","url":null,"abstract":"<p><p>ARID1B is the most frequently mutated gene in Coffin-Siris syndrome (CSS). To date, the vast majority of causative variants reported in ARID1B are truncating, leading to nonsense-mediated mRNA decay. In the absence of experimental data, only few ARID1B amino acid substitutions have been classified as pathogenic, mainly based on clinical data and their de novo occurrence, while most others are currently interpreted as variants of unknown significance. The present study substantiates the pathogenesis of ARID1B non-truncating/NMD-escaping variants located in the SMARCA4-interacting EHD2 and DNA-binding ARID domains. Overexpression assays in cell lines revealed that the majority of EHD2 variants lead to protein misfolding and formation of cytoplasmic aggresomes surrounded by vimentin cage-like structures and co-localizing with the microtubule organisation center. ARID domain variants exhibited not only aggresomes, but also nuclear aggregates, demonstrating robust pathological effects. Protein levels were not compromised, as shown by quantitative western blot analysis. In silico structural analysis predicted the exposure of amylogenic segments in both domains due to the nearby variants, likely causing this aggregation. Genome-wide transcriptome and methylation analysis in affected individuals revealed expression and methylome patterns consistent with those of the pathogenic haploinsufficiency ARID1B alterations in CSS cases. These results further support pathogenicity and indicate two approaches for disambiguation of such variants in everyday practice. The few affected individuals harbouring EHD2 non-truncating variants described to date exhibit mild CSS clinical traits. In summary, this study paves the way for the re-evaluation of previously unclear ARID1B non-truncating variants and opens a new era in CSS genetic diagnosis.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":"965-978"},"PeriodicalIF":3.8,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11303441/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141723537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Human Genetics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1