首页 > 最新文献

HGG Advances最新文献

英文 中文
Epigenome-wide association study meta-analysis of BMI in African Americans. 非裔美国人BMI的全表观基因组关联研究荟萃分析。
IF 3.6 Q2 GENETICS & HEREDITY Pub Date : 2026-01-15 Epub Date: 2025-12-04 DOI: 10.1016/j.xhgg.2025.100552
Kendra Ferrier, Mariaelisa Graff, Iain R Konigsberg, Maggie Stanislawski, Heather M Highland, Laura M Raffield, April P Carson, Eric Boerwinkle, Jill M Norris, Chris R Gignoux, Audrey E Hendricks, Sridharan Raghavan, Kari E North, Kristin L Young, Anne E Justice, Matthew A Allison, Mathew J Budoff, Silva Kasela, François Aguet, Joshua J Joseph, Charles Kooperberg, Stephen S Rich, Jerome I Rotter, Ethan M Lange, Leslie A Lange

Despite considerable advances in identifying risk factors for obesity, gaps remain in our understanding about its etiology. Genetic variants explain only a small portion of variation in obesity-related traits such as body mass index (BMI). Epigenetic regulation, which controls gene expression and is influenced by environmental and genetic factors, may account for additional variability in BMI. Epigenetic studies of BMI have largely been conducted in European ancestry populations, despite the disproportionate burden of obesity in African Americans (AAs). We conducted a sex-stratified BMI epigenome-wide association study meta-analysis in AA participants from the Jackson Heart Study (n = 1,604) and the Multi-Ethnic Study of Atherosclerosis (n = 179) with Illumina EPIC (850,000) array data. Linear regression models with methylation as the outcome and continuous BMI as the predictor were stratified by study and sex and meta-analyzed. We identified 208 methylation sites (CpGs, p < 8.72 × 10-8) significantly associated with BMI; 151 had not been previously reported in the literature. Replication was performed in a separate sample of AA participants with 450,000 array data, which lacks many CpGs present in the 850,000 array. Replication testing was possible for only 29 of the 151 CpGs; 19 were statistically significant (p < 1.72 × 10-3). Sex-specific results showed 4 female-only and 3 male-only BMI-CpGs not identified in the sex-combined results. Differentially methylated region (DMR) analysis resulted in 66 DMRs, including several regions near genes previously implicated for obesity (e.g., SOCS3, TGFB1). Further analyses showed enrichment of genes and traits related to the immune system and inflammation-related pathways (e.g., the IL-6/JAK/STAT pathway).

尽管在确定肥胖的危险因素方面取得了相当大的进展,但我们对其病因的理解仍然存在差距。基因变异只能解释一小部分肥胖相关特征的变异,比如身体质量指数(BMI)。表观遗传调控,控制基因表达,受环境和遗传因素的影响,可能解释了BMI的额外变异性。尽管非裔美国人(AAs)的肥胖负担不成比例,但BMI的表观遗传学研究主要是在欧洲血统人群中进行的。我们对来自Jackson心脏研究(JHS, n=1604)和动脉粥样硬化多种族研究(MESA, n=179)的AA参与者进行了性别分层BMI表观基因组关联研究(EWAS)荟萃分析,使用Illumina EPIC (850k)阵列数据。以甲基化为结果和连续BMI为预测因子的线性回归模型按研究和性别分层并进行meta分析。我们发现208个甲基化位点(CpGs, p< 8.72x10-8)与BMI显著相关;151例未见文献报道。在具有450k阵列数据的AA参与者的单独样本中进行复制,该样本缺乏850k阵列中存在的许多CpGs。151个CpGs中只有29个可以进行复制测试;19例有统计学意义(p-3)。性别特异性结果显示,在性别组合结果中未发现4个仅女性和3个仅男性的BMI-CpGs。差异甲基化区(DMR)分析产生66个DMR,包括先前与肥胖有关的基因附近的几个区域(例如,SOCS3和TGFB1)。进一步分析显示,与免疫系统和炎症相关通路(如IL-6/JAK/STAT通路)相关的基因和性状富集。
{"title":"Epigenome-wide association study meta-analysis of BMI in African Americans.","authors":"Kendra Ferrier, Mariaelisa Graff, Iain R Konigsberg, Maggie Stanislawski, Heather M Highland, Laura M Raffield, April P Carson, Eric Boerwinkle, Jill M Norris, Chris R Gignoux, Audrey E Hendricks, Sridharan Raghavan, Kari E North, Kristin L Young, Anne E Justice, Matthew A Allison, Mathew J Budoff, Silva Kasela, François Aguet, Joshua J Joseph, Charles Kooperberg, Stephen S Rich, Jerome I Rotter, Ethan M Lange, Leslie A Lange","doi":"10.1016/j.xhgg.2025.100552","DOIUrl":"10.1016/j.xhgg.2025.100552","url":null,"abstract":"<p><p>Despite considerable advances in identifying risk factors for obesity, gaps remain in our understanding about its etiology. Genetic variants explain only a small portion of variation in obesity-related traits such as body mass index (BMI). Epigenetic regulation, which controls gene expression and is influenced by environmental and genetic factors, may account for additional variability in BMI. Epigenetic studies of BMI have largely been conducted in European ancestry populations, despite the disproportionate burden of obesity in African Americans (AAs). We conducted a sex-stratified BMI epigenome-wide association study meta-analysis in AA participants from the Jackson Heart Study (n = 1,604) and the Multi-Ethnic Study of Atherosclerosis (n = 179) with Illumina EPIC (850,000) array data. Linear regression models with methylation as the outcome and continuous BMI as the predictor were stratified by study and sex and meta-analyzed. We identified 208 methylation sites (CpGs, p < 8.72 × 10<sup>-8</sup>) significantly associated with BMI; 151 had not been previously reported in the literature. Replication was performed in a separate sample of AA participants with 450,000 array data, which lacks many CpGs present in the 850,000 array. Replication testing was possible for only 29 of the 151 CpGs; 19 were statistically significant (p < 1.72 × 10<sup>-3</sup>). Sex-specific results showed 4 female-only and 3 male-only BMI-CpGs not identified in the sex-combined results. Differentially methylated region (DMR) analysis resulted in 66 DMRs, including several regions near genes previously implicated for obesity (e.g., SOCS3, TGFB1). Further analyses showed enrichment of genes and traits related to the immune system and inflammation-related pathways (e.g., the IL-6/JAK/STAT pathway).</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100552"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145688341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two commonly reported incidental variants in OTC are associated with late-onset disease. 两种常见的非处方药偶然变异与迟发性疾病有关。
IF 3.6 Q2 GENETICS & HEREDITY Pub Date : 2026-01-15 Epub Date: 2025-10-16 DOI: 10.1016/j.xhgg.2025.100531
Steven H Lang, Russell S Lo, Gareth A Cromie, Aimée M Dudley, Nicholas Ah Mew, Kara Simpson, Vernon Reid Sutton, Sandra Darilek, Saima Ali, Matthew T Snyder, Brendan Lee, Ronit Marom, Sandesh C S Nagamani, Lindsay C Burrage

Asymptomatic individuals with pathogenic variants in OTC, the gene encoding ornithine transcarbamylase are increasingly being identified through cascade testing, carrier screening, or as secondary findings from genome-wide sequencing tests. However, guidance for counseling and management of such individuals is currently lacking. We selected two common OTC variants for phenotypic and functional characterization: NM_000531.6:c.118C>T p.(Arg40Cys) and NM_000531.6:c.1061T>G p.(Phe354Cys). The former is the most frequently reported pathogenic/likely pathogenic missense variant present in gnomAD, and the latter has been frequently encountered in our clinical practice. We performed a retrospective chart review at our center, queried the database of the Urea Cycle Disorders Consortium, and performed a literature review to create cohorts of individuals with these variants. Functional studies were pursued using a validated yeast-based assay. We identified 14 individuals (6 females, 8 males) with the p.(Arg40Cys) variant and 14 individuals (5 females, 9 males) with the p.(Phe354Cys) variant. There were no reported episodes of neonatal hyperammonemia in males and no hyperammonemic events reported in females with either variant. In our functional assay, both variants reduced yeast growth to the hypomorphic range. Our findings support the classification of both p.(Arg40Cys) and p.(Phe354Cys) variants in OTC as hypomorphic variants that are typically associated with late-onset OTCD in males.

通过级联检测、携带者筛查或全基因组测序检测的次要发现,越来越多的无症状OTC致病性变异个体被识别出来。然而,目前缺乏对这些个体进行咨询和管理的指导。我们选择了两个常见的OTC变异进行表型和功能表征:NM_000531.6:c。118C>T p.(Arg40Cys)和NM_000531.6:c。1061 t > G p。(Phe354Cys)。前者是gnomAD中最常报道的致病性/可能致病性错义变体,后者在我们的临床实践中经常遇到。我们在本中心进行了回顾性图表回顾,查询了尿素循环紊乱协会的数据库,并进行了文献回顾,以创建具有这些变异的个体队列。功能研究是用一种有效的酵母为基础的试验进行的。我们鉴定出14个个体(6名女性,8名男性)携带p.(Arg40Cys)变异,14个个体(5名女性,9名男性)携带p.(Phe354Cys)变异。没有报道男婴新生儿高氨血症发作,也没有报道两种变异的女婴高氨血症事件。在我们的功能分析中,这两种变体都将酵母生长降低到半胚范围。综上所述,我们的研究结果支持了OTC中p.(Arg40Cys)和p.(Phe354Cys)变异体的分类,这些变异体通常与男性迟发性OTCD相关。
{"title":"Two commonly reported incidental variants in OTC are associated with late-onset disease.","authors":"Steven H Lang, Russell S Lo, Gareth A Cromie, Aimée M Dudley, Nicholas Ah Mew, Kara Simpson, Vernon Reid Sutton, Sandra Darilek, Saima Ali, Matthew T Snyder, Brendan Lee, Ronit Marom, Sandesh C S Nagamani, Lindsay C Burrage","doi":"10.1016/j.xhgg.2025.100531","DOIUrl":"10.1016/j.xhgg.2025.100531","url":null,"abstract":"<p><p>Asymptomatic individuals with pathogenic variants in OTC, the gene encoding ornithine transcarbamylase are increasingly being identified through cascade testing, carrier screening, or as secondary findings from genome-wide sequencing tests. However, guidance for counseling and management of such individuals is currently lacking. We selected two common OTC variants for phenotypic and functional characterization: NM_000531.6:c.118C>T p.(Arg40Cys) and NM_000531.6:c.1061T>G p.(Phe354Cys). The former is the most frequently reported pathogenic/likely pathogenic missense variant present in gnomAD, and the latter has been frequently encountered in our clinical practice. We performed a retrospective chart review at our center, queried the database of the Urea Cycle Disorders Consortium, and performed a literature review to create cohorts of individuals with these variants. Functional studies were pursued using a validated yeast-based assay. We identified 14 individuals (6 females, 8 males) with the p.(Arg40Cys) variant and 14 individuals (5 females, 9 males) with the p.(Phe354Cys) variant. There were no reported episodes of neonatal hyperammonemia in males and no hyperammonemic events reported in females with either variant. In our functional assay, both variants reduced yeast growth to the hypomorphic range. Our findings support the classification of both p.(Arg40Cys) and p.(Phe354Cys) variants in OTC as hypomorphic variants that are typically associated with late-onset OTCD in males.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100531"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12615274/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145313895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-trait genome-wide analysis identified risk loci and candidate drugs for heart failure. 多性状全基因组分析确定了心力衰竭的风险位点和候选药物。
IF 3.6 Q2 GENETICS & HEREDITY Pub Date : 2026-01-15 Epub Date: 2025-10-29 DOI: 10.1016/j.xhgg.2025.100540
Zhengyang Yu, Maohuan Lin, Zhanyu Liang, Bozhen Ren, Ying Yang, Xiaoling Lin, Huiling Liu, Yangxin Chen, Kaida Ning, Li C Xia

Heart failure (HF) is a common cardiovascular syndrome that poses significant morbidity and mortality risks. While genome-wide association studies reporting on HF abound, its genetic etiology remains poorly elucidated, primarily due to its inherent polygenic nature. Furthermore, these genetic insights have not been fully leveraged to develop effective primary treatment strategies for HF. In this study, we conducted a large-scale integrated multi-trait analysis using European ancestry genome-wide association study summary statistics of coronary artery disease and HF, involving nearly 2 million samples to identify risk loci associated with HF. Seventy-two loci were identified for HF using MTAG, of which 58 were supported in the replication phase. Transcriptome association analysis revealed 215 HF risk genes, including EDNRA and FURIN. Pathway enrichment analysis of risk genes revealed their enrichment in pathways closely related to HF, such as response to endogenous stimulus (adjusted p = 8.83 × 10-3), phosphate-containing compound metabolic process (adjusted p = 1.91 × 10-2), myofibroblast differentiation (adjusted p = 4.26 × 10-2), and regulation of muscle adaptation (adjusted p = 4.96 × 10-2). Single-cell analysis indicated significant enrichments of these genes in smooth muscle cells, fibroblasts of cardiac tissue, and cardiac endothelial cells. Additionally, our analysis of HF risk genes identified 81 potential drugs for further pharmacological evaluation. These findings provide insights into the genetic determinants of HF, highlighting MTAG-identified genetic loci as potential interventional targets for HF treatment, with significant implications for public health and clinical practice.

心力衰竭(HF)是一种常见的心血管综合征,具有显著的发病率和死亡率风险。虽然全基因组关联研究大量报道心衰,但其遗传病因仍不清楚,主要是由于其固有的多基因性质。此外,这些遗传见解尚未被充分利用来制定有效的心衰初级治疗策略。在这项研究中,我们使用欧洲血统的冠状动脉疾病和心衰的GWAS汇总统计数据进行了大规模的综合多性状分析,涉及近200万个样本,以确定与心衰相关的风险位点。使用MTAG鉴定了72个HF位点,其中58个在复制阶段得到支持。转录组关联分析显示215个HF危险基因,包括EDNRA和FURIN。通路富集分析显示,在内源性刺激反应(调整P = 8.83×10-3)、含磷酸盐化合物代谢过程(调整P = 1.91×10-2)、肌成纤维细胞分化(调整P = 4.26×10-2)、肌肉适应调节(调整P = 4.96×10-2)等与HF密切相关的通路中,风险基因富集。单细胞分析表明,这些基因在平滑肌细胞、心脏组织成纤维细胞和心脏内皮细胞中显著富集。此外,我们对HF风险基因的分析确定了81种潜在的药物,可供进一步的药理学评估。这些发现为HF的遗传决定因素提供了见解,突出了mtag识别的遗传位点作为HF治疗的潜在介入靶点,对公共卫生和临床实践具有重要意义。
{"title":"Multi-trait genome-wide analysis identified risk loci and candidate drugs for heart failure.","authors":"Zhengyang Yu, Maohuan Lin, Zhanyu Liang, Bozhen Ren, Ying Yang, Xiaoling Lin, Huiling Liu, Yangxin Chen, Kaida Ning, Li C Xia","doi":"10.1016/j.xhgg.2025.100540","DOIUrl":"10.1016/j.xhgg.2025.100540","url":null,"abstract":"<p><p>Heart failure (HF) is a common cardiovascular syndrome that poses significant morbidity and mortality risks. While genome-wide association studies reporting on HF abound, its genetic etiology remains poorly elucidated, primarily due to its inherent polygenic nature. Furthermore, these genetic insights have not been fully leveraged to develop effective primary treatment strategies for HF. In this study, we conducted a large-scale integrated multi-trait analysis using European ancestry genome-wide association study summary statistics of coronary artery disease and HF, involving nearly 2 million samples to identify risk loci associated with HF. Seventy-two loci were identified for HF using MTAG, of which 58 were supported in the replication phase. Transcriptome association analysis revealed 215 HF risk genes, including EDNRA and FURIN. Pathway enrichment analysis of risk genes revealed their enrichment in pathways closely related to HF, such as response to endogenous stimulus (adjusted p = 8.83 × 10<sup>-3</sup>), phosphate-containing compound metabolic process (adjusted p = 1.91 × 10<sup>-2</sup>), myofibroblast differentiation (adjusted p = 4.26 × 10<sup>-2</sup>), and regulation of muscle adaptation (adjusted p = 4.96 × 10<sup>-2</sup>). Single-cell analysis indicated significant enrichments of these genes in smooth muscle cells, fibroblasts of cardiac tissue, and cardiac endothelial cells. Additionally, our analysis of HF risk genes identified 81 potential drugs for further pharmacological evaluation. These findings provide insights into the genetic determinants of HF, highlighting MTAG-identified genetic loci as potential interventional targets for HF treatment, with significant implications for public health and clinical practice.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100540"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12681555/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145410082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lack of association between G6PD variants and Parkinson disease. G6PD变异与帕金森病之间缺乏关联。
IF 3.6 Q2 GENETICS & HEREDITY Pub Date : 2026-01-15 Epub Date: 2025-12-09 DOI: 10.1016/j.xhgg.2025.100555
Leah V Chifamba, Sitki Cem Parlar, Lang Liu, Leonard L Sokol, Eric Yu, Farnaz Asayesh, Jamil Ahmad, Jennifer A Ruskey, Dan Spiegelman, Cheryl Waters, Oury Monchi, Yves Dauvilliers, Nicolas Dupré, Alla Timofeeva, Anton Emelyanov, Sofya Pchelina, Irina Miliukhina, Lior Greenbaum, Sharon Hassin-Baer, Roy N Alcalay, Alberto J Espay, Ziv Gan-Or, Konstantin Senkevich

Oxidative stress has been implicated in Parkinson disease (PD). Genes involved in PD, such as PRKN, PINK1, and PARK7, contribute to oxidative stress in dopaminergic neurons. The X-linked G6PD gene encodes glucose 6-phosphate dehydrogenase, an important regulator of oxidative stress. Recent studies suggested that alpha-synuclein aggregates may impair G6PD activity and contribute to dopaminergic neuron loss, and that G6PD mutations may independently increase the risk of PD. In this study, we aimed to examine the role of common and rare G6PD variants in PD across 6 cohorts, including 8,905 PD cases, 16,770 proxy cases, and 394,098 controls. These cohorts were analyzed after stratification by sex and then combined to account for the G6PD X-linked location. Using logistic regression, we did not identify significant associations for common variants in any of the cohorts. The optimized sequence Kernel association (SKAT-O) test was performed to assess the effect of rare variants (minor allele frequency <0.01) across six cohorts, followed by a meta-analysis using metaSKAT, also demonstrating lack of association. In conclusion, we did not find evidence for a role for G6PD in PD.

氧化应激与帕金森病(PD)有关。与帕金森病相关的基因,如PRKN、PINK1和PARK7,有助于多巴胺能神经元的氧化应激。x连锁的G6PD基因编码葡萄糖6-磷酸脱氢酶,这是氧化应激的重要调节因子。最近的研究表明,α -突触核蛋白聚集体可能会损害G6PD的活性并导致多巴胺能神经元的丧失,并且G6PD突变可能单独增加PD的风险。在这项研究中,我们旨在通过6个队列研究常见和罕见G6PD变异在PD中的作用,包括8,905例PD病例,16,770例代理病例和394,098例对照。这些队列在按性别分层后进行分析,然后合并以解释G6PD x连锁位置。使用逻辑回归,我们没有发现任何队列中常见变异的显著关联。通过优化序列核关联(SKAT-O)测试评估罕见变异(小等位基因频率< 0.01)对6个队列的影响,随后使用metaSKAT进行meta分析,也显示缺乏关联。总之,我们没有发现G6PD在PD中起作用的证据。
{"title":"Lack of association between G6PD variants and Parkinson disease.","authors":"Leah V Chifamba, Sitki Cem Parlar, Lang Liu, Leonard L Sokol, Eric Yu, Farnaz Asayesh, Jamil Ahmad, Jennifer A Ruskey, Dan Spiegelman, Cheryl Waters, Oury Monchi, Yves Dauvilliers, Nicolas Dupré, Alla Timofeeva, Anton Emelyanov, Sofya Pchelina, Irina Miliukhina, Lior Greenbaum, Sharon Hassin-Baer, Roy N Alcalay, Alberto J Espay, Ziv Gan-Or, Konstantin Senkevich","doi":"10.1016/j.xhgg.2025.100555","DOIUrl":"10.1016/j.xhgg.2025.100555","url":null,"abstract":"<p><p>Oxidative stress has been implicated in Parkinson disease (PD). Genes involved in PD, such as PRKN, PINK1, and PARK7, contribute to oxidative stress in dopaminergic neurons. The X-linked G6PD gene encodes glucose 6-phosphate dehydrogenase, an important regulator of oxidative stress. Recent studies suggested that alpha-synuclein aggregates may impair G6PD activity and contribute to dopaminergic neuron loss, and that G6PD mutations may independently increase the risk of PD. In this study, we aimed to examine the role of common and rare G6PD variants in PD across 6 cohorts, including 8,905 PD cases, 16,770 proxy cases, and 394,098 controls. These cohorts were analyzed after stratification by sex and then combined to account for the G6PD X-linked location. Using logistic regression, we did not identify significant associations for common variants in any of the cohorts. The optimized sequence Kernel association (SKAT-O) test was performed to assess the effect of rare variants (minor allele frequency <0.01) across six cohorts, followed by a meta-analysis using metaSKAT, also demonstrating lack of association. In conclusion, we did not find evidence for a role for G6PD in PD.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100555"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12799763/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145726484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A systematic assessment of large language models' knowledge of rare diseases: How much do large language models know about rare disease? 大型语言模型对罕见病知识的系统评估:大型语言模型对罕见病了解多少?
IF 3.6 Q2 GENETICS & HEREDITY Pub Date : 2026-01-15 Epub Date: 2025-12-11 DOI: 10.1016/j.xhgg.2025.100558
Tudor Groza, Allison J Marcello, Tristan Carlisle, Weng Khong Lim, Melissa Haendel, Neerja Karnani, Peter N Robinson, Holm Graessner, Jessica X Chong, Gareth Baynam, Saumya Shekhar Jamuar

Large language models (LLMs) perform well on general medical benchmarks, but their ability to reason about rare diseases (RDs) remains unclear. Rather than challenge LLMs to diagnose a limited number of cases that are unlikely to represent all RDs or RD-associated genes, we instead sought to comprehensively probe LLM understanding of RD-associated genes and phenotypes. We systematically evaluated six leading general-domain LLMs (GPT-4, Claude 3.7, Llama-3.3 70B, Gemma-2 27B, Llama-3.2, and Phi-4) for their ability to generate core phenotypic features and causal genes required to support reasoning for 10,892 Orphanet diseases. Outputs were mapped to Human Phenotype Ontology (HPO) terms and HGNC gene symbols and compared with curated references using set overlap, semantic similarity, and disease ranking via the likelihood ratio interpretation of clinical abnormality (LIRICAL) framework applied to 8,000 patient Phenopackets. LLM recall of curated RD knowledge was generally low, with gene associations retrieved more accurately than phenotypes. Commercial models, particularly GPT-4 and Claude, achieved over 60% recall for gene associations but struggled with precise phenotype recovery. Despite low exact overlaps, moderate semantic similarity scores indicated partial alignment with curated data. When used in LIRICAL, LLM-derived phenotypic profiles yielded ranking performance close to that of gold standard profiles, although direct diagnostic accuracy remained limited. Interestingly, convergent non-curated terms across models suggest potential for hypothesis generation. Current generalist LLMs lack the precision to replace curated RD knowledge bases but offer complementary, semantically relevant information. Our results support hybrid approaches that combine expert curation with selectively integrated LLM outputs to enhance and scale ontology-driven RD diagnostics.

大型语言模型(llm)在一般医学基准上表现良好,但它们对罕见疾病(RD)的推理能力尚不清楚。我们不是要求法学硕士诊断有限数量的不太可能代表所有rd或rd相关基因的病例,而是寻求全面探索法学硕士对rd相关基因和表型的理解。我们系统地评估了六个领先的通用域LLMs (GPT-4、Claude 3.7、Llama-3.3 70B、Gemma-2 27B、Llama-3.2和pi -4)产生核心表型特征和因果基因的能力,这些特征和因果基因需要支持10,892种孤儿病的推理。输出结果被映射到HPO术语和HGNC基因符号,并通过应用于8,000名患者表型包的LIRICAL框架,使用集合重叠、语义相似性和疾病排名与整理的参考文献进行比较。法学硕士对RD知识的回忆通常较低,基因关联比表型检索更准确。商业模型,特别是GPT-4和Claude,实现了超过60%的基因关联召回,但难以精确恢复表型。尽管低精确重叠,适度的语义相似性得分表明部分对齐与策划的数据。当在LIRICAL中使用时,llm衍生的表型谱产生的排名性能接近金标准谱,尽管直接诊断准确性仍然有限。有趣的是,跨模型的收敛性非策划项表明了假设生成的潜力。目前的多面手法学硕士缺乏取代策划研发知识库的精确性,但提供了互补的、语义相关的信息。我们的研究结果支持将专家管理与选择性集成法学硕士输出相结合的混合方法,以增强和扩展本体驱动的罕见疾病诊断。
{"title":"A systematic assessment of large language models' knowledge of rare diseases: How much do large language models know about rare disease?","authors":"Tudor Groza, Allison J Marcello, Tristan Carlisle, Weng Khong Lim, Melissa Haendel, Neerja Karnani, Peter N Robinson, Holm Graessner, Jessica X Chong, Gareth Baynam, Saumya Shekhar Jamuar","doi":"10.1016/j.xhgg.2025.100558","DOIUrl":"10.1016/j.xhgg.2025.100558","url":null,"abstract":"<p><p>Large language models (LLMs) perform well on general medical benchmarks, but their ability to reason about rare diseases (RDs) remains unclear. Rather than challenge LLMs to diagnose a limited number of cases that are unlikely to represent all RDs or RD-associated genes, we instead sought to comprehensively probe LLM understanding of RD-associated genes and phenotypes. We systematically evaluated six leading general-domain LLMs (GPT-4, Claude 3.7, Llama-3.3 70B, Gemma-2 27B, Llama-3.2, and Phi-4) for their ability to generate core phenotypic features and causal genes required to support reasoning for 10,892 Orphanet diseases. Outputs were mapped to Human Phenotype Ontology (HPO) terms and HGNC gene symbols and compared with curated references using set overlap, semantic similarity, and disease ranking via the likelihood ratio interpretation of clinical abnormality (LIRICAL) framework applied to 8,000 patient Phenopackets. LLM recall of curated RD knowledge was generally low, with gene associations retrieved more accurately than phenotypes. Commercial models, particularly GPT-4 and Claude, achieved over 60% recall for gene associations but struggled with precise phenotype recovery. Despite low exact overlaps, moderate semantic similarity scores indicated partial alignment with curated data. When used in LIRICAL, LLM-derived phenotypic profiles yielded ranking performance close to that of gold standard profiles, although direct diagnostic accuracy remained limited. Interestingly, convergent non-curated terms across models suggest potential for hypothesis generation. Current generalist LLMs lack the precision to replace curated RD knowledge bases but offer complementary, semantically relevant information. Our results support hybrid approaches that combine expert curation with selectively integrated LLM outputs to enhance and scale ontology-driven RD diagnostics.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100558"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12796007/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145744942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GrafAnc: Reliable and reproducible inference of continental and regional population structure. 嫁接:大陆和区域种群结构的可靠和可重复推断。
IF 3.6 Q2 GENETICS & HEREDITY Pub Date : 2026-01-15 Epub Date: 2025-10-13 DOI: 10.1016/j.xhgg.2025.100530
Yumi Jin, Hui Wang, Adam C Naj, Li-San Wang, Wan-Ping Lee

Accurate inference of genetic ancestry is a fundamental step in population genetics, disease association studies, and understanding human history. However, most existing tools, whether model-based or model-free, are limited by dataset-specific characteristics, which restrict reproducibility and hinder cross-study comparisons. Additionally, these tools often struggle to resolve fine-scale population structure, requiring multiple processing steps, such as sample subsetting and repeated program execution. These practices introduce bias and reduce replicability, particularly in evolutionary and migration studies. We present GrafAnc, a robust tool for inferring ancestry at both continental and subcontinental levels without requiring dataset partitioning, iterative processing, or manual sample curation. Building upon and extending GRAF-pop, GrafAnc infers an individual's ancestry background by comparing genotypes with allele frequencies from 26 reference populations compiled from publicly available databases. The current version of GrafAnc generates 18 ancestry scores per individual and classifies individuals into 8 continental and 38 subcontinental ancestry groups, including Middle East and North Africa. These scores are invariant to the specific composition of the study dataset and can be used directly as continuous covariates or for ancestry group assignments. GrafAnc enables seamless integration of population structure across studies and datasets, facilitating consistent interpretation in large-scale genomics. We benchmark GrafAnc using the 1000 Genomes Project, UK Biobank, and Human Genome Diversity Project datasets, demonstrating its accuracy and robustness across diverse ancestries and genotyping platforms. GrafAnc is implemented in C++ with multithreading support and is freely available.

遗传祖先的准确推断是群体遗传学、疾病关联研究和理解人类历史的基本步骤。然而,大多数现有的工具,无论是基于模型的还是无模型的,都受到数据集特定特征的限制,这限制了可重复性并阻碍了交叉研究的比较。此外,这些工具往往难以解决精细尺度的人口结构,需要多个处理步骤,如样本子集和重复的程序执行。这些做法引入了偏见,降低了可重复性,特别是在进化和迁移研究中。我们提出了GrafAnc,这是一个强大的工具,可以在大陆和次大陆水平上推断祖先,而不需要数据集划分、迭代处理或手动样本管理。GrafAnc基于并扩展了grafpop,通过比较从公开数据库中编译的26个参考人群的基因型和等位基因频率来推断个体的祖先背景。当前版本的GrafAnc为每个人生成18个祖先分数,并将个体分为8个大陆和38个次大陆祖先群体,包括中东和北非。这些分数对研究数据集的特定组成是不变的,可以直接用作连续协变量或用于祖先群体分配。GrafAnc能够跨研究和数据集无缝整合种群结构,促进大规模基因组学的一致解释。我们使用1000基因组计划、英国生物银行和人类基因组多样性计划数据集对GrafAnc进行基准测试,证明其在不同祖先和基因分型平台上的准确性和稳健性。GrafAnc是用c++实现的,支持多线程,可以在https://github.com/jimmy-penn/grafanc上免费获得。
{"title":"GrafAnc: Reliable and reproducible inference of continental and regional population structure.","authors":"Yumi Jin, Hui Wang, Adam C Naj, Li-San Wang, Wan-Ping Lee","doi":"10.1016/j.xhgg.2025.100530","DOIUrl":"10.1016/j.xhgg.2025.100530","url":null,"abstract":"<p><p>Accurate inference of genetic ancestry is a fundamental step in population genetics, disease association studies, and understanding human history. However, most existing tools, whether model-based or model-free, are limited by dataset-specific characteristics, which restrict reproducibility and hinder cross-study comparisons. Additionally, these tools often struggle to resolve fine-scale population structure, requiring multiple processing steps, such as sample subsetting and repeated program execution. These practices introduce bias and reduce replicability, particularly in evolutionary and migration studies. We present GrafAnc, a robust tool for inferring ancestry at both continental and subcontinental levels without requiring dataset partitioning, iterative processing, or manual sample curation. Building upon and extending GRAF-pop, GrafAnc infers an individual's ancestry background by comparing genotypes with allele frequencies from 26 reference populations compiled from publicly available databases. The current version of GrafAnc generates 18 ancestry scores per individual and classifies individuals into 8 continental and 38 subcontinental ancestry groups, including Middle East and North Africa. These scores are invariant to the specific composition of the study dataset and can be used directly as continuous covariates or for ancestry group assignments. GrafAnc enables seamless integration of population structure across studies and datasets, facilitating consistent interpretation in large-scale genomics. We benchmark GrafAnc using the 1000 Genomes Project, UK Biobank, and Human Genome Diversity Project datasets, demonstrating its accuracy and robustness across diverse ancestries and genotyping platforms. GrafAnc is implemented in C++ with multithreading support and is freely available.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100530"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145293957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fine mapping regulatory variants by characterizing native CpG methylation with nanopore long-read sequencing. 通过纳米孔长读测序表征天然CpG甲基化精细定位调控变异。
IF 3.6 Q2 GENETICS & HEREDITY Pub Date : 2026-01-15 Epub Date: 2025-10-17 DOI: 10.1016/j.xhgg.2025.100532
Yijun Tian, Shannon K McDonnell, Lang Wu, Nicholas B Larson, Liang Wang

5-Methylcytosine (5mC) is the most common DNA modification in the human genome. Bisulfite conversion combined with short-read sequencing captures this modification at single-nucleotide resolution but introduces PCR duplication bias and limits co-methylation analysis between distant cytosines. To resolve these limitations, we used nanopore long-read sequencing to profile human methylation and performed long-range co-methylation analysis with native DNA modification information. We analyzed the nanopore demo data in the adaptive sampling sequencing targeting the CpG islands and applied the linkage disequilibrium (LD) R2 to identified methylation haplotype blocks (MHBs). We found that the cancer genome exhibited significantly smaller MHBs, higher CpG density, and a lower methylation LD R2 value compared to normal cells. Additionally, we demonstrated the superiority of long-read sequencing in capturing large MHBs compared with short-read sequencing. By profiling the methylation changes near the JASPAR motif and actual chromatin immunoprecipitation sequencing (ChIP-seq) peaks, we also studied the epigenetic changes related to protein binding. Based on adaptive sampling technology, we conducted nanopore sequencing targeting regions with methylation quantitative trait loci (mQTLs) and genome-wide association study (GWAS) risk variants in the 22Rv1 cell line. After analyses, we inspected the closest haplotype-specific methylated region near the variant and identified allele-specific methylated regions with allele-specific accessibility signals in the ATAC-seq data. This study demonstrates the feasibility of nanopore sequencing for methylome profiling while preserving haplotype information, offering an innovative approach to elucidate the epigenetic changes driven by noncoding variants in the human genome.

5-甲基胞嘧啶是人类基因组中最常见的DNA修饰。亚硫酸氢盐转化结合短读测序在单核苷酸分辨率下捕获这种修饰,但引入PCR重复偏差并限制了远端胞嘧啶之间的共甲基化分析。为了解决这些限制,我们使用纳米孔长读测序来分析人类甲基化,并使用天然DNA修饰信息进行远程共甲基化分析。我们分析了针对CpG岛的自适应采样测序中的纳米孔演示数据,并应用连锁不平衡R2来鉴定甲基化单倍型块(MHBs)。我们发现,与正常细胞相比,癌症基因组表现出更小的MHB,更高的CpG密度和更低的甲基化LD R2值。此外,我们证明了长读测序比短读测序在捕获大MHB方面的优势。通过分析JASPAR基序附近的甲基化变化和实际ChIP-seq峰,我们还研究了与蛋白质结合相关的表观遗传变化。基于自适应采样技术,我们对22Rv1细胞系中具有甲基化数量性状位点(mQTL)和全基因组关联研究(GWAS)风险变异的区域进行了纳米孔测序。分析后,我们检查了变异附近最接近的单倍型特异性甲基化区域,并在ATAC-seq数据中鉴定出具有等位基因特异性可及性信号的等位基因特异性甲基化区域。该研究证明了纳米孔测序在保留单倍型信息的同时进行甲基组分析的可行性,为阐明人类基因组中由非编码变异驱动的表观遗传变化提供了一种创新方法。
{"title":"Fine mapping regulatory variants by characterizing native CpG methylation with nanopore long-read sequencing.","authors":"Yijun Tian, Shannon K McDonnell, Lang Wu, Nicholas B Larson, Liang Wang","doi":"10.1016/j.xhgg.2025.100532","DOIUrl":"10.1016/j.xhgg.2025.100532","url":null,"abstract":"<p><p>5-Methylcytosine (5mC) is the most common DNA modification in the human genome. Bisulfite conversion combined with short-read sequencing captures this modification at single-nucleotide resolution but introduces PCR duplication bias and limits co-methylation analysis between distant cytosines. To resolve these limitations, we used nanopore long-read sequencing to profile human methylation and performed long-range co-methylation analysis with native DNA modification information. We analyzed the nanopore demo data in the adaptive sampling sequencing targeting the CpG islands and applied the linkage disequilibrium (LD) R<sup>2</sup> to identified methylation haplotype blocks (MHBs). We found that the cancer genome exhibited significantly smaller MHBs, higher CpG density, and a lower methylation LD R<sup>2</sup> value compared to normal cells. Additionally, we demonstrated the superiority of long-read sequencing in capturing large MHBs compared with short-read sequencing. By profiling the methylation changes near the JASPAR motif and actual chromatin immunoprecipitation sequencing (ChIP-seq) peaks, we also studied the epigenetic changes related to protein binding. Based on adaptive sampling technology, we conducted nanopore sequencing targeting regions with methylation quantitative trait loci (mQTLs) and genome-wide association study (GWAS) risk variants in the 22Rv1 cell line. After analyses, we inspected the closest haplotype-specific methylated region near the variant and identified allele-specific methylated regions with allele-specific accessibility signals in the ATAC-seq data. This study demonstrates the feasibility of nanopore sequencing for methylome profiling while preserving haplotype information, offering an innovative approach to elucidate the epigenetic changes driven by noncoding variants in the human genome.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100532"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12642126/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145318795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A national biobank framework for rare diseases: Standardized infrastructure and cross-institutional collaboration accelerating translational innovation in China. 国家罕见病生物库框架:标准化基础设施和跨机构合作加速中国转化创新。
IF 3.6 Q2 GENETICS & HEREDITY Pub Date : 2026-01-15 Epub Date: 2025-11-11 DOI: 10.1016/j.xhgg.2025.100544
Weida Liu, Ye Jin, Yaran Zhang, Anqi Wang, Yiying Chen, Fangyuan Li, Kun Zhao, Ruirui He, Dan Guo, Shuyang Zhang

Rare diseases (RDs) collectively affect >400 million people worldwide, but fragmented infrastructure and biospecimen scarcity impede progress. China's heterogeneous healthcare landscape magnifies these challenges. Since 2016, Peking Union Medical College Hospital (PUMCH) RD Biobank has pioneered a scalable model integrating 446 institutions via national networks, and setup Quality Management System since 2019 with the ISO 20387:2018, that received accreditation in 2022. Our secure digital platform standardizes biospecimen protocols (acquisition/processing/storage) and enables ethical data/specimen sharing through auditable Material Transfer Agreements/Data Use Agreements. Among 49,759 enrolled patients, 73.13% were diagnosed, while 26.87% were undiagnosed, with pediatric cases (50.75%) and males (52.39%) predominating. Phenotypic analysis showed 78.51% single-system versus 21.20% multisystem involvement. Top diagnoses included congenital scoliosis and progressive muscular dystrophy. Specimen diversity revealed system-specific patterns: musculoskeletal/nervous/sensory systems linked to multiple specimen types; immune/genitourinary to fluids; cardiovascular/neoplastic to derivatives; endocrine uniquely to tissues. Nucleic acids (93.4%) and blood specimens (21.4%) formed core resources, while induced pluripotent stem cells/organoids prioritized for cardiovascular/neoplastic RDs enable functional validation. This framework transcends biospecimen fragmentation by uniting clinical, molecular, and institutional dimensions. It demonstrates how centralized governance and interoperable systems can accelerate RD research globally. By transforming isolated data into collaborative discovery engines, we provide a blueprint for converting RD challenges into precision diagnostics and therapies, which are urgently needed for the millions of individuals worldwide who remain undiagnosed.

罕见病(RDs)总共影响全世界4亿人,但基础设施不完整和生物标本稀缺阻碍了进展。中国多样化的医疗格局放大了这些挑战。自2016年以来,北京协和医院(PUMCH)研发生物库通过ISO 20387:2018认证,率先通过国家网络整合了446家机构的可扩展模式。我们的安全数字平台标准化生物标本协议(采集/处理/存储),并通过可审计的mta / dua实现道德数据/标本共享。49759例入组患者中,确诊73.13%,未确诊26.87%,以儿科(50.75%)和男性(52.39%)为主。表型分析显示78.51%的单系统和21.20%的多系统参与。诊断最多的是先天性脊柱侧凸和进行性肌肉萎缩症。标本多样性揭示了系统特异性模式:与多种标本类型相关的肌肉骨骼/神经/感觉系统;免疫/泌尿生殖系统对液体;心血管/肿瘤到衍生物;内分泌对组织来说是独一无二的。核酸(93.4%)和血液标本(21.4%)是核心资源,而优先用于心血管/肿瘤rd的iPSCs/类器官能够进行功能验证。该框架通过结合临床、分子和机构维度超越了生物标本碎片化。它展示了集中治理和互操作系统如何能够加速全球的研发研究。通过将孤立的数据转化为协作发现引擎,我们提供了将研发挑战转化为精确诊断和治疗的蓝图,这是全球数百万未被诊断的人迫切需要的。
{"title":"A national biobank framework for rare diseases: Standardized infrastructure and cross-institutional collaboration accelerating translational innovation in China.","authors":"Weida Liu, Ye Jin, Yaran Zhang, Anqi Wang, Yiying Chen, Fangyuan Li, Kun Zhao, Ruirui He, Dan Guo, Shuyang Zhang","doi":"10.1016/j.xhgg.2025.100544","DOIUrl":"10.1016/j.xhgg.2025.100544","url":null,"abstract":"<p><p>Rare diseases (RDs) collectively affect >400 million people worldwide, but fragmented infrastructure and biospecimen scarcity impede progress. China's heterogeneous healthcare landscape magnifies these challenges. Since 2016, Peking Union Medical College Hospital (PUMCH) RD Biobank has pioneered a scalable model integrating 446 institutions via national networks, and setup Quality Management System since 2019 with the ISO 20387:2018, that received accreditation in 2022. Our secure digital platform standardizes biospecimen protocols (acquisition/processing/storage) and enables ethical data/specimen sharing through auditable Material Transfer Agreements/Data Use Agreements. Among 49,759 enrolled patients, 73.13% were diagnosed, while 26.87% were undiagnosed, with pediatric cases (50.75%) and males (52.39%) predominating. Phenotypic analysis showed 78.51% single-system versus 21.20% multisystem involvement. Top diagnoses included congenital scoliosis and progressive muscular dystrophy. Specimen diversity revealed system-specific patterns: musculoskeletal/nervous/sensory systems linked to multiple specimen types; immune/genitourinary to fluids; cardiovascular/neoplastic to derivatives; endocrine uniquely to tissues. Nucleic acids (93.4%) and blood specimens (21.4%) formed core resources, while induced pluripotent stem cells/organoids prioritized for cardiovascular/neoplastic RDs enable functional validation. This framework transcends biospecimen fragmentation by uniting clinical, molecular, and institutional dimensions. It demonstrates how centralized governance and interoperable systems can accelerate RD research globally. By transforming isolated data into collaborative discovery engines, we provide a blueprint for converting RD challenges into precision diagnostics and therapies, which are urgently needed for the millions of individuals worldwide who remain undiagnosed.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100544"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12686901/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145496335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A non-coding ABO regulatory variant associated with VWF levels, thrombosis risk, and COVID-19 severity is topologically linked to ADAMTS13 in endothelial cells. 一种与VWF水平、血栓形成和COVID-19严重程度相关的非编码ABO调节变异与内皮细胞中的ADAMTS13在拓扑结构上相关。
IF 3.6 Q2 GENETICS & HEREDITY Pub Date : 2026-01-15 Epub Date: 2025-11-27 DOI: 10.1016/j.xhgg.2025.100550
Douglas Victorino Esposito, Hellen Ferreira de Souza Sobrinho, Marcelo Rocha Marques

Venous thromboembolism (VTE) is a major cause of mortality, influenced by genetic and environmental factors. von Willebrand factor (VWF) mediates hemostasis by promoting platelet adhesion, and its plasma levels are associated with thrombotic risk. Although many non-coding variants in ABO are associated with VWF levels, VTE risk, and COVID-19 severity, the mechanisms underlying these associations remain unclear. In this study, we identified the ABO locus as the genomic region with the highest concentration of variants associated with VWF levels. Chromatin conformation analyses in endothelial cells revealed non-coding ABO variants (rs657152, rs9411377, rs660340, and rs505922) associated with VWF levels, VTE risk, and COVID-19 severity, located in spatial proximity to ADAMTS13. ADAMTS13 is a key regulator of VWF activity, and both ADAMTS13 and VWF play crucial roles in coagulation and thrombosis. Chromatin activation (CRISPRa) of the region near the non-coding ABO variant rs657152 increased ADAMTS13 transcription in endothelial cells, suggesting that this variant resides in a regulatory region with the potential to modulate long-range transcriptional control of ADAMTS13. Luciferase assay revealed reduced transcriptional activity driven by the rs505922-C allele in endothelial cells. These findings provide insights into the spatial organization of the ABO locus and its potential role in ADAMTS13 regulation.

静脉血栓栓塞(VTE)是死亡的主要原因,受遗传和环境因素的影响。血管性血友病因子(VWF)通过促进血小板粘附介导止血,其血浆水平与血栓形成风险相关。尽管ABO中的许多非编码变异与VWF水平、VTE风险和COVID-19严重程度相关,但这些关联的机制尚不清楚。在这项研究中,我们确定了ABO位点是与VWF水平相关的变异浓度最高的基因组区域。内皮细胞的染色质构象分析显示,非编码ABO变异(rs657152、rs9411377、rs660340和rs505922)与VWF水平、VTE风险和COVID-19严重程度相关,位于ADAMTS13的空间邻近。ADAMTS13是VWF活性的关键调节因子,ADAMTS13和VWF在凝血和血栓形成中都起着至关重要的作用。非编码ABO变异rs657152附近区域的染色质激活(CRISPRa)增加了内皮细胞中ADAMTS13的转录,表明该变异存在于一个具有调节ADAMTS13远程转录控制潜力的调控区域。荧光素酶检测显示,内皮细胞中rs505922-C等位基因驱动的转录活性降低。这些发现为ABO基因座的空间组织及其在ADAMTS13调控中的潜在作用提供了见解。
{"title":"A non-coding ABO regulatory variant associated with VWF levels, thrombosis risk, and COVID-19 severity is topologically linked to ADAMTS13 in endothelial cells.","authors":"Douglas Victorino Esposito, Hellen Ferreira de Souza Sobrinho, Marcelo Rocha Marques","doi":"10.1016/j.xhgg.2025.100550","DOIUrl":"10.1016/j.xhgg.2025.100550","url":null,"abstract":"<p><p>Venous thromboembolism (VTE) is a major cause of mortality, influenced by genetic and environmental factors. von Willebrand factor (VWF) mediates hemostasis by promoting platelet adhesion, and its plasma levels are associated with thrombotic risk. Although many non-coding variants in ABO are associated with VWF levels, VTE risk, and COVID-19 severity, the mechanisms underlying these associations remain unclear. In this study, we identified the ABO locus as the genomic region with the highest concentration of variants associated with VWF levels. Chromatin conformation analyses in endothelial cells revealed non-coding ABO variants (rs657152, rs9411377, rs660340, and rs505922) associated with VWF levels, VTE risk, and COVID-19 severity, located in spatial proximity to ADAMTS13. ADAMTS13 is a key regulator of VWF activity, and both ADAMTS13 and VWF play crucial roles in coagulation and thrombosis. Chromatin activation (CRISPRa) of the region near the non-coding ABO variant rs657152 increased ADAMTS13 transcription in endothelial cells, suggesting that this variant resides in a regulatory region with the potential to modulate long-range transcriptional control of ADAMTS13. Luciferase assay revealed reduced transcriptional activity driven by the rs505922-C allele in endothelial cells. These findings provide insights into the spatial organization of the ABO locus and its potential role in ADAMTS13 regulation.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100550"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12765440/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145640707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PATJ deficiency leads to cystic kidney disease and related ciliopathies. PATJ缺乏导致囊性肾病及相关纤毛病。
IF 3.6 Q2 GENETICS & HEREDITY Pub Date : 2026-01-15 Epub Date: 2025-09-09 DOI: 10.1016/j.xhgg.2025.100514
Daniel Epting, Daniela A Braun, Eva Decker, Elisabeth Ott, Tobias Eisenberger, Nadine Bachmann, Pavel Nedvetsky, Michael P Krahn, Friedhelm Hildebrandt, Carsten Bergmann

Cystic kidney disease and related ciliopathies are caused by pathogenic variants in genes that commonly result in ciliary dysfunction. For a substantial number of individuals affected by those cilia-related diseases, the causative gene remains unknown. Using massively parallel sequencing, we here identified a pathogenic bi-allelic variant in the gene encoding PALS1-associated tight junction protein ([PATJ] also known as inactivation-no-afterpotential D-like, INADL) in an individual with ciliopathy. The affected fetus carried the homozygous truncating PATJ nonsense variant c.830delC (p.Pro277fsX), and presented with a syndromic phenotype mainly characterized by polycystic kidney disease and hydrocephalus. Using zebrafish (Danio rerio) as a vertebrate in vivo model organism, we could validate our patient findings and demonstrated a ciliopathy phenotype. In addition, we were able to address a hitherto not described role of Patj for cilia formation and function. Taken together, with the Crumbs cell polarity complex member PATJ, we add a new member to the large family of ciliopathy-related human disease proteins that is different from the classical ciliopathy protein classes, and may offer new perspectives for drug development.

囊性肾病和相关纤毛病是由致病基因变异引起的,通常导致纤毛功能障碍。对于那些受纤毛相关疾病影响的大量个体,致病基因仍然未知。通过大规模平行测序,我们在一位纤毛病患者中发现了编码pals1相关紧密连接蛋白(PATJ,也称为无后电位失活d样蛋白,INADL)基因的致病性双等位基因变异。患病胎儿携带纯合子截断型PATJ无义变异c.830delC (p.Pro277fsX),表现为以多囊肾病(PKD)和脑积水为主要特征的综合征表型。使用斑马鱼(Danio rerio)作为脊椎动物体内模型生物,我们可以验证我们的患者发现并证明纤毛病表型。此外,我们能够解决迄今尚未描述的Patj对纤毛形成和功能的作用。综上所述,与碎屑细胞极性复合物成员PATJ一起,我们为与纤毛病相关的人类疾病蛋白大家族增加了一个新成员,不同于经典的纤毛病蛋白类,并可能为药物开发提供新的视角。
{"title":"PATJ deficiency leads to cystic kidney disease and related ciliopathies.","authors":"Daniel Epting, Daniela A Braun, Eva Decker, Elisabeth Ott, Tobias Eisenberger, Nadine Bachmann, Pavel Nedvetsky, Michael P Krahn, Friedhelm Hildebrandt, Carsten Bergmann","doi":"10.1016/j.xhgg.2025.100514","DOIUrl":"10.1016/j.xhgg.2025.100514","url":null,"abstract":"<p><p>Cystic kidney disease and related ciliopathies are caused by pathogenic variants in genes that commonly result in ciliary dysfunction. For a substantial number of individuals affected by those cilia-related diseases, the causative gene remains unknown. Using massively parallel sequencing, we here identified a pathogenic bi-allelic variant in the gene encoding PALS1-associated tight junction protein ([PATJ] also known as inactivation-no-afterpotential D-like, INADL) in an individual with ciliopathy. The affected fetus carried the homozygous truncating PATJ nonsense variant c.830delC (p.Pro277fsX), and presented with a syndromic phenotype mainly characterized by polycystic kidney disease and hydrocephalus. Using zebrafish (Danio rerio) as a vertebrate in vivo model organism, we could validate our patient findings and demonstrated a ciliopathy phenotype. In addition, we were able to address a hitherto not described role of Patj for cilia formation and function. Taken together, with the Crumbs cell polarity complex member PATJ, we add a new member to the large family of ciliopathy-related human disease proteins that is different from the classical ciliopathy protein classes, and may offer new perspectives for drug development.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":" ","pages":"100514"},"PeriodicalIF":3.6,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12512994/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
HGG Advances
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1