首页 > 最新文献

Genetic Epidemiology最新文献

英文 中文
Adjustment for Genotype Imputation Uncertainty Corrects for Inflated Type I Error in Family-Based Association Testing 基因型归算不确定度的调整校正了基于家庭的关联检测中膨胀的I型误差。
IF 3.8 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2025-10-30 DOI: 10.1002/gepi.70021
Tyler R. C. Day, Joshua C. Bis, Nicola Chapman, Alejandro Q. Nato Jr., Andrea R. V. R. Horimoto, Harkirat Sohi, Rafael Nafikov, Elizabeth E. Blue, Mohamad Saad, Ellen M. Wijsman

Genotype imputation is a widely-used data augmentation approach that is applied to samples of related and/or unrelated individuals. Association testing may then be carried out on the complete data with commonly-used methods. This approach has typically not accounted for the mix of observed and imputed data, although recent work has noted the potential for introduction of confounding in case-control studies. In the Alzheimer's Disease Sequencing Project family sample we found severe inflation of the test statistics in logistic regression analysis following genotype imputation, even after standard covariate adjustments. Here we dissect sources of this inflation, which is driven by three factors: frequency-dependent bias in imputation-induced allele frequencies, differential measurement error, and differential genotyping rates in cases versus controls that introduces confounding. To address the problem, we propose a statistic, imputation deviance (� � D ${mathscr{D}}$), which can be easily computed from the observed and imputed genotype probabilities. We show that � � D ${mathscr{D}}$, as an additional fixed-effect covariate, controls the genome-wide inflation in analysis of this family-based sample, and we speculate that use of imputation deviance may also provide a practical approach to correct for genotype imputation effects in other settings, particularly when a data set is unbalanced and includes related individuals.

基因型插入是一种广泛使用的数据增强方法,适用于相关和/或非相关个体的样本。然后可以用常用的方法对完整的数据进行关联检验。这种方法通常没有考虑到观察数据和输入数据的混合,尽管最近的工作已经注意到在病例对照研究中引入混淆的可能性。在阿尔茨海默病测序项目家族样本中,我们发现在基因型代入后的逻辑回归分析中,即使经过标准协变量调整,检验统计量也会严重膨胀。在这里,我们剖析了这种膨胀的来源,这是由三个因素驱动的:频率相关的输入诱导的等位基因频率偏差,差异测量误差,以及病例与对照组的差异基因分型率,这引入了混淆。为了解决这个问题,我们提出了一个统计量,即imputation deviance (D ${mathscr{D}}$),它可以很容易地从观察到的和输入的基因型概率中计算出来。我们发现,D ${mathscr{D}}$作为一个额外的固定效应协变量,在分析这个基于家庭的样本时控制了全基因组膨胀,我们推测,使用imputation deviance也可能提供一种实用的方法来纠正其他情况下的基因型imputation效应,特别是当数据集不平衡且包含相关个体时。
{"title":"Adjustment for Genotype Imputation Uncertainty Corrects for Inflated Type I Error in Family-Based Association Testing","authors":"Tyler R. C. Day,&nbsp;Joshua C. Bis,&nbsp;Nicola Chapman,&nbsp;Alejandro Q. Nato Jr.,&nbsp;Andrea R. V. R. Horimoto,&nbsp;Harkirat Sohi,&nbsp;Rafael Nafikov,&nbsp;Elizabeth E. Blue,&nbsp;Mohamad Saad,&nbsp;Ellen M. Wijsman","doi":"10.1002/gepi.70021","DOIUrl":"10.1002/gepi.70021","url":null,"abstract":"<div>\u0000 \u0000 <p>Genotype imputation is a widely-used data augmentation approach that is applied to samples of related and/or unrelated individuals. Association testing may then be carried out on the complete data with commonly-used methods. This approach has typically not accounted for the mix of observed and imputed data, although recent work has noted the potential for introduction of confounding in case-control studies. In the Alzheimer's Disease Sequencing Project family sample we found severe inflation of the test statistics in logistic regression analysis following genotype imputation, even after standard covariate adjustments. Here we dissect sources of this inflation, which is driven by three factors: frequency-dependent bias in imputation-induced allele frequencies, differential measurement error, and differential genotyping rates in cases versus controls that introduces confounding. To address the problem, we propose a statistic, imputation deviance (<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 \u0000 <mrow>\u0000 <mi>D</mi>\u0000 </mrow>\u0000 </mrow>\u0000 <annotation> ${mathscr{D}}$</annotation>\u0000 </semantics></math>), which can be easily computed from the observed and imputed genotype probabilities. We show that <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 \u0000 <mrow>\u0000 <mi>D</mi>\u0000 </mrow>\u0000 </mrow>\u0000 <annotation> ${mathscr{D}}$</annotation>\u0000 </semantics></math>, as an additional fixed-effect covariate, controls the genome-wide inflation in analysis of this family-based sample, and we speculate that use of imputation deviance may also provide a practical approach to correct for genotype imputation effects in other settings, particularly when a data set is unbalanced and includes related individuals.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 8","pages":""},"PeriodicalIF":3.8,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145400586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Random Forest in Genetic Risk Score Construction 探索随机森林在遗传风险评分构建中的应用
IF 3.8 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2025-10-25 DOI: 10.1002/gepi.70022
Vaishnavi Venkat, Kaylyn Clark, X. Jessie Jeng, Tsung-Chieh Yao, Hui-Ju Tsai, Tzu-Pin Lu, Tzu-Hung Hsiao, Ching-Heng Lin, Shannon Holloway, Cathrine Hoyo, Shin-Yi Chou, Hui Wang, Wan-Ping Lee, Li-San Wang, Jung-Ying Tzeng

Genetic risk scores (GRS) are crucial tools for estimating an individual's genetic liability to various traits and diseases, computed as a weighted sum of trait-associated allele counts. Traditionally, GRS models assume additive, linear effects of risk variants. However, complex traits often involve nonadditive interactions, such as epistasis, which are not captured by these conventional methods. In this study, we investigate the use of random forest (RF) models as a model-free approach for constructing GRS, leveraging RF's capacity to capture complex, nonlinear interactions among genetic variants. Specifically, we introduce two new RF-based GRS strategies to boost RF performance and to incorporate base data information if available, including (1) ctRF, which optimizes linkage disequilibrium (LD) clumping and p-value thresholds within RF; and (2) wRF, which adjusts the chance of SNP inclusion in tree nodes based on their association strength. Through simulation studies and real data applications of Alzheimer's disease, body mass index, and atopy, we find that ctRF consistently outperforms other RF-based methods and classical additive models when traits exhibit complex genetic architectures. Additionally, incorporating informative base data into RF-GRS construction can enhance predictive accuracy. Our findings suggest that RF-based GRS can effectively capture intricate genetic interactions, and offer a robust alternative to traditional GRS methods, especially for complex traits with nonlinear genetic effects.

遗传风险评分(GRS)是评估个体对各种性状和疾病的遗传倾向性的重要工具,以性状相关等位基因计数的加权总和计算。传统上,GRS模型假定风险变量的加性线性效应。然而,复杂性状通常涉及非加性相互作用,如上位性,这些传统方法无法捕获。在这项研究中,我们研究了使用随机森林(RF)模型作为构建GRS的无模型方法,利用RF的能力来捕获遗传变异之间复杂的非线性相互作用。具体来说,我们引入了两种新的基于RF的GRS策略来提高RF性能,并在可用的情况下纳入基础数据信息,包括(1)ctRF,它优化了RF内的链接不平衡(LD)聚集和p值阈值;(2) wRF,根据树节点的关联强度调整SNP包含在树节点中的机会。通过对阿尔茨海默病、体重指数和特应性的模拟研究和实际数据应用,我们发现当性状表现出复杂的遗传结构时,ctRF始终优于其他基于rf的方法和经典的加性模型。此外,将信息性基础数据纳入RF-GRS构建中可以提高预测精度。我们的研究结果表明,基于rf的GRS可以有效地捕获复杂的遗传相互作用,并为传统的GRS方法提供了一个强大的替代方案,特别是对于具有非线性遗传效应的复杂性状。
{"title":"Exploring Random Forest in Genetic Risk Score Construction","authors":"Vaishnavi Venkat,&nbsp;Kaylyn Clark,&nbsp;X. Jessie Jeng,&nbsp;Tsung-Chieh Yao,&nbsp;Hui-Ju Tsai,&nbsp;Tzu-Pin Lu,&nbsp;Tzu-Hung Hsiao,&nbsp;Ching-Heng Lin,&nbsp;Shannon Holloway,&nbsp;Cathrine Hoyo,&nbsp;Shin-Yi Chou,&nbsp;Hui Wang,&nbsp;Wan-Ping Lee,&nbsp;Li-San Wang,&nbsp;Jung-Ying Tzeng","doi":"10.1002/gepi.70022","DOIUrl":"https://doi.org/10.1002/gepi.70022","url":null,"abstract":"<p>Genetic risk scores (GRS) are crucial tools for estimating an individual's genetic liability to various traits and diseases, computed as a weighted sum of trait-associated allele counts. Traditionally, GRS models assume additive, linear effects of risk variants. However, complex traits often involve nonadditive interactions, such as epistasis, which are not captured by these conventional methods. In this study, we investigate the use of random forest (RF) models as a model-free approach for constructing GRS, leveraging RF's capacity to capture complex, nonlinear interactions among genetic variants. Specifically, we introduce two new RF-based GRS strategies to boost RF performance and to incorporate base data information if available, including (1) ctRF, which optimizes linkage disequilibrium (LD) clumping and <i>p</i>-value thresholds within RF; and (2) wRF, which adjusts the chance of SNP inclusion in tree nodes based on their association strength. Through simulation studies and real data applications of Alzheimer's disease, body mass index, and atopy, we find that ctRF consistently outperforms other RF-based methods and classical additive models when traits exhibit complex genetic architectures. Additionally, incorporating informative base data into RF-GRS construction can enhance predictive accuracy. Our findings suggest that RF-based GRS can effectively capture intricate genetic interactions, and offer a robust alternative to traditional GRS methods, especially for complex traits with nonlinear genetic effects.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 8","pages":""},"PeriodicalIF":3.8,"publicationDate":"2025-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.70022","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Associations of Herpes Simplex Virus Type 1/2 IgG Seropositivity and Arthritis Subtypes: Integrating Cross-Sectional Epidemiology and Genetic Association Analyses 单纯疱疹病毒1/2型IgG血清阳性与关节炎亚型的关系:整合横断面流行病学和遗传关联分析
IF 3.8 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2025-10-23 DOI: 10.1002/gepi.70020
Haining Li, Zhen Shen, Changzhou Feng

Herpes simplex viruses (HSV) have been detected within the synovial joint cavity, a secluded area of inflammation that may harbor etiological agents. However, the role of HSV-1/2 infection in arthritis pathogenesis remains ambiguous. In this study, we integrate cross-sectional epidemiology and genetic associations to elucidate their relationships and uncover causal mechanisms. We analyzed cross-sectional data from 18,292 NHANES participants (1999–2016) using multivariable-adjusted logistic regression to assess associations between anti-HSV-1/2 IgG seropositivity and arthritis-related risks. Complementary analyses included linkage disequilibrium score regression (LDSC) and bidirectional Mendelian randomization (MR) using genetic instruments for anti-HSV IgG levels to explore genetic correlations and infer causality. Initial observational findings demonstrated significant positive associations between HSV-1/2 IgG seropositivity and arthritis risk (all p < 0.001); however, these associations lost significance after multivariable adjustment. Notably, after multivariable adjustment, subtype analyses revealed that HSV-2 IgG seropositivity was linked to increased risks of rheumatoid arthritis (RA) (OR: 1.40, 95% CI: 1.04–1.88) and osteoarthritis (OA) (OR: 1.39, 95% CI: 1.07–1.81), while HSV-1 IgG seropositivity correlated with an unknown-arthritis subtype (OR: 1.38, 95% CI: 1.08–1.75). Moreover, MR analyses uncovered divergent causal effects: anti-HSV-1 IgG levels were protective against OA (OR: 0.90, 95% CI: 0.82–0.98), whereas anti-HSV-2 IgG levels modestly increased OA risk (OR: 1.05, 95% CI: 1.01–1.09). No reverse causation or genetic correlation was observed. This study's innovative integration of epidemiological and genetic methodologies not only clarifies the distinct roles of HSV sub-types in arthritis but also identifies HSV-2 as a potential causal factor in OA, thereby opening new avenues for therapeutic targeting.

单纯疱疹病毒(HSV)在滑膜关节腔内被检测到,这是一个隐蔽的炎症区域,可能含有病原体。然而,HSV-1/2感染在关节炎发病机制中的作用仍不明确。在这项研究中,我们整合了横断面流行病学和遗传关联来阐明它们的关系并揭示因果机制。我们分析了18292名NHANES参与者(1999-2016)的横断面数据,采用多变量调整logistic回归来评估抗hsv -1/2 IgG血清阳性与关节炎相关风险之间的关系。补充分析包括连锁不平衡评分回归(LDSC)和双向孟德尔随机化(MR),使用遗传仪器检测抗hsv IgG水平,以探索遗传相关性并推断因果关系。初步观察结果显示HSV-1/2 IgG血清阳性与关节炎风险显著正相关
{"title":"Associations of Herpes Simplex Virus Type 1/2 IgG Seropositivity and Arthritis Subtypes: Integrating Cross-Sectional Epidemiology and Genetic Association Analyses","authors":"Haining Li,&nbsp;Zhen Shen,&nbsp;Changzhou Feng","doi":"10.1002/gepi.70020","DOIUrl":"10.1002/gepi.70020","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 <p>Herpes simplex viruses (HSV) have been detected within the synovial joint cavity, a secluded area of inflammation that may harbor etiological agents. However, the role of HSV-1/2 infection in arthritis pathogenesis remains ambiguous. In this study, we integrate cross-sectional epidemiology and genetic associations to elucidate their relationships and uncover causal mechanisms. We analyzed cross-sectional data from 18,292 NHANES participants (1999–2016) using multivariable-adjusted logistic regression to assess associations between anti-HSV-1/2 IgG seropositivity and arthritis-related risks. Complementary analyses included linkage disequilibrium score regression (LDSC) and bidirectional Mendelian randomization (MR) using genetic instruments for anti-HSV IgG levels to explore genetic correlations and infer causality. Initial observational findings demonstrated significant positive associations between HSV-1/2 IgG seropositivity and arthritis risk (all <i>p</i> &lt; 0.001); however, these associations lost significance after multivariable adjustment. Notably, after multivariable adjustment, subtype analyses revealed that HSV-2 IgG seropositivity was linked to increased risks of rheumatoid arthritis (RA) (OR: 1.40, 95% CI: 1.04–1.88) and osteoarthritis (OA) (OR: 1.39, 95% CI: 1.07–1.81), while HSV-1 IgG seropositivity correlated with an unknown-arthritis subtype (OR: 1.38, 95% CI: 1.08–1.75). Moreover, MR analyses uncovered divergent causal effects: anti-HSV-1 IgG levels were protective against OA (OR: 0.90, 95% CI: 0.82–0.98), whereas anti-HSV-2 IgG levels modestly increased OA risk (OR: 1.05, 95% CI: 1.01–1.09). No reverse causation or genetic correlation was observed. This study's innovative integration of epidemiological and genetic methodologies not only clarifies the distinct roles of HSV sub-types in arthritis but also identifies HSV-2 as a potential causal factor in OA, thereby opening new avenues for therapeutic targeting.</p>\u0000 </section>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 8","pages":""},"PeriodicalIF":3.8,"publicationDate":"2025-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145345061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Complex Relationship of Genetic Ancestry With Self-Reported Race/Ethnicity 遗传祖先与自我报告的种族/民族的复杂关系
IF 3.8 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2025-10-16 DOI: 10.1002/gepi.70019
Yambazi Banda, Neil Risch

Race and ethnicity are demographic constructs used to characterize individuals in biomedical research, and in particular to assess health disparities. Their use in medicine and research has been discussed and challenged, as well as the degree to which they represent strictly social constructs, or ones also with biological meaning. The relationship of race and ethnicity with genetic ancestry has also been described, and how genetic ancestry reflects historical continental isolation, migration, and mating structure. Race and ethnicity are currently most often assessed by self-report in epidemiology and biomedical applications. Here we further interrogate the relationship between how people self-report their race and ethnicity and their genetic ancestry by examining self-report patterns of 97,671 individuals who are participants in the Kaiser Permanente Northern California Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. Genetic ancestry was determined from a set of 43,988 SNPs from genome-wide genotyping arrays. We observed that rates of self-identification as African American, East Asian and Latino(a) rise dramatically with a modest amount of African, East Asian and Native American genetic ancestry, respectively. By contrast, the rate of self-identification as White rises only when the European/West Asian genetic ancestry is substantial. This indicates that the majority of people who are genetically admixed, even those with primarily European/West Asian genetic ancestry, self-identify with the minority race/ethnicity group. By contrast, self-report as Native American did not increase with Native American genetic ancestry; instead, it was positively correlated with European genetic ancestry, with only a small minority of individuals self-reporting Native American race/ethnicity having Native American genetic ancestry. These results differ dramatically from the other minority race/ethnicity groups. These findings have important implications on how the different self-report race/ethnicity groups are considered in epidemiologic and biomedical research.

种族和民族是在生物医学研究中用来描述个体特征的人口结构,特别是用于评估健康差异。它们在医学和研究中的应用已经被讨论和挑战,以及它们在多大程度上代表了严格的社会结构,或者也具有生物学意义。种族和民族与遗传祖先的关系也已被描述,以及遗传祖先如何反映历史上的大陆隔离、迁移和交配结构。在流行病学和生物医学应用中,种族和族裔目前最常采用自我报告的方式进行评估。在这里,我们进一步询问人们如何自我报告他们的种族和民族和他们的遗传祖先之间的关系,通过检查97,671个人的自我报告模式,这些人是Kaiser Permanente北加州成人健康和老龄化遗传流行病学研究(GERA)队列的参与者。遗传祖先是根据全基因组基因分型阵列的43,988个snp确定的。我们观察到,非裔美国人、东亚人和拉丁美洲人(a)的自我认同率分别随着少量非洲人、东亚人和美洲原住民的遗传血统而急剧上升。相比之下,自我认同为白人的比率只有在欧洲/西亚遗传血统丰富时才会上升。这表明,大多数基因混合的人,即使是那些主要是欧洲/西亚基因祖先的人,也自我认同于少数种族/民族群体。相比之下,作为印第安人的自我报告并没有随着印第安人的遗传血统而增加;相反,它与欧洲遗传祖先呈正相关,只有少数自我报告的美洲原住民种族/民族具有美洲原住民遗传祖先。这些结果与其他少数民族有很大的不同。这些发现对如何在流行病学和生物医学研究中考虑不同的自我报告种族/族裔群体具有重要意义。
{"title":"The Complex Relationship of Genetic Ancestry With Self-Reported Race/Ethnicity","authors":"Yambazi Banda,&nbsp;Neil Risch","doi":"10.1002/gepi.70019","DOIUrl":"https://doi.org/10.1002/gepi.70019","url":null,"abstract":"<div>\u0000 \u0000 <p>Race and ethnicity are demographic constructs used to characterize individuals in biomedical research, and in particular to assess health disparities. Their use in medicine and research has been discussed and challenged, as well as the degree to which they represent strictly social constructs, or ones also with biological meaning. The relationship of race and ethnicity with genetic ancestry has also been described, and how genetic ancestry reflects historical continental isolation, migration, and mating structure. Race and ethnicity are currently most often assessed by self-report in epidemiology and biomedical applications. Here we further interrogate the relationship between how people self-report their race and ethnicity and their genetic ancestry by examining self-report patterns of 97,671 individuals who are participants in the Kaiser Permanente Northern California Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. Genetic ancestry was determined from a set of 43,988 SNPs from genome-wide genotyping arrays. We observed that rates of self-identification as African American, East Asian and Latino(a) rise dramatically with a modest amount of African, East Asian and Native American genetic ancestry, respectively. By contrast, the rate of self-identification as White rises only when the European/West Asian genetic ancestry is substantial. This indicates that the majority of people who are genetically admixed, even those with primarily European/West Asian genetic ancestry, self-identify with the minority race/ethnicity group. By contrast, self-report as Native American did not increase with Native American genetic ancestry; instead, it was positively correlated with European genetic ancestry, with only a small minority of individuals self-reporting Native American race/ethnicity having Native American genetic ancestry. These results differ dramatically from the other minority race/ethnicity groups. These findings have important implications on how the different self-report race/ethnicity groups are considered in epidemiologic and biomedical research.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 8","pages":""},"PeriodicalIF":3.8,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145297432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Importance of Sensitivity Analyses for the MR Steiger Approach 敏感性分析对MR Steiger方法的重要性
IF 3.8 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2025-09-04 DOI: 10.1002/gepi.70018
Sharon M. Lutz, Kirsten Voorhies, John E. Hokanson, Stijn Vansteelandt, Christoph Lange
<p>An extension to Mendelian randomization (MR), MR Steiger uses single nucleotide polymorphisms (SNPs) in an instrumental variables framework to infer the causal direction between two phenotypes (Hemani et al. <span>2017</span>). In 2021 and 2022, we explored the role of unmeasured confounding, pleiotropy, and measurement error on the performance of the MR Steiger approach (Lutz et al. <span>2021</span>) as well as selection bias (Lutz et al. <span>2022a</span>). In 2022, we used simulation studies to further examine the role of unmeasured confounding on the general performance of the MR Steiger approach to show that unmeasured confounding can increase the variance of phenotype 1 as compared to phenotype 2 such that the wrong causal direction between the two phenotypes will be inferred by the approach. We moreover created an R package UCRMS to reproduce these simulation studies (Lutz et al. <span>2022b</span>). However, in a 2023 paper by Hemani at el., the authors incorrectly stated that “Lutz et al. (2022) propose an R package (UCRMS) for performing sensitivity analysis of the MR Steiger method” (Hemani et al. <span>2023</span>), where a sensitivity analysis examines how different values of an independent variable affect a dependent variable under a given set of assumptions. The purpose of our R package (UCRMS) was to examine the general performance of the MR Steiger approach in the presence of unmeasured confounding, not as a package for sensitivity analyses. In the 2023 paper by Hemani et al. they state that “If [Lutz et al.] were presenting a simulation of the general performance of MR Steiger under unmeasured confounding then it would not matter that the simulated parameters are not tied to those observed in a particular empirical analysis” (Hemani et al. <span>2023</span>), illustrating the correct original purpose of our R package as a simulation to assess the performance of the MR Steiger approach and not as a sensitivity analysis.</p><p>Here, <span></span><math> <semantics> <mrow> <mrow> <msub> <mi>β</mi> <mi>OLS</mi> </msub> </mrow> </mrow> <annotation> ${beta }_{{OLS}}$</annotation> </semantics></math> is the “observed effect” of phenotype X on phenotype Y, which may differ from the true effect <span></span><math> <semantics> <mrow> <mrow> <msub> <mi>β</mi> <mi>xy</mi> </msub> </mrow> </mrow> <annotation> ${beta }_{{xy}}$</annotation> </semantics></math> as a result of confounding by U.</p><p>As stated by the Hemani et al. estimates of <span></span><math> <semantics>
作为孟德尔随机化(MR)的延伸,MR Steiger在工具变量框架中使用单核苷酸多态性(snp)来推断两种表型之间的因果方向(Hemani et al. 2017)。在2021年和2022年,我们探讨了未测量的混杂、多效性和测量误差对MR Steiger方法性能的影响(Lutz et al. 2021)以及选择偏差(Lutz et al. 2022a)。在2022年,我们使用模拟研究进一步检验了未测量的混杂因素对MR Steiger方法总体性能的作用,结果表明,与表型2相比,未测量的混杂因素会增加表型1的方差,从而通过该方法推断出两种表型之间的错误因果方向。我们还创建了一个R包UCRMS来重现这些模拟研究(Lutz et al. 2022b)。然而,在2023年el的Hemani的一篇论文中。,作者错误地指出“Lutz et al.(2022)提出了一个R包(UCRMS)来执行MR Steiger方法的敏感性分析”(Hemani et al. 2023),其中敏感性分析检查了在给定的一组假设下自变量的不同值如何影响因变量。我们的R包(UCRMS)的目的是检查MR Steiger方法在存在未测量混淆的情况下的一般性能,而不是作为敏感性分析的包。在Hemani等人于2023年发表的论文中,他们指出“如果[Lutz等人]在未测量的混杂下对MR Steiger的一般性能进行模拟,那么模拟参数与特定实证分析中观察到的参数无关”(Hemani等人,2023),这说明了我们R包的正确原始目的是模拟评估MR Steiger方法的性能,而不是作为敏感性分析。其中,β OLS ${beta}_{{OLS}}$为表型X对表型Y的“观察效应”;由于u的混淆,可能与真实效果β xy ${ β}_{{xy}}$有所不同${beta}_{{xy}}$和β OLS ${beta}_{{OLS}}$是必需的,因为给定未测量的混杂因素U(即,β xy ${beta}_{{xy}}$)和观察到的X对Y的影响,不考虑未测量的混杂因素(即,β OLS ${beta}_{{OLS}}$)都是未知的。如Hemani et al. 2023补编所述,若β xy ${beta}_{{xy}}$与β OLS之差${beta}_{{OLS}}$较大,则可以推断出错误的因果方向。另外,如果表型X的方差大于表型Y的方差,则尤其如此。 因此,β xy ${beta}_{{xy}}$的估计值接近β xy的真实值是非常重要的${beta}_{{xy}}$。通过使用MR估计β xy ${beta}_{{xy}}$,如在Hemani等人论文补充的数据分析示例中,隐含地要求满足这种MR方法的所有假设。如果不满足这些假设,或者存在较大的抽样变异性,使得β xy ${beta}_{{xy}}$的估计值与β的真实值有很大差异Xy ${beta}_{{Xy}}$,则敏感性分析可能会低估或高估未测量的混杂因素的影响,这可能会改变推断正确方向的概率。鉴于此,我们对附录中数据分析敏感性分析中参数的选择表示关注(Hemani et al. 2023),该报告探讨了未测量混杂因素对MR Steiger方法推断体重指数(BMI)和收缩压(SBP)影响方向的作用。在Hemani等人的分析中,获得了英国生物银行中欧洲血统参与者中snp对BMI的估计影响。BMI对收缩压的真实影响(即β xy ${beta}_{{xy}}$)是在英国生物银行的欧洲血统参与者中使用MR IVW估计的。表型X(即BMI)、表型Y(即收缩压)、snp(即G)和未测量的混杂因素u的方差设为1。然而,Hemani等人利用观察到的BMI对收缩压的影响(即,β OLS ${beta}_{{OLS}}$),该研究调查了170万中国成年人的BMI和血压之间的关系(Linderman et al. 2018)。虽然Hemani等人表明,在98%的时间里,敏感性分析推断出了正确的方向,但如果观察到BMI对中国人群收缩压的影响(即,β OLS ${beta}_{{OLS}}$)与BMI对中国人群收缩压的真实因果效应(即:β xy ${beta}_{{xy}}$)。因此,目前尚不清楚BMI对中国人群收缩压的真实影响(即β xy ${beta}_{{xy}}$)是否可以假设等于英国生物样本库中BMI对收缩压的估计影响,因为两国人群之间存在实质性差异,例如饮食,生活方式因素、环境暴露等。 此外,在英国生物库中使用MR IVW的β xy ${beta}_{{xy}}$的估计值可能与真实值不同,因为IV假设被违反(即,由于忽略了可能更复杂的潜在纵向结构(其中可能存在反馈关系)或由于大的抽样可变性。此外,Hemani等人提出的敏感性分析并没有考虑到已知的X型和y型的混杂因素。虽然大多数MR方法对两种表型之间的混淆是稳健的,但MR Steiger方法却不是。因此,在检查单个未测量混杂因素的作用时,敏感性分析如何解释表型X和表型Y的混杂因素尚不清楚。例如,虽然Hemani等人的数据分析侧重于整体样本中BMI对收缩压的影响,但有几项研究检查了BMI对收缩压按性别分层的影响(Adler等人2015;Cox等人1997;Chen等人2018;Li等人2015;Dua等人2014)。另外,请注意,不同性别的吸烟率差异很大。在2022年的英国,12.9%的人口被归类为当前吸烟者(14.6%的男性和11.2%的女性)(英国国家统计局2023年)。2018年,中国的一项研究报告称,2%的女性吸烟,50%的男性吸烟(Chan et al. 2023)。由于吸烟和性别都会影响BMI和收缩压,因此尚不清楚该数据分析的敏感性分析如何在检查未测量混杂因素的作用时解释性别和吸烟的影响。如果Hemani等人提出的敏感性分析允许用户指定已知混杂因素的影响,同时检查MR Steiger方法中未测量混杂因素的影响,这将对分析人员有益。本出版物中报告的研究得到了国家精神卫生研究所的支持,奖励号为R01MH129337。本研究得到NHLBI R01MH129337的支持。作者声明无利益冲突。
{"title":"The Importance of Sensitivity Analyses for the MR Steiger Approach","authors":"Sharon M. Lutz,&nbsp;Kirsten Voorhies,&nbsp;John E. Hokanson,&nbsp;Stijn Vansteelandt,&nbsp;Christoph Lange","doi":"10.1002/gepi.70018","DOIUrl":"https://doi.org/10.1002/gepi.70018","url":null,"abstract":"&lt;p&gt;An extension to Mendelian randomization (MR), MR Steiger uses single nucleotide polymorphisms (SNPs) in an instrumental variables framework to infer the causal direction between two phenotypes (Hemani et al. &lt;span&gt;2017&lt;/span&gt;). In 2021 and 2022, we explored the role of unmeasured confounding, pleiotropy, and measurement error on the performance of the MR Steiger approach (Lutz et al. &lt;span&gt;2021&lt;/span&gt;) as well as selection bias (Lutz et al. &lt;span&gt;2022a&lt;/span&gt;). In 2022, we used simulation studies to further examine the role of unmeasured confounding on the general performance of the MR Steiger approach to show that unmeasured confounding can increase the variance of phenotype 1 as compared to phenotype 2 such that the wrong causal direction between the two phenotypes will be inferred by the approach. We moreover created an R package UCRMS to reproduce these simulation studies (Lutz et al. &lt;span&gt;2022b&lt;/span&gt;). However, in a 2023 paper by Hemani at el., the authors incorrectly stated that “Lutz et al. (2022) propose an R package (UCRMS) for performing sensitivity analysis of the MR Steiger method” (Hemani et al. &lt;span&gt;2023&lt;/span&gt;), where a sensitivity analysis examines how different values of an independent variable affect a dependent variable under a given set of assumptions. The purpose of our R package (UCRMS) was to examine the general performance of the MR Steiger approach in the presence of unmeasured confounding, not as a package for sensitivity analyses. In the 2023 paper by Hemani et al. they state that “If [Lutz et al.] were presenting a simulation of the general performance of MR Steiger under unmeasured confounding then it would not matter that the simulated parameters are not tied to those observed in a particular empirical analysis” (Hemani et al. &lt;span&gt;2023&lt;/span&gt;), illustrating the correct original purpose of our R package as a simulation to assess the performance of the MR Steiger approach and not as a sensitivity analysis.&lt;/p&gt;&lt;p&gt;Here, &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mrow&gt;\u0000 &lt;msub&gt;\u0000 &lt;mi&gt;β&lt;/mi&gt;\u0000 &lt;mi&gt;OLS&lt;/mi&gt;\u0000 &lt;/msub&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt; ${beta }_{{OLS}}$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt; is the “observed effect” of phenotype X on phenotype Y, which may differ from the true effect &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mrow&gt;\u0000 &lt;msub&gt;\u0000 &lt;mi&gt;β&lt;/mi&gt;\u0000 &lt;mi&gt;xy&lt;/mi&gt;\u0000 &lt;/msub&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt; ${beta }_{{xy}}$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt; as a result of confounding by U.&lt;/p&gt;&lt;p&gt;As stated by the Hemani et al. estimates of &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 ","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 7","pages":""},"PeriodicalIF":3.8,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.70018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144935107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correcting for Genomic Inflation Leads to Loss of Power in Large-Scale Genome-Wide Association Study Meta-Analysis 校正基因组膨胀导致大规模全基因组关联研究荟萃分析的能力丧失
IF 3.8 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2025-08-06 DOI: 10.1002/gepi.70016
Archit Singh, Lorraine Southam, Konstantinos Hatzikotoulas, Nigel W. Rayner, Ken Suzuki, Henry J. Taylor, Xianyong Yin, Ravi Mandla, Alicia Huerta-Chagoya, Andrew P. Morris, Eleftheria Zeggini, Ozvan Bocher

Inflation in genome-wide association studies (GWAS) summary statistics represents a major challenge, for which correction methods have been developed. These include the genomic control (GC) method, which uses the λ-value to correct summary statistics, and the linkage disequilibrium score regression (LDSR) method, which uses the LDSR intercept. By using type 2 diabetes (T2D) as an exemplar, we explore factors influencing λ-values and the impact of these corrections on association signals. We find that larger sample sizes increase λ-values due to increased captured polygenicity, while including lower frequency variants decreases λ-values due to reduced power. Comparing T2D genetic associations described in overlapping GWAS meta-analyses of increasing sample size, we find that GC correction reduces the false positive rate and leads to the loss of robust associations. In one of the largest meta-analysis, GC correction results in 39.7% loss of independent loci, substantially reducing the number of detected associations. In comparison, the LDSR intercept correction leads to a loss of up to 25.2% of the independent loci, being therefore less conservative than the GC correction. We conclude that in large, well-powered GWAS meta-analysis of polygenic traits, both GC and LDSR intercept correction leads to power loss, highlighting the need for improved genomic inflation correction methods.

全基因组关联研究(GWAS)汇总统计中的膨胀是一个主要挑战,为此已经开发了校正方法。这些方法包括基因组控制(GC)方法,它使用λ值来校正汇总统计,以及连锁不平衡评分回归(LDSR)方法,它使用LDSR截距。以2型糖尿病(T2D)为例,我们探讨了影响λ值的因素以及这些校正对关联信号的影响。我们发现较大的样本量由于捕获的多基因性增加而增加λ值,而包括较低频率的变体由于功率降低而降低λ值。比较增加样本量的重叠GWAS荟萃分析中描述的T2D遗传关联,我们发现GC校正降低了假阳性率,并导致强大关联的丧失。在一项最大的荟萃分析中,GC校正导致39.7%的独立基因座丢失,大大减少了检测到的关联数量。相比之下,LDSR截距校正导致高达25.2%的独立位点的损失,因此比GC校正更保守。我们得出结论,在对多基因性状进行的大规模、高功率的GWAS荟萃分析中,GC和LDSR截距校正都会导致功率损失,这突出了改进基因组膨胀校正方法的必要性。
{"title":"Correcting for Genomic Inflation Leads to Loss of Power in Large-Scale Genome-Wide Association Study Meta-Analysis","authors":"Archit Singh,&nbsp;Lorraine Southam,&nbsp;Konstantinos Hatzikotoulas,&nbsp;Nigel W. Rayner,&nbsp;Ken Suzuki,&nbsp;Henry J. Taylor,&nbsp;Xianyong Yin,&nbsp;Ravi Mandla,&nbsp;Alicia Huerta-Chagoya,&nbsp;Andrew P. Morris,&nbsp;Eleftheria Zeggini,&nbsp;Ozvan Bocher","doi":"10.1002/gepi.70016","DOIUrl":"https://doi.org/10.1002/gepi.70016","url":null,"abstract":"<p>Inflation in genome-wide association studies (GWAS) summary statistics represents a major challenge, for which correction methods have been developed. These include the genomic control (GC) method, which uses the λ-value to correct summary statistics, and the linkage disequilibrium score regression (LDSR) method, which uses the LDSR intercept. By using type 2 diabetes (T2D) as an exemplar, we explore factors influencing λ-values and the impact of these corrections on association signals. We find that larger sample sizes increase λ-values due to increased captured polygenicity, while including lower frequency variants decreases λ-values due to reduced power. Comparing T2D genetic associations described in overlapping GWAS meta-analyses of increasing sample size, we find that GC correction reduces the false positive rate and leads to the loss of robust associations. In one of the largest meta-analysis, GC correction results in 39.7% loss of independent loci, substantially reducing the number of detected associations. In comparison, the LDSR intercept correction leads to a loss of up to 25.2% of the independent loci, being therefore less conservative than the GC correction. We conclude that in large, well-powered GWAS meta-analysis of polygenic traits, both GC and LDSR intercept correction leads to power loss, highlighting the need for improved genomic inflation correction methods.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 6","pages":""},"PeriodicalIF":3.8,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.70016","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144782272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identity-By-Descent Mapping Using Multi-Individual IBD With Genome-Wide Multiple Testing Adjustment 使用多个体IBD与全基因组多重测试调整的血统识别映射
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2025-07-28 DOI: 10.1002/gepi.70015
Ruoyi Cai, Sharon R. Browning

We present an identity-by-descent mapping approach to test the association between genome-wide loci and complex traits. Our method evaluates whether levels of genetic similarities at specific genomic locations, captured by local relatedness matrices derived from multi-individual IBD sharing, are associated with phenotypic variation in complex traits. In addition, we propose an approach to adjust for multiple testing in genome-wide IBD mapping scans based on the correlation structure between test statistics across the genome. Through simulation studies, we demonstrate that our test has a well-controlled genome-wide type I error rate and superior power to detect rare and untyped variants compared to standard single-variant tests. We applied our method to systolic blood pressure data from White British individuals in the UK Biobank.

我们提出了一种身份-血统映射方法来测试全基因组位点和复杂性状之间的关联。我们的方法评估了特定基因组位置的遗传相似性水平是否与复杂性状的表型变异有关,这些水平是由多个体IBD共享产生的局部相关性矩阵捕获的。此外,我们提出了一种基于全基因组测试统计量之间的相关结构来调整IBD全基因组定位扫描中多个测试的方法。通过模拟研究,我们证明,与标准的单变异测试相比,我们的测试具有良好控制的全基因组I型错误率和更强的检测稀有和未分型变异的能力。我们将我们的方法应用于英国生物银行中英国白人个体的收缩压数据。
{"title":"Identity-By-Descent Mapping Using Multi-Individual IBD With Genome-Wide Multiple Testing Adjustment","authors":"Ruoyi Cai,&nbsp;Sharon R. Browning","doi":"10.1002/gepi.70015","DOIUrl":"https://doi.org/10.1002/gepi.70015","url":null,"abstract":"<div>\u0000 \u0000 <p>We present an identity-by-descent mapping approach to test the association between genome-wide loci and complex traits. Our method evaluates whether levels of genetic similarities at specific genomic locations, captured by local relatedness matrices derived from multi-individual IBD sharing, are associated with phenotypic variation in complex traits. In addition, we propose an approach to adjust for multiple testing in genome-wide IBD mapping scans based on the correlation structure between test statistics across the genome. Through simulation studies, we demonstrate that our test has a well-controlled genome-wide type I error rate and superior power to detect rare and untyped variants compared to standard single-variant tests. We applied our method to systolic blood pressure data from White British individuals in the UK Biobank.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 6","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144714682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Similarities and Differences Between Methods That Exploit Patterns of Local Genetic Correlation to Identify Shared Causal Loci Through Application to Genome-Wide Association Studies of Multiple Long Term Conditions 通过应用于多种长期条件的全基因组关联研究,探索利用局部遗传相关模式来识别共享因果位点的方法之间的异同
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2025-06-19 DOI: 10.1002/gepi.70012
Rebecca Darlay, Rupal L. Shah, Richard M. Dodds, Anand T. N. Nair, Ewan R. Pearson, Miles D. Witham, Heather J. Cordell, ADMISSION Research Collaborative

Genetic correlation analysis can provide useful insight into the shared genetic basis between traits or conditions of interest. However, most genome-wide analyses only inform about the degree of global (overall) genetic similarity and do not identify the specific genomic regions that give rise to this similarity. Identification of the key genomic regions contributing to shared genetic correlation between traits could allow the genes in these regions to be prioritised for investigation of potential shared biological mechanisms. In recent years, several statistical tools (e.g. LAVA, ρ-HESS, SUPERGNOVA and LOGODetect) have been developed to investigate local (in contrast to global) genetic correlation. These tools partition the genome into multiple segments and provide estimates of the genetic correlation captured by each individual segment. We applied these tools to publicly available European ancestry genome-wide association study (GWAS) summary statistics for three pairs of commonly occurring conditions: hypertension with atrial fibrillation and flutter, hypertension with chronic kidney disease, and hypertension with type 2 diabetes. Despite each of the methods aiming to address the same question, the results were found to be inconsistent across tools, with some identified regions overlapping and others implicated only by a single tool. Computer simulations using genetic data from UK Biobank, carried out under known generating conditions, suggest that LAVA and, to a lesser extent, ρ-HESS, provide the most reliable identification of genuine shared genetic factors. A newly-developed tool, HDL-L, also performed highly competitively. Here we highlight the similarities and differences between the results obtained from these methods and discuss some potential reasons underlying these differences.

遗传相关分析可以对性状或感兴趣的条件之间共有的遗传基础提供有用的见解。然而,大多数全基因组分析只告知全球(总体)遗传相似性的程度,而没有确定产生这种相似性的特定基因组区域。鉴定有助于性状之间共享遗传相关性的关键基因组区域可以使这些区域中的基因优先用于研究潜在的共享生物学机制。近年来,已经开发了几种统计工具(例如LAVA, ρ-HESS, SUPERGNOVA和LOGODetect)来调查局部(与全球相比)遗传相关性。这些工具将基因组划分为多个片段,并提供每个片段捕获的遗传相关性的估计。我们将这些工具应用于公开可获得的欧洲血统全基因组关联研究(GWAS)对三对常见疾病的汇总统计:高血压合并心房颤动和扑动、高血压合并慢性肾脏疾病和高血压合并2型糖尿病。尽管每种方法都旨在解决相同的问题,但发现结果在不同工具之间是不一致的,一些确定的区域重叠,而另一些仅涉及单个工具。在已知的生成条件下,利用来自UK Biobank的遗传数据进行的计算机模拟表明,LAVA和(在较小程度上)ρ-HESS提供了对真正共享遗传因素的最可靠识别。新开发的HDL-L工具也表现出了很强的竞争力。在这里,我们强调了从这些方法中获得的结果之间的异同,并讨论了这些差异背后的一些潜在原因。
{"title":"Exploring Similarities and Differences Between Methods That Exploit Patterns of Local Genetic Correlation to Identify Shared Causal Loci Through Application to Genome-Wide Association Studies of Multiple Long Term Conditions","authors":"Rebecca Darlay,&nbsp;Rupal L. Shah,&nbsp;Richard M. Dodds,&nbsp;Anand T. N. Nair,&nbsp;Ewan R. Pearson,&nbsp;Miles D. Witham,&nbsp;Heather J. Cordell,&nbsp;ADMISSION Research Collaborative","doi":"10.1002/gepi.70012","DOIUrl":"https://doi.org/10.1002/gepi.70012","url":null,"abstract":"<p>Genetic correlation analysis can provide useful insight into the shared genetic basis between traits or conditions of interest. However, most genome-wide analyses only inform about the degree of global (overall) genetic similarity and do not identify the specific genomic regions that give rise to this similarity. Identification of the key genomic regions contributing to shared genetic correlation between traits could allow the genes in these regions to be prioritised for investigation of potential shared biological mechanisms. In recent years, several statistical tools (e.g. LAVA, ρ-HESS, SUPERGNOVA and LOGODetect) have been developed to investigate local (in contrast to global) genetic correlation. These tools partition the genome into multiple segments and provide estimates of the genetic correlation captured by each individual segment. We applied these tools to publicly available European ancestry genome-wide association study (GWAS) summary statistics for three pairs of commonly occurring conditions: hypertension with atrial fibrillation and flutter, hypertension with chronic kidney disease, and hypertension with type 2 diabetes. Despite each of the methods aiming to address the same question, the results were found to be inconsistent across tools, with some identified regions overlapping and others implicated only by a single tool. Computer simulations using genetic data from UK Biobank, carried out under known generating conditions, suggest that LAVA and, to a lesser extent, ρ-HESS, provide the most reliable identification of genuine shared genetic factors. A newly-developed tool, HDL-L, also performed highly competitively. Here we highlight the similarities and differences between the results obtained from these methods and discuss some potential reasons underlying these differences.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 5","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.70012","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144323481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Robust Association Test Leveraging Unknown Genetic Interactions: Application to Cystic Fibrosis Lung Disease 利用未知遗传相互作用的稳健关联测试:在囊性纤维化肺疾病中的应用
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2025-06-17 DOI: 10.1002/gepi.70013
Sangook Kim, Yu-Chung Lin, Lisa J. Strug

For complex traits such as lung disease in Cystic Fibrosis (CF), Gene x Gene or Gene x Environment interactions can impact disease severity but these remain largely unknown. Unaccounted-for genetic interactions introduce a distributional shift in the quantitative trait across the genotypic groups. Joint location and scale tests, or full distributional differences across genotype groups can account for unknown genetic interactions and increase power for gene identification compared with the conventional association test. Here we propose a new joint location and scale test (JLS), a quantile regression-basd JLS (qJLS), that addresses previous limitations. Specifically, qJLS is free of distributional assumptions, thus applies to non-Gaussian traits; is as powerful as the existing JLS tests under Gaussian traits; and is computationally efficient for genome-wide association studies (GWAS). Our simulation studies, which model unknown genetic interactions, demonstrate that qJLS is robust to skewed and heavy-tailed error distributions and is as powerful as other JLS tests in the literature under normality. Without any unknown genetic interaction, qJLS shows a large increase in power with non-Gaussian traits over conventional association tests and is slightly less powerful under normality. We apply the qJLS method to the Canadian CF Gene Modifier Study (n = 1,997) and identified a genome-wide significant variant, rs9513900 on chromosome 13, that had not previously been reported to contribute to CF lung disease. qJLS provides a powerful alternative to conventional genetic association tests, where interactions may contribute to a quantitative trait.

对于复杂的特征,如囊性纤维化(CF)中的肺部疾病,基因x基因或基因x环境的相互作用可以影响疾病的严重程度,但这些在很大程度上仍然未知。未解释的遗传相互作用在基因型群体中引入了数量性状的分布转移。联合定位和规模测试,或基因型组之间的完全分布差异可以解释未知的遗传相互作用,与传统的关联测试相比,可以增加基因鉴定的能力。在这里,我们提出了一种新的联合位置和规模检验(JLS),一种基于分位数回归的JLS (qJLS),它解决了以前的局限性。具体来说,qJLS没有分布假设,因此适用于非高斯特征;与现有的高斯特征下的JLS测试一样强大;并且在全基因组关联研究(GWAS)中具有计算效率。我们对未知遗传相互作用建模的模拟研究表明,qJLS对偏态和重尾误差分布具有鲁棒性,并且在正态性下与文献中其他JLS检验一样强大。在没有任何未知的遗传相互作用的情况下,qJLS在非高斯性状的关联检验中显示出比常规关联检验更大的功效,而在正态性下的功效略低。我们将qJLS方法应用于加拿大CF基因修饰研究(n = 1997),并在13号染色体上发现了一个全基因组显著变异rs9513900,该变异以前未被报道与CF肺部疾病有关。qJLS提供了一个强大的替代传统的遗传关联测试,其中相互作用可能有助于数量性状。
{"title":"A Robust Association Test Leveraging Unknown Genetic Interactions: Application to Cystic Fibrosis Lung Disease","authors":"Sangook Kim,&nbsp;Yu-Chung Lin,&nbsp;Lisa J. Strug","doi":"10.1002/gepi.70013","DOIUrl":"https://doi.org/10.1002/gepi.70013","url":null,"abstract":"<p>For complex traits such as lung disease in Cystic Fibrosis (CF), Gene x Gene or Gene x Environment interactions can impact disease severity but these remain largely unknown. Unaccounted-for genetic interactions introduce a distributional shift in the quantitative trait across the genotypic groups. Joint location and scale tests, or full distributional differences across genotype groups can account for unknown genetic interactions and increase power for gene identification compared with the conventional association test. Here we propose a new joint location and scale test (JLS), a quantile regression-basd JLS (qJLS), that addresses previous limitations. Specifically, qJLS is free of distributional assumptions, thus applies to non-Gaussian traits; is as powerful as the existing JLS tests under Gaussian traits; and is computationally efficient for genome-wide association studies (GWAS). Our simulation studies, which model unknown genetic interactions, demonstrate that qJLS is robust to skewed and heavy-tailed error distributions and is as powerful as other JLS tests in the literature under normality. Without any unknown genetic interaction, qJLS shows a large increase in power with non-Gaussian traits over conventional association tests and is slightly less powerful under normality. We apply the qJLS method to the Canadian CF Gene Modifier Study (n = 1,997) and identified a genome-wide significant variant, rs9513900 on chromosome 13, that had not previously been reported to contribute to CF lung disease. qJLS provides a powerful alternative to conventional genetic association tests, where interactions may contribute to a quantitative trait.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 5","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.70013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144300332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncovering Ethnicity-Specific Recessive Loci for Alzheimer's Disease in 89 Dominican Families Using Family-Based WGS Analysis 使用基于家族的WGS分析揭示89个多米尼加家庭阿尔茨海默病的种族特异性隐性位点
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2025-06-09 DOI: 10.1002/gepi.70014
Sanghun Lee, Julian Hecker, Badri N. Vardarajan, Rachel S. Kelly, Nicole Prince, Kristina Mullin, Sharon M. Lutz, Georg Hahn, Jessica Lasky-Su, Richard P. Mayeux, Rudolph E. Tanzi, Christoph Lange, Dmitry Prokopenko

In a sample of 89 Dominican families from the National Institute on Aging's Alzheimer's Disease Sequencing Project (ADSP), where at least one family member had a confirmed Alzheimer's disease (AD) diagnosis, we conducted an exploratory recessive whole-genome sequencing (WGS) analysis using family-based association testing (FBAT-GEE). This method tests jointly for affection status and age-at-onset under a recessive inheritance mode. Our analysis identified a genome-wide significant association for rs847697 in the PDK2 gene on chromosome 17, near the MAPT gene previously implicated in AD through linkage studies. Additionally, we detected four suggestive loci (p-value < 1 × 10−6). Given the unexpected strength of these associations in a modest sample size, we rigorously reviewed data quality, ruling out technical artifacts. The PDK2 association was driven by a small subset of families, aligning with recessive inheritance expectations. However, it could not be replicated in other AD datasets including Estudio Familiar de Influencia Genetica en Alzheimer (EFIGA), the National Institute of Mental Health (NIMH), and European Americans from NIA ADSP, suggesting a possible population-specific or ancestry-related effect. This study highlights the effectiveness of the FBAT approach in detecting unique genetic associations in smaller, isolated populations—findings that might be diluted in larger biobank studies where these populations are underrepresented.

在来自国家老年阿尔茨海默病测序项目(ADSP)的89个多米尼加家庭样本中,至少有一个家庭成员确诊为阿尔茨海默病(AD),我们使用基于家族的关联测试(FBAT-GEE)进行了探索性隐性全基因组测序(WGS)分析。该方法在隐性遗传模式下联合检测情感状态和发病年龄。我们的分析通过连锁研究发现,17号染色体上PDK2基因的rs847697与全基因组显著相关,该基因靠近先前与AD相关的MAPT基因。此外,我们还检测到四个提示位点(p值<; 1 × 10−6)。考虑到这些关联在适度样本量中的意外强度,我们严格审查了数据质量,排除了技术工件。PDK2关联是由一小部分家庭驱动的,与隐性遗传预期一致。然而,它不能在其他AD数据集中复制,包括阿尔茨海默病遗传研究中心(EFIGA)、国家精神卫生研究所(NIMH)和来自NIA ADSP的欧洲美国人,这表明可能存在人群特异性或与祖先相关的影响。这项研究强调了FBAT方法在较小的、孤立的人群中检测独特遗传关联方面的有效性,这些发现可能在较大的生物库研究中被稀释,因为这些人群的代表性不足。
{"title":"Uncovering Ethnicity-Specific Recessive Loci for Alzheimer's Disease in 89 Dominican Families Using Family-Based WGS Analysis","authors":"Sanghun Lee,&nbsp;Julian Hecker,&nbsp;Badri N. Vardarajan,&nbsp;Rachel S. Kelly,&nbsp;Nicole Prince,&nbsp;Kristina Mullin,&nbsp;Sharon M. Lutz,&nbsp;Georg Hahn,&nbsp;Jessica Lasky-Su,&nbsp;Richard P. Mayeux,&nbsp;Rudolph E. Tanzi,&nbsp;Christoph Lange,&nbsp;Dmitry Prokopenko","doi":"10.1002/gepi.70014","DOIUrl":"https://doi.org/10.1002/gepi.70014","url":null,"abstract":"<div>\u0000 \u0000 <p>In a sample of 89 Dominican families from the National Institute on Aging's Alzheimer's Disease Sequencing Project (ADSP), where at least one family member had a confirmed Alzheimer's disease (AD) diagnosis, we conducted an exploratory recessive whole-genome sequencing (WGS) analysis using family-based association testing (FBAT-GEE). This method tests jointly for affection status and age-at-onset under a recessive inheritance mode. Our analysis identified a genome-wide significant association for rs847697 in the <i>PDK2</i> gene on chromosome 17, near the <i>MAPT</i> gene previously implicated in AD through linkage studies. Additionally, we detected four suggestive loci (<i>p</i>-value &lt; 1 × 10<sup>−6</sup>). Given the unexpected strength of these associations in a modest sample size, we rigorously reviewed data quality, ruling out technical artifacts. The <i>PDK2</i> association was driven by a small subset of families, aligning with recessive inheritance expectations. However, it could not be replicated in other AD datasets including Estudio Familiar de Influencia Genetica en Alzheimer (EFIGA), the National Institute of Mental Health (NIMH), and European Americans from NIA ADSP, suggesting a possible population-specific or ancestry-related effect. This study highlights the effectiveness of the FBAT approach in detecting unique genetic associations in smaller, isolated populations—findings that might be diluted in larger biobank studies where these populations are underrepresented.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 5","pages":""},"PeriodicalIF":1.7,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144244417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genetic Epidemiology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1