首页 > 最新文献

Genetic Epidemiology最新文献

英文 中文
Additional article of this Special Issue was previously published in another issue of Genetic Epidemiology. That is: 本特刊的其他文章曾在另一期《遗传流行病学》上发表过。即
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2024-11-25 DOI: 10.1002/gepi.22604

Gorfine, M., Qu, C.,Peters, U., & Hsu, L. (2024). Unveiling challenges in Mendelian randomization for gene-environment interaction. Genetic Epidemiology, 48, 164–189. https://doi.org/10.1002/gepi.22552

Gorfine, M., Qu, C.,Peters, U., & Hsu, L. (2024)。揭示孟德尔随机化在基因与环境相互作用方面的挑战。Genetic Epidemiology, 48, 164-189. https://doi.org/10.1002/gepi.22552
{"title":"Additional article of this Special Issue was previously published in another issue of Genetic Epidemiology. That is:","authors":"","doi":"10.1002/gepi.22604","DOIUrl":"https://doi.org/10.1002/gepi.22604","url":null,"abstract":"<p>Gorfine, M., Qu, C.,Peters, U., &amp; Hsu, L. (2024). Unveiling challenges in Mendelian randomization for gene-environment interaction. Genetic Epidemiology, 48, 164–189. https://doi.org/10.1002/gepi.22552</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"48 8","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142714718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel One-Sample Mendelian Randomization Approach for Count-Type Outcomes That Is Robust to Correlated and Uncorrelated Pleiotropic Effects 针对计数型结果的新型单样本孟德尔随机化方法,对相关和不相关的多向效应具有鲁棒性。
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2024-11-05 DOI: 10.1002/gepi.22602
Janaka S. S. Liyanage, Jane S. Hankins, Jeremie H. Estepp, Deokumar Srivastava, Sara R. Rashkin, Clifford Takemoto, Yun Li, Yuehua Cui, Motomi Mori, Mitchell J. Weiss, Guolian Kang

We propose two novel one-sample Mendelian randomization (MR) approaches to causal inference from count-type health outcomes, tailored to both equidispersion and overdispersion conditions. Selecting valid single-nucleotide polymorphisms (SNPs) as instrumental variables (IVs) poses a key challenge for MR approaches, as it requires meeting the necessary IV assumptions. To bolster the proposed approaches by addressing violations of IV assumptions, we incorporate a process for removing invalid SNPs that violate the assumptions. In simulations, our proposed approaches demonstrate robustness to the violations, delivering valid estimates, and interpretable type-I errors and statistical power. This increases the practical applicability of the models. We applied the proposed approaches to evaluate the causal effect of fetal hemoglobin (HbF) on the vaso-occlusive crisis and acute chest syndrome (ACS) events in patients with sickle cell disease (SCD) and revealed the causal relation between HbF and ACS events in these patients. We also developed a user-friendly Shiny web application to facilitate researchers' exploration of causal relations.

我们提出了两种新颖的单样本孟德尔随机化(MR)方法,用于从计数型健康结果中进行因果推断,分别适用于等离散和超离散条件。选择有效的单核苷酸多态性(SNPs)作为工具变量(IVs)是 MR 方法面临的主要挑战,因为它需要满足必要的 IV 假设。为了通过解决违反 IV 假设的问题来支持所提出的方法,我们采用了一种方法来剔除违反假设的无效 SNP。在模拟实验中,我们提出的方法证明了对违反假设的稳健性,提供了有效的估计值,以及可解释的 I 型误差和统计功率。这提高了模型的实际适用性。我们应用所提出的方法评估了胎儿血红蛋白(HbF)对镰状细胞病(SCD)患者血管闭塞危象和急性胸部综合征(ACS)事件的因果效应,并揭示了 HbF 与这些患者 ACS 事件之间的因果关系。我们还开发了一个用户友好型 Shiny 网络应用程序,以方便研究人员探索因果关系。
{"title":"A Novel One-Sample Mendelian Randomization Approach for Count-Type Outcomes That Is Robust to Correlated and Uncorrelated Pleiotropic Effects","authors":"Janaka S. S. Liyanage,&nbsp;Jane S. Hankins,&nbsp;Jeremie H. Estepp,&nbsp;Deokumar Srivastava,&nbsp;Sara R. Rashkin,&nbsp;Clifford Takemoto,&nbsp;Yun Li,&nbsp;Yuehua Cui,&nbsp;Motomi Mori,&nbsp;Mitchell J. Weiss,&nbsp;Guolian Kang","doi":"10.1002/gepi.22602","DOIUrl":"10.1002/gepi.22602","url":null,"abstract":"<div>\u0000 \u0000 <p>We propose two novel one-sample Mendelian randomization (MR) approaches to causal inference from count-type health outcomes, tailored to both equidispersion and overdispersion conditions. Selecting valid single-nucleotide polymorphisms (SNPs) as instrumental variables (IVs) poses a key challenge for MR approaches, as it requires meeting the necessary IV assumptions. To bolster the proposed approaches by addressing violations of IV assumptions, we incorporate a process for removing invalid SNPs that violate the assumptions. In simulations, our proposed approaches demonstrate robustness to the violations, delivering valid estimates, and interpretable type-I errors and statistical power. This increases the practical applicability of the models. We applied the proposed approaches to evaluate the causal effect of fetal hemoglobin (HbF) on the vaso-occlusive crisis and acute chest syndrome (ACS) events in patients with sickle cell disease (SCD) and revealed the causal relation between HbF and ACS events in these patients. We also developed a user-friendly Shiny web application to facilitate researchers' exploration of causal relations.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142582841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Causal Effects on a Disease Progression Trait Using Bivariate Mendelian Randomisation 利用双变量孟德尔随机化估算疾病进展性状的因果效应
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2024-10-24 DOI: 10.1002/gepi.22600
Siyang Cai, Frank Dudbridge

Genome-wide association studies (GWAS) have provided large numbers of genetic markers that can be used as instrumental variables in a Mendelian Randomisation (MR) analysis to assess the causal effect of a risk factor on an outcome. An extension of MR analysis, multivariable MR, has been proposed to handle multiple risk factors. However, adjusting or stratifying the outcome on a variable that is associated with it may induce collider bias. For an outcome that represents progression of a disease, conditioning by selecting only the cases may cause a biased MR estimation of the causal effect of the risk factor of interest on the progression outcome. Recently, we developed instrument effect regression and corrected weighted least squares (CWLS) to adjust for collider bias in observational associations. In this paper, we highlight the importance of adjusting for collider bias in MR with a risk factor of interest and disease progression as the outcome. A generalised version of the instrument effect regression and CWLS adjustment is proposed based on a multivariable MR model. We highlight the assumptions required for this approach and demonstrate its utility for bias reduction. We give an illustrative application to the effect of smoking initiation and smoking cessation on Crohn's disease prognosis, finding no evidence to support a causal effect.

全基因组关联研究(GWAS)提供了大量的遗传标记,这些标记可用作孟德尔随机(MR)分析中的工具变量,以评估风险因素对结果的因果效应。有人提出了 MR 分析的扩展,即多变量 MR,以处理多个风险因素。然而,根据与结果相关的变量对结果进行调整或分层可能会引起碰撞偏差。对于代表疾病进展的结果,仅选择病例进行调节可能会导致对相关风险因素对疾病进展结果的因果效应的 MR 估计出现偏差。最近,我们开发了工具效应回归和校正加权最小二乘法(CWLS)来调整观察性关联中的碰撞偏差。在本文中,我们强调了在以相关风险因素和疾病进展为结果的 MR 中调整碰撞偏差的重要性。基于多变量 MR 模型,我们提出了工具效应回归和 CWLS 调整的通用版本。我们强调了这一方法所需的假设条件,并展示了其在减少偏差方面的实用性。我们举例说明了吸烟和戒烟对克罗恩病预后的影响,发现没有证据支持因果效应。
{"title":"Estimating Causal Effects on a Disease Progression Trait Using Bivariate Mendelian Randomisation","authors":"Siyang Cai,&nbsp;Frank Dudbridge","doi":"10.1002/gepi.22600","DOIUrl":"10.1002/gepi.22600","url":null,"abstract":"<p>Genome-wide association studies (GWAS) have provided large numbers of genetic markers that can be used as instrumental variables in a Mendelian Randomisation (MR) analysis to assess the causal effect of a risk factor on an outcome. An extension of MR analysis, multivariable MR, has been proposed to handle multiple risk factors. However, adjusting or stratifying the outcome on a variable that is associated with it may induce collider bias. For an outcome that represents progression of a disease, conditioning by selecting only the cases may cause a biased MR estimation of the causal effect of the risk factor of interest on the progression outcome. Recently, we developed instrument effect regression and corrected weighted least squares (CWLS) to adjust for collider bias in observational associations. In this paper, we highlight the importance of adjusting for collider bias in MR with a risk factor of interest and disease progression as the outcome. A generalised version of the instrument effect regression and CWLS adjustment is proposed based on a multivariable MR model. We highlight the assumptions required for this approach and demonstrate its utility for bias reduction. We give an illustrative application to the effect of smoking initiation and smoking cessation on Crohn's disease prognosis, finding no evidence to support a causal effect.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22600","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142499061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrative Multi-Omics Approach for Improving Causal Gene Identification 改进因果基因鉴定的多指标整合方法
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2024-10-23 DOI: 10.1002/gepi.22601
Austin King, Chong Wu

Transcriptome-wide association studies (TWAS) have been widely used to identify thousands of likely causal genes for diseases and complex traits using predicted expression models. However, most existing TWAS methods rely on gene expression alone and overlook other regulatory mechanisms of gene expression, including DNA methylation and splicing, that contribute to the genetic basis of these complex traits and diseases. Here we introduce a multi-omics method that integrates gene expression, DNA methylation, and splicing data to improve the identification of associated genes with our traits of interest. Through simulations and by analyzing genome-wide association study (GWAS) summary statistics for 24 complex traits, we show that our integrated method, which leverages these complementary omics biomarkers, achieves higher statistical power, and improves the accuracy of likely causal gene identification in blood tissues over individual omics methods. Finally, we apply our integrated model to a lung cancer GWAS data set, demonstrating the integrated models improved identification of prioritized genes for lung cancer risk.

全转录组关联研究(TWAS)已被广泛应用于利用预测表达模型识别数千种疾病和复杂性状的可能因果基因。然而,现有的大多数 TWAS 方法仅依赖于基因表达,而忽略了基因表达的其他调控机制,包括 DNA 甲基化和剪接,而这些机制正是这些复杂性状和疾病的遗传基础。在这里,我们介绍了一种多组学方法,该方法整合了基因表达、DNA 甲基化和剪接数据,以改进与我们感兴趣的性状相关基因的鉴定。通过模拟和分析 24 个复杂性状的全基因组关联研究(GWAS)汇总统计,我们表明,与单个 omics 方法相比,我们的集成方法利用了这些互补的 omics 生物标记物,实现了更高的统计能力,并提高了血液组织中可能的因果基因识别的准确性。最后,我们将综合模型应用于肺癌 GWAS 数据集,证明综合模型提高了肺癌风险优先基因的识别能力。
{"title":"Integrative Multi-Omics Approach for Improving Causal Gene Identification","authors":"Austin King,&nbsp;Chong Wu","doi":"10.1002/gepi.22601","DOIUrl":"10.1002/gepi.22601","url":null,"abstract":"<div>\u0000 \u0000 <p>Transcriptome-wide association studies (TWAS) have been widely used to identify thousands of likely causal genes for diseases and complex traits using predicted expression models. However, most existing TWAS methods rely on gene expression alone and overlook other regulatory mechanisms of gene expression, including DNA methylation and splicing, that contribute to the genetic basis of these complex traits and diseases. Here we introduce a multi-omics method that integrates gene expression, DNA methylation, and splicing data to improve the identification of associated genes with our traits of interest. Through simulations and by analyzing genome-wide association study (GWAS) summary statistics for 24 complex traits, we show that our integrated method, which leverages these complementary omics biomarkers, achieves higher statistical power, and improves the accuracy of likely causal gene identification in blood tissues over individual omics methods. Finally, we apply our integrated model to a lung cancer GWAS data set, demonstrating the integrated models improved identification of prioritized genes for lung cancer risk.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142499062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to the 2024 Annual Meeting of the International Genetic Epidemiology Society 对国际遗传流行病学学会 2024 年年会的更正
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2024-10-16 DOI: 10.1002/gepi.22599

(2024), The 2024 Annual Meeting of the International Genetic Epidemiology Society. Genetic Epidemiology, 48: 344-398. https://doi.org/10.1002/gepi.22598

In the originally-published article, several abstracts were inadvertently left out. They appear on the following pages.

We apologize for this error.

(2024),国际遗传流行病学学会2024年年会。遗传流行病学杂志,48(4):344-398。https://doi.org/10.1002/gepi.22598In原发表的文章,有几个摘要被无意中遗漏了。它们出现在下面几页。我们为这个错误道歉。
{"title":"Correction to the 2024 Annual Meeting of the International Genetic Epidemiology Society","authors":"","doi":"10.1002/gepi.22599","DOIUrl":"https://doi.org/10.1002/gepi.22599","url":null,"abstract":"<p>(2024), The 2024 Annual Meeting of the International Genetic Epidemiology Society. Genetic Epidemiology, 48: 344-398. https://doi.org/10.1002/gepi.22598</p><p>In the originally-published article, several abstracts were inadvertently left out. They appear on the following pages.</p><p>We apologize for this error.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22599","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fine-Mapping the Results From Genome-Wide Association Studies of Primary Biliary Cholangitis Using SuSiE and h2-D2 利用 Susie 和 h2-D2 对原发性胆汁性胆管炎的全基因组关联研究结果进行精细映射。
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2024-10-06 DOI: 10.1002/gepi.22592
Aida Gjoka, Heather J. Cordell

The main goal of fine-mapping is the identification of relevant genetic variants that have a causal effect on some trait of interest, such as the presence of a disease. From a statistical point of view, fine mapping can be seen as a variable selection problem. Fine-mapping methods are often challenging to apply because of the presence of linkage disequilibrium (LD), that is, regions of the genome where the variants interrogated have high correlation. Several methods have been proposed to address this issue. Here we explore the ‘Sum of Single Effects’ (SuSiE) method, applied to real data (summary statistics) from a genome-wide meta-analysis of the autoimmune liver disease primary biliary cholangitis (PBC). Fine-mapping in this data set was previously performed using the FINEMAP program; we compare these previous results with those obtained from SuSiE, which provides an arguably more convenient and principled way of generating ‘credible sets’, that is set of predictors that are correlated with the response variable. This allows us to appropriately acknowledge the uncertainty when selecting the causal effects for the trait. We focus on the results from SuSiE-RSS, which fits the SuSiE model to summary statistics, such as z-scores, along with a correlation matrix. We also compare the SuSiE results to those obtained using a more recently developed method, h2-D2, which uses the same inputs. Overall, we find the results from SuSiE-RSS and, to a lesser extent, h2-D2, to be quite concordant with those previously obtained using FINEMAP. The resulting genes and biological pathways implicated are therefore also similar to those previously obtained, providing valuable confirmation of these previously reported results. Detailed examination of the credible sets identified suggests that, although for the majority of the loci (33 out of 56) the results from SuSiE-RSS seem most plausible, there are some loci (5 out of 56 loci) where the results from h2-D2 seem more compelling. Computer simulations suggest that, overall, SuSiE-RSS generally has slightly higher power, better precision, and better ability to identify the true number of causal variants in a region than h2-D2, although there are some scenarios where the power of h2-D2 is higher. Thus, in real data analysis, the use of complementary approaches such as both SuSiE and h2-D2 is potentially warranted.

精细作图的主要目标是识别对某些相关性状(如疾病的存在)具有因果效应的相关遗传变异。从统计学的角度来看,精细作图可以看作是一个变量选择问题。由于存在连锁不平衡(LD),即基因组中被检测变异具有高度相关性的区域,因此精细作图方法的应用往往具有挑战性。已经有几种方法被提出来解决这个问题。在此,我们探讨了 "单效应之和"(SuSiE)方法,并将其应用于对自身免疫性肝病原发性胆汁性胆管炎(PBC)进行的全基因组荟萃分析的真实数据(汇总统计)。我们将以前的结果与 SuSiE 得出的结果进行了比较,SuSiE 为生成 "可信集"(即与响应变量相关的预测因子集)提供了一种可以说更方便、更有原则的方法。这使我们在选择特质的因果效应时能够适当地承认不确定性。我们将重点放在 SuSiE-RSS 的结果上,它将 SuSiE 模型与 z 分数等汇总统计量以及相关矩阵进行拟合。我们还将 SuSiE 的结果与最近开发的方法 h2-D2 的结果进行了比较,后者使用了相同的输入。总的来说,我们发现 SuSiE-RSS 的结果与之前使用 FINEMAP 得出的结果非常一致,而 h2-D2 的结果则稍逊一筹。因此,得出的基因和生物通路也与之前得到的结果相似,为之前报告的结果提供了宝贵的证实。对已确定的可信数据集的详细研究表明,虽然对于大多数基因位点(56 个位点中的 33 个)来说,SuSiE-RSS 的结果似乎最可信,但在一些基因位点(56 个位点中的 5 个),h2-D2 的结果似乎更有说服力。计算机模拟表明,总体而言,与 h2-D2 相比,SuSiE-RSS 通常具有更高的功率、更高的精度和更强的能力来识别一个区域中因果变异的真实数量,尽管在某些情况下 h2-D2 的功率更高。因此,在实际数据分析中,可能需要同时使用 SuSiE 和 h2-D2 等互补方法。
{"title":"Fine-Mapping the Results From Genome-Wide Association Studies of Primary Biliary Cholangitis Using SuSiE and h2-D2","authors":"Aida Gjoka,&nbsp;Heather J. Cordell","doi":"10.1002/gepi.22592","DOIUrl":"10.1002/gepi.22592","url":null,"abstract":"<p>The main goal of fine-mapping is the identification of relevant genetic variants that have a causal effect on some trait of interest, such as the presence of a disease. From a statistical point of view, fine mapping can be seen as a variable selection problem. Fine-mapping methods are often challenging to apply because of the presence of linkage disequilibrium (LD), that is, regions of the genome where the variants interrogated have high correlation. Several methods have been proposed to address this issue. Here we explore the ‘Sum of Single Effects’ (SuSiE) method, applied to real data (summary statistics) from a genome-wide meta-analysis of the autoimmune liver disease primary biliary cholangitis (PBC). Fine-mapping in this data set was previously performed using the FINEMAP program; we compare these previous results with those obtained from SuSiE, which provides an arguably more convenient and principled way of generating ‘credible sets’, that is set of predictors that are correlated with the response variable. This allows us to appropriately acknowledge the uncertainty when selecting the causal effects for the trait. We focus on the results from SuSiE-RSS, which fits the SuSiE model to summary statistics, such as z-scores, along with a correlation matrix. We also compare the SuSiE results to those obtained using a more recently developed method, h2-D2, which uses the same inputs. Overall, we find the results from SuSiE-RSS and, to a lesser extent, h2-D2, to be quite concordant with those previously obtained using FINEMAP. The resulting genes and biological pathways implicated are therefore also similar to those previously obtained, providing valuable confirmation of these previously reported results. Detailed examination of the credible sets identified suggests that, although for the majority of the loci (33 out of 56) the results from SuSiE-RSS seem most plausible, there are some loci (5 out of 56 loci) where the results from h2-D2 seem more compelling. Computer simulations suggest that, overall, SuSiE-RSS generally has slightly higher power, better precision, and better ability to identify the true number of causal variants in a region than h2-D2, although there are some scenarios where the power of h2-D2 is higher. Thus, in real data analysis, the use of complementary approaches such as both SuSiE and h2-D2 is potentially warranted.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22592","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142380594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GWASBrewer: An R Package for Simulating Realistic GWAS Summary Statistics GWASBrewer:模拟真实 GWAS 摘要统计的 R 软件包
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2024-10-06 DOI: 10.1002/gepi.22594
Jean Morrison

Many statistical genetics analysis methods make use of GWAS summary statistics. Best statistical practice requires evaluating these methods in realistic simulation experiments. However, simulating summary statistics by first simulating individual genotype and phenotype data is extremely computationally demanding. This high cost may force researchers to conduct overly simplistic simulations that fail to accurately measure method performance. Alternatively, summary statistics can be simulated directly from their theoretical distribution. Although this is a common need among statistical genetics researchers, no software packages exist for comprehensive GWAS summary statistic simulation. We present GWASBrewer, an open source R package for direct simulation of GWAS summary statistics. We show that statistics simulated by GWASBrewer have the same distribution as statistics generated from individual level data, and can be produced at a fraction of the computational expense. Additionally, GWASBrewer can simulate standard error estimates, something that is typically not done when sampling summary statistics directly. GWASBrewer is highly flexible, allowing the user to simulate data for multiple traits connected by causal effects and with complex distributions of effect sizes. We demonstrate example uses of GWASBrewer for evaluating Mendelian randomization, polygenic risk score, and heritability estimation methods.

许多统计遗传学分析方法都使用了 GWAS 摘要统计。最佳统计实践要求在实际模拟实验中评估这些方法。然而,通过首先模拟单个基因型和表型数据来模拟汇总统计量对计算要求极高。这种高成本可能会迫使研究人员进行过于简单的模拟,从而无法准确衡量方法的性能。另一种方法是直接从理论分布模拟汇总统计量。虽然这是统计遗传学研究人员的共同需求,但目前还没有软件包可用于全面的 GWAS 概要统计模拟。我们介绍了 GWASBrewer,这是一个直接模拟 GWAS 概要统计量的开源 R 软件包。我们的研究表明,GWASBrewer 模拟的统计量与从个体水平数据生成的统计量具有相同的分布,而且只需花费很少的计算费用即可生成。此外,GWASBrewer 还能模拟标准误差估计值,而这在直接对汇总统计数据进行采样时通常是做不到的。GWASBrewer 非常灵活,允许用户模拟由因果效应连接的多个性状的数据,以及效应大小的复杂分布。我们将举例说明 GWASBrewer 在评估孟德尔随机化、多基因风险评分和遗传率估计方法方面的应用。
{"title":"GWASBrewer: An R Package for Simulating Realistic GWAS Summary Statistics","authors":"Jean Morrison","doi":"10.1002/gepi.22594","DOIUrl":"10.1002/gepi.22594","url":null,"abstract":"<p>Many statistical genetics analysis methods make use of GWAS summary statistics. Best statistical practice requires evaluating these methods in realistic simulation experiments. However, simulating summary statistics by first simulating individual genotype and phenotype data is extremely computationally demanding. This high cost may force researchers to conduct overly simplistic simulations that fail to accurately measure method performance. Alternatively, summary statistics can be simulated directly from their theoretical distribution. Although this is a common need among statistical genetics researchers, no software packages exist for comprehensive GWAS summary statistic simulation. We present <span>GWASBrewer</span>, an open source R package for direct simulation of GWAS summary statistics. We show that statistics simulated by \u0000<span>GWASBrewer</span> have the same distribution as statistics generated from individual level data, and can be produced at a fraction of the computational expense. Additionally, \u0000<span>GWASBrewer</span> can simulate standard error estimates, something that is typically not done when sampling summary statistics directly. \u0000<span>GWASBrewer</span> is highly flexible, allowing the user to simulate data for multiple traits connected by causal effects and with complex distributions of effect sizes. We demonstrate example uses of \u0000<span>GWASBrewer</span> for evaluating Mendelian randomization, polygenic risk score, and heritability estimation methods.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22594","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142380595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Mixed-Effect Kernel Machine Regression Model for Integrative Analysis of Alpha Diversity in Microbiome Studies 用于综合分析微生物组研究中阿尔法多样性的混合效应核机器回归模型。
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2024-09-30 DOI: 10.1002/gepi.22596
Runzhe Li, Mo Li, Ni Zhao

Increasing evidence suggests that human microbiota plays a crucial role in many diseases. Alpha diversity, a commonly used summary statistic that captures the richness and/or evenness of the microbial community, has been associated with many clinical conditions. However, individual studies that assess the association between alpha diversity and clinical conditions often provide inconsistent results due to insufficient sample size, heterogeneous study populations and technical variability. In practice, meta-analysis tools have been applied to integrate data from multiple studies. However, these methods do not consider the heterogeneity caused by sequencing protocols, and the contribution of each study to the final model depends mainly on its sample size (or variance estimate). To combine studies with distinct sequencing protocols, a robust statistical framework for integrative analysis of microbiome datasets is needed. Here, we propose a mixed-effect kernel machine regression model to assess the association of alpha diversity with a phenotype of interest. Our approach readily incorporates the study-specific characteristics (including sequencing protocols) to allow for flexible modeling of microbiome effect via a kernel similarity matrix. Within the proposed framework, we provide three hypothesis testing approaches to answer different questions that are of interest to researchers. We evaluate the model performance through extensive simulations based on two distinct data generation mechanisms. We also apply our framework to data from HIV reanalysis consortium to investigate gut dysbiosis in HIV infection.

越来越多的证据表明,人类微生物群在许多疾病中起着至关重要的作用。α多样性是一种常用的概括统计量,可反映微生物群落的丰富度和/或均匀度,它与许多临床病症有关。然而,由于样本量不足、研究人群异质和技术上的差异,评估阿尔法多样性与临床症状之间关系的单项研究往往提供不一致的结果。实际上,荟萃分析工具已被用于整合多项研究的数据。然而,这些方法并没有考虑测序方案造成的异质性,每项研究对最终模型的贡献主要取决于其样本量(或方差估计值)。为了将不同测序方案的研究结合起来,需要一个强大的统计框架来综合分析微生物组数据集。在这里,我们提出了一种混合效应核机器回归模型,用于评估阿尔法多样性与感兴趣的表型之间的关联。我们的方法结合了研究的特定特征(包括测序方案),通过核相似性矩阵灵活地建立微生物组效应模型。在提议的框架内,我们提供了三种假设检验方法,以回答研究人员感兴趣的不同问题。我们通过基于两种不同数据生成机制的大量模拟来评估模型性能。我们还将我们的框架应用于艾滋病毒再分析联盟的数据,以研究艾滋病毒感染中的肠道菌群失调。
{"title":"A Mixed-Effect Kernel Machine Regression Model for Integrative Analysis of Alpha Diversity in Microbiome Studies","authors":"Runzhe Li,&nbsp;Mo Li,&nbsp;Ni Zhao","doi":"10.1002/gepi.22596","DOIUrl":"10.1002/gepi.22596","url":null,"abstract":"<div>\u0000 \u0000 <p>Increasing evidence suggests that human microbiota plays a crucial role in many diseases. Alpha diversity, a commonly used summary statistic that captures the richness and/or evenness of the microbial community, has been associated with many clinical conditions. However, individual studies that assess the association between alpha diversity and clinical conditions often provide inconsistent results due to insufficient sample size, heterogeneous study populations and technical variability. In practice, meta-analysis tools have been applied to integrate data from multiple studies. However, these methods do not consider the heterogeneity caused by sequencing protocols, and the contribution of each study to the final model depends mainly on its sample size (or variance estimate). To combine studies with distinct sequencing protocols, a robust statistical framework for integrative analysis of microbiome datasets is needed. Here, we propose a mixed-effect kernel machine regression model to assess the association of alpha diversity with a phenotype of interest. Our approach readily incorporates the study-specific characteristics (including sequencing protocols) to allow for flexible modeling of microbiome effect via a kernel similarity matrix. Within the proposed framework, we provide three hypothesis testing approaches to answer different questions that are of interest to researchers. We evaluate the model performance through extensive simulations based on two distinct data generation mechanisms. We also apply our framework to data from HIV reanalysis consortium to investigate gut dysbiosis in HIV infection.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142345034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Gene Expression Predictions Using Deep Learning and Functional Annotations 利用深度学习和功能注释增强基因表达预测。
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2024-09-30 DOI: 10.1002/gepi.22595
Pratik Ramprasad, Jingchen Ren, Wei Pan

Transcriptome-wide association studies (TWAS) aim to uncover genotype–phenotype relationships through a two-stage procedure: predicting gene expression from genotypes using an expression quantitative trait locus (eQTL) data set, then testing the predicted expression for trait associations. Accurate gene expression prediction in stage 1 is crucial, as it directly impacts the power to identify associations in stage 2. Currently, the first stage of such studies is primarily conducted using linear models like elastic net regression, which fail to capture the nonlinear relationships inherent in biological systems. Deep learning methods have the potential to model such nonlinear effects, but have yet to demonstrably outperform linear methods at this task. To address this gap, we propose a new deep learning architecture to predict gene expression from genotypic variation across individuals. Our method utilizes a learnable input scaling layer in conjunction with a convolutional encoder to capture nonlinear effects and higher-order interactions without compromising on interpretability. We further augment this approach to allow for parameter sharing across multiple networks, enabling us to utilize prior information for individual variants in the form of functional annotations. Evaluations on real-world genomic data show that our method consistently outperforms elastic net regression across a large set of heritable genes. Furthermore, our model statistically significantly improved predictive performance by leveraging functional annotations, whereas elastic net regression failed to show equivalent gains when using the same information, suggesting that our method can capture nonlinear functional information beyond the capability of linear models.

全转录组关联研究(TWAS)旨在通过两个阶段的程序发现基因型与表型之间的关系:使用表达量性状位点(eQTL)数据集根据基因型预测基因表达,然后测试预测表达与性状的关联。第一阶段准确的基因表达预测至关重要,因为它直接影响到第二阶段识别关联的能力。目前,此类研究的第一阶段主要使用弹性网回归等线性模型,而这些模型无法捕捉到生物系统固有的非线性关系。深度学习方法有可能为这种非线性效应建模,但在这项任务中还没有明显优于线性方法的表现。为了弥补这一差距,我们提出了一种新的深度学习架构,用于根据个体间的基因型变异预测基因表达。我们的方法利用可学习的输入缩放层与卷积编码器相结合,在不影响可解释性的前提下捕捉非线性效应和高阶交互作用。我们进一步增强了这种方法,允许在多个网络之间共享参数,使我们能够利用功能注释形式的个体变异先验信息。对真实世界基因组数据的评估表明,在大量遗传基因中,我们的方法始终优于弹性网回归。此外,通过利用功能注释,我们的模型在统计学上显著提高了预测性能,而弹性网回归在使用相同信息时未能显示出同等的收益,这表明我们的方法可以捕捉线性模型无法捕捉的非线性功能信息。
{"title":"Enhancing Gene Expression Predictions Using Deep Learning and Functional Annotations","authors":"Pratik Ramprasad,&nbsp;Jingchen Ren,&nbsp;Wei Pan","doi":"10.1002/gepi.22595","DOIUrl":"10.1002/gepi.22595","url":null,"abstract":"<p>Transcriptome-wide association studies (TWAS) aim to uncover genotype–phenotype relationships through a two-stage procedure: predicting gene expression from genotypes using an expression quantitative trait locus (eQTL) data set, then testing the predicted expression for trait associations. Accurate gene expression prediction in stage 1 is crucial, as it directly impacts the power to identify associations in stage 2. Currently, the first stage of such studies is primarily conducted using linear models like elastic net regression, which fail to capture the nonlinear relationships inherent in biological systems. Deep learning methods have the potential to model such nonlinear effects, but have yet to demonstrably outperform linear methods at this task. To address this gap, we propose a new deep learning architecture to predict gene expression from genotypic variation across individuals. Our method utilizes a learnable input scaling layer in conjunction with a convolutional encoder to capture nonlinear effects and higher-order interactions without compromising on interpretability. We further augment this approach to allow for parameter sharing across multiple networks, enabling us to utilize prior information for individual variants in the form of functional annotations. Evaluations on real-world genomic data show that our method consistently outperforms elastic net regression across a large set of heritable genes. Furthermore, our model statistically significantly improved predictive performance by leveraging functional annotations, whereas elastic net regression failed to show equivalent gains when using the same information, suggesting that our method can capture nonlinear functional information beyond the capability of linear models.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22595","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142345035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Powerful Rare-Variant Association Analysis of Secondary Phenotypes 对次级表型进行强大的罕见变异关联分析
IF 1.7 4区 医学 Q3 GENETICS & HEREDITY Pub Date : 2024-09-30 DOI: 10.1002/gepi.22589
Hanyun Liu, Hong Zhang

Most genome-wide association studies are based on case-control designs, which provide abundant resources for secondary phenotype analyses. However, such studies suffer from biased sampling of primary phenotypes, and the traditional statistical methods can lead to seriously distorted analysis results when they are applied to secondary phenotypes without accounting for the biased sampling mechanism. To our knowledge, there are no statistical methods specifically tailored for rare variant association analysis with secondary phenotypes. In this article, we proposed two novel joint test statistics for identifying secondary-phenotype-associated rare variants based on prospective likelihood and retrospective likelihood, respectively. We also exploit the assumption of gene-environment independence in retrospective likelihood to improve the statistical power and adopt a two-step strategy to balance statistical power and robustness. Simulations and a real-data application are conducted to demonstrate the superior performance of our proposed methods.

大多数全基因组关联研究都基于病例对照设计,这为次级表型分析提供了丰富的资源。然而,这类研究存在主要表型采样偏倚的问题,如果不考虑采样偏倚机制,将传统统计方法应用于次要表型,会导致分析结果严重失真。据我们所知,目前还没有专门用于罕见变异与次级表型关联分析的统计方法。在本文中,我们提出了两种新的联合检验统计方法,分别基于前瞻性似然法和回顾性似然法,用于识别与次级表型相关的罕见变异。我们还利用了回顾似然法中的基因-环境独立性假设来提高统计能力,并采用两步策略来平衡统计能力和稳健性。我们进行了模拟和实际数据应用,以证明我们提出的方法性能优越。
{"title":"Powerful Rare-Variant Association Analysis of Secondary Phenotypes","authors":"Hanyun Liu,&nbsp;Hong Zhang","doi":"10.1002/gepi.22589","DOIUrl":"10.1002/gepi.22589","url":null,"abstract":"<div>\u0000 \u0000 <p>Most genome-wide association studies are based on case-control designs, which provide abundant resources for secondary phenotype analyses. However, such studies suffer from biased sampling of primary phenotypes, and the traditional statistical methods can lead to seriously distorted analysis results when they are applied to secondary phenotypes without accounting for the biased sampling mechanism. To our knowledge, there are no statistical methods specifically tailored for rare variant association analysis with secondary phenotypes. In this article, we proposed two novel joint test statistics for identifying secondary-phenotype-associated rare variants based on prospective likelihood and retrospective likelihood, respectively. We also exploit the assumption of gene-environment independence in retrospective likelihood to improve the statistical power and adopt a two-step strategy to balance statistical power and robustness. Simulations and a real-data application are conducted to demonstrate the superior performance of our proposed methods.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142345036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genetic Epidemiology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1