Pub Date : 2019-01-01Epub Date: 2020-02-11DOI: 10.1159/000504171
Jonathan T L Kang, Noah A Rosenberg
Background: Many statistics for measuring linkage disequilibrium (LD) take the form of a normalization of the LD coefficient D. Different normalizations produce statistics with different ranges, interpretations, and arguments favoring their use.
Methods: Here, to compare the mathematical properties of these normalizations, we consider 5 of these normalized statistics, describing their upper bounds, the mean values of their maxima over the set of possible allele frequency pairs, and the size of the allele frequency regions accessible given specified values of the statistics.
Results: We produce detailed characterizations of these properties for the statistics d and ρ, analogous to computations previously performed for r2. We examine the relationships among the statistics, uncovering conditions under which some of them have close connections.
Conclusion: The results contribute insight into LD measurement, particularly the understanding of differences in the features of different LD measures when computed on the same data.
{"title":"Mathematical Properties of Linkage Disequilibrium Statistics Defined by Normalization of the Coefficient D = pAB - pApB.","authors":"Jonathan T L Kang, Noah A Rosenberg","doi":"10.1159/000504171","DOIUrl":"https://doi.org/10.1159/000504171","url":null,"abstract":"<p><strong>Background: </strong>Many statistics for measuring linkage disequilibrium (LD) take the form of a normalization of the LD coefficient D. Different normalizations produce statistics with different ranges, interpretations, and arguments favoring their use.</p><p><strong>Methods: </strong>Here, to compare the mathematical properties of these normalizations, we consider 5 of these normalized statistics, describing their upper bounds, the mean values of their maxima over the set of possible allele frequency pairs, and the size of the allele frequency regions accessible given specified values of the statistics.</p><p><strong>Results: </strong>We produce detailed characterizations of these properties for the statistics d and ρ, analogous to computations previously performed for r2. We examine the relationships among the statistics, uncovering conditions under which some of them have close connections.</p><p><strong>Conclusion: </strong>The results contribute insight into LD measurement, particularly the understanding of differences in the features of different LD measures when computed on the same data.</p>","PeriodicalId":13226,"journal":{"name":"Human Heredity","volume":"84 3","pages":"127-143"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1159/000504171","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37633432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2020-10-19DOI: 10.1159/000510062
Soukaina Essadssi, Ibtihal Benhsaien, Amina Bakhchane, Hicham Charoute, Houria Abdelghaffar, Ahmed Aziz Bousfiha, Abdelhamid Barakat
Background: The recombination-activating gene 1 and 2 (RAG1/RAG2) proteins are essential to initiate the V(D)J recombination process, the result is a diverse repertoire of antigen receptor genes and the establishment of the adaptive immunity. RAG1 mutations can lead to multiple forms of combined immunodeficiency.
Methods: In this report, whole exome sequencing was performed in a Moroccan child suffering from combined immunodeficiency, with T and B lymphopenia, autoimmune hemolytic anemia, and cytomegalovirus (CMV) infection.
Results: After filtering data and Sanger sequencing validation, one homozygous mutation c.2446G>A (p.Gly816Arg) was identified in the RAG1 gene.
Conclusion: This finding expands the spectrum of immunological and genetic profiles linked to RAG1 mutation, it also illustrates the necessity to consider RAG1 immunodeficiency in the presence of autoimmune hemolytic anemia and CMV infection, even assuming the immunological phenotype appears more or less normal.
{"title":"A Homozygous RAG1 Gene Mutation in a Case of Combined Immunodeficiency: Clinical, Molecular, and Computational Analysis.","authors":"Soukaina Essadssi, Ibtihal Benhsaien, Amina Bakhchane, Hicham Charoute, Houria Abdelghaffar, Ahmed Aziz Bousfiha, Abdelhamid Barakat","doi":"10.1159/000510062","DOIUrl":"https://doi.org/10.1159/000510062","url":null,"abstract":"<p><strong>Background: </strong>The recombination-activating gene 1 and 2 (RAG1/RAG2) proteins are essential to initiate the V(D)J recombination process, the result is a diverse repertoire of antigen receptor genes and the establishment of the adaptive immunity. RAG1 mutations can lead to multiple forms of combined immunodeficiency.</p><p><strong>Methods: </strong>In this report, whole exome sequencing was performed in a Moroccan child suffering from combined immunodeficiency, with T and B lymphopenia, autoimmune hemolytic anemia, and cytomegalovirus (CMV) infection.</p><p><strong>Results: </strong>After filtering data and Sanger sequencing validation, one homozygous mutation c.2446G>A (p.Gly816Arg) was identified in the RAG1 gene.</p><p><strong>Conclusion: </strong>This finding expands the spectrum of immunological and genetic profiles linked to RAG1 mutation, it also illustrates the necessity to consider RAG1 immunodeficiency in the presence of autoimmune hemolytic anemia and CMV infection, even assuming the immunological phenotype appears more or less normal.</p>","PeriodicalId":13226,"journal":{"name":"Human Heredity","volume":"84 6","pages":"272-278"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1159/000510062","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38600712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2020-07-28DOI: 10.1159/000508558
Camille M Moore, Sean A Jacobson, Tasha E Fingerlin
Introduction: When analyzing data from large-scale genetic association studies, such as targeted or genome-wide resequencing studies, it is common to assume a single genetic model, such as dominant or additive, for all tests of association between a given genetic variant and the phenotype. However, for many variants, the chosen model will result in poor model fit and may lack statistical power due to model misspecification.
Objective: We develop power and sample size calculations for tests of gene and gene × environment interaction, allowing for misspecification of the true mode of genetic susceptibility.
Methods: The power calculations are based on a likelihood ratio test framework and are implemented in an open-source R package ("genpwr").
Results: We use these methods to develop an analysis plan for a resequencing study in idiopathic pulmonary fibrosis and show that using a 2-degree of freedom test can increase power to detect recessive genetic effects while maintaining power to detect dominant and additive effects.
Conclusions: Understanding the impact of model misspecification can aid in study design and developing analysis plans that maximize power to detect a range of true underlying genetic effects. In particular, these calculations help identify when a multiple degree of freedom test or other robust test of association may be advantageous.
{"title":"Power and Sample Size Calculations for Genetic Association Studies in the Presence of Genetic Model Misspecification.","authors":"Camille M Moore, Sean A Jacobson, Tasha E Fingerlin","doi":"10.1159/000508558","DOIUrl":"https://doi.org/10.1159/000508558","url":null,"abstract":"<p><strong>Introduction: </strong>When analyzing data from large-scale genetic association studies, such as targeted or genome-wide resequencing studies, it is common to assume a single genetic model, such as dominant or additive, for all tests of association between a given genetic variant and the phenotype. However, for many variants, the chosen model will result in poor model fit and may lack statistical power due to model misspecification.</p><p><strong>Objective: </strong>We develop power and sample size calculations for tests of gene and gene × environment interaction, allowing for misspecification of the true mode of genetic susceptibility.</p><p><strong>Methods: </strong>The power calculations are based on a likelihood ratio test framework and are implemented in an open-source R package (\"genpwr\").</p><p><strong>Results: </strong>We use these methods to develop an analysis plan for a resequencing study in idiopathic pulmonary fibrosis and show that using a 2-degree of freedom test can increase power to detect recessive genetic effects while maintaining power to detect dominant and additive effects.</p><p><strong>Conclusions: </strong>Understanding the impact of model misspecification can aid in study design and developing analysis plans that maximize power to detect a range of true underlying genetic effects. In particular, these calculations help identify when a multiple degree of freedom test or other robust test of association may be advantageous.</p>","PeriodicalId":13226,"journal":{"name":"Human Heredity","volume":"84 6","pages":"256-271"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1159/000508558","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38201303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2019-12-19DOI: 10.1159/000504170
Qinqin Jin, Gang Shi
Meta-analyses are widely used in genome-wide association studies to combine the results obtained from multiple studies. Classical random-effects methods treat genetic heterogeneity as a random effect and consider it as a portion of the variance associated with a fixed effect of the variant. Recent work suggests performing hypothesis testing with the null hypothesis under which neither fixed nor random effects exist for a variant. This method has been shown to perform better than classical random-effects methods. In this work, we propose a meta-analysis of testing single nucleotide polymorphism (SNP)-environment interaction in the presence of genetic heterogeneity. We introduced the random effects of the SNP and SNP-environment interaction under test into a meta-regression model to account for heterogeneity. A test for the SNP-environment interaction was formulated to test for fixed and random effects of the interaction simultaneously. Similarly, a test for total genetic effects was formulated to test for fixed effects of the SNP and the SNP-environment interaction together with their random effects. We performed simulations to study the null distribution and statistical power of the proposed tests. We show that the new methods have higher power than classical random-effects and fixed-effects meta-regression methods when heterogeneity effects are large.
{"title":"Meta-Analysis of SNP-Environment Interaction with Heterogeneity.","authors":"Qinqin Jin, Gang Shi","doi":"10.1159/000504170","DOIUrl":"https://doi.org/10.1159/000504170","url":null,"abstract":"<p><p>Meta-analyses are widely used in genome-wide association studies to combine the results obtained from multiple studies. Classical random-effects methods treat genetic heterogeneity as a random effect and consider it as a portion of the variance associated with a fixed effect of the variant. Recent work suggests performing hypothesis testing with the null hypothesis under which neither fixed nor random effects exist for a variant. This method has been shown to perform better than classical random-effects methods. In this work, we propose a meta-analysis of testing single nucleotide polymorphism (SNP)-environment interaction in the presence of genetic heterogeneity. We introduced the random effects of the SNP and SNP-environment interaction under test into a meta-regression model to account for heterogeneity. A test for the SNP-environment interaction was formulated to test for fixed and random effects of the interaction simultaneously. Similarly, a test for total genetic effects was formulated to test for fixed effects of the SNP and the SNP-environment interaction together with their random effects. We performed simulations to study the null distribution and statistical power of the proposed tests. We show that the new methods have higher power than classical random-effects and fixed-effects meta-regression methods when heterogeneity effects are large.</p>","PeriodicalId":13226,"journal":{"name":"Human Heredity","volume":"84 3","pages":"117-126"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1159/000504170","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37481469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: It is necessary to investigate the frequency of BRCA1 and BRCA2 mutations in Hakka populations due to the variations in breast cancer epidemiology and genetics.
Methods: 359 breast cancer patients and 66 ovarian cancer patients were included in this retrospective clinical study. Mutations of BRCA1 and BRCA2 were detected in blood samples by semiconductor sequencing.
Results: The sensitivity of tumor markers including CEA, CA15-3, CA12-5, and CA199 for screening breast cancer was 16.44, 15.11, 8.44, and 7.56%, the combination of these 4 tumor markers reached the highest sensitivity index (31.11%). For ovarian cancer, the tumor markers were CA12-5 (54.05%), HE-4 (54.05%), CA72-4 (51.35%), and CEA (2.70%) in order of decreasing sensitivity. Moreover, the combination of these 4 tumor markers has the best sensitivity (75.68%) for screening ovarian cancer. In breast cancer patients, we found 5 (1.39%) patients with mutations in BRCA1, 13 (3.62%) mutations in BRCA2, and the total carrier rate is 5.01% (18/359). For ovarian cancer patients, the corresponding results were 3 (4.54%) mutations, 2 (3.03%) mutations, and 7.58% (5/66), respectively. The proportion of BRCA mutations was 5.41% (23/425) in breast and ovarian cancer patients of a Hakka population. The pathogenic, likely pathogenic, and benign mutations, and mutations of uncertain significance in this study mainly occurred in exon 14 of the BRCA1 gene, and exon 10 and exon 11 of the BRCA2 gene.
Conclusions: Understanding the spectrum and frequency of BRCA1 and BRCA2 mutations in a Hakka population will assist in the prevention and control of hereditary breast and ovarian cancers in this population.
{"title":"Frequency of BRCA1 and BRCA2 Mutations in Individuals with Breast and Ovarian Cancer in a Chinese Hakka Population Using Next-Generation Sequencing.","authors":"Heming Wu, Qiuming Wang, Xuemin Guo, Qinghua Liu, Qunji Zhang, Qingyan Huang, Zhikang Yu","doi":"10.1159/000505268","DOIUrl":"https://doi.org/10.1159/000505268","url":null,"abstract":"<p><strong>Background: </strong>It is necessary to investigate the frequency of BRCA1 and BRCA2 mutations in Hakka populations due to the variations in breast cancer epidemiology and genetics.</p><p><strong>Methods: </strong>359 breast cancer patients and 66 ovarian cancer patients were included in this retrospective clinical study. Mutations of BRCA1 and BRCA2 were detected in blood samples by semiconductor sequencing.</p><p><strong>Results: </strong>The sensitivity of tumor markers including CEA, CA15-3, CA12-5, and CA199 for screening breast cancer was 16.44, 15.11, 8.44, and 7.56%, the combination of these 4 tumor markers reached the highest sensitivity index (31.11%). For ovarian cancer, the tumor markers were CA12-5 (54.05%), HE-4 (54.05%), CA72-4 (51.35%), and CEA (2.70%) in order of decreasing sensitivity. Moreover, the combination of these 4 tumor markers has the best sensitivity (75.68%) for screening ovarian cancer. In breast cancer patients, we found 5 (1.39%) patients with mutations in BRCA1, 13 (3.62%) mutations in BRCA2, and the total carrier rate is 5.01% (18/359). For ovarian cancer patients, the corresponding results were 3 (4.54%) mutations, 2 (3.03%) mutations, and 7.58% (5/66), respectively. The proportion of BRCA mutations was 5.41% (23/425) in breast and ovarian cancer patients of a Hakka population. The pathogenic, likely pathogenic, and benign mutations, and mutations of uncertain significance in this study mainly occurred in exon 14 of the BRCA1 gene, and exon 10 and exon 11 of the BRCA2 gene.</p><p><strong>Conclusions: </strong>Understanding the spectrum and frequency of BRCA1 and BRCA2 mutations in a Hakka population will assist in the prevention and control of hereditary breast and ovarian cancers in this population.</p>","PeriodicalId":13226,"journal":{"name":"Human Heredity","volume":"84 4-5","pages":"160-169"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1159/000505268","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37679711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2020-06-15DOI: 10.1159/000507576
Li Liu, Richard J Caselli
Excess of heterozygosity (H) is a widely used measure of genetic diversity of a population. As high-throughput sequencing and genotyping data become readily available, it has been applied to investigating the associations of genome-wide genetic diversity with human diseases and traits. However, these studies often report contradictory results. In this paper, we present a meta-analysis of five whole-exome studies to examine the association of H scores with Alzheimer's disease. We show that the mean H score of a group is not associated with the disease status, but ot is associated with the sample size. Across all five studies, the group with more samples has a significantly lower H score than the group with fewer samples. To remove potential confounders in empirical data sets, we perform computer simulations to create artificial genomes controlled for the number of polymorphic loci, the sample size, and the allele frequency. Analyses of these simulated data confirm the negative correlation between the sample size and the H score. Furthermore, we find that genomes with a large number of rare variants also have inflated H scores. These biases altogether can lead to spurious associations between genetic diversity and the phenotype of interest. Based on these findings, we advocate that studies shall balance the sample sizes when using genome-wide H scores to assess genetic diversities of different populations, which helps improve the reproducibility of future research.
{"title":"Unbalanced Sample Size Introduces Spurious Correlations to Genome-Wide Heterozygosity Analyses.","authors":"Li Liu, Richard J Caselli","doi":"10.1159/000507576","DOIUrl":"https://doi.org/10.1159/000507576","url":null,"abstract":"<p><p>Excess of heterozygosity (H) is a widely used measure of genetic diversity of a population. As high-throughput sequencing and genotyping data become readily available, it has been applied to investigating the associations of genome-wide genetic diversity with human diseases and traits. However, these studies often report contradictory results. In this paper, we present a meta-analysis of five whole-exome studies to examine the association of H scores with Alzheimer's disease. We show that the mean H score of a group is not associated with the disease status, but ot is associated with the sample size. Across all five studies, the group with more samples has a significantly lower H score than the group with fewer samples. To remove potential confounders in empirical data sets, we perform computer simulations to create artificial genomes controlled for the number of polymorphic loci, the sample size, and the allele frequency. Analyses of these simulated data confirm the negative correlation between the sample size and the H score. Furthermore, we find that genomes with a large number of rare variants also have inflated H scores. These biases altogether can lead to spurious associations between genetic diversity and the phenotype of interest. Based on these findings, we advocate that studies shall balance the sample sizes when using genome-wide H scores to assess genetic diversities of different populations, which helps improve the reproducibility of future research.</p>","PeriodicalId":13226,"journal":{"name":"Human Heredity","volume":"84 4-5","pages":"197-202"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1159/000507576","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38051891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2020-05-16DOI: 10.1159/000506008
Jianjun Zhang, Qiuying Sha, Han Hao, Shuanglin Zhang, Xiaoyi Raymond Gao, Xuexia Wang
Motivation: The risk of many complex diseases is determined by an interplay of genetic and environmental factors. The examination of gene-environment interactions (G×Es) for multiple traits can yield valuable insights about the etiology of the disease and increase power in detecting disease-associated genes. However, the methods for testing G×Es for multiple traits are very limited.
Method: We developed novel approaches to test G×Es for multiple traits in sequencing association studies. We first perform a transformation of multiple traits by using either principal component analysis or standardization analysis. Then, we detect the effects of G×Es using novel proposed tests: testing the effect of an optimally weighted combination of G×Es (TOW-GE) and/or variable weight TOW-GE (VW-TOW-GE). Finally, we employ Fisher's combination test to combine the p values.
Results: Extensive simulation studies show that the type I error rates of the proposed methods are well controlled. Compared to the interaction sequence kernel association test (ISKAT), TOW-GE is more powerful when there are only rare risk and protective variants; VW-TOW-GE is more powerful when there are both rare and common variants. Both TOW-GE and VW-TOW-GE are robust to directions of effects of causal G×Es. Application to the COPDGene Study demonstrates that our proposed methods are very effective.
Conclusions: Our proposed methods are useful tools in the identification of G×Es for multiple traits. The proposed methods can be used not only to identify G×Es for common variants, but also for rare variants. Therefore, they can be employed in identifying G×Es in both genome-wide association studies and next-generation sequencing data analyses.
{"title":"Test Gene-Environment Interactions for Multiple Traits in Sequencing Association Studies.","authors":"Jianjun Zhang, Qiuying Sha, Han Hao, Shuanglin Zhang, Xiaoyi Raymond Gao, Xuexia Wang","doi":"10.1159/000506008","DOIUrl":"10.1159/000506008","url":null,"abstract":"<p><strong>Motivation: </strong>The risk of many complex diseases is determined by an interplay of genetic and environmental factors. The examination of gene-environment interactions (G×Es) for multiple traits can yield valuable insights about the etiology of the disease and increase power in detecting disease-associated genes. However, the methods for testing G×Es for multiple traits are very limited.</p><p><strong>Method: </strong>We developed novel approaches to test G×Es for multiple traits in sequencing association studies. We first perform a transformation of multiple traits by using either principal component analysis or standardization analysis. Then, we detect the effects of G×Es using novel proposed tests: testing the effect of an optimally weighted combination of G×Es (TOW-GE) and/or variable weight TOW-GE (VW-TOW-GE). Finally, we employ Fisher's combination test to combine the p values.</p><p><strong>Results: </strong>Extensive simulation studies show that the type I error rates of the proposed methods are well controlled. Compared to the interaction sequence kernel association test (ISKAT), TOW-GE is more powerful when there are only rare risk and protective variants; VW-TOW-GE is more powerful when there are both rare and common variants. Both TOW-GE and VW-TOW-GE are robust to directions of effects of causal G×Es. Application to the COPDGene Study demonstrates that our proposed methods are very effective.</p><p><strong>Conclusions: </strong>Our proposed methods are useful tools in the identification of G×Es for multiple traits. The proposed methods can be used not only to identify G×Es for common variants, but also for rare variants. Therefore, they can be employed in identifying G×Es in both genome-wide association studies and next-generation sequencing data analyses.</p>","PeriodicalId":13226,"journal":{"name":"Human Heredity","volume":"84 4-5","pages":"170-196"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7351593/pdf/nihms-1558071.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37943558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-01-01Epub Date: 2019-10-21DOI: 10.1159/000502738
Ai Ni, Jaya M Satagopan
Background and aims: There is considerable interest in epidemiology to estimate an additive interaction effect between two risk factors in case-control studies. An additive interaction is defined as the differential reduction in absolute risk associated with one factor between different levels of the other factor. A stratified two-phase case-control design is commonly used in epidemiology to reduce the cost of assembling covariates. It is crucial to obtain valid estimates of the model parameters by accounting for the underlying stratification scheme to obtain accurate and precise estimates of additive interaction effects. The aim of this paper is to examine the properties of different methods for estimating model parameters and additive interaction effects under a stratified two-phase case-control design.
Methods: Using simulations, we investigate the properties of three existing methods, namely stratum-specific offset, inverse-probability weighting, and multiple imputation for estimating model parameters and additive interaction effects. We also illustrate these properties using data from two published epidemiology studies.
Results: Simulation studies show that the multiple imputation method performs well when both the true and analysis models are additive (i.e., does not include multiplicative interaction terms) but does not provide a discernible advantage over the offset method when the analysis models are non-additive (i.e., includes multiplicative interaction terms). The offset method exhibits the best overall properties when the analysis model contains multiplicative interaction effects.
Conclusion: When estimating additive interaction between risk factors in stratified two-phase case-control studies, we recommend estimating model parameters using multiple imputation when the analysis model is additive, and we recommend the offset method when the analysis model is non-additive.
{"title":"Estimating Additive Interaction Effect in Stratified Two-Phase Case-Control Design.","authors":"Ai Ni, Jaya M Satagopan","doi":"10.1159/000502738","DOIUrl":"10.1159/000502738","url":null,"abstract":"<p><strong>Background and aims: </strong>There is considerable interest in epidemiology to estimate an additive interaction effect between two risk factors in case-control studies. An additive interaction is defined as the differential reduction in absolute risk associated with one factor between different levels of the other factor. A stratified two-phase case-control design is commonly used in epidemiology to reduce the cost of assembling covariates. It is crucial to obtain valid estimates of the model parameters by accounting for the underlying stratification scheme to obtain accurate and precise estimates of additive interaction effects. The aim of this paper is to examine the properties of different methods for estimating model parameters and additive interaction effects under a stratified two-phase case-control design.</p><p><strong>Methods: </strong>Using simulations, we investigate the properties of three existing methods, namely stratum-specific offset, inverse-probability weighting, and multiple imputation for estimating model parameters and additive interaction effects. We also illustrate these properties using data from two published epidemiology studies.</p><p><strong>Results: </strong>Simulation studies show that the multiple imputation method performs well when both the true and analysis models are additive (i.e., does not include multiplicative interaction terms) but does not provide a discernible advantage over the offset method when the analysis models are non-additive (i.e., includes multiplicative interaction terms). The offset method exhibits the best overall properties when the analysis model contains multiplicative interaction effects.</p><p><strong>Conclusion: </strong>When estimating additive interaction between risk factors in stratified two-phase case-control studies, we recommend estimating model parameters using multiple imputation when the analysis model is additive, and we recommend the offset method when the analysis model is non-additive.</p>","PeriodicalId":13226,"journal":{"name":"Human Heredity","volume":"84 1","pages":"90-108"},"PeriodicalIF":1.1,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6925975/pdf/nihms-1053034.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46172932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}