{"title":"ENHANCING POWER OF SCORE TESTS FOR REGRESSION MODELS VIA FISHER TRANSFORMATION","authors":"Masao Ueki","doi":"10.5183/JJSCS.1702001_234","DOIUrl":null,"url":null,"abstract":"A simple method is presented to enhance statistical power of score tests for regression models via Fisher transformation (or Fisher’s z-transformation) by exploiting a relationship with the partial correlation coefficient. Simulation studies mimicking marginal association and gene-environment interaction analyses for genome-wide association studies (GWASs) under case-control design demonstrate that the Fisher transformation enhances power of the score tests while maintaining type I error asymptotically. The smaller the sample size is, the more the enhancement is pronounced, at the expense of inflated type I error due to invalidating asymptotic approximation. Accordingly, the proposed method may be applied when sample size is enough for valid asymptotic approximation. An illustration with real GWAS data is also presented. 1. Fisher-transformation of score tests for regression models Suppose that n response variables y = (y1, . . . , yn) T and an n × p design matrix X = (x1, . . . ,xn) T are observed, where xi is a p-dimensional column vector of explanatory variables for subject i ∈ {1, . . . , n}. Let f(yi | xi) denote the probability distribution of yi conditional on xi for each i. Here, the probability density function of a continuous random variable or the probability mass function of a discrete random variable is referred to as a probability distribution (Dobson, 2002). Assume that a transformed conditional expectation of yi through some differentiable monotone function (i.e. the link function) is written as xi β, in which β is a vector of corresponding p regression coefficients. Then, denote the loglikelihood by l(xi β) = log f(yi | xi) for the ith sample. Throughout, it is assumed that each yi is independently distributed given xi. The above regression framework includes the generalized linear models (McCullagh and Nelder, 1989; Dobson, 2002) and regression with heavy-tailed error distribution (Lange and Sinsheimer, 1993). Suppose thatX is partitioned into two parts as (X1,X2), where X1 is a collection of q (q < p) explanatory variables to be tested for association with y and X2 is a set of p − q covariates to be adjusted for. Correspondingly, let β = (β1 ,β T 2 ) T and xi = (x T 1,i,x T 2,i) T . In this article, X is assumed to be of full column rank. 1.1. Fisher-transformed score test: single parameter case This subsection considers the case of q = 1, and hence the corresponding regression coefficient is written as β1 with a non-bold letter. In genome-wide association study (GWAS) ∗Biostatistics Center, Kurume University, 67 Asahi-machi, Kurume, Fukuoka 830-0011, Japan. Present affiliation is Statistical Genetics Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan E-mail: uekimrsd@nifty.com","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"146 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Japanese Society of Computational Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5183/JJSCS.1702001_234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
A simple method is presented to enhance statistical power of score tests for regression models via Fisher transformation (or Fisher’s z-transformation) by exploiting a relationship with the partial correlation coefficient. Simulation studies mimicking marginal association and gene-environment interaction analyses for genome-wide association studies (GWASs) under case-control design demonstrate that the Fisher transformation enhances power of the score tests while maintaining type I error asymptotically. The smaller the sample size is, the more the enhancement is pronounced, at the expense of inflated type I error due to invalidating asymptotic approximation. Accordingly, the proposed method may be applied when sample size is enough for valid asymptotic approximation. An illustration with real GWAS data is also presented. 1. Fisher-transformation of score tests for regression models Suppose that n response variables y = (y1, . . . , yn) T and an n × p design matrix X = (x1, . . . ,xn) T are observed, where xi is a p-dimensional column vector of explanatory variables for subject i ∈ {1, . . . , n}. Let f(yi | xi) denote the probability distribution of yi conditional on xi for each i. Here, the probability density function of a continuous random variable or the probability mass function of a discrete random variable is referred to as a probability distribution (Dobson, 2002). Assume that a transformed conditional expectation of yi through some differentiable monotone function (i.e. the link function) is written as xi β, in which β is a vector of corresponding p regression coefficients. Then, denote the loglikelihood by l(xi β) = log f(yi | xi) for the ith sample. Throughout, it is assumed that each yi is independently distributed given xi. The above regression framework includes the generalized linear models (McCullagh and Nelder, 1989; Dobson, 2002) and regression with heavy-tailed error distribution (Lange and Sinsheimer, 1993). Suppose thatX is partitioned into two parts as (X1,X2), where X1 is a collection of q (q < p) explanatory variables to be tested for association with y and X2 is a set of p − q covariates to be adjusted for. Correspondingly, let β = (β1 ,β T 2 ) T and xi = (x T 1,i,x T 2,i) T . In this article, X is assumed to be of full column rank. 1.1. Fisher-transformed score test: single parameter case This subsection considers the case of q = 1, and hence the corresponding regression coefficient is written as β1 with a non-bold letter. In genome-wide association study (GWAS) ∗Biostatistics Center, Kurume University, 67 Asahi-machi, Kurume, Fukuoka 830-0011, Japan. Present affiliation is Statistical Genetics Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan E-mail: uekimrsd@nifty.com