Abstract Besides being mainly used for analyzing clustered or longitudinal data, generalized linear mixed models can also be used for smoothing via restricting changes in the fit at the knots in regression splines. The resulting models are usually called semiparametric mixed models (SPMMs). We investigate the effect of smoothing using SPMMs on the correlation and variance parameter estimates for serially correlated longitudinal normal, Poisson and binary data. Through simulations, we compare the performance of SPMMs to other simpler methods for estimating the nonlinear association such as fractional polynomials, and using a parametric nonlinear function. Simulation results suggest that, in general, the SPMMs recover the true curves very well and yield reasonable estimates of the correlation and variance parameters. However, for binary outcomes, SPMMs produce biased estimates of the variance parameters for high serially correlated data. We apply these methods to a dataset investigating the association between CD4 cell count and time since seroconversion for HIV infected men enrolled in the Multicenter AIDS Cohort Study.
{"title":"Effect of Smoothing in Generalized Linear Mixed Models on the Estimation of Covariance Parameters for Longitudinal Data","authors":"M. Mullah, A. Benedetti","doi":"10.1515/ijb-2015-0026","DOIUrl":"https://doi.org/10.1515/ijb-2015-0026","url":null,"abstract":"Abstract Besides being mainly used for analyzing clustered or longitudinal data, generalized linear mixed models can also be used for smoothing via restricting changes in the fit at the knots in regression splines. The resulting models are usually called semiparametric mixed models (SPMMs). We investigate the effect of smoothing using SPMMs on the correlation and variance parameter estimates for serially correlated longitudinal normal, Poisson and binary data. Through simulations, we compare the performance of SPMMs to other simpler methods for estimating the nonlinear association such as fractional polynomials, and using a parametric nonlinear function. Simulation results suggest that, in general, the SPMMs recover the true curves very well and yield reasonable estimates of the correlation and variance parameters. However, for binary outcomes, SPMMs produce biased estimates of the variance parameters for high serially correlated data. We apply these methods to a dataset investigating the association between CD4 cell count and time since seroconversion for HIV infected men enrolled in the Multicenter AIDS Cohort Study.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"59 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0026","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66987691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We present an integer-valued ARCH model which can be used for modeling time series of counts with under-, equi-, or overdispersion. The introduced model has a conditional binomial distribution, and it is shown to be strictly stationary and ergodic. The unknown parameters are estimated by three methods: conditional maximum likelihood, conditional least squares and maximum likelihood type penalty function estimation. The asymptotic distributions of the estimators are derived. A real application of the novel model to epidemic surveillance is briefly discussed. Finally, a generalization of the introduced model is considered by introducing an integer-valued GARCH model.
{"title":"A Binomial Integer-Valued ARCH Model","authors":"M. Ristić, C. Weiß, Ana D Janjić","doi":"10.1515/ijb-2015-0051","DOIUrl":"https://doi.org/10.1515/ijb-2015-0051","url":null,"abstract":"Abstract We present an integer-valued ARCH model which can be used for modeling time series of counts with under-, equi-, or overdispersion. The introduced model has a conditional binomial distribution, and it is shown to be strictly stationary and ergodic. The unknown parameters are estimated by three methods: conditional maximum likelihood, conditional least squares and maximum likelihood type penalty function estimation. The asymptotic distributions of the estimators are derived. A real application of the novel model to epidemic surveillance is briefly discussed. Finally, a generalization of the introduced model is considered by introducing an integer-valued GARCH model.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"12 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0051","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66987783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In randomized clinical trials, we often encounter ordinal categorical responses with repeated measurements. We propose a model-free approach with using the generalized odds ratio (GOR) to measure the relative treatment effect. We develop procedures for testing equality of treatment effects and derive interval estimators for the GOR. We further develop a simple procedure for testing the treatment-by-period interaction. To illustrate the use of test procedures and interval estimators developed here, we consider two real-life data sets, one studying the gender effect on pain scores on an ordinal scale after hip joint resurfacing surgeries, and the other investigating the effect of an active hypnotic drug in insomnia patients on ordinal categories of time to falling asleep.
{"title":"Testing Equality in Ordinal Data with Repeated Measurements: A Model-Free Approach","authors":"K. Lui","doi":"10.1515/ijb-2015-0075","DOIUrl":"https://doi.org/10.1515/ijb-2015-0075","url":null,"abstract":"Abstract In randomized clinical trials, we often encounter ordinal categorical responses with repeated measurements. We propose a model-free approach with using the generalized odds ratio (GOR) to measure the relative treatment effect. We develop procedures for testing equality of treatment effects and derive interval estimators for the GOR. We further develop a simple procedure for testing the treatment-by-period interaction. To illustrate the use of test procedures and interval estimators developed here, we consider two real-life data sets, one studying the gender effect on pain scores on an ordinal scale after hip joint resurfacing surgeries, and the other investigating the effect of an active hypnotic drug in insomnia patients on ordinal categories of time to falling asleep.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"12 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0075","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66988065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Interval estimation of the proportion parameter in the analysis of binary outcome data arising in cluster studies is often an important problem in many biomedical applications. In this paper, we propose two approaches based on the profile likelihood and Wilson score. We compare them with two existing methods recommended for complex survey data and some other methods that are simple extensions of well-known methods such as the likelihood, the generalized estimating equation of Zeger and Liang and the ratio estimator approach of Rao and Scott. An extensive simulation study is conducted for a variety of parameter combinations for the purposes of evaluating and comparing the performance of these methods in terms of coverage and expected lengths. Applications to biomedical data are used to illustrate the proposed methods.
{"title":"A Comparison of Some Approximate Confidence Intervals for a Single Proportion for Clustered Binary Outcome Data","authors":"Krishna K. Saha, Daniel Miller, Suojin Wang","doi":"10.1515/ijb-2015-0024","DOIUrl":"https://doi.org/10.1515/ijb-2015-0024","url":null,"abstract":"Abstract Interval estimation of the proportion parameter in the analysis of binary outcome data arising in cluster studies is often an important problem in many biomedical applications. In this paper, we propose two approaches based on the profile likelihood and Wilson score. We compare them with two existing methods recommended for complex survey data and some other methods that are simple extensions of well-known methods such as the likelihood, the generalized estimating equation of Zeger and Liang and the ratio estimator approach of Rao and Scott. An extensive simulation study is conducted for a variety of parameter combinations for the purposes of evaluating and comparing the performance of these methods in terms of coverage and expected lengths. Applications to biomedical data are used to illustrate the proposed methods.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"37 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0024","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66987679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Schatzkin et al. and other authors demonstrated that the ratios of some conditional statistics such as the true positive fraction are equal to the ratios of unconditional statistics, such as disease detection rates, and therefore we can calculate these ratios between two screening tests on the same population even if negative test patients are not followed with a reference procedure and the true and false negative rates are unknown. We demonstrate that this same property applies to an expected utility metric. We also demonstrate that while simple estimates of relative specificities and relative areas under ROC curves (AUC) do depend on the unknown negative rates, we can write these ratios in terms of disease prevalence, and the dependence of these ratios on a posited prevalence is often weak particularly if that prevalence is small or the performance of the two screening tests is similar. Therefore we can estimate relative specificity or AUC with little loss of accuracy, if we use an approximate value of disease prevalence.
{"title":"Using Relative Statistics and Approximate Disease Prevalence to Compare Screening Tests","authors":"Samuel Frank, Abigail Craig","doi":"10.1515/IJB-2016-0017","DOIUrl":"https://doi.org/10.1515/IJB-2016-0017","url":null,"abstract":"Schatzkin et al. and other authors demonstrated that the ratios of some conditional statistics such as the true positive fraction are equal to the ratios of unconditional statistics, such as disease detection rates, and therefore we can calculate these ratios between two screening tests on the same population even if negative test patients are not followed with a reference procedure and the true and false negative rates are unknown. We demonstrate that this same property applies to an expected utility metric. We also demonstrate that while simple estimates of relative specificities and relative areas under ROC curves (AUC) do depend on the unknown negative rates, we can write these ratios in terms of disease prevalence, and the dependence of these ratios on a posited prevalence is often weak particularly if that prevalence is small or the performance of the two screening tests is similar. Therefore we can estimate relative specificity or AUC with little loss of accuracy, if we use an approximate value of disease prevalence.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"12 1","pages":"1-9"},"PeriodicalIF":1.2,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/IJB-2016-0017","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66988126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract: The Bland–Altman method has been widely used for assessing agreement between two methods of measurement. However, it remains unsolved about sample size estimation. We propose a new method of sample size estimation for Bland–Altman agreement assessment. According to the Bland–Altman method, the conclusion on agreement is made based on the width of the confidence interval for LOAs (limits of agreement) in comparison to predefined clinical agreement limit. Under the theory of statistical inference, the formulae of sample size estimation are derived, which depended on the pre-determined level of α, β, the mean and the standard deviation of differences between two measurements, and the predefined limits. With this new method, the sample sizes are calculated under different parameter settings which occur frequently in method comparison studies, and Monte-Carlo simulation is used to obtain the corresponding powers. The results of Monte-Carlo simulation showed that the achieved powers could coincide with the pre-determined level of powers, thus validating the correctness of the method. The method of sample size estimation can be applied in the Bland–Altman method to assess agreement between two methods of measurement.
{"title":"Sample Size for Assessing Agreement between Two Methods of Measurement by Bland−Altman Method","authors":"Mengfei Lu, Weihua Zhong, Yu-xiu Liu, Hua-zhang Miao, Yong-Chang Li, Mu-Huo Ji","doi":"10.1515/ijb-2015-0039","DOIUrl":"https://doi.org/10.1515/ijb-2015-0039","url":null,"abstract":"Abstract: The Bland–Altman method has been widely used for assessing agreement between two methods of measurement. However, it remains unsolved about sample size estimation. We propose a new method of sample size estimation for Bland–Altman agreement assessment. According to the Bland–Altman method, the conclusion on agreement is made based on the width of the confidence interval for LOAs (limits of agreement) in comparison to predefined clinical agreement limit. Under the theory of statistical inference, the formulae of sample size estimation are derived, which depended on the pre-determined level of α, β, the mean and the standard deviation of differences between two measurements, and the predefined limits. With this new method, the sample sizes are calculated under different parameter settings which occur frequently in method comparison studies, and Monte-Carlo simulation is used to obtain the corresponding powers. The results of Monte-Carlo simulation showed that the achieved powers could coincide with the pre-determined level of powers, thus validating the correctness of the method. The method of sample size estimation can be applied in the Bland–Altman method to assess agreement between two methods of measurement.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"12 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66987760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In phase II and/or III clinical trial study, there are several competing treatments, the goal is to assess the performances of the treatments at the end of the study, the trial design aims to minimize risks to the patients in the trial, according to some given allocation optimality criterion. Recently, a new type of clinical trial, the staggered-start trial has been proposed in some studies, in which different treatments enter the same trial at different times. Some basic questions for this trial are whether optimality can still be kept? under what conditions? and if so how to allocate the the coming patients to treatments to achieve such optimality? Here we propose and study a class of adaptive designs of staggered-start clinical trials, in which for given optimality criterion object, we show that as long as the initial sizes at the beginning of the successive trials are not too large relative to the total sample size, the proposed design can still achieve optimality criterion asymptotically for the allocation proportions as the ordinary trials; if these initial sample sizes have about the same magnitude as the total sample size, full optimality cannot be achieved. The proposed method is simple to use and is illustrated with several examples and a simulation study.
{"title":"Adaptive Design for Staggered-Start Clinical Trial","authors":"A. Yuan, Qizhai Li, Ming Xiong, M. Tan","doi":"10.1515/ijb-2015-0011","DOIUrl":"https://doi.org/10.1515/ijb-2015-0011","url":null,"abstract":"Abstract In phase II and/or III clinical trial study, there are several competing treatments, the goal is to assess the performances of the treatments at the end of the study, the trial design aims to minimize risks to the patients in the trial, according to some given allocation optimality criterion. Recently, a new type of clinical trial, the staggered-start trial has been proposed in some studies, in which different treatments enter the same trial at different times. Some basic questions for this trial are whether optimality can still be kept? under what conditions? and if so how to allocate the the coming patients to treatments to achieve such optimality? Here we propose and study a class of adaptive designs of staggered-start clinical trials, in which for given optimality criterion object, we show that as long as the initial sizes at the beginning of the successive trials are not too large relative to the total sample size, the proposed design can still achieve optimality criterion asymptotically for the allocation proportions as the ordinary trials; if these initial sample sizes have about the same magnitude as the total sample size, full optimality cannot be achieved. The proposed method is simple to use and is illustrated with several examples and a simulation study.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"12 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66987633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Asanao Shimokawa, Y. Narita, S. Shibui, E. Miyaoka
Abstract In many scenarios, a patient in medical research is treated as a statistical unit. However, in some scenarios, we are interested in treating aggregate data as a statistical unit. In such situations, each set of aggregated data is considered to be a concept in a symbolic representation, and each concept has a hyperrectangle or multiple points in the variable space. To construct a tree-structured model from these aggregate survival data, we propose a new approach, where a datum can be included in several terminal nodes in a tree. By constructing a model under this condition, we expect to obtain a more flexible model while retaining the interpretive ease of a hierarchical structure. In this approach, the survival function of concepts that are partially included in a node is constructed using the Kaplan-Meier method, where the number of events and risks at each time point is replaced by the expectation value of the number of individual descriptions of concepts. We present an application of this proposed model using primary brain tumor patient data. As a result, we obtained a new interpretation of the data in comparison to the classical survival tree modeling methods.
{"title":"Tree Based Method for Aggregate Survival Data Modeling","authors":"Asanao Shimokawa, Y. Narita, S. Shibui, E. Miyaoka","doi":"10.1515/ijb-2015-0071","DOIUrl":"https://doi.org/10.1515/ijb-2015-0071","url":null,"abstract":"Abstract In many scenarios, a patient in medical research is treated as a statistical unit. However, in some scenarios, we are interested in treating aggregate data as a statistical unit. In such situations, each set of aggregated data is considered to be a concept in a symbolic representation, and each concept has a hyperrectangle or multiple points in the variable space. To construct a tree-structured model from these aggregate survival data, we propose a new approach, where a datum can be included in several terminal nodes in a tree. By constructing a model under this condition, we expect to obtain a more flexible model while retaining the interpretive ease of a hierarchical structure. In this approach, the survival function of concepts that are partially included in a node is constructed using the Kaplan-Meier method, where the number of events and risks at each time point is replaced by the expectation value of the number of individual descriptions of concepts. We present an application of this proposed model using primary brain tumor patient data. As a result, we obtained a new interpretation of the data in comparison to the classical survival tree modeling methods.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"39 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0071","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66987991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Understanding treatment heterogeneity is essential to the development of precision medicine, which seeks to tailor medical treatments to subgroups of patients with similar characteristics. One of the challenges of achieving this goal is that we usually do not have a priori knowledge of the grouping information of patients with respect to treatment effect. To address this problem, we consider a heterogeneous regression model which allows the coefficients for treatment variables to be subject-dependent with unknown grouping information. We develop a concave fusion penalized method for estimating the grouping structure and the subgroup-specific treatment effects, and derive an alternating direction method of multipliers algorithm for its implementation. We also study the theoretical properties of the proposed method and show that under suitable conditions there exists a local minimizer that equals the oracle least squares estimator based on a priori knowledge of the true grouping information with high probability. This provides theoretical support for making statistical inference about the subgroup-specific treatment effects using the proposed method. The proposed method is illustrated in simulation studies and illustrated with real data from an AIDS Clinical Trials Group Study.
{"title":"Exploration of Heterogeneous Treatment Effects via Concave Fusion","authors":"Shujie Ma, Jian Huang, Zhiwei Zhang, Mingming Liu","doi":"10.1515/ijb-2018-0026","DOIUrl":"https://doi.org/10.1515/ijb-2018-0026","url":null,"abstract":"Abstract Understanding treatment heterogeneity is essential to the development of precision medicine, which seeks to tailor medical treatments to subgroups of patients with similar characteristics. One of the challenges of achieving this goal is that we usually do not have a priori knowledge of the grouping information of patients with respect to treatment effect. To address this problem, we consider a heterogeneous regression model which allows the coefficients for treatment variables to be subject-dependent with unknown grouping information. We develop a concave fusion penalized method for estimating the grouping structure and the subgroup-specific treatment effects, and derive an alternating direction method of multipliers algorithm for its implementation. We also study the theoretical properties of the proposed method and show that under suitable conditions there exists a local minimizer that equals the oracle least squares estimator based on a priori knowledge of the true grouping information with high probability. This provides theoretical support for making statistical inference about the subgroup-specific treatment effects using the proposed method. The proposed method is illustrated in simulation studies and illustrated with real data from an AIDS Clinical Trials Group Study.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"16 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2016-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2018-0026","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66988175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Linn, Bilwaj Gaonkar, J. Doshi, C. Davatzikos, R. Shinohara
Abstract Understanding structural changes in the brain that are caused by a particular disease is a major goal of neuroimaging research. Multivariate pattern analysis (MVPA) comprises a collection of tools that can be used to understand complex disease efxcfects across the brain. We discuss several important issues that must be considered when analyzing data from neuroimaging studies using MVPA. In particular, we focus on the consequences of confounding by non-imaging variables such as age and sex on the results of MVPA. After reviewing current practice to address confounding in neuroimaging studies, we propose an alternative approach based on inverse probability weighting. Although the proposed method is motivated by neuroimaging applications, it is broadly applicable to many problems in machine learning and predictive modeling. We demonstrate the advantages of our approach on simulated and real data examples.
{"title":"Addressing Confounding in Predictive Models with an Application to Neuroimaging","authors":"K. Linn, Bilwaj Gaonkar, J. Doshi, C. Davatzikos, R. Shinohara","doi":"10.1515/ijb-2015-0030","DOIUrl":"https://doi.org/10.1515/ijb-2015-0030","url":null,"abstract":"Abstract Understanding structural changes in the brain that are caused by a particular disease is a major goal of neuroimaging research. Multivariate pattern analysis (MVPA) comprises a collection of tools that can be used to understand complex disease efxcfects across the brain. We discuss several important issues that must be considered when analyzing data from neuroimaging studies using MVPA. In particular, we focus on the consequences of confounding by non-imaging variables such as age and sex on the results of MVPA. After reviewing current practice to address confounding in neuroimaging studies, we propose an alternative approach based on inverse probability weighting. Although the proposed method is motivated by neuroimaging applications, it is broadly applicable to many problems in machine learning and predictive modeling. We demonstrate the advantages of our approach on simulated and real data examples.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"12 1","pages":"31 - 44"},"PeriodicalIF":1.2,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2015-0030","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66987699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}