{"title":"基于惩罚经验似然的纵向数据分析变量选择","authors":"Tharshanna Nadarajah, A. Variyath, J. Loredo-Osti","doi":"10.1080/01966324.2020.1837042","DOIUrl":null,"url":null,"abstract":"Abstract Longitudinal data with a large number of covariates have become common in many applications such as epidemiology, clinical research, and therapeutic evaluation. The identification of a sub-model that adequately represents the data are necessary for easy interpretation. Existing information theoretic-approaches such as AIC and BIC are useful, but computationally not efficient due to an evaluation of all possible subsets. A new class of penalized likelihood methods such as LASSO, SCAD, etc. are efficient in these situations. All these methods rely on the parametric modeling of the response of interest. The joint likelihood function for longitudinal data is challenging, particularly for correlated discrete outcome data. In such a situation, we propose penalized empirical likelihood (PEL) based on generalized estimating equations (GEE) by which the variable selection and the estimation of the coefficients are carried out simultaneously. We discuss its characteristics and asymptotic properties and present an efficient computational algorithm for optimizing PEL. Simulation studies show that when model assumptions are true, its performance is comparable to that of the existing methods and when the model is misspecified, our method has clear advantages over the existing methods. We have applied the method to two case examples.","PeriodicalId":35850,"journal":{"name":"American Journal of Mathematical and Management Sciences","volume":"40 1","pages":"241 - 260"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/01966324.2020.1837042","citationCount":"1","resultStr":"{\"title\":\"Penalized Empirical Likelihood-Based Variable Selection for Longitudinal Data Analysis\",\"authors\":\"Tharshanna Nadarajah, A. Variyath, J. Loredo-Osti\",\"doi\":\"10.1080/01966324.2020.1837042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Longitudinal data with a large number of covariates have become common in many applications such as epidemiology, clinical research, and therapeutic evaluation. The identification of a sub-model that adequately represents the data are necessary for easy interpretation. Existing information theoretic-approaches such as AIC and BIC are useful, but computationally not efficient due to an evaluation of all possible subsets. A new class of penalized likelihood methods such as LASSO, SCAD, etc. are efficient in these situations. All these methods rely on the parametric modeling of the response of interest. The joint likelihood function for longitudinal data is challenging, particularly for correlated discrete outcome data. In such a situation, we propose penalized empirical likelihood (PEL) based on generalized estimating equations (GEE) by which the variable selection and the estimation of the coefficients are carried out simultaneously. We discuss its characteristics and asymptotic properties and present an efficient computational algorithm for optimizing PEL. Simulation studies show that when model assumptions are true, its performance is comparable to that of the existing methods and when the model is misspecified, our method has clear advantages over the existing methods. We have applied the method to two case examples.\",\"PeriodicalId\":35850,\"journal\":{\"name\":\"American Journal of Mathematical and Management Sciences\",\"volume\":\"40 1\",\"pages\":\"241 - 260\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1080/01966324.2020.1837042\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American Journal of Mathematical and Management Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/01966324.2020.1837042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Business, Management and Accounting\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Mathematical and Management Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/01966324.2020.1837042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Business, Management and Accounting","Score":null,"Total":0}
Penalized Empirical Likelihood-Based Variable Selection for Longitudinal Data Analysis
Abstract Longitudinal data with a large number of covariates have become common in many applications such as epidemiology, clinical research, and therapeutic evaluation. The identification of a sub-model that adequately represents the data are necessary for easy interpretation. Existing information theoretic-approaches such as AIC and BIC are useful, but computationally not efficient due to an evaluation of all possible subsets. A new class of penalized likelihood methods such as LASSO, SCAD, etc. are efficient in these situations. All these methods rely on the parametric modeling of the response of interest. The joint likelihood function for longitudinal data is challenging, particularly for correlated discrete outcome data. In such a situation, we propose penalized empirical likelihood (PEL) based on generalized estimating equations (GEE) by which the variable selection and the estimation of the coefficients are carried out simultaneously. We discuss its characteristics and asymptotic properties and present an efficient computational algorithm for optimizing PEL. Simulation studies show that when model assumptions are true, its performance is comparable to that of the existing methods and when the model is misspecified, our method has clear advantages over the existing methods. We have applied the method to two case examples.