{"title":"Model averaging for generalized linear models in fragmentary data prediction","authors":"Chao-Qun Yuan, Yang Wu, Fang Fang","doi":"10.1080/24754269.2022.2105486","DOIUrl":null,"url":null,"abstract":"ABSTRACT Fragmentary data is becoming more and more popular in many areas which brings big challenges to researchers and data analysts. Most existing methods dealing with fragmentary data consider a continuous response while in many applications the response variable is discrete. In this paper, we propose a model averaging method for generalized linear models in fragmentary data prediction. The candidate models are fitted based on different combinations of covariate availability and sample size. The optimal weight is selected by minimizing the Kullback–Leibler loss in the completed cases and its asymptotic optimality is established. Empirical evidences from a simulation study and a real data analysis about Alzheimer disease are presented.","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"344 - 352"},"PeriodicalIF":0.7000,"publicationDate":"2022-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Theory and Related Fields","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1080/24754269.2022.2105486","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 2
Abstract
ABSTRACT Fragmentary data is becoming more and more popular in many areas which brings big challenges to researchers and data analysts. Most existing methods dealing with fragmentary data consider a continuous response while in many applications the response variable is discrete. In this paper, we propose a model averaging method for generalized linear models in fragmentary data prediction. The candidate models are fitted based on different combinations of covariate availability and sample size. The optimal weight is selected by minimizing the Kullback–Leibler loss in the completed cases and its asymptotic optimality is established. Empirical evidences from a simulation study and a real data analysis about Alzheimer disease are presented.