不可忽略缺失数据的全半参数似然推理

arXiv: Methodology Pub Date : 2019-08-04 DOI:10.5705/ss.202019.0243

Yukun Liu, Pengfei Li, J. Qin

{"title":"不可忽略缺失数据的全半参数似然推理","authors":"Yukun Liu, Pengfei Li, J. Qin","doi":"10.5705/ss.202019.0243","DOIUrl":null,"url":null,"abstract":"During the past few decades, missing-data problems have been studied extensively, with a focus on the ignorable missing case, where the missing probability depends only on observable quantities. By contrast, research into non-ignorable missing data problems is quite limited. The main difficulty in solving such problems is that the missing probability and the regression likelihood function are tangled together in the likelihood presentation, and the model parameters may not be identifiable even under strong parametric model assumptions. In this paper we discuss a semiparametric model for non-ignorable missing data and propose a maximum full semiparametric likelihood estimation method, which is an efficient combination of the parametric conditional likelihood and the marginal nonparametric biased sampling likelihood. The extra marginal likelihood contribution can not only produce efficiency gain but also identify the underlying model parameters without additional assumptions. We further show that the proposed estimators for the underlying parameters and the response mean are semiparametrically efficient. Extensive simulations and a real data analysis demonstrate the advantage of the proposed method over competing methods.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Full-semiparametric-likelihood-based inference for non-ignorable missing data\",\"authors\":\"Yukun Liu, Pengfei Li, J. Qin\",\"doi\":\"10.5705/ss.202019.0243\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"During the past few decades, missing-data problems have been studied extensively, with a focus on the ignorable missing case, where the missing probability depends only on observable quantities. By contrast, research into non-ignorable missing data problems is quite limited. The main difficulty in solving such problems is that the missing probability and the regression likelihood function are tangled together in the likelihood presentation, and the model parameters may not be identifiable even under strong parametric model assumptions. In this paper we discuss a semiparametric model for non-ignorable missing data and propose a maximum full semiparametric likelihood estimation method, which is an efficient combination of the parametric conditional likelihood and the marginal nonparametric biased sampling likelihood. The extra marginal likelihood contribution can not only produce efficiency gain but also identify the underlying model parameters without additional assumptions. We further show that the proposed estimators for the underlying parameters and the response mean are semiparametrically efficient. Extensive simulations and a real data analysis demonstrate the advantage of the proposed method over competing methods.\",\"PeriodicalId\":186390,\"journal\":{\"name\":\"arXiv: Methodology\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv: Methodology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5705/ss.202019.0243\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv: Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5705/ss.202019.0243","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

在过去的几十年里，丢失数据问题得到了广泛的研究，重点是可忽略的丢失情况，其中丢失概率仅取决于可观察到的数量。相比之下，对不可忽视的缺失数据问题的研究相当有限。解决这类问题的主要困难在于缺失概率和回归似然函数在似然表示中纠缠在一起，即使在强参数模型假设下，模型参数也可能无法识别。本文讨论了不可忽略缺失数据的半参数模型，提出了一种极大全半参数似然估计方法，该方法是参数条件似然和边际非参数偏抽样似然的有效结合。额外的边际似然贡献不仅可以产生效率增益，而且可以在没有额外假设的情况下识别潜在的模型参数。我们进一步证明了所提出的基础参数估计和响应均值估计是半参数有效的。大量的仿真和实际数据分析表明，该方法优于竞争对手的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Full-semiparametric-likelihood-based inference for non-ignorable missing data

During the past few decades, missing-data problems have been studied extensively, with a focus on the ignorable missing case, where the missing probability depends only on observable quantities. By contrast, research into non-ignorable missing data problems is quite limited. The main difficulty in solving such problems is that the missing probability and the regression likelihood function are tangled together in the likelihood presentation, and the model parameters may not be identifiable even under strong parametric model assumptions. In this paper we discuss a semiparametric model for non-ignorable missing data and propose a maximum full semiparametric likelihood estimation method, which is an efficient combination of the parametric conditional likelihood and the marginal nonparametric biased sampling likelihood. The extra marginal likelihood contribution can not only produce efficiency gain but also identify the underlying model parameters without additional assumptions. We further show that the proposed estimators for the underlying parameters and the response mean are semiparametrically efficient. Extensive simulations and a real data analysis demonstrate the advantage of the proposed method over competing methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv: Methodology

自引率

0.00%

发文量

期刊最新文献

Revisiting Empirical Bayes Methods and Applications to Special Types of Data Flexible Bayesian modelling of concomitant covariate effects in mixture models A Critique of Differential Abundance Analysis, and Advocacy for an Alternative Post-Processing of MCMC Conditional variance estimator for sufficient dimension reduction