Bayesian inference for nonprobability samples with nonignorable missingness

IF 2.1 4区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Statistical Analysis and Data Mining Pub Date : 2024-02-21 DOI:10.1002/sam.11667
Zhan Liu, Xuesong Chen, Ruohan Li, Lanbao Hou
{"title":"Bayesian inference for nonprobability samples with nonignorable missingness","authors":"Zhan Liu, Xuesong Chen, Ruohan Li, Lanbao Hou","doi":"10.1002/sam.11667","DOIUrl":null,"url":null,"abstract":"Nonprobability samples, especially web survey data, have been available in many different fields. However, nonprobability samples suffer from selection bias, which will yield biased estimates. Moreover, missingness, especially nonignorable missingness, may also be encountered in nonprobability samples. Thus, it is a challenging task to make inference from nonprobability samples with nonignorable missingness. In this article, we propose a Bayesian approach to infer the population based on nonprobability samples with nonignorable missingness. In our method, different Logistic regression models are employed to estimate the selection probabilities and the response probabilities; the superpopulation model is used to explain the relationship between the study variable and covariates. Further, Bayesian and approximate Bayesian methods are proposed to estimate the response model parameters and the superpopulation model parameters, respectively. Specifically, the estimating functions for the response model parameters and superpopulation model parameters are utilized to derive the approximate posterior distribution in superpopulation model estimation. Simulation studies are conducted to investigate the finite sample performance of the proposed method. The data from the Pew Research Center and the Behavioral Risk Factor Surveillance System are used to show better performance of our proposed method over the other approaches.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Analysis and Data Mining","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/sam.11667","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Nonprobability samples, especially web survey data, have been available in many different fields. However, nonprobability samples suffer from selection bias, which will yield biased estimates. Moreover, missingness, especially nonignorable missingness, may also be encountered in nonprobability samples. Thus, it is a challenging task to make inference from nonprobability samples with nonignorable missingness. In this article, we propose a Bayesian approach to infer the population based on nonprobability samples with nonignorable missingness. In our method, different Logistic regression models are employed to estimate the selection probabilities and the response probabilities; the superpopulation model is used to explain the relationship between the study variable and covariates. Further, Bayesian and approximate Bayesian methods are proposed to estimate the response model parameters and the superpopulation model parameters, respectively. Specifically, the estimating functions for the response model parameters and superpopulation model parameters are utilized to derive the approximate posterior distribution in superpopulation model estimation. Simulation studies are conducted to investigate the finite sample performance of the proposed method. The data from the Pew Research Center and the Behavioral Risk Factor Surveillance System are used to show better performance of our proposed method over the other approaches.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
具有不可忽略缺失性的非概率样本的贝叶斯推论
非概率样本,尤其是网络调查数据,已经应用于许多不同领域。然而,非概率样本存在选择偏差,这会产生有偏差的估计值。此外,非概率样本中还可能出现遗漏,尤其是不可忽略的遗漏。因此,从非概率样本中进行推断是一项具有挑战性的任务。在本文中,我们提出了一种基于非概率样本的贝叶斯推断方法。在我们的方法中,使用不同的 Logistic 回归模型来估计选择概率和响应概率;使用超人口模型来解释研究变量和协变量之间的关系。此外,还提出了贝叶斯方法和近似贝叶斯方法来分别估计响应模型参数和超人口模型参数。具体而言,利用响应模型参数和超群模型参数的估计函数,得出超群模型估计的近似后验分布。模拟研究考察了所提方法的有限样本性能。利用皮尤研究中心和行为风险因素监测系统的数据,表明我们提出的方法比其他方法具有更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Statistical Analysis and Data Mining
Statistical Analysis and Data Mining COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCEC-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
CiteScore
3.20
自引率
7.70%
发文量
43
期刊介绍: Statistical Analysis and Data Mining addresses the broad area of data analysis, including statistical approaches, machine learning, data mining, and applications. Topics include statistical and computational approaches for analyzing massive and complex datasets, novel statistical and/or machine learning methods and theory, and state-of-the-art applications with high impact. Of special interest are articles that describe innovative analytical techniques, and discuss their application to real problems, in such a way that they are accessible and beneficial to domain experts across science, engineering, and commerce. The focus of the journal is on papers which satisfy one or more of the following criteria: Solve data analysis problems associated with massive, complex datasets Develop innovative statistical approaches, machine learning algorithms, or methods integrating ideas across disciplines, e.g., statistics, computer science, electrical engineering, operation research. Formulate and solve high-impact real-world problems which challenge existing paradigms via new statistical and/or computational models Provide survey to prominent research topics.
期刊最新文献
Quantifying Epistemic Uncertainty in Binary Classification via Accuracy Gain A new logarithmic multiplicative distortion for correlation analysis Revisiting Winnow: A modified online feature selection algorithm for efficient binary classification A random forest approach for interval selection in functional regression Characterizing climate pathways using feature importance on echo state networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1