Diagnosing and Handling Common Violations of Missing at Random.

IF 2.9 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Psychometrika Pub Date : 2023-12-01 Epub Date: 2023-01-04 DOI:10.1007/s11336-022-09896-0
Feng Ji, Sophia Rabe-Hesketh, Anders Skrondal
{"title":"Diagnosing and Handling Common Violations of Missing at Random.","authors":"Feng Ji, Sophia Rabe-Hesketh, Anders Skrondal","doi":"10.1007/s11336-022-09896-0","DOIUrl":null,"url":null,"abstract":"<p><p>Ignorable likelihood (IL) approaches are often used to handle missing data when estimating a multivariate model, such as a structural equation model. In this case, the likelihood is based on all available data, and no model is specified for the missing data mechanism. Inference proceeds via maximum likelihood or Bayesian methods, including multiple imputation without auxiliary variables. Such IL approaches are valid under a missing at random (MAR) assumption. Rabe-Hesketh and Skrondal (Ignoring non-ignorable missingness. Presidential Address at the International Meeting of the Psychometric Society, Beijing, China, 2015; Psychometrika, 2023) consider a violation of MAR where a variable A can affect missingness of another variable B also when A is not observed. They show that this case can be handled by discarding more data before proceeding with IL approaches. This data-deletion approach is similar to the sequential estimation of Mohan et al. (in: Advances in neural information processing systems, 2013) based on their ordered factorization theorem but is preferable for parametric models. Which kind of data-deletion or ordered factorization to employ depends on the nature of the MAR violation. In this article, we therefore propose two diagnostic tests, a likelihood-ratio test for a heteroscedastic regression model and a kernel conditional independence test. We also develop a test-based estimator that first uses diagnostic tests to determine which MAR violation appears to be present and then proceeds with the corresponding data-deletion estimator. Simulations show that the test-based estimator outperforms IL when the missing data problem is severe and performs similarly otherwise.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1123-1143"},"PeriodicalIF":2.9000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10656344/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychometrika","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1007/s11336-022-09896-0","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/4 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Ignorable likelihood (IL) approaches are often used to handle missing data when estimating a multivariate model, such as a structural equation model. In this case, the likelihood is based on all available data, and no model is specified for the missing data mechanism. Inference proceeds via maximum likelihood or Bayesian methods, including multiple imputation without auxiliary variables. Such IL approaches are valid under a missing at random (MAR) assumption. Rabe-Hesketh and Skrondal (Ignoring non-ignorable missingness. Presidential Address at the International Meeting of the Psychometric Society, Beijing, China, 2015; Psychometrika, 2023) consider a violation of MAR where a variable A can affect missingness of another variable B also when A is not observed. They show that this case can be handled by discarding more data before proceeding with IL approaches. This data-deletion approach is similar to the sequential estimation of Mohan et al. (in: Advances in neural information processing systems, 2013) based on their ordered factorization theorem but is preferable for parametric models. Which kind of data-deletion or ordered factorization to employ depends on the nature of the MAR violation. In this article, we therefore propose two diagnostic tests, a likelihood-ratio test for a heteroscedastic regression model and a kernel conditional independence test. We also develop a test-based estimator that first uses diagnostic tests to determine which MAR violation appears to be present and then proceeds with the corresponding data-deletion estimator. Simulations show that the test-based estimator outperforms IL when the missing data problem is severe and performs similarly otherwise.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
诊断和处理随机失踪的常见违规行为。
在估计多变量模型(如结构方程模型)时,通常使用可忽略似然(IL)方法来处理缺失数据。在这种情况下,可能性基于所有可用数据,并且没有为缺失的数据机制指定模型。推理通过极大似然或贝叶斯方法进行,包括无辅助变量的多重插值。这种IL方法在随机缺失(MAR)假设下是有效的。拉贝-赫斯基和斯克朗达尔(忽略不可忽视的缺失。在心理测量学会国际会议上的主席致辞,北京,中国,2015;Psychometrika, 2023)考虑违反MAR的情况,即当a未被观察到时,变量a也可能影响另一个变量B的缺失。他们表明,这种情况可以通过在继续使用IL方法之前丢弃更多数据来处理。这种数据删除方法类似于Mohan等人基于有序分解定理的顺序估计(参见:Advances in neural information processing systems, 2013),但更适合参数模型。采用哪种类型的数据删除或有序分解取决于违反MAR的性质。因此,在本文中,我们提出了两个诊断检验,一个异方差回归模型的似然比检验和一个核条件独立性检验。我们还开发了一个基于测试的估计器,它首先使用诊断测试来确定存在哪些MAR违规,然后使用相应的数据删除估计器。仿真表明,当缺失数据问题严重时,基于测试的估计器优于IL,而在其他情况下,其性能相似。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Psychometrika
Psychometrika 数学-数学跨学科应用
CiteScore
4.40
自引率
10.00%
发文量
72
审稿时长
>12 weeks
期刊介绍: The journal Psychometrika is devoted to the advancement of theory and methodology for behavioral data in psychology, education and the social and behavioral sciences generally. Its coverage is offered in two sections: Theory and Methods (T& M), and Application Reviews and Case Studies (ARCS). T&M articles present original research and reviews on the development of quantitative models, statistical methods, and mathematical techniques for evaluating data from psychology, the social and behavioral sciences and related fields. Application Reviews can be integrative, drawing together disparate methodologies for applications, or comparative and evaluative, discussing advantages and disadvantages of one or more methodologies in applications. Case Studies highlight methodology that deepens understanding of substantive phenomena through more informative data analysis, or more elegant data description.
期刊最新文献
Correction to: Generalized Structured Component Analysis Accommodating Convex Components: A Knowledge-Based Multivariate Method with Interpretable Composite Indexes. Remarks from the Editor-in-Chief. Optimizing Large-Scale Educational Assessment with a "Divide-and-Conquer" Strategy: Fast and Efficient Distributed Bayesian Inference in IRT Models. Ordinal Outcome State-Space Models for Intensive Longitudinal Data. New Paradigm of Identifiable General-response Cognitive Diagnostic Models: Beyond Categorical Data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1