预防研究中缺失数据分析

J. Graham, S. Hofer, A. Piccinin
{"title":"预防研究中缺失数据分析","authors":"J. Graham, S. Hofer, A. Piccinin","doi":"10.1037/10222-010","DOIUrl":null,"url":null,"abstract":"Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.","PeriodicalId":76229,"journal":{"name":"NIDA research monograph","volume":"142 1","pages":"13-63"},"PeriodicalIF":0.0000,"publicationDate":"1997-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"311","resultStr":"{\"title\":\"Analysis With Missing Data in Prevention Research\",\"authors\":\"J. Graham, S. Hofer, A. Piccinin\",\"doi\":\"10.1037/10222-010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.\",\"PeriodicalId\":76229,\"journal\":{\"name\":\"NIDA research monograph\",\"volume\":\"142 1\",\"pages\":\"13-63\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"311\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NIDA research monograph\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1037/10222-010\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NIDA research monograph","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1037/10222-010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 311

摘要

多年来,数据缺失问题一直是预防研究人员的眼中钉。虽然在统计文献中有一些解决这些问题的办法,但这些办法还没有进入主流预防研究。本章旨在介绍最近Little和Rubin(1987)等人提出的缺失数据分析解决方案的系统应用。本章没有描述一个完整的策略,但它与以下内容相关:(1)使用连续(但不是分类)数据的缺失数据分析,(2)合理正态分布的数据,以及(3)与一般线性模型相关的分析中缺失数据问题的解决方案,特别是使用(或可以使用)协方差矩阵作为输入的分析。本章的例子来自药物预防研究。本章讨论了(1)想要向受访者提出的问题多于大多数人能回答的问题;(2)人员磨耗问题及解决办法;(3)特殊测量程序过于昂贵或耗时,无法获得所有受试者。作者最后得出了几个结论:只要有可能,研究人员应该使用期望最大化(EM)算法(或其他最大似然程序,包括多组结构方程建模程序,或在适当情况下,对涉及缺失数据的分析进行多重输入[本章提供了具体示例]);如果研究人员必须使用其他分析方法,他们应该记住,这些方法会产生有偏见的结果,不应该依赖于最终分析;当数据丢失时,适当的丢失数据分析程序不会无中生有,而是最大限度地利用现有数据;当数据缺失时,研究者应该努力寻找缺失的原因(特别是在计划研究时),并将原因纳入分析模型;研究人员应该对最初缺失的案例进行抽样(只要有可能),并相应地调整EM算法的参数估计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Analysis With Missing Data in Prevention Research
Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Changes in the elimination and resurgence of alcohol-maintained behavior in rats and the effects of naltrexone. Costoclavicular approach to the brachial plexus block: simple or double injection? Participatory governance and policy diffusion in local governments in Korea Implementation of participatory budgeting Measuring and Explaining Subjective Well-being in Korea EVALUATION OF DRUG ABUSE TREATMENT MEDICATIONS: CONCORDANCE BETWEEN CLINICAL AND PRECLINICAL STUDIES.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1