How to Classify, Detect, and Manage Univariate and Multivariate Outliers, With Emphasis on Pre-Registration

IF 2 4区 心理学 Q3 PSYCHOLOGY, SOCIAL International Review of Social Psychology Pub Date : 2019-04-30 DOI:10.5334/IRSP.289
C. Leys, Marie Delacre, Youri L. Mora, D. Lakens, Christophe Ley
{"title":"How to Classify, Detect, and Manage Univariate and Multivariate Outliers, With Emphasis on Pre-Registration","authors":"C. Leys, Marie Delacre, Youri L. Mora, D. Lakens, Christophe Ley","doi":"10.5334/IRSP.289","DOIUrl":null,"url":null,"abstract":"Researchers often lack knowledge about how to deal with outliers when analyzing their data. Even more frequently, researchers do not pre-specify how they plan to manage outliers. In this paper we aim to improve research practices by outlining what you need to know about outliers. We start by providing a functional definition of outliers. We then lay down an appropriate nomenclature/classification of outliers. This nomenclature is used to understand what kinds of outliers can be encountered and serves as a guideline to make appropriate decisions regarding the conservation, deletion, or recoding of outliers. These decisions might impact the validity of statistical inferences as well as the reproducibility of our experiments. To be able to make informed decisions about outliers you first need proper detection tools. We remind readers why the most common outlier detection methods are problematic and recommend the use of the median absolute deviation to detect univariate outliers, and of the Mahalanobis-MCD distance to detect multivariate outliers. An R package was created that can be used to easily perform these detection tests. Finally, we promote the use of pre-registration to avoid flexibility in data analysis when handling outliers.","PeriodicalId":45461,"journal":{"name":"International Review of Social Psychology","volume":" ","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2019-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"133","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Review of Social Psychology","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.5334/IRSP.289","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PSYCHOLOGY, SOCIAL","Score":null,"Total":0}
引用次数: 133

Abstract

Researchers often lack knowledge about how to deal with outliers when analyzing their data. Even more frequently, researchers do not pre-specify how they plan to manage outliers. In this paper we aim to improve research practices by outlining what you need to know about outliers. We start by providing a functional definition of outliers. We then lay down an appropriate nomenclature/classification of outliers. This nomenclature is used to understand what kinds of outliers can be encountered and serves as a guideline to make appropriate decisions regarding the conservation, deletion, or recoding of outliers. These decisions might impact the validity of statistical inferences as well as the reproducibility of our experiments. To be able to make informed decisions about outliers you first need proper detection tools. We remind readers why the most common outlier detection methods are problematic and recommend the use of the median absolute deviation to detect univariate outliers, and of the Mahalanobis-MCD distance to detect multivariate outliers. An R package was created that can be used to easily perform these detection tests. Finally, we promote the use of pre-registration to avoid flexibility in data analysis when handling outliers.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
如何分类、检测和管理单变量和多变量异常值,重点是预注册
研究人员在分析数据时往往缺乏如何处理异常值的知识。更常见的是,研究人员没有预先指定他们计划如何管理异常值。在本文中,我们旨在通过概述您需要了解的异常值来改进研究实践。我们首先提供异常值的函数定义。然后,我们为异常值制定了适当的命名/分类。该命名法用于了解可能遇到的异常值类型,并作为制定有关异常值保存、删除或重新编码的适当决策的指南。这些决定可能会影响统计推断的有效性以及我们实验的再现性。为了能够对异常值做出明智的决定,您首先需要合适的检测工具。我们提醒读者为什么最常见的异常值检测方法存在问题,并建议使用中值绝对偏差来检测单变量异常值,使用Mahalanobis MCD距离来检测多变量异常值。创建了一个可用于轻松执行这些检测测试的R包。最后,我们提倡使用预注册,以避免在处理异常值时数据分析的灵活性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.20
自引率
8.00%
发文量
7
审稿时长
16 weeks
期刊介绍: The International Review of Social Psychology (IRSP) is supported by the Association pour la Diffusion de la Recherche Internationale en Psychologie Sociale (A.D.R.I.P.S.). The International Review of Social Psychology publishes empirical research and theoretical notes in all areas of social psychology. Articles are written preferably in English but can also be written in French. The journal was created to reflect research advances in a field where theoretical and fundamental questions inevitably convey social significance and implications. It emphasizes scientific quality of its publications in every area of social psychology. Any kind of research can be considered, as long as the results significantly enhance the understanding of a general social psychological phenomenon and the methodology is appropriate.
期刊最新文献
Sunk Cost Effects for Time Versus Money: Replication and Extensions Registered Report of Soman (2001) Collective Behaviours: Mediation Mechanisms Underlying the Influence of Descriptive and Injunctive Norms An Unfinished Chapter: The Impact of Belgians’ Social Representations of Colonialism on their Present-Day Attitudes Towards Congolese People Living in Belgium How Neoliberal are You? Development and Validation of the Neoliberal Orientation Questionnaire Group Dominance, System Justification, and Hostile Classism: The Ideological Roots of the Perceived Socioeconomic Humanity Gap That Upholds the Income Gap
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1