A method to simulate multivariate outliers with known mahalanobis distances for normal and non-normal data

Oscar L. Olvera Astivia
{"title":"A method to simulate multivariate outliers with known mahalanobis distances for normal and non-normal data","authors":"Oscar L. Olvera Astivia","doi":"10.1016/j.metip.2024.100157","DOIUrl":null,"url":null,"abstract":"<div><p>Monte Carlo simulations and theoretical analyses have repeatedly demonstrated the impact of outliers on statistical analysis. Most simulation studies generate outliers using one of two general approaches: by multiplying an arbitrary point by a constant or through a finite mixture. The latter can be extended to multivariate settings by defining the Mahalanobis distance between the centroids of two clusters of points. Nevertheless, when researchers aim to simulate individual data points with population-level Mahalanobis distances, the number of available procedures is very limited. This article generalizes one of the few existing methods to simulate an arbitrary number of outliers in an arbitrary number of dimensions, for both multivariate normal and non-normal data. A small simulation demonstration showcases how this methodology enables new simulation designs that were either unpopular or not possible due to the lack of a data-generating algorithm. A discussion of potential implications highlights the importance of considering multivariate outliers in simulation settings.</p></div>","PeriodicalId":93338,"journal":{"name":"Methods in Psychology (Online)","volume":"11 ","pages":"Article 100157"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590260124000237/pdfft?md5=994109449d478d74e642895eea71d9ad&pid=1-s2.0-S2590260124000237-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods in Psychology (Online)","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590260124000237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Psychology","Score":null,"Total":0}
引用次数: 0

Abstract

Monte Carlo simulations and theoretical analyses have repeatedly demonstrated the impact of outliers on statistical analysis. Most simulation studies generate outliers using one of two general approaches: by multiplying an arbitrary point by a constant or through a finite mixture. The latter can be extended to multivariate settings by defining the Mahalanobis distance between the centroids of two clusters of points. Nevertheless, when researchers aim to simulate individual data points with population-level Mahalanobis distances, the number of available procedures is very limited. This article generalizes one of the few existing methods to simulate an arbitrary number of outliers in an arbitrary number of dimensions, for both multivariate normal and non-normal data. A small simulation demonstration showcases how this methodology enables new simulation designs that were either unpopular or not possible due to the lack of a data-generating algorithm. A discussion of potential implications highlights the importance of considering multivariate outliers in simulation settings.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用已知的 mahalanobis 距离模拟正态和非正态数据的多元离群值的方法
蒙特卡罗模拟和理论分析一再证明了异常值对统计分析的影响。大多数模拟研究使用两种一般方法之一生成异常值:将任意点乘以常数或通过有限混合物。后者可通过定义两个点群中心点之间的马哈拉诺比距离扩展到多元设置。然而,当研究人员希望用群体水平的 Mahalanobis 距离模拟单个数据点时,可用程序的数量非常有限。本文将现有的为数不多的方法之一加以推广,以便在任意维度上模拟任意数量的离群值,既适用于多元正态数据,也适用于非正态数据。一个小型模拟演示展示了这种方法是如何实现新的模拟设计的,这些设计要么不受欢迎,要么由于缺乏数据生成算法而无法实现。对潜在影响的讨论强调了在模拟设置中考虑多元离群值的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Methods in Psychology (Online)
Methods in Psychology (Online) Experimental and Cognitive Psychology, Clinical Psychology, Developmental and Educational Psychology
CiteScore
5.50
自引率
0.00%
发文量
0
审稿时长
16 weeks
期刊最新文献
Assessing daily life activities with experience sampling methodology (ESM): Scoring predefined categories or qualitative analysis of open-ended responses? The role of sampling in an explanatory sequential mixed methods study: General applications of the transformative paradigm “Making Positive Vibrations” in a mixed methods study of covert bullying: A transformative methodological framework for social justice Editorial Board Digging and building: How transformative mixed-methods research contributes to explaining and responding to educational exclusion and school dropout
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1