Outlier detection under interval and fuzzy uncertainty: algorithmic solvability and computational complexity

V. Kreinovich, Praveen Patangay, L. Longpré, S. Starks, Cynthia Campos
{"title":"Outlier detection under interval and fuzzy uncertainty: algorithmic solvability and computational complexity","authors":"V. Kreinovich, Praveen Patangay, L. Longpré, S. Starks, Cynthia Campos","doi":"10.1109/NAFIPS.2003.1226818","DOIUrl":null,"url":null,"abstract":"In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some \"normal\" values x/sub 1/,..., x/sub n/, compute the sample average E, the sample standard variation /spl sigma/, and then mark a value x as an outlier if x is outside the k/sub 0/-sigma interval [E-k/sub 0//spl middot//spl sigma/, E+k/sub 0//spl middot//spl sigma/] (for some pre-selected parameter k/sub 0/). In real life, we often have only interval ranges [x/sub i/, x~/sub i/] for the normal values x/sub 1/,...,x/sub n/. In this case, we only have intervals of possible values for the bounds E-k/sub 0//spl middot//spl sigma/ and E+k/sub 0//spl middot//spl sigma/. We can therefore identify outliers as values that are outside all k/sub 0/-sigma intervals. In this paper, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions). We also provide algorithms that estimate the degree of \"outlier-ness\" of a given value x-measured as the largest value k/sub 0/ for which x is outside the corresponding k/sub 0/-sigma interval.","PeriodicalId":153530,"journal":{"name":"22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.2003.1226818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some "normal" values x/sub 1/,..., x/sub n/, compute the sample average E, the sample standard variation /spl sigma/, and then mark a value x as an outlier if x is outside the k/sub 0/-sigma interval [E-k/sub 0//spl middot//spl sigma/, E+k/sub 0//spl middot//spl sigma/] (for some pre-selected parameter k/sub 0/). In real life, we often have only interval ranges [x/sub i/, x~/sub i/] for the normal values x/sub 1/,...,x/sub n/. In this case, we only have intervals of possible values for the bounds E-k/sub 0//spl middot//spl sigma/ and E+k/sub 0//spl middot//spl sigma/. We can therefore identify outliers as values that are outside all k/sub 0/-sigma intervals. In this paper, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions). We also provide algorithms that estimate the degree of "outlier-ness" of a given value x-measured as the largest value k/sub 0/ for which x is outside the corresponding k/sub 0/-sigma interval.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
区间与模糊不确定性下的离群点检测:算法可解性与计算复杂度
在许多应用领域,检测异常值是很重要的。异常值检测的传统工程方法是我们从一些“正常”值x/sub 1/,…, x/sub - n/,计算样本平均值E,样本标准差/spl sigma/,然后将值x标记为异常值,如果x在k/sub - 0/-sigma区间之外[E-k/sub - 0//spl middot//spl sigma/, E+k/sub - 0//spl middot//spl sigma/](对于某些预先选择的参数k/sub - 0/)。在现实生活中,对于正常值x/下标1/,…我们通常只有区间范围[x/下标i/, x~/下标i/]。x / an /。在这种情况下,我们只有边界E-k/sub 0//spl middot//spl sigma/和E+k/sub 0//spl middot//spl sigma/的可能值的区间。因此,我们可以将异常值识别为所有k/sub 0/-sigma区间之外的值。在本文中,我们分析了这些异常点检测问题的计算复杂性,并提供了有效的算法来解决其中的一些问题(在合理的条件下)。我们还提供了估计给定值x的“异常度”程度的算法,该值被测量为x在相应的k/sub 0/-sigma区间之外的最大值k/sub 0/。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Fuzzy-rough nearest-neighbor classification approach Fault detection and diagnosis in turbine engines using fuzzy logic How the number of measured dimensions affects fuzzy causal measures of vitamin therapy for hyperhomocysteinemia in stroke patients The fuzzy rough approximation decomposability Fuzzy-neuro system for bridge health monitoring
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1