V. Kreinovich, Praveen Patangay, L. Longpré, S. Starks, Cynthia Campos
{"title":"Outlier detection under interval and fuzzy uncertainty: algorithmic solvability and computational complexity","authors":"V. Kreinovich, Praveen Patangay, L. Longpré, S. Starks, Cynthia Campos","doi":"10.1109/NAFIPS.2003.1226818","DOIUrl":null,"url":null,"abstract":"In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some \"normal\" values x/sub 1/,..., x/sub n/, compute the sample average E, the sample standard variation /spl sigma/, and then mark a value x as an outlier if x is outside the k/sub 0/-sigma interval [E-k/sub 0//spl middot//spl sigma/, E+k/sub 0//spl middot//spl sigma/] (for some pre-selected parameter k/sub 0/). In real life, we often have only interval ranges [x/sub i/, x~/sub i/] for the normal values x/sub 1/,...,x/sub n/. In this case, we only have intervals of possible values for the bounds E-k/sub 0//spl middot//spl sigma/ and E+k/sub 0//spl middot//spl sigma/. We can therefore identify outliers as values that are outside all k/sub 0/-sigma intervals. In this paper, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions). We also provide algorithms that estimate the degree of \"outlier-ness\" of a given value x-measured as the largest value k/sub 0/ for which x is outside the corresponding k/sub 0/-sigma interval.","PeriodicalId":153530,"journal":{"name":"22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference of the North American Fuzzy Information Processing Society, NAFIPS 2003","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.2003.1226818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some "normal" values x/sub 1/,..., x/sub n/, compute the sample average E, the sample standard variation /spl sigma/, and then mark a value x as an outlier if x is outside the k/sub 0/-sigma interval [E-k/sub 0//spl middot//spl sigma/, E+k/sub 0//spl middot//spl sigma/] (for some pre-selected parameter k/sub 0/). In real life, we often have only interval ranges [x/sub i/, x~/sub i/] for the normal values x/sub 1/,...,x/sub n/. In this case, we only have intervals of possible values for the bounds E-k/sub 0//spl middot//spl sigma/ and E+k/sub 0//spl middot//spl sigma/. We can therefore identify outliers as values that are outside all k/sub 0/-sigma intervals. In this paper, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions). We also provide algorithms that estimate the degree of "outlier-ness" of a given value x-measured as the largest value k/sub 0/ for which x is outside the corresponding k/sub 0/-sigma interval.