A. Fitrianto, W. Z. A. Wan Muhamad, Suliana Kriswan, B. Susetyo
{"title":"Comparing Outlier Detection Methods using Boxplot Generalized Extreme Studentized Deviate and Sequential Fences","authors":"A. Fitrianto, W. Z. A. Wan Muhamad, Suliana Kriswan, B. Susetyo","doi":"10.13170/aijst.11.1.23809","DOIUrl":null,"url":null,"abstract":"Outliers identification is essential in data analysis since it can make wrong inferential statistics. This study aimed to compare the performance of Boxplot, Generalized Extreme Studentized Deviate (Generalized ESD), and Sequential Fences method in identifying outliers. A published dataset was used in the study. Based on preliminary outlier identification, the data did not contain outliers. Each outlier detection method's performance was evaluated by contaminating the original data with few outliers. The contaminations were conducted by replacing the two smallest and largest observations with outliers. The analysis was conducted using SAS version 9.2 for both original and contaminated data. We found that Sequential Fences have outstanding performance in identifying outliers compared to Boxplot and Generalized ESD.","PeriodicalId":7128,"journal":{"name":"Aceh International Journal of Science and Technology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Aceh International Journal of Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13170/aijst.11.1.23809","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Outliers identification is essential in data analysis since it can make wrong inferential statistics. This study aimed to compare the performance of Boxplot, Generalized Extreme Studentized Deviate (Generalized ESD), and Sequential Fences method in identifying outliers. A published dataset was used in the study. Based on preliminary outlier identification, the data did not contain outliers. Each outlier detection method's performance was evaluated by contaminating the original data with few outliers. The contaminations were conducted by replacing the two smallest and largest observations with outliers. The analysis was conducted using SAS version 9.2 for both original and contaminated data. We found that Sequential Fences have outstanding performance in identifying outliers compared to Boxplot and Generalized ESD.