{"title":"Modeling Anomalies Prevalent in Sensor Network Deployments: A Representative Ground Truth","authors":"Giovani Rimon Abuaitah, Bin Wang","doi":"10.1109/MASCOTS.2013.57","DOIUrl":null,"url":null,"abstract":"The performance of anomaly detection algorithms is usually measured using the total residual error. This error metric is calculated by comparing the labels assigned by the detection algorithm against a reference ground truth. Obtaining a highly expressive ground truth is by itself a challenging task, if not infeasible. Often, a dataset is manually labeled by domain experts. However, manual labeling is error prone. In real-world sensor network deployments, it becomes even more difficult to label a sensor dataset due to the large amount of samples, the complexity of visualizing the data, and the uncertainty in the existence of anomalies. This paper proposes an automated technique which uses highly representative anomaly models for labeling. We demonstrate the effectiveness of this technique through evaluating a classification algorithm using our designed anomaly models as ground truth. We show that the classification accuracy is similar to that when using manually labeled real-world data points.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MASCOTS.2013.57","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The performance of anomaly detection algorithms is usually measured using the total residual error. This error metric is calculated by comparing the labels assigned by the detection algorithm against a reference ground truth. Obtaining a highly expressive ground truth is by itself a challenging task, if not infeasible. Often, a dataset is manually labeled by domain experts. However, manual labeling is error prone. In real-world sensor network deployments, it becomes even more difficult to label a sensor dataset due to the large amount of samples, the complexity of visualizing the data, and the uncertainty in the existence of anomalies. This paper proposes an automated technique which uses highly representative anomaly models for labeling. We demonstrate the effectiveness of this technique through evaluating a classification algorithm using our designed anomaly models as ground truth. We show that the classification accuracy is similar to that when using manually labeled real-world data points.