{"title":"A User Behavior Anomaly Detection Approach Based on Sequence Mining over Data Streams","authors":"Yong Zhou, Yijie Wang, Xingkong Ma","doi":"10.1109/PDCAT.2016.086","DOIUrl":null,"url":null,"abstract":"How to design a low-latency and accurate approach for user behavior anomaly detection over data streams has become a great challenge. However, existing studies cannot meet low-latency and accurate requirements, due to a large number of subsequences and sequential relationship in behaviors. This paper presents BADSM, a user behavior anomaly detection approach based on sequence mining over data streams that seeks to address such challenge. BADSM uses self-adaptive behavior pruning algorithm to adaptively divide data stream into behaviors and decrease the number of subsequences to improve the efficiency of sequence mining. Meanwhile, the top-k abnormal scoring algorithm is used to reduce the complexity of traversal and obtain quantitative detection result to improve accuracy. We design and implement a streaming anomaly detection system based on BADSM to perform online detection. Extensive experiments confirm that BADSM significantly reduces processing delay by at least 36.8% and false positive rate by 6.4% compared with the classic sequence mining approach PrefixSpan.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2016.086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
How to design a low-latency and accurate approach for user behavior anomaly detection over data streams has become a great challenge. However, existing studies cannot meet low-latency and accurate requirements, due to a large number of subsequences and sequential relationship in behaviors. This paper presents BADSM, a user behavior anomaly detection approach based on sequence mining over data streams that seeks to address such challenge. BADSM uses self-adaptive behavior pruning algorithm to adaptively divide data stream into behaviors and decrease the number of subsequences to improve the efficiency of sequence mining. Meanwhile, the top-k abnormal scoring algorithm is used to reduce the complexity of traversal and obtain quantitative detection result to improve accuracy. We design and implement a streaming anomaly detection system based on BADSM to perform online detection. Extensive experiments confirm that BADSM significantly reduces processing delay by at least 36.8% and false positive rate by 6.4% compared with the classic sequence mining approach PrefixSpan.