{"title":"重新审视流式异常检测:基准和评估","authors":"Yang Cao, Yixiao Ma, Ye Zhu, Kai Ming Ting","doi":"10.1007/s10462-024-10995-w","DOIUrl":null,"url":null,"abstract":"<div><p>Anomaly detection in streaming data is an important task for many real-world applications, such as network security, fraud detection, and system monitoring. However, streaming data often exhibit concept drift, which means that the data distribution changes over time. This poses a significant challenge for many anomaly detection algorithms, as they need to adapt to the evolving data to maintain high detection accuracy. Existing streaming anomaly detection algorithms lack a unified evaluation framework that validly assesses their performance and robustness under different types of concept drifts and anomalies. In this paper, we conduct a systematic technical review of the state-of-the-art methods for anomaly detection in streaming data. We propose a new data generator, called <i>SCAR</i> (<b>S</b>treaming data generator with <b>C</b>ustomizable <b>A</b>nomalies and concept d<b>R</b>ifts), that can synthesize streaming data based on synthetic and real-world datasets from different domains. Furthermore, we adapt four static anomaly detection models to the streaming setting using a generic reconstruction strategy as baselines, and then compare them systematically with 9 existing streaming anomaly detection algorithms on 76 synthesized datasets that have various types of anomalies and concept drifts. The challenges and future research directions for anomaly detection in streaming data are also presented.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 1","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10995-w.pdf","citationCount":"0","resultStr":"{\"title\":\"Revisiting streaming anomaly detection: benchmark and evaluation\",\"authors\":\"Yang Cao, Yixiao Ma, Ye Zhu, Kai Ming Ting\",\"doi\":\"10.1007/s10462-024-10995-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Anomaly detection in streaming data is an important task for many real-world applications, such as network security, fraud detection, and system monitoring. However, streaming data often exhibit concept drift, which means that the data distribution changes over time. This poses a significant challenge for many anomaly detection algorithms, as they need to adapt to the evolving data to maintain high detection accuracy. Existing streaming anomaly detection algorithms lack a unified evaluation framework that validly assesses their performance and robustness under different types of concept drifts and anomalies. In this paper, we conduct a systematic technical review of the state-of-the-art methods for anomaly detection in streaming data. We propose a new data generator, called <i>SCAR</i> (<b>S</b>treaming data generator with <b>C</b>ustomizable <b>A</b>nomalies and concept d<b>R</b>ifts), that can synthesize streaming data based on synthetic and real-world datasets from different domains. Furthermore, we adapt four static anomaly detection models to the streaming setting using a generic reconstruction strategy as baselines, and then compare them systematically with 9 existing streaming anomaly detection algorithms on 76 synthesized datasets that have various types of anomalies and concept drifts. The challenges and future research directions for anomaly detection in streaming data are also presented.</p></div>\",\"PeriodicalId\":8449,\"journal\":{\"name\":\"Artificial Intelligence Review\",\"volume\":\"58 1\",\"pages\":\"\"},\"PeriodicalIF\":10.7000,\"publicationDate\":\"2024-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10462-024-10995-w.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence Review\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10462-024-10995-w\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-10995-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Revisiting streaming anomaly detection: benchmark and evaluation
Anomaly detection in streaming data is an important task for many real-world applications, such as network security, fraud detection, and system monitoring. However, streaming data often exhibit concept drift, which means that the data distribution changes over time. This poses a significant challenge for many anomaly detection algorithms, as they need to adapt to the evolving data to maintain high detection accuracy. Existing streaming anomaly detection algorithms lack a unified evaluation framework that validly assesses their performance and robustness under different types of concept drifts and anomalies. In this paper, we conduct a systematic technical review of the state-of-the-art methods for anomaly detection in streaming data. We propose a new data generator, called SCAR (Streaming data generator with Customizable Anomalies and concept dRifts), that can synthesize streaming data based on synthetic and real-world datasets from different domains. Furthermore, we adapt four static anomaly detection models to the streaming setting using a generic reconstruction strategy as baselines, and then compare them systematically with 9 existing streaming anomaly detection algorithms on 76 synthesized datasets that have various types of anomalies and concept drifts. The challenges and future research directions for anomaly detection in streaming data are also presented.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.