{"title":"Log2graphs: An Unsupervised Framework for Log Anomaly Detection with Efficient Feature Extraction","authors":"Caihong Wang, Du Xu, Zonghang Li","doi":"arxiv-2409.11890","DOIUrl":null,"url":null,"abstract":"In the era of rapid Internet development, log data has become indispensable\nfor recording the operations of computer devices and software. These data\nprovide valuable insights into system behavior and necessitate thorough\nanalysis. Recent advances in text analysis have enabled deep learning to\nachieve significant breakthroughs in log anomaly detection. However, the high\ncost of manual annotation and the dynamic nature of usage scenarios present\nmajor challenges to effective log analysis. This study proposes a novel log\nfeature extraction model called DualGCN-LogAE, designed to adapt to various\nscenarios. It leverages the expressive power of large models for log content\nanalysis and the capability of graph structures to encapsulate correlations\nbetween logs. It retains key log information while integrating the causal\nrelationships between logs to achieve effective feature extraction.\nAdditionally, we introduce Log2graphs, an unsupervised log anomaly detection\nmethod based on the feature extractor. By employing graph clustering algorithms\nfor log anomaly detection, Log2graphs enables the identification of abnormal\nlogs without the need for labeled data. We comprehensively evaluate the feature\nextraction capability of DualGCN-LogAE and the anomaly detection performance of\nLog2graphs using public log datasets across five different scenarios. Our\nevaluation metrics include detection accuracy and graph clustering quality\nscores. Experimental results demonstrate that the log features extracted by\nDualGCN-LogAE outperform those obtained by other methods on classic\nclassifiers. Moreover, Log2graphs surpasses existing unsupervised log detection\nmethods, providing a robust tool for advancing log anomaly detection research.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"88 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Cryptography and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11890","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the era of rapid Internet development, log data has become indispensable
for recording the operations of computer devices and software. These data
provide valuable insights into system behavior and necessitate thorough
analysis. Recent advances in text analysis have enabled deep learning to
achieve significant breakthroughs in log anomaly detection. However, the high
cost of manual annotation and the dynamic nature of usage scenarios present
major challenges to effective log analysis. This study proposes a novel log
feature extraction model called DualGCN-LogAE, designed to adapt to various
scenarios. It leverages the expressive power of large models for log content
analysis and the capability of graph structures to encapsulate correlations
between logs. It retains key log information while integrating the causal
relationships between logs to achieve effective feature extraction.
Additionally, we introduce Log2graphs, an unsupervised log anomaly detection
method based on the feature extractor. By employing graph clustering algorithms
for log anomaly detection, Log2graphs enables the identification of abnormal
logs without the need for labeled data. We comprehensively evaluate the feature
extraction capability of DualGCN-LogAE and the anomaly detection performance of
Log2graphs using public log datasets across five different scenarios. Our
evaluation metrics include detection accuracy and graph clustering quality
scores. Experimental results demonstrate that the log features extracted by
DualGCN-LogAE outperform those obtained by other methods on classic
classifiers. Moreover, Log2graphs surpasses existing unsupervised log detection
methods, providing a robust tool for advancing log anomaly detection research.