Log2graphs: An Unsupervised Framework for Log Anomaly Detection with Efficient Feature Extraction

Caihong Wang, Du Xu, Zonghang Li
{"title":"Log2graphs: An Unsupervised Framework for Log Anomaly Detection with Efficient Feature Extraction","authors":"Caihong Wang, Du Xu, Zonghang Li","doi":"arxiv-2409.11890","DOIUrl":null,"url":null,"abstract":"In the era of rapid Internet development, log data has become indispensable\nfor recording the operations of computer devices and software. These data\nprovide valuable insights into system behavior and necessitate thorough\nanalysis. Recent advances in text analysis have enabled deep learning to\nachieve significant breakthroughs in log anomaly detection. However, the high\ncost of manual annotation and the dynamic nature of usage scenarios present\nmajor challenges to effective log analysis. This study proposes a novel log\nfeature extraction model called DualGCN-LogAE, designed to adapt to various\nscenarios. It leverages the expressive power of large models for log content\nanalysis and the capability of graph structures to encapsulate correlations\nbetween logs. It retains key log information while integrating the causal\nrelationships between logs to achieve effective feature extraction.\nAdditionally, we introduce Log2graphs, an unsupervised log anomaly detection\nmethod based on the feature extractor. By employing graph clustering algorithms\nfor log anomaly detection, Log2graphs enables the identification of abnormal\nlogs without the need for labeled data. We comprehensively evaluate the feature\nextraction capability of DualGCN-LogAE and the anomaly detection performance of\nLog2graphs using public log datasets across five different scenarios. Our\nevaluation metrics include detection accuracy and graph clustering quality\nscores. Experimental results demonstrate that the log features extracted by\nDualGCN-LogAE outperform those obtained by other methods on classic\nclassifiers. Moreover, Log2graphs surpasses existing unsupervised log detection\nmethods, providing a robust tool for advancing log anomaly detection research.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"88 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Cryptography and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11890","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In the era of rapid Internet development, log data has become indispensable for recording the operations of computer devices and software. These data provide valuable insights into system behavior and necessitate thorough analysis. Recent advances in text analysis have enabled deep learning to achieve significant breakthroughs in log anomaly detection. However, the high cost of manual annotation and the dynamic nature of usage scenarios present major challenges to effective log analysis. This study proposes a novel log feature extraction model called DualGCN-LogAE, designed to adapt to various scenarios. It leverages the expressive power of large models for log content analysis and the capability of graph structures to encapsulate correlations between logs. It retains key log information while integrating the causal relationships between logs to achieve effective feature extraction. Additionally, we introduce Log2graphs, an unsupervised log anomaly detection method based on the feature extractor. By employing graph clustering algorithms for log anomaly detection, Log2graphs enables the identification of abnormal logs without the need for labeled data. We comprehensively evaluate the feature extraction capability of DualGCN-LogAE and the anomaly detection performance of Log2graphs using public log datasets across five different scenarios. Our evaluation metrics include detection accuracy and graph clustering quality scores. Experimental results demonstrate that the log features extracted by DualGCN-LogAE outperform those obtained by other methods on classic classifiers. Moreover, Log2graphs surpasses existing unsupervised log detection methods, providing a robust tool for advancing log anomaly detection research.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Log2graphs:利用高效特征提取进行日志异常检测的无监督框架
在互联网飞速发展的时代,记录计算机设备和软件运行情况的日志数据已变得不可或缺。这些数据为了解系统行为提供了宝贵的信息,因此有必要对其进行深入分析。文本分析领域的最新进展使得深度学习在日志异常检测方面取得了重大突破。然而,人工标注的高成本和使用场景的动态性给有效的日志分析带来了重大挑战。本研究提出了一种名为 DualGCN-LogAE 的新型日志特征提取模型,旨在适应各种场景。它利用大型模型的表现力进行日志内容分析,并利用图结构的能力封装日志之间的相关性。此外,我们还介绍了基于特征提取器的无监督日志异常检测方法 Log2graphs。通过采用图聚类算法进行日志异常检测,Log2graphs 无需标注数据即可识别异常日志。我们使用五个不同场景的公共日志数据集全面评估了 DualGCN-LogAE 的特征提取能力和 Log2graphs 的异常检测性能。评估指标包括检测准确率和图聚类质量分数。实验结果表明,在经典分类器上,DualGCN-LogAE 提取的日志特征优于其他方法提取的特征。此外,Log2graphs 还超越了现有的无监督日志检测方法,为推进日志异常检测研究提供了强大的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
PAD-FT: A Lightweight Defense for Backdoor Attacks via Data Purification and Fine-Tuning Artemis: Efficient Commit-and-Prove SNARKs for zkML A Survey-Based Quantitative Analysis of Stress Factors and Their Impacts Among Cybersecurity Professionals Log2graphs: An Unsupervised Framework for Log Anomaly Detection with Efficient Feature Extraction Practical Investigation on the Distinguishability of Longa's Atomic Patterns
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1