DIP

Proceedings of the 2014 ACM Southeast Regional Conference Pub Date : 2022-04-18 DOI:10.1145/3476883.3520226

Daniel Plaisted, Mengjun Xie

{"title":"DIP","authors":"Daniel Plaisted, Mengjun Xie","doi":"10.1145/3476883.3520226","DOIUrl":null,"url":null,"abstract":"Certain classes of log analytical models, such as those for log anomaly detection, require as inputs sequences of parsed log messages in which the message tokens that belong to the template of the message are indicated. For this reason, it is common for such a model to employ a log parser, a program that detects the template of each message in a log file. It has been shown that even the most accurate log parsers in the literature fail to achieve high accuracy at detecting the templates of messages from certain systems' log files. This paper presents DIP, a tree-based log parser. The primary methodological innovation of DIP lies in the mechanism it uses to determine whether pairs of very similar messages have the same template. While many existing parsers only consider the percentage of matching tokens between two similar messages in determining whether they have the same template, DIP considers in addition the actual tokens at which the two messages disagree, deeming a pair of similar messages to have the same template if and only if each of those tokens satisfies one in a certain set of three conditions. Our experimental results show that DIP can achieve an average accuracy that is superior to that obtained by each of the 13 parsers tested in a 2019 survey study on log parsers. Furthermore, we give evidence that it achieves this high accuracy without compromising in terms of runtime.","PeriodicalId":91384,"journal":{"name":"Proceedings of the 2014 ACM Southeast Regional Conference","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 ACM Southeast Regional Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3476883.3520226","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Certain classes of log analytical models, such as those for log anomaly detection, require as inputs sequences of parsed log messages in which the message tokens that belong to the template of the message are indicated. For this reason, it is common for such a model to employ a log parser, a program that detects the template of each message in a log file. It has been shown that even the most accurate log parsers in the literature fail to achieve high accuracy at detecting the templates of messages from certain systems' log files. This paper presents DIP, a tree-based log parser. The primary methodological innovation of DIP lies in the mechanism it uses to determine whether pairs of very similar messages have the same template. While many existing parsers only consider the percentage of matching tokens between two similar messages in determining whether they have the same template, DIP considers in addition the actual tokens at which the two messages disagree, deeming a pair of similar messages to have the same template if and only if each of those tokens satisfies one in a certain set of three conditions. Our experimental results show that DIP can achieve an average accuracy that is superior to that obtained by each of the 13 parsers tested in a 2019 survey study on log parsers. Furthermore, we give evidence that it achieves this high accuracy without compromising in terms of runtime.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

浸

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助