{"title":"amulog: A general log analysis framework for comparison and combination of diverse template generation methods*","authors":"Satoru Kobayashi, Yuya Yamashiro, Kazuki Otomo, Kensuke Fukuda","doi":"10.1002/nem.2195","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>One of the ways to analyze unstructured log messages from large-scale IT systems is to classify log messages with log templates generated by template generation methods. However, there is currently no common knowledge pertained to the comparison and practical use of log template generation methods because they are implemented on the basis of diverse environments. To this end, we design and implement amulog, a general log analysis framework for comparing and combining diverse log template generation methods. Amulog consists of three key functions: (1) parsing log messages into headers and segmented messages, (2) classifying the log messages using a scalable template-matching method, and (3) storing the structured data in a database. This framework helps us easily utilize time-series data corresponding to the log templates for further analysis. We evaluate amulog with a log dataset collected from a nation-wide academic network and demonstrate that it classifies the log data in a reasonable amount of time even with over 100,000 log template candidates. The template-matching method in amulog also reduces 75% processing time for template generation and keeps the accuracy when combined with an existing structure-based template generation method. In order to show the effectiveness of amulog in comparing log template generation methods, we demonstrate that the appropriate template generation methods and accuracy metrics largely depend on the purpose of further analysis by comparing the accuracy of six existing log template generation methods with 10 different accuracy metrics on amulog.</p>\n </div>","PeriodicalId":14154,"journal":{"name":"International Journal of Network Management","volume":"32 4","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2021-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Network Management","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/nem.2195","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 2
Abstract
One of the ways to analyze unstructured log messages from large-scale IT systems is to classify log messages with log templates generated by template generation methods. However, there is currently no common knowledge pertained to the comparison and practical use of log template generation methods because they are implemented on the basis of diverse environments. To this end, we design and implement amulog, a general log analysis framework for comparing and combining diverse log template generation methods. Amulog consists of three key functions: (1) parsing log messages into headers and segmented messages, (2) classifying the log messages using a scalable template-matching method, and (3) storing the structured data in a database. This framework helps us easily utilize time-series data corresponding to the log templates for further analysis. We evaluate amulog with a log dataset collected from a nation-wide academic network and demonstrate that it classifies the log data in a reasonable amount of time even with over 100,000 log template candidates. The template-matching method in amulog also reduces 75% processing time for template generation and keeps the accuracy when combined with an existing structure-based template generation method. In order to show the effectiveness of amulog in comparing log template generation methods, we demonstrate that the appropriate template generation methods and accuracy metrics largely depend on the purpose of further analysis by comparing the accuracy of six existing log template generation methods with 10 different accuracy metrics on amulog.
期刊介绍:
Modern computer networks and communication systems are increasing in size, scope, and heterogeneity. The promise of a single end-to-end technology has not been realized and likely never will occur. The decreasing cost of bandwidth is increasing the possible applications of computer networks and communication systems to entirely new domains. Problems in integrating heterogeneous wired and wireless technologies, ensuring security and quality of service, and reliably operating large-scale systems including the inclusion of cloud computing have all emerged as important topics. The one constant is the need for network management. Challenges in network management have never been greater than they are today. The International Journal of Network Management is the forum for researchers, developers, and practitioners in network management to present their work to an international audience. The journal is dedicated to the dissemination of information, which will enable improved management, operation, and maintenance of computer networks and communication systems. The journal is peer reviewed and publishes original papers (both theoretical and experimental) by leading researchers, practitioners, and consultants from universities, research laboratories, and companies around the world. Issues with thematic or guest-edited special topics typically occur several times per year. Topic areas for the journal are largely defined by the taxonomy for network and service management developed by IFIP WG6.6, together with IEEE-CNOM, the IRTF-NMRG and the Emanics Network of Excellence.