Characterizing and Detecting Anti-Patterns in the Logging Code

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE) Pub Date : 2017-05-20 DOI:10.1109/ICSE.2017.15

Boyuan Chen, Z. Jiang

{"title":"Characterizing and Detecting Anti-Patterns in the Logging Code","authors":"Boyuan Chen, Z. Jiang","doi":"10.1109/ICSE.2017.15","DOIUrl":null,"url":null,"abstract":"Snippets of logging code are output statements (e.g., LOG.info or System.out.println) that developers insert into a software system. Although more logging code can provide more execution context of the system's behavior during runtime, it is undesirable to instrument the system with too much logging code due to maintenance overhead. Furthermore, excessive logging may cause unexpected side-effects like performance slow-down or high disk I/O bandwidth. Recent studies show that there are no well-defined coding guidelines for performing effective logging. Previous research on the logging code mainly tackles the problems of where-to-log and what-to-log. There are very few works trying to address the problem of how-to-log (developing and maintaining high-quality logging code). In this paper, we study the problem of how-to-log by characterizing and detecting the anti-patterns in the logging code. As the majority of the logging code is evolved together with the feature code, the remaining set of logging code changes usually contains the fixes to the anti-patterns. We have manually examined 352 pairs of independently changed logging code snippets from three well-maintenance open source systems: ActiveMQ, Hadoop and Maven. Our analysis has resulted in six different anti-patterns in the logging code. To demonstrate the value of our findings, we have encoded these anti-patterns into a static code analysis tool, LCAnalyzer. Case studies show that LCAnalyzer has an average recall of 95% and precision of 60% and can be used to automatically detect previously unknown anti-patterns in the source code. To gather feedback, we have filed 64 representative instances of the logging code anti-patterns from the most recent releases of ten open source software systems. Among them, 46 instances (72%) have already been accepted by their developers.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"53 1","pages":"71-81"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"90","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2017.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 90

Abstract

Snippets of logging code are output statements (e.g., LOG.info or System.out.println) that developers insert into a software system. Although more logging code can provide more execution context of the system's behavior during runtime, it is undesirable to instrument the system with too much logging code due to maintenance overhead. Furthermore, excessive logging may cause unexpected side-effects like performance slow-down or high disk I/O bandwidth. Recent studies show that there are no well-defined coding guidelines for performing effective logging. Previous research on the logging code mainly tackles the problems of where-to-log and what-to-log. There are very few works trying to address the problem of how-to-log (developing and maintaining high-quality logging code). In this paper, we study the problem of how-to-log by characterizing and detecting the anti-patterns in the logging code. As the majority of the logging code is evolved together with the feature code, the remaining set of logging code changes usually contains the fixes to the anti-patterns. We have manually examined 352 pairs of independently changed logging code snippets from three well-maintenance open source systems: ActiveMQ, Hadoop and Maven. Our analysis has resulted in six different anti-patterns in the logging code. To demonstrate the value of our findings, we have encoded these anti-patterns into a static code analysis tool, LCAnalyzer. Case studies show that LCAnalyzer has an average recall of 95% and precision of 60% and can be used to automatically detect previously unknown anti-patterns in the source code. To gather feedback, we have filed 64 representative instances of the logging code anti-patterns from the most recent releases of ten open source software systems. Among them, 46 instances (72%) have already been accepted by their developers.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

日志代码中反模式的表征和检测

日志代码片段是开发人员插入到软件系统中的输出语句(例如，LOG.info或system. out.println)。虽然更多的日志代码可以在运行时提供更多的系统行为的执行上下文，但由于维护开销，不希望使用太多的日志代码来检测系统。此外，过多的日志记录可能会导致意想不到的副作用，如性能下降或磁盘I/O带宽高。最近的研究表明，没有定义良好的编码准则来执行有效的日志记录。以往对日志代码的研究主要是解决在哪里记录和记录什么内容的问题。很少有作品试图解决如何记录日志的问题(开发和维护高质量的日志代码)。本文通过对日志代码中的反模式进行表征和检测，研究如何进行日志记录的问题。由于大多数日志代码与特性代码一起演进，其余的日志代码变更通常包含对反模式的修复。我们手工检查了来自三个维护良好的开源系统(ActiveMQ、Hadoop和Maven)的352对独立更改的日志代码片段。我们的分析在日志代码中产生了六种不同的反模式。为了演示我们的发现的价值，我们将这些反模式编码到静态代码分析工具LCAnalyzer中。案例研究表明，LCAnalyzer的平均召回率为95%，精度为60%，可用于自动检测源代码中以前未知的反模式。为了收集反馈，我们从10个开源软件系统的最新版本中整理了64个具有代表性的日志代码反模式实例。其中，46个实例(72%)已经被开发人员接受。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

自引率

0.00%

发文量

期刊最新文献

Adaptive Unpacking of Android Apps Symbolic Model Extraction for Web Application Verification On Cross-Stack Configuration Errors Syntactic and Semantic Differencing for Combinatorial Models of Test Designs Fuzzy Fine-Grained Code-History Analysis