{"title":"Characterizing and Detecting Anti-Patterns in the Logging Code","authors":"Boyuan Chen, Z. Jiang","doi":"10.1109/ICSE.2017.15","DOIUrl":null,"url":null,"abstract":"Snippets of logging code are output statements (e.g., LOG.info or System.out.println) that developers insert into a software system. Although more logging code can provide more execution context of the system's behavior during runtime, it is undesirable to instrument the system with too much logging code due to maintenance overhead. Furthermore, excessive logging may cause unexpected side-effects like performance slow-down or high disk I/O bandwidth. Recent studies show that there are no well-defined coding guidelines for performing effective logging. Previous research on the logging code mainly tackles the problems of where-to-log and what-to-log. There are very few works trying to address the problem of how-to-log (developing and maintaining high-quality logging code). In this paper, we study the problem of how-to-log by characterizing and detecting the anti-patterns in the logging code. As the majority of the logging code is evolved together with the feature code, the remaining set of logging code changes usually contains the fixes to the anti-patterns. We have manually examined 352 pairs of independently changed logging code snippets from three well-maintenance open source systems: ActiveMQ, Hadoop and Maven. Our analysis has resulted in six different anti-patterns in the logging code. To demonstrate the value of our findings, we have encoded these anti-patterns into a static code analysis tool, LCAnalyzer. Case studies show that LCAnalyzer has an average recall of 95% and precision of 60% and can be used to automatically detect previously unknown anti-patterns in the source code. To gather feedback, we have filed 64 representative instances of the logging code anti-patterns from the most recent releases of ten open source software systems. Among them, 46 instances (72%) have already been accepted by their developers.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"53 1","pages":"71-81"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"90","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2017.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 90
Abstract
Snippets of logging code are output statements (e.g., LOG.info or System.out.println) that developers insert into a software system. Although more logging code can provide more execution context of the system's behavior during runtime, it is undesirable to instrument the system with too much logging code due to maintenance overhead. Furthermore, excessive logging may cause unexpected side-effects like performance slow-down or high disk I/O bandwidth. Recent studies show that there are no well-defined coding guidelines for performing effective logging. Previous research on the logging code mainly tackles the problems of where-to-log and what-to-log. There are very few works trying to address the problem of how-to-log (developing and maintaining high-quality logging code). In this paper, we study the problem of how-to-log by characterizing and detecting the anti-patterns in the logging code. As the majority of the logging code is evolved together with the feature code, the remaining set of logging code changes usually contains the fixes to the anti-patterns. We have manually examined 352 pairs of independently changed logging code snippets from three well-maintenance open source systems: ActiveMQ, Hadoop and Maven. Our analysis has resulted in six different anti-patterns in the logging code. To demonstrate the value of our findings, we have encoded these anti-patterns into a static code analysis tool, LCAnalyzer. Case studies show that LCAnalyzer has an average recall of 95% and precision of 60% and can be used to automatically detect previously unknown anti-patterns in the source code. To gather feedback, we have filed 64 representative instances of the logging code anti-patterns from the most recent releases of ten open source software systems. Among them, 46 instances (72%) have already been accepted by their developers.