Mitigating the Uncertainty and Imprecision of Log-Based Code Coverage Without Requiring Additional Logging Statements

IF 6.5 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING IEEE Transactions on Software Engineering Pub Date : 2024-07-29 DOI:10.1109/TSE.2024.3435067

Xiaoyan Xu;Filipe R. Cogo;Shane McIntosh

{"title":"Mitigating the Uncertainty and Imprecision of Log-Based Code Coverage Without Requiring Additional Logging Statements","authors":"Xiaoyan Xu;Filipe R. Cogo;Shane McIntosh","doi":"10.1109/TSE.2024.3435067","DOIUrl":null,"url":null,"abstract":"Understanding code coverage is an important precursor to software maintenance activities (e.g., better testing). Although modern code coverage tools provide key insights, they typically rely on code instrumentation, resulting in significant performance overhead. An alternative approach to code instrumentation is to process an application's source code and the associated log traces in tandem. This so-called “log-based code coverage” approach does not impose the same performance overhead as code instrumentation. Chen et al. proposed \n<sc>LogCoCo\n — a tool that implements log-based code coverage for \n<sc>Java\n. While \n<sc>LogCoCo\n breaks important new ground, it has fundamental limitations, namely: uncertainty due to the lack of logging statements in conditional branches, and imprecision caused by dependency injection. In this study, we propose \n<sc>Log2Cov\n, a tool that generates log-based code coverage for programs written in \n<sc>Python\n and addresses uncertainty and imprecision issues. We evaluate \n<sc>Log2Cov\n on three large and active open-source systems. More specifically, we compare the performance of \n<sc>Log2Cov\n to that of \n<sc>Coverage.py\n, an instrumentation-based coverage tool for \n<sc>Python\n. Our results indicate that 1) \n<sc>Log2Cov\n achieves high precision without introducing runtime overhead; and 2) uncertainty and imprecision can be reduced by up to 11% by statically analyzing the program's source code and execution logs, without requiring additional logging instrumentation from developers. While our enhancements make substantial improvements, we find that future work is needed to handle conditional statements and exception handling blocks to achieve parity with instrumentation-based approaches. We conclude the paper by drawing attention to these promising directions for future work.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"50 9","pages":"2350-2362"},"PeriodicalIF":6.5000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10613788/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Understanding code coverage is an important precursor to software maintenance activities (e.g., better testing). Although modern code coverage tools provide key insights, they typically rely on code instrumentation, resulting in significant performance overhead. An alternative approach to code instrumentation is to process an application's source code and the associated log traces in tandem. This so-called “log-based code coverage” approach does not impose the same performance overhead as code instrumentation. Chen et al. proposed LogCoCo — a tool that implements log-based code coverage for Java . While LogCoCo breaks important new ground, it has fundamental limitations, namely: uncertainty due to the lack of logging statements in conditional branches, and imprecision caused by dependency injection. In this study, we propose Log2Cov , a tool that generates log-based code coverage for programs written in Python and addresses uncertainty and imprecision issues. We evaluate Log2Cov on three large and active open-source systems. More specifically, we compare the performance of Log2Cov to that of Coverage.py , an instrumentation-based coverage tool for Python . Our results indicate that 1) Log2Cov achieves high precision without introducing runtime overhead; and 2) uncertainty and imprecision can be reduced by up to 11% by statically analyzing the program's source code and execution logs, without requiring additional logging instrumentation from developers. While our enhancements make substantial improvements, we find that future work is needed to handle conditional statements and exception handling blocks to achieve parity with instrumentation-based approaches. We conclude the paper by drawing attention to these promising directions for future work.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

减轻基于日志的代码覆盖的不确定性和不精确性，而无需额外的日志声明

了解代码覆盖率是软件维护活动（如更好的测试）的重要前提。尽管现代代码覆盖率工具能提供关键的洞察力，但它们通常依赖于代码工具，从而造成巨大的性能开销。代码检测的另一种方法是同时处理应用程序的源代码和相关日志跟踪。这种所谓的 "基于日志的代码覆盖 "方法不会造成与代码工具相同的性能开销。Chen 等人提出了 LogCoCo--一种为 Java 实现基于日志的代码覆盖的工具。LogCoCo 虽然开辟了重要的新领域，但也存在一些基本限制，即：条件分支中缺乏日志语句导致的不确定性，以及依赖注入导致的不精确性。在本研究中，我们提出了 Log2Cov，这是一种为 Python 编写的程序生成基于日志的代码覆盖率的工具，可解决不确定性和不精确问题。我们在三个活跃的大型开源系统上对 Log2Cov 进行了评估。更具体地说，我们比较了 Log2Cov 和 Coverage.py 的性能，Coverage.py 是一款基于仪器的 Python 代码覆盖工具。我们的结果表明：1）Log2Cov 在不引入运行时开销的情况下实现了高精度；2）通过静态分析程序的源代码和执行日志，不确定性和不精确性最多可减少 11%，而无需开发人员进行额外的日志记录。虽然我们的改进取得了实质性的进步，但我们发现，未来还需要努力处理条件语句和异常处理块，以实现与基于工具的方法的同等效果。在本文的最后，我们提请大家注意这些前景广阔的未来工作方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.