Mimicking Production Behavior With Generated Mocks

IF 5.6 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING IEEE Transactions on Software Engineering Pub Date : 2024-09-11 DOI:10.1109/TSE.2024.3458448

Deepika Tiwari;Martin Monperrus;Benoit Baudry

{"title":"Mimicking Production Behavior With Generated Mocks","authors":"Deepika Tiwari;Martin Monperrus;Benoit Baudry","doi":"10.1109/TSE.2024.3458448","DOIUrl":null,"url":null,"abstract":"Mocking allows testing program units in isolation. A developer who writes tests with mocks faces two challenges: design realistic interactions between a unit and its environment; and understand the expected impact of these interactions on the behavior of the unit. In this paper, we propose to monitor an application in production to generate tests that mimic realistic execution scenarios through mocks. Our approach operates in three phases. First, we instrument a set of target methods for which we want to generate tests, as well as the methods that they invoke, which we refer to as mockable method calls. Second, in production, we collect data about the context in which target methods are invoked, as well as the parameters and the returned value for each mockable method call. Third, offline, we analyze the production data to generate test cases with realistic inputs and mock interactions. The approach is automated and implemented in an open-source tool called \nrick\n. We evaluate our approach with three real-world, open-source Java applications. \nrick\n monitors the invocation of \n<inline-formula><tex-math>$128$</tex-math></inline-formula>\n methods in production across the three applications and captures their behavior. Based on this captured data, \nrick\n generates test cases that include realistic initial states and test inputs, as well as mocks and stubs. All the generated test cases are executable, and \n<inline-formula><tex-math>$52.4\\%$</tex-math></inline-formula>\n of them successfully mimic the complete execution context of the target methods observed in production. The mock-based oracles are also effective at detecting regressions within the target methods, complementing each other in their fault-finding ability. We interview \n<inline-formula><tex-math>$5$</tex-math></inline-formula>\n developers from the industry who confirm the relevance of using production observations to design mocks and stubs. Our experimental findings clearly demonstrate the feasibility and added value of generating mocks from production interactions.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"50 11","pages":"2921-2946"},"PeriodicalIF":5.6000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10677447","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10677447/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Mocking allows testing program units in isolation. A developer who writes tests with mocks faces two challenges: design realistic interactions between a unit and its environment; and understand the expected impact of these interactions on the behavior of the unit. In this paper, we propose to monitor an application in production to generate tests that mimic realistic execution scenarios through mocks. Our approach operates in three phases. First, we instrument a set of target methods for which we want to generate tests, as well as the methods that they invoke, which we refer to as mockable method calls. Second, in production, we collect data about the context in which target methods are invoked, as well as the parameters and the returned value for each mockable method call. Third, offline, we analyze the production data to generate test cases with realistic inputs and mock interactions. The approach is automated and implemented in an open-source tool called rick . We evaluate our approach with three real-world, open-source Java applications. rick monitors the invocation of

$128$

methods in production across the three applications and captures their behavior. Based on this captured data, rick generates test cases that include realistic initial states and test inputs, as well as mocks and stubs. All the generated test cases are executable, and

$52.4\%$

of them successfully mimic the complete execution context of the target methods observed in production. The mock-based oracles are also effective at detecting regressions within the target methods, complementing each other in their fault-finding ability. We interview

$5$

developers from the industry who confirm the relevance of using production observations to design mocks and stubs. Our experimental findings clearly demonstrate the feasibility and added value of generating mocks from production interactions.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用生成的模拟模仿生产行为

模拟可以隔离测试程序单元。使用模拟编写测试的开发人员面临两个挑战：设计单元与其环境之间的真实交互；了解这些交互对单元行为的预期影响。在本文中，我们建议对生产中的应用程序进行监控，通过模拟来生成模拟现实执行场景的测试。我们的方法分为三个阶段。首先，我们为一组要生成测试的目标方法以及它们调用的方法（我们称之为可模拟方法调用）提供工具。其次，在生产过程中，我们收集有关调用目标方法的上下文数据，以及每个可模拟方法调用的参数和返回值。第三，离线分析生产数据，生成具有真实输入和模拟交互的测试用例。这种方法是自动化的，并在名为 rick 的开源工具中实现。我们使用三个真实世界的开源 Java 应用程序对我们的方法进行了评估。rick 监控了这三个应用程序在生产过程中调用 128 美元方法的情况，并捕获了它们的行为。基于这些捕获的数据，rick 生成测试用例，其中包括真实的初始状态和测试输入，以及模拟和存根。所有生成的测试用例都是可执行的，其中有 52.4% 的测试用例成功模拟了在生产中观察到的目标方法的完整执行环境。基于模拟的测试用例还能有效检测目标方法中的回归，在查找故障的能力上形成了互补。我们采访了来自业界的 5 美元开发人员，他们证实了使用生产观察结果来设计模拟和存根的相关性。我们的实验结果清楚地证明了从生产交互中生成模拟的可行性和附加值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.

期刊最新文献

2025 Reviewers List Causality-aware Safety Testing for Autonomous Driving Systems A Multivocal Literature Review on the Effectiveness of Security Threat Modeling Towards Refining Developer Questions using LLM-Based Named Entity Recognition for Developer Chatroom Conversations Impact of an LLM-based Review Assistant in Practice: A Mixed Open-/Closed-source Case Study