Two is better than one: digital siblings to improve autonomous driving testing

IF 3.5 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Empirical Software Engineering Pub Date : 2024-05-17 DOI:10.1007/s10664-024-10458-4

Matteo Biagiola, Andrea Stocco, Vincenzo Riccio, Paolo Tonella

{"title":"Two is better than one: digital siblings to improve autonomous driving testing","authors":"Matteo Biagiola, Andrea Stocco, Vincenzo Riccio, Paolo Tonella","doi":"10.1007/s10664-024-10458-4","DOIUrl":null,"url":null,"abstract":"<p>Simulation-based testing represents an important step to ensure the reliability of autonomous driving software. In practice, when companies rely on third-party general-purpose simulators, either for in-house or outsourced testing, the generalizability of testing results to real autonomous vehicles is at stake. In this paper, we enhance simulation-based testing by introducing the notion of <i>digital siblings</i>—a multi-simulator approach that tests a given autonomous vehicle on multiple general-purpose simulators built with different technologies, that operate collectively as an ensemble in the testing process. We exemplify our approach on a case study focused on testing the lane-keeping component of an autonomous vehicle. We use two open-source simulators as digital siblings, and we empirically compare such a multi-simulator approach against a digital twin of a physical scaled autonomous vehicle on a large set of test cases. Our approach requires generating and running test cases for each individual simulator, in the form of sequences of road points. Then, test cases are migrated between simulators, using feature maps to characterize the exercised driving conditions. Finally, the joint predicted failure probability is computed, and a failure is reported only in cases of agreement among the siblings. Our empirical evaluation shows that the ensemble failure predictor by the digital siblings is superior to each individual simulator at predicting the failures of the digital twin. We discuss the findings of our case study and detail how our approach can help researchers interested in automated testing of autonomous driving software.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"2015 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Empirical Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10664-024-10458-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Simulation-based testing represents an important step to ensure the reliability of autonomous driving software. In practice, when companies rely on third-party general-purpose simulators, either for in-house or outsourced testing, the generalizability of testing results to real autonomous vehicles is at stake. In this paper, we enhance simulation-based testing by introducing the notion of digital siblings—a multi-simulator approach that tests a given autonomous vehicle on multiple general-purpose simulators built with different technologies, that operate collectively as an ensemble in the testing process. We exemplify our approach on a case study focused on testing the lane-keeping component of an autonomous vehicle. We use two open-source simulators as digital siblings, and we empirically compare such a multi-simulator approach against a digital twin of a physical scaled autonomous vehicle on a large set of test cases. Our approach requires generating and running test cases for each individual simulator, in the form of sequences of road points. Then, test cases are migrated between simulators, using feature maps to characterize the exercised driving conditions. Finally, the joint predicted failure probability is computed, and a failure is reported only in cases of agreement among the siblings. Our empirical evaluation shows that the ensemble failure predictor by the digital siblings is superior to each individual simulator at predicting the failures of the digital twin. We discuss the findings of our case study and detail how our approach can help researchers interested in automated testing of autonomous driving software.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

二胜于一：数字兄弟姐妹改善自动驾驶测试

模拟测试是确保自动驾驶软件可靠性的重要一步。在实践中，当公司依靠第三方通用模拟器进行内部或外包测试时，测试结果对真实自动驾驶车辆的通用性就会受到威胁。在本文中，我们通过引入数字兄弟姐妹的概念来增强基于模拟的测试--这种多模拟器方法在多个采用不同技术构建的通用模拟器上测试给定的自动驾驶汽车，这些模拟器在测试过程中作为一个整体共同运行。我们在一个案例研究中示范了我们的方法，该案例研究的重点是测试自动驾驶汽车的车道保持组件。我们使用两个开源模拟器作为数字孪生兄弟，并在大量测试案例中将这种多模拟器方法与物理比例自动驾驶汽车的数字孪生兄弟进行实证比较。我们的方法要求为每个模拟器生成并运行测试用例，测试用例的形式为道路点序列。然后，测试用例在模拟器之间迁移，使用特征图来描述行使的驾驶条件。最后，计算联合预测的故障概率，只有在同胞兄弟一致的情况下才会报告故障。我们的实证评估表明，数字孪生系统的集合故障预测器在预测数字孪生系统故障方面优于单个模拟器。我们将讨论案例研究的结果，并详细介绍我们的方法如何帮助对自动驾驶软件自动测试感兴趣的研究人员。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Empirical Software Engineering 工程技术-计算机：软件工程

CiteScore

8.50

自引率

12.20%

发文量

169

审稿时长

>12 weeks

期刊介绍： Empirical Software Engineering provides a forum for applied software engineering research with a strong empirical component, and a venue for publishing empirical results relevant to both researchers and practitioners. Empirical studies presented here usually involve the collection and analysis of data and experience that can be used to characterize, evaluate and reveal relationships between software development deliverables, practices, and technologies. Over time, it is expected that such empirical results will form a body of knowledge leading to widely accepted and well-formed theories. The journal also offers industrial experience reports detailing the application of software technologies - processes, methods, or tools - and their effectiveness in industrial settings. Empirical Software Engineering promotes the publication of industry-relevant research, to address the significant gap between research and practice.