Software Testing, Verification and Reliability最新文献

英文中文

Fault tolerance and metamorphic relation prediction 容错和变质关系预测

Software Testing, Verification and Reliability

Pub Date : 2024-08-27 DOI: 10.1002/stvr.1896

Yves Le Traon, Tao Xie

引用次数: 0

Validity Matters: Uncertainty‐Guided Testing of Deep Neural Networks 有效性至关重要：深度神经网络的不确定性引导测试

Software Testing, Verification and Reliability

Pub Date : 2024-08-22 DOI: 10.1002/stvr.1894

Zhouxian Jiang, Honghui Li, Rui Wang, Xuetao Tian, Ci Liang, Fei Yan, Junwen Zhang, Zhen Liu

Despite numerous applications of deep learning technologies on critical tasks in various domains, advanced deep neural networks (DNNs) face persistent safety and security challenges, such as the overconfidence in predicting out‐of‐distribution samples and susceptibility to adversarial examples. Thorough testing by exploring the input space serves as a key strategy to ensure their robustness and trustworthiness of these networks. However, existing testing methods focus on disclosing more erroneous model behaviours, overlooking the validity of the generated test inputs. To mitigate this issue, we investigate devising valid test input generation method for DNNs from a predictive uncertainty perspective. Through a large‐scale empirical study across 11 predictive uncertainty metrics for DNNs, we explore the correlation between validity and uncertainty of test inputs. Our findings reveal that the predictive entropy‐based and ensemble‐based uncertainty metrics effectively characterize the input validity demonstration. Building on these insights, we introduce UCTest, an uncertainty‐guided deep learning testing approach, to efficiently generate valid and authentic test inputs. We formulate a joint optimization objective: to uncover the model's misbehaviours by maximizing the loss function and concurrently generate valid test input by minimizing uncertainty. Extensive experiments demonstrate that our approach outperforms the current testing methods in generating valid test inputs. Furthermore, incorporating natural variation through data augmentation techniques into UCTest effectively boosts the diversity of generated test inputs.

尽管深度学习技术在各个领域的关键任务中得到了大量应用，但先进的深度神经网络（DNN）仍面临着持续的安全和保安挑战，例如在预测超出分布范围的样本时过于自信，以及易受对抗性示例的影响。通过探索输入空间进行彻底测试是确保这些网络鲁棒性和可信性的关键策略。然而，现有的测试方法侧重于揭示更多错误的模型行为，而忽略了生成的测试输入的有效性。为了缓解这一问题，我们从预测不确定性的角度出发，研究为 DNN 设计有效的测试输入生成方法。通过对 DNN 的 11 个预测不确定性指标进行大规模实证研究，我们探索了测试输入的有效性和不确定性之间的相关性。我们的研究结果表明，基于预测熵的不确定性度量和基于集合的不确定性度量能有效地描述输入有效性演示。基于这些见解，我们引入了不确定性指导的深度学习测试方法 UCTest，以有效生成真实有效的测试输入。我们制定了一个联合优化目标：通过最大化损失函数发现模型的错误行为，同时通过最小化不确定性生成有效的测试输入。大量实验证明，在生成有效测试输入方面，我们的方法优于当前的测试方法。此外，通过数据增强技术将自然变化纳入 UCTest，有效地提高了生成测试输入的多样性。

{"title":"Validity Matters: Uncertainty‐Guided Testing of Deep Neural Networks","authors":"Zhouxian Jiang, Honghui Li, Rui Wang, Xuetao Tian, Ci Liang, Fei Yan, Junwen Zhang, Zhen Liu","doi":"10.1002/stvr.1894","DOIUrl":"https://doi.org/10.1002/stvr.1894","url":null,"abstract":"Despite numerous applications of deep learning technologies on critical tasks in various domains, advanced deep neural networks (DNNs) face persistent safety and security challenges, such as the overconfidence in predicting out‐of‐distribution samples and susceptibility to adversarial examples. Thorough testing by exploring the input space serves as a key strategy to ensure their robustness and trustworthiness of these networks. However, existing testing methods focus on disclosing more erroneous model behaviours, overlooking the validity of the generated test inputs. To mitigate this issue, we investigate devising valid test input generation method for DNNs from a predictive uncertainty perspective. Through a large‐scale empirical study across 11 predictive uncertainty metrics for DNNs, we explore the correlation between validity and uncertainty of test inputs. Our findings reveal that the predictive entropy‐based and ensemble‐based uncertainty metrics effectively characterize the input validity demonstration. Building on these insights, we introduce UCTest, an uncertainty‐guided deep learning testing approach, to efficiently generate valid and authentic test inputs. We formulate a joint optimization objective: to uncover the model's misbehaviours by maximizing the loss function and concurrently generate valid test input by minimizing uncertainty. Extensive experiments demonstrate that our approach outperforms the current testing methods in generating valid test inputs. Furthermore, incorporating natural variation through data augmentation techniques into UCTest effectively boosts the diversity of generated test inputs.","PeriodicalId":501413,"journal":{"name":"Software Testing, Verification and Reliability","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving Web Element Localization by Using a Large Language Model 使用大型语言模型改进网页元素本地化

Software Testing, Verification and Reliability

Pub Date : 2024-08-16 DOI: 10.1002/stvr.1893

Michel Nass, Emil Alégroth, Robert Feldt

Web‐based test automation heavily relies on accurately finding web elements. Traditional methods compare attributes but do not grasp the context and meaning of elements and words. The emergence of large language models (LLMs) like GPT‐4, which can show human‐like reasoning abilities on some tasks, offers new opportunities for software engineering and web element localization. This paper introduces and evaluates VON Similo LLM, an enhanced web element localization approach. Using an LLM, it selects the most likely web element from the top‐ranked ones identified by the existing VON Similo method, ideally aiming to get closer to human‐like selection accuracy. An experimental study was conducted using 804 web element pairs from 48 real‐world web applications. We measured the number of correctly identified elements as well as the execution times, comparing the effectiveness and efficiency of VON Similo LLM against the baseline algorithm. In addition, motivations from the LLM were recorded and analysed for 140 instances. VON Similo LLM demonstrated improved performance, reducing failed localizations from 70 to 40 (out of 804), a 43% reduction. Despite its slower execution time and additional costs of using the GPT‐4 model, the LLM's human‐like reasoning showed promise in enhancing web element localization. LLM technology can enhance web element localization in GUI test automation, reducing false positives and potentially lowering maintenance costs. However, further research is necessary to fully understand LLMs' capabilities, limitations and practical use in GUI testing.

基于网络的测试自动化在很大程度上依赖于准确查找网络元素。传统方法只能比较属性，却无法把握元素和词语的上下文和含义。像 GPT-4 这样的大型语言模型（LLM）可以在某些任务中表现出类似人类的推理能力，它的出现为软件工程和网页元素本地化提供了新的机遇。本文介绍并评估了 VON Similo LLM，这是一种增强型网页元素本地化方法。通过使用 LLM，它可以从现有 VON Similo 方法识别出的排名靠前的网页元素中选择最有可能的网页元素，其理想目标是接近人类的选择精确度。我们使用 48 个真实世界网络应用程序中的 804 个网络元素对进行了实验研究。我们测量了正确识别的元素数量和执行时间，比较了 VON Similo LLM 与基准算法的有效性和效率。此外，我们还记录并分析了 140 个实例的 LLM 动机。VON Similo LLM 的性能得到了提高，定位失败从 70 次减少到 40 次（共 804 次），减少了 43%。尽管 LLM 的执行时间较慢，而且使用 GPT-4 模型需要额外成本，但它的类人推理功能在增强网页元素本地化方面显示出了前景。LLM 技术可以增强图形用户界面测试自动化中的网页元素定位，减少误报，并有可能降低维护成本。然而，要充分了解 LLM 的能力、局限性以及在图形用户界面测试中的实际应用，还需要进一步的研究。

{"title":"Improving Web Element Localization by Using a Large Language Model","authors":"Michel Nass, Emil Alégroth, Robert Feldt","doi":"10.1002/stvr.1893","DOIUrl":"https://doi.org/10.1002/stvr.1893","url":null,"abstract":"Web‐based test automation heavily relies on accurately finding web elements. Traditional methods compare attributes but do not grasp the context and meaning of elements and words. The emergence of large language models (LLMs) like GPT‐4, which can show human‐like reasoning abilities on some tasks, offers new opportunities for software engineering and web element localization. This paper introduces and evaluates VON Similo LLM, an enhanced web element localization approach. Using an LLM, it selects the most likely web element from the top‐ranked ones identified by the existing VON Similo method, ideally aiming to get closer to human‐like selection accuracy. An experimental study was conducted using 804 web element pairs from 48 real‐world web applications. We measured the number of correctly identified elements as well as the execution times, comparing the effectiveness and efficiency of VON Similo LLM against the baseline algorithm. In addition, motivations from the LLM were recorded and analysed for 140 instances. VON Similo LLM demonstrated improved performance, reducing failed localizations from 70 to 40 (out of 804), a 43% reduction. Despite its slower execution time and additional costs of using the GPT‐4 model, the LLM's human‐like reasoning showed promise in enhancing web element localization. LLM technology can enhance web element localization in GUI test automation, reducing false positives and potentially lowering maintenance costs. However, further research is necessary to fully understand LLMs' capabilities, limitations and practical use in GUI testing.","PeriodicalId":501413,"journal":{"name":"Software Testing, Verification and Reliability","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scenario‐Driven Metamorphic Testing for Autonomous Driving Simulators 自动驾驶模拟器的场景驱动变形测试

Software Testing, Verification and Reliability

Pub Date : 2024-07-24 DOI: 10.1002/stvr.1892

Yifan Zhang, Dave Towey, Matthew Pike, Jia Cheng Han, Zhi Quan Zhou, Chenghao Yin, Qian Wang, Chen Xie

The proliferation of driver‐assistance features in vehicles has resulted in a growing interest among the public in fully autonomous driving systems (ADSs). However, the integration of software and hardware in these complex systems presents significant testing challenges, particularly with respect to ensuring passenger safety. To address these challenges, simulation has emerged as a crucial step in the testing of ADSs. This paper presents a solution to the challenges faced in testing ADSs, with a focus on the validation of ADS simulators. The proposed approach involves using simulations and metamorphic testing (MT) to generate multiple concrete metamorphic relations (MRs) for testing ADS simulators. In order to accomplish this goal, we introduce three metamorphic relation patterns (MRPs). Each MRP is accompanied by a metamorphic relation input pattern (MRIP) that aids in generating detailed MRs. These MRs are designed to identify potential issues within the ADS simulator. To simplify the testing process and facilitate MT for testers, a self‐evolving scenario‐testing framework is also presented. The framework allows testers to improve test cases and MRs iteratively until issues detected are confirmed. The benefits and limitations of the framework are demonstrated using an industry case study. Overall, this study offers a practical solution to the challenges in testing ADSs and provides useful insights into improving testing efficiency for researchers and practitioners in the field.

随着汽车驾驶辅助功能的普及，公众对完全自动驾驶系统（ADS）的兴趣与日俱增。然而，这些复杂系统的软硬件集成带来了巨大的测试挑战，尤其是在确保乘客安全方面。为应对这些挑战，模拟已成为自动驾驶系统测试的关键步骤。本文针对自动驾驶辅助系统测试中面临的挑战提出了一种解决方案，重点关注自动驾驶辅助系统模拟器的验证。所提出的方法包括利用模拟和变形测试（MT）生成多个具体的变形关系（MR），用于测试 ADS 模拟器。为了实现这一目标，我们引入了三种元变形关系模式（MRP）。每个 MRP 都配有一个元变形关系输入模式 (MRIP)，以帮助生成详细的 MR。这些 MR 旨在识别 ADS 模拟器中的潜在问题。为了简化测试过程并方便测试人员进行 MT 测试，还提出了一个自适应场景测试框架。该框架允许测试人员反复改进测试用例和 MR，直到发现的问题得到确认。该框架的优点和局限性通过一个行业案例研究得以展示。总之，本研究为测试 ADS 所面临的挑战提供了实用的解决方案，并为该领域的研究人员和从业人员提高测试效率提供了有用的见解。

{"title":"Scenario‐Driven Metamorphic Testing for Autonomous Driving Simulators","authors":"Yifan Zhang, Dave Towey, Matthew Pike, Jia Cheng Han, Zhi Quan Zhou, Chenghao Yin, Qian Wang, Chen Xie","doi":"10.1002/stvr.1892","DOIUrl":"https://doi.org/10.1002/stvr.1892","url":null,"abstract":"The proliferation of driver‐assistance features in vehicles has resulted in a growing interest among the public in fully autonomous driving systems (ADSs). However, the integration of software and hardware in these complex systems presents significant testing challenges, particularly with respect to ensuring passenger safety. To address these challenges, simulation has emerged as a crucial step in the testing of ADSs. This paper presents a solution to the challenges faced in testing ADSs, with a focus on the validation of ADS simulators. The proposed approach involves using simulations and metamorphic testing (MT) to generate multiple concrete metamorphic relations (MRs) for testing ADS simulators. In order to accomplish this goal, we introduce three metamorphic relation patterns (MRPs). Each MRP is accompanied by a metamorphic relation input pattern (MRIP) that aids in generating detailed MRs. These MRs are designed to identify potential issues within the ADS simulator. To simplify the testing process and facilitate MT for testers, a self‐evolving scenario‐testing framework is also presented. The framework allows testers to improve test cases and MRs iteratively until issues detected are confirmed. The benefits and limitations of the framework are demonstrated using an industry case study. Overall, this study offers a practical solution to the challenges in testing ADSs and provides useful insights into improving testing efficiency for researchers and practitioners in the field.","PeriodicalId":501413,"journal":{"name":"Software Testing, Verification and Reliability","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Boosting Multimode Ruling in DHR Architecture With Metamorphic Relations 用变形关系促进 DHR 架构中的多模式裁决

Software Testing, Verification and Reliability

Pub Date : 2024-07-24 DOI: 10.1002/stvr.1890

Ruosi Li, Xianglong Kong, Wei Guo, Jingdong Guo, Hongfa Li, Fan Zhang

The DHR architecture provides a revolutionary security defense structure for cyberspace. The multimode ruling in DHR is expected to alleviate the oracle problem, which still suffers from the existence of common model vulnerability. In this work, we design a test segmentation method to transform multimode ruling to a metamorphic testing problem. The text test input that causes inconsistency of heterogeneous executors is converted to a condition set, and we extract subsets of conditions based on its syntax tree. The original test can exploit a specific vulnerability, the follow‐up tests are composed by different subsets of conditions within the original test. We collect the execution matrix for the follow‐up tests to analyse the impact of each subset of conditions on ruling decision. Metamorphic relations are extracted based on the localization of independent condition, that is, the subsets of conditions that can impact ruling decision independently. The executors in an inconsistent ruling should be examined with metamorphic testing methods, rather than traditional majority voting mechanism. The proposed test segmentation and improved multimode ruling methods are evaluated on two DHR‐based cases, SQL injection in cyber‐range system and deserialization attack in ‐ project. The experimental results show that our test segmentation can help to locate malicious expressions and the metamorphic testing‐based multimode ruling can generate more correct results than majority voting mechanism with an average 15.8% performance loss.

DHR 架构为网络空间提供了一种革命性的安全防御结构。DHR 中的多模裁决有望缓解仍存在普通模型漏洞的 Oracle 问题。在这项工作中，我们设计了一种测试分割方法，将多模裁决转化为变态测试问题。将导致异构执行器不一致的文本测试输入转换为条件集，并根据其语法树提取条件子集。原始测试可以利用特定的漏洞，后续测试由原始测试中的不同条件子集组成。我们收集后续测试的执行矩阵，分析每个条件子集对裁决决定的影响。根据独立条件的定位提取变态关系，即能够独立影响裁决决定的条件子集。不一致裁决中的执行者应使用变形测试方法进行检验，而不是传统的多数表决机制。我们在两个基于 DHR 的案例（网络远程系统中的 SQL 注入和 - 项目中的反序列化攻击）中评估了所提出的测试分割和改进的多模式裁决方法。实验结果表明，我们的测试分割方法有助于定位恶意表达式，而基于变形测试的多模式裁决方法比多数投票机制能产生更多正确结果，平均性能损失为 15.8%。

{"title":"Boosting Multimode Ruling in DHR Architecture With Metamorphic Relations","authors":"Ruosi Li, Xianglong Kong, Wei Guo, Jingdong Guo, Hongfa Li, Fan Zhang","doi":"10.1002/stvr.1890","DOIUrl":"https://doi.org/10.1002/stvr.1890","url":null,"abstract":"The DHR architecture provides a revolutionary security defense structure for cyberspace. The multimode ruling in DHR is expected to alleviate the oracle problem, which still suffers from the existence of common model vulnerability. In this work, we design a test segmentation method to transform multimode ruling to a metamorphic testing problem. The text test input that causes inconsistency of heterogeneous executors is converted to a condition set, and we extract subsets of conditions based on its syntax tree. The original test can exploit a specific vulnerability, the follow‐up tests are composed by different subsets of conditions within the original test. We collect the execution matrix for the follow‐up tests to analyse the impact of each subset of conditions on ruling decision. Metamorphic relations are extracted based on the localization of independent condition, that is, the subsets of conditions that can impact ruling decision independently. The executors in an inconsistent ruling should be examined with metamorphic testing methods, rather than traditional majority voting mechanism. The proposed test segmentation and improved multimode ruling methods are evaluated on two DHR‐based cases, SQL injection in cyber‐range system and deserialization attack in ‐ project. The experimental results show that our test segmentation can help to locate malicious expressions and the metamorphic testing‐based multimode ruling can generate more correct results than majority voting mechanism with an average 15.8% performance loss.","PeriodicalId":501413,"journal":{"name":"Software Testing, Verification and Reliability","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Boosting Metamorphic Relation Prediction via Code Representation Learning: An Empirical Study 通过代码表示学习提升变形关系预测：实证研究

Software Testing, Verification and Reliability

Pub Date : 2024-07-08 DOI: 10.1002/stvr.1889

Xuedan Zheng, Mingyue Jiang, Zhi Quan Zhou

Metamorphic testing (MT) is an effective testing technique having a broad range of applications. One key task for MT is the identification of metamorphic relations (MRs), which is a fundamental mechanism in MT and is critical to the automation of MT. Prior studies have proposed approaches for predicting MRs (PMR). One major idea behind these PMR approaches is to represent program source code information via manually designed code features and then to apply machine‐learning–based classifiers to automatically predict whether a specific MR can be applied on the target program. Nevertheless, the human‐involved procedure of selecting and extracting code features is costly, and it may not be easy to obtain sufficiently comprehensive features for representing source code. To overcome this limitation, in this study, we explore and evaluate the effectiveness of code representation learning techniques for PMR. By applying neural code representation models for automatically mapping program source code to code vectors, the PMR procedure can be boosted with learned code representations. We develop 32 PMR instances by, respectively, combining 8 code representation models with 4 typical classification models and conduct an extensive empirical study to investigate the effectiveness of code representation learning techniques in the context of MR prediction. Our findings reveal that code representation learning can positively contribute to the prediction of MRs and provide insights into the practical usage of code representation models in the context of MR prediction. Our findings could help researchers and practitioners to gain a deeper understanding of the strength of code representation learning for PMR and, hence, pave the way for future research in deriving or extracting MRs from program source code.

变形测试（MT）是一种有效的测试技术，具有广泛的应用范围。MT 的一项关键任务是识别变态关系（MR），这是 MT 的基本机制，也是 MT 自动化的关键。之前的研究提出了预测变质关系（PMR）的方法。这些 PMR 方法背后的一个主要思路是通过人工设计的代码特征来表示程序源代码信息，然后应用基于机器学习的分类器来自动预测特定的 MR 是否可应用于目标程序。然而，人工选择和提取代码特征的过程成本高昂，而且要获得足够全面的源代码特征可能并不容易。为了克服这一局限性，在本研究中，我们探索并评估了用于 PMR 的代码表示学习技术的有效性。通过应用神经代码表示模型自动映射程序源代码到代码向量，可以用学习到的代码表示来增强 PMR 程序。我们通过将 8 个代码表示模型与 4 个典型分类模型相结合，开发了 32 个 PMR 实例，并进行了广泛的实证研究，以调查代码表示学习技术在磁共振预测中的有效性。我们的研究结果表明，代码表示学习可以为磁共振预测做出积极贡献，并为代码表示模型在磁共振预测中的实际应用提供了启示。我们的发现有助于研究人员和从业人员深入了解代码表示学习在 PMR 方面的优势，从而为今后从程序源代码中推导或提取 MR 的研究铺平道路。

{"title":"Boosting Metamorphic Relation Prediction via Code Representation Learning: An Empirical Study","authors":"Xuedan Zheng, Mingyue Jiang, Zhi Quan Zhou","doi":"10.1002/stvr.1889","DOIUrl":"https://doi.org/10.1002/stvr.1889","url":null,"abstract":"Metamorphic testing (MT) is an effective testing technique having a broad range of applications. One key task for MT is the identification of metamorphic relations (MRs), which is a fundamental mechanism in MT and is critical to the automation of MT. Prior studies have proposed approaches for predicting MRs (PMR). One major idea behind these PMR approaches is to represent program source code information via manually designed code features and then to apply machine‐learning–based classifiers to automatically predict whether a specific MR can be applied on the target program. Nevertheless, the human‐involved procedure of selecting and extracting code features is costly, and it may not be easy to obtain sufficiently comprehensive features for representing source code. To overcome this limitation, in this study, we explore and evaluate the effectiveness of code representation learning techniques for PMR. By applying neural code representation models for automatically mapping program source code to code vectors, the PMR procedure can be boosted with learned code representations. We develop 32 PMR instances by, respectively, combining 8 code representation models with 4 typical classification models and conduct an extensive empirical study to investigate the effectiveness of code representation learning techniques in the context of MR prediction. Our findings reveal that code representation learning can positively contribute to the prediction of MRs and provide insights into the practical usage of code representation models in the context of MR prediction. Our findings could help researchers and practitioners to gain a deeper understanding of the strength of code representation learning for PMR and, hence, pave the way for future research in deriving or extracting MRs from program source code.","PeriodicalId":501413,"journal":{"name":"Software Testing, Verification and Reliability","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141570060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsafe code detection in Rust and metamorphic testing of autonomous driving systems Rust 中的不安全代码检测和自动驾驶系统的变形测试

Software Testing, Verification and Reliability

Pub Date : 2024-07-08 DOI: 10.1002/stvr.1891

Yves Le Traon, Tao Xie

引用次数: 0

Perception simplex: Verifiable collision avoidance in autonomous vehicles amidst obstacle detection faults 简单感知：自动驾驶汽车在障碍物检测故障中可验证的防撞功能

Software Testing, Verification and Reliability

Pub Date : 2024-05-28 DOI: 10.1002/stvr.1879

Ayoosh Bansal, Hunmin Kim, Simon Yu, Bo Li, Naira Hovakimyan, Marco Caccamo, Lui Sha

Advances in deep learning have revolutionized cyber‐physical applications, including the development of autonomous vehicles. However, real‐world collisions involving autonomous control of vehicles have raised significant safety concerns regarding the use of deep neural networks (DNNs) in safety‐critical tasks, particularly perception. The inherent unverifiability of DNNs poses a key challenge in ensuring their safe and reliable operation. In this work, we propose perception simplex (), a fault‐tolerant application architecture designed for obstacle detection and collision avoidance. We analyse an existing LiDAR‐based classical obstacle detection algorithm to establish strict bounds on its capabilities and limitations. Such analysis and verification have not been possible for deep learning‐based perception systems yet. By employing verifiable obstacle detection algorithms, identifies obstacle existence detection faults in the output of unverifiable DNN‐based object detectors. When faults with potential collision risks are detected, appropriate corrective actions are initiated. Through extensive analysis and software‐in‐the‐loop simulations, we demonstrate that provides deterministic fault tolerance against obstacle existence detection faults, establishing a robust safety guarantee.

深度学习的进步彻底改变了网络物理应用，包括自动驾驶汽车的开发。然而，现实世界中涉及自动控制车辆的碰撞事故引发了人们对在安全关键任务（尤其是感知任务）中使用深度神经网络（DNN）的极大安全担忧。DNN 固有的不可验证性是确保其安全可靠运行的关键挑战。在这项工作中，我们提出了感知 simplex（），这是一种专为障碍物检测和避免碰撞而设计的容错应用架构。我们分析了现有的基于激光雷达的经典障碍物检测算法，为其能力和局限性确定了严格的界限。这种分析和验证对于基于深度学习的感知系统来说尚属首次。通过采用可验证的障碍物检测算法，在不可验证的基于 DNN 的物体检测器输出中识别出障碍物存在检测故障。当检测到有潜在碰撞风险的故障时，就会启动适当的纠正措施。通过广泛的分析和软件在环仿真，我们证明了该系统能对障碍物存在检测故障提供确定性容错，从而建立了稳健的安全保障。

{"title":"Perception simplex: Verifiable collision avoidance in autonomous vehicles amidst obstacle detection faults","authors":"Ayoosh Bansal, Hunmin Kim, Simon Yu, Bo Li, Naira Hovakimyan, Marco Caccamo, Lui Sha","doi":"10.1002/stvr.1879","DOIUrl":"https://doi.org/10.1002/stvr.1879","url":null,"abstract":"Advances in deep learning have revolutionized cyber‐physical applications, including the development of autonomous vehicles. However, real‐world collisions involving autonomous control of vehicles have raised significant safety concerns regarding the use of deep neural networks (DNNs) in safety‐critical tasks, particularly perception. The inherent unverifiability of DNNs poses a key challenge in ensuring their safe and reliable operation. In this work, we propose perception simplex (), a fault‐tolerant application architecture designed for obstacle detection and collision avoidance. We analyse an existing LiDAR‐based classical obstacle detection algorithm to establish strict bounds on its capabilities and limitations. Such analysis and verification have not been possible for deep learning‐based perception systems yet. By employing verifiable obstacle detection algorithms, identifies obstacle existence detection faults in the output of unverifiable DNN‐based object detectors. When faults with potential collision risks are detected, appropriate corrective actions are initiated. Through extensive analysis and software‐in‐the‐loop simulations, we demonstrate that provides deterministic fault tolerance against obstacle existence detection faults, establishing a robust safety guarantee.","PeriodicalId":501413,"journal":{"name":"Software Testing, Verification and Reliability","volume":"90 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141191522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Investigating fault injection techniques in hardware‐based deep neural networks and mutation‐based fault localization 研究基于硬件的深度神经网络故障注入技术和基于突变的故障定位技术

Software Testing, Verification and Reliability

Pub Date : 2024-05-07 DOI: 10.1002/stvr.1880

Yves Le Traon, Tao Xie

引用次数: 0

MetaSem: metamorphic testing based on semantic information of autonomous driving scenes MetaSem：基于自动驾驶场景语义信息的变形测试

Software Testing, Verification and Reliability

Pub Date : 2024-05-01 DOI: 10.1002/stvr.1878

Zhen Yang, Song Huang, Tongtong Bai, Yongming Yao, Yang Wang, Changyou Zheng, Chunyan Xia

The development of artificial intelligence and information communication technology has significantly propelled advancements in autonomous driving. The advent of autonomous driving has a profound impact on societal development and transportation methods. However, as intelligent systems, autonomous driving systems (ADSs) often make wrong judgements in specific scenarios, resulting in accidents. There is an urgent need for comprehensive testing and validation of ADSs. Metamorphic testing (MT) techniques have demonstrated effectiveness in testing ADSs. Nevertheless, existing testing methods primarily encompass relatively simple metamorphic relations (MRs) that only verify ADSs from a single perspective. To ensure the safety of ADSs, it is essential to consider the various elements of driving scenarios during the testing process. Therefore, this paper proposes MetaSem, a novel metamorphic testing method based on semantic information of autonomous driving scenes. Based on semantic information of the autonomous driving scenes and traffic regulations, we design 11 MRs targeting different scenario elements. Three transformation modules are developed to execute addition, deletion and replacement operations on various scene elements within the images. Finally, corresponding evaluation metrics are defined based on MRs. MetaSem automatically discovers inconsistent behaviours according to the evaluation metrics. Our empirical study on three advanced and popular autonomous driving models demonstrates that MetaSem not only efficiently generates visually natural and realistic scene images but also detects 11,787 inconsistent behaviours on three driving models.

人工智能和信息通信技术的发展极大地推动了自动驾驶技术的进步。自动驾驶的出现对社会发展和交通方式产生了深远影响。然而，作为智能系统，自动驾驶系统（ADS）在特定场景下往往会做出错误判断，从而导致事故发生。因此，迫切需要对自动驾驶系统进行全面的测试和验证。变形测试（MT）技术在测试自动驾驶汽车系统方面已证明行之有效。然而，现有的测试方法主要包括相对简单的变形关系（MR），只能从单一角度验证自动变速箱。为确保自动驾驶辅助系统的安全性，在测试过程中必须考虑驾驶场景的各种因素。因此，本文提出了一种基于自动驾驶场景语义信息的新型变态测试方法--MetaSem。基于自动驾驶场景和交通法规的语义信息，我们设计了 11 个针对不同场景元素的 MR。开发了三个转换模块，对图像中的各种场景元素执行添加、删除和替换操作。最后，基于 MRs 定义了相应的评估指标。MetaSem 会根据评价指标自动发现不一致的行为。我们在三种先进和流行的自动驾驶模型上进行的实证研究表明，MetaSem 不仅能高效生成视觉上自然逼真的场景图像，还能在三种驾驶模型上检测出 11787 个不一致行为。

{"title":"MetaSem: metamorphic testing based on semantic information of autonomous driving scenes","authors":"Zhen Yang, Song Huang, Tongtong Bai, Yongming Yao, Yang Wang, Changyou Zheng, Chunyan Xia","doi":"10.1002/stvr.1878","DOIUrl":"https://doi.org/10.1002/stvr.1878","url":null,"abstract":"The development of artificial intelligence and information communication technology has significantly propelled advancements in autonomous driving. The advent of autonomous driving has a profound impact on societal development and transportation methods. However, as intelligent systems, autonomous driving systems (ADSs) often make wrong judgements in specific scenarios, resulting in accidents. There is an urgent need for comprehensive testing and validation of ADSs. Metamorphic testing (MT) techniques have demonstrated effectiveness in testing ADSs. Nevertheless, existing testing methods primarily encompass relatively simple metamorphic relations (MRs) that only verify ADSs from a single perspective. To ensure the safety of ADSs, it is essential to consider the various elements of driving scenarios during the testing process. Therefore, this paper proposes MetaSem, a novel metamorphic testing method based on semantic information of autonomous driving scenes. Based on semantic information of the autonomous driving scenes and traffic regulations, we design 11 MRs targeting different scenario elements. Three transformation modules are developed to execute addition, deletion and replacement operations on various scene elements within the images. Finally, corresponding evaluation metrics are defined based on MRs. MetaSem automatically discovers inconsistent behaviours according to the evaluation metrics. Our empirical study on three advanced and popular autonomous driving models demonstrates that MetaSem not only efficiently generates visually natural and realistic scene images but also detects 11,787 inconsistent behaviours on three driving models.","PeriodicalId":501413,"journal":{"name":"Software Testing, Verification and Reliability","volume":"37 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140830550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Software Testing, Verification and Reliability

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀