Software Testing Verification & Reliability最新文献_第3页

RATE: A model‐based testing approach that combines model refinement and test execution RATE:一种基于模型的测试方法，结合了模型细化和测试执行

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Software Testing Verification & Reliability

Pub Date : 2022-12-18 DOI: 10.1002/stvr.1835

A. Bombarda, S. Bonfanti, A. Gargantini, Yu Lei, Feng Duan

In this paper, we present an approach to conformance testing based on abstract state machines (ASMs) that combines model refinement and test execution (RATE) and its application to three case studies. The RATE approach consists in generating test sequences from ASMs and checking the conformance between code and models in multiple iterations. The process follows these steps: (1) model the system as an abstract state machine; (2) validate and verify the model; (3) generate test sequences automatically from the ASM model; (4) execute the tests over the implementation and compute the code coverage; (5) if the coverage is below the desired threshold, then refine the abstract state machine model to add the uncovered functionalities and return to step 2. We have applied the proposed approach in three case studies: a traffic light control system (TLCS), the IEEE 11073‐20601 personal health device (PHD) protocol, and the mechanical ventilator Milano (MVM). By applying RATE, at each refinement level, we have increased code coverage and identified some faults or conformance errors for all the case studies. The fault detection capability of RATE has also been confirmed by mutation analysis, in which we have highlighted that, many mutants can be killed even by the most abstract models.

在本文中，我们提出了一种基于抽象状态机(asm)的一致性测试方法，该方法结合了模型优化和测试执行(RATE)，并将其应用于三个案例研究。RATE方法包括从asm生成测试序列，并在多个迭代中检查代码和模型之间的一致性。该过程遵循以下步骤:(1)将系统建模为抽象状态机;(2)对模型进行验证和验证;(3)根据ASM模型自动生成测试序列;(4)对实现执行测试，计算代码覆盖率;(5)如果覆盖率低于期望的阈值，则细化抽象状态机模型以添加未覆盖的功能并返回步骤2。我们在三个案例研究中应用了该方法:交通信号灯控制系统(TLCS)、IEEE 11073‐20601个人健康设备(PHD)协议和米兰机械呼吸机(MVM)。通过在每个细化级别上应用RATE，我们增加了代码覆盖率，并为所有案例研究确定了一些错误或一致性错误。突变分析也证实了RATE的故障检测能力，其中我们强调，即使是最抽象的模型也可以杀死许多突变体。

{"title":"RATE: A model‐based testing approach that combines model refinement and test execution","authors":"A. Bombarda, S. Bonfanti, A. Gargantini, Yu Lei, Feng Duan","doi":"10.1002/stvr.1835","DOIUrl":"https://doi.org/10.1002/stvr.1835","url":null,"abstract":"In this paper, we present an approach to conformance testing based on abstract state machines (ASMs) that combines model refinement and test execution (RATE) and its application to three case studies. The RATE approach consists in generating test sequences from ASMs and checking the conformance between code and models in multiple iterations. The process follows these steps: (1) model the system as an abstract state machine; (2) validate and verify the model; (3) generate test sequences automatically from the ASM model; (4) execute the tests over the implementation and compute the code coverage; (5) if the coverage is below the desired threshold, then refine the abstract state machine model to add the uncovered functionalities and return to step 2. We have applied the proposed approach in three case studies: a traffic light control system (TLCS), the IEEE 11073‐20601 personal health device (PHD) protocol, and the mechanical ventilator Milano (MVM). By applying RATE, at each refinement level, we have increased code coverage and identified some faults or conformance errors for all the case studies. The fault detection capability of RATE has also been confirmed by mutation analysis, in which we have highlighted that, many mutants can be killed even by the most abstract models.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"49 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86789770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Uncertainty quantification for deep neural networks: An empirical comparison and usage guidelines 深度神经网络的不确定性量化:经验比较和使用指南

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Software Testing Verification & Reliability

Pub Date : 2022-12-14 DOI: 10.1002/stvr.1840

Michael Weiss, P. Tonella

Deep neural networks (DNN) are increasingly used as components of larger software systems that need to process complex data, such as images, written texts, audio/video signals. DNN predictions cannot be assumed to be always correct for several reasons, amongst which the huge input space that is dealt with, the ambiguity of some inputs data, as well as the intrinsic properties of learning algorithms, which can provide only statistical warranties. Hence, developers have to cope with some residual error probability. An architectural pattern commonly adopted to manage failure prone components is the supervisor, an additional component that can estimate the reliability of the predictions made by untrusted (e.g., DNN) components and can activate an automated healing procedure when these are likely to fail, ensuring that the deep learning‐based system (DLS) does not cause damages, despite its main functionality being suspended.

深度神经网络(DNN)越来越多地被用作需要处理复杂数据(如图像、书面文本、音频/视频信号)的大型软件系统的组件。由于几个原因，不能假设DNN预测总是正确的，其中包括处理的巨大输入空间，一些输入数据的模糊性，以及学习算法的内在属性，这些只能提供统计保证。因此，开发人员必须处理一些残余错误概率。通常用于管理容易发生故障的组件的架构模式是监督器，这是一个额外的组件，可以估计不受信任(例如DNN)组件所做预测的可靠性，并可以在这些组件可能失败时激活自动修复程序，确保基于深度学习的系统(DLS)不会造成损害，尽管其主要功能被暂停。

引用次数: 5

Fuzz testing for digital TV receivers and multitasking control software verification 数字电视接收机模糊测试及多任务控制软件验证

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Software Testing Verification & Reliability

Pub Date : 2022-12-07 DOI: 10.1002/stvr.1836

Yves Le Traon, Tao Xie

This issue contains two very different papers, in terms of subjects and proposed test and verification techniques. The first paper focuses on testing the robustness of digital TV (DTV) receivers through (non)compliance fuzz testing. The second one focuses on a model-based approach to enable the verification of multitasking control software, proposing an OS-in-the-Loop (OiL) verification framework. The first paper, ‘A fuzzing-based test-creation approach for evaluating digital TV receivers via transport streams’ by Fabricio Izumi, Eddie B. de Lima Filho, Lucas C. Cordeiro, Orlewilson Maia, Rômulo Fabrício, Bruno Farias and Aguinaldo Silva, concerns the generation of noncompliance tests using grammar-based guided fuzzing. The originality of this contribution resides in the nature of the test subjects, which are DTV receivers, their (mis)configurations and transport streams. The originality extends to conformance testing by targeting robustness improvements: Instead of checking whether it behaves as expected, the goal is to verify the DTV receiver response against inaccurate or inconsistent data, based on fuzzing input generation. Finally, the approach is supported by a complete evaluation framework, which includes a testing environment, audio and video verification algorithms and a strategy for test creation (recommended by Paul Strooper, Rob Hierons and Yves Le Traon). The second paper, ‘OS-in-the-Loop verification for multi-tasking control software’ by Yunja Choi, presents an original approach to perform verification for embedded control software, specifically an OiL verification framework. This framework is based on a modelling of embedded operating systems, enabling the composition of the interactions of the OS model and the device controllers, thanks to an algorithm described in the paper. Multitasking is thus treated thanks to this composition mechanism. The framework makes it possible to apply various verification methods for multitasking (random simulation, dynamic concolic testing and model checking). The application of the OiL verification to a small-case study illustrates the benefit of the framework, which has been successfully applied on two typical pieces of multitasking embedded software from industry (recommended by Benoit Baudry, Rob Hierons and Yves Le Traon). We hope you will find these papers interesting and inspiring for your future work.

这个问题包含两篇非常不同的论文，在主题和提出的测试和验证技术方面。第一篇论文的重点是通过(非)符合性模糊测试来测试数字电视(DTV)接收机的鲁棒性。第二部分着重于基于模型的方法来验证多任务控制软件，提出了一个OS-in-the-Loop (OiL)验证框架。第一篇论文，“通过传输流评估数字电视接收器的基于模糊测试创建方法”，作者是fabicio Izumi、Eddie B. de Lima Filho、Lucas C. Cordeiro、Orlewilson Maia、Rômulo Fabrício、Bruno Farias和Aguinaldo Silva，论文关注的是使用基于语法的引导模糊测试生成不合规测试。这一贡献的独创性在于测试对象的性质，即数字电视接收器，它们的(错误)配置和传输流。通过针对鲁棒性改进，其独创性扩展到一致性测试:与其检查其行为是否如预期的那样，其目标是根据模糊输入生成来验证DTV接收器对不准确或不一致数据的响应。最后，该方法由一个完整的评估框架支持，该框架包括测试环境、音频和视频验证算法以及测试创建策略(由Paul Strooper、Rob Hierons和Yves Le Traon推荐)。第二篇论文，Yunja Choi的“多任务控制软件的OS-in-the-Loop验证”，提出了一种对嵌入式控制软件进行验证的原始方法，特别是一个OiL验证框架。该框架基于嵌入式操作系统的建模，使操作系统模型和设备控制器的交互组合成为可能，这要归功于论文中描述的算法。由于这种组合机制，因此可以处理多任务。该框架使得多任务验证方法(随机仿真、动态碰撞测试和模型检验)的应用成为可能。将OiL验证应用到一个小案例研究中，说明了该框架的好处，该框架已成功应用于两个典型的工业多任务嵌入式软件(由Benoit Baudry、Rob Hierons和Yves Le Traon推荐)。我们希望你会发现这些论文有趣和启发你未来的工作。

{"title":"Fuzz testing for digital TV receivers and multitasking control software verification","authors":"Yves Le Traon, Tao Xie","doi":"10.1002/stvr.1836","DOIUrl":"https://doi.org/10.1002/stvr.1836","url":null,"abstract":"This issue contains two very different papers, in terms of subjects and proposed test and verification techniques. The first paper focuses on testing the robustness of digital TV (DTV) receivers through (non)compliance fuzz testing. The second one focuses on a model-based approach to enable the verification of multitasking control software, proposing an OS-in-the-Loop (OiL) verification framework. The first paper, ‘A fuzzing-based test-creation approach for evaluating digital TV receivers via transport streams’ by Fabricio Izumi, Eddie B. de Lima Filho, Lucas C. Cordeiro, Orlewilson Maia, Rômulo Fabrício, Bruno Farias and Aguinaldo Silva, concerns the generation of noncompliance tests using grammar-based guided fuzzing. The originality of this contribution resides in the nature of the test subjects, which are DTV receivers, their (mis)configurations and transport streams. The originality extends to conformance testing by targeting robustness improvements: Instead of checking whether it behaves as expected, the goal is to verify the DTV receiver response against inaccurate or inconsistent data, based on fuzzing input generation. Finally, the approach is supported by a complete evaluation framework, which includes a testing environment, audio and video verification algorithms and a strategy for test creation (recommended by Paul Strooper, Rob Hierons and Yves Le Traon). The second paper, ‘OS-in-the-Loop verification for multi-tasking control software’ by Yunja Choi, presents an original approach to perform verification for embedded control software, specifically an OiL verification framework. This framework is based on a modelling of embedded operating systems, enabling the composition of the interactions of the OS model and the device controllers, thanks to an algorithm described in the paper. Multitasking is thus treated thanks to this composition mechanism. The framework makes it possible to apply various verification methods for multitasking (random simulation, dynamic concolic testing and model checking). The application of the OiL verification to a small-case study illustrates the benefit of the framework, which has been successfully applied on two typical pieces of multitasking embedded software from industry (recommended by Benoit Baudry, Rob Hierons and Yves Le Traon). We hope you will find these papers interesting and inspiring for your future work.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"26 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84592095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

OS‐in‐the‐Loop verification for multi‐tasking control software 多任务控制软件的OS in - the - Loop验证

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Software Testing Verification & Reliability

Pub Date : 2022-11-17 DOI: 10.1002/stvr.1834

Yunja Choi

Embedded control software that controls safety‐critical IoT devices requires systematic and comprehensive verification to ensure safe operation of the device. However, rigorous verification in this domain has not been feasible due to the high complexity of embedded control software, which is characterized by the frequent use of multi‐tasking, interrupts, and periodic alarms. Realizing that two major factors, scalability and exactness, are extremely difficult to achieve at the same time but critical for effective and efficient verification in this domain, this work introduces a domain‐specific compositional OS‐in‐the‐Loop (OiL) verification approach and sets out to push the boundary in achieving both factors. The suggested approach (1) models the behavior of the underlying operating system to limit the search space using the notion of controlled concurrency, (2) performs heterogeneous composition of controllers with the formal OS model to reduce verification complexity, and (3) utilizes state‐of‐the‐art verification techniques for the purpose of comprehensive verification up to a given search depth.

控制安全关键型物联网设备的嵌入式控制软件需要系统和全面的验证，以确保设备的安全运行。然而，由于嵌入式控制软件的高度复杂性，其特点是频繁使用多任务、中断和定期报警，因此在该领域进行严格的验证是不可实现的。认识到两个主要因素，可扩展性和准确性，很难同时实现，但对于该领域的有效和高效验证至关重要，本工作引入了一种特定领域的组合OS - in - the - Loop (OiL)验证方法，并着手推动实现这两个因素的边界。建议的方法(1)使用受控并发的概念对底层操作系统的行为进行建模，以限制搜索空间;(2)使用正式的操作系统模型执行控制器的异构组合，以降低验证复杂性;(3)利用最先进的验证技术，在给定的搜索深度内进行全面验证。

引用次数: 0

A fuzzing‐based test‐creation approach for evaluating digital TV receivers via transport streams 一种基于模糊的测试创建方法，用于通过传输流评估数字电视接收器

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Software Testing Verification & Reliability

Pub Date : 2022-10-02 DOI: 10.1002/stvr.1833

Fabrício Izumi, E. Filho, L. Cordeiro, O. Maia, Rômulo Fabrício, B. Farias, Aguinaldo Silva

Digital TV (DTV) receivers are usually submitted to testing systems for conformity and robustness assessment, and their approval implies correct operation under a given DTV specification protocol. However, many broadcasters inadvertently misconfigure their devices and transmit the wrong information concerning data structures and protocol format. Since most receivers were not designed to operate under such conditions, malfunction and incorrect behaviour may be noticed, often recognized as field problems, thus compromising a given system's operation. Moreover, the way those problems are usually introduced in DTV signals presents some randomness, but with known restrictions given by the underlying transport protocols used in DTV systems, which resembles fuzzing techniques. Indeed, everything may happen since any deviation can incur problems, depending on each specific implementation. This error scenario is addressed here, and a novel receiver robustness evaluation methodology based on non‐compliance tests using grammar‐based guided fuzzing is proposed. In particular, devices are submitted to unforeseen conditions and incorrect configuration. They are created with guided fuzzing based on real problems, protocol structure, and system architecture to provide resources for handling them, thus ensuring correct operation. Experiments using such a fuzzing scheme have shown its efficacy and provided opportunities to improve robustness regarding commercial DTV platforms.

数字电视(DTV)接收机通常提交测试系统进行一致性和稳健性评估，其批准意味着在给定的数字电视规范协议下正确运行。然而，许多广播公司无意中错误地配置了他们的设备，并传输了有关数据结构和协议格式的错误信息。由于大多数接收器的设计不适合在这种条件下工作，因此可能会注意到故障和不正确的行为，通常被认为是现场问题，从而危及给定系统的操作。此外，通常在数字电视信号中引入这些问题的方式呈现出一些随机性，但是在数字电视系统中使用的底层传输协议给出了已知的限制，这类似于模糊技术。实际上，任何事情都可能发生，因为任何偏差都可能导致问题，这取决于每个特定的实现。本文解决了这种错误情况，并提出了一种新的基于非符合性测试的接收器鲁棒性评估方法，该方法使用基于语法的引导模糊测试。特别是，设备被提交到不可预见的条件和不正确的配置。它们是根据实际问题、协议结构、系统架构，通过引导模糊来创建的，为处理它们提供资源，从而保证正确运行。使用这种模糊方案的实验表明了它的有效性，并为提高商业数字电视平台的鲁棒性提供了机会。

{"title":"A fuzzing‐based test‐creation approach for evaluating digital TV receivers via transport streams","authors":"Fabrício Izumi, E. Filho, L. Cordeiro, O. Maia, Rômulo Fabrício, B. Farias, Aguinaldo Silva","doi":"10.1002/stvr.1833","DOIUrl":"https://doi.org/10.1002/stvr.1833","url":null,"abstract":"Digital TV (DTV) receivers are usually submitted to testing systems for conformity and robustness assessment, and their approval implies correct operation under a given DTV specification protocol. However, many broadcasters inadvertently misconfigure their devices and transmit the wrong information concerning data structures and protocol format. Since most receivers were not designed to operate under such conditions, malfunction and incorrect behaviour may be noticed, often recognized as field problems, thus compromising a given system's operation. Moreover, the way those problems are usually introduced in DTV signals presents some randomness, but with known restrictions given by the underlying transport protocols used in DTV systems, which resembles fuzzing techniques. Indeed, everything may happen since any deviation can incur problems, depending on each specific implementation. This error scenario is addressed here, and a novel receiver robustness evaluation methodology based on non‐compliance tests using grammar‐based guided fuzzing is proposed. In particular, devices are submitted to unforeseen conditions and incorrect configuration. They are created with guided fuzzing based on real problems, protocol structure, and system architecture to provide resources for handling them, thus ensuring correct operation. Experiments using such a fuzzing scheme have shown its efficacy and provided opportunities to improve robustness regarding commercial DTV platforms.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"1 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85851921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Combinatorial testing and model checking 组合测试和模型检查

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Software Testing Verification & Reliability

Pub Date : 2022-08-10 DOI: 10.1002/stvr.1832

Yves Le Traon, Tao Xie

This issue contains two papers. The first paper focuses on combinatorial testing, and the second one focuses on model checking.Thefirst paper, ‘ Combinatorial methods for dynamic grey-box SQL injection testing ’ by Bernhard Garn, Jovan Zivanovic, Manuel Leithner and Dimitris E. Simos, concerns combinatorial testing for SQL injection. Code injections attacks, and in particular SQL injection (SQLi) attacks, are still among the most critical threats for web applications. These attacks rely on exploiting vulnerabilities, which must be actively chased to deploy a secure system. Leveraging combinatorial testing, the authors propose novel attack grammars to generate SQLi attacks against MySQL-compatible databases. One originality of this contribution resides in dynamically optimizing and improving the attack grammars to the context. This context-sensitive adaptation technique is supported by a prototype tool named SQLInjector + and is validated and benchmarked on a representative set of web applications under test. The contribution is accompanied by a nice addition to the field: a simple framework called WAFTF for testing the filtering tech-niques of web application firewalls such as ModSecurity. (Recommended by Yves Le Traon) The second paper, ‘ Comprehensive evaluation of file systems robustness with SPIN model checking ’ by Jingcheng Yuan, Toshiaki Aoki and Xiaoyun Guo, presents a study that comprehensively evaluates the robustness of file systems using a model checking approach, covering the majority of the mainstream file system types and both single-thread and multi-thread modes. In particular, to abstract real file systems, the authors developed Promela models optimized to avoid state explosion during model checking and used an SPIN model checker to check these models for detecting corner-case errors during an unexpected power outage. The authors analysed counterexamples generated by model checking to determine an improved file system model that is capable of preventing errors in most mainstream file system types and then rechecked the improved file system model and verified the absence of all critical errors.

这一期有两篇论文。第一篇论文的重点是组合检验，第二篇论文的重点是模型检验。第一篇论文，“动态灰盒SQL注入测试的组合方法”，由Bernhard Garn, Jovan Zivanovic, Manuel Leithner和Dimitris E. Simos撰写，涉及SQL注入的组合测试。代码注入攻击，特别是SQL注入(SQLi)攻击，仍然是web应用程序最严重的威胁之一。这些攻击依赖于利用漏洞，必须积极追踪以部署安全系统。利用组合测试，作者提出了新的攻击语法来生成针对mysql兼容数据库的sql攻击。这个贡献的一个独创性在于根据上下文动态优化和改进攻击语法。这种上下文敏感的适应技术由名为SQLInjector +的原型工具支持，并在一组有代表性的被测web应用程序上进行了验证和基准测试。这一贡献还伴随着对该领域的一个很好的补充:一个名为WAFTF的简单框架，用于测试web应用程序防火墙(如ModSecurity)的过滤技术。第二篇论文《基于SPIN模型检查的文件系统鲁棒性综合评估》，作者为袁景成、青木俊明和郭晓云，他们采用模型检查方法对文件系统的鲁棒性进行了综合评估，涵盖了大多数主流文件系统类型以及单线程和多线程模式。特别是，为了抽象真实的文件系统，作者开发了经过优化的Promela模型，以避免模型检查期间的状态爆炸，并使用SPIN模型检查器检查这些模型，以便在意外断电期间检测边缘情况错误。作者分析了模型检查产生的反例，确定了一个改进的文件系统模型，该模型能够防止大多数主流文件系统类型的错误，然后重新检查了改进的文件系统模型，并验证了所有关键错误都不存在。

{"title":"Combinatorial testing and model checking","authors":"Yves Le Traon, Tao Xie","doi":"10.1002/stvr.1832","DOIUrl":"https://doi.org/10.1002/stvr.1832","url":null,"abstract":"This issue contains two papers. The first paper focuses on combinatorial testing, and the second one focuses on model checking.Thefirst paper, ‘ Combinatorial methods for dynamic grey-box SQL injection testing ’ by Bernhard Garn, Jovan Zivanovic, Manuel Leithner and Dimitris E. Simos, concerns combinatorial testing for SQL injection. Code injections attacks, and in particular SQL injection (SQLi) attacks, are still among the most critical threats for web applications. These attacks rely on exploiting vulnerabilities, which must be actively chased to deploy a secure system. Leveraging combinatorial testing, the authors propose novel attack grammars to generate SQLi attacks against MySQL-compatible databases. One originality of this contribution resides in dynamically optimizing and improving the attack grammars to the context. This context-sensitive adaptation technique is supported by a prototype tool named SQLInjector + and is validated and benchmarked on a representative set of web applications under test. The contribution is accompanied by a nice addition to the field: a simple framework called WAFTF for testing the filtering tech-niques of web application firewalls such as ModSecurity. (Recommended by Yves Le Traon) The second paper, ‘ Comprehensive evaluation of file systems robustness with SPIN model checking ’ by Jingcheng Yuan, Toshiaki Aoki and Xiaoyun Guo, presents a study that comprehensively evaluates the robustness of file systems using a model checking approach, covering the majority of the mainstream file system types and both single-thread and multi-thread modes. In particular, to abstract real file systems, the authors developed Promela models optimized to avoid state explosion during model checking and used an SPIN model checker to check these models for detecting corner-case errors during an unexpected power outage. The authors analysed counterexamples generated by model checking to determine an improved file system model that is capable of preventing errors in most mainstream file system types and then rechecked the improved file system model and verified the absence of all critical errors.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"16 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78328631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mutation analysis and its industrial applications 突变分析及其工业应用

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Software Testing Verification & Reliability

Pub Date : 2022-08-05 DOI: 10.1002/stvr.1830

R. Gopinath, Jie M. Zhang, Marinos Kintis, Mike Papadakis

from EFSM specifications using numerous coverage criteria, which are evaluated using mutation analysis. The authors present their results, and provide recommendations for practitioners. 3. The third paper is Learning-based Mutant Reduction using Fine-grained Mutation Operators by Shin Hong and Yunho Kim . This paper proposes MUTRAIN, a technique for reducing the cost of mutation testing. It uses cost-considerate linear regression to learn a mutation model allows prediction of mutation score from a much smaller set of fine-grained mutation operators.

从EFSM规范中使用许多覆盖标准，这些标准使用突变分析进行评估。作者介绍了他们的结果，并为从业者提供了建议。3.第三篇论文是Shin Hong和Yunho Kim使用细粒度突变算子的基于学习的突变约简。本文提出了一种降低突变检测成本的MUTRAIN技术。它使用考虑成本的线性回归来学习一个突变模型，该模型允许从更小的一组细粒度突变操作符中预测突变分数。

引用次数: 0

Comprehensive evaluation of file systems robustness with SPIN model checking 基于SPIN模型检验的文件系统鲁棒性综合评价

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Software Testing Verification & Reliability

Pub Date : 2022-07-20 DOI: 10.1002/stvr.1828

Jingcheng Yuan, Toshiaki Aoki, Xiaoyun Guo

In existing computer systems, file systems are indispensable for organizing user data and system codes. However, several studies have reported certain file system errors that cause significant data loss or system crashes. Most of these errors are due to external failures, such as an unexpected power outage. However, comprehensively evaluating file system robustness to detect these errors is challenging. The various types of file systems use different data structures and algorithms for various applications. Moreover, file system errors may be triggered by an unpredictable external condition. In addition, a file system works in an operating system's kernel layer as a passive module and runs in a multi‐thread mode, which makes file system testing time‐intensive. Furthermore, the large number of states in file systems leads to greedy checking, which results in a state explosion. In this study, we comprehensively evaluated the robustness expected in multiple properties of file systems using a model checking approach. The evaluation covered the majority of the mainstream file system types and included both single‐thread and multi‐thread modes. We developed Promela models that abstracted the real file systems and subsequently checked them using a SPIN model checker. Our model was optimized to avoid state explosion during model checking. Using the model checking, we successfully detected corner‐case errors during an unexpected power outage. By analysing counterexamples generated by model checking, we determined an improved file system model capable of preventing errors in most mainstream file system types. Finally, we rechecked the improved file system model and verified the absence of all critical errors.

在现有的计算机系统中，文件系统对于组织用户数据和系统代码是必不可少的。然而，一些研究报告了某些文件系统错误会导致严重的数据丢失或系统崩溃。这些错误大多是由于外部故障造成的，比如意外断电。然而，全面评估文件系统健壮性以检测这些错误是具有挑战性的。不同类型的文件系统为不同的应用程序使用不同的数据结构和算法。此外，文件系统错误可能由不可预测的外部条件触发。此外，文件系统作为一个被动模块在操作系统的内核层中工作，并以多线程模式运行，这使得文件系统测试非常耗时。此外，文件系统中的大量状态会导致贪婪检查，从而导致状态爆炸。在本研究中，我们使用模型检查方法全面评估了文件系统多个属性的鲁棒性。评估涵盖了大多数主流文件系统类型，包括单线程和多线程模式。我们开发了Promela模型来抽象真实的文件系统，并随后使用SPIN模型检查器检查它们。对模型进行了优化，避免了模型校验过程中的状态爆炸。使用模型检查，我们成功地检测了意外停电期间的角落案例错误。通过分析模型检查产生的反例，我们确定了一个改进的文件系统模型，能够防止大多数主流文件系统类型的错误。最后，我们重新检查了改进后的文件系统模型，并验证了不存在所有关键错误。

{"title":"Comprehensive evaluation of file systems robustness with SPIN model checking","authors":"Jingcheng Yuan, Toshiaki Aoki, Xiaoyun Guo","doi":"10.1002/stvr.1828","DOIUrl":"https://doi.org/10.1002/stvr.1828","url":null,"abstract":"In existing computer systems, file systems are indispensable for organizing user data and system codes. However, several studies have reported certain file system errors that cause significant data loss or system crashes. Most of these errors are due to external failures, such as an unexpected power outage. However, comprehensively evaluating file system robustness to detect these errors is challenging. The various types of file systems use different data structures and algorithms for various applications. Moreover, file system errors may be triggered by an unpredictable external condition. In addition, a file system works in an operating system's kernel layer as a passive module and runs in a multi‐thread mode, which makes file system testing time‐intensive. Furthermore, the large number of states in file systems leads to greedy checking, which results in a state explosion. In this study, we comprehensively evaluated the robustness expected in multiple properties of file systems using a model checking approach. The evaluation covered the majority of the mainstream file system types and included both single‐thread and multi‐thread modes. We developed Promela models that abstracted the real file systems and subsequently checked them using a SPIN model checker. Our model was optimized to avoid state explosion during model checking. Using the model checking, we successfully detected corner‐case errors during an unexpected power outage. By analysing counterexamples generated by model checking, we determined an improved file system model capable of preventing errors in most mainstream file system types. Finally, we rechecked the improved file system model and verified the absence of all critical errors.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"151 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79551036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

IEEE International Conference on Software Testing, Verification and Validation (ICST 2020) IEEE软件测试、验证和确认国际会议(ICST 2020)

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Software Testing Verification & Reliability

Pub Date : 2022-07-18 DOI: 10.1002/stvr.1829

C. Pasareanu, A. Zeller

This special issue contains articles which are extended versions of some of the best papers presented at the IEEE International Conference on Software Testing, Verification and Validation (ICST 2020). ICST is intended as a common forum for researchers, scientists, engineers and practitioners throughout the world to present their latest research findings, ideas, developments and applications in the area of Software Testing, Verification and Validation. The articles are ‘ Fostering the Diversity of Exploratory Testing in Web Applications ’ , by Leveau et al., ‘ RVPRIO: a Tool for Prioritizing Runtime Verification Violations ’ , by Cabral et al., and ‘ Automated Black-Box Testing of Nominal and Error Scenarios in RESTful APIs ’ , by Corradini et al., covering diverse topics in software testing and verification. In the first article, the authors investigate exploratory testing, a form of software testing that leverages business expertise, in the context of web applications. They propose a new approach that monitors online interactions performed by testers to suggest new interactions, thus enabling deeper explorations of the applications. In the second article, the authors leverage machine learning to prioritise violations reported by runtime verification, leading to the discovery of previously unknown bugs in open-source projects. In the third article, the authors develop black-box testing techniques for RESTful APIs, a mainstream approach for web API design, leading to the discovery of new faults in already deployed web services. We would like to thank the authors for submitting their contributions and the reviewers for their excellent job. We would also like to thank Rob Hierons for kind guidance and great patience with this volume.

本期特刊包含的文章是IEEE软件测试、验证和验证国际会议(ICST 2020)上发表的一些最佳论文的扩展版本。ICST旨在为世界各地的研究人员、科学家、工程师和实践者提供一个共同的论坛，展示他们在软件测试、验证和验证领域的最新研究成果、想法、发展和应用。这三篇文章分别是Leveau等人的《促进Web应用程序中探索性测试的多样性》、Cabral等人的《RVPRIO:对运行时验证违规进行优先排序的工具》和Corradini等人的《RESTful api中标称和错误场景的自动黑盒测试》，涵盖了软件测试和验证中的各种主题。在第一篇文章中，作者研究了探索性测试，这是一种在web应用程序环境中利用业务专业知识的软件测试形式。他们提出了一种新的方法，该方法监视由测试人员执行的在线交互，以建议新的交互，从而能够对应用程序进行更深入的探索。在第二篇文章中，作者利用机器学习来优先处理运行时验证报告的违规行为，从而发现开源项目中以前未知的错误。在第三篇文章中，作者开发了RESTful API的黑盒测试技术，这是web API设计的主流方法，可以发现已经部署的web服务中的新故障。我们要感谢作者的贡献，感谢审稿人的出色工作。我们还要感谢Rob Hierons对本书的善意指导和极大的耐心。

{"title":"IEEE International Conference on Software Testing, Verification and Validation (ICST 2020)","authors":"C. Pasareanu, A. Zeller","doi":"10.1002/stvr.1829","DOIUrl":"https://doi.org/10.1002/stvr.1829","url":null,"abstract":"This special issue contains articles which are extended versions of some of the best papers presented at the IEEE International Conference on Software Testing, Verification and Validation (ICST 2020). ICST is intended as a common forum for researchers, scientists, engineers and practitioners throughout the world to present their latest research findings, ideas, developments and applications in the area of Software Testing, Verification and Validation. The articles are ‘ Fostering the Diversity of Exploratory Testing in Web Applications ’ , by Leveau et al., ‘ RVPRIO: a Tool for Prioritizing Runtime Verification Violations ’ , by Cabral et al., and ‘ Automated Black-Box Testing of Nominal and Error Scenarios in RESTful APIs ’ , by Corradini et al., covering diverse topics in software testing and verification. In the first article, the authors investigate exploratory testing, a form of software testing that leverages business expertise, in the context of web applications. They propose a new approach that monitors online interactions performed by testers to suggest new interactions, thus enabling deeper explorations of the applications. In the second article, the authors leverage machine learning to prioritise violations reported by runtime verification, leading to the discovery of previously unknown bugs in open-source projects. In the third article, the authors develop black-box testing techniques for RESTful APIs, a mainstream approach for web API design, leading to the discovery of new faults in already deployed web services. We would like to thank the authors for submitting their contributions and the reviewers for their excellent job. We would also like to thank Rob Hierons for kind guidance and great patience with this volume.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"18 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72519647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The integration of machine learning into automated test generation: A systematic mapping study 机器学习与自动化测试生成的集成:一个系统的映射研究

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Software Testing Verification & Reliability

Pub Date : 2022-06-21 DOI: 10.1002/stvr.1845

Afonso Fontes, Gregory Gay

Machine learning (ML) may enable effective automated test generation. We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges in this intersection by performing. We perform a systematic mapping study on a sample of 124 publications. ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property‐based, and expected output oracles. Supervised learning—often based on neural networks—and reinforcement learning—often based on Q‐learning—are common, and some publications also employ unsupervised or semi‐supervised learning. (Semi‐/Un‐)Supervised approaches are evaluated using both traditional testing metrics and ML‐related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. The work‐to‐date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed—and how they are applied—benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.

机器学习(ML)可以实现有效的自动化测试生成。我们描述了新兴的研究，检查测试实践，研究人员的目标，机器学习技术的应用，评估和挑战在这个交叉点通过执行。我们对124份出版物的样本进行了系统的制图研究。ML为系统、GUI、单元、性能和组合测试生成输入，或者改进现有生成方法的性能。机器学习还用于生成测试结论、基于属性和预期输出的预言。监督学习(通常基于神经网络)和强化学习(通常基于Q学习)是常见的，一些出版物也采用无监督或半监督学习。(半/非)监督方法使用传统测试指标和机器学习相关指标(例如，准确性)进行评估，而强化学习通常使用与奖励函数相关的测试指标进行评估。迄今为止的工作显示出巨大的希望，但在训练数据、再训练、可扩展性、评估复杂性、所使用的ML算法以及它们如何应用、基准测试和可复制性方面存在着开放的挑战。我们的发现可以作为该领域研究人员的路线图和灵感。

{"title":"The integration of machine learning into automated test generation: A systematic mapping study","authors":"Afonso Fontes, Gregory Gay","doi":"10.1002/stvr.1845","DOIUrl":"https://doi.org/10.1002/stvr.1845","url":null,"abstract":"Machine learning (ML) may enable effective automated test generation. We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges in this intersection by performing. We perform a systematic mapping study on a sample of 124 publications. ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property‐based, and expected output oracles. Supervised learning—often based on neural networks—and reinforcement learning—often based on Q‐learning—are common, and some publications also employ unsupervised or semi‐supervised learning. (Semi‐/Un‐)Supervised approaches are evaluated using both traditional testing metrics and ML‐related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. The work‐to‐date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed—and how they are applied—benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"28 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75501628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0