A. Bombarda, S. Bonfanti, A. Gargantini, Yu Lei, Feng Duan
In this paper, we present an approach to conformance testing based on abstract state machines (ASMs) that combines model refinement and test execution (RATE) and its application to three case studies. The RATE approach consists in generating test sequences from ASMs and checking the conformance between code and models in multiple iterations. The process follows these steps: (1) model the system as an abstract state machine; (2) validate and verify the model; (3) generate test sequences automatically from the ASM model; (4) execute the tests over the implementation and compute the code coverage; (5) if the coverage is below the desired threshold, then refine the abstract state machine model to add the uncovered functionalities and return to step 2. We have applied the proposed approach in three case studies: a traffic light control system (TLCS), the IEEE 11073‐20601 personal health device (PHD) protocol, and the mechanical ventilator Milano (MVM). By applying RATE, at each refinement level, we have increased code coverage and identified some faults or conformance errors for all the case studies. The fault detection capability of RATE has also been confirmed by mutation analysis, in which we have highlighted that, many mutants can be killed even by the most abstract models.
{"title":"RATE: A model‐based testing approach that combines model refinement and test execution","authors":"A. Bombarda, S. Bonfanti, A. Gargantini, Yu Lei, Feng Duan","doi":"10.1002/stvr.1835","DOIUrl":"https://doi.org/10.1002/stvr.1835","url":null,"abstract":"In this paper, we present an approach to conformance testing based on abstract state machines (ASMs) that combines model refinement and test execution (RATE) and its application to three case studies. The RATE approach consists in generating test sequences from ASMs and checking the conformance between code and models in multiple iterations. The process follows these steps: (1) model the system as an abstract state machine; (2) validate and verify the model; (3) generate test sequences automatically from the ASM model; (4) execute the tests over the implementation and compute the code coverage; (5) if the coverage is below the desired threshold, then refine the abstract state machine model to add the uncovered functionalities and return to step 2. We have applied the proposed approach in three case studies: a traffic light control system (TLCS), the IEEE 11073‐20601 personal health device (PHD) protocol, and the mechanical ventilator Milano (MVM). By applying RATE, at each refinement level, we have increased code coverage and identified some faults or conformance errors for all the case studies. The fault detection capability of RATE has also been confirmed by mutation analysis, in which we have highlighted that, many mutants can be killed even by the most abstract models.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"49 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86789770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deep neural networks (DNN) are increasingly used as components of larger software systems that need to process complex data, such as images, written texts, audio/video signals. DNN predictions cannot be assumed to be always correct for several reasons, amongst which the huge input space that is dealt with, the ambiguity of some inputs data, as well as the intrinsic properties of learning algorithms, which can provide only statistical warranties. Hence, developers have to cope with some residual error probability. An architectural pattern commonly adopted to manage failure prone components is the supervisor, an additional component that can estimate the reliability of the predictions made by untrusted (e.g., DNN) components and can activate an automated healing procedure when these are likely to fail, ensuring that the deep learning‐based system (DLS) does not cause damages, despite its main functionality being suspended.
{"title":"Uncertainty quantification for deep neural networks: An empirical comparison and usage guidelines","authors":"Michael Weiss, P. Tonella","doi":"10.1002/stvr.1840","DOIUrl":"https://doi.org/10.1002/stvr.1840","url":null,"abstract":"Deep neural networks (DNN) are increasingly used as components of larger software systems that need to process complex data, such as images, written texts, audio/video signals. DNN predictions cannot be assumed to be always correct for several reasons, amongst which the huge input space that is dealt with, the ambiguity of some inputs data, as well as the intrinsic properties of learning algorithms, which can provide only statistical warranties. Hence, developers have to cope with some residual error probability. An architectural pattern commonly adopted to manage failure prone components is the supervisor, an additional component that can estimate the reliability of the predictions made by untrusted (e.g., DNN) components and can activate an automated healing procedure when these are likely to fail, ensuring that the deep learning‐based system (DLS) does not cause damages, despite its main functionality being suspended.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"91 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85874069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This issue contains two very different papers, in terms of subjects and proposed test and verification techniques. The first paper focuses on testing the robustness of digital TV (DTV) receivers through (non)compliance fuzz testing. The second one focuses on a model-based approach to enable the verification of multitasking control software, proposing an OS-in-the-Loop (OiL) verification framework. The first paper, ‘A fuzzing-based test-creation approach for evaluating digital TV receivers via transport streams’ by Fabricio Izumi, Eddie B. de Lima Filho, Lucas C. Cordeiro, Orlewilson Maia, Rômulo Fabrício, Bruno Farias and Aguinaldo Silva, concerns the generation of noncompliance tests using grammar-based guided fuzzing. The originality of this contribution resides in the nature of the test subjects, which are DTV receivers, their (mis)configurations and transport streams. The originality extends to conformance testing by targeting robustness improvements: Instead of checking whether it behaves as expected, the goal is to verify the DTV receiver response against inaccurate or inconsistent data, based on fuzzing input generation. Finally, the approach is supported by a complete evaluation framework, which includes a testing environment, audio and video verification algorithms and a strategy for test creation (recommended by Paul Strooper, Rob Hierons and Yves Le Traon). The second paper, ‘OS-in-the-Loop verification for multi-tasking control software’ by Yunja Choi, presents an original approach to perform verification for embedded control software, specifically an OiL verification framework. This framework is based on a modelling of embedded operating systems, enabling the composition of the interactions of the OS model and the device controllers, thanks to an algorithm described in the paper. Multitasking is thus treated thanks to this composition mechanism. The framework makes it possible to apply various verification methods for multitasking (random simulation, dynamic concolic testing and model checking). The application of the OiL verification to a small-case study illustrates the benefit of the framework, which has been successfully applied on two typical pieces of multitasking embedded software from industry (recommended by Benoit Baudry, Rob Hierons and Yves Le Traon). We hope you will find these papers interesting and inspiring for your future work.
这个问题包含两篇非常不同的论文,在主题和提出的测试和验证技术方面。第一篇论文的重点是通过(非)符合性模糊测试来测试数字电视(DTV)接收机的鲁棒性。第二部分着重于基于模型的方法来验证多任务控制软件,提出了一个OS-in-the-Loop (OiL)验证框架。第一篇论文,“通过传输流评估数字电视接收器的基于模糊测试创建方法”,作者是fabicio Izumi、Eddie B. de Lima Filho、Lucas C. Cordeiro、Orlewilson Maia、Rômulo Fabrício、Bruno Farias和Aguinaldo Silva,论文关注的是使用基于语法的引导模糊测试生成不合规测试。这一贡献的独创性在于测试对象的性质,即数字电视接收器,它们的(错误)配置和传输流。通过针对鲁棒性改进,其独创性扩展到一致性测试:与其检查其行为是否如预期的那样,其目标是根据模糊输入生成来验证DTV接收器对不准确或不一致数据的响应。最后,该方法由一个完整的评估框架支持,该框架包括测试环境、音频和视频验证算法以及测试创建策略(由Paul Strooper、Rob Hierons和Yves Le Traon推荐)。第二篇论文,Yunja Choi的“多任务控制软件的OS-in-the-Loop验证”,提出了一种对嵌入式控制软件进行验证的原始方法,特别是一个OiL验证框架。该框架基于嵌入式操作系统的建模,使操作系统模型和设备控制器的交互组合成为可能,这要归功于论文中描述的算法。由于这种组合机制,因此可以处理多任务。该框架使得多任务验证方法(随机仿真、动态碰撞测试和模型检验)的应用成为可能。将OiL验证应用到一个小案例研究中,说明了该框架的好处,该框架已成功应用于两个典型的工业多任务嵌入式软件(由Benoit Baudry、Rob Hierons和Yves Le Traon推荐)。我们希望你会发现这些论文有趣和启发你未来的工作。
{"title":"Fuzz testing for digital TV receivers and multitasking control software verification","authors":"Yves Le Traon, Tao Xie","doi":"10.1002/stvr.1836","DOIUrl":"https://doi.org/10.1002/stvr.1836","url":null,"abstract":"This issue contains two very different papers, in terms of subjects and proposed test and verification techniques. The first paper focuses on testing the robustness of digital TV (DTV) receivers through (non)compliance fuzz testing. The second one focuses on a model-based approach to enable the verification of multitasking control software, proposing an OS-in-the-Loop (OiL) verification framework. The first paper, ‘A fuzzing-based test-creation approach for evaluating digital TV receivers via transport streams’ by Fabricio Izumi, Eddie B. de Lima Filho, Lucas C. Cordeiro, Orlewilson Maia, Rômulo Fabrício, Bruno Farias and Aguinaldo Silva, concerns the generation of noncompliance tests using grammar-based guided fuzzing. The originality of this contribution resides in the nature of the test subjects, which are DTV receivers, their (mis)configurations and transport streams. The originality extends to conformance testing by targeting robustness improvements: Instead of checking whether it behaves as expected, the goal is to verify the DTV receiver response against inaccurate or inconsistent data, based on fuzzing input generation. Finally, the approach is supported by a complete evaluation framework, which includes a testing environment, audio and video verification algorithms and a strategy for test creation (recommended by Paul Strooper, Rob Hierons and Yves Le Traon). The second paper, ‘OS-in-the-Loop verification for multi-tasking control software’ by Yunja Choi, presents an original approach to perform verification for embedded control software, specifically an OiL verification framework. This framework is based on a modelling of embedded operating systems, enabling the composition of the interactions of the OS model and the device controllers, thanks to an algorithm described in the paper. Multitasking is thus treated thanks to this composition mechanism. The framework makes it possible to apply various verification methods for multitasking (random simulation, dynamic concolic testing and model checking). The application of the OiL verification to a small-case study illustrates the benefit of the framework, which has been successfully applied on two typical pieces of multitasking embedded software from industry (recommended by Benoit Baudry, Rob Hierons and Yves Le Traon). We hope you will find these papers interesting and inspiring for your future work.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"26 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84592095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Embedded control software that controls safety‐critical IoT devices requires systematic and comprehensive verification to ensure safe operation of the device. However, rigorous verification in this domain has not been feasible due to the high complexity of embedded control software, which is characterized by the frequent use of multi‐tasking, interrupts, and periodic alarms. Realizing that two major factors, scalability and exactness, are extremely difficult to achieve at the same time but critical for effective and efficient verification in this domain, this work introduces a domain‐specific compositional OS‐in‐the‐Loop (OiL) verification approach and sets out to push the boundary in achieving both factors. The suggested approach (1) models the behavior of the underlying operating system to limit the search space using the notion of controlled concurrency, (2) performs heterogeneous composition of controllers with the formal OS model to reduce verification complexity, and (3) utilizes state‐of‐the‐art verification techniques for the purpose of comprehensive verification up to a given search depth.
控制安全关键型物联网设备的嵌入式控制软件需要系统和全面的验证,以确保设备的安全运行。然而,由于嵌入式控制软件的高度复杂性,其特点是频繁使用多任务、中断和定期报警,因此在该领域进行严格的验证是不可实现的。认识到两个主要因素,可扩展性和准确性,很难同时实现,但对于该领域的有效和高效验证至关重要,本工作引入了一种特定领域的组合OS - in - the - Loop (OiL)验证方法,并着手推动实现这两个因素的边界。建议的方法(1)使用受控并发的概念对底层操作系统的行为进行建模,以限制搜索空间;(2)使用正式的操作系统模型执行控制器的异构组合,以降低验证复杂性;(3)利用最先进的验证技术,在给定的搜索深度内进行全面验证。
{"title":"OS‐in‐the‐Loop verification for multi‐tasking control software","authors":"Yunja Choi","doi":"10.1002/stvr.1834","DOIUrl":"https://doi.org/10.1002/stvr.1834","url":null,"abstract":"Embedded control software that controls safety‐critical IoT devices requires systematic and comprehensive verification to ensure safe operation of the device. However, rigorous verification in this domain has not been feasible due to the high complexity of embedded control software, which is characterized by the frequent use of multi‐tasking, interrupts, and periodic alarms. Realizing that two major factors, scalability and exactness, are extremely difficult to achieve at the same time but critical for effective and efficient verification in this domain, this work introduces a domain‐specific compositional OS‐in‐the‐Loop (OiL) verification approach and sets out to push the boundary in achieving both factors. The suggested approach (1) models the behavior of the underlying operating system to limit the search space using the notion of controlled concurrency, (2) performs heterogeneous composition of controllers with the formal OS model to reduce verification complexity, and (3) utilizes state‐of‐the‐art verification techniques for the purpose of comprehensive verification up to a given search depth.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"127 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89887924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabrício Izumi, E. Filho, L. Cordeiro, O. Maia, Rômulo Fabrício, B. Farias, Aguinaldo Silva
Digital TV (DTV) receivers are usually submitted to testing systems for conformity and robustness assessment, and their approval implies correct operation under a given DTV specification protocol. However, many broadcasters inadvertently misconfigure their devices and transmit the wrong information concerning data structures and protocol format. Since most receivers were not designed to operate under such conditions, malfunction and incorrect behaviour may be noticed, often recognized as field problems, thus compromising a given system's operation. Moreover, the way those problems are usually introduced in DTV signals presents some randomness, but with known restrictions given by the underlying transport protocols used in DTV systems, which resembles fuzzing techniques. Indeed, everything may happen since any deviation can incur problems, depending on each specific implementation. This error scenario is addressed here, and a novel receiver robustness evaluation methodology based on non‐compliance tests using grammar‐based guided fuzzing is proposed. In particular, devices are submitted to unforeseen conditions and incorrect configuration. They are created with guided fuzzing based on real problems, protocol structure, and system architecture to provide resources for handling them, thus ensuring correct operation. Experiments using such a fuzzing scheme have shown its efficacy and provided opportunities to improve robustness regarding commercial DTV platforms.
{"title":"A fuzzing‐based test‐creation approach for evaluating digital TV receivers via transport streams","authors":"Fabrício Izumi, E. Filho, L. Cordeiro, O. Maia, Rômulo Fabrício, B. Farias, Aguinaldo Silva","doi":"10.1002/stvr.1833","DOIUrl":"https://doi.org/10.1002/stvr.1833","url":null,"abstract":"Digital TV (DTV) receivers are usually submitted to testing systems for conformity and robustness assessment, and their approval implies correct operation under a given DTV specification protocol. However, many broadcasters inadvertently misconfigure their devices and transmit the wrong information concerning data structures and protocol format. Since most receivers were not designed to operate under such conditions, malfunction and incorrect behaviour may be noticed, often recognized as field problems, thus compromising a given system's operation. Moreover, the way those problems are usually introduced in DTV signals presents some randomness, but with known restrictions given by the underlying transport protocols used in DTV systems, which resembles fuzzing techniques. Indeed, everything may happen since any deviation can incur problems, depending on each specific implementation. This error scenario is addressed here, and a novel receiver robustness evaluation methodology based on non‐compliance tests using grammar‐based guided fuzzing is proposed. In particular, devices are submitted to unforeseen conditions and incorrect configuration. They are created with guided fuzzing based on real problems, protocol structure, and system architecture to provide resources for handling them, thus ensuring correct operation. Experiments using such a fuzzing scheme have shown its efficacy and provided opportunities to improve robustness regarding commercial DTV platforms.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"1 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85851921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This issue contains two papers. The first paper focuses on combinatorial testing, and the second one focuses on model checking.Thefirst paper, ‘ Combinatorial methods for dynamic grey-box SQL injection testing ’ by Bernhard Garn, Jovan Zivanovic, Manuel Leithner and Dimitris E. Simos, concerns combinatorial testing for SQL injection. Code injections attacks, and in particular SQL injection (SQLi) attacks, are still among the most critical threats for web applications. These attacks rely on exploiting vulnerabilities, which must be actively chased to deploy a secure system. Leveraging combinatorial testing, the authors propose novel attack grammars to generate SQLi attacks against MySQL-compatible databases. One originality of this contribution resides in dynamically optimizing and improving the attack grammars to the context. This context-sensitive adaptation technique is supported by a prototype tool named SQLInjector + and is validated and benchmarked on a representative set of web applications under test. The contribution is accompanied by a nice addition to the field: a simple framework called WAFTF for testing the filtering tech-niques of web application firewalls such as ModSecurity. (Recommended by Yves Le Traon) The second paper, ‘ Comprehensive evaluation of file systems robustness with SPIN model checking ’ by Jingcheng Yuan, Toshiaki Aoki and Xiaoyun Guo, presents a study that comprehensively evaluates the robustness of file systems using a model checking approach, covering the majority of the mainstream file system types and both single-thread and multi-thread modes. In particular, to abstract real file systems, the authors developed Promela models optimized to avoid state explosion during model checking and used an SPIN model checker to check these models for detecting corner-case errors during an unexpected power outage. The authors analysed counterexamples generated by model checking to determine an improved file system model that is capable of preventing errors in most mainstream file system types and then rechecked the improved file system model and verified the absence of all critical errors.
这一期有两篇论文。第一篇论文的重点是组合检验,第二篇论文的重点是模型检验。第一篇论文,“动态灰盒SQL注入测试的组合方法”,由Bernhard Garn, Jovan Zivanovic, Manuel Leithner和Dimitris E. Simos撰写,涉及SQL注入的组合测试。代码注入攻击,特别是SQL注入(SQLi)攻击,仍然是web应用程序最严重的威胁之一。这些攻击依赖于利用漏洞,必须积极追踪以部署安全系统。利用组合测试,作者提出了新的攻击语法来生成针对mysql兼容数据库的sql攻击。这个贡献的一个独创性在于根据上下文动态优化和改进攻击语法。这种上下文敏感的适应技术由名为SQLInjector +的原型工具支持,并在一组有代表性的被测web应用程序上进行了验证和基准测试。这一贡献还伴随着对该领域的一个很好的补充:一个名为WAFTF的简单框架,用于测试web应用程序防火墙(如ModSecurity)的过滤技术。第二篇论文《基于SPIN模型检查的文件系统鲁棒性综合评估》,作者为袁景成、青木俊明和郭晓云,他们采用模型检查方法对文件系统的鲁棒性进行了综合评估,涵盖了大多数主流文件系统类型以及单线程和多线程模式。特别是,为了抽象真实的文件系统,作者开发了经过优化的Promela模型,以避免模型检查期间的状态爆炸,并使用SPIN模型检查器检查这些模型,以便在意外断电期间检测边缘情况错误。作者分析了模型检查产生的反例,确定了一个改进的文件系统模型,该模型能够防止大多数主流文件系统类型的错误,然后重新检查了改进的文件系统模型,并验证了所有关键错误都不存在。
{"title":"Combinatorial testing and model checking","authors":"Yves Le Traon, Tao Xie","doi":"10.1002/stvr.1832","DOIUrl":"https://doi.org/10.1002/stvr.1832","url":null,"abstract":"This issue contains two papers. The first paper focuses on combinatorial testing, and the second one focuses on model checking.Thefirst paper, ‘ Combinatorial methods for dynamic grey-box SQL injection testing ’ by Bernhard Garn, Jovan Zivanovic, Manuel Leithner and Dimitris E. Simos, concerns combinatorial testing for SQL injection. Code injections attacks, and in particular SQL injection (SQLi) attacks, are still among the most critical threats for web applications. These attacks rely on exploiting vulnerabilities, which must be actively chased to deploy a secure system. Leveraging combinatorial testing, the authors propose novel attack grammars to generate SQLi attacks against MySQL-compatible databases. One originality of this contribution resides in dynamically optimizing and improving the attack grammars to the context. This context-sensitive adaptation technique is supported by a prototype tool named SQLInjector + and is validated and benchmarked on a representative set of web applications under test. The contribution is accompanied by a nice addition to the field: a simple framework called WAFTF for testing the filtering tech-niques of web application firewalls such as ModSecurity. (Recommended by Yves Le Traon) The second paper, ‘ Comprehensive evaluation of file systems robustness with SPIN model checking ’ by Jingcheng Yuan, Toshiaki Aoki and Xiaoyun Guo, presents a study that comprehensively evaluates the robustness of file systems using a model checking approach, covering the majority of the mainstream file system types and both single-thread and multi-thread modes. In particular, to abstract real file systems, the authors developed Promela models optimized to avoid state explosion during model checking and used an SPIN model checker to check these models for detecting corner-case errors during an unexpected power outage. The authors analysed counterexamples generated by model checking to determine an improved file system model that is capable of preventing errors in most mainstream file system types and then rechecked the improved file system model and verified the absence of all critical errors.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"16 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78328631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Gopinath, Jie M. Zhang, Marinos Kintis, Mike Papadakis
from EFSM specifications using numerous coverage criteria, which are evaluated using mutation analysis. The authors present their results, and provide recommendations for practitioners. 3. The third paper is Learning-based Mutant Reduction using Fine-grained Mutation Operators by Shin Hong and Yunho Kim . This paper proposes MUTRAIN, a technique for reducing the cost of mutation testing. It uses cost-considerate linear regression to learn a mutation model allows prediction of mutation score from a much smaller set of fine-grained mutation operators.
{"title":"Mutation analysis and its industrial applications","authors":"R. Gopinath, Jie M. Zhang, Marinos Kintis, Mike Papadakis","doi":"10.1002/stvr.1830","DOIUrl":"https://doi.org/10.1002/stvr.1830","url":null,"abstract":"from EFSM specifications using numerous coverage criteria, which are evaluated using mutation analysis. The authors present their results, and provide recommendations for practitioners. 3. The third paper is Learning-based Mutant Reduction using Fine-grained Mutation Operators by Shin Hong and Yunho Kim . This paper proposes MUTRAIN, a technique for reducing the cost of mutation testing. It uses cost-considerate linear regression to learn a mutation model allows prediction of mutation score from a much smaller set of fine-grained mutation operators.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"10 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75144061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In existing computer systems, file systems are indispensable for organizing user data and system codes. However, several studies have reported certain file system errors that cause significant data loss or system crashes. Most of these errors are due to external failures, such as an unexpected power outage. However, comprehensively evaluating file system robustness to detect these errors is challenging. The various types of file systems use different data structures and algorithms for various applications. Moreover, file system errors may be triggered by an unpredictable external condition. In addition, a file system works in an operating system's kernel layer as a passive module and runs in a multi‐thread mode, which makes file system testing time‐intensive. Furthermore, the large number of states in file systems leads to greedy checking, which results in a state explosion. In this study, we comprehensively evaluated the robustness expected in multiple properties of file systems using a model checking approach. The evaluation covered the majority of the mainstream file system types and included both single‐thread and multi‐thread modes. We developed Promela models that abstracted the real file systems and subsequently checked them using a SPIN model checker. Our model was optimized to avoid state explosion during model checking. Using the model checking, we successfully detected corner‐case errors during an unexpected power outage. By analysing counterexamples generated by model checking, we determined an improved file system model capable of preventing errors in most mainstream file system types. Finally, we rechecked the improved file system model and verified the absence of all critical errors.
{"title":"Comprehensive evaluation of file systems robustness with SPIN model checking","authors":"Jingcheng Yuan, Toshiaki Aoki, Xiaoyun Guo","doi":"10.1002/stvr.1828","DOIUrl":"https://doi.org/10.1002/stvr.1828","url":null,"abstract":"In existing computer systems, file systems are indispensable for organizing user data and system codes. However, several studies have reported certain file system errors that cause significant data loss or system crashes. Most of these errors are due to external failures, such as an unexpected power outage. However, comprehensively evaluating file system robustness to detect these errors is challenging. The various types of file systems use different data structures and algorithms for various applications. Moreover, file system errors may be triggered by an unpredictable external condition. In addition, a file system works in an operating system's kernel layer as a passive module and runs in a multi‐thread mode, which makes file system testing time‐intensive. Furthermore, the large number of states in file systems leads to greedy checking, which results in a state explosion. In this study, we comprehensively evaluated the robustness expected in multiple properties of file systems using a model checking approach. The evaluation covered the majority of the mainstream file system types and included both single‐thread and multi‐thread modes. We developed Promela models that abstracted the real file systems and subsequently checked them using a SPIN model checker. Our model was optimized to avoid state explosion during model checking. Using the model checking, we successfully detected corner‐case errors during an unexpected power outage. By analysing counterexamples generated by model checking, we determined an improved file system model capable of preventing errors in most mainstream file system types. Finally, we rechecked the improved file system model and verified the absence of all critical errors.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"151 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79551036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This special issue contains articles which are extended versions of some of the best papers presented at the IEEE International Conference on Software Testing, Verification and Validation (ICST 2020). ICST is intended as a common forum for researchers, scientists, engineers and practitioners throughout the world to present their latest research findings, ideas, developments and applications in the area of Software Testing, Verification and Validation. The articles are ‘ Fostering the Diversity of Exploratory Testing in Web Applications ’ , by Leveau et al., ‘ RVPRIO: a Tool for Prioritizing Runtime Verification Violations ’ , by Cabral et al., and ‘ Automated Black-Box Testing of Nominal and Error Scenarios in RESTful APIs ’ , by Corradini et al., covering diverse topics in software testing and verification. In the first article, the authors investigate exploratory testing, a form of software testing that leverages business expertise, in the context of web applications. They propose a new approach that monitors online interactions performed by testers to suggest new interactions, thus enabling deeper explorations of the applications. In the second article, the authors leverage machine learning to prioritise violations reported by runtime verification, leading to the discovery of previously unknown bugs in open-source projects. In the third article, the authors develop black-box testing techniques for RESTful APIs, a mainstream approach for web API design, leading to the discovery of new faults in already deployed web services. We would like to thank the authors for submitting their contributions and the reviewers for their excellent job. We would also like to thank Rob Hierons for kind guidance and great patience with this volume.
{"title":"IEEE International Conference on Software Testing, Verification and Validation (ICST 2020)","authors":"C. Pasareanu, A. Zeller","doi":"10.1002/stvr.1829","DOIUrl":"https://doi.org/10.1002/stvr.1829","url":null,"abstract":"This special issue contains articles which are extended versions of some of the best papers presented at the IEEE International Conference on Software Testing, Verification and Validation (ICST 2020). ICST is intended as a common forum for researchers, scientists, engineers and practitioners throughout the world to present their latest research findings, ideas, developments and applications in the area of Software Testing, Verification and Validation. The articles are ‘ Fostering the Diversity of Exploratory Testing in Web Applications ’ , by Leveau et al., ‘ RVPRIO: a Tool for Prioritizing Runtime Verification Violations ’ , by Cabral et al., and ‘ Automated Black-Box Testing of Nominal and Error Scenarios in RESTful APIs ’ , by Corradini et al., covering diverse topics in software testing and verification. In the first article, the authors investigate exploratory testing, a form of software testing that leverages business expertise, in the context of web applications. They propose a new approach that monitors online interactions performed by testers to suggest new interactions, thus enabling deeper explorations of the applications. In the second article, the authors leverage machine learning to prioritise violations reported by runtime verification, leading to the discovery of previously unknown bugs in open-source projects. In the third article, the authors develop black-box testing techniques for RESTful APIs, a mainstream approach for web API design, leading to the discovery of new faults in already deployed web services. We would like to thank the authors for submitting their contributions and the reviewers for their excellent job. We would also like to thank Rob Hierons for kind guidance and great patience with this volume.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"18 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72519647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Machine learning (ML) may enable effective automated test generation. We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges in this intersection by performing. We perform a systematic mapping study on a sample of 124 publications. ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property‐based, and expected output oracles. Supervised learning—often based on neural networks—and reinforcement learning—often based on Q‐learning—are common, and some publications also employ unsupervised or semi‐supervised learning. (Semi‐/Un‐)Supervised approaches are evaluated using both traditional testing metrics and ML‐related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. The work‐to‐date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed—and how they are applied—benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.
{"title":"The integration of machine learning into automated test generation: A systematic mapping study","authors":"Afonso Fontes, Gregory Gay","doi":"10.1002/stvr.1845","DOIUrl":"https://doi.org/10.1002/stvr.1845","url":null,"abstract":"Machine learning (ML) may enable effective automated test generation. We characterize emerging research, examining testing practices, researcher goals, ML techniques applied, evaluation, and challenges in this intersection by performing. We perform a systematic mapping study on a sample of 124 publications. ML generates input for system, GUI, unit, performance, and combinatorial testing or improves the performance of existing generation methods. ML is also used to generate test verdicts, property‐based, and expected output oracles. Supervised learning—often based on neural networks—and reinforcement learning—often based on Q‐learning—are common, and some publications also employ unsupervised or semi‐supervised learning. (Semi‐/Un‐)Supervised approaches are evaluated using both traditional testing metrics and ML‐related metrics (e.g., accuracy), while reinforcement learning is often evaluated using testing metrics tied to the reward function. The work‐to‐date shows great promise, but there are open challenges regarding training data, retraining, scalability, evaluation complexity, ML algorithms employed—and how they are applied—benchmarks, and replicability. Our findings can serve as a roadmap and inspiration for researchers in this field.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"28 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75501628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}