Pub Date : 2021-04-01DOI: 10.1109/ICSTW52544.2021.00016
Samuel Peacock, Lin Deng, J. Dehlinger, Suranjan Chakraborty
Mutation testing is a testing technique that is effective at designing tests and evaluating an existing test suite. Even though mutation testing has been developed to be applicable and effective towards different types of software systems and programing languages for many years, wide industrial use of mutation testing has not yet been seen. One primary reason that prevents developers and testers from using mutation testing is the expensive computational cost. Specifically, the need to manually identify equivalent mutants is a major obstacle and makes mutation testing very time consuming and labor intensive. This paper addresses this limitation and proposes a machine learning-based approach that designs and trains an abstract syntax tree recurrent neural network model to automatically classify equivalent mutants during the process of mutation testing. A pilot study with 582 mutants shows that the proposed machine learning-based approach can automatically classify equivalent mutants with an accuracy higher than 90%. The approach can significantly save the manual effort and time spent on identifying equivalent mutants during the process of mutation testing.
{"title":"Automatic Equivalent Mutants Classification Using Abstract Syntax Tree Neural Networks","authors":"Samuel Peacock, Lin Deng, J. Dehlinger, Suranjan Chakraborty","doi":"10.1109/ICSTW52544.2021.00016","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00016","url":null,"abstract":"Mutation testing is a testing technique that is effective at designing tests and evaluating an existing test suite. Even though mutation testing has been developed to be applicable and effective towards different types of software systems and programing languages for many years, wide industrial use of mutation testing has not yet been seen. One primary reason that prevents developers and testers from using mutation testing is the expensive computational cost. Specifically, the need to manually identify equivalent mutants is a major obstacle and makes mutation testing very time consuming and labor intensive. This paper addresses this limitation and proposes a machine learning-based approach that designs and trains an abstract syntax tree recurrent neural network model to automatically classify equivalent mutants during the process of mutation testing. A pilot study with 582 mutants shows that the proposed machine learning-based approach can automatically classify equivalent mutants with an accuracy higher than 90%. The approach can significantly save the manual effort and time spent on identifying equivalent mutants during the process of mutation testing.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"291 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116867123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/ICSTW52544.2021.00040
Sylvain Hallé, R. Khoury
The paper presents a theoretical foundation for test sequence generation based on an input specification. The set of possible test sequences is first partitioned according to a generic “triaging” function, which can be created from a state-machine specification in various ways. The notion of coverage metric is then expressed in terms of the categories produced by this function. Many existing test generation problems, such as t-way state or transition coverage, become particular cases of this generic framework. We then present algorithms for generating sets of test sequences providing guaranteed full coverage with respect to a metric, by building and processing a special type of graph called a Cayley graph. An implementation of these concepts is then experimentally evaluated against existing techniques, and shows it provides better performance in terms of running time and test suite size.
{"title":"Test Sequence Generation with Cayley Graphs","authors":"Sylvain Hallé, R. Khoury","doi":"10.1109/ICSTW52544.2021.00040","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00040","url":null,"abstract":"The paper presents a theoretical foundation for test sequence generation based on an input specification. The set of possible test sequences is first partitioned according to a generic “triaging” function, which can be created from a state-machine specification in various ways. The notion of coverage metric is then expressed in terms of the categories produced by this function. Many existing test generation problems, such as t-way state or transition coverage, become particular cases of this generic framework. We then present algorithms for generating sets of test sequences providing guaranteed full coverage with respect to a metric, by building and processing a special type of graph called a Cayley graph. An implementation of these concepts is then experimentally evaluated against existing techniques, and shows it provides better performance in terms of running time and test suite size.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123059241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/ICSTW52544.2021.00049
Pedro M. Fernandes, Manuel Lopes, R. Prada
The automation of functional testing in software has allowed developers to continuously check for negative impacts on functionality throughout the iterative phases of development. This is not the case for User eXperience (UX), which has hitherto relied almost exclusively on testing with real users. User testing is a slow endeavour that can become a bottleneck for development of interactive systems. To address this problem, we here propose an agent based approach for automatic UX testing. We develop agents with basic problem solving skills and a core affect model, allowing us to model an artificial affective state as they traverse different levels of a game. Although this research is still at a primordial state, we believe the results here presented make a strong case for the use of intelligent agents endowed with affective computing models for automating UX testing.
{"title":"Agents for Automated User Experience Testing","authors":"Pedro M. Fernandes, Manuel Lopes, R. Prada","doi":"10.1109/ICSTW52544.2021.00049","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00049","url":null,"abstract":"The automation of functional testing in software has allowed developers to continuously check for negative impacts on functionality throughout the iterative phases of development. This is not the case for User eXperience (UX), which has hitherto relied almost exclusively on testing with real users. User testing is a slow endeavour that can become a bottleneck for development of interactive systems. To address this problem, we here propose an agent based approach for automatic UX testing. We develop agents with basic problem solving skills and a core affect model, allowing us to model an artificial affective state as they traverse different levels of a game. Although this research is still at a primordial state, we believe the results here presented make a strong case for the use of intelligent agents endowed with affective computing models for automating UX testing.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122618453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/ICSTW52544.2021.00022
Jaganmohan Chandrasekaran, Yu Lei, R. Kacker, D. R. Kuhn
Recent advancements in the field of deep learning have enabled its application in Autonomous Driving Systems (ADS). A Deep Neural Network (DNN) model is often used to perform tasks such as pedestrian detection, object detection, and steering control in ADS. Unfortunately, DNN models could exhibit incorrect or unexpected behavior in real-world scenarios. There is a need to rigorously test these models with real-world driving scenarios so that safety-critical bugs can be detected before their deployment in the real world.In this paper, we propose a combinatorial approach to testing DNN models. Our approach generates test images by applying a set of combinations of some basic image transformation operations to a seed image. First, we identify a set of valid transformation operations or simply transformations. Next, we design an input parameter model based on the valid transformations and generate a t-way (t=2) combinatorial test set. Each test represents a combination of transformations, and can be used to produce a test image. We execute the test images on a DNN model and distinguish between consistent and inconsistent behavior using a relation. We conducted an experimental evaluation of our approach on three DNN models that are used in the Udacity challenge. Our results suggest that test images generated by our approach can effectively identify inconsistent behaviors and can significantly increase neuron coverage. To the best of our knowledge, our work is the first effort to use a combinatorial testing approach to generating test images based on image transformations for testing DNNs used in ADS.
{"title":"A Combinatorial Approach to Testing Deep Neural Network-based Autonomous Driving Systems","authors":"Jaganmohan Chandrasekaran, Yu Lei, R. Kacker, D. R. Kuhn","doi":"10.1109/ICSTW52544.2021.00022","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00022","url":null,"abstract":"Recent advancements in the field of deep learning have enabled its application in Autonomous Driving Systems (ADS). A Deep Neural Network (DNN) model is often used to perform tasks such as pedestrian detection, object detection, and steering control in ADS. Unfortunately, DNN models could exhibit incorrect or unexpected behavior in real-world scenarios. There is a need to rigorously test these models with real-world driving scenarios so that safety-critical bugs can be detected before their deployment in the real world.In this paper, we propose a combinatorial approach to testing DNN models. Our approach generates test images by applying a set of combinations of some basic image transformation operations to a seed image. First, we identify a set of valid transformation operations or simply transformations. Next, we design an input parameter model based on the valid transformations and generate a t-way (t=2) combinatorial test set. Each test represents a combination of transformations, and can be used to produce a test image. We execute the test images on a DNN model and distinguish between consistent and inconsistent behavior using a relation. We conducted an experimental evaluation of our approach on three DNN models that are used in the Udacity challenge. Our results suggest that test images generated by our approach can effectively identify inconsistent behaviors and can significantly increase neuron coverage. To the best of our knowledge, our work is the first effort to use a combinatorial testing approach to generating test images based on image transformations for testing DNNs used in ADS.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129704450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/ICSTW52544.2021.00050
Maral Azizi
The most effective regression testing algorithms have long running times and often require dynamic or static code analysis, making them unsuitable for the modern software development environment where the rate of software delivery could be less than a minute. More recently, some researchers have developed information retrieval-based (IR-based) techniques for prioritizing tests such that the higher similar tests to the code changes have a higher likelihood of finding bugs. A vast majority of these techniques are based on standard term similarity calculation, which can be imprecise. One reason for the low accuracy of these techniques is that the original query often is short, therefore, it does not return the relevant test cases. In such cases, the query needs reformulation. The current state of research lacks methods to increase the quality of the query in the regression testing domain. Our research aims at addressing this problem and we conjecture that enhancing the quality of the queries can improve the performance of IR-based regression test case prioritization (RTP). Our empirical evaluation with six open source programs shows that our approach improves the accuracy of IR-based RTP and increases regression fault detection rate, compared to the common prioritization techniques.
{"title":"QRTest: Automatic Query Reformulation for Information Retrieval Based Regression Test Case Prioritization","authors":"Maral Azizi","doi":"10.1109/ICSTW52544.2021.00050","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00050","url":null,"abstract":"The most effective regression testing algorithms have long running times and often require dynamic or static code analysis, making them unsuitable for the modern software development environment where the rate of software delivery could be less than a minute. More recently, some researchers have developed information retrieval-based (IR-based) techniques for prioritizing tests such that the higher similar tests to the code changes have a higher likelihood of finding bugs. A vast majority of these techniques are based on standard term similarity calculation, which can be imprecise. One reason for the low accuracy of these techniques is that the original query often is short, therefore, it does not return the relevant test cases. In such cases, the query needs reformulation. The current state of research lacks methods to increase the quality of the query in the regression testing domain. Our research aims at addressing this problem and we conjecture that enhancing the quality of the queries can improve the performance of IR-based regression test case prioritization (RTP). Our empirical evaluation with six open source programs shows that our approach improves the accuracy of IR-based RTP and increases regression fault detection rate, compared to the common prioritization techniques.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129137428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/ICSTW52544.2021.00052
Taghreed Bagies, A. Jannesari
The execution of software testing is costly and time-consuming. To accelerate the test execution, researchers have applied several methods to run the testing in parallel. One method of parallelizing the test execution is by using a GPU to distribute test inputs among several threads running in parallel. In this paper, we investigate three programming models CUDA Unified Memory, CUDA Non-Unified Memory, and OpenMP GPU offloading to parallelize the test execution and discuss the challenges using these programming models. We use eleven benchmarks and parallelize their test suites by using these models. Our study shows some limitations (e.g. cache size, branch divergence, and load imbalance) when using GPUs to execute the testing in parallel.
{"title":"An Empirical Study of Parallelizing Test Execution Using CUDA Unified Memory and OpenMP GPU Offloading","authors":"Taghreed Bagies, A. Jannesari","doi":"10.1109/ICSTW52544.2021.00052","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00052","url":null,"abstract":"The execution of software testing is costly and time-consuming. To accelerate the test execution, researchers have applied several methods to run the testing in parallel. One method of parallelizing the test execution is by using a GPU to distribute test inputs among several threads running in parallel. In this paper, we investigate three programming models CUDA Unified Memory, CUDA Non-Unified Memory, and OpenMP GPU offloading to parallelize the test execution and discuss the challenges using these programming models. We use eleven benchmarks and parallelize their test suites by using these models. Our study shows some limitations (e.g. cache size, branch divergence, and load imbalance) when using GPUs to execute the testing in parallel.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123417036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/ICSTW52544.2021.00017
Saeyoon Oh, Seongmin Lee, S. Yoo
Higher Order Mutation (HOM) has been proposed to avoid equivalent mutants and improve the scalability of mutation testing, but generating useful HOMs remain an expensive search problem on its own. We propose a new approach to generate Strongly Subsuming Higher Order Mutants (SSHOM) using a recently introduced Causal Program Dependence Analysis (CPDA). CPDA itself is based on program mutation, and provides quantitative estimation of how often a change of the value of a program element will cause a value change of another program element. Our SSHOM generation approach chooses pairs of program elements using heuristics based on CPDA analysis, performs First Order Mutation to the chosen pairs, and generates an HOM by combining two FOMs.
高阶突变(high Order Mutation, HOM)被提出以避免等效突变和提高突变测试的可扩展性,但是产生有用的高阶突变本身仍然是一个昂贵的搜索问题。我们提出了一种新的方法来生成强包容高阶突变体(SSHOM)使用最近引入的因果程序依赖分析(CPDA)。CPDA本身基于程序突变,并提供了定量的估计,即一个程序元素的值的变化将导致另一个程序元素的值变化的频率。我们的SSHOM生成方法使用基于CPDA分析的启发式方法选择成对的程序元素,对所选的对进行一阶突变,并通过组合两个fom生成HOM。
{"title":"Effectively Sampling Higher Order Mutants Using Causal Effect","authors":"Saeyoon Oh, Seongmin Lee, S. Yoo","doi":"10.1109/ICSTW52544.2021.00017","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00017","url":null,"abstract":"Higher Order Mutation (HOM) has been proposed to avoid equivalent mutants and improve the scalability of mutation testing, but generating useful HOMs remain an expensive search problem on its own. We propose a new approach to generate Strongly Subsuming Higher Order Mutants (SSHOM) using a recently introduced Causal Program Dependence Analysis (CPDA). CPDA itself is based on program mutation, and provides quantitative estimation of how often a change of the value of a program element will cause a value change of another program element. Our SSHOM generation approach chooses pairs of program elements using heuristics based on CPDA analysis, performs First Order Mutation to the chosen pairs, and generates an HOM by combining two FOMs.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114580128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/ICSTW52544.2021.00028
A. Vescan, C. Serban, G. Crişan
Developing highly qualitative software systems is important but at the same time difficult to be achieved due to a constant increase in size and complexity of the systems. Early detection of software defects becomes a must, various methods being researched. The Software Defects Rules Discovery (SDRD) approach aims to address the generation of rules for software defects. The model is based on code metric values and uses the ant colony system algorithm to discover the best solution. Performed experiments considered one theoretical example and two open-source projects with various setups, in total 22 different configurations were investigated. The obtained results indicated that from the set of considered metrics the ones that provide the best rules are: CBO (Coupling Between Objects), LOC (Lines Of Code), and NPM (Number of Private Methods).
{"title":"Software Defects Rules Discovery","authors":"A. Vescan, C. Serban, G. Crişan","doi":"10.1109/ICSTW52544.2021.00028","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00028","url":null,"abstract":"Developing highly qualitative software systems is important but at the same time difficult to be achieved due to a constant increase in size and complexity of the systems. Early detection of software defects becomes a must, various methods being researched. The Software Defects Rules Discovery (SDRD) approach aims to address the generation of rules for software defects. The model is based on code metric values and uses the ant colony system algorithm to discover the best solution. Performed experiments considered one theoretical example and two open-source projects with various setups, in total 22 different configurations were investigated. The obtained results indicated that from the set of considered metrics the ones that provide the best rules are: CBO (Coupling Between Objects), LOC (Lines Of Code), and NPM (Number of Private Methods).","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128298877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/ICSTW52544.2021.00041
M. Zafar, W. Afzal, Eduard Paul Enoiu, A. Stratis, Ola Sellin
The abstract test cases generated through model-based testing (MBT) need to be concretized to make them executable on the software under test (SUT). Multiple re-searchers proposed different solutions, e.g., by utilizing adapters for concretization of abstract test cases and generation of test scripts. In this paper, we propose our Model-Based Test scrIpt GenEration fRamework (TIGER) based on GraphWalker, an open source MBT tool. The framework is capable of generating test scripts for embedded software controlling functions of a cyber physical system such as passenger trains developed at Bombardier Transportation AB. The framework follows some defined mapping rules for the concretization of abstract test cases. We have evaluated the generated test scripts using an industrial case study in terms of fault detection. We have induced faults in the model of the SUT based on three mutation operators to generate faulty test scripts. The aim of generating faulty test scripts is to produce failed test steps and to guarantee the absence of faults in the SUT. Moreover, we have also generated the test scripts using the correct version of the model and executed it to analyse the behaviour of the generated test scripts in comparison with manually-written test scripts. The results show that the test scripts generated by GW using the proposed framework are executable, provide 100% requirements coverage and can be used to uncover faults at software-in-the-loop simulation level of sub-system testing.
{"title":"A Model-Based Test Script Generation Framework for Embedded Software","authors":"M. Zafar, W. Afzal, Eduard Paul Enoiu, A. Stratis, Ola Sellin","doi":"10.1109/ICSTW52544.2021.00041","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00041","url":null,"abstract":"The abstract test cases generated through model-based testing (MBT) need to be concretized to make them executable on the software under test (SUT). Multiple re-searchers proposed different solutions, e.g., by utilizing adapters for concretization of abstract test cases and generation of test scripts. In this paper, we propose our Model-Based Test scrIpt GenEration fRamework (TIGER) based on GraphWalker, an open source MBT tool. The framework is capable of generating test scripts for embedded software controlling functions of a cyber physical system such as passenger trains developed at Bombardier Transportation AB. The framework follows some defined mapping rules for the concretization of abstract test cases. We have evaluated the generated test scripts using an industrial case study in terms of fault detection. We have induced faults in the model of the SUT based on three mutation operators to generate faulty test scripts. The aim of generating faulty test scripts is to produce failed test steps and to guarantee the absence of faults in the SUT. Moreover, we have also generated the test scripts using the correct version of the model and executed it to analyse the behaviour of the generated test scripts in comparison with manually-written test scripts. The results show that the test scripts generated by GW using the proposed framework are executable, provide 100% requirements coverage and can be used to uncover faults at software-in-the-loop simulation level of sub-system testing.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129507931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/ICSTW52544.2021.00015
Lars van Hijfte, Ana Oprescu
The Equivalent Mutant Problem is an ongoing research problem that led to the creation of multiple reusable non-standardized mutant datasets. However, cross-study evaluation is still tedious. To tackle this problem, we propose MutantBench, a novel open-source comparison framework that is designed with a focus on the FAIR data principles and adoptability by the community. With this, tools within the Equivalent Mutant Problem can be empirically evaluated using the same dataset which then allows for improved cross-study evaluation. We also combine existing datasets resulting in a mutant dataset containing 4400 mutants with 1416 equivalent mutants. This increases the previously largest mutant dataset by more than a thousand equivalent mutants.
{"title":"MutantBench: an Equivalent Mutant Problem Comparison Framework","authors":"Lars van Hijfte, Ana Oprescu","doi":"10.1109/ICSTW52544.2021.00015","DOIUrl":"https://doi.org/10.1109/ICSTW52544.2021.00015","url":null,"abstract":"The Equivalent Mutant Problem is an ongoing research problem that led to the creation of multiple reusable non-standardized mutant datasets. However, cross-study evaluation is still tedious. To tackle this problem, we propose MutantBench, a novel open-source comparison framework that is designed with a focus on the FAIR data principles and adoptability by the community. With this, tools within the Equivalent Mutant Problem can be empirically evaluated using the same dataset which then allows for improved cross-study evaluation. We also combine existing datasets resulting in a mutant dataset containing 4400 mutants with 1416 equivalent mutants. This increases the previously largest mutant dataset by more than a thousand equivalent mutants.","PeriodicalId":371680,"journal":{"name":"2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128711012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}