Yanqiang Liu, Fangge Yan, Mingyuan Xia, Zhengwei Qi, Xue Liu
With the constantly growing and changing requirements of app users, web techniques are used in mobile application development for better cross‐platform compatibility and online update. As the embedded web contents gain complexity, debugging web apps become a critical demand. Web replay tools can record program inputs and reproduce the same execution for debugging and performance tuning. However, traditional replay approaches are largely intended for apps with desktop interaction methods (keyboard, mouse) and require modification to the browser, which limits their applicability in mobile platforms.
{"title":"TimelyRep: Timing deterministic replay for Android web applications","authors":"Yanqiang Liu, Fangge Yan, Mingyuan Xia, Zhengwei Qi, Xue Liu","doi":"10.1002/stvr.1745","DOIUrl":"https://doi.org/10.1002/stvr.1745","url":null,"abstract":"With the constantly growing and changing requirements of app users, web techniques are used in mobile application development for better cross‐platform compatibility and online update. As the embedded web contents gain complexity, debugging web apps become a critical demand. Web replay tools can record program inputs and reproduce the same execution for debugging and performance tuning. However, traditional replay approaches are largely intended for apps with desktop interaction methods (keyboard, mouse) and require modification to the browser, which limits their applicability in mobile platforms.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"206 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73579939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to the ongoing COVID-19 outbreak, conference virtualization has happened, is happening, and will happen for the recent past, ongoing, and upcoming periods, respectively. An ongoing example of conference virtualization is ICSE 2020, the largest conference in software engineering. ACM has recently formed the ACM Presidential Task Force on What Conferences Can Do to Replace Faceto-Face Meetings; in May 2020, this task force released a guide to best practices on virtual conferences (https://www.acm.org/virtual-conferences). The availability of videoconferencing and/or Webinar systems such as Zoom has made online live presentations of accepted conference papers easy and low cost. One may wonder whether the current technology and platform availability for live presentations can facilitate some innovations of disseminating journal papers, going beyond the current common practice of partnership between journals and conferences, e.g., journal-first papers. For example, a journal may consider organizing a virtual journal summit every year or every half of a year for authors of accepted or published papers in that journal to present their papers in an online live manner. Indeed, community discussion is needed before these kinds of innovations are put into action. We welcome your thoughts on possible innovations of disseminating journal papers (especially in the face of conference virtualization), and these innovations’ potential pros and cons. This issue contains two papers. In the first paper, Lucas R. Andrade, Patricia D. L. Machado, and Wilkerson. L. Andrade address the problem of predicting the fault detection capability of a test suite. It has previously been observed that although code coverage is often seen as being important, the actual coverage achieved by a test suite is a poor predictor of effectiveness. To address this, recent work has introduced metrics (forms of Operational Coverage) that combine code coverage with information from an operational profile that models the expected usage of the system. This paper reports on the outcomes of a case study that considered 46 versions of a proprietary system. In order to provide an estimate of the effectiveness of a test suite, the authors used the number of post-release bugs reported (the fewer found, the more effective the test suite). Interestingly, it was found that there was a negative correlation between measures of test suite effectiveness with both versions of statement coverage but that the correlation was stronger with operational statement coverage. (Recommended by Lori L. Pollock). In the second paper, Yanqiang Liu, Fangge Yan, Mingyuan Xia, Zhengwei Qi, and Xue Liu present TimelyRep, an efficient and deterministic replay tool for web-enabled mobile applications. TimelyRep achieves deterministic replay of program states and low replay delays in face of the high input rate of mobile interaction. In particular, TimelyRep includes a mechanism for delivering an HTTP response stream with de
由于2019冠状病毒病(COVID-19)的持续爆发,会议虚拟化分别在最近的过去、正在进行和即将进行的时期发生、正在发生和将要发生。一个正在进行的会议虚拟化的例子是ICSE 2020,这是软件工程领域最大的会议。ACM最近成立了ACM主席工作组,研究会议如何取代面对面会议;2020年5月,该工作组发布了虚拟会议最佳实践指南(https://www.acm.org/virtual-conferences)。视频会议和/或网络研讨会系统(如Zoom)的可用性使得被接受的会议论文的在线实时演示变得容易和低成本。人们可能会想,目前的现场演示技术和平台是否可以促进一些传播期刊论文的创新,超越目前期刊和会议之间合作的常见做法,例如期刊优先论文。例如,一个期刊可以考虑每年或每半年组织一次虚拟期刊峰会,让该期刊上被接受或发表的论文的作者以在线直播的方式展示他们的论文。事实上,在这些创新付诸行动之前,需要进行社区讨论。我们欢迎您对期刊论文传播的可能创新(特别是面对会议虚拟化)的想法,以及这些创新的潜在利弊。本期包含两篇论文。在第一篇论文中,Lucas R. Andrade, Patricia D. L. Machado和Wilkerson。L. Andrade解决了预测测试套件的故障检测能力的问题。前面已经观察到,尽管代码覆盖率经常被认为是重要的,但是测试套件所实现的实际覆盖率并不能很好地预测效率。为了解决这个问题,最近的工作引入了度量(操作覆盖的形式),它将代码覆盖与来自操作概要文件的信息结合起来,该概要文件模拟了系统的预期使用情况。本文报告了一个案例研究的结果,该案例研究考虑了46个版本的专有系统。为了提供测试套件有效性的估计,作者使用了发布后报告的错误数量(发现的越少,测试套件越有效)。有趣的是,我们发现在测试套件有效性的度量与两个版本的语句覆盖率之间存在负相关,但是与操作语句覆盖率之间的相关性更强。(洛里·l·波洛克推荐)。在第二篇论文中,刘延强、闫方歌、夏明远、齐正伟和刘雪介绍了TimelyRep,这是一种用于支持web的移动应用程序的高效且确定的重放工具。TimelyRep在面对移动交互的高输入率时,实现了程序状态的确定性重播和较低的重播延迟。特别是,TimelyRep包含一种机制,用于交付具有确定性序列、内容和延迟的HTTP响应流,而不需要修改浏览器核心或操作系统。TimelyRep还包含一种机制来控制JavaScript空间中的重放延迟,适用于移动web嵌入和传统web浏览器。本文报告了对两个具有复杂非确定性和密集用户输入的现实世界网页游戏应用程序的评估。评估结果表明,TimelyRep对于重现程序漏洞和保持低延迟对于触摸密集型网页游戏非常有用。(Robert Hierons推荐)。
{"title":"Conference Virtualization","authors":"R. Hierons, Tao Xie","doi":"10.1002/stvr.1749","DOIUrl":"https://doi.org/10.1002/stvr.1749","url":null,"abstract":"Due to the ongoing COVID-19 outbreak, conference virtualization has happened, is happening, and will happen for the recent past, ongoing, and upcoming periods, respectively. An ongoing example of conference virtualization is ICSE 2020, the largest conference in software engineering. ACM has recently formed the ACM Presidential Task Force on What Conferences Can Do to Replace Faceto-Face Meetings; in May 2020, this task force released a guide to best practices on virtual conferences (https://www.acm.org/virtual-conferences). The availability of videoconferencing and/or Webinar systems such as Zoom has made online live presentations of accepted conference papers easy and low cost. One may wonder whether the current technology and platform availability for live presentations can facilitate some innovations of disseminating journal papers, going beyond the current common practice of partnership between journals and conferences, e.g., journal-first papers. For example, a journal may consider organizing a virtual journal summit every year or every half of a year for authors of accepted or published papers in that journal to present their papers in an online live manner. Indeed, community discussion is needed before these kinds of innovations are put into action. We welcome your thoughts on possible innovations of disseminating journal papers (especially in the face of conference virtualization), and these innovations’ potential pros and cons. This issue contains two papers. In the first paper, Lucas R. Andrade, Patricia D. L. Machado, and Wilkerson. L. Andrade address the problem of predicting the fault detection capability of a test suite. It has previously been observed that although code coverage is often seen as being important, the actual coverage achieved by a test suite is a poor predictor of effectiveness. To address this, recent work has introduced metrics (forms of Operational Coverage) that combine code coverage with information from an operational profile that models the expected usage of the system. This paper reports on the outcomes of a case study that considered 46 versions of a proprietary system. In order to provide an estimate of the effectiveness of a test suite, the authors used the number of post-release bugs reported (the fewer found, the more effective the test suite). Interestingly, it was found that there was a negative correlation between measures of test suite effectiveness with both versions of statement coverage but that the correlation was stronger with operational statement coverage. (Recommended by Lori L. Pollock). In the second paper, Yanqiang Liu, Fangge Yan, Mingyuan Xia, Zhengwei Qi, and Xue Liu present TimelyRep, an efficient and deterministic replay tool for web-enabled mobile applications. TimelyRep achieves deterministic replay of program states and low replay delays in face of the high input rate of mobile interaction. In particular, TimelyRep includes a mechanism for delivering an HTTP response stream with de","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"92 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90342739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To deliver reliable software, developers may rely on the fault detection capability of test suites. To evaluate this capability, they can apply code coverage metrics before a software release. However, recent research results have shown that these metrics may not provide a solid basis for this evaluation. Moreover, the fixing of a fault has a cost, and not all faults have the same impact regarding software reliability. In this sense, operational testing aims at assessing parts of the system that are more valuable for users. The goal of this work is to investigate whether traditional code coverage and code coverage merged with operational information can be related to post‐release bug detection. We focus on the scope of proprietary software under continuous delivery. We performed an exploratory case study where code branch and statement coverage metrics were collected for each version of a proprietary software together with real usage data of the system. We then measured the ability to explain the bug‐fixing activity after version release using code coverage levels. We found that traditional statement coverage has a moderate negative correlation with bug‐fixing activities, whereas statement coverage merged with the operational profile has a large negative correlation with higher confidence. Developers can consider operational information as an important factor of influence that should be analysed, among other factors, together with code coverage to assess the fault detection capability of a test suite.
{"title":"Can operational profile coverage explain post‐release bug detection?","authors":"Lucas Andrade, Patricia D. L. Machado, W. Andrade","doi":"10.1002/stvr.1735","DOIUrl":"https://doi.org/10.1002/stvr.1735","url":null,"abstract":"To deliver reliable software, developers may rely on the fault detection capability of test suites. To evaluate this capability, they can apply code coverage metrics before a software release. However, recent research results have shown that these metrics may not provide a solid basis for this evaluation. Moreover, the fixing of a fault has a cost, and not all faults have the same impact regarding software reliability. In this sense, operational testing aims at assessing parts of the system that are more valuable for users. The goal of this work is to investigate whether traditional code coverage and code coverage merged with operational information can be related to post‐release bug detection. We focus on the scope of proprietary software under continuous delivery. We performed an exploratory case study where code branch and statement coverage metrics were collected for each version of a proprietary software together with real usage data of the system. We then measured the ability to explain the bug‐fixing activity after version release using code coverage levels. We found that traditional statement coverage has a moderate negative correlation with bug‐fixing activities, whereas statement coverage merged with the operational profile has a large negative correlation with higher confidence. Developers can consider operational information as an important factor of influence that should be analysed, among other factors, together with code coverage to assess the fault detection capability of a test suite.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"31 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79926995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This editorial was written during a period of extreme difficulty for many individuals, families, and nations in the ongoing COVID-19 outbreak. We can only hope that measures taken are successful and that the situation has improved considerably. We also do not pretend that we have anything to add regarding health, social, or economic issues. However, the crisis has shown the role that Computer Science can play in informing policy. Society requires evidence and computers are often involved in producing such evidence via, for example, simulation. It is here that we, as a community, can contribute through advances in testing, verification, and reliability in areas such as Scientific Computing and Computer Simulations and maybe also AI/data sciences for helping expedite the process of finding treatment. As a recent example discussed in social media, when commenting on pandemic simulation code used to model control measures against COVID-19, Prof. Guido Salvaneschi said in his tweet: “Ever wondered about the “impact“ of research on programming languages and software engineering? Political decisions affecting hundreds of millions are being taken based on thousands of lines of 13+ years old C code that allegedly nobody understands anymore. #COVID19 #cs” (https://twitter.com/guidosalva/status/1242049884347412482). There is already some truly excellent work for making advances in these areas and we are confident that the community will rise to the challenge. This issue contains two papers. In the first paper, Simons and Lefticaru introduce a new Model-Based Testing approach, which is based on the use of a Stream X-machine (SXM) specification. SXMs provide a state-based formalism and there is a traditional approach to testing from an SXM. This approach typically assumes that the underlying functions/operations have been implemented correctly but these functions may be integrated (into a state machine) in the wrong way. There are a number of automated test generation approaches for SXMs and the authors make two main additional contributions to this area. First, they introduce a number of novel optimisations into test generation. Second, they observe that SXM test generation algorithms return abstract test cases (sequences of functions); the paper shows how corresponding concrete test data can be generated. The approach has been implemented and evaluated on case studies, with the tool also checking that a specification satisfies certain desirable properties. (Recommended by Hyunsook Do). In the second paper, Pouria Derakhshanfar, Xavier Devroey, Gilles Perrouin, Andy Zaidman, and Arie van Deursen introduce behavioural model seeding, a new seeding approach for learning class usages from both the system source code under test and existing test cases. The learned class usages are represented in a state-machine-based behavioural model. The behavioural model is then used to guide search-based crash reproduction, which generates a test case (i.e., objects and seque
{"title":"Working Across Boundaries","authors":"R. Hierons, Tao Xie","doi":"10.1002/stvr.1734","DOIUrl":"https://doi.org/10.1002/stvr.1734","url":null,"abstract":"This editorial was written during a period of extreme difficulty for many individuals, families, and nations in the ongoing COVID-19 outbreak. We can only hope that measures taken are successful and that the situation has improved considerably. We also do not pretend that we have anything to add regarding health, social, or economic issues. However, the crisis has shown the role that Computer Science can play in informing policy. Society requires evidence and computers are often involved in producing such evidence via, for example, simulation. It is here that we, as a community, can contribute through advances in testing, verification, and reliability in areas such as Scientific Computing and Computer Simulations and maybe also AI/data sciences for helping expedite the process of finding treatment. As a recent example discussed in social media, when commenting on pandemic simulation code used to model control measures against COVID-19, Prof. Guido Salvaneschi said in his tweet: “Ever wondered about the “impact“ of research on programming languages and software engineering? Political decisions affecting hundreds of millions are being taken based on thousands of lines of 13+ years old C code that allegedly nobody understands anymore. #COVID19 #cs” (https://twitter.com/guidosalva/status/1242049884347412482). There is already some truly excellent work for making advances in these areas and we are confident that the community will rise to the challenge. This issue contains two papers. In the first paper, Simons and Lefticaru introduce a new Model-Based Testing approach, which is based on the use of a Stream X-machine (SXM) specification. SXMs provide a state-based formalism and there is a traditional approach to testing from an SXM. This approach typically assumes that the underlying functions/operations have been implemented correctly but these functions may be integrated (into a state machine) in the wrong way. There are a number of automated test generation approaches for SXMs and the authors make two main additional contributions to this area. First, they introduce a number of novel optimisations into test generation. Second, they observe that SXM test generation algorithms return abstract test cases (sequences of functions); the paper shows how corresponding concrete test data can be generated. The approach has been implemented and evaluated on case studies, with the tool also checking that a specification satisfies certain desirable properties. (Recommended by Hyunsook Do). In the second paper, Pouria Derakhshanfar, Xavier Devroey, Gilles Perrouin, Andy Zaidman, and Arie van Deursen introduce behavioural model seeding, a new seeding approach for learning class usages from both the system source code under test and existing test cases. The learned class usages are represented in a state-machine-based behavioural model. The behavioural model is then used to guide search-based crash reproduction, which generates a test case (i.e., objects and seque","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"77 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2020-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83872654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Aquino, Pietro Braione, G. Denaro, P. Salza
Performance profiling can benefit from test cases that hit high‐cost executions of programs. In this paper, we investigate the problem of automatically generating test cases that trigger the worst‐case execution of programs and propose a novel technique that solves this problem with an unprecedented combination of symbolic execution and evolutionary algorithms. Our technique, which we refer to as ‘Evolutionary Symbolic Execution’, embraces the execution cost of the program paths as the fitness function to pursue the worst execution. It defines an original set of evolutionary operators, based on symbolic execution, which suitably sample the possible program paths to make the search process effective. Specifically, our technique defines a memetic algorithm that (i) incrementally evolves by steering symbolic execution to traverse new program paths that comply with execution conditions combined and refined from the currently collected worse program paths and (ii) periodically applies local optimizations to the execution conditions of the worst currently identified program path to further speed up the identification of the worst path. We report on a set of initial experiments indicating that our technique succeeds in generating good worst‐case test cases for programs with which existing approaches cannot cope. Also, we show that, as far as the problem of generating worst‐case test cases is concerned, the distinguishing evolutionary operators based on symbolic execution that we define in this paper are more effective than traditional operators that directly manipulate the program inputs.
{"title":"Facilitating program performance profiling via evolutionary symbolic execution","authors":"Andrea Aquino, Pietro Braione, G. Denaro, P. Salza","doi":"10.1002/stvr.1719","DOIUrl":"https://doi.org/10.1002/stvr.1719","url":null,"abstract":"Performance profiling can benefit from test cases that hit high‐cost executions of programs. In this paper, we investigate the problem of automatically generating test cases that trigger the worst‐case execution of programs and propose a novel technique that solves this problem with an unprecedented combination of symbolic execution and evolutionary algorithms. Our technique, which we refer to as ‘Evolutionary Symbolic Execution’, embraces the execution cost of the program paths as the fitness function to pursue the worst execution. It defines an original set of evolutionary operators, based on symbolic execution, which suitably sample the possible program paths to make the search process effective. Specifically, our technique defines a memetic algorithm that (i) incrementally evolves by steering symbolic execution to traverse new program paths that comply with execution conditions combined and refined from the currently collected worse program paths and (ii) periodically applies local optimizations to the execution conditions of the worst currently identified program path to further speed up the identification of the worst path. We report on a set of initial experiments indicating that our technique succeeds in generating good worst‐case test cases for programs with which existing approaches cannot cope. Also, we show that, as far as the problem of generating worst‐case test cases is concerned, the distinguishing evolutionary operators based on symbolic execution that we define in this paper are more effective than traditional operators that directly manipulate the program inputs.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"68 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88764593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Stream X‐Machine (SXM) testing method provides strong and repeatable guarantees of functional correctness, up to a specification. These qualities make the method attractive for software certification, especially in the domain of brokered cloud services, where arbitrage seeks to substitute functionally equivalent services from alternative providers. However, practical obstacles include the difficulty in providing a correct specification, the translation of abstract paths into feasible concrete tests and the large size of generated test suites. We describe a novel SXM verification and testing method, which automatically checks specifications for completeness and determinism, prior to generating complete test suites with full grounding information. Three optimization steps achieve up to a 10‐fold reduction in the size of the test suite, removing infeasible and redundant tests. The method is backed by a set of tools to validate and verify the SXM specification, generate technology‐agnostic test suites and ground these in SOAP, REST or rich‐client service implementations. The method was initially validated using seven specifications, three cloud platforms and five grounding strategies.
Stream X - Machine (SXM)测试方法提供了强大且可重复的功能正确性保证,达到规格要求。这些特性使得该方法对软件认证很有吸引力,尤其是在代理云服务领域,在这个领域,套利试图替代其他提供商提供的功能等效的服务。然而,实际的障碍包括难以提供正确的规范、将抽象路径转换为可行的具体测试以及生成的测试套件的大尺寸。我们描述了一种新的SXM验证和测试方法,它在生成具有完整接地信息的完整测试套件之前,自动检查规范的完整性和确定性。三个优化步骤可将测试套件的大小减少10倍,消除不可行和冗余的测试。该方法由一组工具支持,用于验证和验证SXM规范,生成与技术无关的测试套件,并将这些测试套件置于SOAP、REST或富客户端服务实现中。该方法最初使用七个规范、三个云平台和五种接地策略进行了验证。
{"title":"A verified and optimized Stream X‐Machine testing method, with application to cloud service certification","authors":"A. Simons, R. Lefticaru","doi":"10.1002/stvr.1729","DOIUrl":"https://doi.org/10.1002/stvr.1729","url":null,"abstract":"The Stream X‐Machine (SXM) testing method provides strong and repeatable guarantees of functional correctness, up to a specification. These qualities make the method attractive for software certification, especially in the domain of brokered cloud services, where arbitrage seeks to substitute functionally equivalent services from alternative providers. However, practical obstacles include the difficulty in providing a correct specification, the translation of abstract paths into feasible concrete tests and the large size of generated test suites. We describe a novel SXM verification and testing method, which automatically checks specifications for completeness and determinism, prior to generating complete test suites with full grounding information. Three optimization steps achieve up to a 10‐fold reduction in the size of the test suite, removing infeasible and redundant tests. The method is backed by a set of tools to validate and verify the SXM specification, generate technology‐agnostic test suites and ground these in SOAP, REST or rich‐client service implementations. The method was initially validated using seven specifications, three cloud platforms and five grounding strategies.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"46 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2020-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89126084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nowadays, there exists an increasing demand for reliable software systems able to fulfill their requirements in different operational environments and to cope with uncertainty that can be introduced both at design‐time and at runtime because of the lack of control over third‐party system components and complex interactions among software, hardware infrastructures and physical phenomena. This article addresses the problem of the discrepancy between measured data at runtime and the design‐time formal specification by using an inverse uncertainty quantification approach. Namely, we introduce a methodology called METRIC and its supporting toolchain to quantify and mitigate software system uncertainty during testing by combining (on‐the‐fly) model‐based testing and Bayesian inference. Our approach connects probabilistic input/output conformance theory with statistical hypothesis testing in order to assess if the behaviour of the system under test corresponds to its probabilistic formal specification provided in terms of a Markov decision process. An uncertainty‐aware model‐based test case generation strategy is used as a means to collect evidence from software components affected by sources of uncertainty. Test results serve as input to a Bayesian inference process that updates beliefs on model parameters encoding uncertain quality attributes of the system under test. This article describes our approach from both theoretical and practical perspectives. An extensive empirical evaluation activity has been conducted in order to assess the cost‐effectiveness of our approach. We show that, under same effort constraints, our uncertainty‐aware testing strategy increases the accuracy of the uncertainty quantification process up to 50 times with respect to traditional model‐based testing methods.
{"title":"Model‐based hypothesis testing of uncertain software systems","authors":"Matteo Camilli, A. Gargantini, P. Scandurra","doi":"10.1002/stvr.1730","DOIUrl":"https://doi.org/10.1002/stvr.1730","url":null,"abstract":"Nowadays, there exists an increasing demand for reliable software systems able to fulfill their requirements in different operational environments and to cope with uncertainty that can be introduced both at design‐time and at runtime because of the lack of control over third‐party system components and complex interactions among software, hardware infrastructures and physical phenomena. This article addresses the problem of the discrepancy between measured data at runtime and the design‐time formal specification by using an inverse uncertainty quantification approach. Namely, we introduce a methodology called METRIC and its supporting toolchain to quantify and mitigate software system uncertainty during testing by combining (on‐the‐fly) model‐based testing and Bayesian inference. Our approach connects probabilistic input/output conformance theory with statistical hypothesis testing in order to assess if the behaviour of the system under test corresponds to its probabilistic formal specification provided in terms of a Markov decision process. An uncertainty‐aware model‐based test case generation strategy is used as a means to collect evidence from software components affected by sources of uncertainty. Test results serve as input to a Bayesian inference process that updates beliefs on model parameters encoding uncertain quality attributes of the system under test. This article describes our approach from both theoretical and practical perspectives. An extensive empirical evaluation activity has been conducted in order to assess the cost‐effectiveness of our approach. We show that, under same effort constraints, our uncertainty‐aware testing strategy increases the accuracy of the uncertainty quantification process up to 50 times with respect to traditional model‐based testing methods.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"8 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2020-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90232108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe techniques based on symbolic execution for finding software vulnerabilities that are due to algorithmic complexity. Such vulnerabilities allow an attacker to mount denial‐of‐service attacks to deny service to benign users or to otherwise disable a software system.
{"title":"Complexity vulnerability analysis using symbolic execution","authors":"K. S. Luckow, Rody Kersten, C. Pasareanu","doi":"10.1002/stvr.1716","DOIUrl":"https://doi.org/10.1002/stvr.1716","url":null,"abstract":"We describe techniques based on symbolic execution for finding software vulnerabilities that are due to algorithmic complexity. Such vulnerabilities allow an attacker to mount denial‐of‐service attacks to deny service to benign users or to otherwise disable a software system.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"33 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2020-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91294024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pedro Delgado-Pérez, A. B. Sánchez, Sergio Segura, I. Medina-Bulo
Performance bugs are known to be a major threat to the success of software products. Performance tests aim to detect performance bugs by executing the program through test cases and checking whether it exhibits a noticeable performance degradation. The principles of mutation testing, a well‐established testing technique for the assessment of test suites through the injection of artificial faults, could be exploited to evaluate and improve the detection power of performance tests. However, the application of mutation testing to assess performance tests, henceforth called performance mutation testing (PMT), is a novel research topic with numerous open challenges. In previous papers, we identified some key challenges related to PMT. In this work, we go a step further and explore the feasibility of applying PMT at the source‐code level in general‐purpose languages. To do so, we revisit concepts associated with classical mutation testing and design seven novel mutation operators to model known bug‐inducing patterns. As a proof of concept, we applied traditional mutation operators as well as performance mutation operators to open‐source C++ programs. The results reveal the potential of the new performance‐mutants to help assess and enhance performance tests when compared with traditional mutants. A review of live mutants in these programs suggests that they can induce the design of special test inputs. In addition to these promising results, our work brings a whole new set of challenges related to PMT, which will hopefully serve as a starting point for new contributions in the area.
{"title":"Performance mutation testing","authors":"Pedro Delgado-Pérez, A. B. Sánchez, Sergio Segura, I. Medina-Bulo","doi":"10.1002/stvr.1728","DOIUrl":"https://doi.org/10.1002/stvr.1728","url":null,"abstract":"Performance bugs are known to be a major threat to the success of software products. Performance tests aim to detect performance bugs by executing the program through test cases and checking whether it exhibits a noticeable performance degradation. The principles of mutation testing, a well‐established testing technique for the assessment of test suites through the injection of artificial faults, could be exploited to evaluate and improve the detection power of performance tests. However, the application of mutation testing to assess performance tests, henceforth called performance mutation testing (PMT), is a novel research topic with numerous open challenges. In previous papers, we identified some key challenges related to PMT. In this work, we go a step further and explore the feasibility of applying PMT at the source‐code level in general‐purpose languages. To do so, we revisit concepts associated with classical mutation testing and design seven novel mutation operators to model known bug‐inducing patterns. As a proof of concept, we applied traditional mutation operators as well as performance mutation operators to open‐source C++ programs. The results reveal the potential of the new performance‐mutants to help assess and enhance performance tests when compared with traditional mutants. A review of live mutants in these programs suggests that they can induce the design of special test inputs. In addition to these promising results, our work brings a whole new set of challenges related to PMT, which will hopefully serve as a starting point for new contributions in the area.","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"49 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2020-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80915481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This special issue contains extended versions of five papers from the 29th IEEE International Symposium on Software Reliability Engineering (ISSRE 2018). ISSRE is focused on innovative techniques and tools for assessing, predicting, and improving the reliability, safety, and security of software products. The symposium emphasizes scientific methods, industrial relevance, rigorous empirical validation, and shared value of practical tools and experiences. ISSRE boasts a large industry participation, with authors and participants from international corporations. Based on the reviews from the programme committee members and discussions with the editorsin-chief regarding the relevance of the papers to the journal’s topics of interest, we invited the authors of seven papers to extend their work and submit to this special issue. The extended papers went through several rounds of revision during the rigorous peer-review process. The papers were reviewed by a panel of experts that included, but was not limited to, members of the ISSRE 2018 Program Committee. Five papers successfully completed the review process and are included in this special issue. The first paper, Using Mutants to Help Developers Distinguish and Debug (Compiler) Faults by Josie Holmes and Alex Groce, introduces a distance metric for failing test cases based on the intuition that failing tests that kill the same mutants are likely related to the same fault. This issue is especially relevant for very large test suites, as in the ‘compiler fuzzer taming’ problem. The paper evaluates the metric on two widely used real-world compilers by combining the metric with state-of-the-art methods for fault identification and localization. The second paper, Testing Microservice Architectures for Operational Reliability by Roberto Pietrantuono, Stefano Russo, and Antonio Guerriero, proposes a method for quantitatively assessing the probability of failures (‘operational reliability’) in the context of microservice applications, where the usage profile changes often for reasons such as frequent releases. The method achieves significant improvements in terms of accuracy and efficiency of reliability assessment on three open-source applications. The third paper, Model-based Hypothesis Testing of Uncertain Software Systems by Matteo Camilli, Angelo Gargantini, and Patrizia Scandurra, presents a methodology for combining model-based testing with Bayesian reasoning for testing systems with stochastic QoS properties using a model with uncertain parameters. The paper provides a detailed and reproducible case study for demonstrating the methodology. The fourth paper, Fully Automated HTML and Javascript Rewriting for Constructing a Self-healing Web Proxy by Thomas Durieux, Youssef Hamadi, and Martin Monperrus, applies the failure-oblivious computing principle to web applications. Errors are masked through HTML and Javascript code rewriting (e.g., to skip the faulty line) with an HTTP proxy and a browser extensio
{"title":"Special issue: ISSRE 2018, the 29th IEEE International Symposium on Software Reliability Engineering","authors":"R. Natella, Sudipto Ghosh","doi":"10.1002/stvr.1732","DOIUrl":"https://doi.org/10.1002/stvr.1732","url":null,"abstract":"This special issue contains extended versions of five papers from the 29th IEEE International Symposium on Software Reliability Engineering (ISSRE 2018). ISSRE is focused on innovative techniques and tools for assessing, predicting, and improving the reliability, safety, and security of software products. The symposium emphasizes scientific methods, industrial relevance, rigorous empirical validation, and shared value of practical tools and experiences. ISSRE boasts a large industry participation, with authors and participants from international corporations. Based on the reviews from the programme committee members and discussions with the editorsin-chief regarding the relevance of the papers to the journal’s topics of interest, we invited the authors of seven papers to extend their work and submit to this special issue. The extended papers went through several rounds of revision during the rigorous peer-review process. The papers were reviewed by a panel of experts that included, but was not limited to, members of the ISSRE 2018 Program Committee. Five papers successfully completed the review process and are included in this special issue. The first paper, Using Mutants to Help Developers Distinguish and Debug (Compiler) Faults by Josie Holmes and Alex Groce, introduces a distance metric for failing test cases based on the intuition that failing tests that kill the same mutants are likely related to the same fault. This issue is especially relevant for very large test suites, as in the ‘compiler fuzzer taming’ problem. The paper evaluates the metric on two widely used real-world compilers by combining the metric with state-of-the-art methods for fault identification and localization. The second paper, Testing Microservice Architectures for Operational Reliability by Roberto Pietrantuono, Stefano Russo, and Antonio Guerriero, proposes a method for quantitatively assessing the probability of failures (‘operational reliability’) in the context of microservice applications, where the usage profile changes often for reasons such as frequent releases. The method achieves significant improvements in terms of accuracy and efficiency of reliability assessment on three open-source applications. The third paper, Model-based Hypothesis Testing of Uncertain Software Systems by Matteo Camilli, Angelo Gargantini, and Patrizia Scandurra, presents a methodology for combining model-based testing with Bayesian reasoning for testing systems with stochastic QoS properties using a model with uncertain parameters. The paper provides a detailed and reproducible case study for demonstrating the methodology. The fourth paper, Fully Automated HTML and Javascript Rewriting for Constructing a Self-healing Web Proxy by Thomas Durieux, Youssef Hamadi, and Martin Monperrus, applies the failure-oblivious computing principle to web applications. Errors are masked through HTML and Javascript code rewriting (e.g., to skip the faulty line) with an HTTP proxy and a browser extensio","PeriodicalId":49506,"journal":{"name":"Software Testing Verification & Reliability","volume":"51 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2020-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83299003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}