2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)最新文献

英文中文

Organization Committee 组织委员会

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-01 DOI: 10.1109/icst.2019.00007

引用次数: 0

BugsJS: a Benchmark of JavaScript Bugs BugsJS: JavaScript bug的基准测试

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-01 DOI: 10.1109/ICST.2019.00019

Péter Gyimesi, Béla Vancsics, Andrea Stocco, D. Mazinanian, Árpád Beszédes, R. Ferenc, A. Mesbah

JavaScript is a popular programming language that is also error-prone due to its asynchronous, dynamic, and loosely-typed nature. In recent years, numerous techniques have been proposed for analyzing and testing JavaScript applications. However, our survey of the literature in this area revealed that the proposed techniques are often evaluated on different datasets of programs and bugs. The lack of a commonly used benchmark limits the ability to perform fair and unbiased comparisons for assessing the efficacy of new techniques. To fill this gap, we propose BugsJS, a benchmark of 453 real, manually validated JavaScript bugs from 10 popular JavaScript server-side programs, comprising 444k LOC in total. Each bug is accompanied by its bug report, the test cases that detect it, as well as the patch that fixes it. BugsJS features a rich interface for accessing the faulty and fixed versions of the programs and executing the corresponding test cases, which facilitates conducting highly-reproducible empirical studies and comparisons of JavaScript analysis and testing tools.

JavaScript是一种流行的编程语言，由于其异步、动态和松散类型的特性，它也容易出错。近年来，出现了许多用于分析和测试JavaScript应用程序的技术。然而，我们对该领域文献的调查显示，所提出的技术通常是在不同的程序和错误数据集上进行评估的。缺乏常用的基准限制了对评估新技术功效进行公平和无偏见比较的能力。为了填补这一空白，我们提出了BugsJS，这是一个来自10个流行的JavaScript服务器端程序的453个真实的、手动验证的JavaScript错误的基准测试，总共包含444k的LOC。每个错误都伴随着它的错误报告，检测它的测试用例，以及修复它的补丁。BugsJS提供了丰富的接口，用于访问程序的错误版本和固定版本，并执行相应的测试用例，这有助于对JavaScript分析和测试工具进行高可重复性的经验研究和比较。

引用次数: 60

AADL-Based Safety Analysis Approaches for Safety-Critical Systems 基于aadl的安全关键系统安全分析方法

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-01 DOI: 10.1109/ICST.2019.00058

Xiaomin Wei

Ensuring system safety is significant for safety-critical systems. To improve system safety in system architecture models, Architecture Analysis and Design Language (AADL) is used to model safety-critical systems. My thesis provides several safety analysis approaches for AADL models. To make it more effective, model transformation rules from AADL models to target formal models are formulated for the integration of formal methods into safety analysis approaches. The automatic transformation can reduce the degree of application difficulty of formal methods for engineers.

确保系统安全对于安全关键型系统至关重要。为了提高系统体系结构模型中的系统安全性，采用体系结构分析与设计语言(AADL)对安全关键型系统进行建模。本文为AADL模型提供了几种安全性分析方法。为了使其更有效，制定了从AADL模型到目标形式模型的模型转换规则，将形式方法集成到安全分析方法中。自动转换可以降低形式化方法对工程师的应用难度。

引用次数: 0

SmokeOut: An Approach for Testing Clustering Implementations 冒烟:一种测试集群实现的方法

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-01 DOI: 10.1109/ICST.2019.00057

Vincenzo Musco, Xin Yin, Iulian Neamtiu

Clustering is a key Machine Learning technique, used in many high-stakes domains from medicine to self-driving cars. Many clustering algorithms have been proposed, and these algorithms have been implemented in many toolkits. Clustering users assume that clustering implementations are correct, reliable, and for a given algorithm, interchangeable. We challenge these assumptions. We introduce SmokeOut, an approach and tool that pits clustering implementations against each other (and against themselves) while controlling for algorithm and dataset, to find datasets where clustering outcomes differ when they shouldn't, and measure this difference. We ran SmokeOut on 7 clustering algorithms (3 deterministic and 4 nondeterministic) implemented in 7 widely-used toolkits, and run in a variety of scenarios on the Penn Machine Learning Benchmark (162 datasets). SmokeOut has revealed that clustering implementations are fragile: on a given input dataset and using a given clustering algorithm, clustering outcomes and accuracy vary widely between (1) successive runs of the same toolkit; (2) different input parameters for that tool; (3) different toolkits.

聚类是一项关键的机器学习技术，用于从医学到自动驾驶汽车的许多高风险领域。人们提出了许多聚类算法，并在许多工具包中实现了这些算法。聚类用户假设聚类实现是正确、可靠的，并且对于给定的算法是可互换的。我们挑战这些假设。我们介绍了SmokeOut，这是一种方法和工具，在控制算法和数据集的同时，使聚类实现相互竞争(以及相互竞争)，以找到聚类结果不应该存在差异的数据集，并测量这种差异。我们在7个广泛使用的工具包中实现的7种聚类算法(3种确定性和4种不确定性)上运行了SmokeOut，并在宾夕法尼亚大学机器学习基准(162个数据集)上运行了各种场景。SmokeOut揭示了聚类实现是脆弱的:在给定的输入数据集上，使用给定的聚类算法，聚类结果和准确性在(1)同一工具包的连续运行之间差异很大;(2)该工具的输入参数不同;(3)不同的工具箱。

{"title":"SmokeOut: An Approach for Testing Clustering Implementations","authors":"Vincenzo Musco, Xin Yin, Iulian Neamtiu","doi":"10.1109/ICST.2019.00057","DOIUrl":"https://doi.org/10.1109/ICST.2019.00057","url":null,"abstract":"Clustering is a key Machine Learning technique, used in many high-stakes domains from medicine to self-driving cars. Many clustering algorithms have been proposed, and these algorithms have been implemented in many toolkits. Clustering users assume that clustering implementations are correct, reliable, and for a given algorithm, interchangeable. We challenge these assumptions. We introduce SmokeOut, an approach and tool that pits clustering implementations against each other (and against themselves) while controlling for algorithm and dataset, to find datasets where clustering outcomes differ when they shouldn't, and measure this difference. We ran SmokeOut on 7 clustering algorithms (3 deterministic and 4 nondeterministic) implemented in 7 widely-used toolkits, and run in a variety of scenarios on the Penn Machine Learning Benchmark (162 datasets). SmokeOut has revealed that clustering implementations are fragile: on a given input dataset and using a given clustering algorithm, clustering outcomes and accuracy vary widely between (1) successive runs of the same toolkit; (2) different input parameters for that tool; (3) different toolkits.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122838651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

On the Evolution of Keyword-Driven Test Suites 关键词驱动测试套件的演变

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-01 DOI: 10.1109/ICST.2019.00040

Renaud Rwemalika, Marinos Kintis, Mike Papadakis, Yves Le Traon, Pierre Lorrach

Many companies rely on software testing to verify that their software products meet their requirements. However, test quality and, in particular, the quality of end-to-end testing is relatively hard to achieve. The problem becomes challenging when software evolves, as end-to-end test suites need to adapt and conform to the evolved software. Unfortunately, end-to-end tests are particularly fragile as any change in the application interface, e.g., application flow, location or name of graphical user interface elements, necessitates a change in the tests. This paper presents an industrial case study on the evolution of Keyword-Driven test suites, also known as Keyword-Driven Testing (KDT). Our aim is to demonstrate the problem of test maintenance, identify the benefits of Keyword-Driven Testing and overall improve the understanding of test code evolution (at the acceptance testing level). This information will support the development of automatic techniques, such as test refactoring and repair, and will motivate future research. To this end, we identify, collect and analyze test code changes across the evolution of industrial KDT test suites for a period of eight months. We show that the problem of test maintenance is largely due to test fragility (most commonly-performed changes are due to locator and synchronization issues) and test clones (over 30% of keywords are duplicated). We also show that the better test design of KDT test suites has the potential for drastically reducing (approximately 70%) the number of test code changes required to support software evolution. To further validate our results, we interview testers from BGL BNP Paribas and report their perceptions on the advantages and challenges of keyword-driven testing.

许多公司依靠软件测试来验证他们的软件产品是否满足他们的需求。然而，测试质量，特别是端到端测试的质量是相对难以实现的。当软件发展时，问题变得具有挑战性，因为端到端测试套件需要适应并遵循发展的软件。不幸的是，端到端测试特别脆弱，因为应用程序界面中的任何更改(例如，应用程序流、图形用户界面元素的位置或名称)都需要更改测试。本文介绍了一个关键字驱动测试套件(也称为关键字驱动测试(KDT))演进的工业案例研究。我们的目标是演示测试维护的问题，确定关键字驱动测试的好处，并全面提高对测试代码演进的理解(在验收测试级别)。这些信息将支持自动化技术的发展，例如测试重构和修复，并将激励未来的研究。为此，我们在8个月的时间里识别、收集和分析了工业KDT测试套件的演进过程中的测试代码变更。我们表明，测试维护的问题主要是由于测试的脆弱性(最常见的执行更改是由于定位器和同步问题)和测试克隆(超过30%的关键字是重复的)。我们还表明，KDT测试套件的更好的测试设计具有显著减少(大约70%)支持软件演进所需的测试代码更改数量的潜力。为了进一步验证我们的结果，我们采访了BGL BNP Paribas的测试人员，并报告了他们对关键字驱动测试的优势和挑战的看法。

{"title":"On the Evolution of Keyword-Driven Test Suites","authors":"Renaud Rwemalika, Marinos Kintis, Mike Papadakis, Yves Le Traon, Pierre Lorrach","doi":"10.1109/ICST.2019.00040","DOIUrl":"https://doi.org/10.1109/ICST.2019.00040","url":null,"abstract":"Many companies rely on software testing to verify that their software products meet their requirements. However, test quality and, in particular, the quality of end-to-end testing is relatively hard to achieve. The problem becomes challenging when software evolves, as end-to-end test suites need to adapt and conform to the evolved software. Unfortunately, end-to-end tests are particularly fragile as any change in the application interface, e.g., application flow, location or name of graphical user interface elements, necessitates a change in the tests. This paper presents an industrial case study on the evolution of Keyword-Driven test suites, also known as Keyword-Driven Testing (KDT). Our aim is to demonstrate the problem of test maintenance, identify the benefits of Keyword-Driven Testing and overall improve the understanding of test code evolution (at the acceptance testing level). This information will support the development of automatic techniques, such as test refactoring and repair, and will motivate future research. To this end, we identify, collect and analyze test code changes across the evolution of industrial KDT test suites for a period of eight months. We show that the problem of test maintenance is largely due to test fragility (most commonly-performed changes are due to locator and synchronization issues) and test clones (over 30% of keywords are duplicated). We also show that the better test design of KDT test suites has the potential for drastically reducing (approximately 70%) the number of test code changes required to support software evolution. To further validate our results, we interview testers from BGL BNP Paribas and report their perceptions on the advantages and challenges of keyword-driven testing.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131119185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Using Data Flow-Based Coverage Criteria for Black-Box Integration Testing of Distributed Software Systems 基于数据流覆盖标准的分布式软件系统黑盒集成测试

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-01 DOI: 10.1109/ICST.2019.00051

Dominik Hellhake, Tobias Schmid, S. Wagner

Modern automotive E/E systems are implemented as distributed real-time software systems. The constantly growing complexity of safety-relevant software functions leads to an increased importance of testing during system integration of such systems. Systematic metrics are required to guide the testing process during system integration by providing coverage measures and stopping criteria but few studied approaches exist. For this purpose, we introduce a data-flow based observation scheme which captures the interplay behavior of involved ECUs during test execution and failure occurrences. In addition, we introduce a data flow-based coverage criterion designed for black box integration. By applying the observation scheme to test cases and associated faults found during execution, we first analyze similarities in data flow coverage. By further analyzing the data flow of failures, that slipped through the phase of system integration testing, we evaluate the usefulness of test gaps identified by using the suggested coverage criterion. We found major differences in the usage of data flow between undetected failures and existing test cases. In addition, we found that for the studied system under test the occurrence of failures is not necessarily a direct consequence of the test execution due to functional dependencies and side effects. Overall, these findings highlight the potential and limitations of data flow-based measures to be formalized as coverage or stopping criteria for the integration testing of distributed software systems.

现代汽车E/E系统是作为分布式实时软件系统实现的。与安全相关的软件功能的不断增长的复杂性导致测试在这类系统的系统集成过程中变得越来越重要。通过提供覆盖度量和停止标准，需要系统度量来指导系统集成期间的测试过程，但是很少有研究方法存在。为此，我们引入了一个基于数据流的观察方案，该方案捕获了在测试执行和故障发生期间所涉及的ecu的相互作用行为。此外，我们还介绍了为黑箱集成设计的基于数据流的覆盖标准。通过对测试用例和执行过程中发现的相关错误应用观察方案，我们首先分析了数据流覆盖的相似性。通过进一步分析系统集成测试阶段的失败数据流，我们通过使用建议的覆盖标准来评估测试间隙的有效性。我们发现在未检测到的故障和现有的测试用例之间的数据流使用的主要差异。此外，我们发现，对于所研究的测试系统，由于功能依赖和副作用，失败的发生不一定是测试执行的直接结果。总的来说，这些发现突出了基于数据流的方法的潜力和局限性，这些方法被形式化为分布式软件系统集成测试的覆盖或停止标准。

{"title":"Using Data Flow-Based Coverage Criteria for Black-Box Integration Testing of Distributed Software Systems","authors":"Dominik Hellhake, Tobias Schmid, S. Wagner","doi":"10.1109/ICST.2019.00051","DOIUrl":"https://doi.org/10.1109/ICST.2019.00051","url":null,"abstract":"Modern automotive E/E systems are implemented as distributed real-time software systems. The constantly growing complexity of safety-relevant software functions leads to an increased importance of testing during system integration of such systems. Systematic metrics are required to guide the testing process during system integration by providing coverage measures and stopping criteria but few studied approaches exist. For this purpose, we introduce a data-flow based observation scheme which captures the interplay behavior of involved ECUs during test execution and failure occurrences. In addition, we introduce a data flow-based coverage criterion designed for black box integration. By applying the observation scheme to test cases and associated faults found during execution, we first analyze similarities in data flow coverage. By further analyzing the data flow of failures, that slipped through the phase of system integration testing, we evaluate the usefulness of test gaps identified by using the suggested coverage criterion. We found major differences in the usage of data flow between undetected failures and existing test cases. In addition, we found that for the studied system under test the occurrence of failures is not necessarily a direct consequence of the test execution due to functional dependencies and side effects. Overall, these findings highlight the potential and limitations of data flow-based measures to be formalized as coverage or stopping criteria for the integration testing of distributed software systems.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122044746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Testing for Implicit Inconsistencies in Documentation and Implementation 测试文档和实现中的隐性不一致性

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-01 DOI: 10.1109/ICST.2019.00059

Devika Sondhi

The thesis aims to provide test generation techniques, beyond the consideration of coverage-based criterion, with an objective to highlight inconsistencies in the documentation and the implementation. We leverage the domain knowledge gained from developers' expertise and existing resources to generate test-cases.

本文旨在提供测试生成技术，超越对基于覆盖率的标准的考虑，目的是突出文档和实现中的不一致性。我们利用从开发人员的专业知识和现有资源中获得的领域知识来生成测试用例。

引用次数: 0

Testing Android Incoming Calls 测试Android呼入

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-01 DOI: 10.1109/ICST.2019.00053

A. C. Paiva, Marco A. Gonçalves, André R. Barros

Mobile applications are increasingly present in our daily lives. Being increasingly dependent on apps, we all want to make sure apps work as expected. One way to increase confidence and quality of software is through testing. However, the existing approaches and tools still do not provide sufficient solutions for testing mobile apps with features different from the ones found in desktop or web applications. In particular, there are guidelines that mobile developers should follow and that may be tested automatically but, as far as we know, there are no tools that are able do it. The iMPAcT tool combines exploration, reverse engineering and testing to check if mobile apps follow best practices to implement specific behavior called UI Patterns. Examples of UI Patterns within this catalog are: orientation, background-foreground, side drawer, tab-scroll, among others. For each of these behaviors (UI Patterns), the iMPAcT tool has a corresponding Test Pattern that checks if the UI Pattern implementation follows the guidelines. This paper presents an extension to iMPAcT tool. It enables to test if Android apps work properly after receiving an incoming call, i.e., if the state of the screen after the call is the same as before getting the call. It formalizes the problem, describes the overall approach, describes the architecture of the tool and reports an experiment performed over 61 public mobile apps.

移动应用程序越来越多地出现在我们的日常生活中。越来越依赖于应用程序，我们都希望确保应用程序按预期工作。增加信心和软件质量的一种方法是通过测试。然而，现有的方法和工具仍然不能提供足够的解决方案来测试具有不同于桌面或web应用程序功能的移动应用程序。特别是，手机开发者应该遵循一些指导方针，这些指导方针可能会自动进行测试，但据我们所知，目前还没有工具能够做到这一点。iMPAcT工具结合了探索、逆向工程和测试，以检查移动应用程序是否遵循最佳实践来实现被称为UI模式的特定行为。此目录中的UI模式示例包括:方向、背景-前景、侧边抽屉、选项卡-滚动等等。对于这些行为(UI模式)中的每一个，iMPAcT工具都有一个相应的测试模式来检查UI模式实现是否遵循指导方针。本文介绍了iMPAcT工具的一个扩展。它可以测试Android应用程序在接到来电后是否正常工作，也就是说，通话后的屏幕状态是否与接听来电前相同。它将问题形式化，描述了总体方法，描述了工具的体系结构，并报告了在61个公共移动应用程序上进行的实验。

{"title":"Testing Android Incoming Calls","authors":"A. C. Paiva, Marco A. Gonçalves, André R. Barros","doi":"10.1109/ICST.2019.00053","DOIUrl":"https://doi.org/10.1109/ICST.2019.00053","url":null,"abstract":"Mobile applications are increasingly present in our daily lives. Being increasingly dependent on apps, we all want to make sure apps work as expected. One way to increase confidence and quality of software is through testing. However, the existing approaches and tools still do not provide sufficient solutions for testing mobile apps with features different from the ones found in desktop or web applications. In particular, there are guidelines that mobile developers should follow and that may be tested automatically but, as far as we know, there are no tools that are able do it. The iMPAcT tool combines exploration, reverse engineering and testing to check if mobile apps follow best practices to implement specific behavior called UI Patterns. Examples of UI Patterns within this catalog are: orientation, background-foreground, side drawer, tab-scroll, among others. For each of these behaviors (UI Patterns), the iMPAcT tool has a corresponding Test Pattern that checks if the UI Pattern implementation follows the guidelines. This paper presents an extension to iMPAcT tool. It enables to test if Android apps work properly after receiving an incoming call, i.e., if the state of the screen after the call is the same as before getting the call. It formalizes the problem, describes the overall approach, describes the architecture of the tool and reports an experiment performed over 61 public mobile apps.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114448175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

[Copyright notice] (版权)

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-01 DOI: 10.1109/icst.2019.00003

引用次数: 0

An Empirical Study on the Use of Defect Prediction for Test Case Prioritization 缺陷预测用于测试用例优先级的实证研究

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

Pub Date : 2019-04-01 DOI: 10.1109/ICST.2019.00041

David Paterson, José Campos, Rui Abreu, G. M. Kapfhammer, G. Fraser, Phil McMinn

Test case prioritization has been extensively re-searched as a means for reducing the time taken to discover regressions in software. While many different strategies have been developed and evaluated, prior experiments have shown them to not be effective at prioritizing test suites to find real faults. This paper presents a test case prioritization strategy based on defect prediction, a technique that analyzes code features – such as the number of revisions and authors — to estimate the likelihood that any given Java class will contain a bug. Intuitively, if defect prediction can accurately predict the class that is most likely to be buggy, a tool can prioritize tests to rapidly detect the defects in that class. We investigated how to configure a defect prediction tool, called Schwa, to maximize the likelihood of an accurate prediction, surfacing the link between perfect defect prediction and test case prioritization effectiveness. Using 6 real-world Java programs containing 395 real faults, we conducted an empirical evaluation comparing this paper's strategy, called G-clef, against eight existing test case prioritization strategies. The experiments reveal that using defect prediction to prioritize test cases reduces the number of test cases required to find a fault by on average 9.48% when compared with existing coverage-based strategies, and 10.4% when compared with existing history-based strategies.

作为一种减少发现软件中回归所花费时间的方法，测试用例优先级被广泛地研究过。虽然已经开发和评估了许多不同的策略，但先前的实验表明，它们在确定测试套件的优先级以发现真正的错误方面并不有效。本文提出了一种基于缺陷预测的测试用例优先级策略，缺陷预测是一种分析代码特性的技术——比如修订和作者的数量——以估计任何给定的Java类包含错误的可能性。直观地说，如果缺陷预测可以准确地预测最有可能出错的类，那么工具就可以对测试进行优先排序，以快速检测该类中的缺陷。我们研究了如何配置一个称为Schwa的缺陷预测工具，以最大限度地提高准确预测的可能性，揭示完美缺陷预测和测试用例优先级效率之间的联系。使用包含395个真实错误的6个真实Java程序，我们对本文的策略(称为G-clef)与8个现有的测试用例优先级策略进行了实证评估。实验表明，与现有的基于覆盖率的策略相比，使用缺陷预测对测试用例进行优先级排序可以减少发现故障所需的测试用例数量，平均减少9.48%，与现有的基于历史的策略相比，平均减少10.4%。

{"title":"An Empirical Study on the Use of Defect Prediction for Test Case Prioritization","authors":"David Paterson, José Campos, Rui Abreu, G. M. Kapfhammer, G. Fraser, Phil McMinn","doi":"10.1109/ICST.2019.00041","DOIUrl":"https://doi.org/10.1109/ICST.2019.00041","url":null,"abstract":"Test case prioritization has been extensively re-searched as a means for reducing the time taken to discover regressions in software. While many different strategies have been developed and evaluated, prior experiments have shown them to not be effective at prioritizing test suites to find real faults. This paper presents a test case prioritization strategy based on defect prediction, a technique that analyzes code features – such as the number of revisions and authors — to estimate the likelihood that any given Java class will contain a bug. Intuitively, if defect prediction can accurately predict the class that is most likely to be buggy, a tool can prioritize tests to rapidly detect the defects in that class. We investigated how to configure a defect prediction tool, called Schwa, to maximize the likelihood of an accurate prediction, surfacing the link between perfect defect prediction and test case prioritization effectiveness. Using 6 real-world Java programs containing 395 real faults, we conducted an empirical evaluation comparing this paper's strategy, called G-clef, against eight existing test case prioritization strategies. The experiments reveal that using defect prediction to prioritize test cases reduces the number of test cases required to find a fault by on average 9.48% when compared with existing coverage-based strategies, and 10.4% when compared with existing history-based strategies.","PeriodicalId":446827,"journal":{"name":"2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)","volume":" 28","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132095054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 32

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀