2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)最新文献

英文中文

Efficient Incrementalized Runtime Checking of Linear Measures on Lists 列表上线性度量的高效增量运行时检查

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)

Pub Date : 2017-03-01 DOI: 10.1109/ICST.2017.35

A. Gyori, P. Garg, E. Pek, P. Madhusudan

We present mechanisms to specify and efficiently check, at runtime, assertions that express structural properties and aggregate measures of dynamically manipulated linkedlist data structures. Checking assertions involving the structure, disjointness, and aggregation measures on lists and list segments typically requires linear or quadratic time in the size of the heap. Our main contribution is an incrementalization instrumentation that tracks properties of data structures dynamically as the program executes and leads to orders of magnitude speedup in assertion checking in many scenarios. Our incrementalization incurs a constant overhead on updates to list structures but enables checking assertions in constant time, independent of the size of the heap. We define a general class of functions on lists, called linear measures, which are amenable to our incrementalization technique. We demonstrate the effectiveness of our technique by showing orders of magnitude speedup in two scenarios: one scenario stemming from assertions at the level of APIs of list-manipulating libraries and the other scenario stemming from providing dynamic detection of security attacks caused by malicious rootkits.

我们提供了在运行时指定和有效检查断言的机制，这些断言表达了动态操作的链表数据结构的结构属性和聚合度量。检查涉及列表和列表段的结构、不连接性和聚合措施的断言通常需要堆大小的线性或二次时间。我们的主要贡献是一个增量化工具，它可以在程序执行时动态地跟踪数据结构的属性，并在许多场景中导致断言检查的数量级加速。我们的增量化会在更新列表结构时产生恒定的开销，但可以在恒定的时间内检查断言，与堆的大小无关。我们在列表上定义了一类一般的函数，称为线性测度，它适用于我们的增量化技术。我们通过在两个场景中展示数量级的加速来展示我们技术的有效性:一个场景源于列表操作库的api级别的断言，另一个场景源于提供恶意rootkit引起的安全攻击的动态检测。

{"title":"Efficient Incrementalized Runtime Checking of Linear Measures on Lists","authors":"A. Gyori, P. Garg, E. Pek, P. Madhusudan","doi":"10.1109/ICST.2017.35","DOIUrl":"https://doi.org/10.1109/ICST.2017.35","url":null,"abstract":"We present mechanisms to specify and efficiently check, at runtime, assertions that express structural properties and aggregate measures of dynamically manipulated linkedlist data structures. Checking assertions involving the structure, disjointness, and aggregation measures on lists and list segments typically requires linear or quadratic time in the size of the heap. Our main contribution is an incrementalization instrumentation that tracks properties of data structures dynamically as the program executes and leads to orders of magnitude speedup in assertion checking in many scenarios. Our incrementalization incurs a constant overhead on updates to list structures but enables checking assertions in constant time, independent of the size of the heap. We define a general class of functions on lists, called linear measures, which are amenable to our incrementalization technique. We demonstrate the effectiveness of our technique by showing orders of magnitude speedup in two scenarios: one scenario stemming from assertions at the level of APIs of list-manipulating libraries and the other scenario stemming from providing dynamic detection of security attacks caused by malicious rootkits.","PeriodicalId":112258,"journal":{"name":"2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130284333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Framework for Failure Diagnosis 故障诊断框架

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)

Pub Date : 2017-03-01 DOI: 10.1109/ICST.2017.75

Mojdeh Golagha

Testing and debugging is one of the most expensive and challenging phases in the software development life-cycle. One important cost factor in the debugging process is the time required to analyze failures and repair underlying faults. Two types of methods that can help testers to reduce this analysis time are Failure Clustering and Fault Localization. Although there is a plethora of these methods in the literature, there are still some gaps that prevent their operationalization in real world contexts. In addition, the abundance of these methods confuses the practitioners in selecting the suitable method for their own specific domain. To fill the gaps and bring state-of-art closer to practice, we develop a framework for failure diagnosis. To devise this framework, we evaluate existing methods to investigate the possibility of ?nding a method(s) that would be effective in different contexts. Then, we introduce a methodology for adapting this method(s) to different contexts with a priori parameter setting. This framework will empower practitioners to do fast and reliable debugging.

测试和调试是软件开发生命周期中最昂贵和最具挑战性的阶段之一。调试过程中的一个重要成本因素是分析故障和修复潜在故障所需的时间。可以帮助测试人员减少分析时间的两种方法是故障聚类和故障定位。尽管在文献中有大量的这些方法，但仍然存在一些差距，阻碍了它们在现实世界中的操作化。此外，这些方法的丰富性使实践者在为自己的特定领域选择合适的方法时感到困惑。为了填补空白，使先进的技术更接近实践，我们开发了一个故障诊断框架。为了设计这个框架，我们评估了现有的方法，以研究在不同背景下有效的方法的可能性。然后，我们介绍了一种方法，通过先验参数设置使该方法适应不同的上下文。这个框架将使从业者能够进行快速可靠的调试。

引用次数: 0

SAGA Toolbox: Interactive Testing of Guarded Assertions SAGA工具箱:保护断言的交互式测试

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)

Pub Date : 2017-03-01 DOI: 10.1109/ICST.2017.59

Daniel Flemström, T. Gustafsson, A. Kobetski

This paper presents the SAGA toolbox. It centers around development of tests, and analysis of test results, on Guarded Assertions (GA) format. Such a test defines when to test, and what to expect in such a state. The SAGA toolbox lets the user describe the test, and at the same time get immediate feedback on the test result based on a trace from the System Under Test (SUT). The feedback is visual using plots of the trace. This enables the test engineer to play around with the data and use an agile development method, since the data is already there. Moreover, the SAGA toolbox also enables the test engineer to change test stimuli plots to study the effect they have on a test. It can later generate computer programs that can feed these test stimuli to the SUT. This enables an interactive feedback loop, where immediate feedback on changes to the test, or to the test stimuli, indicate whether the test is correct and it passed or failed.

本文介绍了SAGA工具箱。它以保护断言(GA)格式的测试开发和测试结果分析为中心。这样的测试定义了何时进行测试，以及在这种状态下会发生什么。SAGA工具箱允许用户描述测试，同时根据来自被测系统(System Under test, SUT)的跟踪获得测试结果的即时反馈。使用轨迹图的反馈是可视化的。这使得测试工程师能够处理数据并使用敏捷开发方法，因为数据已经存在了。此外，SAGA工具箱还使测试工程师能够更改测试刺激图，以研究它们对测试的影响。之后，它可以生成计算机程序，将这些测试刺激输入SUT。这就实现了一个交互式反馈循环，在这个循环中，对测试或测试刺激的变化的即时反馈表明测试是正确的，通过了还是失败了。

引用次数: 3

Generic and Effective Specification of Structural Test Objectives 结构测试目标的通用和有效规范

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)

Pub Date : 2016-09-05 DOI: 10.1109/ICST.2017.48

M. Marcozzi, Mickaël Delahaye, Sébastien Bardin, N. Kosmatov, V. Prevosto

A large amount of research has been carried out to automate white-box testing. While a wide range of different and sometimes heterogeneous code-coverage criteria have been proposed, there exists no generic formalism to describe them all, and available test automation tools usually support only a small subset of them. We introduce a new specification language, called HTOL (Hyperlabel Test Objectives Language), providing a powerful generic mechanism to define a wide range of test objectives. HTOL comes with a formal semantics, and can encode all standard criteria but full mutations. Besides specification, HTOL is appealing in the context of test automation as it allows handling criteria in a unified way.

为了实现白盒测试的自动化，已经进行了大量的研究。虽然已经提出了广泛的不同的，有时是异构的代码覆盖标准，但是没有通用的形式来描述它们，并且可用的测试自动化工具通常只支持其中的一小部分。我们引入了一种新的规范语言，称为HTOL(超标签测试目标语言)，它提供了一种强大的通用机制来定义广泛的测试目标。HTOL具有形式化语义，可以编码除完全突变外的所有标准标准。除了规范之外，HTOL在测试自动化的上下文中也很有吸引力，因为它允许以统一的方式处理标准。

引用次数: 15

Uncertainty-Driven Black-Box Test Data Generation 不确定性驱动的黑盒测试数据生成

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)

Pub Date : 2016-08-10 DOI: 10.1109/ICST.2017.30

Neil Walkinshaw, G. Fraser

We can never be certain that a software system is correct simply by testing it, but with every additional successful test we become less uncertain about its correctness. In absence of source code or elaborate specifications and models, tests are usually generated or chosen randomly. However, rather than randomly choosing tests, it would be preferable to choose those tests that decrease our uncertainty about correctness the most. In order to guide test generation, we apply what is referred to in Machine Learning as "Query Strategy Framework": We infer a behavioural model of the system under test and select those tests which the inferred model is "least certain" about. Running these tests on the system under test thus directly targets those parts about which tests so far have failed to inform the model. We provide an implementation that uses a genetic programming engine for model inference in order to enable an uncertainty sampling technique known as "query by committee", and evaluate it on eight subject systems from the Apache Commons Math framework and JodaTime. The results indicate that test generation using uncertainty sampling outperforms conventional and Adaptive Random Testing.

我们永远不能仅仅通过测试就确定一个软件系统是正确的，但是随着每一次成功的测试，我们对其正确性的不确定性就会减少。在没有源代码或详细的规范和模型的情况下，测试通常是随机生成或选择的。然而，与其随机选择测试，不如选择那些最能减少我们对正确性的不确定性的测试。为了指导测试生成，我们应用机器学习中所谓的“查询策略框架”:我们推断被测系统的行为模型，并选择推断模型“最不确定”的那些测试。因此，在被测系统上运行这些测试直接针对到目前为止测试未能通知模型的那些部分。我们提供了一个使用遗传编程引擎进行模型推理的实现，以启用被称为“委员会查询”的不确定性采样技术，并在Apache Commons Math框架和JodaTime的八个主题系统上对其进行了评估。结果表明，采用不确定抽样的测试生成方法优于常规随机测试和自适应随机测试。

{"title":"Uncertainty-Driven Black-Box Test Data Generation","authors":"Neil Walkinshaw, G. Fraser","doi":"10.1109/ICST.2017.30","DOIUrl":"https://doi.org/10.1109/ICST.2017.30","url":null,"abstract":"We can never be certain that a software system is correct simply by testing it, but with every additional successful test we become less uncertain about its correctness. In absence of source code or elaborate specifications and models, tests are usually generated or chosen randomly. However, rather than randomly choosing tests, it would be preferable to choose those tests that decrease our uncertainty about correctness the most. In order to guide test generation, we apply what is referred to in Machine Learning as \"Query Strategy Framework\": We infer a behavioural model of the system under test and select those tests which the inferred model is \"least certain\" about. Running these tests on the system under test thus directly targets those parts about which tests so far have failed to inform the model. We provide an implementation that uses a genetic programming engine for model inference in order to enable an uncertainty sampling technique known as \"query by committee\", and evaluate it on eight subject systems from the Apache Commons Math framework and JodaTime. The results indicate that test generation using uncertainty sampling outperforms conventional and Adaptive Random Testing.","PeriodicalId":112258,"journal":{"name":"2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126395080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33

Assessing and Improving the Mutation Testing Practice of PIT PIT突变检测实践的评估与改进

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)

Pub Date : 2016-01-11 DOI: 10.1109/ICST.2017.47

Thomas Laurent, Anthony Ventresque, Mike Papadakis, Christopher Henard, Yves Le Traon

Mutation testing is extensively used in software testing studies. However, popular mutation testing tools use a restrictive set of mutants which does not conform to the community standards and mutation testing literature. This can be problematic since the effectiveness of mutation strongly depends on the used mutants. To investigate this issue we form an extended set of mutants and implement it on a popular mutation testing tool named PIT. We then show that in real-world projects the original mutants of PIT are easier to kill and lead to tests that score statistically lower than those of the extended set of mutants for a range of 35% to 70% of the studied classes. These results raise serious concerns regarding the validity of mutation-based experiments that use PIT. To further show the strengths of the extended mutants we also performed an analysis using a benchmark with mutation-adequate test cases and identified equivalent mutants. Our results confirmed that the extended mutants are more effective than a) the original version of PIT and b) two other popular mutation testing tools (major and muJava). In particular, our results demonstrate that the extended mutants are more effective by 23%, 12% and 7% than the mutants of the original PIT, major and muJava. They also show that the extended mutants are at least as strong as the mutants of all the other three tools together. To support future research, we make the new version of PIT, which is equipped with the extended mutants, publicly available.

突变测试在软件测试研究中被广泛使用。然而，流行的突变检测工具使用的是一组限制性的突变体，不符合社区标准和突变检测文献。这可能是有问题的，因为突变的有效性在很大程度上取决于使用的突变体。为了研究这个问题，我们形成了一个扩展的突变集，并在一个名为PIT的流行突变测试工具上实现它。然后，我们表明，在现实世界的项目中，PIT的原始突变体更容易被杀死，并且在35%至70%的研究类别中导致测试得分低于扩展突变集的测试得分。这些结果引起了对使用PIT的基于突变的实验有效性的严重关注。为了进一步显示扩展突变的优势，我们还使用具有足够突变的测试用例的基准执行了分析，并确定了等效突变。我们的结果证实，扩展突变比a)原始版本的PIT和b)其他两种流行的突变测试工具(major和muJava)更有效。特别是，我们的研究结果表明，扩展突变体比原始PIT, major和muJava突变体的有效性分别提高了23%，12%和7%。他们还表明，扩展突变体的强度至少与其他三种工具的突变体加在一起一样强。为了支持未来的研究，我们公开了配备扩展突变体的新版本PIT。

{"title":"Assessing and Improving the Mutation Testing Practice of PIT","authors":"Thomas Laurent, Anthony Ventresque, Mike Papadakis, Christopher Henard, Yves Le Traon","doi":"10.1109/ICST.2017.47","DOIUrl":"https://doi.org/10.1109/ICST.2017.47","url":null,"abstract":"Mutation testing is extensively used in software testing studies. However, popular mutation testing tools use a restrictive set of mutants which does not conform to the community standards and mutation testing literature. This can be problematic since the effectiveness of mutation strongly depends on the used mutants. To investigate this issue we form an extended set of mutants and implement it on a popular mutation testing tool named PIT. We then show that in real-world projects the original mutants of PIT are easier to kill and lead to tests that score statistically lower than those of the extended set of mutants for a range of 35% to 70% of the studied classes. These results raise serious concerns regarding the validity of mutation-based experiments that use PIT. To further show the strengths of the extended mutants we also performed an analysis using a benchmark with mutation-adequate test cases and identified equivalent mutants. Our results confirmed that the extended mutants are more effective than a) the original version of PIT and b) two other popular mutation testing tools (major and muJava). In particular, our results demonstrate that the extended mutants are more effective by 23%, 12% and 7% than the mutants of the original PIT, major and muJava. They also show that the extended mutants are at least as strong as the mutants of all the other three tools together. To support future research, we make the new version of PIT, which is equipped with the extended mutants, publicly available.","PeriodicalId":112258,"journal":{"name":"2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125659390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

The Fitness Function for the Job: Search-Based Generation of Test Suites That Detect Real Faults 任务的适应度函数:基于搜索的检测真实故障的测试套件生成

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)

Pub Date : 1900-01-01 DOI: 10.1109/ICST.2017.38

Gregory Gay

Search-based test generation, if effective at fault detection, can lower the cost of testing. Such techniques rely on fitness functions to guide the search. Ultimately, such functions represent test goals that approximate — but do not ensure — fault detection. The need to rely on approximations leads to two questions — can fitness functions produce effective tests and, if so, which should be used to generate tests? To answer these questions, we have assessed the fault-detection capabilities of the EvoSuite framework and eight of its fitness functions on 353 real faults from the Defects4J database. Our analysis has found that the strongest indicator of effectiveness is a high level of code coverage. Consequently, the branch coverage fitness function is the most effective. Our findings indicate that fitness functions that thoroughly explore system structure should be used as primary generation objectives — supported by secondary fitness functions that vary the scenarios explored.

基于搜索的测试生成，如果在故障检测中有效，可以降低测试成本。这种技术依赖于适应度函数来指导搜索。最终，这些函数代表了测试目标，它们近似于——但不保证——故障检测。依赖近似的需要导致了两个问题——适应度函数能产生有效的测试吗?如果可以，应该使用哪个来生成测试?为了回答这些问题，我们评估了EvoSuite框架的故障检测能力及其八个适合度函数对来自Defects4J数据库的353个实际故障的检测能力。我们的分析发现，效率的最强指标是高水平的代码覆盖率。因此，分支覆盖适应度函数是最有效的。我们的研究结果表明，应该使用彻底探索系统结构的适应度函数作为主要的生成目标，并辅以改变所探索场景的次级适应度函数。

引用次数: 32

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2017 IEEE International Conference on Software Testing, Verification and Validation (ICST)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀