Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis最新文献

英文中文

Learning user interface element interactions 学习用户界面元素交互

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2019-07-10 DOI: 10.1145/3293882.3330569

Christian Degott, N. P. Borges, A. Zeller

When generating tests for graphical user interfaces, one central problem is to identify how individual UI elements can be interacted with—clicking, long- or right-clicking, swiping, dragging, typing, or more. We present an approach based on reinforcement learning that automatically learns which interactions can be used for which elements, and uses this information to guide test generation. We model the problem as an instance of the multi-armed bandit problem (MAB problem) from probability theory, and show how its traditional solutions work on test generation, with and without relying on previous knowledge. The resulting guidance yields higher coverage. In our evaluation, our approach shows improvements in statement coverage between 18% (when not using any previous knowledge) and 20% (when reusing previously generated models).

在为图形用户界面生成测试时，一个中心问题是确定如何与单个UI元素进行交互——单击、长击或右击、滑动、拖动、键入等等。我们提出了一种基于强化学习的方法，该方法自动学习哪些交互可以用于哪些元素，并使用该信息来指导测试生成。我们从概率论中将该问题建模为多臂强盗问题(MAB问题)的一个实例，并展示了其传统解决方案在有或没有依赖于先前知识的情况下如何在测试生成中工作。由此产生的指导产生更高的覆盖率。在我们的评估中，我们的方法显示语句覆盖率的提高在18%(不使用任何以前的知识时)和20%(重用以前生成的模型时)之间。

引用次数: 39

DeepHunter: a coverage-guided fuzz testing framework for deep neural networks 深度神经网络的覆盖引导模糊测试框架

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2019-07-10 DOI: 10.1145/3293882.3330579

Xiaofei Xie, L. Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, S. See

The past decade has seen the great potential of applying deep neural network (DNN) based software to safety-critical scenarios, such as autonomous driving. Similar to traditional software, DNNs could exhibit incorrect behaviors, caused by hidden defects, leading to severe accidents and losses. In this paper, we propose DeepHunter, a coverage-guided fuzz testing framework for detecting potential defects of general-purpose DNNs. To this end, we first propose a metamorphic mutation strategy to generate new semantically preserved tests, and leverage multiple extensible coverage criteria as feedback to guide the test generation. We further propose a seed selection strategy that combines both diversity-based and recency-based seed selection. We implement and incorporate 5 existing testing criteria and 4 seed selection strategies in DeepHunter. Large-scale experiments demonstrate that (1) our metamorphic mutation strategy is useful to generate new valid tests with the same semantics as the original seed, by up to a 98% validity ratio; (2) the diversity-based seed selection generally weighs more than recency-based seed selection in boosting the coverage and in detecting defects; (3) DeepHunter outperforms the state of the arts by coverage as well as the quantity and diversity of defects identified; (4) guided by corner-region based criteria, DeepHunter is useful to capture defects during the DNN quantization for platform migration.

在过去的十年中，我们看到了将基于深度神经网络(DNN)的软件应用于安全关键场景(如自动驾驶)的巨大潜力。与传统软件类似，dnn可能由于隐藏缺陷而表现出不正确的行为，从而导致严重的事故和损失。在本文中，我们提出了DeepHunter，这是一个覆盖引导的模糊测试框架，用于检测通用dnn的潜在缺陷。为此，我们首先提出了一种变形突变策略来生成新的语义保留的测试，并利用多个可扩展的覆盖标准作为反馈来指导测试的生成。我们进一步提出了一种结合多样性和近代性的种子选择策略。我们在DeepHunter中实现并合并了5个现有的测试标准和4个种子选择策略。大规模实验表明:(1)我们的变形突变策略可以产生与原始种子具有相同语义的新有效测试，效度高达98%;(2)在提高覆盖率和检测缺陷方面，基于多样性的种子选择通常比基于近代性的种子选择更重要;(3) DeepHunter在覆盖范围以及已识别缺陷的数量和多样性方面优于现有技术;(4)在基于角域的准则指导下，DeepHunter可以在深度神经网络量化平台迁移过程中捕获缺陷。

{"title":"DeepHunter: a coverage-guided fuzz testing framework for deep neural networks","authors":"Xiaofei Xie, L. Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, S. See","doi":"10.1145/3293882.3330579","DOIUrl":"https://doi.org/10.1145/3293882.3330579","url":null,"abstract":"The past decade has seen the great potential of applying deep neural network (DNN) based software to safety-critical scenarios, such as autonomous driving. Similar to traditional software, DNNs could exhibit incorrect behaviors, caused by hidden defects, leading to severe accidents and losses. In this paper, we propose DeepHunter, a coverage-guided fuzz testing framework for detecting potential defects of general-purpose DNNs. To this end, we first propose a metamorphic mutation strategy to generate new semantically preserved tests, and leverage multiple extensible coverage criteria as feedback to guide the test generation. We further propose a seed selection strategy that combines both diversity-based and recency-based seed selection. We implement and incorporate 5 existing testing criteria and 4 seed selection strategies in DeepHunter. Large-scale experiments demonstrate that (1) our metamorphic mutation strategy is useful to generate new valid tests with the same semantics as the original seed, by up to a 98% validity ratio; (2) the diversity-based seed selection generally weighs more than recency-based seed selection in boosting the coverage and in detecting defects; (3) DeepHunter outperforms the state of the arts by coverage as well as the quantity and diversity of defects identified; (4) guided by corner-region based criteria, DeepHunter is useful to capture defects during the DNN quantization for platform migration.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75516775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 306

Identifying error code misuses in complex system 复杂系统误码识别

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2019-07-10 DOI: 10.1145/3293882.3338986

Wensheng Tang

Many complex software systems use error codes to differentiate error states. Therefore, it is crucial to ensure those error codes are used correctly. Misuses of error codes can lead to hardly sensible but fatal system failures. These errors are especially difficult to debug, since the failure points are usually far away from the root causes. Existing static analysis approaches to detecting error handling bugs mainly focus on how an error code is propagated or used in a program. However, they do not consider whether an error code is correctly chosen for propagation or usage within different program contexts, and thus miss to detect many error code misuse bugs. In this work, we conduct an empirical study on error code misuses in a mature commercial system. We collect error code issues from the commit history and conclude three main causes of them. To further resolve this problem, we propose a static approach that can automatically detect error code misuses. Our approach takes error code definition and error domain assignment as the input, and uses a novel static analysis method to detect the occurrence of the three categories of error code misuses in the source code.

许多复杂的软件系统使用错误码来区分错误状态。因此，确保正确使用这些错误代码是至关重要的。错误代码的滥用可能导致几乎不明智但致命的系统故障。这些错误特别难以调试，因为故障点通常远离根本原因。现有的检测错误处理缺陷的静态分析方法主要关注错误代码在程序中的传播或使用方式。然而，他们没有考虑错误代码是否被正确地选择用于传播或在不同的程序上下文中使用，因此错过了检测许多错误代码误用错误。在这项工作中，我们对一个成熟的商业系统中的错误码滥用进行了实证研究。我们从提交历史中收集错误代码问题，并总结出三个主要原因。为了进一步解决这个问题，我们提出了一种静态方法，可以自动检测错误代码的滥用。该方法以错误码定义和错误域分配为输入，采用一种新颖的静态分析方法检测源代码中三类错误码滥用的发生情况。

{"title":"Identifying error code misuses in complex system","authors":"Wensheng Tang","doi":"10.1145/3293882.3338986","DOIUrl":"https://doi.org/10.1145/3293882.3338986","url":null,"abstract":"Many complex software systems use error codes to differentiate error states. Therefore, it is crucial to ensure those error codes are used correctly. Misuses of error codes can lead to hardly sensible but fatal system failures. These errors are especially difficult to debug, since the failure points are usually far away from the root causes. Existing static analysis approaches to detecting error handling bugs mainly focus on how an error code is propagated or used in a program. However, they do not consider whether an error code is correctly chosen for propagation or usage within different program contexts, and thus miss to detect many error code misuse bugs. In this work, we conduct an empirical study on error code misuses in a mature commercial system. We collect error code issues from the commit history and conclude three main causes of them. To further resolve this problem, we propose a static approach that can automatically detect error code misuses. Our approach takes error code definition and error domain assignment as the input, and uses a novel static analysis method to detect the occurrence of the three categories of error code misuses in the source code.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"115 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75651376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Automatically testing self-driving cars with search-based procedural content generation 通过基于搜索的程序内容生成自动测试自动驾驶汽车

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2019-07-10 DOI: 10.1145/3293882.3330566

Alessio Gambi, Marc Müller, G. Fraser

Self-driving cars rely on software which needs to be thoroughly tested. Testing self-driving car software in real traffic is not only expensive but also dangerous, and has already caused fatalities. Virtual tests, in which self-driving car software is tested in computer simulations, offer a more efficient and safer alternative compared to naturalistic field operational tests. However, creating suitable test scenarios is laborious and difficult. In this paper we combine procedural content generation, a technique commonly employed in modern video games, and search-based testing, a testing technique proven to be effective in many domains, in order to automatically create challenging virtual scenarios for testing self-driving car soft- ware. Our AsFault prototype implements this approach to generate virtual roads for testing lane keeping, one of the defining features of autonomous driving. Evaluation on two different self-driving car software systems demonstrates that AsFault can generate effective virtual road networks that succeed in revealing software failures, which manifest as cars departing their lane. Compared to random testing AsFault was not only more efficient, but also caused up to twice as many lane departures.

自动驾驶汽车依赖于需要彻底测试的软件。在真实交通中测试自动驾驶汽车软件不仅昂贵而且危险，而且已经造成了人员伤亡。与自然的现场操作测试相比，自动驾驶汽车软件在计算机模拟中进行测试的虚拟测试提供了一种更有效、更安全的替代方案。然而，创建合适的测试场景既费力又困难。在本文中，我们将程序内容生成(一种通常用于现代视频游戏的技术)和基于搜索的测试(一种在许多领域被证明有效的测试技术)结合起来，以自动创建具有挑战性的虚拟场景来测试自动驾驶汽车软件。我们的AsFault原型实现了这种方法来生成用于测试车道保持的虚拟道路，这是自动驾驶的定义特征之一。对两种不同的自动驾驶汽车软件系统的评估表明，AsFault可以生成有效的虚拟道路网络，成功地揭示软件故障，这些故障表现为汽车偏离车道。与随机测试相比，AsFault不仅效率更高，而且导致车道偏离的次数最多是随机测试的两倍。

{"title":"Automatically testing self-driving cars with search-based procedural content generation","authors":"Alessio Gambi, Marc Müller, G. Fraser","doi":"10.1145/3293882.3330566","DOIUrl":"https://doi.org/10.1145/3293882.3330566","url":null,"abstract":"Self-driving cars rely on software which needs to be thoroughly tested. Testing self-driving car software in real traffic is not only expensive but also dangerous, and has already caused fatalities. Virtual tests, in which self-driving car software is tested in computer simulations, offer a more efficient and safer alternative compared to naturalistic field operational tests. However, creating suitable test scenarios is laborious and difficult. In this paper we combine procedural content generation, a technique commonly employed in modern video games, and search-based testing, a testing technique proven to be effective in many domains, in order to automatically create challenging virtual scenarios for testing self-driving car soft- ware. Our AsFault prototype implements this approach to generate virtual roads for testing lane keeping, one of the defining features of autonomous driving. Evaluation on two different self-driving car software systems demonstrates that AsFault can generate effective virtual road networks that succeed in revealing software failures, which manifest as cars departing their lane. Compared to random testing AsFault was not only more efficient, but also caused up to twice as many lane departures.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78311738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 151

JNI program analysis with automatically extracted C semantic summary JNI程序分析用C语言自动提取语义摘要

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2019-07-10 DOI: 10.1145/3293882.3338990

Sungho Lee

From Oracle JVM to Android Runtime, most Java runtime environments officially support Java Native Interface (JNI) for interaction between Java and C. Using JNI, developers can improve Java program performance or reuse existing libraries implemented in C. At the same time, differences between the languages can lead to various kinds of unexpected bugs when developers do not understand the differences or comprehensive interoperation semantics completely. Furthermore, existing program analysis techniques do not cover the interoperation, which can reduce the quality of JNI programs. We propose a JNI program analysis technique that analyzes Java and C code of JNI programs using analyzers targeting each language respectively. The C analyzer generates a semantic summary for each C function callable from Java and the Java analyzer constructs call graphs using the semantic summaries and Java code. In addition to the call graph construction, we extend the analysis technique to detect four bug types that can occur in the interoperation between the languages. We believe that our approach would be able to detect genuine bugs as well as improve the quality of JNI programs.

从Oracle JVM到Android Runtime，大多数Java运行时环境都正式支持Java Native Interface (JNI)来实现Java和c之间的交互。使用JNI，开发人员可以提高Java程序的性能或重用c中实现的现有库。同时，当开发人员不完全理解语言之间的差异或全面的互操作语义时，语言之间的差异会导致各种意想不到的错误。此外，现有的程序分析技术没有涵盖互操作，这可能会降低JNI程序的质量。我们提出了一种JNI程序分析技术，该技术使用针对每种语言的分析器分别分析JNI程序的Java和C代码。C分析器为每个可从Java调用的C函数生成语义摘要，Java分析器使用语义摘要和Java代码构造调用图。除了调用图构造之外，我们还扩展了分析技术，以检测语言之间互操作中可能出现的四种错误类型。我们相信，我们的方法将能够检测真正的错误，并提高JNI程序的质量。

{"title":"JNI program analysis with automatically extracted C semantic summary","authors":"Sungho Lee","doi":"10.1145/3293882.3338990","DOIUrl":"https://doi.org/10.1145/3293882.3338990","url":null,"abstract":"From Oracle JVM to Android Runtime, most Java runtime environments officially support Java Native Interface (JNI) for interaction between Java and C. Using JNI, developers can improve Java program performance or reuse existing libraries implemented in C. At the same time, differences between the languages can lead to various kinds of unexpected bugs when developers do not understand the differences or comprehensive interoperation semantics completely. Furthermore, existing program analysis techniques do not cover the interoperation, which can reduce the quality of JNI programs. We propose a JNI program analysis technique that analyzes Java and C code of JNI programs using analyzers targeting each language respectively. The C analyzer generates a semantic summary for each C function callable from Java and the Java analyzer constructs call graphs using the semantic summaries and Java code. In addition to the call graph construction, we extend the analysis technique to detect four bug types that can occur in the interoperation between the languages. We believe that our approach would be able to detect genuine bugs as well as improve the quality of JNI programs.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83863673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Some challenges for software testing research (invited talk paper) 软件测试研究面临的一些挑战(特邀报告)

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2019-07-10 DOI: 10.1145/3293882.3338991

N. Alshahwan, Andrea Ciancone, M. Harman, Yue Jia, Ke Mao, Alexandru Marginean, A. Mols, Hila Peleg, Federica Sarro, Ilya Zorin

This paper outlines 4 open challenges for Software Testing in general and Search Based Software Testing in particular, arising from our experience with the Sapienz System Deployment at Facebook. The challenges may also apply more generally, thereby representing opportunities for the research community to further benefit from the growing interest in automated test design in industry.

本文概述了软件测试的4个开放挑战，特别是基于搜索的软件测试，这些挑战来自我们在Facebook的Sapienz系统部署的经验。这些挑战也可以更普遍地应用，从而代表了研究团体从工业中对自动化测试设计日益增长的兴趣中进一步受益的机会。

引用次数: 8

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis 第28届ACM SIGSOFT软件测试与分析国际研讨会论文集

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2019-07-10 DOI: 10.1145/3293882

引用次数: 1

Ukwikora: continuous inspection for keyword-driven testing Ukwikora:关键字驱动测试的连续检查

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2019-07-10 DOI: 10.1145/3293882.3339003

Renaud Rwemalika, Marinos Kintis, Mike Papadakis, Yves Le Traon, Pierre Lorrach

Automation of acceptance test suites becomes necessary in the context of agile software development practices, which require rapid feedback on the quality of code changes. To this end, companies try to automate their acceptance tests as much as possible. Unfortunately, the growth of the automated test suites, by several automation testers, gives rise to potential test smells, i.e., poorly designed test code, being introduced in the test code base, which in turn may increase the cost of maintaining the code and creating new one. In this paper, we investigate this problem in the context of our industrial partner, BGL BNP Paribas, and introduce Ukwikora, an automated tool that statically analyzes acceptance test suites, enabling the continuous inspection of the test code base. Ukwikora targets code written in the Robot Framework syntax, a popular framework for writing Keyword-Driven tests. Ukwikora has been successfully deployed at BGL BNP Paribas, detecting issues otherwise unknown to the automation testers, such as the presence of duplicated test code, dead test code and dependency issues among the tests. The success of our case study reinforces the need for additional research and tooling for acceptance test suites.

在敏捷软件开发实践的环境中，验收测试套件的自动化变得非常必要，因为敏捷软件开发实践需要对代码更改的质量进行快速反馈。为此，公司试图尽可能地自动化他们的验收测试。不幸的是，自动化测试套件的增长，由几个自动化测试人员完成，产生了潜在的测试气味，例如，在测试代码库中引入了设计不良的测试代码，这反过来又可能增加维护代码和创建新代码的成本。在本文中，我们在我们的工业合作伙伴BGL BNP Paribas的背景下研究了这个问题，并介绍了Ukwikora，一个静态分析验收测试套件的自动化工具，支持对测试代码库的连续检查。Ukwikora的目标代码是用Robot Framework语法编写的，Robot Framework是编写关键字驱动测试的一个流行框架。Ukwikora已经成功地部署在BGL BNP Paribas，检测到自动化测试人员不知道的问题，例如重复测试代码的存在、失效测试代码和测试之间的依赖问题。我们案例研究的成功强化了对验收测试套件的额外研究和工具的需求。

{"title":"Ukwikora: continuous inspection for keyword-driven testing","authors":"Renaud Rwemalika, Marinos Kintis, Mike Papadakis, Yves Le Traon, Pierre Lorrach","doi":"10.1145/3293882.3339003","DOIUrl":"https://doi.org/10.1145/3293882.3339003","url":null,"abstract":"Automation of acceptance test suites becomes necessary in the context of agile software development practices, which require rapid feedback on the quality of code changes. To this end, companies try to automate their acceptance tests as much as possible. Unfortunately, the growth of the automated test suites, by several automation testers, gives rise to potential test smells, i.e., poorly designed test code, being introduced in the test code base, which in turn may increase the cost of maintaining the code and creating new one. In this paper, we investigate this problem in the context of our industrial partner, BGL BNP Paribas, and introduce Ukwikora, an automated tool that statically analyzes acceptance test suites, enabling the continuous inspection of the test code base. Ukwikora targets code written in the Robot Framework syntax, a popular framework for writing Keyword-Driven tests. Ukwikora has been successfully deployed at BGL BNP Paribas, detecting issues otherwise unknown to the automation testers, such as the presence of duplicated test code, dead test code and dependency issues among the tests. The success of our case study reinforces the need for additional research and tooling for acceptance test suites.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75370375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Interactive metamorphic testing of debuggers 调试器的交互式变形测试

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2019-07-10 DOI: 10.1145/3293882.3330567

S. Tolksdorf, Daniel Lehmann, Michael Pradel

When improving their code, developers often turn to interactive debuggers. The correctness of these tools is crucial, because bugs in the debugger itself may mislead a developer, e.g., to believe that executed code is never reached or that a variable has another value than in the actual execution. Yet, debuggers are difficult to test because their input consists of both source code and a sequence of debugging actions, such as setting breakpoints or stepping through code. This paper presents the first metamorphic testing approach for debuggers. The key idea is to transform both the debugged code and the debugging actions in such a way that the behavior of the original and the transformed inputs should differ only in specific ways. For example, adding a breakpoint should not change the control flow of the debugged program. To support the interactive nature of debuggers, we introduce interactive metamorphic testing. It differs from traditional metamorphic testing by determining the input transformation and the expected behavioral change it causes while the program under test is running. Our evaluation applies the approach to the widely used debugger in the Chromium browser, where it finds eight previously unknown bugs with a true positive rate of 51%. All bugs have been confirmed by the developers, and one bug has even been marked as release-blocking.

在改进代码时，开发人员经常求助于交互式调试器。这些工具的正确性是至关重要的，因为调试器本身的错误可能会误导开发人员，例如，相信从未到达执行的代码或变量具有与实际执行时不同的值。然而，调试器很难测试，因为它们的输入既包括源代码，也包括一系列调试操作，比如设置断点或逐步执行代码。本文提出了调试器的第一种变形测试方法。关键思想是以这样一种方式转换调试代码和调试操作，即原始输入和转换后输入的行为只在特定方面有所不同。例如，添加断点不应该改变已调试程序的控制流。为了支持调试器的交互特性，我们引入了交互式变形测试。它与传统的变形测试的不同之处在于，它在测试程序运行时确定输入转换和预期的行为变化。我们的评估将这种方法应用于chrome浏览器中广泛使用的调试器，它发现了8个以前未知的错误，真实阳性率为51%。开发人员已经确认了所有的bug，其中一个bug甚至被标记为发布阻塞。

{"title":"Interactive metamorphic testing of debuggers","authors":"S. Tolksdorf, Daniel Lehmann, Michael Pradel","doi":"10.1145/3293882.3330567","DOIUrl":"https://doi.org/10.1145/3293882.3330567","url":null,"abstract":"When improving their code, developers often turn to interactive debuggers. The correctness of these tools is crucial, because bugs in the debugger itself may mislead a developer, e.g., to believe that executed code is never reached or that a variable has another value than in the actual execution. Yet, debuggers are difficult to test because their input consists of both source code and a sequence of debugging actions, such as setting breakpoints or stepping through code. This paper presents the first metamorphic testing approach for debuggers. The key idea is to transform both the debugged code and the debugging actions in such a way that the behavior of the original and the transformed inputs should differ only in specific ways. For example, adding a breakpoint should not change the control flow of the debugged program. To support the interactive nature of debuggers, we introduce interactive metamorphic testing. It differs from traditional metamorphic testing by determining the input transformation and the expected behavioral change it causes while the program under test is running. Our evaluation applies the approach to the widely used debugger in the Chromium browser, where it finds eight previously unknown bugs with a true positive rate of 51%. All bugs have been confirmed by the developers, and one bug has even been marked as release-blocking.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82025584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Effective and efficient API misuse detection via exception propagation and search-based testing 通过异常传播和基于搜索的测试进行有效和高效的API误用检测

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pub Date : 2019-07-10 DOI: 10.1145/3293882.3330552

M. Kechagia, Xavier Devroey, Annibale Panichella, Georgios Gousios, A. Deursen

Application Programming Interfaces (APIs) typically come with (implicit) usage constraints. The violations of these constraints (API misuses) can lead to software crashes. Even though there are several tools that can detect API misuses, most of them suffer from a very high rate of false positives. We introduce Catcher, a novel API misuse detection approach that combines static exception propagation analysis with automatic search-based test case generation to effectively and efficiently pinpoint crash-prone API misuses in client applications. We validate Catcher against 21 Java applications, targeting misuses of the Java platform's API. Our results indicate that Catcher is able to generate test cases that uncover 243 (unique) API misuses that result in crashes. Our empirical evaluation shows that Catcher can detect a large number of misuses (77 cases) that would remain undetected by the traditional coverage-based test case generator EvoSuite. Additionally, on average, Catcher is eight times faster than EvoSuite in generating test cases for the identified misuses. Finally, we find that the majority of the exceptions triggered by Catcher are unexpected to developers, i.e., not only unhandled in the source code but also not listed in the documentation of the client applications.

应用程序编程接口(api)通常带有(隐式)使用约束。违反这些约束(API误用)可能导致软件崩溃。尽管有几种工具可以检测API的滥用，但大多数工具的误报率非常高。我们介绍了一种新的API误用检测方法Catcher，它将静态异常传播分析与基于自动搜索的测试用例生成相结合，从而有效地查明客户端应用程序中容易导致崩溃的API误用。我们针对21个Java应用程序验证了Catcher，针对Java平台API的误用。我们的结果表明，Catcher能够生成测试用例，发现243个(独特的)导致崩溃的API误用。我们的经验评估表明，Catcher可以检测到传统的基于覆盖率的测试用例生成器EvoSuite无法检测到的大量滥用(77例)。此外，在为已识别的错误生成测试用例方面，Catcher的平均速度是EvoSuite的8倍。最后，我们发现Catcher触发的大多数异常对开发人员来说都是意料之外的，也就是说，不仅在源代码中没有处理，而且在客户端应用程序的文档中也没有列出。

{"title":"Effective and efficient API misuse detection via exception propagation and search-based testing","authors":"M. Kechagia, Xavier Devroey, Annibale Panichella, Georgios Gousios, A. Deursen","doi":"10.1145/3293882.3330552","DOIUrl":"https://doi.org/10.1145/3293882.3330552","url":null,"abstract":"Application Programming Interfaces (APIs) typically come with (implicit) usage constraints. The violations of these constraints (API misuses) can lead to software crashes. Even though there are several tools that can detect API misuses, most of them suffer from a very high rate of false positives. We introduce Catcher, a novel API misuse detection approach that combines static exception propagation analysis with automatic search-based test case generation to effectively and efficiently pinpoint crash-prone API misuses in client applications. We validate Catcher against 21 Java applications, targeting misuses of the Java platform's API. Our results indicate that Catcher is able to generate test cases that uncover 243 (unique) API misuses that result in crashes. Our empirical evaluation shows that Catcher can detect a large number of misuses (77 cases) that would remain undetected by the traditional coverage-based test case generator EvoSuite. Additionally, on average, Catcher is eight times faster than EvoSuite in generating test cases for the identified misuses. Finally, we find that the majority of the exceptions triggered by Catcher are unexpected to developers, i.e., not only unhandled in the source code but also not listed in the documentation of the client applications.","PeriodicalId":20624,"journal":{"name":"Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88009723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀