2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)最新文献_第8页

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180203

Arash Vahabzadeh, Andrea Stocco, A. Mesbah

As a software system evolves, its test suite can accumulate redundancies over time. Test minimization aims at removing redundant test cases. However, current techniques remove whole test cases from the test suite using test adequacy criteria, such as code coverage. This has two limitations, namely (1) by removing a whole test case the corresponding test assertions are also lost, which can inhibit test suite effectiveness, (2) the issue of partly redundant test cases, i.e., tests with redundant test statements, is ignored. We propose a novel approach for fine-grained test case minimization. Our analysis is based on the inference of a test suite model that enables automated test reorganization within test cases. It enables removing redundancies at the test statement level, while preserving the coverage and test assertions of the test suite. We evaluated our approach, implemented in a tool called Testler, on the test suites of 15 open source projects. Our analysis shows that over 4,639 (24%) of the tests in these test suites are partly redundant, with over 11,819 redundant test statements in total. Our results show that Testler removes 43% of the redundant test statements, reducing the number of partly redundant tests by 52%. As a result, test suite execution time is reduced by up to 37% (20% on average), while maintaining the original statement coverage, branch coverage, test assertions, and fault detection capability.

随着软件系统的发展，它的测试套件会随着时间的推移积累冗余。测试最小化旨在删除冗余的测试用例。然而，当前的技术使用测试充分性标准(比如代码覆盖率)从测试套件中移除整个测试用例。这有两个限制，即(1)通过移除整个测试用例，相应的测试断言也丢失了，这可能会抑制测试套件的有效性，(2)部分冗余的测试用例的问题，即，具有冗余测试语句的测试，被忽略了。我们提出了一种细粒度测试用例最小化的新方法。我们的分析是基于测试套件模型的推断，该模型支持测试用例中的自动化测试重组。它支持在测试语句级别删除冗余，同时保留测试套件的覆盖率和测试断言。我们在15个开源项目的测试套件中评估了我们的方法，并在一个名为Testler的工具中实现。我们的分析显示，这些测试套件中超过4,639(24%)的测试是部分冗余的，总共超过11,819个冗余的测试语句。我们的结果表明，Testler删除了43%的冗余测试语句，将部分冗余测试的数量减少了52%。结果，测试套件的执行时间减少了37%(平均20%)，同时保持了原始语句覆盖、分支覆盖、测试断言和故障检测能力。

{"title":"Fine-Grained Test Minimization","authors":"Arash Vahabzadeh, Andrea Stocco, A. Mesbah","doi":"10.1145/3180155.3180203","DOIUrl":"https://doi.org/10.1145/3180155.3180203","url":null,"abstract":"As a software system evolves, its test suite can accumulate redundancies over time. Test minimization aims at removing redundant test cases. However, current techniques remove whole test cases from the test suite using test adequacy criteria, such as code coverage. This has two limitations, namely (1) by removing a whole test case the corresponding test assertions are also lost, which can inhibit test suite effectiveness, (2) the issue of partly redundant test cases, i.e., tests with redundant test statements, is ignored. We propose a novel approach for fine-grained test case minimization. Our analysis is based on the inference of a test suite model that enables automated test reorganization within test cases. It enables removing redundancies at the test statement level, while preserving the coverage and test assertions of the test suite. We evaluated our approach, implemented in a tool called Testler, on the test suites of 15 open source projects. Our analysis shows that over 4,639 (24%) of the tests in these test suites are partly redundant, with over 11,819 redundant test statements in total. Our results show that Testler removes 43% of the redundant test statements, reducing the number of partly redundant tests by 52%. As a result, test suite execution time is reduced by up to 37% (20% on average), while maintaining the original statement coverage, branch coverage, test assertions, and fault detection capability.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"42 1","pages":"210-221"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73386120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29

Automatically Finding Bugs in a Commercial Cyber-Physical System Development Tool Chain With SLforge 使用SLforge自动查找商业网络物理系统开发工具链中的错误

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180231

Shafiul Azam Chowdhury, Soumik Mohian, Sidharth Mehra, Siddhant Gawsane, Taylor T. Johnson, Christoph Csallner

Cyber-physical system (CPS) development tool chains are widely used in the design, simulation, and verification of CPS data-flow models. Commercial CPS tool chains such as MathWorks' Simulink generate artifacts such as code binaries that are widely deployed in embedded systems. Hardening such tool chains by testing is crucial since formally verifying them is currently infeasible. Existing differential testing frameworks such as CyFuzz can not generate models rich in language features, partly because these tool chains do not leverage the available informal Simulink specifications. Furthermore, no study of existing Simulink models is available, which could guide CyFuzz to generate realistic models. To address these shortcomings, we created the first large collection of public Simulink models and used the collected models' properties to guide random model generation. To further guide model generation we systematically collected semi-formal Simulink specifications. In our experiments on several hundred models, the resulting SLforge generator was more effective and efficient than the state-of-the-art tool CyFuzz. SLforge also found 8 new confirmed bugs in Simulink.

信息物理系统(CPS)开发工具链广泛应用于CPS数据流模型的设计、仿真和验证。商业CPS工具链，如MathWorks的Simulink，可以生成广泛部署在嵌入式系统中的二进制代码等工件。通过测试来强化这些工具链是至关重要的，因为正式验证它们目前是不可行的。现有的差异测试框架，如CyFuzz，不能生成语言特性丰富的模型，部分原因是这些工具链没有利用可用的非正式Simulink规范。此外，没有对现有Simulink模型的研究，这可以指导CyFuzz生成逼真的模型。为了解决这些缺点，我们创建了第一个大型公共Simulink模型集合，并使用收集到的模型属性来指导随机模型生成。为了进一步指导模型生成，我们系统地收集了半形式化的Simulink规范。在我们对几百个模型的实验中，所得到的SLforge生成器比最先进的工具CyFuzz更有效和高效。SLforge还在Simulink中发现了8个新的确认bug。

{"title":"Automatically Finding Bugs in a Commercial Cyber-Physical System Development Tool Chain With SLforge","authors":"Shafiul Azam Chowdhury, Soumik Mohian, Sidharth Mehra, Siddhant Gawsane, Taylor T. Johnson, Christoph Csallner","doi":"10.1145/3180155.3180231","DOIUrl":"https://doi.org/10.1145/3180155.3180231","url":null,"abstract":"Cyber-physical system (CPS) development tool chains are widely used in the design, simulation, and verification of CPS data-flow models. Commercial CPS tool chains such as MathWorks' Simulink generate artifacts such as code binaries that are widely deployed in embedded systems. Hardening such tool chains by testing is crucial since formally verifying them is currently infeasible. Existing differential testing frameworks such as CyFuzz can not generate models rich in language features, partly because these tool chains do not leverage the available informal Simulink specifications. Furthermore, no study of existing Simulink models is available, which could guide CyFuzz to generate realistic models. To address these shortcomings, we created the first large collection of public Simulink models and used the collected models' properties to guide random model generation. To further guide model generation we systematically collected semi-formal Simulink specifications. In our experiments on several hundred models, the resulting SLforge generator was more effective and efficient than the state-of-the-art tool CyFuzz. SLforge also found 8 new confirmed bugs in Simulink.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"45 1","pages":"981-992"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85905650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

Software Protection on the Go: A Large-Scale Empirical Study on Mobile App Obfuscation 移动软件保护:移动应用程序混淆的大规模实证研究

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180169

Pei Wang, Qinkun Bao, Li Wang, Shuai Wang, Zhaofeng Chen, Tao Wei, Dinghao Wu

The prosperity of smartphone markets has raised new concerns about software security on mobile platforms, leading to a growing demand for effective software obfuscation techniques. Due to various differences between the mobile and desktop ecosystems, obfuscation faces both technical and non-technical challenges when applied to mobile software. Although there have been quite a few software security solution providers launching their mobile app obfuscation services, it is yet unclear how real-world mobile developers perform obfuscation as part of their software engineering practices. Our research takes a first step to systematically studying the deployment of software obfuscation techniques in mobile software development. With the help of an automated but coarse-grained method, we computed the likelihood of an app being obfuscated for over a million app samples crawled from Apple App Store. We then inspected the top 6600 instances and managed to identify 601 obfuscated versions of 539 iOS apps. By analyzing this sample set with extensive manual effort, we made various observations that reveal the status quo of mobile obfuscation in the real world, providing insights into understanding and improving the situation of software protection on mobile platforms.

智能手机市场的繁荣引发了对移动平台软件安全的新担忧，导致对有效软件混淆技术的需求不断增长。由于移动和桌面生态系统之间的各种差异，当应用于移动软件时，混淆面临着技术和非技术的挑战。尽管已经有相当多的软件安全解决方案提供商推出了他们的移动应用混淆服务，但目前还不清楚现实世界的移动开发者如何将混淆作为他们软件工程实践的一部分。我们的研究为系统地研究移动软件开发中软件混淆技术的部署迈出了第一步。在一种自动化但粗粒度方法的帮助下，我们计算了从苹果app Store中抓取的100多万个应用样本被混淆的可能性。然后，我们检查了前6600个实例，并设法识别出539个iOS应用的601个混淆版本。通过对这个样本集进行大量的人工分析，我们进行了各种观察，揭示了现实世界中移动混淆的现状，为理解和改善移动平台上的软件保护状况提供了见解。

{"title":"Software Protection on the Go: A Large-Scale Empirical Study on Mobile App Obfuscation","authors":"Pei Wang, Qinkun Bao, Li Wang, Shuai Wang, Zhaofeng Chen, Tao Wei, Dinghao Wu","doi":"10.1145/3180155.3180169","DOIUrl":"https://doi.org/10.1145/3180155.3180169","url":null,"abstract":"The prosperity of smartphone markets has raised new concerns about software security on mobile platforms, leading to a growing demand for effective software obfuscation techniques. Due to various differences between the mobile and desktop ecosystems, obfuscation faces both technical and non-technical challenges when applied to mobile software. Although there have been quite a few software security solution providers launching their mobile app obfuscation services, it is yet unclear how real-world mobile developers perform obfuscation as part of their software engineering practices. Our research takes a first step to systematically studying the deployment of software obfuscation techniques in mobile software development. With the help of an automated but coarse-grained method, we computed the likelihood of an app being obfuscated for over a million app samples crawled from Apple App Store. We then inspected the top 6600 instances and managed to identify 601 obfuscated versions of 539 iOS apps. By analyzing this sample set with extensive manual effort, we made various observations that reveal the status quo of mobile obfuscation in the real world, providing insights into understanding and improving the situation of software protection on mobile platforms.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"26-36"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86228934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Precise Concolic Unit Testing of C Programs Using Extended Units and Symbolic Alarm Filtering 用扩展单元和符号报警滤波的C程序精确的集合单元测试

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180253

Yunho Kim, Yunja Choi, Moonzoo Kim

Automated unit testing reduces manual effort to write unit test drivers/stubs and generate unit test inputs. However, automatically generated unit test drivers/stubs raise false alarms because they often over-approximate real contexts of a target function f and allow infeasible executions off. To solve this problem, we have developed a concolic unit testing technique CONBRIO. To provide realistic context to f, it constructs an extended unit of f that consists of f and closely relevant functions to f. Also, CONBRIO filters out a false alarm by checking feasibility of a corresponding symbolic execution path with regard to f 's symbolic calling contexts obtained by combining symbolic execution paths of f 's closely related predecessor functions. In the experiments on the crash bugs of 15 real-world C programs, CONBRIO shows both high bug detection ability (i.e. 91.0% of the target bugs detected) and high precision (i.e. a true to false alarm ratio is 1:4.5). Also, CONBRIO detects 14 new bugs in 9 target C programs studied in papers on crash bug detection techniques.

自动化单元测试减少了编写单元测试驱动程序/存根和生成单元测试输入的手工工作。然而，自动生成的单元测试驱动程序/存根会引起错误警报，因为它们通常过于接近目标函数的真实上下文，并允许不可行的执行。为了解决这个问题，我们开发了一种集成单元测试技术CONBRIO。为了给f提供真实的上下文，CONBRIO构造了一个由f和与f密切相关的函数组成的f的扩展单元。同时，CONBRIO结合f的密切相关的前代函数的符号执行路径，根据f的符号调用上下文，检查相应符号执行路径的可行性，从而过滤出虚警。在对15个真实C程序的崩溃bug的实验中，CONBRIO显示出较高的bug检测能力(即目标bug检测率为91.0%)和较高的准确率(即真假报警比为1:4.5)。此外，CONBRIO在关于崩溃bug检测技术的论文中研究了9个目标C程序中的14个新bug。

引用次数: 14

Search-Based Test Data Generation for SQL Queries 基于搜索的SQL查询测试数据生成

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180202

J. Castelein, M. Aniche, Mozhan Soltani, Annibale Panichella, A. Deursen

Database-centric systems strongly rely on SQL queries to manage and manipulate their data. These SQL commands can range from very simple selections to queries that involve several tables, subqueries, and grouping operations. And, as with any important piece of code, developers should properly test SQL queries. In order to completely test a SQL query, developers need to create test data that exercise all possible coverage targets in a query, e.g., JOINs and WHERE predicates. And indeed, this task can be challenging and time-consuming for complex queries. Previous studies have modeled the problem of generating test data as a constraint satisfaction problem and, with the help of SAT solvers, generate the required data. However, such approaches have strong limitations, such as partial support for queries with JOINs, subqueries, and strings (which are commonly used in SQL queries). In this paper, we model test data generation for SQL queries as a search-based problem. Then, we devise and evaluate three different approaches based on random search, biased random search, and genetic algorithms (GAs). The GA, in particular, uses a fitness function based on information extracted from the physical query plan of a database engine as search guidance. We then evaluate each approach in 2,135 queries extracted from three open source software and one industrial software system. Our results show that GA is able to completely cover 98.6% of all queries in the dataset, requiring only a few seconds per query. Moreover, it does not suffer from the limitations affecting state-of-the art techniques.

以数据库为中心的系统强烈依赖SQL查询来管理和操作它们的数据。这些SQL命令的范围从非常简单的选择到涉及多个表、子查询和分组操作的查询。而且，对于任何重要的代码段，开发人员应该正确地测试SQL查询。为了完整地测试SQL查询，开发人员需要创建测试数据来测试查询中所有可能的覆盖目标，例如，join和WHERE谓词。事实上，对于复杂的查询，这项任务可能是具有挑战性和耗时的。以前的研究将生成测试数据的问题建模为约束满足问题，并在SAT求解器的帮助下生成所需的数据。但是，这种方法有很强的局限性，例如部分支持使用join、子查询和字符串(在SQL查询中常用)的查询。在本文中，我们将SQL查询的测试数据生成建模为一个基于搜索的问题。然后，我们设计并评估了基于随机搜索、有偏随机搜索和遗传算法(GAs)的三种不同方法。特别是，该遗传算法使用基于从数据库引擎的物理查询计划中提取的信息的适应度函数作为搜索指导。然后，我们在从三个开源软件和一个工业软件系统中提取的2135个查询中评估每种方法。我们的结果表明，GA能够完全覆盖数据集中98.6%的所有查询，每个查询只需要几秒钟。此外，它不受影响最先进技术的限制。

{"title":"Search-Based Test Data Generation for SQL Queries","authors":"J. Castelein, M. Aniche, Mozhan Soltani, Annibale Panichella, A. Deursen","doi":"10.1145/3180155.3180202","DOIUrl":"https://doi.org/10.1145/3180155.3180202","url":null,"abstract":"Database-centric systems strongly rely on SQL queries to manage and manipulate their data. These SQL commands can range from very simple selections to queries that involve several tables, subqueries, and grouping operations. And, as with any important piece of code, developers should properly test SQL queries. In order to completely test a SQL query, developers need to create test data that exercise all possible coverage targets in a query, e.g., JOINs and WHERE predicates. And indeed, this task can be challenging and time-consuming for complex queries. Previous studies have modeled the problem of generating test data as a constraint satisfaction problem and, with the help of SAT solvers, generate the required data. However, such approaches have strong limitations, such as partial support for queries with JOINs, subqueries, and strings (which are commonly used in SQL queries). In this paper, we model test data generation for SQL queries as a search-based problem. Then, we devise and evaluate three different approaches based on random search, biased random search, and genetic algorithms (GAs). The GA, in particular, uses a fitness function based on information extracted from the physical query plan of a database engine as search guidance. We then evaluate each approach in 2,135 queries extracted from three open source software and one industrial software system. Our results show that GA is able to completely cover 98.6% of all queries in the dataset, requiring only a few seconds per query. Moreover, it does not suffer from the limitations affecting state-of-the art techniques.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"22 1","pages":"1220-1230"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83393138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Adding Sparkle to Social Coding: An Empirical Study of Repository Badges in the npm Ecosystem 为社交编码添加亮点:npm生态系统中存储库徽章的实证研究

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3183440.3190335

Asher Trockman, Shurui Zhou, Christian Kästner, Bogdan Vasilescu

In fast-paced, reuse-heavy, and distributed software development, the transparency provided by social coding platforms like GitHub is essential to decision making. Developers infer the quality of projects using visible cues, known as signals, collected from personal profile and repository pages. We report on a large-scale, mixed-methods empirical study of npm packages that explores the emerging phenomenon of repository badges, with which maintainers signal underlying qualities about their projects to contributors and users. We investigate which qualities maintainers intend to signal and how well badges correlate with those qualities. After surveying developers, mining 294,941 repositories, and applying statistical modeling and time-series analyses, we find that non-trivial badges, which display the build status, test coverage, and up-to-dateness of dependencies, are mostly reliable signals, correlating with more tests, better pull requests, and fresher dependencies. Displaying such badges correlates with best practices, but the effects do not always persist.

在快节奏、重重用和分布式软件开发中，像GitHub这样的社交编码平台提供的透明度对决策至关重要。开发人员使用从个人配置文件和存储库页面收集的可见线索(称为信号)来推断项目的质量。我们报告了一项大规模的、混合方法的npm包实证研究，该研究探索了新出现的存储库徽章现象，维护者用它向贡献者和用户表明他们项目的潜在质量。我们调查了维护者想要表达的品质，以及徽章与这些品质的关联程度。在调查了开发人员，挖掘了294,941个存储库，并应用了统计建模和时间序列分析之后，我们发现，显示构建状态、测试覆盖率和依赖关系的最新状态的重要标记，大多是可靠的信号，与更多的测试、更好的拉取请求和更新的依赖关系相关。显示这样的徽章与最佳实践相关，但效果并不总是持续存在。

{"title":"Adding Sparkle to Social Coding: An Empirical Study of Repository Badges in the npm Ecosystem","authors":"Asher Trockman, Shurui Zhou, Christian Kästner, Bogdan Vasilescu","doi":"10.1145/3183440.3190335","DOIUrl":"https://doi.org/10.1145/3183440.3190335","url":null,"abstract":"In fast-paced, reuse-heavy, and distributed software development, the transparency provided by social coding platforms like GitHub is essential to decision making. Developers infer the quality of projects using visible cues, known as signals, collected from personal profile and repository pages. We report on a large-scale, mixed-methods empirical study of npm packages that explores the emerging phenomenon of repository badges, with which maintainers signal underlying qualities about their projects to contributors and users. We investigate which qualities maintainers intend to signal and how well badges correlate with those qualities. After surveying developers, mining 294,941 repositories, and applying statistical modeling and time-series analyses, we find that non-trivial badges, which display the build status, test coverage, and up-to-dateness of dependencies, are mostly reliable signals, correlating with more tests, better pull requests, and fresher dependencies. Displaying such badges correlates with best practices, but the effects do not always persist.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"30 1","pages":"511-522"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81388868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

DetReduce: Minimizing Android GUI Test Suites for Regression Testing dereduce:最小化用于回归测试的Android GUI测试套件

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180173

Wontae Choi, Koushik Sen, G. Necula, Wenyu Wang

In recent years, several automated GUI testing techniques for Android apps have been proposed. These tools have been shown to be effective in achieving good test coverage and in finding bugs without human intervention. Being automated, these tools typically run for a long time (say, for several hours), either until they saturate test coverage or until a testing time budget expires. Thus, these automated tools are not good at generating concise regression test suites that could be used for testing in incremental development of the apps and in regression testing. We propose a heuristic technique that helps create a small regression test suite for an Android app from a large test suite generated by an automated Android GUI testing tool. The key insight behind our technique is that if we can identify and remove some common forms of redundancies introduced by existing automated GUI testing tools, then we can drastically lower the time required to minimize a GUI test suite. We have implemented our algorithm in a prototype tool called DetReduce. We applied DetReduce to several Android apps and found that DetReduce reduces a test-suite by an average factor of16.9× in size and14.7× in running time. We also found that for a test suite generated by running SwiftHand and a randomized test generation algorithm for 8 hours, DetReduce minimizes the test suite in an average of 14.6 hours.

近年来，针对Android应用程序提出了几种自动化GUI测试技术。这些工具已被证明在获得良好的测试覆盖率和在没有人为干预的情况下发现错误方面是有效的。由于是自动化的，这些工具通常会运行很长时间(比如，几个小时)，要么直到测试覆盖范围饱和，要么直到测试时间预算到期。因此，这些自动化工具并不擅长生成简洁的回归测试套件，这些回归测试套件可以用于应用程序的增量开发和回归测试。我们提出了一种启发式技术，该技术有助于从自动化Android GUI测试工具生成的大型测试套件中为Android应用程序创建小型回归测试套件。我们技术背后的关键见解是，如果我们能够识别并移除由现有的自动化GUI测试工具引入的一些常见的冗余形式，那么我们就可以大大降低最小化GUI测试套件所需的时间。我们已经在一个叫做dereduce的原型工具中实现了我们的算法。我们将dereduce应用于几个Android应用程序，发现dereduce将测试套件的大小平均减少了16.9倍，运行时间减少了14.7倍。我们还发现，对于运行SwiftHand和随机测试生成算法8小时生成的测试套件，dereduce平均在14.6小时内将测试套件最小化。

{"title":"DetReduce: Minimizing Android GUI Test Suites for Regression Testing","authors":"Wontae Choi, Koushik Sen, G. Necula, Wenyu Wang","doi":"10.1145/3180155.3180173","DOIUrl":"https://doi.org/10.1145/3180155.3180173","url":null,"abstract":"In recent years, several automated GUI testing techniques for Android apps have been proposed. These tools have been shown to be effective in achieving good test coverage and in finding bugs without human intervention. Being automated, these tools typically run for a long time (say, for several hours), either until they saturate test coverage or until a testing time budget expires. Thus, these automated tools are not good at generating concise regression test suites that could be used for testing in incremental development of the apps and in regression testing. We propose a heuristic technique that helps create a small regression test suite for an Android app from a large test suite generated by an automated Android GUI testing tool. The key insight behind our technique is that if we can identify and remove some common forms of redundancies introduced by existing automated GUI testing tools, then we can drastically lower the time required to minimize a GUI test suite. We have implemented our algorithm in a prototype tool called DetReduce. We applied DetReduce to several Android apps and found that DetReduce reduces a test-suite by an average factor of16.9× in size and14.7× in running time. We also found that for a test suite generated by running SwiftHand and a randomized test generation algorithm for 8 hours, DetReduce minimizes the test suite in an average of 14.6 hours.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"40 1","pages":"445-455"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81421152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33

Inheritance Usage Patterns in Open-Source Systems 开源系统中的继承使用模式

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180168

Jamie Stevenson, M. Wood

This research investigates how object-oriented inheritance is actually used in practice. The aim is to close the gap between inheritance guidance and inheritance practice. It is based on detailed analyses of 2440 inheritance hierarchies drawn from 14 open-source systems. The original contributions made by this paper concern pragmatic assessment of inheritance hierarchy design quality. The findings show that inheritance is very widely used but that most of the usage patterns that occur in practice are simple in structure. They are so simple that they may not require much inheritance-specific design consideration. On the other hand, the majority of classes defined using inheritance actually appear within a relatively small number of large, complex hierarchies. While some of these large hierarchies appear to have a consistent structure, often based on a problem domain model or a design pattern, others do not. Another contribution is that the quality of hierarchies, especially the large problematic ones, may be assessed in practice based on size, shape, and the definition and invocation of novel methods – all properties that can be detected automatically.

本研究探讨了面向对象的继承在实践中是如何实际使用的。目的在于缩小继承指导与继承实践之间的差距。它基于对来自14个开源系统的2440个继承层次的详细分析。本文的原始贡献是对继承层次设计质量的实用评估。研究结果表明，继承的使用非常广泛，但在实践中出现的大多数使用模式在结构上都很简单。它们是如此简单，以至于可能不需要太多特定于继承的设计考虑。另一方面，使用继承定义的大多数类实际上出现在数量相对较少的大型复杂层次结构中。虽然这些大型层次结构中的一些似乎具有一致的结构，通常基于问题域模型或设计模式，但其他层次结构则没有。另一个贡献是层次结构的质量，特别是大的有问题的层次结构的质量，可以在实践中根据大小、形状以及新方法的定义和调用来评估——所有这些属性都可以自动检测到。

{"title":"Inheritance Usage Patterns in Open-Source Systems","authors":"Jamie Stevenson, M. Wood","doi":"10.1145/3180155.3180168","DOIUrl":"https://doi.org/10.1145/3180155.3180168","url":null,"abstract":"This research investigates how object-oriented inheritance is actually used in practice. The aim is to close the gap between inheritance guidance and inheritance practice. It is based on detailed analyses of 2440 inheritance hierarchies drawn from 14 open-source systems. The original contributions made by this paper concern pragmatic assessment of inheritance hierarchy design quality. The findings show that inheritance is very widely used but that most of the usage patterns that occur in practice are simple in structure. They are so simple that they may not require much inheritance-specific design consideration. On the other hand, the majority of classes defined using inheritance actually appear within a relatively small number of large, complex hierarchies. While some of these large hierarchies appear to have a consistent structure, often based on a problem domain model or a design pattern, others do not. Another contribution is that the quality of hierarchies, especially the large problematic ones, may be assessed in practice based on size, shape, and the definition and invocation of novel methods – all properties that can be detected automatically.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"58 1","pages":"245-255"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81621033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

To Distribute or Not to Distribute? Why Licensing Bugs Matter 发行还是不发行?为什么许可漏洞很重要

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180221

Christopher Vendome, D. Germán, M. D. Penta, G. Bavota, M. Vásquez, D. Poshyvanyk

Software licenses dictate how source code or binaries can be modified, reused, and redistributed. In the case of open source projects, software licenses generally fit into two main categories, permissive and restrictive, depending on the degree to which they allow redistribution or modification under licenses different from the original one(s). Developers and organizations can also modify existing licenses, creating custom licenses with specific permissive/restrictive terms. Having such a variety of software licenses can create confusion among software developers, and can easily result in the introduction of licensing bugs, not necessarily limited to well-known license incompatibilities. In this work, we report a study aimed at characterizing licensing bugs by (i) building a catalog categorizing the types of licensing bugs developers and other stakeholders face, and (ii) understanding the implications licensing bugs have on the software projects they affect. The presented study is the result of the manual analysis of 1,200 discussions related to licensing bugs carried out in issue trackers and in five legal mailing lists of open source communities. Our findings uncover new types of licensing bugs not addressed in prior literature, and a detailed assessment of their implications.

软件许可证规定了如何修改、重用和重新分发源代码或二进制文件。在开源项目的情况下，软件许可证通常分为两大类，许可型和限制性，这取决于它们在不同于原始许可证的情况下允许再分发或修改的程度。开发人员和组织还可以修改现有的许可证，创建具有特定许可/限制条款的自定义许可证。拥有如此多样的软件许可可能会在软件开发人员之间造成混乱，并且很容易导致引入许可错误，而不一定限于众所周知的许可不兼容性。在这项工作中，我们报告了一项研究，旨在通过(i)建立一个目录，对开发人员和其他利益相关者面临的许可错误类型进行分类，以及(ii)理解许可错误对它们所影响的软件项目的影响。本文的研究是对1200个与许可缺陷相关的讨论进行人工分析的结果，这些讨论是在问题跟踪器和五个开源社区的合法邮件列表中进行的。我们的发现揭示了以前文献中没有提到的新类型的许可漏洞，并详细评估了它们的影响。

{"title":"To Distribute or Not to Distribute? Why Licensing Bugs Matter","authors":"Christopher Vendome, D. Germán, M. D. Penta, G. Bavota, M. Vásquez, D. Poshyvanyk","doi":"10.1145/3180155.3180221","DOIUrl":"https://doi.org/10.1145/3180155.3180221","url":null,"abstract":"Software licenses dictate how source code or binaries can be modified, reused, and redistributed. In the case of open source projects, software licenses generally fit into two main categories, permissive and restrictive, depending on the degree to which they allow redistribution or modification under licenses different from the original one(s). Developers and organizations can also modify existing licenses, creating custom licenses with specific permissive/restrictive terms. Having such a variety of software licenses can create confusion among software developers, and can easily result in the introduction of licensing bugs, not necessarily limited to well-known license incompatibilities. In this work, we report a study aimed at characterizing licensing bugs by (i) building a catalog categorizing the types of licensing bugs developers and other stakeholders face, and (ii) understanding the implications licensing bugs have on the software projects they affect. The presented study is the result of the manual analysis of 1,200 discussions related to licensing bugs carried out in issue trackers and in five legal mailing lists of open source communities. Our findings uncover new types of licensing bugs not addressed in prior literature, and a detailed assessment of their implications.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"27 1","pages":"268-279"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82480580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Efficient Sampling of SAT Solutions for Testing SAT测试解决方案的有效采样

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180248

Rafael Dutra, Kevin Laeufer, J. Bachrach, Koushik Sen

In software and hardware testing, generating multiple inputs which satisfy a given set of constraints is an important problem with applications in fuzz testing and stimulus generation. However, it is a challenge to perform the sampling efficiently, while generating a diverse set of inputs which satisfy the constraints. We developed a new algorithm QuickSampler which requires a small number of solver calls to produce millions of samples which satisfy the constraints with high probability. We evaluate QuickSampler on large real-world benchmarks and show that it can produce unique valid solutions orders of magnitude faster than other state-of-the-art sampling tools, with a distribution which is reasonably close to uniform in practice.

在软件和硬件测试中，生成满足给定约束的多个输入是模糊测试和刺激生成应用中的一个重要问题。然而，如何有效地执行采样，同时生成满足约束的多样化输入集是一个挑战。我们开发了一种新的算法QuickSampler，该算法只需要少量的求解器调用就可以产生数百万个高概率满足约束的样本。我们在大型真实世界的基准测试中评估了QuickSampler，并表明它可以比其他最先进的采样工具更快地产生独特的有效解决方案，其分布在实践中相当接近均匀。

引用次数: 82