2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)最新文献_第10页

Exploring the Architectural Impact of Possible Dependencies in Python Software 探索Python软件中可能的依赖对体系结构的影响

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416619

Wuxia Jin, Yuanfang Cai, R. Kazman, Gang Zhang, Q. Zheng, Ting Liu

Dependencies among software entities are the basis for many software analytic research and architecture analysis tools. Dynamically typed languages, such as Python, JavaScript and Ruby, tolerate the lack of explicit type references, making certain syntactic dependencies indiscernible in source code. We call these possible dependencies, in contrast with the explicit dependencies that are directly referenced in source code. Type inference techniques have been widely studied and applied, but existing architecture analytic research and tools have not taken possible dependencies into consideration. The fundamental question is, to what extent will these missing possible dependencies impact the architecture analysis? To answer this question, we conducted an empirical study with 105 Python projects, using type inference techniques to manifest possible dependencies. Our study revealed that the architectural impact of possible dependencies is substantial-higher than that of explicit dependencies: (1) file-level possible dependencies account for at least 27.93% of all file-level dependencies, and create different dependency structures than that of explicit dependencies only, with an average difference of 30.71%; (2) adding possible dependencies significantly improves the precision (0.52%~14.18%), recall(31.73%~39.12%), and F1 scores (22.13%~32.09%) of capturing co-change relations; (3) on average, a file involved in possible dependencies influences 28% more files and 42% more dependencies within architectural sub-spaces than a file involved in just explicit dependencies; (4) on average, a file involved in possible dependencies consumes 32% more maintenance effort. Consequently, maintainability scores reported by existing tools make a system written in these dynamic languages appear to be better modularized than it actually is. This evidence stronglysuggests that possible dependencies have a more significant impact than explicit dependencies on architecture quality, that architecture analysis and tools should assess and even emphasize the architectural impact of possible dependencies due to dynamic typing.

软件实体之间的依赖关系是许多软件分析研究和架构分析工具的基础。动态类型语言，如Python、JavaScript和Ruby，容忍缺乏显式类型引用，使某些语法依赖在源代码中无法识别。我们称这些为可能的依赖，与直接在源代码中引用的显式依赖形成对比。类型推断技术已经得到了广泛的研究和应用，但是现有的架构分析研究和工具并没有考虑到可能的依赖性。最基本的问题是，这些缺失的可能的依赖关系会在多大程度上影响架构分析?为了回答这个问题，我们对105个Python项目进行了实证研究，使用类型推断技术来显示可能的依赖关系。我们的研究表明，可能的依赖关系对架构的影响比显式依赖关系大得多:①文件级可能的依赖关系至少占所有文件级依赖关系的27.93%，并且与仅显式依赖关系创建的依赖结构不同，平均差异为30.71%;(2)增加可能依赖关系显著提高了共变关系捕获的准确率(0.52%~14.18%)、召回率(31.73%~39.12%)和F1得分(22.13%~32.09%);(3)平均而言，在架构子空间中，涉及可能依赖项的文件比仅涉及显式依赖项的文件多影响28%的文件和42%的依赖项;(4)平均而言，涉及到可能依赖项的文件要多消耗32%的维护工作。因此，由现有工具报告的可维护性分数使得用这些动态语言编写的系统看起来比实际上更好地模块化了。这一证据有力地表明，可能的依赖关系比显式依赖关系对体系结构质量的影响更大，体系结构分析和工具应该评估甚至强调由于动态类型导致的可能的依赖关系对体系结构的影响。

{"title":"Exploring the Architectural Impact of Possible Dependencies in Python Software","authors":"Wuxia Jin, Yuanfang Cai, R. Kazman, Gang Zhang, Q. Zheng, Ting Liu","doi":"10.1145/3324884.3416619","DOIUrl":"https://doi.org/10.1145/3324884.3416619","url":null,"abstract":"Dependencies among software entities are the basis for many software analytic research and architecture analysis tools. Dynamically typed languages, such as Python, JavaScript and Ruby, tolerate the lack of explicit type references, making certain syntactic dependencies indiscernible in source code. We call these possible dependencies, in contrast with the explicit dependencies that are directly referenced in source code. Type inference techniques have been widely studied and applied, but existing architecture analytic research and tools have not taken possible dependencies into consideration. The fundamental question is, to what extent will these missing possible dependencies impact the architecture analysis? To answer this question, we conducted an empirical study with 105 Python projects, using type inference techniques to manifest possible dependencies. Our study revealed that the architectural impact of possible dependencies is substantial-higher than that of explicit dependencies: (1) file-level possible dependencies account for at least 27.93% of all file-level dependencies, and create different dependency structures than that of explicit dependencies only, with an average difference of 30.71%; (2) adding possible dependencies significantly improves the precision (0.52%~14.18%), recall(31.73%~39.12%), and F1 scores (22.13%~32.09%) of capturing co-change relations; (3) on average, a file involved in possible dependencies influences 28% more files and 42% more dependencies within architectural sub-spaces than a file involved in just explicit dependencies; (4) on average, a file involved in possible dependencies consumes 32% more maintenance effort. Consequently, maintainability scores reported by existing tools make a system written in these dynamic languages appear to be better modularized than it actually is. This evidence stronglysuggests that possible dependencies have a more significant impact than explicit dependencies on architecture quality, that architecture analysis and tools should assess and even emphasize the architectural impact of possible dependencies due to dynamic typing.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126467851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

A Predictive Analysis for Detecting Deadlock in MPI Programs MPI程序中死锁检测的预测分析

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416588

Yu Huang, B. Ogles, Eric Mercer

A common problem in MPI programs is deadlock: when two or more processes are blocked indefinitely due to a circular communication dependency. Automatically detecting deadlock is difficult due to its schedule-dependent nature. This paper presents a predictive analysis for single-path MPI programs that observes a single program execution and then determines whether any other feasible schedule of the program can lead to a deadlock. The analysis works by identifying problematic communication patterns in a dependency graph to form a set of deadlock candidates. The deadlock candidates are filtered by an abstract machine and ultimately tested for reachability by an SMT solver with an efficient encoding for deadlock. This approach quickly yields a set of high probability deadlock candidates useful for reasoning about complex codes and yields higher performance overall in many cases compared to other state-of-the-art analyses. The analysis is sound and complete for single-path MPI programs on a given input.

MPI程序中的一个常见问题是死锁:当两个或多个进程由于循环通信依赖而无限期阻塞时。自动检测死锁是困难的，因为它依赖于调度的性质。本文提出了一种单路径MPI程序的预测分析方法，它观察单个程序的执行情况，然后确定程序的任何其他可行的调度是否会导致死锁。分析通过在依赖关系图中识别有问题的通信模式来形成一组死锁候选者。死锁候选者由抽象机器过滤，并最终由具有有效死锁编码的SMT求解器测试可达性。这种方法可以快速生成一组高概率死锁候选者，这对于推理复杂代码非常有用，并且在许多情况下，与其他最先进的分析相比，可以产生更高的总体性能。对于给定输入的单路径MPI程序，分析是健全和完整的。

引用次数: 2

Test Automation in Open-Source Android Apps: A Large-Scale Empirical Study 开源Android应用中的测试自动化:一项大规模的实证研究

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416623

Jun-Wei Lin, Navid Salehnamadi, S. Malek

Automated testing of mobile apps has received significant attention in recent years from researchers and practitioners alike. In this paper, we report on the largest empirical study to date, aimed at understanding the test automation culture prevalent among mobile app developers. We systematically examined more than 3.5 million repositories on GitHub and identified more than 12,000 non-trivial and real-world Android apps. We then analyzed these non-trivial apps to investigate (1) the prevalence of adoption of test automation; (2) working habits of mobile app developers in regards to automated testing; and (3) the correlation between the adoption of test automation and the popularity of projects. Among others, we found that (1) only 8% of the mobile app development projects leverage automated testing practices; (2) developers tend to follow the same test automation practices across projects; and (3) popular projects, measured in terms of the number of contributors, stars, and forks on GitHub, are more likely to adopt test automation practices. To understand the rationale behind our observations, we further conducted a survey with 148 professional and experienced developers contributing to the subject apps. Our findings shed light on the current practices and future research directions pertaining to test automation for mobile app development.

近年来，移动应用程序的自动化测试受到了研究人员和实践者的极大关注。在本文中，我们报告了迄今为止最大的实证研究，旨在了解在移动应用程序开发人员中流行的测试自动化文化。我们系统地检查了GitHub上超过350万个存储库，并确定了超过12,000个重要的和真实的Android应用程序。然后，我们分析了这些重要的应用程序，以调查(1)测试自动化的普及程度;(2)移动应用开发者在自动化测试方面的工作习惯;(3)测试自动化的采用与项目的普及之间的相关性。其中，我们发现(1)只有8%的移动应用开发项目利用了自动化测试实践;(2)开发人员倾向于跨项目遵循相同的测试自动化实践;(3)受欢迎的项目，根据GitHub上的贡献者、明星和分支的数量来衡量，更有可能采用测试自动化实践。为了理解我们的观察背后的原理，我们进一步对148名专业和有经验的开发人员进行了调查。我们的发现揭示了当前的实践和未来的研究方向有关测试自动化的移动应用程序开发。

{"title":"Test Automation in Open-Source Android Apps: A Large-Scale Empirical Study","authors":"Jun-Wei Lin, Navid Salehnamadi, S. Malek","doi":"10.1145/3324884.3416623","DOIUrl":"https://doi.org/10.1145/3324884.3416623","url":null,"abstract":"Automated testing of mobile apps has received significant attention in recent years from researchers and practitioners alike. In this paper, we report on the largest empirical study to date, aimed at understanding the test automation culture prevalent among mobile app developers. We systematically examined more than 3.5 million repositories on GitHub and identified more than 12,000 non-trivial and real-world Android apps. We then analyzed these non-trivial apps to investigate (1) the prevalence of adoption of test automation; (2) working habits of mobile app developers in regards to automated testing; and (3) the correlation between the adoption of test automation and the popularity of projects. Among others, we found that (1) only 8% of the mobile app development projects leverage automated testing practices; (2) developers tend to follow the same test automation practices across projects; and (3) popular projects, measured in terms of the number of contributors, stars, and forks on GitHub, are more likely to adopt test automation practices. To understand the rationale behind our observations, we further conducted a survey with 148 professional and experienced developers contributing to the subject apps. Our findings shed light on the current practices and future research directions pertaining to test automation for mobile app development.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131369775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Assessing and Restoring Reproducibility of Jupyter Notebooks 评估和恢复Jupyter笔记本的再现性

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416585

Jiawei Wang, Tzu-yang Kuo, Li Li, A. Zeller

Jupyter notebooks-documents that contain live code, equations, visualizations, and narrative text-now are among the most popular means to compute, present, discuss and disseminate scientific findings. In principle, Jupyter notebooks should easily allow to reproduce and extend scientific computations and their findings; but in practice, this is not the case. The individual code cells in Jupyter notebooks can be executed in any order, with identifier usages preceding their definitions and results preceding their computations. In a sample of 936 published notebooks that would be executable in principle, we found that 73% of them would not be reproducible with straightforward approaches, requiring humans to infer (and often guess) the order in which the authors created the cells. In this paper, we present an approach to (1) automatically satisfy dependencies between code cells to reconstruct possible execution orders of the cells; and (2) instrument code cells to mitigate the impact of non-reproducible statements (i.e., random functions) in Jupyter notebooks. Our Osiris prototype takes a notebook as input and outputs the possible execution schemes that reproduce the exact notebook results. In our sample, Osiris was able to reconstruct such schemes for 82.23% of all executable notebooks, which has more than three times better than the state-of-the-art; the resulting reordered code is valid program code and thus available for further testing and analysis.

Jupyter笔记本——包含实时代码、方程式、可视化和叙述文本的文档——现在是计算、呈现、讨论和传播科学发现的最流行的手段之一。原则上，木星笔记本应该可以很容易地复制和扩展科学计算和他们的发现;但在实践中，情况并非如此。Jupyter笔记本中的单个代码单元可以以任何顺序执行，标识符使用在其定义之前，结果在其计算之前。在936本原则上可执行的已发表笔记的样本中，我们发现73%的笔记不能用直接的方法重现，需要人们推断(通常是猜测)作者创造细胞的顺序。在本文中，我们提出了一种方法:(1)自动满足代码单元之间的依赖关系，以重建单元可能的执行顺序;(2)仪器编码单元，以减轻Jupyter笔记本中不可复制语句(即随机函数)的影响。我们的Osiris原型将笔记本作为输入，并输出可能的执行方案，以复制精确的笔记本结果。在我们的样本中，Osiris能够为所有可执行笔记本的82.23%重建这样的方案，这比最先进的三倍多;结果重新排序的代码是有效的程序代码，因此可用于进一步的测试和分析。

{"title":"Assessing and Restoring Reproducibility of Jupyter Notebooks","authors":"Jiawei Wang, Tzu-yang Kuo, Li Li, A. Zeller","doi":"10.1145/3324884.3416585","DOIUrl":"https://doi.org/10.1145/3324884.3416585","url":null,"abstract":"Jupyter notebooks-documents that contain live code, equations, visualizations, and narrative text-now are among the most popular means to compute, present, discuss and disseminate scientific findings. In principle, Jupyter notebooks should easily allow to reproduce and extend scientific computations and their findings; but in practice, this is not the case. The individual code cells in Jupyter notebooks can be executed in any order, with identifier usages preceding their definitions and results preceding their computations. In a sample of 936 published notebooks that would be executable in principle, we found that 73% of them would not be reproducible with straightforward approaches, requiring humans to infer (and often guess) the order in which the authors created the cells. In this paper, we present an approach to (1) automatically satisfy dependencies between code cells to reconstruct possible execution orders of the cells; and (2) instrument code cells to mitigate the impact of non-reproducible statements (i.e., random functions) in Jupyter notebooks. Our Osiris prototype takes a notebook as input and outputs the possible execution schemes that reproduce the exact notebook results. In our sample, Osiris was able to reconstruct such schemes for 82.23% of all executable notebooks, which has more than three times better than the state-of-the-art; the resulting reordered code is valid program code and thus available for further testing and analysis.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115726853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

Cross-Contract Static Analysis for Detecting Practical Reentrancy Vulnerabilities in Smart Contracts 智能合约中检测实际重入漏洞的跨合约静态分析

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416553

Yinxing Xue, Mingliang Ma, Yun Lin, Yulei Sui, Jiaming Ye, T. Peng

Reentrancy bugs, one of the most severe vulnerabilities in smart contracts, have caused huge financial loss in recent years. Researchers have proposed many approaches to detecting them. However, empirical studies have shown that these approaches suffer from undesirable false positives and false negatives, when the code under detection involves the interaction between multiple smart contracts. In this paper, we propose an accurate and efficient cross-contract reentrancy detection approach in practice. Rather than design rule-of-thumb heuristics, we conduct a large empirical study of 11714 real-world contracts from Etherscan against three well-known general-purpose security tools for reentrancy detection. We manually summarized the reentrancy scenarios where the state-of-the-art approaches cannot address. Based on the empirical evidence, we present Clairvoyance, a cross-function and cross-contract static analysis to detect reentrancy vulnerabilities in real world with significantly higher accuracy. To reduce false negatives, we enable, for the first time, a cross-contract call chain analysis by tracking possibly tainted paths. To reduce false positives, we systematically summarized five major path protective techniques (PPTs) to support fast yet precise path feasibility checking. We implemented our approach and compared Clairvoyance with five state-of-the-art tools on 17770 real-worlds contracts. The results show that Clairvoyance yields the best detection accuracy among all the five tools and also finds 101 unknown reentrancy vulnerabilities.

可重入性漏洞是智能合约中最严重的漏洞之一，近年来造成了巨大的经济损失。研究人员提出了许多检测它们的方法。然而，实证研究表明，当检测到的代码涉及多个智能合约之间的交互时，这些方法会出现不希望出现的假阳性和假阴性。本文在实践中提出了一种准确、高效的交叉契约可重入检测方法。我们没有设计经验法则，而是对来自Etherscan的11714个真实世界的合约进行了大型实证研究，并针对三种著名的通用安全工具进行了重入检测。我们手动总结了最先进的方法无法解决的重入场景。基于经验证据，我们提出了Clairvoyance，这是一种跨功能和跨契约的静态分析，可以以更高的准确性检测现实世界中的可重入漏洞。为了减少误报，我们首次通过跟踪可能受污染的路径来启用交叉契约调用链分析。为了减少误报，我们系统地总结了五种主要的路径保护技术(PPTs)，以支持快速而精确的路径可行性检查。我们实现了我们的方法，并将Clairvoyance与五个最先进的工具在17770个真实世界的合同中进行了比较。结果表明，千里眼的检测准确率最高，共发现101个未知重入漏洞。

{"title":"Cross-Contract Static Analysis for Detecting Practical Reentrancy Vulnerabilities in Smart Contracts","authors":"Yinxing Xue, Mingliang Ma, Yun Lin, Yulei Sui, Jiaming Ye, T. Peng","doi":"10.1145/3324884.3416553","DOIUrl":"https://doi.org/10.1145/3324884.3416553","url":null,"abstract":"Reentrancy bugs, one of the most severe vulnerabilities in smart contracts, have caused huge financial loss in recent years. Researchers have proposed many approaches to detecting them. However, empirical studies have shown that these approaches suffer from undesirable false positives and false negatives, when the code under detection involves the interaction between multiple smart contracts. In this paper, we propose an accurate and efficient cross-contract reentrancy detection approach in practice. Rather than design rule-of-thumb heuristics, we conduct a large empirical study of 11714 real-world contracts from Etherscan against three well-known general-purpose security tools for reentrancy detection. We manually summarized the reentrancy scenarios where the state-of-the-art approaches cannot address. Based on the empirical evidence, we present Clairvoyance, a cross-function and cross-contract static analysis to detect reentrancy vulnerabilities in real world with significantly higher accuracy. To reduce false negatives, we enable, for the first time, a cross-contract call chain analysis by tracking possibly tainted paths. To reduce false positives, we systematically summarized five major path protective techniques (PPTs) to support fast yet precise path feasibility checking. We implemented our approach and compared Clairvoyance with five state-of-the-art tools on 17770 real-worlds contracts. The results show that Clairvoyance yields the best detection accuracy among all the five tools and also finds 101 unknown reentrancy vulnerabilities.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129965552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44

Identifying Software Performance Changes Across Variants and Versions 识别跨变体和版本的软件性能变化

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416573

Stefan Mühlbauer, S. Apel, Norbert Siegmund

We address the problem of identifying performance changes in the evolution of configurable software systems. Finding optimal configurations and configuration options that influence performance is already difficult, but in the light of software evolution, configuration-dependent performance changes may lurk in a potentially large number of different versions of the system. In this work, we combine two perspectives-variability and time-into a novel perspective. We propose an approach to identify configuration-dependent performance changes retrospectively across the software variants and versions of a software system. In a nutshell, we iteratively sample pairs of configurations and versions and measure the respective performance, which we use to update a model of likelihoods for performance changes. Pursuing a search strategy with the goal of measuring selectively and incrementally further pairs, we increase the accuracy of identified change points related to configuration options and interactions. We have conducted a number of experiments both on controlled synthetic data sets as well as in real-world scenarios with different software systems. Our evaluation demonstrates that we can pinpoint performance shifts to individual configuration options and interactions as well as commits introducing change points with high accuracy and at scale. Experiments on three real-world systems explore the effectiveness and practicality of our approach.

我们解决了在可配置软件系统的发展过程中识别性能变化的问题。找到影响性能的最优配置和配置选项已经很困难了，但是根据软件的发展，与配置相关的性能变化可能潜伏在大量不同版本的系统中。在这项工作中，我们将两个视角——变异性和时间——结合成一个新的视角。我们提出了一种方法，可以在软件系统的软件变体和版本之间回顾性地识别与配置相关的性能变化。简而言之，我们迭代地对配置和版本进行采样，并测量各自的性能，我们使用它们来更新性能变化的可能性模型。追求一种搜索策略，其目标是有选择地和增量地测量进一步的对，我们增加了与配置选项和交互相关的识别更改点的准确性。我们已经在受控的合成数据集以及使用不同软件系统的真实场景中进行了大量的实验。我们的评估表明，我们可以精确地指出单个配置选项和交互的性能变化，以及以高精度和大规模的方式引入变更点的提交。在三个现实世界系统上的实验探索了我们方法的有效性和实用性。

{"title":"Identifying Software Performance Changes Across Variants and Versions","authors":"Stefan Mühlbauer, S. Apel, Norbert Siegmund","doi":"10.1145/3324884.3416573","DOIUrl":"https://doi.org/10.1145/3324884.3416573","url":null,"abstract":"We address the problem of identifying performance changes in the evolution of configurable software systems. Finding optimal configurations and configuration options that influence performance is already difficult, but in the light of software evolution, configuration-dependent performance changes may lurk in a potentially large number of different versions of the system. In this work, we combine two perspectives-variability and time-into a novel perspective. We propose an approach to identify configuration-dependent performance changes retrospectively across the software variants and versions of a software system. In a nutshell, we iteratively sample pairs of configurations and versions and measure the respective performance, which we use to update a model of likelihoods for performance changes. Pursuing a search strategy with the goal of measuring selectively and incrementally further pairs, we increase the accuracy of identified change points related to configuration options and interactions. We have conducted a number of experiments both on controlled synthetic data sets as well as in real-world scenarios with different software systems. Our evaluation demonstrates that we can pinpoint performance shifts to individual configuration options and interactions as well as commits introducing change points with high accuracy and at scale. Experiments on three real-world systems explore the effectiveness and practicality of our approach.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126849241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Automated Generation of Client-Specific Backends Utilizing Existing Microservices and Architectural Knowledge 利用现有的微服务和架构知识自动生成特定于客户端的后端

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3415283

Nils Wieber

The design and development of production-grade microservice backends is a tedious and error-prone task. In particular, they must be capable of handling all Functional Requirements (FRs) and all Non-Functional Requirements (NFRs) (like security) including all operational requirements (like monitoring). This becomes even more difficult if there are many clients with different roles, linked to diverse (non-)functional requirements and many existing services are involved, which have to consider these in a consistent way. In this paper we present a model-driven approach that automatically generates client-specific production-grade backends by incorporating previously expressed architectural knowledge out of an interpretable specification of the targeted APIs and the NFRs.CCS CONCEPTS • Software and its engineering →Abstraction, modeling and modularity; System modeling languages; Software architectures; Software development techniques.

设计和开发生产级微服务后端是一项繁琐且容易出错的任务。特别是，它们必须能够处理所有的功能需求(FRs)和所有的非功能需求(NFRs)(如安全性)，包括所有的操作需求(如监控)。如果有许多具有不同角色的客户端，与不同的(非)功能需求相关联，并且涉及许多现有的服务，则必须以一致的方式考虑这些需求，则这将变得更加困难。在本文中，我们提出了一种模型驱动的方法，通过将先前表达的体系结构知识从目标api和nfr的可解释规范中结合起来，自动生成特定于客户端的生产级后端。•软件及其工程→抽象、建模和模块化;系统建模语言;软件架构;软件开发技术。

引用次数: 1

Metamorphic Object Insertion for Testing Object Detection Systems 用于测试对象检测系统的变形对象插入

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416584

Shuai Wang, Z. Su

Recent advances in deep neural networks (DNNs) have led to object detectors (ODs) that can rapidly process pictures or videos, and recognize the objects that they contain. Despite the promising progress by industrial manufacturers such as Amazon and Google in commercializing deep learning-based ODs as a standard computer vision service, ODs — similar to traditional software — may still produce incorrect results. These errors, in turn, can lead to severe negative outcomes for the users. For instance, an autonomous driving system that fails to detect pedestrians can cause accidents or even fatalities. However, despite their importance, principled, systematic methods for testing ODs do not yet exist. To fill this critical gap, we introduce the design and realization of Metaod, a metamorphic testing system specifically designed for ODs to effectively uncover erroneous detection results. To this end, we (1) synthesize natural-looking images by inserting extra object instances into background images, and (2) design metamorphic conditions asserting the equivalence of OD results between the original and synthetic images after excluding the prediction results on the inserted objects. Metaod is designed as a streamlined workflow that performs object extraction, selection, and insertion. We develop a set of practical techniques to realize an effective workflow, and generate diverse, natural-looking images for testing. Evaluated on four commercial OD services and four pretrained models provided by the TensorFlow API, Metaod found tens of thousands of detection failures. To further demonstrate the practical usage of Metaod, we use the synthetic images that cause erroneous detection results to retrain the model. Our results show that the model performance is significantly increased, from an mAP score of 9.3 to an mAP score of 10.5.

深度神经网络(dnn)的最新进展导致了物体探测器(od)，它可以快速处理图片或视频，并识别其中包含的物体。尽管亚马逊和谷歌等工业制造商在将基于深度学习的od商业化作为标准计算机视觉服务方面取得了很好的进展，但od -类似于传统软件-仍然可能产生不正确的结果。这些错误反过来会给用户带来严重的负面后果。例如，自动驾驶系统如果不能检测到行人，可能会导致事故甚至死亡。然而，尽管它们很重要，但测试ODs的有原则的、系统的方法尚不存在。为了填补这一关键空白，我们介绍了method的设计和实现，这是一个专门为ODs设计的变质检测系统，可以有效地发现错误的检测结果。为此，我们(1)通过在背景图像中插入额外的对象实例来合成看起来很自然的图像;(2)在排除插入对象的预测结果后，设计变形条件，断言原始图像和合成图像之间的OD结果是等价的。方法被设计为执行对象提取、选择和插入的流线型工作流。我们开发了一套实用的技术来实现有效的工作流程，并为测试生成各种自然的图像。通过对TensorFlow API提供的四种商业OD服务和四种预训练模型进行评估，method发现了数以万计的检测失败。为了进一步演示method的实际应用，我们使用导致错误检测结果的合成图像对模型进行重新训练。我们的结果表明，模型的性能显著提高，从mAP得分9.3到mAP得分10.5。

{"title":"Metamorphic Object Insertion for Testing Object Detection Systems","authors":"Shuai Wang, Z. Su","doi":"10.1145/3324884.3416584","DOIUrl":"https://doi.org/10.1145/3324884.3416584","url":null,"abstract":"Recent advances in deep neural networks (DNNs) have led to object detectors (ODs) that can rapidly process pictures or videos, and recognize the objects that they contain. Despite the promising progress by industrial manufacturers such as Amazon and Google in commercializing deep learning-based ODs as a standard computer vision service, ODs — similar to traditional software — may still produce incorrect results. These errors, in turn, can lead to severe negative outcomes for the users. For instance, an autonomous driving system that fails to detect pedestrians can cause accidents or even fatalities. However, despite their importance, principled, systematic methods for testing ODs do not yet exist. To fill this critical gap, we introduce the design and realization of Metaod, a metamorphic testing system specifically designed for ODs to effectively uncover erroneous detection results. To this end, we (1) synthesize natural-looking images by inserting extra object instances into background images, and (2) design metamorphic conditions asserting the equivalence of OD results between the original and synthetic images after excluding the prediction results on the inserted objects. Metaod is designed as a streamlined workflow that performs object extraction, selection, and insertion. We develop a set of practical techniques to realize an effective workflow, and generate diverse, natural-looking images for testing. Evaluated on four commercial OD services and four pretrained models provided by the TensorFlow API, Metaod found tens of thousands of detection failures. To further demonstrate the practical usage of Metaod, we use the synthetic images that cause erroneous detection results to retrain the model. Our results show that the model performance is significantly increased, from an mAP score of 9.3 to an mAP score of 10.5.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121680177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 51

Botsing, a Search-based Crash Reproduction Framework for Java Botsing，一个基于搜索的Java崩溃再现框架

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-08-31 DOI: 10.1145/3324884.3415299

P. Derakhshanfar, Xavier Devroey, Annibale Panichella, A. Zaidman, A. Deursen

Approaches for automatic crash reproduction aim to generate test cases that reproduce crashes starting from the crash stack traces. These tests help developers during their debugging practices. One of the most promising techniques in this research field leverages search-based software testing techniques for generating crash reproducing test cases. In this paper, we introduce Botsing, an open-source search-based crash reproduction framework for Java. Botsing implements state-of-the-art and novel approaches for crash reproduction. The well-documented architecture of Botsing makes it an easy-to-extend framework, and can hence be used for implementing new approaches to improve crash reproduction. We have applied Botsing to a wide range of crashes collected from open source systems. Furthermore, we conducted a qualitative assessment of the crash-reproducing test cases with our industrial partners. In both cases, Botsing could reproduce a notable amount of the given stack traces.Demo. video: https://www.youtube.com/watch?v=k6XaQjHqe48 Botsing website: https://stamp-project.github.io/botsing/

自动崩溃再现的方法旨在生成从崩溃堆栈跟踪开始再现崩溃的测试用例。这些测试可以帮助开发人员进行调试。在这个研究领域中最有前途的技术之一是利用基于搜索的软件测试技术来生成崩溃再现测试用例。在本文中，我们介绍了Botsing，一个开源的基于搜索的Java崩溃再现框架。Botsing实现了最先进和新颖的崩溃再现方法。Botsing文档完备的体系结构使其成为易于扩展的框架，因此可以用于实现改进崩溃再现的新方法。我们已经将Botsing应用于从开源系统收集的大量崩溃。此外，我们与我们的工业合作伙伴一起对碰撞再现测试用例进行了定性评估。在这两种情况下，Botsing都可以重现相当数量的给定堆栈跟踪。视频:https://www.youtube.com/watch?v=k6XaQjHqe48 Botsing网站:https://stamp-project.github.io/botsing/

{"title":"Botsing, a Search-based Crash Reproduction Framework for Java","authors":"P. Derakhshanfar, Xavier Devroey, Annibale Panichella, A. Zaidman, A. Deursen","doi":"10.1145/3324884.3415299","DOIUrl":"https://doi.org/10.1145/3324884.3415299","url":null,"abstract":"Approaches for automatic crash reproduction aim to generate test cases that reproduce crashes starting from the crash stack traces. These tests help developers during their debugging practices. One of the most promising techniques in this research field leverages search-based software testing techniques for generating crash reproducing test cases. In this paper, we introduce Botsing, an open-source search-based crash reproduction framework for Java. Botsing implements state-of-the-art and novel approaches for crash reproduction. The well-documented architecture of Botsing makes it an easy-to-extend framework, and can hence be used for implementing new approaches to improve crash reproduction. We have applied Botsing to a wide range of crashes collected from open source systems. Furthermore, we conducted a qualitative assessment of the crash-reproducing test cases with our industrial partners. In both cases, Botsing could reproduce a notable amount of the given stack traces.Demo. video: https://www.youtube.com/watch?v=k6XaQjHqe48 Botsing website: https://stamp-project.github.io/botsing/","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115022287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

BiLO-CPDP: Bi-Level Programming for Automated Model Discovery in Cross-Project Defect Prediction BiLO-CPDP:跨项目缺陷预测中自动模型发现的双层编程

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-08-31 DOI: 10.1145/3324884.3416617

Kewei Li, Zilin Xiang, Tao-An Chen, K. Tan

Cross-Project Defect Prediction (CPDP), which borrows data from similar projects by combining a transfer learner with a classifier, have emerged as a promising way to predict software defects when the available data about the target project is insufficient. However, developing such a model is challenge because it is difficult to determine the right combination of transfer learner and classifier along with their optimal hyper-parameter settings. In this paper, we propose a tool, dubbed BiLO-CPDP, which is the first of its kind to formulate the automated CPDP model discovery from the perspective of bi-level programming. In particular, the bi-level programming proceeds the optimization with two nested levels in a hierarchical manner. Specifically, the upper-level optimization routine is designed to search for the right combination of transfer learner and classifier while the nested lower-level optimization routine aims to optimize the corresponding hyper-parameter settings. To evaluate BiLO-CPDP, we conduct experiments on 20 projects to compare it with a total of 21 existing CPDP techniques, along with its single-level optimization variant and Auto-Sklearn, a state-of-the-art automated machine learning tool. Empirical results show that BiLO-CPDP champions better prediction performance than all other 21 existing CPDP techniques on 70% of the projects, while being overwhelmingly superior to Auto-Sklearn and its single-level optimization variant on all cases. Furthermore, the unique bi-level formalization in BiLO-CPDP also permits to allocate more budget to the upper-level, which significantly boosts the performance.

跨项目缺陷预测(CPDP)，通过结合迁移学习器和分类器从类似的项目中借用数据，已经成为当目标项目的可用数据不足时预测软件缺陷的一种有前途的方法。然而，开发这样的模型是一个挑战，因为很难确定迁移学习器和分类器的正确组合以及它们的最佳超参数设置。在本文中，我们提出了一个称为BiLO-CPDP的工具，这是同类工具中第一个从双层规划的角度制定自动CPDP模型发现的工具。特别是，双层规划以分层的方式对两个嵌套层进行优化。其中，上层优化例程的目的是寻找迁移学习器和分类器的正确组合，而嵌套下层优化例程的目的是优化相应的超参数设置。为了评估BiLO-CPDP，我们对20个项目进行了实验，将其与现有的21种CPDP技术，以及其单级优化变体和Auto-Sklearn(一种最先进的自动化机器学习工具)进行了比较。实证结果表明，在70%的项目中，BiLO-CPDP的预测性能优于所有其他21种现有的CPDP技术，同时在所有情况下都绝对优于Auto-Sklearn及其单级优化变体。此外，BiLO-CPDP中独特的双层形式化也允许将更多的预算分配给上层，这大大提高了性能。

{"title":"BiLO-CPDP: Bi-Level Programming for Automated Model Discovery in Cross-Project Defect Prediction","authors":"Kewei Li, Zilin Xiang, Tao-An Chen, K. Tan","doi":"10.1145/3324884.3416617","DOIUrl":"https://doi.org/10.1145/3324884.3416617","url":null,"abstract":"Cross-Project Defect Prediction (CPDP), which borrows data from similar projects by combining a transfer learner with a classifier, have emerged as a promising way to predict software defects when the available data about the target project is insufficient. However, developing such a model is challenge because it is difficult to determine the right combination of transfer learner and classifier along with their optimal hyper-parameter settings. In this paper, we propose a tool, dubbed BiLO-CPDP, which is the first of its kind to formulate the automated CPDP model discovery from the perspective of bi-level programming. In particular, the bi-level programming proceeds the optimization with two nested levels in a hierarchical manner. Specifically, the upper-level optimization routine is designed to search for the right combination of transfer learner and classifier while the nested lower-level optimization routine aims to optimize the corresponding hyper-parameter settings. To evaluate BiLO-CPDP, we conduct experiments on 20 projects to compare it with a total of 21 existing CPDP techniques, along with its single-level optimization variant and Auto-Sklearn, a state-of-the-art automated machine learning tool. Empirical results show that BiLO-CPDP champions better prediction performance than all other 21 existing CPDP techniques on 70% of the projects, while being overwhelmingly superior to Auto-Sklearn and its single-level optimization variant on all cases. Furthermore, the unique bi-level formalization in BiLO-CPDP also permits to allocate more budget to the upper-level, which significantly boosts the performance.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126974679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30