首页 > 最新文献

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)最新文献

英文 中文
Good Things Come In Threes: Improving Search-based Crash Reproduction With Helper Objectives 好事成三:用助手目标改进基于搜索的崩溃再现
P. Derakhshanfar, Xavier Devroey, A. Zaidman, A. Deursen, Annibale Panichella
Writing a test case reproducing a reported software crash is a common practice to identify the root cause of an anomaly in the software under test. However, this task is usually labor-intensive and time-taking. Hence, evolutionary intelligence approaches have been successfully applied to assist developers during debugging by generating a test case reproducing reported crashes. These approaches use a single fitness function called Crash Distance to guide the search process toward reproducing a target crash. Despite the reported achievements, these approaches do not always successfully reproduce some crashes due to a lack of test diversity (premature convergence). In this study, we introduce a new approach, called MOHO, that addresses this issue via multi-objectivization. In particular, we introduce two new Helper-Objectives for crash reproduction, namely test length (to minimize) and method sequence diversity (to maximize), in addition to Crash Distance. We assessed MO-HO using five multi-objective evolutionary algorithms (NSGA-II, SPEA2, PESA-II, MOEA/D, FEMO) on 124 non-trivial crashes stemming from open-source projects. Our results indicate that SPEA2 is the best-performing multi-objective algorithm for MO-HO. We evaluated this best-performing algorithm for MO-HO against the state-of-the-art: single-objective approach (Single-Objective Search) and decomposition-based multi-objectivization approach (De-MO). Our results show that MO-HO reproduces five crashes that cannot be reproduced by the current state-of-the-art. Besides, MO-HO improves the effectiveness (+10% and +8% in reproduction ratio) and the efficiency in 34.6% and 36% of crashes (i.e., significantly lower running time) compared to Single-Objective Search and De-MO, respectively. For some crashes, the improvements are very large, being up to +93.3% for reproduction ratio and −92% for the required running time.
编写一个重现报告的软件崩溃的测试用例是识别被测软件中异常的根本原因的一种常见做法。然而,这项任务通常是劳动密集型和耗时的。因此,进化智能方法已经成功地应用于通过生成重现报告的崩溃的测试用例来帮助开发人员进行调试。这些方法使用一个称为崩溃距离的适应度函数来引导搜索过程重现目标崩溃。尽管有报道的成就,由于缺乏测试多样性(过早收敛),这些方法并不总是成功地重现一些崩溃。在这项研究中,我们引入了一种名为MOHO的新方法,通过多客观化来解决这个问题。特别地,我们为崩溃再现引入了两个新的辅助目标,即测试长度(最小化)和方法序列多样性(最大化),以及崩溃距离。我们使用五种多目标进化算法(NSGA-II, SPEA2, PESA-II, MOEA/D, FEMO)对124个来自开源项目的重大崩溃进行了MO-HO评估。研究结果表明,SPEA2是MO-HO算法中性能最好的多目标算法。我们将这种性能最佳的MO-HO算法与最先进的单目标方法(single-objective Search)和基于分解的多目标方法(De-MO)进行了比较。我们的结果表明,MO-HO再现了目前最先进的技术无法再现的五种崩溃。此外,与单目标搜索和De-MO相比,MO-HO分别提高了效率(繁殖率+10%和+8%)和34.6%和36%的崩溃效率(即显著降低运行时间)。对于某些崩溃,改进是非常大的,复制比率提高了+93.3%,所需运行时间提高了- 92%。
{"title":"Good Things Come In Threes: Improving Search-based Crash Reproduction With Helper Objectives","authors":"P. Derakhshanfar, Xavier Devroey, A. Zaidman, A. Deursen, Annibale Panichella","doi":"10.1145/3324884.3416643","DOIUrl":"https://doi.org/10.1145/3324884.3416643","url":null,"abstract":"Writing a test case reproducing a reported software crash is a common practice to identify the root cause of an anomaly in the software under test. However, this task is usually labor-intensive and time-taking. Hence, evolutionary intelligence approaches have been successfully applied to assist developers during debugging by generating a test case reproducing reported crashes. These approaches use a single fitness function called Crash Distance to guide the search process toward reproducing a target crash. Despite the reported achievements, these approaches do not always successfully reproduce some crashes due to a lack of test diversity (premature convergence). In this study, we introduce a new approach, called MOHO, that addresses this issue via multi-objectivization. In particular, we introduce two new Helper-Objectives for crash reproduction, namely test length (to minimize) and method sequence diversity (to maximize), in addition to Crash Distance. We assessed MO-HO using five multi-objective evolutionary algorithms (NSGA-II, SPEA2, PESA-II, MOEA/D, FEMO) on 124 non-trivial crashes stemming from open-source projects. Our results indicate that SPEA2 is the best-performing multi-objective algorithm for MO-HO. We evaluated this best-performing algorithm for MO-HO against the state-of-the-art: single-objective approach (Single-Objective Search) and decomposition-based multi-objectivization approach (De-MO). Our results show that MO-HO reproduces five crashes that cannot be reproduced by the current state-of-the-art. Besides, MO-HO improves the effectiveness (+10% and +8% in reproduction ratio) and the efficiency in 34.6% and 36% of crashes (i.e., significantly lower running time) compared to Single-Objective Search and De-MO, respectively. For some crashes, the improvements are very large, being up to +93.3% for reproduction ratio and −92% for the required running time.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131910946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Broadening Horizons of Multilingual Static Analysis: Semantic Summary Extraction from C Code for JNI Program Analysis 拓宽多语言静态分析的视野:从 C 代码中提取语义摘要用于 JNI 程序分析
Sungho Lee, Hyogun Lee, Sukyoung Ryu
Most programming languages support foreign language interoperation that allows developers to integrate multiple modules implemented in different languages into a single multilingual program. While utilizing various features from multiple languages expands expressivity, differences in language semantics require developers to understand the semantics of multiple languages and their inter-operation. Because current compilers do not support compile-time checking for interoperation, they do not help developers avoid in-teroperation bugs. Similarly, active research on static analysis and bug detection has been focusing on programs written in a single language. In this paper, we propose a novel approach to analyze multilingual programs statically. Unlike existing approaches that extend a static analyzer for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summaries. To show practicality of our approach, we design and implement a static analyzer for multilingual programs, which analyzes JNI interoperation between Java and C. Our empirical evaluation shows that the analyzer is scalable in that it can construct call graphs for large programs that use JNI interoperation, and useful in that it found 74 genuine interoperation bugs in real-world Android JNI applications.
大多数编程语言都支持外语互操作,允许开发人员将用不同语言实现的多个模块集成到一个多语言程序中。虽然利用多种语言的各种功能可以扩展表达能力,但语言语义的差异要求开发人员了解多种语言的语义及其互操作性。由于目前的编译器不支持互操作的编译时检查,因此无法帮助开发人员避免操作中的错误。同样,关于静态分析和错误检测的积极研究一直集中在用单一语言编写的程序上。在本文中,我们提出了一种静态分析多语言程序的新方法。与现有的扩展主语言静态分析器以支持分析外来函数调用的方法不同,我们的方法使用模块化分析技术从用客体语言编写的程序中提取语义摘要,并使用提取的语义摘要执行整个程序的分析。为了证明我们的方法的实用性,我们为多语言程序设计并实现了一个静态分析器,它可以分析 Java 和 C 之间的 JNI 互操作。我们的经验评估表明,该分析器具有可扩展性,它可以为使用 JNI 互操作的大型程序构建调用图,而且非常有用,它在现实世界的 Android JNI 应用程序中发现了 74 个真正的互操作错误。
{"title":"Broadening Horizons of Multilingual Static Analysis: Semantic Summary Extraction from C Code for JNI Program Analysis","authors":"Sungho Lee, Hyogun Lee, Sukyoung Ryu","doi":"10.1145/3324884.3416558","DOIUrl":"https://doi.org/10.1145/3324884.3416558","url":null,"abstract":"Most programming languages support foreign language interoperation that allows developers to integrate multiple modules implemented in different languages into a single multilingual program. While utilizing various features from multiple languages expands expressivity, differences in language semantics require developers to understand the semantics of multiple languages and their inter-operation. Because current compilers do not support compile-time checking for interoperation, they do not help developers avoid in-teroperation bugs. Similarly, active research on static analysis and bug detection has been focusing on programs written in a single language. In this paper, we propose a novel approach to analyze multilingual programs statically. Unlike existing approaches that extend a static analyzer for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summaries. To show practicality of our approach, we design and implement a static analyzer for multilingual programs, which analyzes JNI interoperation between Java and C. Our empirical evaluation shows that the analyzer is scalable in that it can construct call graphs for large programs that use JNI interoperation, and useful in that it found 74 genuine interoperation bugs in real-world Android JNI applications.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132390893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Test Automation in Open-Source Android Apps: A Large-Scale Empirical Study 开源Android应用中的测试自动化:一项大规模的实证研究
Jun-Wei Lin, Navid Salehnamadi, S. Malek
Automated testing of mobile apps has received significant attention in recent years from researchers and practitioners alike. In this paper, we report on the largest empirical study to date, aimed at understanding the test automation culture prevalent among mobile app developers. We systematically examined more than 3.5 million repositories on GitHub and identified more than 12,000 non-trivial and real-world Android apps. We then analyzed these non-trivial apps to investigate (1) the prevalence of adoption of test automation; (2) working habits of mobile app developers in regards to automated testing; and (3) the correlation between the adoption of test automation and the popularity of projects. Among others, we found that (1) only 8% of the mobile app development projects leverage automated testing practices; (2) developers tend to follow the same test automation practices across projects; and (3) popular projects, measured in terms of the number of contributors, stars, and forks on GitHub, are more likely to adopt test automation practices. To understand the rationale behind our observations, we further conducted a survey with 148 professional and experienced developers contributing to the subject apps. Our findings shed light on the current practices and future research directions pertaining to test automation for mobile app development.
近年来,移动应用程序的自动化测试受到了研究人员和实践者的极大关注。在本文中,我们报告了迄今为止最大的实证研究,旨在了解在移动应用程序开发人员中流行的测试自动化文化。我们系统地检查了GitHub上超过350万个存储库,并确定了超过12,000个重要的和真实的Android应用程序。然后,我们分析了这些重要的应用程序,以调查(1)测试自动化的普及程度;(2)移动应用开发者在自动化测试方面的工作习惯;(3)测试自动化的采用与项目的普及之间的相关性。其中,我们发现(1)只有8%的移动应用开发项目利用了自动化测试实践;(2)开发人员倾向于跨项目遵循相同的测试自动化实践;(3)受欢迎的项目,根据GitHub上的贡献者、明星和分支的数量来衡量,更有可能采用测试自动化实践。为了理解我们的观察背后的原理,我们进一步对148名专业和有经验的开发人员进行了调查。我们的发现揭示了当前的实践和未来的研究方向有关测试自动化的移动应用程序开发。
{"title":"Test Automation in Open-Source Android Apps: A Large-Scale Empirical Study","authors":"Jun-Wei Lin, Navid Salehnamadi, S. Malek","doi":"10.1145/3324884.3416623","DOIUrl":"https://doi.org/10.1145/3324884.3416623","url":null,"abstract":"Automated testing of mobile apps has received significant attention in recent years from researchers and practitioners alike. In this paper, we report on the largest empirical study to date, aimed at understanding the test automation culture prevalent among mobile app developers. We systematically examined more than 3.5 million repositories on GitHub and identified more than 12,000 non-trivial and real-world Android apps. We then analyzed these non-trivial apps to investigate (1) the prevalence of adoption of test automation; (2) working habits of mobile app developers in regards to automated testing; and (3) the correlation between the adoption of test automation and the popularity of projects. Among others, we found that (1) only 8% of the mobile app development projects leverage automated testing practices; (2) developers tend to follow the same test automation practices across projects; and (3) popular projects, measured in terms of the number of contributors, stars, and forks on GitHub, are more likely to adopt test automation practices. To understand the rationale behind our observations, we further conducted a survey with 148 professional and experienced developers contributing to the subject apps. Our findings shed light on the current practices and future research directions pertaining to test automation for mobile app development.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131369775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Assessing and Restoring Reproducibility of Jupyter Notebooks 评估和恢复Jupyter笔记本的再现性
Jiawei Wang, Tzu-yang Kuo, Li Li, A. Zeller
Jupyter notebooks-documents that contain live code, equations, visualizations, and narrative text-now are among the most popular means to compute, present, discuss and disseminate scientific findings. In principle, Jupyter notebooks should easily allow to reproduce and extend scientific computations and their findings; but in practice, this is not the case. The individual code cells in Jupyter notebooks can be executed in any order, with identifier usages preceding their definitions and results preceding their computations. In a sample of 936 published notebooks that would be executable in principle, we found that 73% of them would not be reproducible with straightforward approaches, requiring humans to infer (and often guess) the order in which the authors created the cells. In this paper, we present an approach to (1) automatically satisfy dependencies between code cells to reconstruct possible execution orders of the cells; and (2) instrument code cells to mitigate the impact of non-reproducible statements (i.e., random functions) in Jupyter notebooks. Our Osiris prototype takes a notebook as input and outputs the possible execution schemes that reproduce the exact notebook results. In our sample, Osiris was able to reconstruct such schemes for 82.23% of all executable notebooks, which has more than three times better than the state-of-the-art; the resulting reordered code is valid program code and thus available for further testing and analysis.
Jupyter笔记本——包含实时代码、方程式、可视化和叙述文本的文档——现在是计算、呈现、讨论和传播科学发现的最流行的手段之一。原则上,木星笔记本应该可以很容易地复制和扩展科学计算和他们的发现;但在实践中,情况并非如此。Jupyter笔记本中的单个代码单元可以以任何顺序执行,标识符使用在其定义之前,结果在其计算之前。在936本原则上可执行的已发表笔记的样本中,我们发现73%的笔记不能用直接的方法重现,需要人们推断(通常是猜测)作者创造细胞的顺序。在本文中,我们提出了一种方法:(1)自动满足代码单元之间的依赖关系,以重建单元可能的执行顺序;(2)仪器编码单元,以减轻Jupyter笔记本中不可复制语句(即随机函数)的影响。我们的Osiris原型将笔记本作为输入,并输出可能的执行方案,以复制精确的笔记本结果。在我们的样本中,Osiris能够为所有可执行笔记本的82.23%重建这样的方案,这比最先进的三倍多;结果重新排序的代码是有效的程序代码,因此可用于进一步的测试和分析。
{"title":"Assessing and Restoring Reproducibility of Jupyter Notebooks","authors":"Jiawei Wang, Tzu-yang Kuo, Li Li, A. Zeller","doi":"10.1145/3324884.3416585","DOIUrl":"https://doi.org/10.1145/3324884.3416585","url":null,"abstract":"Jupyter notebooks-documents that contain live code, equations, visualizations, and narrative text-now are among the most popular means to compute, present, discuss and disseminate scientific findings. In principle, Jupyter notebooks should easily allow to reproduce and extend scientific computations and their findings; but in practice, this is not the case. The individual code cells in Jupyter notebooks can be executed in any order, with identifier usages preceding their definitions and results preceding their computations. In a sample of 936 published notebooks that would be executable in principle, we found that 73% of them would not be reproducible with straightforward approaches, requiring humans to infer (and often guess) the order in which the authors created the cells. In this paper, we present an approach to (1) automatically satisfy dependencies between code cells to reconstruct possible execution orders of the cells; and (2) instrument code cells to mitigate the impact of non-reproducible statements (i.e., random functions) in Jupyter notebooks. Our Osiris prototype takes a notebook as input and outputs the possible execution schemes that reproduce the exact notebook results. In our sample, Osiris was able to reconstruct such schemes for 82.23% of all executable notebooks, which has more than three times better than the state-of-the-art; the resulting reordered code is valid program code and thus available for further testing and analysis.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115726853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Cross-Contract Static Analysis for Detecting Practical Reentrancy Vulnerabilities in Smart Contracts 智能合约中检测实际重入漏洞的跨合约静态分析
Yinxing Xue, Mingliang Ma, Yun Lin, Yulei Sui, Jiaming Ye, T. Peng
Reentrancy bugs, one of the most severe vulnerabilities in smart contracts, have caused huge financial loss in recent years. Researchers have proposed many approaches to detecting them. However, empirical studies have shown that these approaches suffer from undesirable false positives and false negatives, when the code under detection involves the interaction between multiple smart contracts. In this paper, we propose an accurate and efficient cross-contract reentrancy detection approach in practice. Rather than design rule-of-thumb heuristics, we conduct a large empirical study of 11714 real-world contracts from Etherscan against three well-known general-purpose security tools for reentrancy detection. We manually summarized the reentrancy scenarios where the state-of-the-art approaches cannot address. Based on the empirical evidence, we present Clairvoyance, a cross-function and cross-contract static analysis to detect reentrancy vulnerabilities in real world with significantly higher accuracy. To reduce false negatives, we enable, for the first time, a cross-contract call chain analysis by tracking possibly tainted paths. To reduce false positives, we systematically summarized five major path protective techniques (PPTs) to support fast yet precise path feasibility checking. We implemented our approach and compared Clairvoyance with five state-of-the-art tools on 17770 real-worlds contracts. The results show that Clairvoyance yields the best detection accuracy among all the five tools and also finds 101 unknown reentrancy vulnerabilities.
可重入性漏洞是智能合约中最严重的漏洞之一,近年来造成了巨大的经济损失。研究人员提出了许多检测它们的方法。然而,实证研究表明,当检测到的代码涉及多个智能合约之间的交互时,这些方法会出现不希望出现的假阳性和假阴性。本文在实践中提出了一种准确、高效的交叉契约可重入检测方法。我们没有设计经验法则,而是对来自Etherscan的11714个真实世界的合约进行了大型实证研究,并针对三种著名的通用安全工具进行了重入检测。我们手动总结了最先进的方法无法解决的重入场景。基于经验证据,我们提出了Clairvoyance,这是一种跨功能和跨契约的静态分析,可以以更高的准确性检测现实世界中的可重入漏洞。为了减少误报,我们首次通过跟踪可能受污染的路径来启用交叉契约调用链分析。为了减少误报,我们系统地总结了五种主要的路径保护技术(PPTs),以支持快速而精确的路径可行性检查。我们实现了我们的方法,并将Clairvoyance与五个最先进的工具在17770个真实世界的合同中进行了比较。结果表明,千里眼的检测准确率最高,共发现101个未知重入漏洞。
{"title":"Cross-Contract Static Analysis for Detecting Practical Reentrancy Vulnerabilities in Smart Contracts","authors":"Yinxing Xue, Mingliang Ma, Yun Lin, Yulei Sui, Jiaming Ye, T. Peng","doi":"10.1145/3324884.3416553","DOIUrl":"https://doi.org/10.1145/3324884.3416553","url":null,"abstract":"Reentrancy bugs, one of the most severe vulnerabilities in smart contracts, have caused huge financial loss in recent years. Researchers have proposed many approaches to detecting them. However, empirical studies have shown that these approaches suffer from undesirable false positives and false negatives, when the code under detection involves the interaction between multiple smart contracts. In this paper, we propose an accurate and efficient cross-contract reentrancy detection approach in practice. Rather than design rule-of-thumb heuristics, we conduct a large empirical study of 11714 real-world contracts from Etherscan against three well-known general-purpose security tools for reentrancy detection. We manually summarized the reentrancy scenarios where the state-of-the-art approaches cannot address. Based on the empirical evidence, we present Clairvoyance, a cross-function and cross-contract static analysis to detect reentrancy vulnerabilities in real world with significantly higher accuracy. To reduce false negatives, we enable, for the first time, a cross-contract call chain analysis by tracking possibly tainted paths. To reduce false positives, we systematically summarized five major path protective techniques (PPTs) to support fast yet precise path feasibility checking. We implemented our approach and compared Clairvoyance with five state-of-the-art tools on 17770 real-worlds contracts. The results show that Clairvoyance yields the best detection accuracy among all the five tools and also finds 101 unknown reentrancy vulnerabilities.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129965552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Identifying Software Performance Changes Across Variants and Versions 识别跨变体和版本的软件性能变化
Stefan Mühlbauer, S. Apel, Norbert Siegmund
We address the problem of identifying performance changes in the evolution of configurable software systems. Finding optimal configurations and configuration options that influence performance is already difficult, but in the light of software evolution, configuration-dependent performance changes may lurk in a potentially large number of different versions of the system. In this work, we combine two perspectives-variability and time-into a novel perspective. We propose an approach to identify configuration-dependent performance changes retrospectively across the software variants and versions of a software system. In a nutshell, we iteratively sample pairs of configurations and versions and measure the respective performance, which we use to update a model of likelihoods for performance changes. Pursuing a search strategy with the goal of measuring selectively and incrementally further pairs, we increase the accuracy of identified change points related to configuration options and interactions. We have conducted a number of experiments both on controlled synthetic data sets as well as in real-world scenarios with different software systems. Our evaluation demonstrates that we can pinpoint performance shifts to individual configuration options and interactions as well as commits introducing change points with high accuracy and at scale. Experiments on three real-world systems explore the effectiveness and practicality of our approach.
我们解决了在可配置软件系统的发展过程中识别性能变化的问题。找到影响性能的最优配置和配置选项已经很困难了,但是根据软件的发展,与配置相关的性能变化可能潜伏在大量不同版本的系统中。在这项工作中,我们将两个视角——变异性和时间——结合成一个新的视角。我们提出了一种方法,可以在软件系统的软件变体和版本之间回顾性地识别与配置相关的性能变化。简而言之,我们迭代地对配置和版本进行采样,并测量各自的性能,我们使用它们来更新性能变化的可能性模型。追求一种搜索策略,其目标是有选择地和增量地测量进一步的对,我们增加了与配置选项和交互相关的识别更改点的准确性。我们已经在受控的合成数据集以及使用不同软件系统的真实场景中进行了大量的实验。我们的评估表明,我们可以精确地指出单个配置选项和交互的性能变化,以及以高精度和大规模的方式引入变更点的提交。在三个现实世界系统上的实验探索了我们方法的有效性和实用性。
{"title":"Identifying Software Performance Changes Across Variants and Versions","authors":"Stefan Mühlbauer, S. Apel, Norbert Siegmund","doi":"10.1145/3324884.3416573","DOIUrl":"https://doi.org/10.1145/3324884.3416573","url":null,"abstract":"We address the problem of identifying performance changes in the evolution of configurable software systems. Finding optimal configurations and configuration options that influence performance is already difficult, but in the light of software evolution, configuration-dependent performance changes may lurk in a potentially large number of different versions of the system. In this work, we combine two perspectives-variability and time-into a novel perspective. We propose an approach to identify configuration-dependent performance changes retrospectively across the software variants and versions of a software system. In a nutshell, we iteratively sample pairs of configurations and versions and measure the respective performance, which we use to update a model of likelihoods for performance changes. Pursuing a search strategy with the goal of measuring selectively and incrementally further pairs, we increase the accuracy of identified change points related to configuration options and interactions. We have conducted a number of experiments both on controlled synthetic data sets as well as in real-world scenarios with different software systems. Our evaluation demonstrates that we can pinpoint performance shifts to individual configuration options and interactions as well as commits introducing change points with high accuracy and at scale. Experiments on three real-world systems explore the effectiveness and practicality of our approach.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126849241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Automated Generation of Client-Specific Backends Utilizing Existing Microservices and Architectural Knowledge 利用现有的微服务和架构知识自动生成特定于客户端的后端
Nils Wieber
The design and development of production-grade microservice backends is a tedious and error-prone task. In particular, they must be capable of handling all Functional Requirements (FRs) and all Non-Functional Requirements (NFRs) (like security) including all operational requirements (like monitoring). This becomes even more difficult if there are many clients with different roles, linked to diverse (non-)functional requirements and many existing services are involved, which have to consider these in a consistent way. In this paper we present a model-driven approach that automatically generates client-specific production-grade backends by incorporating previously expressed architectural knowledge out of an interpretable specification of the targeted APIs and the NFRs.CCS CONCEPTS • Software and its engineering →Abstraction, modeling and modularity; System modeling languages; Software architectures; Software development techniques.
设计和开发生产级微服务后端是一项繁琐且容易出错的任务。特别是,它们必须能够处理所有的功能需求(FRs)和所有的非功能需求(NFRs)(如安全性),包括所有的操作需求(如监控)。如果有许多具有不同角色的客户端,与不同的(非)功能需求相关联,并且涉及许多现有的服务,则必须以一致的方式考虑这些需求,则这将变得更加困难。在本文中,我们提出了一种模型驱动的方法,通过将先前表达的体系结构知识从目标api和nfr的可解释规范中结合起来,自动生成特定于客户端的生产级后端。•软件及其工程→抽象、建模和模块化;系统建模语言;软件架构;软件开发技术。
{"title":"Automated Generation of Client-Specific Backends Utilizing Existing Microservices and Architectural Knowledge","authors":"Nils Wieber","doi":"10.1145/3324884.3415283","DOIUrl":"https://doi.org/10.1145/3324884.3415283","url":null,"abstract":"The design and development of production-grade microservice backends is a tedious and error-prone task. In particular, they must be capable of handling all Functional Requirements (FRs) and all Non-Functional Requirements (NFRs) (like security) including all operational requirements (like monitoring). This becomes even more difficult if there are many clients with different roles, linked to diverse (non-)functional requirements and many existing services are involved, which have to consider these in a consistent way. In this paper we present a model-driven approach that automatically generates client-specific production-grade backends by incorporating previously expressed architectural knowledge out of an interpretable specification of the targeted APIs and the NFRs.CCS CONCEPTS • Software and its engineering →Abstraction, modeling and modularity; System modeling languages; Software architectures; Software development techniques.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126131749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Metamorphic Object Insertion for Testing Object Detection Systems 用于测试对象检测系统的变形对象插入
Shuai Wang, Z. Su
Recent advances in deep neural networks (DNNs) have led to object detectors (ODs) that can rapidly process pictures or videos, and recognize the objects that they contain. Despite the promising progress by industrial manufacturers such as Amazon and Google in commercializing deep learning-based ODs as a standard computer vision service, ODs — similar to traditional software — may still produce incorrect results. These errors, in turn, can lead to severe negative outcomes for the users. For instance, an autonomous driving system that fails to detect pedestrians can cause accidents or even fatalities. However, despite their importance, principled, systematic methods for testing ODs do not yet exist. To fill this critical gap, we introduce the design and realization of Metaod, a metamorphic testing system specifically designed for ODs to effectively uncover erroneous detection results. To this end, we (1) synthesize natural-looking images by inserting extra object instances into background images, and (2) design metamorphic conditions asserting the equivalence of OD results between the original and synthetic images after excluding the prediction results on the inserted objects. Metaod is designed as a streamlined workflow that performs object extraction, selection, and insertion. We develop a set of practical techniques to realize an effective workflow, and generate diverse, natural-looking images for testing. Evaluated on four commercial OD services and four pretrained models provided by the TensorFlow API, Metaod found tens of thousands of detection failures. To further demonstrate the practical usage of Metaod, we use the synthetic images that cause erroneous detection results to retrain the model. Our results show that the model performance is significantly increased, from an mAP score of 9.3 to an mAP score of 10.5.
深度神经网络(dnn)的最新进展导致了物体探测器(od),它可以快速处理图片或视频,并识别其中包含的物体。尽管亚马逊和谷歌等工业制造商在将基于深度学习的od商业化作为标准计算机视觉服务方面取得了很好的进展,但od -类似于传统软件-仍然可能产生不正确的结果。这些错误反过来会给用户带来严重的负面后果。例如,自动驾驶系统如果不能检测到行人,可能会导致事故甚至死亡。然而,尽管它们很重要,但测试ODs的有原则的、系统的方法尚不存在。为了填补这一关键空白,我们介绍了method的设计和实现,这是一个专门为ODs设计的变质检测系统,可以有效地发现错误的检测结果。为此,我们(1)通过在背景图像中插入额外的对象实例来合成看起来很自然的图像;(2)在排除插入对象的预测结果后,设计变形条件,断言原始图像和合成图像之间的OD结果是等价的。方法被设计为执行对象提取、选择和插入的流线型工作流。我们开发了一套实用的技术来实现有效的工作流程,并为测试生成各种自然的图像。通过对TensorFlow API提供的四种商业OD服务和四种预训练模型进行评估,method发现了数以万计的检测失败。为了进一步演示method的实际应用,我们使用导致错误检测结果的合成图像对模型进行重新训练。我们的结果表明,模型的性能显著提高,从mAP得分9.3到mAP得分10.5。
{"title":"Metamorphic Object Insertion for Testing Object Detection Systems","authors":"Shuai Wang, Z. Su","doi":"10.1145/3324884.3416584","DOIUrl":"https://doi.org/10.1145/3324884.3416584","url":null,"abstract":"Recent advances in deep neural networks (DNNs) have led to object detectors (ODs) that can rapidly process pictures or videos, and recognize the objects that they contain. Despite the promising progress by industrial manufacturers such as Amazon and Google in commercializing deep learning-based ODs as a standard computer vision service, ODs — similar to traditional software — may still produce incorrect results. These errors, in turn, can lead to severe negative outcomes for the users. For instance, an autonomous driving system that fails to detect pedestrians can cause accidents or even fatalities. However, despite their importance, principled, systematic methods for testing ODs do not yet exist. To fill this critical gap, we introduce the design and realization of Metaod, a metamorphic testing system specifically designed for ODs to effectively uncover erroneous detection results. To this end, we (1) synthesize natural-looking images by inserting extra object instances into background images, and (2) design metamorphic conditions asserting the equivalence of OD results between the original and synthetic images after excluding the prediction results on the inserted objects. Metaod is designed as a streamlined workflow that performs object extraction, selection, and insertion. We develop a set of practical techniques to realize an effective workflow, and generate diverse, natural-looking images for testing. Evaluated on four commercial OD services and four pretrained models provided by the TensorFlow API, Metaod found tens of thousands of detection failures. To further demonstrate the practical usage of Metaod, we use the synthetic images that cause erroneous detection results to retrain the model. Our results show that the model performance is significantly increased, from an mAP score of 9.3 to an mAP score of 10.5.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121680177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Botsing, a Search-based Crash Reproduction Framework for Java Botsing,一个基于搜索的Java崩溃再现框架
P. Derakhshanfar, Xavier Devroey, Annibale Panichella, A. Zaidman, A. Deursen
Approaches for automatic crash reproduction aim to generate test cases that reproduce crashes starting from the crash stack traces. These tests help developers during their debugging practices. One of the most promising techniques in this research field leverages search-based software testing techniques for generating crash reproducing test cases. In this paper, we introduce Botsing, an open-source search-based crash reproduction framework for Java. Botsing implements state-of-the-art and novel approaches for crash reproduction. The well-documented architecture of Botsing makes it an easy-to-extend framework, and can hence be used for implementing new approaches to improve crash reproduction. We have applied Botsing to a wide range of crashes collected from open source systems. Furthermore, we conducted a qualitative assessment of the crash-reproducing test cases with our industrial partners. In both cases, Botsing could reproduce a notable amount of the given stack traces.Demo. video: https://www.youtube.com/watch?v=k6XaQjHqe48 Botsing website: https://stamp-project.github.io/botsing/
自动崩溃再现的方法旨在生成从崩溃堆栈跟踪开始再现崩溃的测试用例。这些测试可以帮助开发人员进行调试。在这个研究领域中最有前途的技术之一是利用基于搜索的软件测试技术来生成崩溃再现测试用例。在本文中,我们介绍了Botsing,一个开源的基于搜索的Java崩溃再现框架。Botsing实现了最先进和新颖的崩溃再现方法。Botsing文档完备的体系结构使其成为易于扩展的框架,因此可以用于实现改进崩溃再现的新方法。我们已经将Botsing应用于从开源系统收集的大量崩溃。此外,我们与我们的工业合作伙伴一起对碰撞再现测试用例进行了定性评估。在这两种情况下,Botsing都可以重现相当数量的给定堆栈跟踪。视频:https://www.youtube.com/watch?v=k6XaQjHqe48 Botsing网站:https://stamp-project.github.io/botsing/
{"title":"Botsing, a Search-based Crash Reproduction Framework for Java","authors":"P. Derakhshanfar, Xavier Devroey, Annibale Panichella, A. Zaidman, A. Deursen","doi":"10.1145/3324884.3415299","DOIUrl":"https://doi.org/10.1145/3324884.3415299","url":null,"abstract":"Approaches for automatic crash reproduction aim to generate test cases that reproduce crashes starting from the crash stack traces. These tests help developers during their debugging practices. One of the most promising techniques in this research field leverages search-based software testing techniques for generating crash reproducing test cases. In this paper, we introduce Botsing, an open-source search-based crash reproduction framework for Java. Botsing implements state-of-the-art and novel approaches for crash reproduction. The well-documented architecture of Botsing makes it an easy-to-extend framework, and can hence be used for implementing new approaches to improve crash reproduction. We have applied Botsing to a wide range of crashes collected from open source systems. Furthermore, we conducted a qualitative assessment of the crash-reproducing test cases with our industrial partners. In both cases, Botsing could reproduce a notable amount of the given stack traces.Demo. video: https://www.youtube.com/watch?v=k6XaQjHqe48 Botsing website: https://stamp-project.github.io/botsing/","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115022287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
BiLO-CPDP: Bi-Level Programming for Automated Model Discovery in Cross-Project Defect Prediction BiLO-CPDP:跨项目缺陷预测中自动模型发现的双层编程
Kewei Li, Zilin Xiang, Tao-An Chen, K. Tan
Cross-Project Defect Prediction (CPDP), which borrows data from similar projects by combining a transfer learner with a classifier, have emerged as a promising way to predict software defects when the available data about the target project is insufficient. However, developing such a model is challenge because it is difficult to determine the right combination of transfer learner and classifier along with their optimal hyper-parameter settings. In this paper, we propose a tool, dubbed BiLO-CPDP, which is the first of its kind to formulate the automated CPDP model discovery from the perspective of bi-level programming. In particular, the bi-level programming proceeds the optimization with two nested levels in a hierarchical manner. Specifically, the upper-level optimization routine is designed to search for the right combination of transfer learner and classifier while the nested lower-level optimization routine aims to optimize the corresponding hyper-parameter settings. To evaluate BiLO-CPDP, we conduct experiments on 20 projects to compare it with a total of 21 existing CPDP techniques, along with its single-level optimization variant and Auto-Sklearn, a state-of-the-art automated machine learning tool. Empirical results show that BiLO-CPDP champions better prediction performance than all other 21 existing CPDP techniques on 70% of the projects, while being overwhelmingly superior to Auto-Sklearn and its single-level optimization variant on all cases. Furthermore, the unique bi-level formalization in BiLO-CPDP also permits to allocate more budget to the upper-level, which significantly boosts the performance.
跨项目缺陷预测(CPDP),通过结合迁移学习器和分类器从类似的项目中借用数据,已经成为当目标项目的可用数据不足时预测软件缺陷的一种有前途的方法。然而,开发这样的模型是一个挑战,因为很难确定迁移学习器和分类器的正确组合以及它们的最佳超参数设置。在本文中,我们提出了一个称为BiLO-CPDP的工具,这是同类工具中第一个从双层规划的角度制定自动CPDP模型发现的工具。特别是,双层规划以分层的方式对两个嵌套层进行优化。其中,上层优化例程的目的是寻找迁移学习器和分类器的正确组合,而嵌套下层优化例程的目的是优化相应的超参数设置。为了评估BiLO-CPDP,我们对20个项目进行了实验,将其与现有的21种CPDP技术,以及其单级优化变体和Auto-Sklearn(一种最先进的自动化机器学习工具)进行了比较。实证结果表明,在70%的项目中,BiLO-CPDP的预测性能优于所有其他21种现有的CPDP技术,同时在所有情况下都绝对优于Auto-Sklearn及其单级优化变体。此外,BiLO-CPDP中独特的双层形式化也允许将更多的预算分配给上层,这大大提高了性能。
{"title":"BiLO-CPDP: Bi-Level Programming for Automated Model Discovery in Cross-Project Defect Prediction","authors":"Kewei Li, Zilin Xiang, Tao-An Chen, K. Tan","doi":"10.1145/3324884.3416617","DOIUrl":"https://doi.org/10.1145/3324884.3416617","url":null,"abstract":"Cross-Project Defect Prediction (CPDP), which borrows data from similar projects by combining a transfer learner with a classifier, have emerged as a promising way to predict software defects when the available data about the target project is insufficient. However, developing such a model is challenge because it is difficult to determine the right combination of transfer learner and classifier along with their optimal hyper-parameter settings. In this paper, we propose a tool, dubbed BiLO-CPDP, which is the first of its kind to formulate the automated CPDP model discovery from the perspective of bi-level programming. In particular, the bi-level programming proceeds the optimization with two nested levels in a hierarchical manner. Specifically, the upper-level optimization routine is designed to search for the right combination of transfer learner and classifier while the nested lower-level optimization routine aims to optimize the corresponding hyper-parameter settings. To evaluate BiLO-CPDP, we conduct experiments on 20 projects to compare it with a total of 21 existing CPDP techniques, along with its single-level optimization variant and Auto-Sklearn, a state-of-the-art automated machine learning tool. Empirical results show that BiLO-CPDP champions better prediction performance than all other 21 existing CPDP techniques on 70% of the projects, while being overwhelmingly superior to Auto-Sklearn and its single-level optimization variant on all cases. Furthermore, the unique bi-level formalization in BiLO-CPDP also permits to allocate more budget to the upper-level, which significantly boosts the performance.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126974679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
期刊
2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1