2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)最新文献_第9页

Pending Constraints in Symbolic Execution for Better Exploration and Seeding 符号执行中的未决约束，以便更好地探索和播种

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416589

Timotej Kapus, Frank Busse, Cristian Cadar

Symbolic execution is a well established technique for software testing and analysis. However, scalability continues to be a challenge, both in terms of constraint solving cost and path explosion. In this work, we present a novel approach for symbolic execution, which can enhance its scalability by aggressively prioritising execution paths that are already known to be feasible, and deferring all other paths. We evaluate our technique on nine applications, including SQLite3, make and tcpdump and show it can achieve higher coverage for both seeded and non-seeded exploration.

符号执行是一种成熟的软件测试和分析技术。然而，可扩展性仍然是一个挑战，无论是在约束解决成本和路径爆炸方面。在这项工作中，我们提出了一种新的符号执行方法，通过积极地优先考虑已知可行的执行路径，并推迟所有其他路径，可以增强其可扩展性。我们在包括SQLite3、make和tcpdump在内的9个应用程序上评估了我们的技术，并表明它可以在播种和非播种勘探中实现更高的覆盖率。

引用次数: 5

Proving Termination by k-Induction 用k归纳法证明终止

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3418929

Jianhui Chen, Fei He

We propose a novel approach to proving the termination of imperative programs by -induction. By our approach, the termination proving problem can be formalized as a -inductive invariant synthesis task. On the one hand, -induction uses weaker invariants than that required by the standard inductive approach. On the other hand, the base case of -induction, which unrolls the program, can provide stronger pre-condition for invariant synthesis. As a result, the termination arguments of our approach can be synthesized more efficiently than the standard method. We implement a prototype of our inductive approach. The experimental results show the significant effectiveness and efficiency of our approach.

提出了一种用-归纳法证明命令式程序终止的新方法。通过我们的方法，终止证明问题可以形式化为一个-归纳不变综合任务。一方面，-归纳法比标准归纳法使用更弱的不变量。另一方面，-归纳的基本情况，展开程序，可以为不变综合提供更强的前提条件。因此，我们的方法可以比标准方法更有效地综合终止参数。我们实现了归纳方法的一个原型。实验结果表明了该方法的有效性和高效性。

引用次数: 2

Source Code and Binary Level Vulnerability Detection and Hot Patching 源代码和二进制级漏洞检测和热补丁

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3418914

Zhengzi Xu

This paper presents a static vulnerability detection and patching framework at both source code and binary level. It automatically identifies and collects known vulnerability information to build the signature. It matches vulnerable functions with similar signatures and filters out the ones that have been patched in the target program. For the vulnerable functions, the framework tries to generate hot patches by learning from the source code.

本文提出了一个源代码级和二进制级的静态漏洞检测和修补框架。它自动识别和收集已知的漏洞信息来构建签名。它匹配具有相似签名的易受攻击的功能，并过滤掉目标程序中已打过补丁的功能。对于易受攻击的函数，框架尝试通过从源代码中学习来生成热补丁。

引用次数: 3

Summary-Based Symbolic Evaluation for Smart Contracts 基于摘要的智能合约符号评估

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416646

Yu Feng, E. Torlak, R. Bodík

This paper presents Solar, a system for automatic synthesis of adversarial contracts that exploit vulnerabilities in a victim smart contract. To make the synthesis tractable, we introduce a query language as well as summary-based symbolic evaluation, which significantly reduces the number of instructions that our synthesizer needs to evaluate symbolically, without compromising the precision of the vulnerability query. We encoded common vulnerabilities of smart contracts and evaluated Solar on the entire data set from Etherscan. Our experiments demonstrate the benefits of summary-based symbolic evaluation and show that Solar outperforms state-of-the-art smart contracts analyzers, TEETHER, Mythril, and Contract Fuzzer, in terms of running time and precision.

本文介绍了Solar，这是一个自动合成对抗性合约的系统，该系统利用了受害者智能合约中的漏洞。为了使合成易于处理，我们引入了查询语言和基于摘要的符号求值，这大大减少了合成器需要用符号求值的指令数量，同时又不影响漏洞查询的精度。我们对智能合约的常见漏洞进行了编码，并在Etherscan的整个数据集上评估了Solar。我们的实验证明了基于摘要的符号评估的好处，并表明Solar在运行时间和精度方面优于最先进的智能合约分析仪，TEETHER, Mythril和Contract Fuzzer。

引用次数: 13

Problems and Opportunities in Training Deep Learning Software Systems: An Analysis of Variance 训练深度学习软件系统的问题与机遇:方差分析

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416545

H. Pham, Shangshu Qian, Jiannan Wang, Thibaud Lutellier, Jonathan Rosenthal, Lin Tan, Yaoliang Yu, Nachiappan Nagappan

Deep learning (DL) training algorithms utilize nondeterminism to improve models' accuracy and training efficiency. Hence, multiple identical training runs (e.g., identical training data, algorithm, and network) produce different models with different accuracies and training times. In addition to these algorithmic factors, DL libraries (e.g., TensorFlow and cuDNN) introduce additional variance (referred to as implementation-level variance) due to parallelism, optimization, and floating-point computation. This work is the first to study the variance of DL systems and the awareness of this variance among researchers and practitioners. Our experiments on three datasets with six popular networks show large overall accuracy differences among identical training runs. Even after excluding weak models, the accuracy difference is 10.8%. In addition, implementation-level factors alone cause the accuracy difference across identical training runs to be up to 2.9%, the per-class accuracy difference to be up to 52.4%, and the training time difference to be up to 145.3%. All core libraries (TensorFlow, CNTK, and Theano) and low-level libraries (e.g., cuDNN) exhibit implementation-level variance across all evaluated versions. Our researcher and practitioner survey shows that 83.8% of the 901 participants are unaware of or unsure about any implementation-level variance. In addition, our literature survey shows that only 19.5±3% of papers in recent top software engineering (SE), artificial intelligence (AI), and systems conferences use multiple identical training runs to quantify the variance of their DL approaches. This paper raises awareness of DL variance and directs SE researchers to challenging tasks such as creating deterministic DL implementations to facilitate debugging and improving the reproducibility of DL software and results.

深度学习(DL)训练算法利用不确定性来提高模型的准确性和训练效率。因此，多次相同的训练运行(例如，相同的训练数据、算法和网络)会产生具有不同精度和训练时间的不同模型。除了这些算法因素，深度学习库(例如，TensorFlow和cuDNN)由于并行性、优化和浮点计算而引入了额外的方差(称为实现级方差)。这项工作是第一次研究深度学习系统的差异以及研究人员和从业者对这种差异的认识。我们在六个流行网络的三个数据集上的实验显示，在相同的训练运行中，总体准确率存在很大差异。即使在排除弱模型后，准确率也相差10.8%。此外，仅实现层面的因素就会导致相同训练运行之间的准确率差异高达2.9%，每类准确率差异高达52.4%，训练时间差异高达145.3%。所有核心库(TensorFlow, CNTK和Theano)和底层库(例如cuDNN)在所有评估版本中都表现出实现级别的差异。我们的研究人员和从业者调查显示，901名参与者中有83.8%的人不知道或不确定任何实现级别的差异。此外，我们的文献调查显示，在最近的顶级软件工程(SE)、人工智能(AI)和系统会议上，只有19.5±3%的论文使用多次相同的训练运行来量化他们的深度学习方法的方差。本文提高了对深度学习差异的认识，并指导SE研究人员完成具有挑战性的任务，例如创建确定性的深度学习实现，以促进调试和提高深度学习软件和结果的可重复性。

{"title":"Problems and Opportunities in Training Deep Learning Software Systems: An Analysis of Variance","authors":"H. Pham, Shangshu Qian, Jiannan Wang, Thibaud Lutellier, Jonathan Rosenthal, Lin Tan, Yaoliang Yu, Nachiappan Nagappan","doi":"10.1145/3324884.3416545","DOIUrl":"https://doi.org/10.1145/3324884.3416545","url":null,"abstract":"Deep learning (DL) training algorithms utilize nondeterminism to improve models' accuracy and training efficiency. Hence, multiple identical training runs (e.g., identical training data, algorithm, and network) produce different models with different accuracies and training times. In addition to these algorithmic factors, DL libraries (e.g., TensorFlow and cuDNN) introduce additional variance (referred to as implementation-level variance) due to parallelism, optimization, and floating-point computation. This work is the first to study the variance of DL systems and the awareness of this variance among researchers and practitioners. Our experiments on three datasets with six popular networks show large overall accuracy differences among identical training runs. Even after excluding weak models, the accuracy difference is 10.8%. In addition, implementation-level factors alone cause the accuracy difference across identical training runs to be up to 2.9%, the per-class accuracy difference to be up to 52.4%, and the training time difference to be up to 145.3%. All core libraries (TensorFlow, CNTK, and Theano) and low-level libraries (e.g., cuDNN) exhibit implementation-level variance across all evaluated versions. Our researcher and practitioner survey shows that 83.8% of the 901 participants are unaware of or unsure about any implementation-level variance. In addition, our literature survey shows that only 19.5±3% of papers in recent top software engineering (SE), artificial intelligence (AI), and systems conferences use multiple identical training runs to quantify the variance of their DL approaches. This paper raises awareness of DL variance and directs SE researchers to challenging tasks such as creating deterministic DL implementations to facilitate debugging and improving the reproducibility of DL software and results.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126556139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 92

Detecting and Explaining Self-Admitted Technical Debts with Attention-based Neural Networks 基于注意的神经网络检测和解释自我承认的技术债务

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416583

Xin Wang

Self-Admitted Technical Debt (SATD) is a sub-type of technical debt. It is introduced to represent such technical debts that are intentionally introduced by developers in the process of software development. While being able to gain short-term benefits, the introduction of SATDs often requires to be paid back later with a higher cost, e.g., introducing bugs to the software or increasing the complexity of the software. To cope with these issues, our community has proposed various machine learning-based approaches to detect SATDs. These approaches, however, are either not generic that usually require manual feature engineering efforts or do not provide promising means to explain the predicted outcomes. To that end, we propose to the community a novel approach, namely HATD (Hybrid Attention-based method for self-admitted Technical Debt detection), to detect and explain SATDs using attention-based neural networks. Through extensive experiments on 445,365 comments in 20 projects, we show that HATD is effective in detecting SATDs on both in-the-lab and in-the-wild datasets under both within-project and cross-project settings. HATD also outperforms the state-of-the-art approaches in detecting and explaining SATDs.

自我承认的技术债务(SATD)是技术债务的一个子类型。引入它是为了表示开发人员在软件开发过程中有意引入的技术债务。在能够获得短期利益的同时，引入satd往往需要在以后以更高的成本来偿还，例如，向软件引入错误或增加软件的复杂性。为了解决这些问题，我们的社区提出了各种基于机器学习的方法来检测satd。然而，这些方法要么不是通用的，通常需要人工特征工程的努力，要么不能提供有希望的方法来解释预测的结果。为此，我们向社区提出了一种新颖的方法，即HATD(基于注意力的混合方法，用于自我承认的技术债务检测)，使用基于注意力的神经网络来检测和解释satd。通过对20个项目中445,365条评论的广泛实验，我们表明，在项目内和跨项目设置下，HATD在实验室和野外数据集上都能有效地检测satd。HATD在检测和解释satd方面也优于最先进的方法。

{"title":"Detecting and Explaining Self-Admitted Technical Debts with Attention-based Neural Networks","authors":"Xin Wang","doi":"10.1145/3324884.3416583","DOIUrl":"https://doi.org/10.1145/3324884.3416583","url":null,"abstract":"Self-Admitted Technical Debt (SATD) is a sub-type of technical debt. It is introduced to represent such technical debts that are intentionally introduced by developers in the process of software development. While being able to gain short-term benefits, the introduction of SATDs often requires to be paid back later with a higher cost, e.g., introducing bugs to the software or increasing the complexity of the software. To cope with these issues, our community has proposed various machine learning-based approaches to detect SATDs. These approaches, however, are either not generic that usually require manual feature engineering efforts or do not provide promising means to explain the predicted outcomes. To that end, we propose to the community a novel approach, namely HATD (Hybrid Attention-based method for self-admitted Technical Debt detection), to detect and explain SATDs using attention-based neural networks. Through extensive experiments on 445,365 comments in 20 projects, we show that HATD is effective in detecting SATDs on both in-the-lab and in-the-wild datasets under both within-project and cross-project settings. HATD also outperforms the state-of-the-art approaches in detecting and explaining SATDs.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128548173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

On the Effectiveness of Unified Debugging: An Extensive Study on 16 Program Repair Systems 论统一调试的有效性——对16个程序修复系统的广泛研究

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416566

Samuel Benton, Xia Li, Yiling Lou, Lingming Zhang

Automated debugging techniques, including fault localization and program repair, have been studied for over a decade. However, the only existing connection between fault localization and program repair is that fault localization computes the potential buggy elements for program repair to patch. Recently, a pioneering work, ProFL, explored the idea of unified debugging to unify fault localization and program repair in the other direction for thefi rst time to boost both areas. More specifically, ProFL utilizes the patch execution results from one state-of-the-art repair system, PraPR, to help improve state-of-the-art fault localization. In this way, ProFL not only improves fault localization for manual repair, but also extends the application scope of automated repair to all possible bugs (not only the small ratio of bugs that can be automaticallyfi xed). However, ProFL only considers one APR system (i.e., PraPR), and it is not clear how other existing APR systems based on different designs contribute to unified debugging. In this work, we perform an extensive study of the unified-debugging approach on 16 state-of-the-art program repair systems for thefi rst time. Our experimental results on the widely studied Defects4J benchmark suite reveal various practical guidelines for unified debugging, such as (1) nearly all the studied 16 repair systems can positively contribute to unified debugging despite their varying repairing capabilities, (2) repair systems targeting multi-edit patches can bring extraneous noise into unified debugging, (3) repair systems with more executed/plausible patches tend to perform better for unified debugging, and (4) unified debugging effectiveness does not rely on the availability of correct patches in automated repair. Based on our results, we further propose an advanced unified debugging technique, UniDebug++, which can localize over 20% more bugs within Top-1 positions than state-of-the-art unified debugging technique, ProFL.

自动调试技术，包括故障定位和程序修复，已经研究了十多年。然而，故障定位和程序修复之间唯一存在的联系是故障定位计算程序修复要修补的潜在错误元素。最近，一项开创性的工作，ProFL，首次探索了统一调试的思想，将故障定位和程序修复在另一个方向上统一起来，从而促进了这两个领域的发展。更具体地说，ProFL利用来自最先进的修复系统PraPR的补丁执行结果来帮助改进最先进的故障定位。这样，ProFL不仅提高了人工修复的故障定位，而且将自动修复的应用范围扩展到所有可能的错误(而不仅仅是可以自动修复的一小部分错误)。但是，ProFL只考虑一个APR系统(即PraPR)，并且不清楚基于不同设计的其他现有APR系统如何对统一调试做出贡献。在这项工作中，我们首次对16个最先进的程序维修系统进行了统一调试方法的广泛研究。我们在广泛研究的缺陷4j基准套件上的实验结果揭示了统一调试的各种实用指南，例如:(1)几乎所有研究的16个修复系统都可以对统一调试做出积极贡献，尽管它们的修复能力不同;(2)针对多编辑补丁的修复系统可能会给统一调试带来无关的噪音;(3)具有更多执行/可信补丁的修复系统往往在统一调试中表现更好。(4)统一调试的有效性不依赖于自动修复中正确补丁的可用性。基于我们的结果，我们进一步提出了一种先进的统一调试技术UniDebug++，它可以比最先进的统一调试技术ProFL多定位20%以上的Top-1位置的bug。

{"title":"On the Effectiveness of Unified Debugging: An Extensive Study on 16 Program Repair Systems","authors":"Samuel Benton, Xia Li, Yiling Lou, Lingming Zhang","doi":"10.1145/3324884.3416566","DOIUrl":"https://doi.org/10.1145/3324884.3416566","url":null,"abstract":"Automated debugging techniques, including fault localization and program repair, have been studied for over a decade. However, the only existing connection between fault localization and program repair is that fault localization computes the potential buggy elements for program repair to patch. Recently, a pioneering work, ProFL, explored the idea of unified debugging to unify fault localization and program repair in the other direction for thefi rst time to boost both areas. More specifically, ProFL utilizes the patch execution results from one state-of-the-art repair system, PraPR, to help improve state-of-the-art fault localization. In this way, ProFL not only improves fault localization for manual repair, but also extends the application scope of automated repair to all possible bugs (not only the small ratio of bugs that can be automaticallyfi xed). However, ProFL only considers one APR system (i.e., PraPR), and it is not clear how other existing APR systems based on different designs contribute to unified debugging. In this work, we perform an extensive study of the unified-debugging approach on 16 state-of-the-art program repair systems for thefi rst time. Our experimental results on the widely studied Defects4J benchmark suite reveal various practical guidelines for unified debugging, such as (1) nearly all the studied 16 repair systems can positively contribute to unified debugging despite their varying repairing capabilities, (2) repair systems targeting multi-edit patches can bring extraneous noise into unified debugging, (3) repair systems with more executed/plausible patches tend to perform better for unified debugging, and (4) unified debugging effectiveness does not rely on the availability of correct patches in automated repair. Based on our results, we further propose an advanced unified debugging technique, UniDebug++, which can localize over 20% more bugs within Top-1 positions than state-of-the-art unified debugging technique, ProFL.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126418097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

The Symptom, Cause and Repair of Workaround 解决方案的症状、原因和修复

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3418910

Daohan Song, Hao Zhong, Li Jia

In software development, issue tracker systems are widely used to manage bug reports. In such a system, a bug report can be filed, diagnosed, assigned, and fixed. In the standard process, a bug can be resolved as fixed, invalid, duplicated or won't fix. Although the above resolutions are well-defined and easy to understand, a bug report can end with a less known resolution, i.e., workaround. Compared with other resolutions, the definition of workarounds is more ambiguous. Besides the problem that is reported in a bug report, the resolution of a workaround raises more questions. Some questions are important for users, especially those programmers who build their projects upon others (e.g., libraries). Although some early studies have been conducted to analyze API workarounds, many research questions on workarounds are still open. For example, which bugs are resolved as workarounds? Why is a bug report resolved as workarounds? What are the repairs of workarounds? In this experience paper, we conduct the first empirical study to explore the above research questions. In particular, we analyzed 221 real workarounds that were collected from Apache projects. Our results lead to some interesting and useful answers to all the above questions. For example, we find that most bug reports are resolved as workarounds, because their problems reside in libraries (24.43%), settings (18.55%), and clients (10.41%). Among them, many bugs are difficult to be fixed fully and perfectly. As a late breaking result, we can only briefly introduce our study, but we present a detailed plan to extend it to a full paper.ACM Reference Format: Daohan Song, Hao Zhong, and Li Jia. 2020. The Symptom, Cause and Repair of Workaround. In 35th IEEE/ACM International Conference on Automated Software Engineering (ASE ‘20), September 21–25, 2020, Virtual Event, Australia. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3324884.3418910

在软件开发中，问题跟踪系统被广泛用于管理bug报告。在这样的系统中，bug报告可以归档、诊断、分配和修复。在标准流程中，错误可以被解决为已修复、无效、重复或无法修复。尽管上面的解决方案是定义良好且易于理解的，但bug报告可能以不太为人所知的解决方案结束，即变通方案。与其他解决方案相比，变通方法的定义更加模糊。除了在bug报告中报告的问题之外，解决方法还会引发更多的问题。有些问题对用户来说是很重要的，尤其是那些在别人的基础上构建项目的程序员(例如，库)。尽管已经进行了一些早期的研究来分析API变通方法，但关于变通方法的许多研究问题仍然是开放的。例如，哪些bug作为变通解决了?为什么bug报告被解决为变通方法?变通方法的修复是什么?在这篇经验论文中，我们进行了第一次实证研究来探讨上述研究问题。特别地，我们分析了从Apache项目中收集的221个实际解决方案。我们的研究结果为上述所有问题提供了一些有趣而有用的答案。例如，我们发现大多数错误报告都是作为变通方法解决的，因为它们的问题存在于库(24.43%)、设置(18.55%)和客户端(10.41%)中。其中，很多bug很难完全、完美的修复。作为一个后期的突破性成果，我们只能简要介绍我们的研究，但我们提出了一个详细的计划，将其扩展到一篇完整的论文。ACM参考文献格式:宋道涵，钟昊，李嘉。2020。解决方案的症状、原因和修复。第35届IEEE/ACM自动化软件工程国际会议(ASE ' 20)， 2020年9月21-25日，虚拟事件，澳大利亚。ACM，纽约，美国，3页。https://doi.org/10.1145/3324884.3418910

{"title":"The Symptom, Cause and Repair of Workaround","authors":"Daohan Song, Hao Zhong, Li Jia","doi":"10.1145/3324884.3418910","DOIUrl":"https://doi.org/10.1145/3324884.3418910","url":null,"abstract":"In software development, issue tracker systems are widely used to manage bug reports. In such a system, a bug report can be filed, diagnosed, assigned, and fixed. In the standard process, a bug can be resolved as fixed, invalid, duplicated or won't fix. Although the above resolutions are well-defined and easy to understand, a bug report can end with a less known resolution, i.e., workaround. Compared with other resolutions, the definition of workarounds is more ambiguous. Besides the problem that is reported in a bug report, the resolution of a workaround raises more questions. Some questions are important for users, especially those programmers who build their projects upon others (e.g., libraries). Although some early studies have been conducted to analyze API workarounds, many research questions on workarounds are still open. For example, which bugs are resolved as workarounds? Why is a bug report resolved as workarounds? What are the repairs of workarounds? In this experience paper, we conduct the first empirical study to explore the above research questions. In particular, we analyzed 221 real workarounds that were collected from Apache projects. Our results lead to some interesting and useful answers to all the above questions. For example, we find that most bug reports are resolved as workarounds, because their problems reside in libraries (24.43%), settings (18.55%), and clients (10.41%). Among them, many bugs are difficult to be fixed fully and perfectly. As a late breaking result, we can only briefly introduce our study, but we present a detailed plan to extend it to a full paper.ACM Reference Format: Daohan Song, Hao Zhong, and Li Jia. 2020. The Symptom, Cause and Repair of Workaround. In 35th IEEE/ACM International Conference on Automated Software Engineering (ASE ‘20), September 21–25, 2020, Virtual Event, Australia. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3324884.3418910","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128365970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Broadening Horizons of Multilingual Static Analysis: Semantic Summary Extraction from C Code for JNI Program Analysis 拓宽多语言静态分析的视野：从 C 代码中提取语义摘要用于 JNI 程序分析

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416558

Sungho Lee, Hyogun Lee, Sukyoung Ryu

Most programming languages support foreign language interoperation that allows developers to integrate multiple modules implemented in different languages into a single multilingual program. While utilizing various features from multiple languages expands expressivity, differences in language semantics require developers to understand the semantics of multiple languages and their inter-operation. Because current compilers do not support compile-time checking for interoperation, they do not help developers avoid in-teroperation bugs. Similarly, active research on static analysis and bug detection has been focusing on programs written in a single language. In this paper, we propose a novel approach to analyze multilingual programs statically. Unlike existing approaches that extend a static analyzer for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summaries. To show practicality of our approach, we design and implement a static analyzer for multilingual programs, which analyzes JNI interoperation between Java and C. Our empirical evaluation shows that the analyzer is scalable in that it can construct call graphs for large programs that use JNI interoperation, and useful in that it found 74 genuine interoperation bugs in real-world Android JNI applications.

大多数编程语言都支持外语互操作，允许开发人员将用不同语言实现的多个模块集成到一个多语言程序中。虽然利用多种语言的各种功能可以扩展表达能力，但语言语义的差异要求开发人员了解多种语言的语义及其互操作性。由于目前的编译器不支持互操作的编译时检查，因此无法帮助开发人员避免操作中的错误。同样，关于静态分析和错误检测的积极研究一直集中在用单一语言编写的程序上。在本文中，我们提出了一种静态分析多语言程序的新方法。与现有的扩展主语言静态分析器以支持分析外来函数调用的方法不同，我们的方法使用模块化分析技术从用客体语言编写的程序中提取语义摘要，并使用提取的语义摘要执行整个程序的分析。为了证明我们的方法的实用性，我们为多语言程序设计并实现了一个静态分析器，它可以分析 Java 和 C 之间的 JNI 互操作。我们的经验评估表明，该分析器具有可扩展性，它可以为使用 JNI 互操作的大型程序构建调用图，而且非常有用，它在现实世界的 Android JNI 应用程序中发现了 74 个真正的互操作错误。

{"title":"Broadening Horizons of Multilingual Static Analysis: Semantic Summary Extraction from C Code for JNI Program Analysis","authors":"Sungho Lee, Hyogun Lee, Sukyoung Ryu","doi":"10.1145/3324884.3416558","DOIUrl":"https://doi.org/10.1145/3324884.3416558","url":null,"abstract":"Most programming languages support foreign language interoperation that allows developers to integrate multiple modules implemented in different languages into a single multilingual program. While utilizing various features from multiple languages expands expressivity, differences in language semantics require developers to understand the semantics of multiple languages and their inter-operation. Because current compilers do not support compile-time checking for interoperation, they do not help developers avoid in-teroperation bugs. Similarly, active research on static analysis and bug detection has been focusing on programs written in a single language. In this paper, we propose a novel approach to analyze multilingual programs statically. Unlike existing approaches that extend a static analyzer for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summaries. To show practicality of our approach, we design and implement a static analyzer for multilingual programs, which analyzes JNI interoperation between Java and C. Our empirical evaluation shows that the analyzer is scalable in that it can construct call graphs for large programs that use JNI interoperation, and useful in that it found 74 genuine interoperation bugs in real-world Android JNI applications.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132390893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Good Things Come In Threes: Improving Search-based Crash Reproduction With Helper Objectives 好事成三:用助手目标改进基于搜索的崩溃再现

2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Pub Date : 2020-09-01 DOI: 10.1145/3324884.3416643

P. Derakhshanfar, Xavier Devroey, A. Zaidman, A. Deursen, Annibale Panichella

Writing a test case reproducing a reported software crash is a common practice to identify the root cause of an anomaly in the software under test. However, this task is usually labor-intensive and time-taking. Hence, evolutionary intelligence approaches have been successfully applied to assist developers during debugging by generating a test case reproducing reported crashes. These approaches use a single fitness function called Crash Distance to guide the search process toward reproducing a target crash. Despite the reported achievements, these approaches do not always successfully reproduce some crashes due to a lack of test diversity (premature convergence). In this study, we introduce a new approach, called MOHO, that addresses this issue via multi-objectivization. In particular, we introduce two new Helper-Objectives for crash reproduction, namely test length (to minimize) and method sequence diversity (to maximize), in addition to Crash Distance. We assessed MO-HO using five multi-objective evolutionary algorithms (NSGA-II, SPEA2, PESA-II, MOEA/D, FEMO) on 124 non-trivial crashes stemming from open-source projects. Our results indicate that SPEA2 is the best-performing multi-objective algorithm for MO-HO. We evaluated this best-performing algorithm for MO-HO against the state-of-the-art: single-objective approach (Single-Objective Search) and decomposition-based multi-objectivization approach (De-MO). Our results show that MO-HO reproduces five crashes that cannot be reproduced by the current state-of-the-art. Besides, MO-HO improves the effectiveness (+10% and +8% in reproduction ratio) and the efficiency in 34.6% and 36% of crashes (i.e., significantly lower running time) compared to Single-Objective Search and De-MO, respectively. For some crashes, the improvements are very large, being up to +93.3% for reproduction ratio and −92% for the required running time.

编写一个重现报告的软件崩溃的测试用例是识别被测软件中异常的根本原因的一种常见做法。然而，这项任务通常是劳动密集型和耗时的。因此，进化智能方法已经成功地应用于通过生成重现报告的崩溃的测试用例来帮助开发人员进行调试。这些方法使用一个称为崩溃距离的适应度函数来引导搜索过程重现目标崩溃。尽管有报道的成就，由于缺乏测试多样性(过早收敛)，这些方法并不总是成功地重现一些崩溃。在这项研究中，我们引入了一种名为MOHO的新方法，通过多客观化来解决这个问题。特别地，我们为崩溃再现引入了两个新的辅助目标，即测试长度(最小化)和方法序列多样性(最大化)，以及崩溃距离。我们使用五种多目标进化算法(NSGA-II, SPEA2, PESA-II, MOEA/D, FEMO)对124个来自开源项目的重大崩溃进行了MO-HO评估。研究结果表明，SPEA2是MO-HO算法中性能最好的多目标算法。我们将这种性能最佳的MO-HO算法与最先进的单目标方法(single-objective Search)和基于分解的多目标方法(De-MO)进行了比较。我们的结果表明，MO-HO再现了目前最先进的技术无法再现的五种崩溃。此外，与单目标搜索和De-MO相比，MO-HO分别提高了效率(繁殖率+10%和+8%)和34.6%和36%的崩溃效率(即显著降低运行时间)。对于某些崩溃，改进是非常大的，复制比率提高了+93.3%，所需运行时间提高了- 92%。

{"title":"Good Things Come In Threes: Improving Search-based Crash Reproduction With Helper Objectives","authors":"P. Derakhshanfar, Xavier Devroey, A. Zaidman, A. Deursen, Annibale Panichella","doi":"10.1145/3324884.3416643","DOIUrl":"https://doi.org/10.1145/3324884.3416643","url":null,"abstract":"Writing a test case reproducing a reported software crash is a common practice to identify the root cause of an anomaly in the software under test. However, this task is usually labor-intensive and time-taking. Hence, evolutionary intelligence approaches have been successfully applied to assist developers during debugging by generating a test case reproducing reported crashes. These approaches use a single fitness function called Crash Distance to guide the search process toward reproducing a target crash. Despite the reported achievements, these approaches do not always successfully reproduce some crashes due to a lack of test diversity (premature convergence). In this study, we introduce a new approach, called MOHO, that addresses this issue via multi-objectivization. In particular, we introduce two new Helper-Objectives for crash reproduction, namely test length (to minimize) and method sequence diversity (to maximize), in addition to Crash Distance. We assessed MO-HO using five multi-objective evolutionary algorithms (NSGA-II, SPEA2, PESA-II, MOEA/D, FEMO) on 124 non-trivial crashes stemming from open-source projects. Our results indicate that SPEA2 is the best-performing multi-objective algorithm for MO-HO. We evaluated this best-performing algorithm for MO-HO against the state-of-the-art: single-objective approach (Single-Objective Search) and decomposition-based multi-objectivization approach (De-MO). Our results show that MO-HO reproduces five crashes that cannot be reproduced by the current state-of-the-art. Besides, MO-HO improves the effectiveness (+10% and +8% in reproduction ratio) and the efficiency in 34.6% and 36% of crashes (i.e., significantly lower running time) compared to Single-Objective Search and De-MO, respectively. For some crashes, the improvements are very large, being up to +93.3% for reproduction ratio and −92% for the required running time.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131910946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9