2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)最新文献_第3页

Speedoo: Prioritizing Performance Optimization Opportunities Speedoo:优先考虑性能优化机会

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180229

Zhifei Chen, Bihuan Chen, Lu Xiao, Xiao Wang, Lin Chen, Yang Liu, Baowen Xu

Performance problems widely exist in modern software systems. Existing performance optimization techniques, including profiling-based and pattern-based techniques, usually fail to consider the architectural impacts among methods that easily slow down the overall system performance. This paper contributes a new approach, named Speedoo, to identify groups of methods that should be treated together and deserve high priorities for performance optimization. The uniqueness of Speedoo is to measure and rank the performance optimization opportunities of a method based on 1) the architectural impact and 2) the optimization potential. For each highly ranked method, we locate a respective Optimization Space based on 5 performance patterns generalized from empirical observations. The top ranked optimization spaces are suggested to developers as potential optimization opportunities. Our evaluation on three real-life projects has demonstrated that 18.52% to 42.86% of methods in the top ranked optimization spaces indeed undertook performance optimization in the projects. This outperforms one of the state-of-the-art profiling tools YourKit by 2 to 3 times. An important implication of this study is that developers should treat methods in an optimization space together as a group rather than as individuals in performance optimization. The proposed approach can provide guidelines and reduce developers' manual effort.

性能问题在现代软件系统中广泛存在。现有的性能优化技术，包括基于分析和基于模式的技术，通常没有考虑到容易降低整体系统性能的方法之间的体系结构影响。本文提出了一种名为Speedoo的新方法，用于识别应该一起处理的方法组，并且应该优先考虑性能优化。Speedoo的独特之处在于，它根据1)体系结构影响和2)优化潜力对方法的性能优化机会进行度量和排序。对于每个排名较高的方法，我们根据经验观察得出的5种性能模式定位各自的优化空间。建议开发者将排名靠前的优化空间作为潜在的优化机会。我们对三个实际项目的评价表明，在排名前几位的优化空间中，18.52% ~ 42.86%的方法确实在项目中进行了性能优化。这比最先进的分析工具YourKit要好2到3倍。这项研究的一个重要含义是，开发人员应该将优化空间中的方法作为一个组来处理，而不是作为性能优化中的单个方法来处理。所建议的方法可以提供指导方针并减少开发人员的手工工作。

{"title":"Speedoo: Prioritizing Performance Optimization Opportunities","authors":"Zhifei Chen, Bihuan Chen, Lu Xiao, Xiao Wang, Lin Chen, Yang Liu, Baowen Xu","doi":"10.1145/3180155.3180229","DOIUrl":"https://doi.org/10.1145/3180155.3180229","url":null,"abstract":"Performance problems widely exist in modern software systems. Existing performance optimization techniques, including profiling-based and pattern-based techniques, usually fail to consider the architectural impacts among methods that easily slow down the overall system performance. This paper contributes a new approach, named Speedoo, to identify groups of methods that should be treated together and deserve high priorities for performance optimization. The uniqueness of Speedoo is to measure and rank the performance optimization opportunities of a method based on 1) the architectural impact and 2) the optimization potential. For each highly ranked method, we locate a respective Optimization Space based on 5 performance patterns generalized from empirical observations. The top ranked optimization spaces are suggested to developers as potential optimization opportunities. Our evaluation on three real-life projects has demonstrated that 18.52% to 42.86% of methods in the top ranked optimization spaces indeed undertook performance optimization in the projects. This outperforms one of the state-of-the-art profiling tools YourKit by 2 to 3 times. An important implication of this study is that developers should treat methods in an optimization space together as a group rather than as individuals in performance optimization. The proposed approach can provide guidelines and reduce developers' manual effort.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"26 1","pages":"811-821"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81245534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

The Evolution of Requirements Practices in Software Startups 软件初创公司需求实践的演变

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180158

Catarina Gralha, D. Damian, A. Wasserman, M. Goulão, João Araújo

We use Grounded Theory to study the evolution of requirements practices of 16 software startups as they grow and introduce new products and services. These startups operate in a dynamic environment, with significant time and market pressure, and rarely have time for systematic requirements analysis. Our theory describes the evolution of practice along six dimensions that emerged as relevant to their requirements activities: requirements artefacts, knowledge management, requirements-related roles, planning, technical debt and product quality. Beyond the relationships among the dimensions, our theory also explains the turning points that drove the evolution along these dimensions. These changes are reactive, rather than planned, suggesting an overall pragmatic lightness, i.e., flexibility, in the startups' evolution towards engineering practices for requirements. Our theory organises knowledge about evolving requirements practice in maturing startups, and provides practical insights for startups' assessing their own evolution as they face challenges to their growth. Our research also suggests that a startup's evolution along the six dimensions is not fundamental to its success, but has significant effects on their product, their employees and the company.

我们使用扎根理论来研究16家软件初创公司在成长和引入新产品和服务时需求实践的演变。这些初创公司在一个动态的环境中运作，有很大的时间和市场压力，很少有时间进行系统的需求分析。我们的理论沿着与需求活动相关的六个维度描述了实践的演变:需求工件、知识管理、与需求相关的角色、计划、技术债务和产品质量。除了维度之间的关系之外，我们的理论还解释了推动这些维度进化的转折点。这些变化是被动的，而不是计划好的，这表明在初创公司向需求工程实践的演变中，总体上是实用的，即灵活性。我们的理论组织了关于成熟初创公司不断发展的需求实践的知识，并为初创公司在面临成长挑战时评估自己的发展提供了实用的见解。我们的研究还表明，创业公司在这六个方面的发展并不是其成功的基础，但对他们的产品、员工和公司都有重大影响。

{"title":"The Evolution of Requirements Practices in Software Startups","authors":"Catarina Gralha, D. Damian, A. Wasserman, M. Goulão, João Araújo","doi":"10.1145/3180155.3180158","DOIUrl":"https://doi.org/10.1145/3180155.3180158","url":null,"abstract":"We use Grounded Theory to study the evolution of requirements practices of 16 software startups as they grow and introduce new products and services. These startups operate in a dynamic environment, with significant time and market pressure, and rarely have time for systematic requirements analysis. Our theory describes the evolution of practice along six dimensions that emerged as relevant to their requirements activities: requirements artefacts, knowledge management, requirements-related roles, planning, technical debt and product quality. Beyond the relationships among the dimensions, our theory also explains the turning points that drove the evolution along these dimensions. These changes are reactive, rather than planned, suggesting an overall pragmatic lightness, i.e., flexibility, in the startups' evolution towards engineering practices for requirements. Our theory organises knowledge about evolving requirements practice in maturing startups, and provides practical insights for startups' assessing their own evolution as they face challenges to their growth. Our research also suggests that a startup's evolution along the six dimensions is not fundamental to its success, but has significant effects on their product, their employees and the company.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"157 1","pages":"823-833"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86344810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 59

FAST Approaches to Scalable Similarity-Based Test Case Prioritization 可扩展的基于相似性的测试用例优先级的快速方法

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180210

Breno Miranda, Emilio Cruciani, R. Verdecchia, A. Bertolino

Many test case prioritization criteria have been proposed for speeding up fault detection. Among them, similarity-based approaches give priority to the test cases that are the most dissimilar from those already selected. However, the proposed criteria do not scale up to handle the many thousands or even some millions test suite sizes of modern industrial systems and simple heuristics are used instead. We introduce the FAST family of test case prioritization techniques that radically changes this landscape by borrowing algorithms commonly exploited in the big data domain to find similar items. FAST techniques provide scalable similarity-based test case prioritization in both white-box and black-box fashion. The results from experimentation on real world C and Java subjects show that the fastest members of the family outperform other black-box approaches in efficiency with no significant impact on effectiveness, and also outperform white-box approaches, including greedy ones, if preparation time is not counted. A simulation study of scalability shows that one FAST technique can prioritize a million test cases in less than 20 minutes.

为了加速故障检测，已经提出了许多测试用例优先级标准。其中，基于相似度的方法优先考虑与已经选择的最不相似的测试用例。然而，所提出的标准不能扩展到处理现代工业系统的数千甚至数百万个测试套件大小，而是使用简单的启发式方法。我们引入FAST系列测试用例优先级技术，通过借用大数据领域中常用的算法来查找类似的项目，从根本上改变了这种情况。FAST技术以白盒和黑盒两种方式提供可伸缩的基于相似性的测试用例优先级。在真实世界的C和Java主题上的实验结果表明，最快的家族成员在效率上优于其他黑盒方法，而对有效性没有显著影响，并且如果不计算准备时间，也优于白盒方法，包括贪婪方法。可伸缩性的模拟研究表明，一种FAST技术可以在不到20分钟的时间内优先处理一百万个测试用例。

{"title":"FAST Approaches to Scalable Similarity-Based Test Case Prioritization","authors":"Breno Miranda, Emilio Cruciani, R. Verdecchia, A. Bertolino","doi":"10.1145/3180155.3180210","DOIUrl":"https://doi.org/10.1145/3180155.3180210","url":null,"abstract":"Many test case prioritization criteria have been proposed for speeding up fault detection. Among them, similarity-based approaches give priority to the test cases that are the most dissimilar from those already selected. However, the proposed criteria do not scale up to handle the many thousands or even some millions test suite sizes of modern industrial systems and simple heuristics are used instead. We introduce the FAST family of test case prioritization techniques that radically changes this landscape by borrowing algorithms commonly exploited in the big data domain to find similar items. FAST techniques provide scalable similarity-based test case prioritization in both white-box and black-box fashion. The results from experimentation on real world C and Java subjects show that the fastest members of the family outperform other black-box approaches in efficiency with no significant impact on effectiveness, and also outperform white-box approaches, including greedy ones, if preparation time is not counted. A simulation study of scalability shows that one FAST technique can prioritize a million test cases in less than 20 minutes.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"52 1","pages":"222-232"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85674237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 77

Multi-granular Conflict and Dependency Analysis in Software Engineering Based on Graph Transformation 基于图变换的软件工程多粒度冲突与依赖分析

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180258

Leen Lambers, D. Strüber, G. Taentzer, K. Born, J. Huebert

Conflict and dependency analysis (CDA) of graph transformation has been shown to be a versatile foundation for understanding interactions in many software engineering domains, including software analysis and design, model-driven engineering, and testing. In this paper, we propose a novel static CDA technique that is multi-granular in the sense that it can detect all conflicts and dependencies on multiple granularity levels. Specifically, we provide an efficient algorithm suite for computing binary, coarse-grained, and fine-grained conflicts and dependencies: Binary granularity indicates the presence or absence of conflicts and dependencies, coarse granularity focuses on root causes for conflicts and dependencies, and fine granularity shows each conflict and dependency in full detail. Doing so, we can address specific performance and usability requirements that we identified in a literature survey of CDA usage scenarios. In an experimental evaluation, our algorithm suite computes conflicts and dependencies rapidly. Finally, we present a user study, in which the participants found our coarse-grained results more understandable than the fine-grained ones reported in a state-of-the-art tool. Our overall contribution is twofold: (i) we significantly speed up the computation of fine-grained and binary CDA results and, (ii) complement them with coarse-grained ones, which offer usability benefits for numerous use cases.

图转换的冲突和依赖分析(CDA)已被证明是理解许多软件工程领域中的交互的通用基础，包括软件分析和设计、模型驱动工程和测试。在本文中，我们提出了一种新的静态CDA技术，它是多粒度的，即它可以检测多个粒度级别上的所有冲突和依赖关系。具体来说，我们提供了一个有效的算法套件，用于计算二进制、粗粒度和细粒度的冲突和依赖项:二进制粒度表示冲突和依赖项是否存在，粗粒度侧重于冲突和依赖项的根本原因，细粒度详细显示每个冲突和依赖项。这样，我们就可以处理我们在CDA使用场景的文献调查中确定的特定性能和可用性需求。在实验评估中，我们的算法套件可以快速计算冲突和依赖关系。最后，我们提出了一个用户研究，其中参与者发现我们的粗粒度结果比在最先进的工具中报告的细粒度结果更容易理解。我们的总体贡献是双重的:(i)我们显著加快了细粒度和二进制CDA结果的计算，(ii)用粗粒度的CDA结果补充它们，这为许多用例提供了可用性优势。

{"title":"Multi-granular Conflict and Dependency Analysis in Software Engineering Based on Graph Transformation","authors":"Leen Lambers, D. Strüber, G. Taentzer, K. Born, J. Huebert","doi":"10.1145/3180155.3180258","DOIUrl":"https://doi.org/10.1145/3180155.3180258","url":null,"abstract":"Conflict and dependency analysis (CDA) of graph transformation has been shown to be a versatile foundation for understanding interactions in many software engineering domains, including software analysis and design, model-driven engineering, and testing. In this paper, we propose a novel static CDA technique that is multi-granular in the sense that it can detect all conflicts and dependencies on multiple granularity levels. Specifically, we provide an efficient algorithm suite for computing binary, coarse-grained, and fine-grained conflicts and dependencies: Binary granularity indicates the presence or absence of conflicts and dependencies, coarse granularity focuses on root causes for conflicts and dependencies, and fine granularity shows each conflict and dependency in full detail. Doing so, we can address specific performance and usability requirements that we identified in a literature survey of CDA usage scenarios. In an experimental evaluation, our algorithm suite computes conflicts and dependencies rapidly. Finally, we present a user study, in which the participants found our coarse-grained results more understandable than the fine-grained ones reported in a state-of-the-art tool. Our overall contribution is twofold: (i) we significantly speed up the computation of fine-grained and binary CDA results and, (ii) complement them with coarse-grained ones, which offer usability benefits for numerous use cases.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"10 1","pages":"716-727"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79696837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

UFO: Predictive Concurrency Use-After-Free Detection UFO:自由检测后的预测性并发使用

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180225

Jeff Huang

Use-After-Free (UAF) vulnerabilities are caused by the program operating on a dangling pointer and can be exploited to compromise critical software systems. While there have been many tools to mitigate UAF vulnerabilities, UAF remains one of the most common attack vectors. UAF is particularly di cult to detect in concurrent programs, in which a UAF may only occur with rare thread schedules. In this paper, we present a novel technique, UFO, that can precisely predict UAFs based on a single observed execution trace with a provably higher detection capability than existing techniques with no false positives. The key technical advancement of UFO is an extended maximal thread causality model that captures the largest possible set of feasible traces that can be inferred from a given multithreaded execution trace. By formulating UAF detection as a constraint solving problem atop this model, we can explore a much larger thread scheduling space than classical happens-before based techniques. We have evaluated UFO on several real-world large complex C/C++ programs including Chromium and FireFox. UFO scales to real-world systems with hundreds of millions of events in their execution and has detected a large number of real concurrency UAFs.

释放后使用(UAF)漏洞是由在悬空指针上操作的程序引起的，可以被利用来危害关键的软件系统。虽然有许多工具可以减轻UAF漏洞，但UAF仍然是最常见的攻击媒介之一。在并发程序中，UAF特别难以检测，因为在并发程序中，UAF可能只在很少的线程调度中发生。在本文中，我们提出了一种新技术，UFO，它可以根据单个观察到的执行轨迹精确预测UAFs，具有比现有技术更高的检测能力，没有假阳性。UFO的关键技术进步是一个扩展的最大线程因果关系模型，它捕获了从给定的多线程执行跟踪中可以推断出的最大可能的可行跟踪集。通过将UAF检测表述为该模型之上的约束求解问题，我们可以探索比传统的基于事件之前的技术更大的线程调度空间。我们已经在几个真实世界的大型复杂C/ c++程序(包括Chromium和FireFox)上评估了UFO。UFO扩展到具有数亿个执行事件的现实世界系统，并且已经检测到大量真实的并发uaf。

{"title":"UFO: Predictive Concurrency Use-After-Free Detection","authors":"Jeff Huang","doi":"10.1145/3180155.3180225","DOIUrl":"https://doi.org/10.1145/3180155.3180225","url":null,"abstract":"Use-After-Free (UAF) vulnerabilities are caused by the program operating on a dangling pointer and can be exploited to compromise critical software systems. While there have been many tools to mitigate UAF vulnerabilities, UAF remains one of the most common attack vectors. UAF is particularly di cult to detect in concurrent programs, in which a UAF may only occur with rare thread schedules. In this paper, we present a novel technique, UFO, that can precisely predict UAFs based on a single observed execution trace with a provably higher detection capability than existing techniques with no false positives. The key technical advancement of UFO is an extended maximal thread causality model that captures the largest possible set of feasible traces that can be inferred from a given multithreaded execution trace. By formulating UAF detection as a constraint solving problem atop this model, we can explore a much larger thread scheduling space than classical happens-before based techniques. We have evaluated UFO on several real-world large complex C/C++ programs including Chromium and FireFox. UFO scales to real-world systems with hundreds of millions of events in their execution and has detected a large number of real concurrency UAFs.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"11 1","pages":"609-619"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81904680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29

CCAligner: A Token Based Large-Gap Clone Detector CCAligner:一个基于令牌的大间隙克隆检测器

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180179

Pengcheng Wang, Jeffrey Svajlenko, Yanzhao Wu, Yun Xu, C. Roy

Copying code and then pasting with large number of edits is a common activity in software development, and the pasted code is a kind of complicated Type-3 clone. Due to large number of edits, we consider the clone as a large-gap clone. Large-gap clone can reflect the extension of code, such as change and improvement. The existing state-of-the-art clone detectors suffer from several limitations in detecting large-gap clones. In this paper, we propose a tool, CCAligner, using code window that considers e edit distance for matching to detect large-gap clones. In our approach, a novel e-mismatch index is designed and the asymmetric similarity coefficient is used for similarity measure. We thoroughly evaluate CCAligner both for large-gap clone detection, and for general Type-1, Type-2 and Type-3 clone detection. The results show that CCAligner performs better than other competing tools in large-gap clone detection, and has the best execution time for 10MLOC input with good precision and recall in general Type-1 to Type-3 clone detection. Compared with existing state-of-the-art tools, CCAligner is the best performing large-gap clone detection tool, and remains competitive with the best clone detectors in general Type-1, Type-2 and Type-3 clone detection.

复制代码并进行大量编辑粘贴是软件开发中常见的活动，粘贴后的代码是一种复杂的Type-3克隆。由于大量的编辑，我们认为克隆是一个大间隙克隆。大间隙克隆可以反映代码的扩展，如更改和改进。现有的最先进的克隆探测器在检测大间隙克隆时存在一些局限性。在本文中，我们提出了一个工具CCAligner，利用考虑编辑距离的代码窗口进行匹配来检测大间隙克隆。该方法设计了一种新的e-失配指数，并采用非对称相似系数进行相似性度量。我们全面评估了CCAligner在大间隙克隆检测以及一般的1型、2型和3型克隆检测中的应用。结果表明，CCAligner在大间隙克隆检测方面的性能优于其他竞争工具，在一般的Type-1到Type-3克隆检测中，对于10MLOC输入具有最佳的执行时间，并且具有良好的精度和召回率。与现有最先进的克隆检测工具相比，CCAligner是性能最好的大间隙克隆检测工具，在一般的1型、2型和3型克隆检测中具有较强的竞争力。

{"title":"CCAligner: A Token Based Large-Gap Clone Detector","authors":"Pengcheng Wang, Jeffrey Svajlenko, Yanzhao Wu, Yun Xu, C. Roy","doi":"10.1145/3180155.3180179","DOIUrl":"https://doi.org/10.1145/3180155.3180179","url":null,"abstract":"Copying code and then pasting with large number of edits is a common activity in software development, and the pasted code is a kind of complicated Type-3 clone. Due to large number of edits, we consider the clone as a large-gap clone. Large-gap clone can reflect the extension of code, such as change and improvement. The existing state-of-the-art clone detectors suffer from several limitations in detecting large-gap clones. In this paper, we propose a tool, CCAligner, using code window that considers e edit distance for matching to detect large-gap clones. In our approach, a novel e-mismatch index is designed and the asymmetric similarity coefficient is used for similarity measure. We thoroughly evaluate CCAligner both for large-gap clone detection, and for general Type-1, Type-2 and Type-3 clone detection. The results show that CCAligner performs better than other competing tools in large-gap clone detection, and has the best execution time for 10MLOC input with good precision and recall in general Type-1 to Type-3 clone detection. Compared with existing state-of-the-art tools, CCAligner is the best performing large-gap clone detection tool, and remains competitive with the best clone detectors in general Type-1, Type-2 and Type-3 clone detection.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"47 1","pages":"1066-1077"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74521572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 96

Open Source Barriers to Entry, Revisited: A Sociotechnical Perspective 开放源码的进入壁垒，重新审视:一个社会技术的视角

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180241

Christopher J. Mendez, Hema Susmita Padala, Zoe Steine-Hanson, C. Hilderbrand, Amber Horvath, Charles Hill, L. Simpson, Nupoor Patil, A. Sarma, M. Burnett

Research has revealed that significant barriers exist when entering Open-Source Software (OSS) communities and that women disproportionately experience such barriers. However, this research has focused mainly on social/cultural factors, ignoring the environment itself — the tools and infrastructure. To shed some light onto how tools and infrastructure might somehow factor into OSS barriers to entry, we conducted a field study with five teams of software professionals, who worked through five use-cases to analyze the tools and infrastructure used in their OSS projects. These software professionals found tool/infrastructure barriers in 7% to 71% of the use-case steps that they analyzed, most of which are tied to newcomer barriers that have been established in the literature. Further, over 80% of the barrier types they found include attributes that are biased against women.

研究表明，在进入开源软件(OSS)社区时存在着重大障碍，而女性尤其会遇到这样的障碍。然而，这项研究主要集中在社会/文化因素上，忽视了环境本身——工具和基础设施。为了阐明工具和基础设施如何以某种方式影响OSS进入的障碍，我们与五个软件专业团队进行了实地研究，他们通过五个用例来分析他们的OSS项目中使用的工具和基础设施。这些软件专业人员在他们分析的用例步骤中发现了7%到71%的工具/基础设施障碍，其中大多数与文献中建立的新来者障碍有关。此外，他们发现超过80%的障碍类型包括对女性有偏见的属性。

引用次数: 91

Automatically Generating Search Heuristics for Concolic Testing 自动生成搜索启发式的集合测试

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180166

Sooyoung Cha, Seongjoon Hong, Junhee Lee, Hakjoo Oh

We present a technique to automatically generate search heuristics for concolic testing. A key challenge in concolic testing is how to effectively explore the program's execution paths to achieve high code coverage in a limited time budget. Concolic testing employs a search heuristic to address this challenge, which favors exploring particular types of paths that are most likely to maximize the final coverage. However, manually designing a good search heuristic is nontrivial and typically ends up with suboptimal and unstable outcomes. The goal of this paper is to overcome this shortcoming of concolic testing by automatically generating search heuristics. We define a class of search heuristics, namely a parameterized heuristic, and present an algorithm that efficiently finds an optimal heuristic for each subject program. Experimental results with open-source C programs show that our technique successfully generates search heuristics that significantly outperform existing manually-crafted heuristics in terms of branch coverage and bug-finding.

提出了一种自动生成搜索启发式的方法。concolic测试中的一个关键挑战是如何有效地探索程序的执行路径，以在有限的时间预算内实现高代码覆盖率。Concolic测试使用搜索启发式来解决这个问题，它倾向于探索最有可能最大化最终覆盖率的特定类型的路径。然而，手动设计一个好的搜索启发式是非常重要的，并且通常会以次优和不稳定的结果告终。本文的目标是通过自动生成搜索启发式来克服集合测试的这一缺点。我们定义了一类搜索启发式算法，即参数化启发式算法，并给出了一种针对每个主题程序有效地寻找最优启发式算法的算法。使用开源C程序的实验结果表明，我们的技术成功地生成了搜索启发式，在分支覆盖和bug查找方面，它明显优于现有的手工制作的启发式。

引用次数: 20

Are Code Examples on an Online Q&A Forum Reliable?: A Study of API Misuse on Stack Overflow 在线问答论坛上的代码示例可靠吗?基于堆栈溢出的API误用研究

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180260

Tianyi Zhang, Ganesha Upadhyaya, Anastasia Reinhardt, Hridesh Rajan, Miryung Kim

Programmers often consult an online Q&A forum such as Stack Overflow to learn new APIs. This paper presents an empirical study on the prevalence and severity of API misuse on Stack Overflow. To reduce manual assessment effort, we design ExampleCheck, an API usage mining framework that extracts patterns from over 380K Java repositories on GitHub and subsequently reports potential API usage violations in Stack Overflow posts. We analyze 217,818 Stack Overflow posts using ExampleCheck and find that 31% may have potential API usage violations that could produce unexpected behavior such as program crashes and resource leaks. Such API misuse is caused by three main reasons—missing control constructs, missing or incorrect order of API calls, and incorrect guard conditions. Even the posts that are accepted as correct answers or upvoted by other programmers are not necessarily more reliable than other posts in terms of API misuse. This study result calls for a new approach to augment Stack Overflow with alternative API usage details that are not typically shown in curated examples.

程序员经常咨询在线问答论坛，如Stack Overflow来学习新的api。本文对堆栈溢出中API滥用的流行程度和严重程度进行了实证研究。为了减少人工评估的工作量，我们设计了ExampleCheck，这是一个API使用挖掘框架，可以从GitHub上超过380K的Java存储库中提取模式，并随后在Stack Overflow帖子中报告潜在的API使用违规。我们使用ExampleCheck分析了217,818个Stack Overflow帖子，发现31%可能有潜在的API使用违规，可能产生意想不到的行为，如程序崩溃和资源泄漏。这种API误用是由三个主要原因造成的:缺少控制结构、缺少或不正确的API调用顺序以及不正确的保护条件。就API误用而言，即使是被其他程序员接受为正确答案或点赞的帖子也不一定比其他帖子更可靠。该研究结果要求使用一种新方法来增强Stack Overflow，其中包含在精选示例中通常未显示的替代API使用细节。

{"title":"Are Code Examples on an Online Q&A Forum Reliable?: A Study of API Misuse on Stack Overflow","authors":"Tianyi Zhang, Ganesha Upadhyaya, Anastasia Reinhardt, Hridesh Rajan, Miryung Kim","doi":"10.1145/3180155.3180260","DOIUrl":"https://doi.org/10.1145/3180155.3180260","url":null,"abstract":"Programmers often consult an online Q&A forum such as Stack Overflow to learn new APIs. This paper presents an empirical study on the prevalence and severity of API misuse on Stack Overflow. To reduce manual assessment effort, we design ExampleCheck, an API usage mining framework that extracts patterns from over 380K Java repositories on GitHub and subsequently reports potential API usage violations in Stack Overflow posts. We analyze 217,818 Stack Overflow posts using ExampleCheck and find that 31% may have potential API usage violations that could produce unexpected behavior such as program crashes and resource leaks. Such API misuse is caused by three main reasons—missing control constructs, missing or incorrect order of API calls, and incorrect guard conditions. Even the posts that are accepted as correct answers or upvoted by other programmers are not necessarily more reliable than other posts in terms of API misuse. This study result calls for a new approach to augment Stack Overflow with alternative API usage details that are not typically shown in curated examples.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"20 1","pages":"886-896"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78586293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 145

Context-Aware Patch Generation for Better Automated Program Repair 上下文感知补丁生成更好的自动化程序修复

2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)

Pub Date : 2018-05-27 DOI: 10.1145/3180155.3180233

Ming Wen, Junjie Chen, Rongxin Wu, Dan Hao, S. Cheung

The effectiveness of search-based automated program repair is limited in the number of correct patches that can be successfully generated. There are two causes of such limitation. First, the search space does not contain the correct patch. Second, the search space is huge and therefore the correct patch cannot be generated (ie correct patches are either generated after incorrect plausible ones or not generated within the time budget). To increase the likelihood of including the correct patches in the search space, we propose to work at a fine granularity in terms of AST nodes. This, however, will further enlarge the search space, increasing the challenge to find the correct patches. We address the challenge by devising a strategy to prioritize the candidate patches based on their likelihood of being correct. Specifically, we study the use of AST nodes' context information to estimate the likelihood. In this paper, we propose CapGen, a context-aware patch generation technique. The novelty which allows CapGen to produce more correct patches lies in three aspects: (1) The fine-granularity design enables it to find more correct fixing ingredients; (2) The context-aware prioritization of mutation operators enables it to constrain the search space; (3) Three context-aware models enable it to rank correct patches at high positions before incorrect plausible ones. We evaluate CapGen on Defects4J and compare it with the state-of-the-art program repair techniques. Our evaluation shows that CapGen outperforms and complements existing techniques. CapGen achieves a high precision of 84.00% and can prioritize the correct patches before 98.78% of the incorrect plausible ones.

基于搜索的自动程序修复的有效性受限于能够成功生成的正确补丁的数量。造成这种限制的原因有两个。首先，搜索空间不包含正确的补丁。其次，搜索空间太大，无法生成正确的补丁(即正确的补丁要么在不正确的似是而非的补丁之后生成，要么没有在时间预算内生成)。为了增加在搜索空间中包含正确补丁的可能性，我们建议按照AST节点的细粒度进行工作。然而，这将进一步扩大搜索空间，增加寻找正确补丁的挑战。我们通过设计一种策略来解决这一挑战，根据候选补丁的正确可能性对其进行优先排序。具体来说，我们研究了使用AST节点的上下文信息来估计可能性。在本文中，我们提出了CapGen，一种上下文感知补丁生成技术。CapGen能够生产出更正确的贴片的新颖性在于三个方面:(1)细粒度设计使其能够找到更多正确的固定成分;(2)突变算子的上下文感知优先化使其能够约束搜索空间;(3)三个上下文感知模型使其能够将正确的补丁排在不正确的可信补丁之前。我们在缺陷4j上评估CapGen，并将其与最先进的程序修复技术进行比较。我们的评估表明，CapGen的性能优于现有技术，是现有技术的补充。CapGen的准确率达到了84.00%，正确的补丁排在98.78%的错误可信补丁之前。

{"title":"Context-Aware Patch Generation for Better Automated Program Repair","authors":"Ming Wen, Junjie Chen, Rongxin Wu, Dan Hao, S. Cheung","doi":"10.1145/3180155.3180233","DOIUrl":"https://doi.org/10.1145/3180155.3180233","url":null,"abstract":"The effectiveness of search-based automated program repair is limited in the number of correct patches that can be successfully generated. There are two causes of such limitation. First, the search space does not contain the correct patch. Second, the search space is huge and therefore the correct patch cannot be generated (ie correct patches are either generated after incorrect plausible ones or not generated within the time budget). To increase the likelihood of including the correct patches in the search space, we propose to work at a fine granularity in terms of AST nodes. This, however, will further enlarge the search space, increasing the challenge to find the correct patches. We address the challenge by devising a strategy to prioritize the candidate patches based on their likelihood of being correct. Specifically, we study the use of AST nodes' context information to estimate the likelihood. In this paper, we propose CapGen, a context-aware patch generation technique. The novelty which allows CapGen to produce more correct patches lies in three aspects: (1) The fine-granularity design enables it to find more correct fixing ingredients; (2) The context-aware prioritization of mutation operators enables it to constrain the search space; (3) Three context-aware models enable it to rank correct patches at high positions before incorrect plausible ones. We evaluate CapGen on Defects4J and compare it with the state-of-the-art program repair techniques. Our evaluation shows that CapGen outperforms and complements existing techniques. CapGen achieves a high precision of 84.00% and can prioritize the correct patches before 98.78% of the incorrect plausible ones.","PeriodicalId":6560,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"1-11"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85447464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 278