2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)最新文献_第8页

Towards Smoother Library Migrations: A Look at Vulnerable Dependency Migrations at Function Level for npm JavaScript Packages 迈向更流畅的库迁移:从函数级看npm JavaScript包的脆弱依赖迁移

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00067

R. Zapata, R. Kula, Bodin Chinthanet, T. Ishio, Ken-ichi Matsumoto, Akinori Ihara

It has become common practice for software projects to adopt third-party libraries, allowing developers full access to functions that otherwise will take time and effort to create them-selves. Regardless of migration effort involved, developers are encouraged to maintain their library dependencies by updating any outdated dependency, so as to remain safe from potential threats such as vulnerabilities. Through a manual inspection of a total of 60 client projects from three cases of high severity vulnerabilities, we investigate whether or not clients are really safe from these threats. Surprisingly, our early results show evidence that up to 73.3% of outdated clients were actually safe from the threat. This is the first work to confirm that analysis at the library level is indeed an overestimation. This result to pave the path for future studies to empirically investigate and validate this phenomena, and is towards aiding a smoother library migration for client developers.

采用第三方库已成为软件项目的常见做法，允许开发人员完全访问功能，否则将花费时间和精力自己创建这些功能。无论涉及的迁移工作如何，我们都鼓励开发人员通过更新任何过时的依赖项来维护他们的库依赖项，以便免受诸如漏洞之类的潜在威胁。通过对来自三个高严重性漏洞案例的总共60个客户端项目的人工检查，我们调查客户端是否真的能够免受这些威胁。令人惊讶的是，我们的早期结果显示，有证据表明，高达73.3%的过时客户端实际上是安全的。这是第一个证实在图书馆级别的分析确实是高估的工作。这一结果为未来的研究铺平了道路，以经验调查和验证这一现象，并有助于客户端开发人员更顺利地迁移库。

引用次数: 51

BinMatch: A Semantics-Based Hybrid Approach on Binary Code Clone Analysis BinMatch:基于语义的二进制代码克隆分析混合方法

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-08-19 DOI: 10.1109/ICSME.2018.00019

Yikun Hu, Yuanyuan Zhang, Juanru Li, Hui Wang, Bodong Li, Dawu Gu

Binary code clone analysis is an important technique which has a wide range of applications in software engineering (e.g., plagiarism detection, bug detection). The main challenge of the topic lies in the semantics-equivalent code transformation (e.g., optimization, obfuscation) which would alter representations of binary code tremendously. Another challenge is the trade-off between detection accuracy and coverage. Unfortunately, existing techniques still rely on semantics-less code features which are susceptible to the code transformation. Besides, they adopt merely either a static or a dynamic approach to detect binary code clones, which cannot achieve high accuracy and coverage simultaneously. In this paper, we propose a semantics-based hybrid approach to detect binary clone functions. We execute a template binary function with its test cases, and emulate the execution of every target function for clone comparison with the runtime information migrated from that template function. The semantic signatures are extracted during the execution of the template function and emulation of the target function. Lastly, a similarity score is calculated from their signatures to measure their likeness. We implement the approach in a prototype system designated as BinMatch which analyzes IA-32 binary code on the Linux platform. We evaluate BinMatch with eight real-world projects compiled with different compilation configurations and commonly-used obfuscation methods, totally performing over 100 million pairs of function comparison. The experimental results show that BinMatch is robust to the semantics-equivalent code transformation. Besides, it not only covers all target functions for clone analysis, but also improves the detection accuracy comparing to the state-of-the-art solutions.

二进制代码克隆分析是一项重要的技术，在软件工程中有着广泛的应用(如抄袭检测、错误检测)。该主题的主要挑战在于语义等效的代码转换(例如，优化，混淆)，这将极大地改变二进制代码的表示。另一个挑战是检测精度和覆盖范围之间的权衡。不幸的是，现有技术仍然依赖于易受代码转换影响的无语义代码特性。此外，它们仅仅采用静态或动态的方法来检测二进制代码克隆，无法同时达到较高的准确性和覆盖率。本文提出了一种基于语义的二元克隆函数混合检测方法。我们执行一个带有测试用例的模板二进制函数，并模拟每个目标函数的执行，以便与从该模板函数迁移过来的运行时信息进行克隆比较。语义签名是在模板函数执行和目标函数仿真期间提取的。最后，从他们的签名中计算出相似度分数来衡量他们的相似度。我们在一个名为BinMatch的原型系统中实现了该方法，该系统在Linux平台上分析IA-32二进制代码。我们用八个使用不同编译配置和常用混淆方法编译的实际项目来评估BinMatch，总共执行了超过1亿对函数比较。实验结果表明，BinMatch算法对语义等价码变换具有较强的鲁棒性。此外，它不仅涵盖了克隆分析的所有目标函数，而且与最先进的解决方案相比，它还提高了检测精度。

{"title":"BinMatch: A Semantics-Based Hybrid Approach on Binary Code Clone Analysis","authors":"Yikun Hu, Yuanyuan Zhang, Juanru Li, Hui Wang, Bodong Li, Dawu Gu","doi":"10.1109/ICSME.2018.00019","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00019","url":null,"abstract":"Binary code clone analysis is an important technique which has a wide range of applications in software engineering (e.g., plagiarism detection, bug detection). The main challenge of the topic lies in the semantics-equivalent code transformation (e.g., optimization, obfuscation) which would alter representations of binary code tremendously. Another challenge is the trade-off between detection accuracy and coverage. Unfortunately, existing techniques still rely on semantics-less code features which are susceptible to the code transformation. Besides, they adopt merely either a static or a dynamic approach to detect binary code clones, which cannot achieve high accuracy and coverage simultaneously. In this paper, we propose a semantics-based hybrid approach to detect binary clone functions. We execute a template binary function with its test cases, and emulate the execution of every target function for clone comparison with the runtime information migrated from that template function. The semantic signatures are extracted during the execution of the template function and emulation of the target function. Lastly, a similarity score is calculated from their signatures to measure their likeness. We implement the approach in a prototype system designated as BinMatch which analyzes IA-32 binary code on the Linux platform. We evaluate BinMatch with eight real-world projects compiled with different compilation configurations and commonly-used obfuscation methods, totally performing over 100 million pairs of function comparison. The experimental results show that BinMatch is robust to the semantics-equivalent code transformation. Besides, it not only covers all target functions for clone analysis, but also improves the detection accuracy comparing to the state-of-the-art solutions.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"104-114"},"PeriodicalIF":0.0,"publicationDate":"2018-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83920015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

DRLgencert: Deep Learning-Based Automated Testing of Certificate Verification in SSL/TLS Implementations DRLgencert:基于深度学习的SSL/TLS实现证书验证自动化测试

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-08-16 DOI: 10.1109/ICSME.2018.00014

Chao Chen, Wenrui Diao, Yingpei Zeng, Shanqing Guo, Chengyu Hu

The Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols are the foundation of network security. The certificate verification in SSL/TLS implementations is vital and may become the "weak link" in the whole network ecosystem. In previous works, some research focused on the automated testing of certificate verification, and the main approaches rely on generating massive certificates through randomly combining parts of seed certificates for fuzzing. Although the generated certificates could meet the semantic constraints, the cost is quite heavy, and the performance is limited due to the randomness. To fill this gap, in this paper, we propose DRLGENCERT, the first framework of applying deep reinforcement learning to the automated testing of certificate verification in SSL/TLS implementations. DRLGENCERT accepts ordinary certificates as input and outputs newly generated certificates which could trigger discrepancies with high efficiency. Benefited by the deep reinforcement learning, when generating certificates, our framework could choose the best next action according to the result of a previous modification, instead of simple random combinations. At the same time, we developed a set of new techniques to support the overall design, like new feature extraction method for X.509 certificates, fine-grained differential testing, and so forth. Also, we implemented a prototype of DRLGENCERT and carried out a series of real-world experiments. The results show DRLGENCERT is quite efficient, and we obtained 84,661 discrepancy-triggering certificates from 181,900 certificate seeds, say around 46.5% effectiveness. Also, we evaluated six popular SSL/TLS implementations, including GnuTLS, MatrixSSL, MbedTLS, NSS, OpenSSL, and wolfSSL. DRLGENCERT successfully discovered 23 serious certificate verification flaws, and most of them were previously unknown.

SSL (Secure Sockets Layer)和TLS (Transport Layer Security)协议是网络安全的基础。证书验证在SSL/TLS实现中至关重要，可能成为整个网络生态系统的“薄弱环节”。在以往的工作中，一些研究集中在证书验证的自动化测试上，主要的方法是通过随机组合种子证书的部分来生成大量的证书来进行模糊测试。虽然生成的证书可以满足语义约束，但成本相当高，并且由于其随机性而限制了性能。为了填补这一空白，在本文中，我们提出了DRLGENCERT，这是第一个将深度强化学习应用于SSL/TLS实现中证书验证的自动化测试的框架。DRLGENCERT接受普通证书作为输入，并以高效率输出可能引发差异的新生成证书。得益于深度强化学习，在生成证书时，我们的框架可以根据之前修改的结果选择最佳的下一步动作，而不是简单的随机组合。同时，我们开发了一组新技术来支持整体设计，比如用于X.509证书的新特征提取方法、细粒度差分测试等等。此外，我们还实现了DRLGENCERT的原型，并进行了一系列现实世界的实验。结果表明，DRLGENCERT具有很高的效率，从181900个证书种子中获得了84661个触发差异的证书，有效性约为46.5%。此外，我们还评估了六种流行的SSL/TLS实现，包括GnuTLS、MatrixSSL、MbedTLS、NSS、OpenSSL和wolfSSL。DRLGENCERT成功发现了23个严重的证书验证漏洞，其中大部分是以前不为人知的。

{"title":"DRLgencert: Deep Learning-Based Automated Testing of Certificate Verification in SSL/TLS Implementations","authors":"Chao Chen, Wenrui Diao, Yingpei Zeng, Shanqing Guo, Chengyu Hu","doi":"10.1109/ICSME.2018.00014","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00014","url":null,"abstract":"The Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols are the foundation of network security. The certificate verification in SSL/TLS implementations is vital and may become the \"weak link\" in the whole network ecosystem. In previous works, some research focused on the automated testing of certificate verification, and the main approaches rely on generating massive certificates through randomly combining parts of seed certificates for fuzzing. Although the generated certificates could meet the semantic constraints, the cost is quite heavy, and the performance is limited due to the randomness. To fill this gap, in this paper, we propose DRLGENCERT, the first framework of applying deep reinforcement learning to the automated testing of certificate verification in SSL/TLS implementations. DRLGENCERT accepts ordinary certificates as input and outputs newly generated certificates which could trigger discrepancies with high efficiency. Benefited by the deep reinforcement learning, when generating certificates, our framework could choose the best next action according to the result of a previous modification, instead of simple random combinations. At the same time, we developed a set of new techniques to support the overall design, like new feature extraction method for X.509 certificates, fine-grained differential testing, and so forth. Also, we implemented a prototype of DRLGENCERT and carried out a series of real-world experiments. The results show DRLGENCERT is quite efficient, and we obtained 84,661 discrepancy-triggering certificates from 181,900 certificate seeds, say around 46.5% effectiveness. Also, we evaluated six popular SSL/TLS implementations, including GnuTLS, MatrixSSL, MbedTLS, NSS, OpenSSL, and wolfSSL. DRLGENCERT successfully discovered 23 serious certificate verification flaws, and most of them were previously unknown.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"48-58"},"PeriodicalIF":0.0,"publicationDate":"2018-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82546620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Gistable: Evaluating the Executability of Python Code Snippets on GitHub GitHub:评估Python代码片段在GitHub上的可执行性

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-08-14 DOI: 10.1109/ICSME.2018.00031

Eric Horton, Chris Parnin

Software developers create and share code online to demonstrate programming language concepts and programming tasks. Code snippets can be a useful way to explain and demonstrate a programming concept, but may not always be directly executable. A code snippet can contain parse errors, or fail to execute if the environment contains unmet dependencies. This paper presents an empirical analysis of the executable status of Python code snippets shared through the GitHub gist system, and the ability of developers familiar with software configuration to correctly configure and run them. We find that 75.6% of gists require non-trivial configuration to overcome missing dependencies, configuration files, reliance on a specific operating system, or some other environment configuration. Our study also suggests the natural assumption developers make about resource names when resolving configuration errors is correct less than half the time. We also present Gistable, a database and extensible framework built on GitHub's gist system, which provides executable code snippets to enable reproducible studies in software engineering. Gistable contains 10,259 code snippets, approximately 5,000 with a Dockerfile to configure and execute them without import error. Gistable is publicly available at https://github.com/gistable/gistable.

软件开发人员在线创建和共享代码，以演示编程语言概念和编程任务。代码片段是解释和演示编程概念的一种有用的方式，但可能并不总是直接可执行的。代码片段可能包含解析错误，或者如果环境包含未满足的依赖项，则执行失败。本文对通过GitHub gist系统共享的Python代码片段的可执行状态，以及熟悉软件配置的开发人员正确配置和运行它们的能力进行了实证分析。我们发现75.6%的gist需要重要的配置来克服缺失的依赖项、配置文件、对特定操作系统的依赖，或者其他一些环境配置。我们的研究还表明，开发人员在解决配置错误时对资源名称所做的自然假设的正确性不到一半。我们还介绍了Gistable，这是一个建立在GitHub的gist系统上的数据库和可扩展框架，它提供了可执行的代码片段，以便在软件工程中进行可重复的研究。Gistable包含10259个代码片段，其中大约5000个需要一个Dockerfile来配置和执行，而不会出现导入错误。Gistable可在https://github.com/gistable/gistable公开获取。

{"title":"Gistable: Evaluating the Executability of Python Code Snippets on GitHub","authors":"Eric Horton, Chris Parnin","doi":"10.1109/ICSME.2018.00031","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00031","url":null,"abstract":"Software developers create and share code online to demonstrate programming language concepts and programming tasks. Code snippets can be a useful way to explain and demonstrate a programming concept, but may not always be directly executable. A code snippet can contain parse errors, or fail to execute if the environment contains unmet dependencies. This paper presents an empirical analysis of the executable status of Python code snippets shared through the GitHub gist system, and the ability of developers familiar with software configuration to correctly configure and run them. We find that 75.6% of gists require non-trivial configuration to overcome missing dependencies, configuration files, reliance on a specific operating system, or some other environment configuration. Our study also suggests the natural assumption developers make about resource names when resolving configuration errors is correct less than half the time. We also present Gistable, a database and extensible framework built on GitHub's gist system, which provides executable code snippets to enable reproducible studies in software engineering. Gistable contains 10,259 code snippets, approximately 5,000 with a Dockerfile to configure and execute them without import error. Gistable is publicly available at https://github.com/gistable/gistable.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"58 1","pages":"217-227"},"PeriodicalIF":0.0,"publicationDate":"2018-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75822211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

COBOL to Java and Newspapers Still Get Delivered 从COBOL到Java，报纸还在投递

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-08-10 DOI: 10.1109/ICSME.2018.00055

A. D. Marco, Valentin Iancu, Ira Asinofsky

This paper is an experience report on migrating an American newspaper company's business-critical IBM mainframe application to Linux servers by automatically translating the application's source code from COBOL to Java and converting the mainframe data store from VSAM KSDS files to an Oracle relational database. The mainframe application had supported daily home delivery of the newspaper since 1979. It was in need of modernization in order to increase interoperability and enable future convergence with newer enterprise systems as well as to reduce operating costs. Testing the modernized application proved to be the most vexing area of work. This paper explains the process that was employed to test functional equivalence between the legacy and modernized applications, the main testing challenges, and lessons learned after having operated and maintained the modernized application in production over the last eight months. The goal of delivering a functionally equivalent system was achieved, but problems remained to be solved related to new feature development, business domain knowledge transfer, and recruiting new software engineers to work on the modernized application.

本文是关于将一家美国报纸公司的业务关键型IBM大型机应用程序自动从COBOL转换为Java，并将大型机数据存储从VSAM KSDS文件转换为Oracle关系数据库，从而迁移到Linux服务器的经验报告。自1979年以来，大型机应用程序一直支持报纸的每日送货到家。它需要现代化，以便增加互操作性，使将来能够与较新的企业系统融合，并降低操作成本。测试现代化的应用程序被证明是最令人烦恼的工作领域。本文解释了用于测试遗留应用程序和现代化应用程序之间的功能等价性的过程、主要的测试挑战，以及在过去八个月在生产环境中操作和维护现代化应用程序后获得的经验教训。交付功能相同的系统的目标实现了，但是与新特性开发、业务领域知识转移以及招募新的软件工程师来处理现代化的应用程序相关的问题仍然有待解决。

{"title":"COBOL to Java and Newspapers Still Get Delivered","authors":"A. D. Marco, Valentin Iancu, Ira Asinofsky","doi":"10.1109/ICSME.2018.00055","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00055","url":null,"abstract":"This paper is an experience report on migrating an American newspaper company's business-critical IBM mainframe application to Linux servers by automatically translating the application's source code from COBOL to Java and converting the mainframe data store from VSAM KSDS files to an Oracle relational database. The mainframe application had supported daily home delivery of the newspaper since 1979. It was in need of modernization in order to increase interoperability and enable future convergence with newer enterprise systems as well as to reduce operating costs. Testing the modernized application proved to be the most vexing area of work. This paper explains the process that was employed to test functional equivalence between the legacy and modernized applications, the main testing challenges, and lessons learned after having operated and maintained the modernized application in production over the last eight months. The goal of delivering a functionally equivalent system was achieved, but problems remained to be solved related to new feature development, business domain knowledge transfer, and recruiting new software engineers to work on the modernized application.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"50 1","pages":"583-586"},"PeriodicalIF":0.0,"publicationDate":"2018-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80928332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Automatic Clone Recommendation for Refactoring Based on the Present and the Past 基于现在和过去的自动克隆重构建议

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-07-30 DOI: 10.1109/ICSME.2018.00021

Ruru Yue, Zhe Gao, Na Meng, Yingfei Xiong, Xiaoyin Wang, J. D. Morgenthaler

When many clones are detected in software programs, not all clones are equally important to developers. To help developers refactor code and improve software quality, various tools were built to recommend clone-removal refactorings based on the past and the present information, such as the cohesion degree of individual clones or the co-evolution relations of clone peers. The existence of these tools inspired us to build an approach that considers as many factors as possible to more accurately recommend clones. This paper introduces CREC, a learning-based approach that recommends clones by extracting features from the current status and past history of software projects. Given a set of software repositories, CREC first automatically extracts the clone groups historically refactored (R-clones) and those not refactored (NR-clones) to construct the training set. CREC extracts 34 features to characterize the content and evolution behaviors of individual clones, as well as the spatial, syntactical, and co-change relations of clone peers. With these features, CREC trains a classifier that recommends clones for refactoring. We designed the largest feature set thus far for clone recommendation, and performed an evaluation on six large projects. The results show that our approach suggested refactorings with 83% and 76% F-scores in the within-project and cross-project settings. CREC significantly outperforms a state-of-the-art similar approach on our data set, with the latter one achieving 70% and 50% F-scores. We also compared the effectiveness of different factors and different learning algorithms.

当在软件程序中检测到许多克隆时，并不是所有的克隆对开发人员都同样重要。为了帮助开发人员重构代码并提高软件质量，基于过去和现在的信息，例如单个克隆的内聚度或克隆同伴的共同进化关系，构建了各种工具来推荐克隆移除重构。这些工具的存在激励我们建立一种方法，考虑尽可能多的因素，以更准确地推荐克隆。本文介绍了CREC，这是一种基于学习的方法，通过从软件项目的当前状态和过去的历史中提取特征来推荐克隆。给定一组软件存储库，CREC首先自动提取历史重构(r -克隆)和未重构(nr -克隆)的克隆组来构建训练集。CREC提取了34个特征来表征个体克隆的内容和进化行为，以及克隆同伴的空间、句法和共变关系。有了这些特性，CREC训练了一个分类器，该分类器推荐进行重构的克隆。我们为克隆推荐设计了迄今为止最大的功能集，并对六个大型项目进行了评估。结果表明，我们的方法建议在项目内和跨项目设置中重构的f值分别为83%和76%。在我们的数据集上，CREC显著优于最先进的类似方法，后者的f得分分别达到70%和50%。我们还比较了不同因素和不同学习算法的有效性。

{"title":"Automatic Clone Recommendation for Refactoring Based on the Present and the Past","authors":"Ruru Yue, Zhe Gao, Na Meng, Yingfei Xiong, Xiaoyin Wang, J. D. Morgenthaler","doi":"10.1109/ICSME.2018.00021","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00021","url":null,"abstract":"When many clones are detected in software programs, not all clones are equally important to developers. To help developers refactor code and improve software quality, various tools were built to recommend clone-removal refactorings based on the past and the present information, such as the cohesion degree of individual clones or the co-evolution relations of clone peers. The existence of these tools inspired us to build an approach that considers as many factors as possible to more accurately recommend clones. This paper introduces CREC, a learning-based approach that recommends clones by extracting features from the current status and past history of software projects. Given a set of software repositories, CREC first automatically extracts the clone groups historically refactored (R-clones) and those not refactored (NR-clones) to construct the training set. CREC extracts 34 features to characterize the content and evolution behaviors of individual clones, as well as the spatial, syntactical, and co-change relations of clone peers. With these features, CREC trains a classifier that recommends clones for refactoring. We designed the largest feature set thus far for clone recommendation, and performed an evaluation on six large projects. The results show that our approach suggested refactorings with 83% and 76% F-scores in the within-project and cross-project settings. CREC significantly outperforms a state-of-the-art similar approach on our data set, with the latter one achieving 70% and 50% F-scores. We also compared the effectiveness of different factors and different learning algorithms.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"115-126"},"PeriodicalIF":0.0,"publicationDate":"2018-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90352697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Assessing Test Case Prioritization on Real Faults and Mutants 评估真实故障和突变的测试用例优先级

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-07-23 DOI: 10.1109/ICSME.2018.00033

Qi Luo, Kevin Moran, D. Poshyvanyk, M. D. Penta

Test Case Prioritization (TCP) is an important component of regression testing, allowing for earlier detection of faults or helping to reduce testing time and cost. While several TCP approaches exist in the research literature, a growing number of studies have evaluated them against synthetic software defects, called mutants. Hence, it is currently unclear to what extent TCP performance on mutants would be representative of the performance achieved on real faults. To answer this fundamental question, we conduct the first empirical study comparing the performance of TCP techniques applied to both real-world and mutation faults. The context of our study includes eight well-studied TCP approaches, 35k+ mutation faults, and 357 real-world faults from five Java systems in the Defects4J dataset. Our results indicate that the relative performance of the studied TCP techniques on mutants may not strongly correlate with performance on real faults, depending upon attributes of the subject programs. This suggests that, in certain contexts, the best performing technique on a set of mutants may not be the best technique in practice when applied to real faults. We also illustrate that these correlations vary for mutants generated by different operators depending on whether chosen operators reflect typical faults of a subject program. This highlights the importance, particularly for TCP, of developing mutation operators tailored for specific program domains.

测试用例优先级(TCP)是回归测试的重要组成部分，允许更早地检测故障或帮助减少测试时间和成本。虽然研究文献中存在几种TCP方法，但越来越多的研究已经针对称为突变的合成软件缺陷对它们进行了评估。因此，目前尚不清楚TCP在突变体上的性能在多大程度上能够代表在真实故障上取得的性能。为了回答这个基本问题，我们进行了第一次实证研究，比较了TCP技术应用于现实世界和突变故障的性能。我们的研究上下文包括8种经过充分研究的TCP方法、35k+突变错误和来自缺陷4j数据集中的5个Java系统的357个实际错误。我们的研究结果表明，所研究的TCP技术在突变体上的相对性能可能与实际故障上的性能没有很强的相关性，这取决于主题程序的属性。这表明，在某些情况下，在一组突变体上表现最好的技术在实际应用于实际故障时可能不是最好的技术。我们还说明了这些相关性对于由不同操作符产生的突变体是不同的，这取决于所选择的操作符是否反映了主题程序的典型错误。这突出了开发针对特定程序域的突变操作符的重要性，特别是对于TCP。

{"title":"Assessing Test Case Prioritization on Real Faults and Mutants","authors":"Qi Luo, Kevin Moran, D. Poshyvanyk, M. D. Penta","doi":"10.1109/ICSME.2018.00033","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00033","url":null,"abstract":"Test Case Prioritization (TCP) is an important component of regression testing, allowing for earlier detection of faults or helping to reduce testing time and cost. While several TCP approaches exist in the research literature, a growing number of studies have evaluated them against synthetic software defects, called mutants. Hence, it is currently unclear to what extent TCP performance on mutants would be representative of the performance achieved on real faults. To answer this fundamental question, we conduct the first empirical study comparing the performance of TCP techniques applied to both real-world and mutation faults. The context of our study includes eight well-studied TCP approaches, 35k+ mutation faults, and 357 real-world faults from five Java systems in the Defects4J dataset. Our results indicate that the relative performance of the studied TCP techniques on mutants may not strongly correlate with performance on real faults, depending upon attributes of the subject programs. This suggests that, in certain contexts, the best performing technique on a set of mutants may not be the best technique in practice when applied to real faults. We also illustrate that these correlations vary for mutants generated by different operators depending on whether chosen operators reflect typical faults of a subject program. This highlights the importance, particularly for TCP, of developing mutation operators tailored for specific program domains.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"26 1","pages":"240-251"},"PeriodicalIF":0.0,"publicationDate":"2018-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76875584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 41

Effective Reformulation of Query for Code Search Using Crowdsourced Knowledge and Extra-Large Data Analytics 基于众包知识和超大数据分析的代码搜索查询的有效重构

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-07-23 DOI: 10.1109/ICSME.2018.00057

M. M. Rahman, C. Roy

Software developers frequently issue generic natural language queries for code search while using code search engines (e.g., GitHub native search, Krugle). Such queries often do not lead to any relevant results due to vocabulary mismatch problems. In this paper, we propose a novel technique that automatically identifies relevant and specific API classes from Stack Overflow Q & A site for a programming task written as a natural language query, and then reformulates the query for improved code search. We first collect candidate API classes from Stack Overflow using pseudo-relevance feedback and two term weighting algorithms, and then rank the candidates using Borda count and semantic proximity between query keywords and the API classes. The semantic proximity has been determined by an analysis of 1.3 million questions and answers of Stack Overflow. Experiments using 310 code search queries report that our technique suggests relevant API classes with 48% precision and 58% recall which are 32% and 48% higher respectively than those of the state-of-the-art. Comparisons with two state-of-the-art studies and three popular search engines (e.g., Google, Stack Overflow, and GitHub native search) report that our reformulated queries (1) outperform the queries of the state-of-the-art, and (2) significantly improve the code search results provided by these contemporary search engines.

软件开发人员在使用代码搜索引擎(例如，GitHub原生搜索，Krugle)时，经常为代码搜索发出通用的自然语言查询。由于词汇不匹配问题，此类查询通常不会产生任何相关的结果。在本文中，我们提出了一种新技术，该技术可以自动从堆栈溢出问答站点中识别相关和特定的API类，用于编写为自然语言查询的编程任务，然后重新制定查询以改进代码搜索。我们首先使用伪相关反馈和两个术语加权算法从Stack Overflow收集候选API类，然后使用Borda计数和查询关键字与API类之间的语义接近度对候选API类进行排名。语义接近度是通过对Stack Overflow 130万个问题和答案的分析确定的。使用310个代码搜索查询的实验报告表明，我们的技术建议相关的API类具有48%的精度和58%的召回率，分别比最先进的技术高32%和48%。与两个最先进的研究和三个流行的搜索引擎(例如，b谷歌，Stack Overflow和GitHub原生搜索)的比较报告表明，我们重新制定的查询(1)优于最先进的查询，并且(2)显着改善了这些当代搜索引擎提供的代码搜索结果。

{"title":"Effective Reformulation of Query for Code Search Using Crowdsourced Knowledge and Extra-Large Data Analytics","authors":"M. M. Rahman, C. Roy","doi":"10.1109/ICSME.2018.00057","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00057","url":null,"abstract":"Software developers frequently issue generic natural language queries for code search while using code search engines (e.g., GitHub native search, Krugle). Such queries often do not lead to any relevant results due to vocabulary mismatch problems. In this paper, we propose a novel technique that automatically identifies relevant and specific API classes from Stack Overflow Q & A site for a programming task written as a natural language query, and then reformulates the query for improved code search. We first collect candidate API classes from Stack Overflow using pseudo-relevance feedback and two term weighting algorithms, and then rank the candidates using Borda count and semantic proximity between query keywords and the API classes. The semantic proximity has been determined by an analysis of 1.3 million questions and answers of Stack Overflow. Experiments using 310 code search queries report that our technique suggests relevant API classes with 48% precision and 58% recall which are 32% and 48% higher respectively than those of the state-of-the-art. Comparisons with two state-of-the-art studies and three popular search engines (e.g., Google, Stack Overflow, and GitHub native search) report that our reformulated queries (1) outperform the queries of the state-of-the-art, and (2) significantly improve the code search results provided by these contemporary search engines.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"17 1","pages":"473-484"},"PeriodicalIF":0.0,"publicationDate":"2018-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78728775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 48

Automating Software Development for Mobile Computing Platforms 移动计算平台的自动化软件开发

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-07-18 DOI: 10.1109/ICSME.2018.00094

Kevin Moran

Mobile devices such as smartphones and tablets have become ubiquitous in today's modern computing landscape. The applications that run on these mobile devices (often referred to as "apps") have become a primary means of computing for millions of users and, as such, have garnered immense developer interest. These apps allow for unique, personal software experiences through touch-based UIs and a complex assortment of sensors. However designing and implementing high quality mobile apps can be a difficult process. This is primarily due to challenges unique to mobile development including change-prone APIs and platform fragmentation, just to name a few. This paper presents the motivation and an overview of a dissertation which presents new approaches for automating and improving mobile app design and development practices. Additionally, this paper discusses potential avenues for future research based upon the work conducted, as well as general lessons learned during the author's tenure as a doctoral student in the general areas of software engineering, maintenance, and evolution.

智能手机和平板电脑等移动设备在当今的现代计算环境中无处不在。在这些移动设备上运行的应用程序(通常被称为“应用程序”)已经成为数百万用户计算的主要手段，因此吸引了大量开发人员的兴趣。这些应用程序通过基于触摸的用户界面和各种复杂的传感器提供独特的个人软件体验。然而，设计和执行高质量的手机应用可能是一个困难的过程。这主要是由于移动开发所面临的独特挑战，包括易变化的api和平台分裂等等。本文介绍了一篇论文的动机和概述，该论文提出了自动化和改进移动应用程序设计和开发实践的新方法。此外，本文讨论了基于所进行的工作的未来研究的潜在途径，以及作者在软件工程、维护和进化的一般领域担任博士生期间所学到的一般经验教训。

引用次数: 3

Automatic Traceability Maintenance via Machine Learning Classification 通过机器学习分类自动跟踪维护

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-07-17 DOI: 10.1109/ICSME.2018.00045

Chris Mills, Javier Escobar-Avila, S. Haiduc

Previous studies have shown that software traceability, the ability to link together related artifacts from different sources within a project (e.g., source code, use cases, documentation, etc.), improves project outcomes by assisting developers and other stakeholders with common tasks such as impact analysis, concept location, etc. Establishing traceability links in a software system is an important and costly task, but only half the struggle. As the project undergoes maintenance and evolution, new artifacts are added and existing ones are changed, resulting in outdated traceability information. Therefore, specific steps need to be taken to make sure that traceability links are maintained in tandem with the rest of the project. In this paper we address this problem and propose a novel approach called TRAIL for maintaining traceability information in a system. The novelty of TRAIL stands in the fact that it leverages previously captured knowledge about project traceability to train a machine learning classifier which can then be used to derive new traceability links and update existing ones. We evaluated TRAIL on 11 commonly used traceability datasets from six software systems and compared it to seven popular Information Retrieval (IR) techniques including the most common approaches used in previous work. The results indicate that TRAIL outperforms all IR approaches in terms of precision, recall, and F-score.

以前的研究已经表明，软件的可追溯性，将项目中来自不同来源的相关工件(例如，源代码、用例、文档等)连接在一起的能力，通过帮助开发人员和其他涉众完成诸如影响分析、概念定位等共同任务来改善项目结果。在软件系统中建立可追溯性链接是一项重要而昂贵的任务，但这只是一半的困难。随着项目的维护和发展，新的工件被添加，现有的工件被更改，从而导致过时的可跟踪性信息。因此，需要采取特定的步骤来确保跟踪链接与项目的其余部分保持一致。在本文中，我们解决了这个问题，并提出了一种称为TRAIL的新方法来维护系统中的可追溯性信息。TRAIL的新颖之处在于，它利用先前捕获的关于项目可追溯性的知识来训练机器学习分类器，然后该分类器可用于派生新的可追溯性链接并更新现有的可追溯性链接。我们对来自6个软件系统的11个常用可追溯性数据集进行了TRAIL评估，并将其与7种流行的信息检索(IR)技术(包括以前工作中使用的最常用方法)进行了比较。结果表明，TRAIL在准确率、召回率和f分数方面优于所有IR方法。

{"title":"Automatic Traceability Maintenance via Machine Learning Classification","authors":"Chris Mills, Javier Escobar-Avila, S. Haiduc","doi":"10.1109/ICSME.2018.00045","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00045","url":null,"abstract":"Previous studies have shown that software traceability, the ability to link together related artifacts from different sources within a project (e.g., source code, use cases, documentation, etc.), improves project outcomes by assisting developers and other stakeholders with common tasks such as impact analysis, concept location, etc. Establishing traceability links in a software system is an important and costly task, but only half the struggle. As the project undergoes maintenance and evolution, new artifacts are added and existing ones are changed, resulting in outdated traceability information. Therefore, specific steps need to be taken to make sure that traceability links are maintained in tandem with the rest of the project. In this paper we address this problem and propose a novel approach called TRAIL for maintaining traceability information in a system. The novelty of TRAIL stands in the fact that it leverages previously captured knowledge about project traceability to train a machine learning classifier which can then be used to derive new traceability links and update existing ones. We evaluated TRAIL on 11 commonly used traceability datasets from six software systems and compared it to seven popular Information Retrieval (IR) techniques including the most common approaches used in previous work. The results indicate that TRAIL outperforms all IR approaches in terms of precision, recall, and F-score.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"16 1","pages":"369-380"},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86882358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30