2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)最新文献_第2页

The Phantom Menace: Unmasking Security Issues in Evolving Software 幽灵的威胁:揭露发展中的软件中的安全问题

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2022-10-01 DOI: 10.1109/icsme55016.2022.00085

Emanuele Iannone, Fabio Palomba

Software security concerns the creation of secure software starting from its initial development phases, i.e., software that can withstand malicious attacks. To this end, several automated and not-automated solutions have been developed that support developers in identifying and assessing security issues, e.g., software vulnerabilities. However, most solutions were not meant to cooperate synergically or continuously run in the context of evolving software, i.e., software subject to frequent maintenance and evolution activities. In this scenario, developers have trouble setting up an effective defensive line against security issues arising in their projects. This research fills this gap by investigating how vulnerabilities affect evolving software projects and by proposing novel solutions to improve and simplify the security verification and validation process. The paper concludes by presenting the open challenges in the field of software security we framed while conducting our research.

软件安全涉及从最初的开发阶段开始创建安全的软件，即能够承受恶意攻击的软件。为此，已经开发了一些自动化和非自动化的解决方案，以支持开发人员识别和评估安全性问题，例如软件漏洞。然而，大多数解决方案并不意味着协同合作或在不断发展的软件环境中持续运行，也就是说，软件受制于频繁的维护和发展活动。在这种情况下，开发人员很难针对项目中出现的安全问题建立有效的防线。本研究通过调查漏洞如何影响不断发展的软件项目，并通过提出改进和简化安全验证和确认过程的新颖解决方案来填补这一空白。本文最后提出了我们在进行研究时提出的软件安全领域的公开挑战。

引用次数: 0

Defuse: A Data Annotator and Model Builder for Software Defect Prediction 化解:用于软件缺陷预测的数据注释器和模型构建器

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2022-10-01 DOI: 10.1109/ICSME55016.2022.00063

Stefano Dalla Palma, D. D. Nucci, D. Tamburri

We propose a language-agnostic tool for software defect prediction, called DEFUSE. The tool automatically collects and classifies failure data, enables the correction of those classifications, and builds machine learning models to detect defects based on those data. We instantiated the tool in the scope of Infrastructure-as-Code, the DevOps practice enabling management and provisioning of infrastructure through the definition of machine-readable files. We present its architecture and provide examples of its application.Demo video: https://youtu.be/37mmLdCX3jU.

我们提出了一种与语言无关的软件缺陷预测工具，称为“消融器”。该工具自动收集和分类故障数据，允许对这些分类进行纠正，并构建机器学习模型以基于这些数据检测缺陷。我们在基础设施即代码的范围内实例化了该工具，DevOps实践通过定义机器可读文件来实现基础设施的管理和供应。我们介绍了它的体系结构并提供了它的应用示例。演示视频:https://youtu.be/37mmLdCX3jU。

引用次数: 1

eTagger - An Energy Pattern Tagging Tool for GitHub Issues in Android Projects eTagger -一个用于Android项目中GitHub问题的能量模式标记工具

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2022-10-01 DOI: 10.1109/ICSME55016.2022.00064

Shriram Shanbhag, S. Chimalakonda, V. Sharma, Vikrant S. Kaulgud

Energy efficiency is an essential consideration in mobile application development, given that these apps run on battery-powered devices. This has led the researchers to develop a set of energy design patterns that can help the developers improve the energy efficiency of their applications. However, the adoption of these energy patterns in projects remains a challenge, given the lack of awareness about these patterns among the developers. To bridge this gap, we propose our tool eTagger, a Google Chrome extension that tags GitHub issues from Android repositories with associated energy patterns. eTagger works based on the embeddings generated by Sentence-BERT. We believe that labeling the GitHub issues with energy patterns may help towards their larger adoption as GitHub is a prominent platform in collaborative software development. A preliminary evaluation of eTagger achieved an AUC-ROC of 0.73 with a precision of 0.58, recall of 0.53 and an F1-score of 0.5. The demonstration of the tool is available at https://youtu.be/hP4pWJ4AKxE and related artifacts at https://rishalab.github.io/eTagger/.

考虑到这些应用程序在电池供电的设备上运行，能源效率是移动应用程序开发的一个重要考虑因素。这使得研究人员开发了一套能源设计模式，可以帮助开发人员提高其应用程序的能源效率。然而，由于开发人员缺乏对这些模式的认识，在项目中采用这些能量模式仍然是一个挑战。为了弥补这一差距，我们提出了我们的工具eTagger，一个Google Chrome扩展，标记GitHub问题从Android存储库与相关的能量模式。eTagger的工作基于由Sentence-BERT生成的嵌入。我们相信，将GitHub问题与能源模式联系起来，可能有助于更广泛地采用它们，因为GitHub是协作软件开发的重要平台。初步评价eTagger的AUC-ROC为0.73，精度为0.58，召回率为0.53,f1评分为0.5。该工具的演示可以在https://youtu.be/hP4pWJ4AKxE上获得，相关的工件可以在https://rishalab.github.io/eTagger/上获得。

{"title":"eTagger - An Energy Pattern Tagging Tool for GitHub Issues in Android Projects","authors":"Shriram Shanbhag, S. Chimalakonda, V. Sharma, Vikrant S. Kaulgud","doi":"10.1109/ICSME55016.2022.00064","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00064","url":null,"abstract":"Energy efficiency is an essential consideration in mobile application development, given that these apps run on battery-powered devices. This has led the researchers to develop a set of energy design patterns that can help the developers improve the energy efficiency of their applications. However, the adoption of these energy patterns in projects remains a challenge, given the lack of awareness about these patterns among the developers. To bridge this gap, we propose our tool eTagger, a Google Chrome extension that tags GitHub issues from Android repositories with associated energy patterns. eTagger works based on the embeddings generated by Sentence-BERT. We believe that labeling the GitHub issues with energy patterns may help towards their larger adoption as GitHub is a prominent platform in collaborative software development. A preliminary evaluation of eTagger achieved an AUC-ROC of 0.73 with a precision of 0.58, recall of 0.53 and an F1-score of 0.5. The demonstration of the tool is available at https://youtu.be/hP4pWJ4AKxE and related artifacts at https://rishalab.github.io/eTagger/.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121604054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

RestTestGen: An Extensible Framework for Automated Black-box Testing of RESTful APIs RestTestGen:一个用于RESTful api自动黑盒测试的可扩展框架

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2022-10-01 DOI: 10.1109/ICSME55016.2022.00068

Davide Corradini, Amedeo Zampieri, Michele Pasqua, M. Ceccato

Over the past few years, several novel black-box testing approaches targeting RESTful APIs have been proposed. In order to assess their effectiveness, such testing strategies had to be implemented as a prototype tool and validated on empirical data. However, developing a testing tool is a time-consuming task, and reimplementing from scratch the same common basic features represents a waste of resources that causes a remarkable overhead in the "time to market" of research results.In this paper, we present RestTestGen, an extensible framework for implementing new automated black-box testing strategies for RESTful APIs. The framework provides a collection of commonly used components, such as a robust OpenAPI specification parser, dictionaries, input value generators, mutation operators, oracles, and others. Many of the provided components are customizable and extensible, enabling researchers and practitioners to quickly prototype, deploy, and evaluate their novel ideas. Additionally, the framework facilitates the development of novel black-box testing strategies by guiding researchers, by means of abstract components that explicitly identify those parts of the framework requiring a concrete implementation.As an adoption example, we show how we can implement nominal and error black-box testing strategies for RESTful APIs, by reusing primitives and features provided by the framework, and by concretely extending very few abstract components.RestTestGen is open-source, actively maintained, and publicly available on GitHub at https://github.com/SeUniVr/RestTestGen

在过去的几年中，已经提出了几种针对RESTful api的新颖黑盒测试方法。为了评估其有效性，这些测试策略必须作为原型工具实施，并在经验数据上进行验证。然而，开发测试工具是一项耗时的任务，并且从头开始重新实现相同的通用基本特性代表了对资源的浪费，这会导致研究结果的“上市时间”的显著开销。在本文中，我们提出了RestTestGen，这是一个可扩展的框架，用于实现RESTful api的新的自动化黑盒测试策略。该框架提供了一组常用组件，如健壮的OpenAPI规范解析器、字典、输入值生成器、变异操作符、oracle等。提供的许多组件都是可定制和可扩展的，使研究人员和实践者能够快速构建原型、部署和评估他们的新想法。此外，该框架通过指导研究人员，通过明确标识需要具体实现的框架的那些部分的抽象组件，促进了新的黑盒测试策略的开发。作为一个采用示例，我们将展示如何通过重用框架提供的原语和特性，以及具体扩展很少的抽象组件，来实现RESTful api的名义和错误黑盒测试策略。RestTestGen是开源的，积极维护，并在GitHub上公开提供https://github.com/SeUniVr/RestTestGen

{"title":"RestTestGen: An Extensible Framework for Automated Black-box Testing of RESTful APIs","authors":"Davide Corradini, Amedeo Zampieri, Michele Pasqua, M. Ceccato","doi":"10.1109/ICSME55016.2022.00068","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00068","url":null,"abstract":"Over the past few years, several novel black-box testing approaches targeting RESTful APIs have been proposed. In order to assess their effectiveness, such testing strategies had to be implemented as a prototype tool and validated on empirical data. However, developing a testing tool is a time-consuming task, and reimplementing from scratch the same common basic features represents a waste of resources that causes a remarkable overhead in the \"time to market\" of research results.In this paper, we present RestTestGen, an extensible framework for implementing new automated black-box testing strategies for RESTful APIs. The framework provides a collection of commonly used components, such as a robust OpenAPI specification parser, dictionaries, input value generators, mutation operators, oracles, and others. Many of the provided components are customizable and extensible, enabling researchers and practitioners to quickly prototype, deploy, and evaluate their novel ideas. Additionally, the framework facilitates the development of novel black-box testing strategies by guiding researchers, by means of abstract components that explicitly identify those parts of the framework requiring a concrete implementation.As an adoption example, we show how we can implement nominal and error black-box testing strategies for RESTful APIs, by reusing primitives and features provided by the framework, and by concretely extending very few abstract components.RestTestGen is open-source, actively maintained, and publicly available on GitHub at https://github.com/SeUniVr/RestTestGen","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"290 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114053888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Quantifying the Potential to Automate the Synchronization of Variants in Clone-and-Own 量化克隆与自有中自动同步变异的潜力

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2022-10-01 DOI: 10.1109/ICSME55016.2022.00032

Alexander Schultheiss, P. M. Bittner, Thomas Thüm, Timo Kehrer

In clone-and-own - the predominant paradigm for developing multi-variant software systems in practice - a new variant of a software system is created by copying and adapting an existing one. While clone-and-own is flexible, it causes high maintenance effort in the long run as cloned variants evolve in parallel; certain changes, such as bug fixes, need to be propagated between variants manually. On top of the principle of cherry-picking and by collecting lightweight domain knowledge on cloned variants and software changes, a recent line of research proposes to automate such synchronization tasks when migration to a software product line is not feasible. However, it is yet unclear how far this synchronization can actually be pushed. We conduct an empirical study in which we quantify the potential to automate the synchronization of variants in clone-and-own. We simulate the variant synchronization using the history of a real-world multi-variant software system as a case study. Our results indicate that existing patching techniques propagate changes with an accuracy of up to 85%, if applied consistently from the start of a project. This can be even further improved to 93% by exploiting lightweight domain knowledge about which features are affected by a change, and which variants implement affected features. Based on our findings, we conclude that there is potential to automate the synchronization of cloned variants through existing patching techniques.

在“克隆并拥有”——实践中开发多变体软件系统的主要范例中——通过复制和改编现有软件系统来创建软件系统的新变体。虽然克隆和拥有是灵活的，但从长远来看，由于克隆的变体是并行发展的，它会带来很高的维护工作量;某些更改(如bug修复)需要在变体之间手动传播。在择优选择原则的基础上，通过收集克隆变体和软件变更的轻量级领域知识，最近的一项研究建议在迁移到软件产品线不可行的情况下自动执行此类同步任务。然而，目前还不清楚这种同步能推进到什么程度。我们进行了一项实证研究，在该研究中，我们量化了克隆和自有变异自动化同步的潜力。我们使用一个真实世界的多变体软件系统的历史作为案例研究来模拟变体同步。我们的结果表明，如果从项目开始就始终如一地应用，现有的补丁技术传播更改的准确率高达85%。通过利用关于哪些特性受到变更的影响，以及哪些变体实现了受影响的特性的轻量级领域知识，这一比例甚至可以进一步提高到93%。根据我们的研究结果，我们得出结论，通过现有的补丁技术，有可能实现克隆变体的自动化同步。

{"title":"Quantifying the Potential to Automate the Synchronization of Variants in Clone-and-Own","authors":"Alexander Schultheiss, P. M. Bittner, Thomas Thüm, Timo Kehrer","doi":"10.1109/ICSME55016.2022.00032","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00032","url":null,"abstract":"In clone-and-own - the predominant paradigm for developing multi-variant software systems in practice - a new variant of a software system is created by copying and adapting an existing one. While clone-and-own is flexible, it causes high maintenance effort in the long run as cloned variants evolve in parallel; certain changes, such as bug fixes, need to be propagated between variants manually. On top of the principle of cherry-picking and by collecting lightweight domain knowledge on cloned variants and software changes, a recent line of research proposes to automate such synchronization tasks when migration to a software product line is not feasible. However, it is yet unclear how far this synchronization can actually be pushed. We conduct an empirical study in which we quantify the potential to automate the synchronization of variants in clone-and-own. We simulate the variant synchronization using the history of a real-world multi-variant software system as a case study. Our results indicate that existing patching techniques propagate changes with an accuracy of up to 85%, if applied consistently from the start of a project. This can be even further improved to 93% by exploiting lightweight domain knowledge about which features are affected by a change, and which variants implement affected features. Based on our findings, we conclude that there is potential to automate the synchronization of cloned variants through existing patching techniques.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133885978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Evaluating Atoms of Confusion in the Context of Code Reviews 评估代码审查环境中的混乱原子

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2022-10-01 DOI: 10.1109/ICSME55016.2022.00048

Victoria Bogachenkova, Linh Nguyen, Felipe Ebert, Alexander Serebrenik, Fernando Castor

Code review is a popular software engineering practice. Success of code reviews can be threatened by confusion experienced by code reviewers. For instance, on the one hand, research has studied the reasons for confusion in code reviews, and on the other hand, it also has analyzed source code patterns, so called "atoms of confusion", that have been shown to lead to misunderstanding in the lab setting. However, to the best of our knowledge, there is no research which tried to investigate the possible cause and effect relationship between atoms of confusion and confusion in code reviews. Another important aspect still not studied is how those atoms of confusion evolve across pull requests. In this emerging results paper, we report an exploratory case study to provide a deeper understanding of atoms of confusion, more specifically, whether atoms of confusion are related to confusion in code reviews and how they persist across pull requests. With the help of an existing tool for the detection of atoms of confusion, and a manual analysis of code reviews comments, we observed that statistical analysis did not show any relationship between atoms of confusion and presence of confusion comments in code reviews. Additionally, we found evidence that atoms of confusion are mostly not being removed in pull requests. Based on the results, we formulate hypotheses on atoms of confusion in the code review context, that should be confirmed or rejected by future studies.

代码审查是一种流行的软件工程实践。代码审查的成功可能会受到代码审查者所经历的困惑的威胁。例如，一方面，研究研究了代码审查中混乱的原因，另一方面，它也分析了源代码模式，即所谓的“混乱原子”，这些模式已被证明在实验室环境中会导致误解。然而，据我们所知，还没有研究试图调查代码审查中混乱原子和混乱之间可能的因果关系。另一个尚未研究的重要方面是这些混淆原子如何在拉取请求中演变。在这篇新兴的成果论文中，我们报告了一个探索性的案例研究，以提供对混乱原子的更深入的理解，更具体地说，混乱原子是否与代码审查中的混乱有关，以及它们如何在拉请求中持续存在。在现有的检测混乱原子的工具的帮助下，以及对代码审查注释的手工分析，我们观察到统计分析并没有显示出代码审查中混乱原子和混乱注释之间的任何关系。此外，我们发现有证据表明，在拉取请求中，混淆的原子大多没有被删除。基于结果，我们对代码审查上下文中的混淆原子提出假设，这些假设应该在未来的研究中得到证实或拒绝。

{"title":"Evaluating Atoms of Confusion in the Context of Code Reviews","authors":"Victoria Bogachenkova, Linh Nguyen, Felipe Ebert, Alexander Serebrenik, Fernando Castor","doi":"10.1109/ICSME55016.2022.00048","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00048","url":null,"abstract":"Code review is a popular software engineering practice. Success of code reviews can be threatened by confusion experienced by code reviewers. For instance, on the one hand, research has studied the reasons for confusion in code reviews, and on the other hand, it also has analyzed source code patterns, so called \"atoms of confusion\", that have been shown to lead to misunderstanding in the lab setting. However, to the best of our knowledge, there is no research which tried to investigate the possible cause and effect relationship between atoms of confusion and confusion in code reviews. Another important aspect still not studied is how those atoms of confusion evolve across pull requests. In this emerging results paper, we report an exploratory case study to provide a deeper understanding of atoms of confusion, more specifically, whether atoms of confusion are related to confusion in code reviews and how they persist across pull requests. With the help of an existing tool for the detection of atoms of confusion, and a manual analysis of code reviews comments, we observed that statistical analysis did not show any relationship between atoms of confusion and presence of confusion comments in code reviews. Additionally, we found evidence that atoms of confusion are mostly not being removed in pull requests. Based on the results, we formulate hypotheses on atoms of confusion in the code review context, that should be confirmed or rejected by future studies.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130316903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Heterogeneous Vulnerability Report Traceability Recovery by Vulnerability Aspect Matching 基于漏洞方面匹配的异构漏洞报告可跟踪性恢复

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2022-10-01 DOI: 10.1109/ICSME55016.2022.00024

Jiamou Sun, Zhenchang Xing, Xiwei Xu, Liming Zhu, Qinghua Lu

Security databases describe characteristics of discovered vulnerabilities in text for future studying and patching. However, due to different maintainers having different perspectives about vulnerabilities, they often describe the same vulnerability in different ways, creating obstacles for gathering comprehensive information about the vulnerabilities from different databases. To mitigate this problem, Common Vulnerability and Exposures (CVE) is established to identify each publicly disclosed vulnerability with a unique CVE id, and vulnerability databases by different vendors and organizations can reference the CVE ids in their vulnerability reports. In spite of the wide adoption of CVEs, traceability issues are still prevalent. Our empirical study on vulnerability traceability across four representative security databases (NVD, IBM X-Force, ExploitDB, Openwall) shows that there was a fast-increasing amount of CVE records, traceability delay, and missing issues become severe for the vulnerability databases. To address these issues, we develop an automatic traceability recovery method for recommending related external vulnerability reports to the reports in one database. As vulnerability reports from different databases differ in content details and length, our approach does not match the reports at the document level but extracts seven distinctive vulnerability key aspects that are widely present in vulnerability descriptions. As a proof of concept, we apply our methods to recommend the reports from IBM X-Force, ExploitDB and Openwall to the NVD report. We use NVD as the target because it is a de-facto standard vulnerability database that contains the most comprehensive list of vulnerabilities. Our experiments on a wide range of NLP methods show our aspect-level matching methods can achieve high MRR and accuracy for traceability recovery across heterogeneous vulnerability databases.

安全数据库以文本形式描述已发现漏洞的特征，以便将来研究和修补。然而，由于不同的维护人员对漏洞有不同的看法，他们经常以不同的方式描述相同的漏洞，这为从不同的数据库收集有关漏洞的全面信息造成了障碍。为了缓解这一问题，我们建立了CVE (Common Vulnerability and Exposures)机制，通过唯一的CVE id来识别每个公开披露的漏洞，不同厂商和组织的漏洞数据库可以在各自的漏洞报告中引用CVE id。尽管cve被广泛采用，可追溯性问题仍然很普遍。我们对四个具有代表性的安全数据库(NVD、IBM X-Force、ExploitDB、Openwall)的漏洞可追溯性进行了实证研究，结果表明，漏洞数据库的CVE记录数量快速增长，可追溯性延迟，缺失问题变得严重。为了解决这些问题，我们开发了一种自动跟踪恢复方法，用于向一个数据库中的报告推荐相关的外部漏洞报告。由于来自不同数据库的漏洞报告的内容细节和长度不同，我们的方法并不匹配文档级别的报告，而是提取了在漏洞描述中广泛存在的七个不同的漏洞关键方面。作为概念证明，我们应用我们的方法将IBM X-Force、ExploitDB和Openwall的报告推荐到NVD报告中。我们使用NVD作为目标，因为它是一个事实上的标准漏洞数据库，包含了最全面的漏洞列表。我们在各种NLP方法上的实验表明，我们的方面级匹配方法可以实现跨异构漏洞数据库的高MRR和准确度的可追溯性恢复。

{"title":"Heterogeneous Vulnerability Report Traceability Recovery by Vulnerability Aspect Matching","authors":"Jiamou Sun, Zhenchang Xing, Xiwei Xu, Liming Zhu, Qinghua Lu","doi":"10.1109/ICSME55016.2022.00024","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00024","url":null,"abstract":"Security databases describe characteristics of discovered vulnerabilities in text for future studying and patching. However, due to different maintainers having different perspectives about vulnerabilities, they often describe the same vulnerability in different ways, creating obstacles for gathering comprehensive information about the vulnerabilities from different databases. To mitigate this problem, Common Vulnerability and Exposures (CVE) is established to identify each publicly disclosed vulnerability with a unique CVE id, and vulnerability databases by different vendors and organizations can reference the CVE ids in their vulnerability reports. In spite of the wide adoption of CVEs, traceability issues are still prevalent. Our empirical study on vulnerability traceability across four representative security databases (NVD, IBM X-Force, ExploitDB, Openwall) shows that there was a fast-increasing amount of CVE records, traceability delay, and missing issues become severe for the vulnerability databases. To address these issues, we develop an automatic traceability recovery method for recommending related external vulnerability reports to the reports in one database. As vulnerability reports from different databases differ in content details and length, our approach does not match the reports at the document level but extracts seven distinctive vulnerability key aspects that are widely present in vulnerability descriptions. As a proof of concept, we apply our methods to recommend the reports from IBM X-Force, ExploitDB and Openwall to the NVD report. We use NVD as the target because it is a de-facto standard vulnerability database that contains the most comprehensive list of vulnerabilities. Our experiments on a wide range of NLP methods show our aspect-level matching methods can achieve high MRR and accuracy for traceability recovery across heterogeneous vulnerability databases.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115204236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Learning-based Identification of Coding Best Practices from Software Documentation 基于学习的软件文档编码最佳实践识别

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2022-10-01 DOI: 10.1109/ICSME55016.2022.00073

Neela Sawant, Srinivasan H. Sengamedu

Automatic identification of coding best practices can scale the development of code and application analyzers. We present Doc2BP, a deep learning tool to identify coding best practices in software documentation. Natural language descriptions are mapped to an informative embedding space, optimized under the dual objectives of binary and few shot classification. The binary objective powers general classification into known best practice categories using a deep learning classifier. The few shot objective facilitates example-based classification into novel categories by matching embeddings with user-provided examples at run-time, without having to retrain the underlying model. We analyze the effects of manually and synthetically labeled examples, context, and cross-domain information.We have applied Doc2BP to Java, Python, AWS Java SDK, and AWS CloudFormation documentations. With respect to prior works that primarily leverage keyword heuristics and our own parts of speech pattern baselines, we obtain 3-5% F1 score improvement for Java and Python, and 15-20% for AWS Java SDK and AWS CloudFormation. Experiments with four few shot use-cases show promising results (5-shot accuracy of 99%+ for Java NullPointerException and AWS Java metrics, 65% for AWS CloudFormation numerics, and 35% for Python best practices).Doc2BP has contributed new rules and improved specifications in Amazon's code and application analyzers: (a) 500+ new checks in cfn-lint, an open-source AWS CloudFormation linter, (b) over 97% automated coverage of metrics APIs and related practices in Amazon DevOps Guru, (c) support for nullable AWS APIs in Amazon CodeGuru's Java NullPointerException (NPE) detector, (d) 200+ new best practices for Java, Python, and respective AWS SDKs in Amazon CodeGuru, and (e) 2% reduction in false positives in Amazon CodeGuru's Java resource leak detector.

编码最佳实践的自动识别可以扩展代码和应用程序分析器的开发。我们提出了Doc2BP，一个深度学习工具，用于识别软件文档中的编码最佳实践。将自然语言描述映射到信息嵌入空间，并在二元和少镜头分类的双重目标下进行优化。二元目标使用深度学习分类器将一般分类分为已知的最佳实践类别。通过在运行时将嵌入与用户提供的示例进行匹配，而无需重新训练底层模型，few shot objective将基于示例的分类简化为新的类别。我们分析了人工和综合标记的例子、上下文和跨域信息的效果。我们已经将Doc2BP应用于Java、Python、AWS Java SDK和AWS CloudFormation文档。对于之前主要利用关键字启发式和我们自己的词性模式基线的工作，我们在Java和Python上获得了3-5%的F1分数提高，在AWS Java SDK和AWS CloudFormation上获得了15-20%的分数提高。用四个少量用例进行的实验显示了有希望的结果(Java NullPointerException和AWS Java指标的5次准确率为99%以上，AWS CloudFormation数字的准确率为65%，Python最佳实践的准确率为35%)。Doc2BP为亚马逊的代码和应用程序分析器提供了新的规则和改进的规范:(a)开源AWS CloudFormation检测器cfn-lint中新增500多项检查，(b) Amazon DevOps Guru中指标api和相关实践的自动化覆盖率超过97%，(c) Amazon CodeGuru的Java NullPointerException (NPE)检测器中支持可空的AWS api， (d) Amazon CodeGuru中针对Java、Python和各自AWS sdk的200多项新最佳实践，以及(e) Amazon CodeGuru的Java资源泄漏检测器中误报率减少2%。

{"title":"Learning-based Identification of Coding Best Practices from Software Documentation","authors":"Neela Sawant, Srinivasan H. Sengamedu","doi":"10.1109/ICSME55016.2022.00073","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00073","url":null,"abstract":"Automatic identification of coding best practices can scale the development of code and application analyzers. We present Doc2BP, a deep learning tool to identify coding best practices in software documentation. Natural language descriptions are mapped to an informative embedding space, optimized under the dual objectives of binary and few shot classification. The binary objective powers general classification into known best practice categories using a deep learning classifier. The few shot objective facilitates example-based classification into novel categories by matching embeddings with user-provided examples at run-time, without having to retrain the underlying model. We analyze the effects of manually and synthetically labeled examples, context, and cross-domain information.We have applied Doc2BP to Java, Python, AWS Java SDK, and AWS CloudFormation documentations. With respect to prior works that primarily leverage keyword heuristics and our own parts of speech pattern baselines, we obtain 3-5% F1 score improvement for Java and Python, and 15-20% for AWS Java SDK and AWS CloudFormation. Experiments with four few shot use-cases show promising results (5-shot accuracy of 99%+ for Java NullPointerException and AWS Java metrics, 65% for AWS CloudFormation numerics, and 35% for Python best practices).Doc2BP has contributed new rules and improved specifications in Amazon's code and application analyzers: (a) 500+ new checks in cfn-lint, an open-source AWS CloudFormation linter, (b) over 97% automated coverage of metrics APIs and related practices in Amazon DevOps Guru, (c) support for nullable AWS APIs in Amazon CodeGuru's Java NullPointerException (NPE) detector, (d) 200+ new best practices for Java, Python, and respective AWS SDKs in Amazon CodeGuru, and (e) 2% reduction in false positives in Amazon CodeGuru's Java resource leak detector.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122727801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

AIP: Scalable and Reproducible Execution Traces in Energy Studies on Mobile Devices AIP:移动设备能源研究中可扩展和可复制的执行轨迹

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2022-10-01 DOI: 10.1109/ICSME55016.2022.00057

Olivier Nourry, Yutaro Kashiwa, B. Lin, G. Bavota, Michele Lanza, Yasutaka Kamei

Energy consumption in mobile applications is a key area of software engineering studies, since any advance could affect billions of devices. Currently, several software-based energy calculation tools can provide close estimates of the energy consumed by mobile applications without relying on physical hardware, offering new opportunities to conduct large-scale energy studies in mobile devices. In these studies, one key step of data collection is generating events, since it allows exercising specific parts of the code and, as a consequence, assessing their energy consumption. Given the fact that manually generating events by interacting with applications is time-consuming and not scalable, large-scale studies often use software-based tools to automate event generation to profile devices. Existing tools rely on randomly generated events, which undermines the reproducibility and generalizability of such studies.We present AIP (Android Instrumentation Profiler), an alternative to existing software-based event generation tools such as Monkey. AIP uses instrumented tests as a source of event generation, which enables the targeting of complex use cases for energy consumption estimations, as well as the creation of fully reproducible events and execution traces, while maintaining the scaling abilities of other state-of-the-art tools. The tool and demo video can be found on https://github.com/ONourry/AndroidInstrumentationProfiler.

移动应用程序的能耗是软件工程研究的一个关键领域，因为任何进步都可能影响数十亿台设备。目前，一些基于软件的能量计算工具可以在不依赖物理硬件的情况下提供移动应用程序消耗的能量的接近估计，为在移动设备中进行大规模的能量研究提供了新的机会。在这些研究中，数据收集的一个关键步骤是生成事件，因为它允许执行代码的特定部分，并因此评估它们的能量消耗。考虑到通过与应用程序交互手动生成事件非常耗时且不可伸缩，大规模研究通常使用基于软件的工具来自动生成事件以配置设备。现有的工具依赖于随机产生的事件，这破坏了这类研究的可重复性和普遍性。我们提出了AIP (Android Instrumentation Profiler)，这是现有基于软件的事件生成工具(如Monkey)的替代方案。AIP使用仪器化测试作为事件生成的来源，它支持针对能源消耗估计的复杂用例，以及创建完全可再现的事件和执行跟踪，同时保持其他最先进工具的伸缩能力。该工具和演示视频可以在https://github.com/ONourry/AndroidInstrumentationProfiler上找到。

{"title":"AIP: Scalable and Reproducible Execution Traces in Energy Studies on Mobile Devices","authors":"Olivier Nourry, Yutaro Kashiwa, B. Lin, G. Bavota, Michele Lanza, Yasutaka Kamei","doi":"10.1109/ICSME55016.2022.00057","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00057","url":null,"abstract":"Energy consumption in mobile applications is a key area of software engineering studies, since any advance could affect billions of devices. Currently, several software-based energy calculation tools can provide close estimates of the energy consumed by mobile applications without relying on physical hardware, offering new opportunities to conduct large-scale energy studies in mobile devices. In these studies, one key step of data collection is generating events, since it allows exercising specific parts of the code and, as a consequence, assessing their energy consumption. Given the fact that manually generating events by interacting with applications is time-consuming and not scalable, large-scale studies often use software-based tools to automate event generation to profile devices. Existing tools rely on randomly generated events, which undermines the reproducibility and generalizability of such studies.We present AIP (Android Instrumentation Profiler), an alternative to existing software-based event generation tools such as Monkey. AIP uses instrumented tests as a source of event generation, which enables the targeting of complex use cases for energy consumption estimations, as well as the creation of fully reproducible events and execution traces, while maintaining the scaling abilities of other state-of-the-art tools. The tool and demo video can be found on https://github.com/ONourry/AndroidInstrumentationProfiler.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131116375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VSCode Migrate: Semi-Automatic Migrations for Low Coverage Projects VSCode Migrate:低覆盖率项目的半自动迁移

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2022-10-01 DOI: 10.1109/ICSME55016.2022.00070

Tim Vahlbrock, Martin Guddat, Tom Vierjahn

Modern software is subject to continuous change and so are its interfaces to other software. Introducing breaking changes to an interface requires its consumers to make adaptions to their own code base in order to compensate. Oftentimes, the number of changes requires a large effort when performed manually. Additionally, the places that require changes may be hard to find using simple pattern matching. Both have lead to the development of tools like ClangMR, which automatically find and adapt affected pieces of code. Such automatic tools, however, assume that the correctness of the applied changes will be verified by tests. This makes them risky to use for projects with a low test coverage.In this paper we present VSCode Migrate, an IDE extension to perform semi-automatic migrations, enabling large refactorings in low coverage projects. The locations that need to be refactored can be found using alternative matching strategies, including AST based ones, and the changes to perform can be generated automatically. However, instead of applying the changes immediately, VSCode Migrate lets the developer verify and modify each adaption. If a change is not sufficiently covered, additional tests can be added before the change is applied. These mechanisms provide the safeguarding necessary for projects with low test coverage.

现代软件是不断变化的，它与其他软件的接口也是如此。在接口中引入破坏性的更改需要其使用者对自己的代码库进行调整以进行补偿。通常，当手动执行这些更改时，需要付出很大的努力。此外，使用简单的模式匹配可能很难找到需要更改的地方。两者都导致了ClangMR等工具的开发，这些工具可以自动发现和调整受影响的代码片段。然而，这种自动工具假定应用的更改的正确性将由测试验证。这使得在测试覆盖率低的项目中使用它们是有风险的。在本文中，我们介绍了VSCode Migrate，这是一个执行半自动迁移的IDE扩展，可以在低覆盖率的项目中进行大规模重构。可以使用其他匹配策略(包括基于AST的策略)找到需要重构的位置，并且可以自动生成要执行的更改。但是，VSCode Migrate并没有立即应用这些更改，而是让开发人员验证和修改每个适配。如果没有充分覆盖更改，则可以在应用更改之前添加额外的测试。这些机制为低测试覆盖率的项目提供了必要的保障。

{"title":"VSCode Migrate: Semi-Automatic Migrations for Low Coverage Projects","authors":"Tim Vahlbrock, Martin Guddat, Tom Vierjahn","doi":"10.1109/ICSME55016.2022.00070","DOIUrl":"https://doi.org/10.1109/ICSME55016.2022.00070","url":null,"abstract":"Modern software is subject to continuous change and so are its interfaces to other software. Introducing breaking changes to an interface requires its consumers to make adaptions to their own code base in order to compensate. Oftentimes, the number of changes requires a large effort when performed manually. Additionally, the places that require changes may be hard to find using simple pattern matching. Both have lead to the development of tools like ClangMR, which automatically find and adapt affected pieces of code. Such automatic tools, however, assume that the correctness of the applied changes will be verified by tests. This makes them risky to use for projects with a low test coverage.In this paper we present VSCode Migrate, an IDE extension to perform semi-automatic migrations, enabling large refactorings in low coverage projects. The locations that need to be refactored can be found using alternative matching strategies, including AST based ones, and the changes to perform can be generated automatically. However, instead of applying the changes immediately, VSCode Migrate lets the developer verify and modify each adaption. If a change is not sufficiently covered, additional tests can be added before the change is applied. These mechanisms provide the safeguarding necessary for projects with low test coverage.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116915940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0