2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)最新文献_第9页

Aligning Technical Debt Prioritization with Business Objectives: A Multiple-Case Study 调整技术债务优先级与业务目标:多案例研究

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-07-15 DOI: 10.1109/ICSME.2018.00075

R. R. Almeida, U. Kulesza, Christoph Treude, D'angellys Cavalcanti Feitosa, Aliandro Lima

Technical debt (TD) is a metaphor to describe the trade-off between short-term workarounds and long-term goals in software development. Despite being widely used to explain technical issues in business terms, industry and academia still lack a proper way to manage technical debt while explicitly considering business priorities. In this paper, we report on a multiple-case study of how two big software development companies handle technical debt items, and we show how taking the business perspective into account can improve the decision making for the prioritization of technical debt. We also propose a first step toward an approach that uses business process management (BPM) to manage technical debt. We interviewed a set of IT business stakeholders, and we collected and analyzed different sets of technical debt items, comparing how these items would be prioritized using a purely technical versus a business-oriented approach. We found that the use of business process management to support technical debt management makes the technical debt prioritization decision process more aligned with business expectations. We also found evidence that the business process management approach can help technical debt management achieve business objectives.

技术债务(TD)是一个比喻，用来描述软件开发中短期解决方案和长期目标之间的权衡。尽管被广泛用于用业务术语解释技术问题，但工业界和学术界仍然缺乏一种适当的方法来管理技术债务，同时明确考虑业务优先级。在本文中，我们报告了两个大型软件开发公司如何处理技术债务项目的多案例研究，并且我们展示了如何考虑业务角度可以改善技术债务优先级的决策制定。我们还建议朝着使用业务流程管理(BPM)来管理技术债务的方法迈出第一步。我们采访了一组IT业务涉众，我们收集并分析了不同的技术债务项目集，比较了如何使用纯技术方法和面向业务的方法对这些项目进行优先级排序。我们发现使用业务流程管理来支持技术债务管理使得技术债务优先级决策过程更加符合业务期望。我们还发现了业务流程管理方法可以帮助技术债务管理实现业务目标的证据。

{"title":"Aligning Technical Debt Prioritization with Business Objectives: A Multiple-Case Study","authors":"R. R. Almeida, U. Kulesza, Christoph Treude, D'angellys Cavalcanti Feitosa, Aliandro Lima","doi":"10.1109/ICSME.2018.00075","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00075","url":null,"abstract":"Technical debt (TD) is a metaphor to describe the trade-off between short-term workarounds and long-term goals in software development. Despite being widely used to explain technical issues in business terms, industry and academia still lack a proper way to manage technical debt while explicitly considering business priorities. In this paper, we report on a multiple-case study of how two big software development companies handle technical debt items, and we show how taking the business perspective into account can improve the decision making for the prioritization of technical debt. We also propose a first step toward an approach that uses business process management (BPM) to manage technical debt. We interviewed a set of IT business stakeholders, and we collected and analyzed different sets of technical debt items, comparing how these items would be prioritized using a purely technical versus a business-oriented approach. We found that the use of business process management to support technical debt management makes the technical debt prioritization decision process more aligned with business expectations. We also found evidence that the business process management approach can help technical debt management achieve business objectives.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"136 1","pages":"655-664"},"PeriodicalIF":0.0,"publicationDate":"2018-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79613975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

A Practical Approach to the Automatic Classification of Security-Relevant Commits 安全相关提交自动分类的实用方法

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-07-06 DOI: 10.1109/ICSME.2018.00058

A. Sabetta, M. Bezzi

The lack of reliable sources of detailed information on the vulnerabilities of open-source software (OSS) components is a major obstacle to maintaining a secure software supply chain and an effective vulnerability management process. Standard sources of advisories and vulnerability data, such as the National Vulnerability Database (NVD), are known to suffer from poor coverage and inconsistent quality. To reduce our dependency on these sources, we propose an approach that uses machine-learning to analyze source code repositories and to automatically identify commits that are security-relevant (i.e., that are likely to fix a vulnerability). We treat the source code changes introduced by commits as documents written in natural language, classifying them using standard document classification methods. Combining independent classifiers that use information from different facets of commits, our method can yield high precision (80%) while ensuring acceptable recall (43%). In particular, the use of information extracted from the source code changes yields a substantial improvement over the best known approach in state of the art, while requiring a significantly smaller amount of training data and employing a simpler architecture.

缺乏关于开源软件(OSS)组件漏洞的可靠详细信息来源是维护安全软件供应链和有效漏洞管理过程的主要障碍。众所周知，咨询和脆弱性数据的标准来源，如国家脆弱性数据库(NVD)，覆盖率低，质量不一致。为了减少对这些源的依赖，我们提出了一种方法，使用机器学习来分析源代码库，并自动识别与安全相关的提交(即，可能修复漏洞的提交)。我们将提交引入的源代码更改视为用自然语言编写的文档，并使用标准文档分类方法对它们进行分类。结合使用来自提交的不同方面的信息的独立分类器，我们的方法可以产生高精度(80%)，同时确保可接受的召回率(43%)。特别是，使用从源代码更改中提取的信息比目前最知名的方法产生了实质性的改进，同时需要更少的训练数据并采用更简单的体系结构。

{"title":"A Practical Approach to the Automatic Classification of Security-Relevant Commits","authors":"A. Sabetta, M. Bezzi","doi":"10.1109/ICSME.2018.00058","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00058","url":null,"abstract":"The lack of reliable sources of detailed information on the vulnerabilities of open-source software (OSS) components is a major obstacle to maintaining a secure software supply chain and an effective vulnerability management process. Standard sources of advisories and vulnerability data, such as the National Vulnerability Database (NVD), are known to suffer from poor coverage and inconsistent quality. To reduce our dependency on these sources, we propose an approach that uses machine-learning to analyze source code repositories and to automatically identify commits that are security-relevant (i.e., that are likely to fix a vulnerability). We treat the source code changes introduced by commits as documents written in natural language, classifying them using standard document classification methods. Combining independent classifiers that use information from different facets of commits, our method can yield high precision (80%) while ensuring acceptable recall (43%). In particular, the use of information extracted from the source code changes yields a substantial improvement over the best known approach in state of the art, while requiring a significantly smaller amount of training data and employing a simpler architecture.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"5 1","pages":"579-582"},"PeriodicalIF":0.0,"publicationDate":"2018-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87355591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 58

AutoSpearman: Automatically Mitigating Correlated Software Metrics for Interpreting Defect Models AutoSpearman:自动减轻解释缺陷模型的相关软件度量

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-06-26 DOI: 10.1109/ICSME.2018.00018

Jirayus Jiarpakdee, C. Tantithamthavorn, Christoph Treude

The interpretation of defect models heavily relies on software metrics that are used to construct them. However, such software metrics are often correlated in defect models. Prior work often uses feature selection techniques to remove correlated metrics in order to improve the performance of defect models. Yet, the interpretation of defect models may be misleading if feature selection techniques produce subsets of inconsistent and correlated metrics. In this paper, we investigate the consistency and correlation of the subsets of metrics that are produced by nine commonly-used feature selection techniques. Through a case study of 13 publicly-available defect datasets, we find that feature selection techniques produce inconsistent subsets of metrics and do not mitigate correlated metrics, suggesting that feature selection techniques should not be used and correlation analyses must be applied when the goal is model interpretation. Since correlation analyses often involve manual selection of metrics by a domain expert, we introduce AutoSpearman, an automated metric selection approach based on correlation analyses. Our evaluation indicates that AutoSpearman yields the highest consistency of subsets of metrics among training samples and mitigates correlated metrics, while impacting model performance by 1-2%pts. Thus, to automatically mitigate correlated metrics when interpreting defect models, we recommend future studies use AutoSpearman in lieu of commonly-used feature selection techniques.

缺陷模型的解释在很大程度上依赖于用来构造它们的软件度量。然而，这样的软件度量通常在缺陷模型中是相关的。先前的工作通常使用特征选择技术来去除相关的度量，以提高缺陷模型的性能。然而，如果特征选择技术产生不一致和相关度量的子集，那么对缺陷模型的解释可能会产生误导。在本文中，我们研究了由九种常用的特征选择技术产生的度量子集的一致性和相关性。通过对13个公开可用的缺陷数据集的案例研究，我们发现特征选择技术产生不一致的度量子集，并且不能减轻相关度量，这表明当目标是模型解释时，不应该使用特征选择技术，而必须应用相关分析。由于相关分析通常涉及由领域专家手动选择度量，我们介绍了AutoSpearman，一种基于相关分析的自动度量选择方法。我们的评估表明，AutoSpearman在训练样本中产生了最高一致性的指标子集，并减轻了相关指标，同时影响了1-2%的模型性能。因此，为了在解释缺陷模型时自动减轻相关度量，我们建议未来的研究使用AutoSpearman来代替常用的特征选择技术。

{"title":"AutoSpearman: Automatically Mitigating Correlated Software Metrics for Interpreting Defect Models","authors":"Jirayus Jiarpakdee, C. Tantithamthavorn, Christoph Treude","doi":"10.1109/ICSME.2018.00018","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00018","url":null,"abstract":"The interpretation of defect models heavily relies on software metrics that are used to construct them. However, such software metrics are often correlated in defect models. Prior work often uses feature selection techniques to remove correlated metrics in order to improve the performance of defect models. Yet, the interpretation of defect models may be misleading if feature selection techniques produce subsets of inconsistent and correlated metrics. In this paper, we investigate the consistency and correlation of the subsets of metrics that are produced by nine commonly-used feature selection techniques. Through a case study of 13 publicly-available defect datasets, we find that feature selection techniques produce inconsistent subsets of metrics and do not mitigate correlated metrics, suggesting that feature selection techniques should not be used and correlation analyses must be applied when the goal is model interpretation. Since correlation analyses often involve manual selection of metrics by a domain expert, we introduce AutoSpearman, an automated metric selection approach based on correlation analyses. Our evaluation indicates that AutoSpearman yields the highest consistency of subsets of metrics among training samples and mitigates correlated metrics, while impacting model performance by 1-2%pts. Thus, to automatically mitigate correlated metrics when interpreting defect models, we recommend future studies use AutoSpearman in lieu of commonly-used feature selection techniques.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"33 1","pages":"92-103"},"PeriodicalIF":0.0,"publicationDate":"2018-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86360826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 39

Beyond Metadata: Code-Centric and Usage-Based Analysis of Known Vulnerabilities in Open-Source Software 超越元数据:开源软件中已知漏洞的以代码为中心和基于使用的分析

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-06-15 DOI: 10.1109/ICSME.2018.00054

Serena Elisa Ponta, H. Plate, A. Sabetta

The use of open-source software (OSS) is ever-increasing, and so is the number of open-source vulnerabilities being discovered and publicly disclosed. The gains obtained from the reuse of community-developed libraries may be offset by the cost of detecting, assessing, and mitigating their vulnerabilities in a timely manner. In this paper we present a novel method to detect, assess and mitigate OSS vulnerabilities that improves on state-of-the-art approaches, which commonly depend on metadata to identify vulnerable OSS dependencies. Our solution instead is code-centric and combines static and dynamic analysis to determine the reachability of the vulnerable portion of libraries used (directly or transitively) by an application. Taking this usage into account, our approach then supports developers in choosing among the existing non-vulnerable library versions. Vulas, the tool implementing our code-centric and usage-based approach, is officially recommended by SAP to scan its Java software, and has been successfully used to perform more than 250000 scans of about 500 applications since December 2016. We report on our experience and on the lessons we learned when maturing the tool from a research prototype to an industrial-grade solution.

开放源代码软件(OSS)的使用在不断增加，因此，被发现和公开披露的开放源代码漏洞的数量也在增加。从社区开发的库的重用中获得的收益可能会被及时检测、评估和减轻其漏洞的成本所抵消。在本文中，我们提出了一种新的方法来检测、评估和减轻OSS漏洞，该方法改进了最先进的方法，这些方法通常依赖于元数据来识别易受攻击的OSS依赖项。相反，我们的解决方案是以代码为中心的，并结合了静态和动态分析，以确定应用程序(直接或传递地)使用的库的易受攻击部分的可达性。考虑到这种用法，我们的方法支持开发人员在现有的无漏洞库版本中进行选择。Vulas是实现我们以代码为中心和基于使用的方法的工具，被SAP正式推荐用于扫描其Java软件，自2016年12月以来，已成功用于对约500个应用程序执行超过25万次扫描。我们报告了我们的经验，以及我们在将工具从研究原型成熟为工业级解决方案时学到的教训。

{"title":"Beyond Metadata: Code-Centric and Usage-Based Analysis of Known Vulnerabilities in Open-Source Software","authors":"Serena Elisa Ponta, H. Plate, A. Sabetta","doi":"10.1109/ICSME.2018.00054","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00054","url":null,"abstract":"The use of open-source software (OSS) is ever-increasing, and so is the number of open-source vulnerabilities being discovered and publicly disclosed. The gains obtained from the reuse of community-developed libraries may be offset by the cost of detecting, assessing, and mitigating their vulnerabilities in a timely manner. In this paper we present a novel method to detect, assess and mitigate OSS vulnerabilities that improves on state-of-the-art approaches, which commonly depend on metadata to identify vulnerable OSS dependencies. Our solution instead is code-centric and combines static and dynamic analysis to determine the reachability of the vulnerable portion of libraries used (directly or transitively) by an application. Taking this usage into account, our approach then supports developers in choosing among the existing non-vulnerable library versions. Vulas, the tool implementing our code-centric and usage-based approach, is officially recommended by SAP to scan its Java software, and has been successfully used to perform more than 250000 scans of about 500 applications since December 2016. We report on our experience and on the lessons we learned when maturing the tool from a research prototype to an industrial-grade solution.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"449-460"},"PeriodicalIF":0.0,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75847261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 57

A Simple NLP-Based Approach to Support Onboarding and Retention in Open Source Communities 一个简单的基于nlp的方法来支持开源社区的入职和保留

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-06-07 DOI: 10.1109/ICSME.2018.00027

Christoph Stanik, Lloyd Montgomery, Daniel Martens, D. Fucci, W. Maalej

Successful open source communities are constantly looking for new members and helping them become active developers. A common approach for developer onboarding in open source projects is to let newcomers focus on relevant yet easy-to-solve issues to familiarize themselves with the code and the community. The goal of this research is twofold. First, we aim at automatically identifying issues that newcomers can resolve by analyzing the history of resolved issues by simply using the title and description of issues. Second, we aim at automatically identifying issues, that can be resolved by newcomers who later become active developers. We mined the issue trackers of three large open source projects and extracted natural language features from the title and description of resolved issues. In a series of experiments, we optimized and compared the accuracy of four supervised classifiers to address our research goals. Random Forest, achieved up to 91% precision (F1-score 72%) towards the first goal while for the second goal, Decision Tree achieved a precision of 92% (F1-score 91%). A qualitative evaluation gave insights on what information in the issue description is helpful for newcomers. Our approach can be used to automatically identify, label, and recommend issues for newcomers in open source software projects based only on the text of the issues.

成功的开源社区不断地寻找新成员，并帮助他们成为活跃的开发人员。在开源项目中，开发人员入职的一种常见方法是让新人专注于相关但易于解决的问题，以熟悉代码和社区。这项研究的目的是双重的。首先，我们的目标是通过简单地使用问题的标题和描述来分析解决问题的历史，从而自动识别新人可以解决的问题。其次，我们的目标是自动识别问题，这些问题可以由后来成为活跃开发人员的新人解决。我们挖掘了三个大型开源项目的问题跟踪器，并从已解决问题的标题和描述中提取了自然语言特征。在一系列实验中，我们对四种监督分类器的准确率进行了优化和比较，以实现我们的研究目标。对于第一个目标，随机森林实现了高达91%的精度(f1得分为72%)，而对于第二个目标，决策树实现了92%的精度(f1得分为91%)。一个定性的评估给出了问题描述中哪些信息对新手有帮助的见解。我们的方法可以用于自动识别、标记问题，并根据问题的文本为开源软件项目中的新手推荐问题。

{"title":"A Simple NLP-Based Approach to Support Onboarding and Retention in Open Source Communities","authors":"Christoph Stanik, Lloyd Montgomery, Daniel Martens, D. Fucci, W. Maalej","doi":"10.1109/ICSME.2018.00027","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00027","url":null,"abstract":"Successful open source communities are constantly looking for new members and helping them become active developers. A common approach for developer onboarding in open source projects is to let newcomers focus on relevant yet easy-to-solve issues to familiarize themselves with the code and the community. The goal of this research is twofold. First, we aim at automatically identifying issues that newcomers can resolve by analyzing the history of resolved issues by simply using the title and description of issues. Second, we aim at automatically identifying issues, that can be resolved by newcomers who later become active developers. We mined the issue trackers of three large open source projects and extracted natural language features from the title and description of resolved issues. In a series of experiments, we optimized and compared the accuracy of four supervised classifiers to address our research goals. Random Forest, achieved up to 91% precision (F1-score 72%) towards the first goal while for the second goal, Decision Tree achieved a precision of 92% (F1-score 91%). A qualitative evaluation gave insights on what information in the issue description is helpful for newcomers. Our approach can be used to automatically identify, label, and recommend issues for newcomers in open source software projects based only on the text of the issues.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"08 1","pages":"172-182"},"PeriodicalIF":0.0,"publicationDate":"2018-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86132022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Adapting Neural Text Classification for Improved Software Categorization 应用神经文本分类改进软件分类

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-06-05 DOI: 10.1109/ICSME.2018.00056

Alexander LeClair, Zachary Eberhart, Collin McMillan

Software Categorization is the task of organizing software into groups that broadly describe the behavior of the software, such as "editors" or "science." Categorization plays an important role in several maintenance tasks, such as repository navigation and feature elicitation. Current approaches attempt to cast the problem as text classification, to make use of the rich body of literature from the NLP domain. However, as we will this paper, algorithms are generally not applicable off-the-shelf to source code; we found that they work well when high-level project descriptions are available, but suffer very large performance penalties when classifying sourcecode and comments only. We propose a set of adaptations to a state-of-the-art neural classification algorithm and perform two evaluations: one with reference data from Debian end-user programs, and one with a set of C/C++ libraries that we hired professional programmers to annotate. We show that our proposed approach achieves performance exceeding that of previous software classification techniques as well as a state-of-the-art neural text classification technique.

软件分类是将软件组织成广泛描述软件行为的组的任务，例如“编辑”或“科学”。分类在一些维护任务中扮演着重要的角色，例如存储库导航和特性提取。目前的方法试图将问题作为文本分类，以利用来自自然语言处理领域的丰富文献。然而，正如我们将在本文中所述，算法通常不适用于现成的源代码;我们发现，当高级项目描述可用时，它们工作得很好，但是当只对源代码和注释进行分类时，它们的性能会受到很大的影响。我们提出了一组对最先进的神经分类算法的调整，并执行了两个评估:一个使用来自Debian最终用户程序的参考数据，另一个使用一组C/ c++库，我们聘请了专业程序员对其进行注释。我们表明，我们提出的方法达到的性能超过了以前的软件分类技术，以及最先进的神经文本分类技术。

{"title":"Adapting Neural Text Classification for Improved Software Categorization","authors":"Alexander LeClair, Zachary Eberhart, Collin McMillan","doi":"10.1109/ICSME.2018.00056","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00056","url":null,"abstract":"Software Categorization is the task of organizing software into groups that broadly describe the behavior of the software, such as \"editors\" or \"science.\" Categorization plays an important role in several maintenance tasks, such as repository navigation and feature elicitation. Current approaches attempt to cast the problem as text classification, to make use of the rich body of literature from the NLP domain. However, as we will this paper, algorithms are generally not applicable off-the-shelf to source code; we found that they work well when high-level project descriptions are available, but suffer very large performance penalties when classifying sourcecode and comments only. We propose a set of adaptations to a state-of-the-art neural classification algorithm and perform two evaluations: one with reference data from Debian end-user programs, and one with a set of C/C++ libraries that we hired professional programmers to annotate. We show that our proposed approach achieves performance exceeding that of previous software classification techniques as well as a state-of-the-art neural text classification technique.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"25 1","pages":"461-472"},"PeriodicalIF":0.0,"publicationDate":"2018-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74986118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

On the Evolution of Technical Lag in the npm Package Dependency Network 论npm包依赖网络中技术滞后的演变

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-06-05 DOI: 10.1109/ICSME.2018.00050

Alexandre Decan, T. Mens, Eleni Constantinou

Software packages developed and distributed through package managers extensively depend on other packages. These dependencies are regularly updated, for example to add new features, resolve bugs or fix security issues. In order to take full advantage of the benefits of this type of reuse, developers should keep their dependencies up to date by relying on the latest releases. In practice, however, this is not always possible, and packages lag behind with respect to the latest version of their dependencies. This phenomenon is described as technical lag in the literature. In this paper, we perform an empirical study of technical lag in the npm dependency network by investigating its evolution for over 1.4M releases of 120K packages and 8M dependencies between these releases. We explore how technical lag increases over time, taking into account the release type and the use of package dependency constraints. We also discuss how technical lag can be reduced by relying on the semantic versioning policy.

通过包管理器开发和分发的软件包广泛地依赖于其他软件包。这些依赖项会定期更新，例如添加新功能、解决错误或修复安全问题。为了充分利用这种类型的重用的好处，开发人员应该通过依赖最新版本来保持他们的依赖关系是最新的。然而，在实践中，这并不总是可能的，并且包相对于其依赖项的最新版本会落后。这种现象在文献中被描述为技术滞后。在本文中，我们对npm依赖网络中的技术滞后进行了实证研究，研究了它在120K包的140多万个版本和这些版本之间的800万个依赖关系中的演变。考虑到发布类型和包依赖约束的使用，我们将探讨技术滞后如何随着时间的推移而增加。我们还讨论了如何通过依赖语义版本控制策略来减少技术滞后。

引用次数: 60

Understanding the Role of Reporting in Work Item Tracking Systems for Software Development: An Industrial Case Study 理解报告在软件开发的工作项跟踪系统中的作用:一个工业案例研究

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-05-27 DOI: 10.1145/3183440.3195071

Pavneet Singh Kochhar, S. Swierc, Trevor Carnahan, Hitesh Sajnani, M. Nagappan

Work item tracking systems such as Visual Studio Team Services, JIRA, BugZilla and GitHub issue tracker are widely used by software engineers. These systems are used to track work items such as features, user stories, bugs, plan sprints, distribute tasks across the team and prioritize the team's work. Such systems can help teams track the progress and manage the shipping of software. While these tracking systems give data about different work items in tabular format, using a reporting tool on top of them can help teams visualize the data related to their projects such as how many bugs are open and closed and which work items are assigned to a team member. While tools like Visual Studio and JIRA provide reporting services, it is important to understand how users leverage them in their projects to help improve the reporting services. In this study, we conduct an empirical investigation on the usage of Analytics Service - a reporting service provided by Visual Studio Team Services (VSTS) to build dashboards and reports out of their work item tracking data. In particular, we want to understand why and how users interact with Analytics Service and what are the outcomes and business decisions taken by stakeholders from reports built using Analytics Service. We perform semi-structured interviews and survey with users of Analytics Service to understand usage and challenges. Our report on qualitative and quantitative analysis can help organizations and engineers building similar tools or services.

诸如Visual Studio Team Services、JIRA、BugZilla和GitHub问题跟踪器等工作项跟踪系统被软件工程师广泛使用。这些系统用于跟踪工作项，如特性、用户故事、bug、计划冲刺、跨团队分配任务以及确定团队工作的优先级。这样的系统可以帮助团队跟踪进度并管理软件的交付。当这些跟踪系统以表格形式提供关于不同工作项的数据时，在它们上面使用报告工具可以帮助团队可视化与他们的项目相关的数据，例如打开和关闭了多少错误，以及将哪些工作项分配给了团队成员。虽然像Visual Studio和JIRA这样的工具提供了报告服务，但重要的是要了解用户如何在他们的项目中利用它们来帮助改进报告服务。在这项研究中，我们对分析服务的使用进行了实证调查——分析服务是由Visual Studio Team Services (VSTS)提供的一种报告服务，用于根据工作项跟踪数据构建仪表板和报告。特别是，我们想要了解用户为什么以及如何与Analytics Service交互，以及利益相关者从使用Analytics Service构建的报告中获得的结果和业务决策是什么。我们对Analytics Service的用户进行半结构化访谈和调查，以了解使用情况和面临的挑战。我们关于定性和定量分析的报告可以帮助组织和工程师构建类似的工具或服务。

{"title":"Understanding the Role of Reporting in Work Item Tracking Systems for Software Development: An Industrial Case Study","authors":"Pavneet Singh Kochhar, S. Swierc, Trevor Carnahan, Hitesh Sajnani, M. Nagappan","doi":"10.1145/3183440.3195071","DOIUrl":"https://doi.org/10.1145/3183440.3195071","url":null,"abstract":"Work item tracking systems such as Visual Studio Team Services, JIRA, BugZilla and GitHub issue tracker are widely used by software engineers. These systems are used to track work items such as features, user stories, bugs, plan sprints, distribute tasks across the team and prioritize the team's work. Such systems can help teams track the progress and manage the shipping of software. While these tracking systems give data about different work items in tabular format, using a reporting tool on top of them can help teams visualize the data related to their projects such as how many bugs are open and closed and which work items are assigned to a team member. While tools like Visual Studio and JIRA provide reporting services, it is important to understand how users leverage them in their projects to help improve the reporting services. In this study, we conduct an empirical investigation on the usage of Analytics Service - a reporting service provided by Visual Studio Team Services (VSTS) to build dashboards and reports out of their work item tracking data. In particular, we want to understand why and how users interact with Analytics Service and what are the outcomes and business decisions taken by stakeholders from reports built using Analytics Service. We perform semi-structured interviews and survey with users of Analytics Service to understand usage and challenges. Our report on qualitative and quantitative analysis can help organizations and engineers building similar tools or services.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"68 1","pages":"605-614"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84100387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Methods and Tools for Focusing and Prioritizing the Testing Effort 测试工作的重点和优先级的方法和工具

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-03-15 DOI: 10.1109/ICSME.2018.00089

D. D. Nucci

Software testing is essential for any software development process, representing an extremely expensive activity. Despite its importance recent studies showed that developers rarely test their application and most programming sessions end without any test execution. Indeed, new methods and tools able to better allocating the developers effort are needed to increment the system reliability and to reduce the testing costs. In this work we focus on three activities able to optimize testing activities, specifically, bug prediction, test case prioritization, and energy leaks detection. Indeed, despite the effort devoted in the last decades by the research community led to interesting results, we highlight some aspects that might be improved and propose empirical investigations and novel approaches. Finally, we provide a set of open issues that should be addressed by the research community in the future.

软件测试对于任何软件开发过程都是必不可少的，它代表着一项极其昂贵的活动。尽管它很重要，但最近的研究表明，开发人员很少测试他们的应用程序，并且大多数编程会话在没有任何测试执行的情况下结束。实际上，需要新的方法和工具来更好地分配开发人员的工作，以增加系统可靠性并减少测试成本。在这项工作中，我们专注于能够优化测试活动的三个活动，特别是，错误预测，测试用例优先级，和能量泄漏检测。事实上，尽管过去几十年研究界的努力导致了有趣的结果，但我们强调了一些可能得到改进的方面，并提出了实证调查和新方法。最后，我们提供了一组未来应该由研究界解决的开放性问题。

引用次数: 1

Statistical Translation of English Texts to API Code Templates 英文文本到API代码模板的统计翻译

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE-C.2017.81

A. Nguyen, Peter C. Rigby, THANH VAN NGUYEN, Dharani Palani, Mark Karanfil, T. Nguyen

We develop T2API, a context-sensitive, graph-based statistical translation approach that takes as input an English description of a programming task and synthesizes the corresponding API code template for the task. We train T2API to statistically learn the alignments between English and API elements and determine the relevant API elements. The training is done on StackOverflow, a bilingual corpus on which developers discuss programming problems in two types of language: English and programming language. T2API considers both the context of the words in the input query and the context of API elements that often go together in the corpus. The derived API elements with their relevance scores are assembled into an API usage by GraSyn, a novel graph-based API synthesis algorithm that generates a graph representing an API usage from a large code corpus. Importantly, it is capable of generating new API usages from previously seen sub-usages. We curate a test benchmark of 250 real-world StackOverflow posts. Across the benchmark, T2API's synthesized snippets have the correct API elements with a median top-1 precision and recall of 67% and 100%, respectively. Four professional developers and five graduate students judged that 77% of our top synthesized API code templates are useful to solve the problem presented in the StackOverflow posts.

我们开发了T2API，这是一种上下文敏感的、基于图形的统计翻译方法，它将编程任务的英文描述作为输入，并为该任务合成相应的API代码模板。我们训练T2API来统计学习英语和API元素之间的对齐，并确定相关的API元素。培训是在StackOverflow上完成的，这是一个双语语料库，开发人员在上面用两种语言讨论编程问题:英语和编程语言。T2API既考虑输入查询中单词的上下文，也考虑语料库中经常一起出现的API元素的上下文。通过GraSyn(一种新颖的基于图的API合成算法)，将派生的API元素及其相关分数组合成API使用情况，该算法可以从大型代码语料库中生成表示API使用情况的图。重要的是，它能够从以前看到的子用法生成新的API用法。我们策划了250个真实世界StackOverflow帖子的测试基准。在整个基准测试中，T2API的合成片段具有正确的API元素，其中位数top-1精度和召回率分别为67%和100%。四名专业开发人员和五名研究生认为，77%的顶级合成API代码模板对解决StackOverflow帖子中提出的问题是有用的。

{"title":"Statistical Translation of English Texts to API Code Templates","authors":"A. Nguyen, Peter C. Rigby, THANH VAN NGUYEN, Dharani Palani, Mark Karanfil, T. Nguyen","doi":"10.1109/ICSE-C.2017.81","DOIUrl":"https://doi.org/10.1109/ICSE-C.2017.81","url":null,"abstract":"We develop T2API, a context-sensitive, graph-based statistical translation approach that takes as input an English description of a programming task and synthesizes the corresponding API code template for the task. We train T2API to statistically learn the alignments between English and API elements and determine the relevant API elements. The training is done on StackOverflow, a bilingual corpus on which developers discuss programming problems in two types of language: English and programming language. T2API considers both the context of the words in the input query and the context of API elements that often go together in the corpus. The derived API elements with their relevance scores are assembled into an API usage by GraSyn, a novel graph-based API synthesis algorithm that generates a graph representing an API usage from a large code corpus. Importantly, it is capable of generating new API usages from previously seen sub-usages. We curate a test benchmark of 250 real-world StackOverflow posts. Across the benchmark, T2API's synthesized snippets have the correct API elements with a median top-1 precision and recall of 67% and 100%, respectively. Four professional developers and five graduate students judged that 77% of our top synthesized API code templates are useful to solve the problem presented in the StackOverflow posts.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"1 1","pages":"194-205"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83668841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3