2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)最新文献

英文中文

Raising MSR Researchers: An Experience Report on Teaching a Graduate Seminar Course in Mining Software Repositories (MSR) 培养MSR研究人员:《挖掘软件库》研究生研讨课教学经验报告

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901780

A. Hassan

This experience report discusses my views on raising MSR researchers through a graduate-level seminar course. A key goal of this report is to kick start a discussion on this topic within our growing community. A discussion for which there is rarely a suitable venue. Yet, it is an essential discussion to have as a community grows, especially given the rapid growth of the MSR community over the past decade.

这份经验报告讨论了我对通过研究生水平的研讨会课程培养MSR研究人员的看法。本报告的一个关键目标是在我们不断增长的社区中启动关于该主题的讨论。这种讨论很少有合适的场合。然而，随着社区的发展，特别是考虑到MSR社区在过去十年中的快速增长，这是一个必不可少的讨论。

引用次数: 6

An Empirical Study on the Practice of Maintaining Object-Relational Mapping Code in Java Systems Java系统中维护对象关系映射代码实践的实证研究

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901758

T. Chen, Weiyi Shang, Jinqiu Yang, A. Hassan, Michael W. Godfrey, Mohamed N. Nasser, P. Flora

Databases have become one of the most important components in modern software systems. For example, web services, cloud computing systems, and online transaction processing systems all rely heavily on databases. To abstract the complexity of accessing a database, developers make use of Object-Relational Mapping (ORM) frameworks. ORM frameworks provide an abstraction layer between the application logic and the underlying database. Such abstraction layer automatically maps objects in Object-Oriented Languages to database records, which significantly reduces the amount of boilerplate code that needs to be written. Despite the advantages of using ORM frameworks, we observe several difficulties in maintaining ORM code (i.e., code that makes use of ORM frameworks) when cooperating with our industrial partner. After conducting studies on other open source systems, we find that such difficulties are common in other Java systems. Our study finds that i) ORM cannot completely encapsulate database accesses in objects or abstract the underlying database technology, thus may cause ORM code changes more scattered; ii) ORM code changes are more frequent than regular code, but there is a lack of tools that help developers verify ORM code at compilation time; iii) we find that changes to ORM code are more commonly due to performance or security reasons; however, traditional static code analyzers need to be extended to capture the peculiarities of ORM code in order to detect such problems. Our study highlights the hidden maintenance costs of using ORM frameworks, and provides some initial insights about potential approaches to help maintain ORM code. Future studies should carefully examine ORM code, especially given the rising use of ORM in modern software systems.

数据库已成为现代软件系统最重要的组成部分之一。例如，web服务、云计算系统和在线事务处理系统都严重依赖数据库。为了抽象访问数据库的复杂性，开发人员使用对象关系映射(ORM)框架。ORM框架在应用程序逻辑和底层数据库之间提供了一个抽象层。这样的抽象层自动地将面向对象语言中的对象映射到数据库记录，这大大减少了需要编写的样板代码的数量。尽管使用ORM框架有很多优点，但我们发现在与我们的工业伙伴合作时，在维护ORM代码(即使用ORM框架的代码)方面存在一些困难。在对其他开源系统进行研究后，我们发现这些困难在其他Java系统中很常见。我们的研究发现，i) ORM不能将数据库访问完全封装在对象中或抽象底层数据库技术，从而可能导致ORM代码更改更加分散;ii) ORM代码更改比常规代码更频繁，但缺乏帮助开发人员在编译时验证ORM代码的工具;iii)我们发现ORM代码的更改通常是由于性能或安全原因;然而，传统的静态代码分析器需要扩展以捕获ORM代码的特性，以便检测此类问题。我们的研究强调了使用ORM框架的隐藏维护成本，并提供了一些关于帮助维护ORM代码的潜在方法的初步见解。未来的研究应该仔细地检查ORM代码，特别是考虑到ORM在现代软件系统中越来越多的使用。

{"title":"An Empirical Study on the Practice of Maintaining Object-Relational Mapping Code in Java Systems","authors":"T. Chen, Weiyi Shang, Jinqiu Yang, A. Hassan, Michael W. Godfrey, Mohamed N. Nasser, P. Flora","doi":"10.1145/2901739.2901758","DOIUrl":"https://doi.org/10.1145/2901739.2901758","url":null,"abstract":"Databases have become one of the most important components in modern software systems. For example, web services, cloud computing systems, and online transaction processing systems all rely heavily on databases. To abstract the complexity of accessing a database, developers make use of Object-Relational Mapping (ORM) frameworks. ORM frameworks provide an abstraction layer between the application logic and the underlying database. Such abstraction layer automatically maps objects in Object-Oriented Languages to database records, which significantly reduces the amount of boilerplate code that needs to be written. Despite the advantages of using ORM frameworks, we observe several difficulties in maintaining ORM code (i.e., code that makes use of ORM frameworks) when cooperating with our industrial partner. After conducting studies on other open source systems, we find that such difficulties are common in other Java systems. Our study finds that i) ORM cannot completely encapsulate database accesses in objects or abstract the underlying database technology, thus may cause ORM code changes more scattered; ii) ORM code changes are more frequent than regular code, but there is a lack of tools that help developers verify ORM code at compilation time; iii) we find that changes to ORM code are more commonly due to performance or security reasons; however, traditional static code analyzers need to be extended to capture the peculiarities of ORM code in order to detect such problems. Our study highlights the hidden maintenance costs of using ORM frameworks, and provides some initial insights about potential approaches to help maintain ORM code. Future studies should carefully examine ORM code, especially given the rising use of ORM in modern software systems.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"120 1","pages":"165-176"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87813392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Grouping Android Tag Synonyms on Stack Overflow 在堆栈溢出上分组Android标签同义词

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901750

S. Beyer, M. Pinzger

On Stack Overflow, more than 38,000 diverse tags are used to classify posts. The Stack Overflow community provides tag synonyms to reduce the number of tags that have the same or similar meaning. In our previous research, we used those synonym pairs to derive a number of strategies to create tag synonyms automatically.In this work, we continue this line of research and present an approach to group tag synonyms to meaningful topics. We represent our synonyms as directed, weighted graphs, and investigate several graph community detection algorithms to build meaningful groups of tags, also called tag communities.We apply our approach to the tags obtained from Android-related Stack Overflow posts and evaluate the resulting tag communities quantitatively with various community metrics. In addition, we evaluate our approach qualitatively through a manual inspection and comparison of a random sample of tag communities. Our results show that we can cluster the Android tags to 2,481 meaningful tag communities. We also show how these tag communities can be used to derive trends of topics of Android-related questions on Stack Overflow.

在Stack Overflow上，有超过38,000个不同的标签用于对帖子进行分类。Stack Overflow社区提供标签同义词，以减少具有相同或相似含义的标签的数量。在我们之前的研究中，我们使用这些同义词对派生出许多自动创建标签同义词的策略。在这项工作中，我们继续这条研究路线，并提出了一种将标签同义词分组到有意义主题的方法。我们将同义词表示为有向加权图，并研究了几种图社区检测算法来构建有意义的标签组，也称为标签社区。我们将我们的方法应用于从android相关Stack Overflow帖子中获得的标签，并使用各种社区指标定量评估所产生的标签社区。此外，我们通过人工检查和标签社区随机样本的比较来定性地评估我们的方法。我们的结果表明，我们可以将Android标签聚类到2481个有意义的标签社区。我们还展示了如何使用这些标签社区来派生Stack Overflow上android相关问题的主题趋势。

引用次数: 23

On Mining Crowd-Based Speech Documentation 基于群体的语音文档挖掘

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901771

P. Moslehi, Bram Adams, J. Rilling

Despite the globalization of software development, relevant documentation of a project, such as requirements and design documents, often still is missing, incomplete or outdated. However, parts of that documentation can be found outside the project, where it is fragmented across hundreds of textual web documents like blog posts, email messages and forum posts, as well as multimedia documents such as screencasts and podcasts. Since dissecting and filtering multimedia information based on its relevancy to a given project is an inherently difficult task, it is necessary to provide an automated approach for mining this crowd-based documentation. In this paper, we are interested in mining the speech part of YouTube screencasts, since this part typically contains the rationale and insights of a screencast. We introduce a methodology that transcribes and analyzes the transcribed text using various Information Extraction (IE) techniques, and present a case study to illustrate the applicability of our mining methodology. In this case study, we extract use case scenarios from WordPress tutorial videos and show how their content can supplement existing documentation. We then evaluate how well existing rankings of video content are able to pinpoint the most relevant videos for a given scenario.

尽管软件开发全球化，但是项目的相关文档，例如需求和设计文档，仍然经常缺失、不完整或过时。然而，这些文档的一部分可以在项目之外找到，它们分散在数百个文本网络文档中，比如博客文章、电子邮件消息和论坛帖子，以及多媒体文档，比如屏幕视频和播客。由于根据多媒体信息与给定项目的相关性分析和过滤多媒体信息本身就是一项困难的任务，因此有必要提供一种自动化的方法来挖掘这种基于人群的文档。在本文中，我们感兴趣的是挖掘YouTube视频片段的语音部分，因为这部分通常包含视频片段的基本原理和见解。我们介绍了一种使用各种信息提取(IE)技术转录和分析转录文本的方法，并提出了一个案例研究来说明我们的挖掘方法的适用性。在这个案例研究中，我们从WordPress教程视频中提取用例场景，并展示它们的内容如何补充现有文档。然后，我们评估现有的视频内容排名如何能够精确定位给定场景中最相关的视频。

{"title":"On Mining Crowd-Based Speech Documentation","authors":"P. Moslehi, Bram Adams, J. Rilling","doi":"10.1145/2901739.2901771","DOIUrl":"https://doi.org/10.1145/2901739.2901771","url":null,"abstract":"Despite the globalization of software development, relevant documentation of a project, such as requirements and design documents, often still is missing, incomplete or outdated. However, parts of that documentation can be found outside the project, where it is fragmented across hundreds of textual web documents like blog posts, email messages and forum posts, as well as multimedia documents such as screencasts and podcasts. Since dissecting and filtering multimedia information based on its relevancy to a given project is an inherently difficult task, it is necessary to provide an automated approach for mining this crowd-based documentation. In this paper, we are interested in mining the speech part of YouTube screencasts, since this part typically contains the rationale and insights of a screencast. We introduce a methodology that transcribes and analyzes the transcribed text using various Information Extraction (IE) techniques, and present a case study to illustrate the applicability of our mining methodology. In this case study, we extract use case scenarios from WordPress tutorial videos and show how their content can supplement existing documentation. We then evaluate how well existing rankings of video content are able to pinpoint the most relevant videos for a given scenario.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"6 1","pages":"259-268"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90093468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Cold-Start Software Analytics 冷启动软件分析

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901740

Jin Guo, Mona Rahimi, J. Cleland-Huang, A. Rasin, J. Hayes, Michael Vierhauser

Software project artifacts such as source code, requirements, and change logs represent a gold-mine of actionable information. As a result, software analytic solutions have been developed to mine repositories and answer questions such as "who is the expert?,'' "which classes are fault prone?,'' or even "who are the domain experts for these fault-prone classes?'' Analytics often require training and configuring in order to maximize performance within the context of each project. A cold-start problem exists when a function is applied within a project context without first configuring the analytic functions on project-specific data. This scenario exists because of the non-trivial effort necessary to instrument a project environment with candidate tools and algorithms and to empirically evaluate alternate configurations. We address the cold-start problem by comparatively evaluating `best-of-breed' and `profile-driven' solutions, both of which reuse known configurations in new project contexts. We describe and evaluate our approach against 20 project datasets for the three analytic areas of artifact connectivity, fault-prediction, and finding the expert, and show that the best-of-breed approach outperformed the profile-driven approach in all three areas; however, while it delivered acceptable results for artifact connectivity and find the expert, both techniques underperformed for cold-start fault prediction.

软件项目工件，如源代码、需求和变更日志，代表了可操作信息的金矿。因此，开发了软件分析解决方案来挖掘存储库并回答诸如“谁是专家?”，“哪些类别容易发生故障?”，甚至“谁是这些容易出错类的领域专家?”“分析通常需要培训和配置，以便在每个项目的上下文中最大化性能。当在项目上下文中应用函数而没有首先在项目特定数据上配置分析函数时，就会存在冷启动问题。这种情况之所以存在，是因为使用候选工具和算法对项目环境进行仪表化以及经验地评估备选配置所必需的重要工作。我们通过比较评估“同类最佳”和“配置文件驱动”的解决方案来解决冷启动问题，这两种解决方案都在新的项目环境中重用已知的配置。我们针对工件连接性、故障预测和寻找专家这三个分析领域的20个项目数据集描述和评估了我们的方法，并表明在所有三个领域中，最佳的方法都优于概要驱动的方法;然而，尽管它为工件连接性和寻找专家提供了可接受的结果，但这两种技术在冷启动故障预测方面表现不佳。

{"title":"Cold-Start Software Analytics","authors":"Jin Guo, Mona Rahimi, J. Cleland-Huang, A. Rasin, J. Hayes, Michael Vierhauser","doi":"10.1145/2901739.2901740","DOIUrl":"https://doi.org/10.1145/2901739.2901740","url":null,"abstract":"Software project artifacts such as source code, requirements, and change logs represent a gold-mine of actionable information. As a result, software analytic solutions have been developed to mine repositories and answer questions such as \"who is the expert?,'' \"which classes are fault prone?,'' or even \"who are the domain experts for these fault-prone classes?'' Analytics often require training and configuring in order to maximize performance within the context of each project. A cold-start problem exists when a function is applied within a project context without first configuring the analytic functions on project-specific data. This scenario exists because of the non-trivial effort necessary to instrument a project environment with candidate tools and algorithms and to empirically evaluate alternate configurations. We address the cold-start problem by comparatively evaluating `best-of-breed' and `profile-driven' solutions, both of which reuse known configurations in new project contexts. We describe and evaluate our approach against 20 project datasets for the three analytic areas of artifact connectivity, fault-prediction, and finding the expert, and show that the best-of-breed approach outperformed the profile-driven approach in all three areas; however, while it delivered acceptable results for artifact connectivity and find the expert, both techniques underperformed for cold-start fault prediction.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"8 1","pages":"142-153"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82877783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Findings from GitHub: Methods, Datasets and Limitations 来自GitHub的发现:方法、数据集和局限性

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901776

Valerio Cosentino, Javier Luis Cánovas Izquierdo, Jordi Cabot

GitHub, one of the most popular social coding platforms, is the platform of reference when mining Open Source repositories to learn from past experiences. In the last years, a number of research papers have been published reporting findings based on data mined from GitHub. As the community continues to deepen in its understanding of software engineering thanks to the analysis performed on this platform, we believe it is worthwhile to reflect how research papers have addressed the task of mining GitHub repositories over the last years. In this regard, we present a meta-analysis of 93 research papers which addresses three main dimensions of those papers: i) the empirical methods employed, ii) the datasets they used and iii) the limitations reported. Results of our meta-analysis show some concerns regarding the dataset collection process and size, the low level of replicability, poor sampling techniques, lack of longitudinal studies and scarce variety of methodologies.

GitHub是最流行的社交编码平台之一，是挖掘开源存储库以学习过去经验的参考平台。在过去的几年里，已经发表了许多研究论文，报告了基于从GitHub挖掘的数据的发现。由于在这个平台上进行的分析，社区对软件工程的理解不断加深，我们认为有必要反思一下过去几年研究论文是如何解决挖掘GitHub存储库的任务的。在这方面，我们对93篇研究论文进行了荟萃分析，解决了这些论文的三个主要维度:i)采用的实证方法，ii)他们使用的数据集，以及iii)报道的局限性。我们的荟萃分析结果显示，数据集收集过程和规模、低水平的可复制性、糟糕的抽样技术、缺乏纵向研究和缺乏多样化的方法等方面存在一些问题。

引用次数: 84

Got Technical Debt? Surfacing Elusive Technical Debt in Issue Trackers 有技术债?问题跟踪器中难以捉摸的技术债务浮出水面

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901754

S. Bellomo, R. Nord, I. Ozkaya, Mary Popeck

Concretely communicating technical debt and its consequences is of common interest to both researchers and software engineers. In the absence of validated tools and techniques to achieve this goal with repeatable results, developers resort to ad hoc practices. Most commonly they report using issue trackers or their existing backlog management practices to capture and track technical debt. In a manual examination of 1,264 issues from four issue trackers from open source industry and government projects, we identified 109 examples of technical debt. Our study reveals that technical debt and its related concepts have entered the vernacular of developers as they discuss development tasks through issue trackers. Even when issues are not explicitly tagged as technical debt, it is possible to identify technical debt items in these issue trackers using a categorization method we developed. We use our results and data to motivate an improved definition and an approach to explicitly report technical debt in issue trackers.

具体地交流技术债务及其后果是研究人员和软件工程师共同感兴趣的问题。在缺乏经过验证的工具和技术来实现具有可重复结果的目标的情况下，开发人员求助于特别的实践。最常见的是，他们使用问题跟踪器或他们现有的待办事项管理实践来捕获和跟踪技术债务。在对来自开源行业和政府项目的四个问题跟踪器的1,264个问题的手工检查中，我们确定了109个技术债务的例子。我们的研究表明，当开发人员通过问题跟踪器讨论开发任务时，技术债务及其相关概念已经进入了他们的日常用语。即使问题没有明确地标记为技术债务，也可以使用我们开发的分类方法在这些问题跟踪器中识别技术债务项。我们使用我们的结果和数据来激励改进的定义和方法，以便在问题跟踪器中明确报告技术债务。

引用次数: 31

Examining Programmer Practices for Locally Handling Exceptions 检查程序员本地处理异常的做法

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

Pub Date : 2016-05-14 DOI: 10.1145/2901739.2903497

Mary Beth Kery, Claire Le Goues, B. Myers

Many have argued that the current try/catch mechanism for handling exceptions in Java is flawed. A major complaint is that programmers often write minimal and low quality handlers. We used the Boa tool to examine a large number of Java projects on GitHub to provide empirical evidence about how programmers currently deal with exceptions. We found that programmers handle exceptions locally in catch blocks much of the time, rather than propagating by throwing an Exception. Programmers make heavy use of actions like Log, Print, Return, or Throw in catch blocks, and also frequently copy code between handlers. We found bad practices like empty catch blocks or catching Exception are indeed widespread. We discuss evidence that programmers may misjudge risk when catching Exception, and face a tension between handlers that directly address local program statement failure and handlers that consider the program-wide implications of an exception. Some of these issues might be ad-dressed by future tools which autocomplete more complete handlers.

许多人认为当前Java中处理异常的try/catch机制存在缺陷。一个主要的抱怨是程序员经常编写最少和低质量的处理程序。我们使用Boa工具检查了GitHub上的大量Java项目，以提供有关程序员当前如何处理异常的经验证据。我们发现，大多数时候，程序员在catch块中局部处理异常，而不是通过抛出Exception来传播异常。程序员在catch块中大量使用Log、Print、Return或Throw等操作，并且还经常在处理程序之间复制代码。我们发现像空catch块或捕获Exception这样的坏做法确实很普遍。我们讨论了程序员在捕捉异常时可能误判风险的证据，并面临直接处理局部程序语句失败的处理程序和考虑异常的程序范围含义的处理程序之间的紧张关系。其中一些问题可能会通过未来的工具来解决，这些工具会自动完成更完整的处理程序。

引用次数: 30

Externalization of Software Behavior by the Mining of Norms 通过规范挖掘实现软件行为的外部化

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901744

Daniel Avery, K. Dam, Bastin Tony Roy Savarimuthu, A. Ghose

Open Source Software Development (OSSD) often suffers from conflicting views and actions due to the perceived flat and open ecology of an open source community. This often manifests itself as a lack of codified knowledge that is easily accessible for community members. How decisions are made and expectations of a software system are often described in detail through the many forms of social communications that take place within a community. These social interactions form norms which are influential in dictating what behaviors are expected in a community and of the system. In this paper, we provide a tool which mines these social interactions (in the form of bug reports) and extract norms of the system, externalizing this information into a codified form that allows others within the community to be aware of without having witnessed the social interactions.

由于开源社区的扁平化和开放生态，开源软件开发(OSSD)经常受到观点和行动冲突的困扰。这通常表现为缺乏社区成员容易获得的成文知识。如何做出决策和对软件系统的期望通常通过社区内发生的多种形式的社会通信来详细描述。这些社会互动形成了规范，这些规范在规定社区和系统中期望的行为方面具有影响力。在本文中，我们提供了一种工具来挖掘这些社会互动(以bug报告的形式)并提取系统的规范，将这些信息外化为一种编纂的形式，允许社区中的其他人在没有目睹社会互动的情况下意识到这一点。

引用次数: 10

Characterization of the Xen Project Code Review Process: an Experience Report Xen项目代码审查过程的特征描述:经验报告

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

Pub Date : 2016-05-14 DOI: 10.1145/2901739.2901778

Daniel Izquierdo-Cortazar, Lars Kurth, Jesus M. Gonzalez-Barahona, Santiago Dueñas, Nelson Sekitoleko

Many software development projects have introduced manda-tory code review for every change to the code. This meansthat the project needs to devote a significant effort to re-view all proposed changes, and that their merging into thecode base may get considerably delayed. Therefore, all thoseprojects need to understand how code review is working, andthe delays it is causing in time to merge.This is the case in the Xen project, which performs peerreview using mailing lists. During the first half of 2015, somepeople in the project observed a large and sustained increasein the number of messages related to code review, which hadstarted some years before. This observation led to concernson whether the code review process was having some trouble,and too large an impact on the overall development process.Those concerns were addressed with a quantitative study,which is presented in this paper. Based on the informa-tion in code review messages, some metrics were defined toinfer delays imposed by code review. The study producedquantitative data suitable for informed discussion, which theproject is using to understand its code review process, andto take decisions to improve it.

许多软件开发项目对代码的每次更改都引入了强制性的代码审查。这意味着项目需要投入大量的精力来审查所有提议的变更，并且它们合并到代码库中可能会被大大延迟。因此，所有这些项目都需要了解代码审查是如何工作的，以及它在合并时造成的延迟。Xen项目就是这种情况，它使用邮件列表执行同行评审。在2015年上半年，项目中的一些人观察到与代码审查相关的消息数量出现了大量持续的增长，这种情况在几年前就开始了。这种观察导致了对代码审查过程是否有一些麻烦的关注，以及对整个开发过程的影响是否太大。这些问题已通过一项定量研究加以解决，该研究将在本文中提出。基于代码审查消息中的信息，定义了一些度量来推断代码审查所带来的延迟。该研究产生了适合于知情讨论的定量数据，该项目正在使用这些数据来理解其代码审查过程，并做出改进它的决定。

{"title":"Characterization of the Xen Project Code Review Process: an Experience Report","authors":"Daniel Izquierdo-Cortazar, Lars Kurth, Jesus M. Gonzalez-Barahona, Santiago Dueñas, Nelson Sekitoleko","doi":"10.1145/2901739.2901778","DOIUrl":"https://doi.org/10.1145/2901739.2901778","url":null,"abstract":"Many software development projects have introduced manda-tory code review for every change to the code. This meansthat the project needs to devote a significant effort to re-view all proposed changes, and that their merging into thecode base may get considerably delayed. Therefore, all thoseprojects need to understand how code review is working, andthe delays it is causing in time to merge.This is the case in the Xen project, which performs peerreview using mailing lists. During the first half of 2015, somepeople in the project observed a large and sustained increasein the number of messages related to code review, which hadstarted some years before. This observation led to concernson whether the code review process was having some trouble,and too large an impact on the overall development process.Those concerns were addressed with a quantitative study,which is presented in this paper. Based on the informa-tion in code review messages, some metrics were defined toinfer delays imposed by code review. The study producedquantitative data suitable for informed discussion, which theproject is using to understand its code review process, andto take decisions to improve it.","PeriodicalId":6621,"journal":{"name":"2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)","volume":"242 1","pages":"386-390"},"PeriodicalIF":0.0,"publicationDate":"2016-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75100079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀