Kaifeng Huang, Bihuan Chen, Xin Peng, Daihong Zhou, Y. Wang, Yang Liu, Wenyun Zhao
{"title":"ClDiff: Generating Concise Linked Code Differences","authors":"Kaifeng Huang, Bihuan Chen, Xin Peng, Daihong Zhou, Y. Wang, Yang Liu, Wenyun Zhao","doi":"10.1145/3238147.3238219","DOIUrl":null,"url":null,"abstract":"Analyzing and understanding source code changes is important in a variety of software maintenance tasks. To this end, many code differencing and code change summarization methods have been proposed. For some tasks (e.g. code review and software merging), however, those differencing methods generate too fine-grained a representation of code changes, and those summarization methods generate too coarse-grained a representation of code changes. Moreover, they do not consider the relationships among code changes. Therefore, the generated differences or summaries make it not easy to analyze and understand code changes in some software maintenance tasks. In this paper, we propose a code differencing approach, named ClDiff, to generate concise linked code differences whose granularity is in between the existing code differencing and code change summarization methods. The goal of ClDiff is to generate more easily understandable code differences. ClDiff takes source code files before and after changes as inputs, and consists of three steps. First, it pre-processes the source code files by pruning unchanged declarations from the parsed abstract syntax trees. Second, it generates concise code differences by grouping fine-grained code differences at or above the statement level and describing high-level changes in each group. Third, it links the related concise code differences according to five pre-defined links. Experiments with 12 Java projects (74,387 commits) and a human study with 10 participants have indicated the accuracy, conciseness, performance and usefulness of ClDiff.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"17 1","pages":"679-690"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3238147.3238219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33
Abstract
Analyzing and understanding source code changes is important in a variety of software maintenance tasks. To this end, many code differencing and code change summarization methods have been proposed. For some tasks (e.g. code review and software merging), however, those differencing methods generate too fine-grained a representation of code changes, and those summarization methods generate too coarse-grained a representation of code changes. Moreover, they do not consider the relationships among code changes. Therefore, the generated differences or summaries make it not easy to analyze and understand code changes in some software maintenance tasks. In this paper, we propose a code differencing approach, named ClDiff, to generate concise linked code differences whose granularity is in between the existing code differencing and code change summarization methods. The goal of ClDiff is to generate more easily understandable code differences. ClDiff takes source code files before and after changes as inputs, and consists of three steps. First, it pre-processes the source code files by pruning unchanged declarations from the parsed abstract syntax trees. Second, it generates concise code differences by grouping fine-grained code differences at or above the statement level and describing high-level changes in each group. Third, it links the related concise code differences according to five pre-defined links. Experiments with 12 Java projects (74,387 commits) and a human study with 10 participants have indicated the accuracy, conciseness, performance and usefulness of ClDiff.