{"title":"Understanding Code Change with Micro-Changes","authors":"Lei Chen, Michele Lanza, Shinpei Hayashi","doi":"arxiv-2409.09923","DOIUrl":null,"url":null,"abstract":"A crucial activity in software maintenance and evolution is the comprehension\nof the changes performed by developers, when they submit a pull request and/or\nperform a commit on the repository. Typically, code changes are represented in\nthe form of code diffs, textual representations highlighting the differences\nbetween two file versions, depicting the added, removed, and changed lines.\nThis simplistic representation must be interpreted by developers, and mentally\nlifted to a higher abstraction level, that more closely resembles natural\nlanguage descriptions, and eases the creation of a mental model of the changes.\nHowever, the textual diff-based representation is cumbersome, and the lifting\nrequires considerable domain knowledge and programming skills. We present an\napproach, based on the concept of micro-change, to overcome these difficulties,\ntranslating code diffs into a series of pre-defined change operations, which\ncan be described in natural language. We present a catalog of micro-changes,\ntogether with an automated micro-change detector. To evaluate our approach, we\nperformed an empirical study on a large set of open-source repositories,\nfocusing on a subset of our micro-change catalog, namely those related to\nchanges affecting the conditional logic. We found that our detector is capable\nof explaining more than 67% of the changes taking place in the systems under\nstudy.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A crucial activity in software maintenance and evolution is the comprehension
of the changes performed by developers, when they submit a pull request and/or
perform a commit on the repository. Typically, code changes are represented in
the form of code diffs, textual representations highlighting the differences
between two file versions, depicting the added, removed, and changed lines.
This simplistic representation must be interpreted by developers, and mentally
lifted to a higher abstraction level, that more closely resembles natural
language descriptions, and eases the creation of a mental model of the changes.
However, the textual diff-based representation is cumbersome, and the lifting
requires considerable domain knowledge and programming skills. We present an
approach, based on the concept of micro-change, to overcome these difficulties,
translating code diffs into a series of pre-defined change operations, which
can be described in natural language. We present a catalog of micro-changes,
together with an automated micro-change detector. To evaluate our approach, we
performed an empirical study on a large set of open-source repositories,
focusing on a subset of our micro-change catalog, namely those related to
changes affecting the conditional logic. We found that our detector is capable
of explaining more than 67% of the changes taking place in the systems under
study.