{"title":"Automatic Data-Driven Software Change Identification via Code Representation Learning","authors":"Tjaša Heričko","doi":"10.1145/3593434.3593505","DOIUrl":null,"url":null,"abstract":"Changes to a software project are inevitable as the software requires continuous adaptations, improvements, and corrections throughout maintenance. Identifying the purpose and impact of changes made to the codebase is critical in software engineering. However, manually identifying and characterizing software changes can be a time-consuming and tedious process that adds to the workload of software engineers. To address this challenge, several attempts have been made to automatically identify and demystify intents of software changes based on software artifacts such as commit change logs, issue reports, change messages, source code files, and software documentation. However, these existing approaches have their limitations. These include a lack of data, limited performance, and an inability to evaluate compound changes. This paper presents a doctoral research proposal that aims to automate the process of identifying commit-level changes in software projects using software repository mining and code representation learning models. The research background, state-of-the-art, research objectives, research agenda, and threats to validity are discussed.","PeriodicalId":178596,"journal":{"name":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3593434.3593505","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Changes to a software project are inevitable as the software requires continuous adaptations, improvements, and corrections throughout maintenance. Identifying the purpose and impact of changes made to the codebase is critical in software engineering. However, manually identifying and characterizing software changes can be a time-consuming and tedious process that adds to the workload of software engineers. To address this challenge, several attempts have been made to automatically identify and demystify intents of software changes based on software artifacts such as commit change logs, issue reports, change messages, source code files, and software documentation. However, these existing approaches have their limitations. These include a lack of data, limited performance, and an inability to evaluate compound changes. This paper presents a doctoral research proposal that aims to automate the process of identifying commit-level changes in software projects using software repository mining and code representation learning models. The research background, state-of-the-art, research objectives, research agenda, and threats to validity are discussed.