Automated description generation for software patches

IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Information and Software Technology Pub Date : 2024-07-29 DOI:10.1016/j.infsof.2024.107543
Thanh Trong Vu, Tuan-Dung Bui, Thanh-Dat Do, Thu-Trang Nguyen, Hieu Dinh Vo, Son Nguyen
{"title":"Automated description generation for software patches","authors":"Thanh Trong Vu,&nbsp;Tuan-Dung Bui,&nbsp;Thanh-Dat Do,&nbsp;Thu-Trang Nguyen,&nbsp;Hieu Dinh Vo,&nbsp;Son Nguyen","doi":"10.1016/j.infsof.2024.107543","DOIUrl":null,"url":null,"abstract":"<div><p>Software patches are pivotal in refining and evolving codebases, addressing bugs, vulnerabilities, and optimizations. Patch descriptions provide detailed accounts of changes, aiding comprehension and collaboration among developers. However, manual description creation poses challenges in terms of time consumption and variations in quality and detail. In this paper, we propose <span>PatchExplainer</span>, an approach that addresses these challenges by framing patch description generation as a machine translation task. In <span>PatchExplainer</span>, we leverage explicit representations of critical elements, historical context, and syntactic conventions. Moreover, the translation model in <span>PatchExplainer</span> is designed with an awareness of description similarity. Particularly, the model is <em>explicitly</em> trained to recognize and incorporate similarities present in patch descriptions clustered into groups, improving its ability to generate accurate and consistent descriptions across similar patches. The dual objectives maximize similarity and accurately predict affiliating groups. Our experimental results on a large dataset of real-world software patches show that <span>PatchExplainer</span> consistently outperforms existing methods, with improvements up to 189% in <em>BLEU</em>, 5.7X in <em>Exact Match</em> rate, and 154% in <em>Semantic Similarity</em>, affirming its effectiveness in generating software patch descriptions.</p></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"177 ","pages":"Article 107543"},"PeriodicalIF":3.8000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0950584924001484/pdfft?md5=d70964a215e22a7c1a1c6018c85b6e2f&pid=1-s2.0-S0950584924001484-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584924001484","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Software patches are pivotal in refining and evolving codebases, addressing bugs, vulnerabilities, and optimizations. Patch descriptions provide detailed accounts of changes, aiding comprehension and collaboration among developers. However, manual description creation poses challenges in terms of time consumption and variations in quality and detail. In this paper, we propose PatchExplainer, an approach that addresses these challenges by framing patch description generation as a machine translation task. In PatchExplainer, we leverage explicit representations of critical elements, historical context, and syntactic conventions. Moreover, the translation model in PatchExplainer is designed with an awareness of description similarity. Particularly, the model is explicitly trained to recognize and incorporate similarities present in patch descriptions clustered into groups, improving its ability to generate accurate and consistent descriptions across similar patches. The dual objectives maximize similarity and accurately predict affiliating groups. Our experimental results on a large dataset of real-world software patches show that PatchExplainer consistently outperforms existing methods, with improvements up to 189% in BLEU, 5.7X in Exact Match rate, and 154% in Semantic Similarity, affirming its effectiveness in generating software patch descriptions.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
自动生成软件补丁说明
软件补丁是完善和发展代码库、解决错误、漏洞和优化的关键。补丁说明提供了详细的变更说明,有助于开发人员理解和协作。然而,手动创建说明会耗费大量时间,而且质量和细节会有差异。在本文中,我们提出了 PatchExplainer,这是一种通过将补丁描述生成作为机器翻译任务来应对这些挑战的方法。在 PatchExplainer 中,我们利用了关键元素、历史背景和句法习惯的显式表示。此外,PatchExplainer 中的翻译模型在设计时考虑到了描述的相似性。特别是,我们对模型进行了明确的训练,使其能够识别并纳入聚类成组的补丁描述中存在的相似性,从而提高其在相似补丁中生成准确一致描述的能力。双重目标既能最大限度地提高相似性,又能准确预测隶属群体。我们在一个大型真实世界软件补丁数据集上的实验结果表明,PatchExplainer 的性能始终优于现有方法,BLEU 提高了 189%,精确匹配率提高了 5.7 倍,语义相似度提高了 154%,这充分证明了它在生成软件补丁描述方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Information and Software Technology
Information and Software Technology 工程技术-计算机:软件工程
CiteScore
9.10
自引率
7.70%
发文量
164
审稿时长
9.6 weeks
期刊介绍: Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include: • Software management, quality and metrics, • Software processes, • Software architecture, modelling, specification, design and programming • Functional and non-functional software requirements • Software testing and verification & validation • Empirical studies of all aspects of engineering and managing software development Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information. The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.
期刊最新文献
Editorial Board A software product line approach for developing hybrid software systems Systematic mapping study on requirements engineering for regulatory compliance of software systems Evaluating the understandability and user acceptance of Attack-Defense Trees: Original experiment and replication Who uses personas in requirements engineering: The practitioners’ perspective
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1