Deep Semantic Feature Learning with Embedded Static Metrics for Software Defect Prediction

Guisheng Fan, Xuyang Diao, Huiqun Yu, Kang Yang, Liqiong Chen
{"title":"Deep Semantic Feature Learning with Embedded Static Metrics for Software Defect Prediction","authors":"Guisheng Fan, Xuyang Diao, Huiqun Yu, Kang Yang, Liqiong Chen","doi":"10.1109/APSEC48747.2019.00041","DOIUrl":null,"url":null,"abstract":"Software defect prediction, which locates defective code snippets, can assist developers in finding potential bugs and assigning their testing efforts. Traditional defect prediction features are static code metrics, which only contain statistic information of programs and fail to capture semantics in programs, leading to the degradation of defect prediction performance. To take full advantage of the semantics and static metrics of programs, we propose a framework called Defect Prediction via Attention Mechanism (DP-AM) in this paper. Specifically, DPAM first extracts vectors which are then encoded as digital vectors by mapping and word embedding from abstract syntax trees (ASTs) of programs. Then it feeds these numerical vectors into Recurrent Neural Network to automatically learn semantic features of programs. After that, it applies self-attention mechanism to further build relationship among these features. Furthermore, it employs global attention mechanism to generate significant features among them. Finally, we combine these semantic features with traditional static metrics for accurate software defect prediction. We evaluate our method in terms of F1-measure on seven open-source Java projects in Apache. Our experimental results show that DP-AM improves F1-measure by 11% in average, compared with the state-of-the-art methods.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC48747.2019.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

Software defect prediction, which locates defective code snippets, can assist developers in finding potential bugs and assigning their testing efforts. Traditional defect prediction features are static code metrics, which only contain statistic information of programs and fail to capture semantics in programs, leading to the degradation of defect prediction performance. To take full advantage of the semantics and static metrics of programs, we propose a framework called Defect Prediction via Attention Mechanism (DP-AM) in this paper. Specifically, DPAM first extracts vectors which are then encoded as digital vectors by mapping and word embedding from abstract syntax trees (ASTs) of programs. Then it feeds these numerical vectors into Recurrent Neural Network to automatically learn semantic features of programs. After that, it applies self-attention mechanism to further build relationship among these features. Furthermore, it employs global attention mechanism to generate significant features among them. Finally, we combine these semantic features with traditional static metrics for accurate software defect prediction. We evaluate our method in terms of F1-measure on seven open-source Java projects in Apache. Our experimental results show that DP-AM improves F1-measure by 11% in average, compared with the state-of-the-art methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于嵌入式静态度量的深度语义特征学习用于软件缺陷预测
软件缺陷预测,定位有缺陷的代码片段,可以帮助开发人员发现潜在的错误并分配他们的测试工作。传统的缺陷预测特征是静态的代码度量,仅包含程序的统计信息,无法捕捉程序中的语义,导致缺陷预测性能下降。为了充分利用程序的语义和静态度量,本文提出了一个基于注意机制的缺陷预测框架(DP-AM)。具体来说,DPAM首先从程序的抽象语法树(ast)中提取向量,然后通过映射和词嵌入将向量编码为数字向量。然后将这些数值向量输入到递归神经网络中,自动学习程序的语义特征。然后,应用自关注机制进一步建立这些特征之间的关系。此外,它还利用全局注意机制来生成其中的显著特征。最后,我们将这些语义特征与传统的静态度量相结合,以实现准确的软件缺陷预测。我们在Apache中的7个开源Java项目中根据f1度量来评估我们的方法。我们的实验结果表明,与最先进的方法相比,DP-AM平均提高了11%的f1测量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Detecting Duplicate Questions in Stack Overflow via Deep Learning Approaches An Algebraic Approach to Modeling and Verifying Policy-Driven Smart Devices in IoT Systems Integrating Static Program Analysis Tools for Verifying Cautions of Microcontroller How Compact Will My System Be? A Fully-Automated Way to Calculate LoC Reduced by Clone Refactoring Neural Comment Generation for Source Code with Auxiliary Code Classification Task
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1