An Explanation Method for Models of Code

IF 2.2 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Proceedings of the ACM on Programming Languages Pub Date : 2023-10-16 DOI:10.1145/3622826
Yu Wang, Ke Wang, Linzhang Wang
{"title":"An Explanation Method for Models of Code","authors":"Yu Wang, Ke Wang, Linzhang Wang","doi":"10.1145/3622826","DOIUrl":null,"url":null,"abstract":"This paper introduces a novel method, called WheaCha, for explaining the predictions of code models. Similar to attribution methods, WheaCha seeks to identify input features that are responsible for a particular prediction that models make. On the other hand, it differs from attribution methods in crucial ways. Specifically, WheaCha separates an input program into \"wheat\" (i.e., defining features that are the reason for which models predict the label that they predict) and the rest \"chaff\" for any given prediction. We realize WheaCha in a tool, HuoYan, and use it to explain four prominent code models: code2vec, seq-GNN, GGNN, and CodeBERT. Results show that (1) HuoYan is efficient — taking on average under twenty seconds to compute wheat for an input program in an end-to-end fashion (i.e., including model prediction time); (2) the wheat that all models use to make predictions is predominantly comprised of simple syntactic or even lexical properties (i.e., identifier names); (3) neither the latest explainability methods for code models (i.e., SIVAND and CounterFactual Explanations) nor the most noteworthy attribution methods (i.e., Integrated Gradients and SHAP) can precisely capture wheat. Finally, we set out to demonstrate the usefulness of WheaCha, in particular, we assess if WheaCha’s explanations can help end users to identify defective code models (e.g., trained on mislabeled data or learned spurious correlations from biased data). We find that, with WheaCha, users achieve far higher accuracy in identifying faulty models than SIVAND, CounterFactual Explanations, Integrated Gradients and SHAP.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"14 2-3 1","pages":"0"},"PeriodicalIF":2.2000,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3622826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

This paper introduces a novel method, called WheaCha, for explaining the predictions of code models. Similar to attribution methods, WheaCha seeks to identify input features that are responsible for a particular prediction that models make. On the other hand, it differs from attribution methods in crucial ways. Specifically, WheaCha separates an input program into "wheat" (i.e., defining features that are the reason for which models predict the label that they predict) and the rest "chaff" for any given prediction. We realize WheaCha in a tool, HuoYan, and use it to explain four prominent code models: code2vec, seq-GNN, GGNN, and CodeBERT. Results show that (1) HuoYan is efficient — taking on average under twenty seconds to compute wheat for an input program in an end-to-end fashion (i.e., including model prediction time); (2) the wheat that all models use to make predictions is predominantly comprised of simple syntactic or even lexical properties (i.e., identifier names); (3) neither the latest explainability methods for code models (i.e., SIVAND and CounterFactual Explanations) nor the most noteworthy attribution methods (i.e., Integrated Gradients and SHAP) can precisely capture wheat. Finally, we set out to demonstrate the usefulness of WheaCha, in particular, we assess if WheaCha’s explanations can help end users to identify defective code models (e.g., trained on mislabeled data or learned spurious correlations from biased data). We find that, with WheaCha, users achieve far higher accuracy in identifying faulty models than SIVAND, CounterFactual Explanations, Integrated Gradients and SHAP.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
代码模型的一种解释方法
本文介绍了一种名为WheaCha的新方法,用于解释代码模型的预测。与归因方法类似,WheaCha寻求识别负责模型做出特定预测的输入特征。另一方面,它与归因方法在关键方面有所不同。具体来说,WheaCha将输入程序分离为“小麦”(即,定义模型预测其预测标签的原因的特征)和其他任何给定预测的“糠”。我们在一个工具HuoYan中实现了WheaCha,并用它来解释四个重要的代码模型:code2vec、seq-GNN、GGNN和CodeBERT。结果表明:(1)火言是高效的——以端到端方式计算输入程序的小麦平均用时不到20秒(即,包括模型预测时间);(2)所有模型用来进行预测的小麦主要由简单的语法甚至词汇属性(即标识符名称)组成;(3)最新的编码模型可解释性方法(即SIVAND和反事实解释)和最值得注意的归因方法(即Integrated Gradients和SHAP)都不能精确捕获小麦。最后,我们开始展示WheaCha的有用性,特别是,我们评估WheaCha的解释是否可以帮助最终用户识别有缺陷的代码模型(例如,在错误标记的数据上训练或从有偏差的数据中学习虚假的相关性)。我们发现,与SIVAND、反事实解释、集成梯度和SHAP相比,使用WheaCha,用户在识别错误模型方面的准确性要高得多。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages Engineering-Safety, Risk, Reliability and Quality
CiteScore
5.20
自引率
22.20%
发文量
192
期刊最新文献
ReLU Hull Approximation An Axiomatic Basis for Computer Programming on the Relaxed Arm-A Architecture: The AxSL Logic The Essence of Generalized Algebraic Data Types Explicit Effects and Effect Constraints in ReML Indexed Types for a Statically Safe WebAssembly
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1