{"title":"An Explanation Method for Models of Code","authors":"Yu Wang, Ke Wang, Linzhang Wang","doi":"10.1145/3622826","DOIUrl":null,"url":null,"abstract":"This paper introduces a novel method, called WheaCha, for explaining the predictions of code models. Similar to attribution methods, WheaCha seeks to identify input features that are responsible for a particular prediction that models make. On the other hand, it differs from attribution methods in crucial ways. Specifically, WheaCha separates an input program into \"wheat\" (i.e., defining features that are the reason for which models predict the label that they predict) and the rest \"chaff\" for any given prediction. We realize WheaCha in a tool, HuoYan, and use it to explain four prominent code models: code2vec, seq-GNN, GGNN, and CodeBERT. Results show that (1) HuoYan is efficient — taking on average under twenty seconds to compute wheat for an input program in an end-to-end fashion (i.e., including model prediction time); (2) the wheat that all models use to make predictions is predominantly comprised of simple syntactic or even lexical properties (i.e., identifier names); (3) neither the latest explainability methods for code models (i.e., SIVAND and CounterFactual Explanations) nor the most noteworthy attribution methods (i.e., Integrated Gradients and SHAP) can precisely capture wheat. Finally, we set out to demonstrate the usefulness of WheaCha, in particular, we assess if WheaCha’s explanations can help end users to identify defective code models (e.g., trained on mislabeled data or learned spurious correlations from biased data). We find that, with WheaCha, users achieve far higher accuracy in identifying faulty models than SIVAND, CounterFactual Explanations, Integrated Gradients and SHAP.","PeriodicalId":20697,"journal":{"name":"Proceedings of the ACM on Programming Languages","volume":"14 2-3 1","pages":"0"},"PeriodicalIF":2.2000,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM on Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3622826","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
This paper introduces a novel method, called WheaCha, for explaining the predictions of code models. Similar to attribution methods, WheaCha seeks to identify input features that are responsible for a particular prediction that models make. On the other hand, it differs from attribution methods in crucial ways. Specifically, WheaCha separates an input program into "wheat" (i.e., defining features that are the reason for which models predict the label that they predict) and the rest "chaff" for any given prediction. We realize WheaCha in a tool, HuoYan, and use it to explain four prominent code models: code2vec, seq-GNN, GGNN, and CodeBERT. Results show that (1) HuoYan is efficient — taking on average under twenty seconds to compute wheat for an input program in an end-to-end fashion (i.e., including model prediction time); (2) the wheat that all models use to make predictions is predominantly comprised of simple syntactic or even lexical properties (i.e., identifier names); (3) neither the latest explainability methods for code models (i.e., SIVAND and CounterFactual Explanations) nor the most noteworthy attribution methods (i.e., Integrated Gradients and SHAP) can precisely capture wheat. Finally, we set out to demonstrate the usefulness of WheaCha, in particular, we assess if WheaCha’s explanations can help end users to identify defective code models (e.g., trained on mislabeled data or learned spurious correlations from biased data). We find that, with WheaCha, users achieve far higher accuracy in identifying faulty models than SIVAND, CounterFactual Explanations, Integrated Gradients and SHAP.