Analyzing and Improving the Robustness of Tabular Classifiers using Counterfactual Explanations

P. Rasouli, Ingrid Chieh Yu
{"title":"Analyzing and Improving the Robustness of Tabular Classifiers using Counterfactual Explanations","authors":"P. Rasouli, Ingrid Chieh Yu","doi":"10.1109/ICMLA52953.2021.00209","DOIUrl":null,"url":null,"abstract":"Recent studies have revealed that Machine Learning (ML) models are vulnerable to adversarial perturbations. Such perturbations can be intentionally or accidentally added to the original inputs, evading the classifier’s behavior to misclassify the crafted samples. A widely-used solution is to retrain the model using data points generated by various attack strategies. However, this creates a classifier robust to some particular evasions and can not defend unknown or universal perturbations. Counterfactual explanations are a specific class of post-hoc explanation methods that provide minimal modification to the input features in order to obtain a particular outcome from the model. In addition to the resemblance of counterfactual explanations to the universal perturbations, the possibility of generating instances from specific classes makes such approaches suitable for analyzing and improving the model’s robustness. Rather than explaining the model’s decisions in the deployment phase, we utilize the distance information obtained from counterfactuals and propose novel metrics to analyze the robustness of tabular classifiers. Further, we introduce a decision boundary modification approach using customized counterfactual data points to improve the robustness of the models without compromising their accuracy. Our framework addresses the robustness of black-box classifiers in the tabular setting, which is considered an under-explored research area. Through several experiments and evaluations, we demonstrate the efficacy of our approach in analyzing and improving the robustness of black-box tabular classifiers.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"8 1","pages":"1286-1293"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA52953.2021.00209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Recent studies have revealed that Machine Learning (ML) models are vulnerable to adversarial perturbations. Such perturbations can be intentionally or accidentally added to the original inputs, evading the classifier’s behavior to misclassify the crafted samples. A widely-used solution is to retrain the model using data points generated by various attack strategies. However, this creates a classifier robust to some particular evasions and can not defend unknown or universal perturbations. Counterfactual explanations are a specific class of post-hoc explanation methods that provide minimal modification to the input features in order to obtain a particular outcome from the model. In addition to the resemblance of counterfactual explanations to the universal perturbations, the possibility of generating instances from specific classes makes such approaches suitable for analyzing and improving the model’s robustness. Rather than explaining the model’s decisions in the deployment phase, we utilize the distance information obtained from counterfactuals and propose novel metrics to analyze the robustness of tabular classifiers. Further, we introduce a decision boundary modification approach using customized counterfactual data points to improve the robustness of the models without compromising their accuracy. Our framework addresses the robustness of black-box classifiers in the tabular setting, which is considered an under-explored research area. Through several experiments and evaluations, we demonstrate the efficacy of our approach in analyzing and improving the robustness of black-box tabular classifiers.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用反事实解释分析和改进表分类器的鲁棒性
最近的研究表明,机器学习(ML)模型容易受到对抗性扰动的影响。这种扰动可以有意或无意地添加到原始输入中,从而避免分类器对精心制作的样本进行错误分类。一种广泛使用的解决方案是使用由各种攻击策略生成的数据点重新训练模型。然而,这创建了一个分类器对某些特定的规避鲁棒性,不能防御未知或普遍的扰动。反事实解释是一种特殊的事后解释方法,它对输入特征进行最小的修改,以便从模型中获得特定的结果。除了反事实解释与普遍扰动的相似性之外,从特定类中生成实例的可能性使得这种方法适合于分析和提高模型的鲁棒性。我们不是在部署阶段解释模型的决策,而是利用从反事实中获得的距离信息并提出新的度量来分析表格分类器的鲁棒性。此外,我们引入了一种使用自定义反事实数据点的决策边界修改方法,以提高模型的鲁棒性而不影响其准确性。我们的框架解决了黑箱分类器在表格设置中的鲁棒性,这被认为是一个尚未开发的研究领域。通过几个实验和评估,我们证明了我们的方法在分析和提高黑箱表分类器的鲁棒性方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Detecting Offensive Content on Twitter During Proud Boys Riots Explainable Zero-Shot Modelling of Clinical Depression Symptoms from Text Deep Learning Methods for the Prediction of Information Display Type Using Eye Tracking Sequences Step Detection using SVM on NURVV Trackers Condition Monitoring for Power Converters via Deep One-Class Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1