Interpreting Deep Neural Networks through Prototype Factorization

Subhajit Das, Panpan Xu, Zeng Dai, A. Endert, Liu Ren
{"title":"Interpreting Deep Neural Networks through Prototype Factorization","authors":"Subhajit Das, Panpan Xu, Zeng Dai, A. Endert, Liu Ren","doi":"10.1109/ICDMW51313.2020.00068","DOIUrl":null,"url":null,"abstract":"Typical deep neural networks (DNNs) are complex black-box models and their decision making process can be difficult to comprehend even for experienced machine learning practitioners. Therefore their use could be limited in mission-critical scenarios despite state-of-the-art performance on many challenging ML tasks. Through this work, we empower users to interpret DNNs with a post-hoc analysis protocol. We propose ProtoFac, an explainable matrix factorization technique that decomposes the latent representations at any selected layer in a pre-trained DNN as a collection of weighted prototypes, which are a small number of exemplars extracted from the original data (e.g. image patches, shapelets). Using the factorized weights and prototypes we build a surrogate model for interpretation by replacing the corresponding layer in the neural network. We identify a number of desired properties of ProtoFac including authenticity, interpretability, simplicity and propose the optimization objective and training procedure accordingly. The method is model-agnostic and can be applied to DNNs with varying architectures. It goes beyond per-sample feature-based explanation by providing prototypes as a condensed set of evidences used by the model for decision making. We applied ProtoFac to interpret pretrained DNNs for a variety of ML tasks including time series classification on electrocardiograms, and image classification. The result shows that ProtoFac is able to extract meaningful prototypes to explain the models' decisions while truthfully reflects the models' operation. We also evaluated human interpretability through Amazon Mechanical Turk (MTurk), showing that ProtoFac is able to produce interpretable and user-friendly explanations.","PeriodicalId":426846,"journal":{"name":"2020 International Conference on Data Mining Workshops (ICDMW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW51313.2020.00068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Typical deep neural networks (DNNs) are complex black-box models and their decision making process can be difficult to comprehend even for experienced machine learning practitioners. Therefore their use could be limited in mission-critical scenarios despite state-of-the-art performance on many challenging ML tasks. Through this work, we empower users to interpret DNNs with a post-hoc analysis protocol. We propose ProtoFac, an explainable matrix factorization technique that decomposes the latent representations at any selected layer in a pre-trained DNN as a collection of weighted prototypes, which are a small number of exemplars extracted from the original data (e.g. image patches, shapelets). Using the factorized weights and prototypes we build a surrogate model for interpretation by replacing the corresponding layer in the neural network. We identify a number of desired properties of ProtoFac including authenticity, interpretability, simplicity and propose the optimization objective and training procedure accordingly. The method is model-agnostic and can be applied to DNNs with varying architectures. It goes beyond per-sample feature-based explanation by providing prototypes as a condensed set of evidences used by the model for decision making. We applied ProtoFac to interpret pretrained DNNs for a variety of ML tasks including time series classification on electrocardiograms, and image classification. The result shows that ProtoFac is able to extract meaningful prototypes to explain the models' decisions while truthfully reflects the models' operation. We also evaluated human interpretability through Amazon Mechanical Turk (MTurk), showing that ProtoFac is able to produce interpretable and user-friendly explanations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过原型分解解释深度神经网络
典型的深度神经网络(dnn)是复杂的黑箱模型,即使对于经验丰富的机器学习从业者来说,它们的决策过程也很难理解。因此,尽管在许多具有挑战性的机器学习任务中具有最先进的性能,但它们在关键任务场景中的使用可能受到限制。通过这项工作,我们授权用户使用事后分析协议来解释dnn。我们提出ProtoFac,这是一种可解释的矩阵分解技术,它将预训练DNN中任何选定层的潜在表示分解为加权原型的集合,加权原型是从原始数据中提取的少量样本(例如图像补丁,shapelets)。利用分解的权重和原型,我们通过替换神经网络中的相应层来构建代理模型进行解释。我们确定了ProtoFac的一些期望属性,包括真实性、可解释性、简单性,并提出了相应的优化目标和训练程序。该方法是模型不可知的,可以应用于具有不同结构的dnn。它超越了基于每个样本特征的解释,通过提供原型作为模型用于决策的证据的浓缩集。我们应用ProtoFac来解释各种ML任务的预训练dnn,包括心电图的时间序列分类和图像分类。结果表明,ProtoFac能够提取有意义的原型来解释模型的决策,同时真实地反映模型的运行情况。我们还通过Amazon Mechanical Turk (MTurk)评估了人类的可解释性,表明ProtoFac能够产生可解释且用户友好的解释。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Synthetic Data by Principal Component Analysis Deep Contextualized Word Embedding for Text-based Online User Profiling to Detect Social Bots on Twitter Integration of Fuzzy and Deep Learning in Three-Way Decisions Mining Heterogeneous Data for Formulation Design Restructuring of Hoeffding Trees for Trapezoidal Data Streams
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1