从不透明模型中提取规则——一个稍微不同的视角

U. Johansson, Tuwe Löfström, Rikard König, Cecilia Sönströd, L. Niklasson
{"title":"从不透明模型中提取规则——一个稍微不同的视角","authors":"U. Johansson, Tuwe Löfström, Rikard König, Cecilia Sönströd, L. Niklasson","doi":"10.1109/ICMLA.2006.46","DOIUrl":null,"url":null,"abstract":"When performing predictive modeling, the key criterion is always accuracy. With this in mind, complex techniques like neural networks or ensembles are normally used, resulting in opaque models impossible to interpret. When models need to be comprehensible, accuracy is often sacrificed by using simpler techniques directly producing transparent models; a tradeoff termed the accuracy vs. comprehensibility tradeoff. In order to reduce this tradeoff, the opaque model can be transformed into another, interpretable, model; an activity termed rule extraction. In this paper, it is argued that rule extraction algorithms should gain from using oracle data; i.e. test set instances, together with corresponding predictions from the opaque model. The experiments, using 17 publicly available data sets, clearly show that rules extracted using only oracle data were significantly more accurate than both rules extracted by the same algorithm, using training data, and standard decision tree algorithms. In addition, the same rules were also significantly more compact; thus providing better comprehensibility. The overall implication is that rules extracted in this fashion explain the predictions made on novel data better than rules extracted in the standard way; i.e. using training data only","PeriodicalId":297071,"journal":{"name":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Rule Extraction from Opaque Models-- A Slightly Different Perspective\",\"authors\":\"U. Johansson, Tuwe Löfström, Rikard König, Cecilia Sönströd, L. Niklasson\",\"doi\":\"10.1109/ICMLA.2006.46\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When performing predictive modeling, the key criterion is always accuracy. With this in mind, complex techniques like neural networks or ensembles are normally used, resulting in opaque models impossible to interpret. When models need to be comprehensible, accuracy is often sacrificed by using simpler techniques directly producing transparent models; a tradeoff termed the accuracy vs. comprehensibility tradeoff. In order to reduce this tradeoff, the opaque model can be transformed into another, interpretable, model; an activity termed rule extraction. In this paper, it is argued that rule extraction algorithms should gain from using oracle data; i.e. test set instances, together with corresponding predictions from the opaque model. The experiments, using 17 publicly available data sets, clearly show that rules extracted using only oracle data were significantly more accurate than both rules extracted by the same algorithm, using training data, and standard decision tree algorithms. In addition, the same rules were also significantly more compact; thus providing better comprehensibility. The overall implication is that rules extracted in this fashion explain the predictions made on novel data better than rules extracted in the standard way; i.e. using training data only\",\"PeriodicalId\":297071,\"journal\":{\"name\":\"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2006.46\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 5th International Conference on Machine Learning and Applications (ICMLA'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2006.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

摘要

在进行预测建模时,关键的标准总是准确性。考虑到这一点,通常使用复杂的技术,如神经网络或集成,导致不透明的模型无法解释。当模型需要易于理解时,使用更简单的技术直接生成透明模型往往会牺牲准确性;这种权衡被称为准确性与可理解性的权衡。为了减少这种权衡,可以将不透明模型转换为另一种可解释的模型;称为规则提取的活动。本文认为规则提取算法应该从使用oracle数据中获益;即测试集实例,以及来自不透明模型的相应预测。使用17个公开可用数据集的实验清楚地表明,仅使用oracle数据提取的规则明显比使用相同算法(使用训练数据和标准决策树算法)提取的规则更准确。此外,同样的规则也明显更加紧凑;从而提供更好的可理解性。总的含义是,以这种方式提取的规则比以标准方式提取的规则更能解释对新数据做出的预测;即只使用训练数据
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Rule Extraction from Opaque Models-- A Slightly Different Perspective
When performing predictive modeling, the key criterion is always accuracy. With this in mind, complex techniques like neural networks or ensembles are normally used, resulting in opaque models impossible to interpret. When models need to be comprehensible, accuracy is often sacrificed by using simpler techniques directly producing transparent models; a tradeoff termed the accuracy vs. comprehensibility tradeoff. In order to reduce this tradeoff, the opaque model can be transformed into another, interpretable, model; an activity termed rule extraction. In this paper, it is argued that rule extraction algorithms should gain from using oracle data; i.e. test set instances, together with corresponding predictions from the opaque model. The experiments, using 17 publicly available data sets, clearly show that rules extracted using only oracle data were significantly more accurate than both rules extracted by the same algorithm, using training data, and standard decision tree algorithms. In addition, the same rules were also significantly more compact; thus providing better comprehensibility. The overall implication is that rules extracted in this fashion explain the predictions made on novel data better than rules extracted in the standard way; i.e. using training data only
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An Efficient Heuristic for Discovering Multiple Ill-Defined Attributes in Datasets Robust Model Selection Using Cross Validation: A Simple Iterative Technique for Developing Robust Gene Signatures in Biomedical Genomics Applications Detecting Web Content Function Using Generalized Hidden Markov Model Naive Bayes Classification Given Probability Estimation Trees A New Machine Learning Technique Based on Straight Line Segments
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1