Why the Naive Bayes approximation is not as Naive as it appears

2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA) Pub Date : 2015-07-06 DOI:10.1109/IISA.2015.7388083

C. Stephens, Hugo Flores, Ana Ruiz Linares

{"title":"Why the Naive Bayes approximation is not as Naive as it appears","authors":"C. Stephens, Hugo Flores, Ana Ruiz Linares","doi":"10.1109/IISA.2015.7388083","DOIUrl":null,"url":null,"abstract":"The Naive Bayes approximation and associated classifier is widely used in machine learning and data mining and offers very robust performance across a large spectrum of problem domains. As it depends on a very strong assumption - independence among features - this has been somewhat puzzling. Various hypotheses have been put forward to explain its success and moreover many generalizations have been proposed. In this paper we propose a set of \"local\" error measures - associated with the likelihood functions for particular subsets of attributes and for each class - and show explicitly how these local errors combine to give a \"global\" error associated to the full attribute set. By so doing we formulate a framework within which the phenomenon of error cancelation, or augmentation, can be quantitatively evaluated and its impact on classifier performance estimated and predicted a priori. These diagnostics also allow us to develop a deeper and more quantitative understanding of why the Naive Bayes approximation is so robust and under what circumstances one expects it to break down.","PeriodicalId":433872,"journal":{"name":"2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA.2015.7388083","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

The Naive Bayes approximation and associated classifier is widely used in machine learning and data mining and offers very robust performance across a large spectrum of problem domains. As it depends on a very strong assumption - independence among features - this has been somewhat puzzling. Various hypotheses have been put forward to explain its success and moreover many generalizations have been proposed. In this paper we propose a set of "local" error measures - associated with the likelihood functions for particular subsets of attributes and for each class - and show explicitly how these local errors combine to give a "global" error associated to the full attribute set. By so doing we formulate a framework within which the phenomenon of error cancelation, or augmentation, can be quantitatively evaluated and its impact on classifier performance estimated and predicted a priori. These diagnostics also allow us to develop a deeper and more quantitative understanding of why the Naive Bayes approximation is so robust and under what circumstances one expects it to break down.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

为什么朴素贝叶斯近似不像看起来那么朴素

朴素贝叶斯近似和相关分类器被广泛应用于机器学习和数据挖掘，并在大范围的问题领域提供了非常强大的性能。由于它依赖于一个非常强的假设——特征之间的独立性——这有点令人困惑。人们提出了各种各样的假设来解释它的成功，并且提出了许多概括。在本文中，我们提出了一组“局部”误差度量——与属性的特定子集和每个类的似然函数相关联——并明确地展示了这些局部误差是如何结合起来给出与完整属性集相关的“全局”误差的。通过这样做，我们制定了一个框架，可以定量评估误差消除或增强现象，并先验地估计和预测其对分类器性能的影响。这些诊断还使我们能够更深入、更定量地理解为什么朴素贝叶斯近似如此鲁棒，以及在什么情况下它会崩溃。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA)

自引率

0.00%

发文量