Taxonomic Dimensionality Reduction in Bayesian Text Classification

2012 11th International Conference on Machine Learning and Applications Pub Date : 2012-12-12 DOI:10.1109/ICMLA.2012.93

Richard A. McAllister, John W. Sheppard

引用次数: 0

Abstract

Lexical abstraction hierarchies can be leveraged to provide semantic information that characterizes features of text corpora as a whole. This information may be used to determine the classification utility of the dimensions that describe a dataset. This paper presents a new method for preparing a dataset for probabilistic classification by determining, a priori, the utility of a very small subset of taxonomically-related dimensions via a Discriminative Multinomial Naive Bayes process. We show that this method yields significant improvements over both Discriminative Multinomial Naive Bayes and Bayesian network classifiers alone.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

贝叶斯文本分类中的分类降维

可以利用词汇抽象层次结构来提供语义信息，这些信息将文本语料库的特征作为一个整体来描述。此信息可用于确定描述数据集的维度的分类效用。本文提出了一种通过判别多项式朴素贝叶斯过程先验地确定分类相关维度的极小子集的效用来准备用于概率分类的数据集的新方法。我们证明这种方法比单独的判别多项式朴素贝叶斯和贝叶斯网络分类器都有显著的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2012 11th International Conference on Machine Learning and Applications

自引率

0.00%

发文量