Using the H-Divergence to Prune Probabilistic Automata

2011 IEEE 23rd International Conference on Tools with Artificial Intelligence Pub Date : 2011-11-07 DOI:10.1109/ICTAI.2011.114

Marc Bernard, Baptiste Jeudy, Jean-Philippe Peyrache, M. Sebban, F. Thollard

引用次数: 0

Abstract

A problem usually encountered in probabilistic automata learning is the difficulty to deal with large training samples and/or wide alphabets. This is partially due to the size of the resulting Probabilistic Prefix Tree (PPT) from which state merging-based learning algorithms are generally applied. In this paper, we propose a novel method to prune PPTs by making use of the H-divergence d_H, recently introduced in the field of domain adaptation. d_H is based on the classification error made by an hypothesis learned from unlabeled examples drawn according to two distributions to compare. Through a thorough comparison with state-of-the-art divergence measures, we provide experimental evidences that demonstrate the efficiency of our method based on this simple and intuitive criterion.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用h散度对概率自动机进行剪枝

在概率自动机学习中经常遇到的一个问题是难以处理大型训练样本和/或广泛的字母。这部分是由于结果的概率前缀树(PPT)的大小，通常应用基于状态合并的学习算法。本文提出了一种利用域自适应领域新近引入的h -散度d_H对PPTs进行剪枝的新方法。d_H是基于从根据两个分布进行比较的未标记示例中学习到的假设所产生的分类误差。通过与最先进的散度测量方法的全面比较，我们提供了实验证据，证明了基于该简单直观准则的方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2011 IEEE 23rd International Conference on Tools with Artificial Intelligence

自引率

0.00%

发文量

期刊最新文献

Independence-Based MAP for Markov Networks Structure Discovery Flexible, Efficient and Interactive Retrieval for Supporting In-silico Studies of Endobacteria Recurrent Neural Networks for Moisture Content Prediction in Seed Corn Dryer Buildings Top Subspace Synthesizing for Promotional Subspace Mining RELIEF-C: Efficient Feature Selection for Clustering over Noisy Data