Fast transpose methods for kernel learning on sparse data

Proceedings of the 23rd international conference on Machine learning Pub Date : 2006-06-25 DOI:10.1145/1143844.1143893

P. Haffner

引用次数: 7

Abstract

Kernel-based learning algorithms, such as Support Vector Machines (SVMs) or Perceptron, often rely on sequential optimization where a few examples are added at each iteration. Updating the kernel matrix usually requires matrix-vector multiplications. We propose a new method based on transposition to speedup this computation on sparse data. Instead of dot-products over sparse feature vectors, our computation incrementally merges lists of training examples and minimizes access to the data. Caching and shrinking are also optimized for sparsity. On very large natural language tasks (tagging, translation, text classification) with sparse feature representations, a 20 to 80-fold speedup over LIBSVM is observed using the same SMO algorithm. Theory and experiments explain what type of sparsity structure is needed for this approach to work, and why its adaptation to Maxent sequential optimization is inefficient.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

稀疏数据核学习的快速转置方法

基于核的学习算法，如支持向量机(svm)或感知机(Perceptron)，通常依赖于顺序优化，在每次迭代中添加一些示例。更新核矩阵通常需要矩阵-向量乘法。我们提出了一种基于换位的新方法来加快稀疏数据的计算速度。我们的计算不是稀疏特征向量上的点积，而是增量地合并训练示例列表，并最大限度地减少对数据的访问。缓存和收缩也针对稀疏性进行了优化。在具有稀疏特征表示的非常大的自然语言任务(标记、翻译、文本分类)上，使用相同的SMO算法可以观察到比LIBSVM提高20到80倍的速度。理论和实验解释了这种方法需要什么类型的稀疏结构才能工作，以及为什么它对Maxent顺序优化的适应是低效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 23rd international conference on Machine learning

自引率

0.00%

发文量

期刊最新文献

On a theory of learning with similarity functions Bayesian learning of measurement and structural models Predictive search distributions Data association for topic intensity tracking Feature value acquisition in testing: a sequential batch test algorithm