Proceedings of the 23rd international conference on Machine learning最新文献

英文中文

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143981

Shipeng Yu, Kai Yu, Volker Tresp, H. Kriegel

Ordinal regression has become an effective way of learning user preferences, but most research focuses on single regression problems. In this paper we introduce collaborative ordinal regression, where multiple ordinal regression tasks are handled simultaneously. Rather than modeling each task individually, we explore the dependency between ranking functions through a hierarchical Bayesian model and assign a common Gaussian Process (GP) prior to all individual functions. Empirical studies show that our collaborative model outperforms the individual counterpart in preference learning applications.

有序回归已经成为学习用户偏好的一种有效方法，但大多数研究都集中在单一回归问题上。在本文中，我们引入了协同有序回归，其中多个有序回归任务同时处理。我们不是单独对每个任务建模，而是通过分层贝叶斯模型探索排序函数之间的依赖关系，并在所有单个函数之前分配一个共同的高斯过程(GP)。实证研究表明，我们的协作模型在偏好学习应用中优于个体模型。

引用次数: 60

Active sampling for detecting irrelevant features 主动采样检测不相关的特征

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143965

S. Veeramachaneni, E. Olivetti, P. Avesani

The general approach for automatically driving data collection using information from previously acquired data is called active learning. Traditional active learning addresses the problem of choosing the unlabeled examples for which the class labels are queried with the goal of learning a classifier. In contrast we address the problem of active feature sampling for detecting useless features. We propose a strategy to actively sample the values of new features on class-labeled examples, with the objective of feature relevance assessment. We derive an active feature sampling algorithm from an information theoretic and statistical formulation of the problem. We present experimental results on synthetic, UCI and real world datasets to demonstrate that our active sampling algorithm can provide accurate estimates of feature relevance with lower data acquisition costs than random sampling and other previously proposed sampling algorithms.

使用先前获取的数据中的信息自动驱动数据收集的一般方法称为主动学习。传统的主动学习解决的问题是选择未标记的样本，并为其查询类标签，以学习分类器。相反，我们解决了主动特征采样的问题，以检测无用的特征。我们提出了一种主动采样新特征值的策略，以特征相关性评估为目标。我们从信息理论和统计公式中推导出一种主动特征采样算法。我们展示了在合成、UCI和真实世界数据集上的实验结果，以证明我们的主动采样算法可以以更低的数据采集成本提供准确的特征相关性估计，而不是随机采样和其他先前提出的采样算法。

引用次数: 14

Higher order learning with graphs 图的高阶学习

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143847

Sameer Agarwal, K. Branson, Serge J. Belongie

Recently there has been considerable interest in learning with higher order relations (i.e., three-way or higher) in the unsupervised and semi-supervised settings. Hypergraphs and tensors have been proposed as the natural way of representing these relations and their corresponding algebra as the natural tools for operating on them. In this paper we argue that hypergraphs are not a natural representation for higher order relations, indeed pairwise as well as higher order relations can be handled using graphs. We show that various formulations of the semi-supervised and the unsupervised learning problem on hypergraphs result in the same graph theoretic problem and can be analyzed using existing tools.

最近，人们对在无监督和半监督环境中学习高阶关系(即三向或更高)产生了相当大的兴趣。超图和张量被认为是表示这些关系的自然方式，而它们对应的代数被认为是操作这些关系的自然工具。在本文中，我们论证了超图不是高阶关系的自然表示，事实上，对和高阶关系都可以用图来处理。我们证明了超图上的半监督和无监督学习问题的各种表述会导致相同的图论问题，并且可以使用现有的工具进行分析。

引用次数: 375

Simpler knowledge-based support vector machines 更简单的基于知识的支持向量机

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143910

Quoc V. Le, Alex Smola, Thomas Gärtner

If appropriately used, prior knowledge can significantly improve the predictive accuracy of learning algorithms or reduce the amount of training data needed. In this paper we introduce a simple method to incorporate prior knowledge in support vector machines by modifying the hypothesis space rather than the optimization problem. The optimization problem is amenable to solution by the constrained concave convex procedure, which finds a local optimum. The paper discusses different kinds of prior knowledge and demonstrates the applicability of the approach in some characteristic experiments.

如果使用得当，先验知识可以显著提高学习算法的预测精度或减少所需的训练数据量。本文介绍了一种简单的方法，通过修改假设空间而不是优化问题来将先验知识纳入支持向量机。优化问题可以用约束凹凸法求解，该方法求出一个局部最优。本文讨论了不同类型的先验知识，并在一些特征实验中证明了该方法的适用性。

引用次数: 38

An investigation of computational and informational limits in Gaussian mixture clustering 高斯混合聚类的计算和信息极限研究

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143953

N. Srebro, Gregory Shakhnarovich, S. Roweis

We investigate under what conditions clustering by learning a mixture of spherical Gaussians is (a) computationally tractable; and (b) statistically possible. We show that using principal component projection greatly aids in recovering the clustering using EM; present empirical evidence that even using such a projection, there is still a large gap between the number of samples needed to recover the clustering using EM, and the number of samples needed without computational restrictions; and characterize the regime in which such a gap exists.

我们研究了在什么条件下，通过学习球状高斯函数的混合聚类是(a)计算上可处理的;(b)统计上可能。我们发现主成分投影极大地帮助了EM聚类的恢复;提供经验证据表明，即使使用这样的投影，使用EM恢复聚类所需的样本数量与不受计算限制所需的样本数量之间仍然存在很大差距;并描述存在这种差距的制度。

引用次数: 27

Clustering graphs by weighted substructure mining 基于加权子结构挖掘的聚类图

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143964

K. Tsuda, Taku Kudo

Graph data is getting increasingly popular in, e.g., bioinformatics and text processing. A main difficulty of graph data processing lies in the intrinsic high dimensionality of graphs, namely, when a graph is represented as a binary feature vector of indicators of all possible subgraphs, the dimensionality gets too large for usual statistical methods. We propose an efficient method for learning a binomial mixture model in this feature space. Combining the l1 regularizer and the data structure called DFS code tree, the MAP estimate of non-zero parameters are computed efficiently by means of the EM algorithm. Our method is applied to the clustering of RNA graphs, and is compared favorably with graph kernels and the spectral graph distance.

图形数据在生物信息学和文本处理等领域越来越受欢迎。图数据处理的一个主要困难在于图固有的高维，即当一个图被表示为所有可能子图的指标的二值特征向量时，通常的统计方法的维数太大。我们提出了一种在该特征空间中学习二项混合模型的有效方法。结合l1正则化器和DFS编码树的数据结构，利用EM算法有效地计算了非零参数的MAP估计。将该方法应用于RNA图的聚类，并与图核和谱图距离进行了比较。

引用次数: 80

How boosting the margin can also boost classifier complexity 如何提高边际也能提高分类器的复杂度

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143939

L. Reyzin, R. Schapire

Boosting methods are known not to usually overfit training data even as the size of the generated classifiers becomes large. Schapire et al. attempted to explain this phenomenon in terms of the margins the classifier achieves on training examples. Later, however, Breiman cast serious doubt on this explanation by introducing a boosting algorithm, arc-gv, that can generate a higher margins distribution than AdaBoost and yet performs worse. In this paper, we take a close look at Breiman's compelling but puzzling results. Although we can reproduce his main finding, we find that the poorer performance of arc-gv can be explained by the increased complexity of the base classifiers it uses, an explanation supported by our experiments and entirely consistent with the margins theory. Thus, we find maximizing the margins is desirable, but not necessarily at the expense of other factors, especially base-classifier complexity.

众所周知，即使生成的分类器变得很大，增强方法通常也不会过度拟合训练数据。Schapire等人试图用分类器在训练样本上获得的边际来解释这种现象。然而，后来Breiman对这一解释提出了严重的质疑，他引入了一种增强算法arc-gv，该算法可以产生比AdaBoost更高的利润率分布，但性能却更差。在本文中，我们仔细研究了Breiman令人信服但令人困惑的结果。虽然我们可以重现他的主要发现，但我们发现arc-gv较差的性能可以通过它使用的基本分类器的复杂性增加来解释，我们的实验支持这一解释，并且与边际理论完全一致。因此，我们发现最大化边际是可取的，但不一定以牺牲其他因素为代价，特别是基分类器的复杂性。

引用次数: 238

Permutation invariant SVMs 排列不变支持向量机

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143947

Pannagadatta K. Shivaswamy, T. Jebara

We extend Support Vector Machines to input spaces that are sets by ensuring that the classifier is invariant to permutations of sub-elements within each input. Such permutations include reordering of scalars in an input vector, re-orderings of tuples in an input matrix or re-orderings of general objects (in Hilbert spaces) within a set as well. This approach induces permutational invariance in the classifier which can then be directly applied to unusual set-based representations of data. The permutation invariant Support Vector Machine alternates the Hungarian method for maximum weight matching within the maximum margin learning procedure. We effectively estimate and apply permutations to the input data points to maximize classification margin while minimizing data radius. This procedure has a strong theoretical justification via well established error probability bounds. Experiments are shown on character recognition, 3D object recognition and various UCI datasets.

我们将支持向量机扩展到通过确保分类器对每个输入中的子元素的排列不变而设置的输入空间。这种排列包括输入向量中标量的重新排序，输入矩阵中元组的重新排序，或者集合中一般对象(在希尔伯特空间中)的重新排序。这种方法在分类器中引入了排列不变性，然后可以直接应用于不寻常的基于集合的数据表示。在最大边际学习过程中，置换不变支持向量机替代匈牙利方法进行最大权重匹配。我们有效地估计和应用排列输入数据点，以最大限度地提高分类裕度，同时最小化数据半径。通过建立良好的误差概率界限，该程序具有很强的理论正当性。在字符识别、三维物体识别和各种UCI数据集上进行了实验。

引用次数: 22

Fast particle smoothing: if I had a million particles 快速粒子平滑:如果我有一百万个粒子

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143905

Mike Klaas, M. Briers, Nando de Freitas, A. Doucet, S. Maskell, Dustin Lang

We propose efficient particle smoothing methods for generalized state-spaces models. Particle smoothing is an expensive O(N2) algorithm, where N is the number of particles. We overcome this problem by integrating dual tree recursions and fast multipole techniques with forward-backward smoothers, a new generalized two-filter smoother and a maximum a posteriori (MAP) smoother. Our experiments show that these improvements can substantially increase the practicality of particle smoothing.

针对广义状态空间模型，提出了有效的粒子平滑方法。粒子平滑是一种代价昂贵的O(N2)算法，其中N为粒子数。我们通过将对偶树递归和快速多极技术与正向向后平滑、一种新的广义双滤波器平滑和最大后验(MAP)平滑相结合来克服这一问题。我们的实验表明，这些改进可以大大提高粒子平滑的实用性。

引用次数: 185

A choice model with infinitely many latent features 具有无限多潜在特征的选择模型

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143890

Dilan Görür, F. Jäkel, C. Rasmussen

Elimination by aspects (EBA) is a probabilistic choice model describing how humans decide between several options. The options from which the choice is made are characterized by binary features and associated weights. For instance, when choosing which mobile phone to buy the features to consider may be: long lasting battery, color screen, etc. Existing methods for inferring the parameters of the model assume pre-specified features. However, the features that lead to the observed choices are not always known. Here, we present a non-parametric Bayesian model to infer the features of the options and the corresponding weights from choice data. We use the Indian buffet process (IBP) as a prior over the features. Inference using Markov chain Monte Carlo (MCMC) in conjugate IBP models has been previously described. The main contribution of this paper is an MCMC algorithm for the EBA model that can also be used in inference for other non-conjugate IBP models---this may broaden the use of IBP priors considerably.

方面消除(EBA)是一种概率选择模型，描述人类如何在几个选项之间做出决定。做出选择的选项由二元特征和相关权重表征。例如，在选择购买哪款手机时，要考虑的功能可能是:持久的电池，彩色屏幕等。现有的推断模型参数的方法假设了预先指定的特征。然而，导致观察到的选择的特征并不总是已知的。在这里，我们提出了一个非参数贝叶斯模型，从选择数据中推断出选项的特征和相应的权重。我们使用印度自助餐过程(IBP)作为特征的先验。在共轭IBP模型中使用马尔可夫链蒙特卡罗(MCMC)进行推理之前已经描述过。本文的主要贡献是EBA模型的MCMC算法，该算法也可用于其他非共轭IBP模型的推理-这可能会大大扩大IBP先验的使用。

引用次数: 53

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 23rd international conference on Machine learning

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀