Proceedings of the 23rd international conference on Machine learning最新文献

英文中文

Autonomous shaping: knowledge transfer in reinforcement learning 自主塑造:强化学习中的知识转移

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143906

G. Konidaris, A. Barto

We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting in accelerated learning in later tasks that are related but distinct. Such agents can be trained on a sequence of relatively easy tasks in order to develop a more informative measure of reward that can be transferred to improve performance on more difficult tasks without requiring a hand coded shaping function. We use a rod positioning task to show that this significantly improves performance even after a very brief training period.

我们在强化学习任务中引入了学习成型奖励的使用，其中智能体使用一系列任务的先验经验来学习估计中间奖励的便携式预测器，从而加速了后期相关但不同的任务的学习。这样的智能体可以在一系列相对简单的任务上进行训练，以便开发一种更有信息的奖励措施，这种奖励措施可以转移到更困难的任务上，而不需要手工编码的塑造函数。我们使用一个杆定位任务来证明，即使在很短的训练时间后，这也能显著提高表现。

引用次数: 225

Learning a kernel function for classification with small training samples 学习核函数的分类与小的训练样本

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143895

T. Hertz, Aharon Bar-Hillel, D. Weinshall

When given a small sample, we show that classification with SVM can be considerably enhanced by using a kernel function learned from the training data prior to discrimination. This kernel is also shown to enhance retrieval based on data similarity. Specifically, we describe KernelBoost - a boosting algorithm which computes a kernel function as a combination of 'weak' space partitions. The kernel learning method naturally incorporates domain knowledge in the form of unlabeled data (i.e. in a semi-supervised or transductive settings), and also in the form of labeled samples from relevant related problems (i.e. in a learning-to-learn scenario). The latter goal is accomplished by learning a single kernel function for all classes. We show comparative evaluations of our method on datasets from the UCI repository. We demonstrate performance enhancement on two challenging tasks: digit classification with kernel SVM, and facial image retrieval based on image similarity as measured by the learnt kernel.

当给定一个小样本时，我们表明通过使用从训练数据中学习到的核函数在识别之前可以显着增强SVM的分类。该内核还可以增强基于数据相似度的检索。具体来说，我们描述了KernelBoost——一种将核函数计算为“弱”空间分区组合的增强算法。核学习方法自然地以未标记数据的形式(即在半监督或转导设置中)结合领域知识，也以来自相关问题的标记样本的形式(即在学习到学习的场景中)。后一个目标是通过学习所有类的单个核函数来实现的。我们在UCI存储库的数据集上展示了我们的方法的比较评估。我们在两个具有挑战性的任务上展示了性能增强:核支持向量机的数字分类，以及基于学习到的核测量的图像相似性的面部图像检索。

引用次数: 87

Nonstationary kernel combination 非平稳核组合

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143914

Darrin P. Lewis, T. Jebara, William Stafford Noble

The power and popularity of kernel methods stem in part from their ability to handle diverse forms of structured inputs, including vectors, graphs and strings. Recently, several methods have been proposed for combining kernels from heterogeneous data sources. However, all of these methods produce stationary combinations; i.e., the relative weights of the various kernels do not vary among input examples. This article proposes a method for combining multiple kernels in a nonstationary fashion. The approach uses a large-margin latent-variable generative model within the maximum entropy discrimination (MED) framework. Latent parameter estimation is rendered tractable by variational bounds and an iterative optimization procedure. The classifier we use is a log-ratio of Gaussian mixtures, in which each component is implicitly mapped via a Mercer kernel function. We show that the support vector machine is a special case of this model. In this approach, discriminative parameter estimation is feasible via a fast sequential minimal optimization algorithm. Empirical results are presented on synthetic data, several benchmarks, and on a protein function annotation task.

核方法的强大和流行部分源于它们处理各种形式的结构化输入的能力，包括向量、图和字符串。最近，人们提出了几种方法来组合来自异构数据源的核。然而，所有这些方法产生平稳组合;也就是说，各种核的相对权重在不同的输入示例中不会变化。本文提出了一种以非平稳方式组合多个核的方法。该方法在最大熵判别(MED)框架内使用大边际潜变量生成模型。隐参数估计通过变分边界和迭代优化过程变得易于处理。我们使用的分类器是高斯混合物的对数比，其中每个成分都通过默瑟核函数隐式映射。我们证明了支持向量机是该模型的一个特例。在该方法中，通过快速的顺序最小优化算法，判别参数估计是可行的。实证结果提出了合成数据，几个基准，并在蛋白质功能注释任务。

{"title":"Nonstationary kernel combination","authors":"Darrin P. Lewis, T. Jebara, William Stafford Noble","doi":"10.1145/1143844.1143914","DOIUrl":"https://doi.org/10.1145/1143844.1143914","url":null,"abstract":"The power and popularity of kernel methods stem in part from their ability to handle diverse forms of structured inputs, including vectors, graphs and strings. Recently, several methods have been proposed for combining kernels from heterogeneous data sources. However, all of these methods produce stationary combinations; i.e., the relative weights of the various kernels do not vary among input examples. This article proposes a method for combining multiple kernels in a nonstationary fashion. The approach uses a large-margin latent-variable generative model within the maximum entropy discrimination (MED) framework. Latent parameter estimation is rendered tractable by variational bounds and an iterative optimization procedure. The classifier we use is a log-ratio of Gaussian mixtures, in which each component is implicitly mapped via a Mercer kernel function. We show that the support vector machine is a special case of this model. In this approach, discriminative parameter estimation is feasible via a fast sequential minimal optimization algorithm. Empirical results are presented on synthetic data, several benchmarks, and on a protein function annotation task.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134365092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 91

Two-dimensional solution path for support vector regression 支持向量回归的二维解路径

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143969

G. Wang, D. Yeung, F. Lochovsky

Recently, a very appealing approach was proposed to compute the entire solution path for support vector classification (SVC) with very low extra computational cost. This approach was later extended to a support vector regression (SVR) model called ε-SVR. However, the method requires that the error parameter ε be set a priori, which is only possible if the desired accuracy of the approximation can be specified in advance. In this paper, we show that the solution path for ε-SVR is also piecewise linear with respect to ε. We further propose an efficient algorithm for exploring the two-dimensional solution space defined by the regularization and error parameters. As opposed to the algorithm for SVC, our proposed algorithm for ε-SVR initializes the number of support vectors to zero and then increases it gradually as the algorithm proceeds. As such, a good regression function possessing the sparseness property can be obtained after only a few iterations.

最近，人们提出了一种计算支持向量分类(SVC)整个解路径的方法，其额外计算成本非常低。该方法后来被扩展为支持向量回归(SVR)模型，称为ε-SVR。然而，该方法需要先验地设置误差参数ε，这只有在可以提前指定所需的近似精度时才有可能。在本文中，我们证明了ε- svr的解路径对于ε也是分段线性的。我们进一步提出了一种有效的算法来探索由正则化和误差参数定义的二维解空间。与SVC算法相反，我们提出的ε-SVR算法将支持向量的数量初始化为零，然后随着算法的进行逐渐增加。因此，只需几次迭代就可以得到具有稀疏性的良好回归函数。

引用次数: 36

Feature subset selection bias for classification learning 分类学习的特征子集选择偏差

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143951

Surendra K. Singhi, Huan Liu

Feature selection is often applied to high-dimensional data prior to classification learning. Using the same training dataset in both selection and learning can result in so-called feature subset selection bias. This bias putatively can exacerbate data over-fitting and negatively affect classification performance. However, in current practice separate datasets are seldom employed for selection and learning, because dividing the training data into two datasets for feature selection and classifier learning respectively reduces the amount of data that can be used in either task. This work attempts to address this dilemma. We formalize selection bias for classification learning, analyze its statistical properties, and study factors that affect selection bias, as well as how the bias impacts classification learning via various experiments. This research endeavors to provide illustration and explanation why the bias may not cause negative impact in classification as much as expected in regression.

特征选择通常在分类学习之前应用于高维数据。在选择和学习中使用相同的训练数据集会导致所谓的特征子集选择偏差。这种偏差可能会加剧数据的过度拟合，并对分类性能产生负面影响。然而，在目前的实践中，很少使用单独的数据集进行选择和学习，因为将训练数据分成两个数据集分别进行特征选择和分类器学习会减少两项任务中可用的数据量。这项工作试图解决这一困境。我们形式化了分类学习的选择偏差，分析了选择偏差的统计特性，并通过各种实验研究了影响选择偏差的因素，以及选择偏差如何影响分类学习。本研究试图提供说明和解释为什么偏差可能不会像回归中预期的那样对分类产生负面影响。

引用次数: 98

Ranking individuals by group comparisons 通过群体比较对个人进行排名

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143898

Tzu-Kuo Huang, Chih-Jen Lin, R. C. Weng

This paper proposes new approaches to rank individuals from their group competition results. Many real-world problems are of this type. For example, ranking players from team games is important in some sports. We propose an exponential model to solve such problems. To estimate individual rankings through the proposed model we introduce two convex minimization formulas with easy and efficient solution procedures. Experiments on real bridge records and multi-class classification demonstrate the viability of the proposed model.

本文提出了一种新的方法，从个体的群体竞争结果中对个体进行排名。许多现实世界的问题都属于这种类型。例如，在某些运动项目中，团队比赛中的排名是很重要的。我们提出了一个指数模型来解决这类问题。为了通过所提出的模型估计个体排名，我们引入了两个具有简单有效求解过程的凸最小化公式。实际桥梁记录和多类分类实验证明了该模型的可行性。

引用次数: 58

A duality view of spectral methods for dimensionality reduction 光谱降维方法的对偶观点

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143975

Lin Xiao, Jun Sun, Stephen P. Boyd

We present a unified duality view of several recently emerged spectral methods for nonlinear dimensionality reduction, including Isomap, locally linear embedding, Laplacian eigenmaps, and maximum variance unfolding. We discuss the duality theory for the maximum variance unfolding problem, and show that other methods are directly related to either its primal formulation or its dual formulation, or can be interpreted from the optimality conditions. This duality framework reveals close connections between these seemingly quite different algorithms. In particular, it resolves the myth about these methods in using either the top eigenvectors of a dense matrix, or the bottom eigenvectors of a sparse matrix --- these two eigenspaces are exactly aligned at primal-dual optimality.

我们提出了几种最近出现的用于非线性降维的光谱方法的统一对偶视图，包括等高线图、局部线性嵌入、拉普拉斯特征映射和最大方差展开。我们讨论了最大方差展开问题的对偶理论，并证明了其他方法要么直接与它的原始形式有关，要么与它的对偶形式有关，要么可以从最优性条件解释。这个对偶框架揭示了这些看似完全不同的算法之间的密切联系。特别是，它解决了关于使用密集矩阵的顶部特征向量或稀疏矩阵的底部特征向量的这些方法的神话-这两个特征空间在原始对偶最优性下精确对齐。

引用次数: 42

Local Fisher discriminant analysis for supervised dimensionality reduction 监督降维的局部Fisher判别分析

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143958

Masashi Sugiyama

Dimensionality reduction is one of the important preprocessing steps in high-dimensional data analysis. In this paper, we consider the supervised dimensionality reduction problem where samples are accompanied with class labels. Traditional Fisher discriminant analysis is a popular and powerful method for this purpose. However, it tends to give undesired results if samples in some class form several separate clusters, i.e., multimodal. In this paper, we propose a new dimensionality reduction method called local Fisher discriminant analysis (LFDA), which is a localized variant of Fisher discriminant analysis. LFDA takes local structure of the data into account so the multimodal data can be embedded appropriately. We also show that LFDA can be extended to non-linear dimensionality reduction scenarios by the kernel trick.

降维是高维数据分析中重要的预处理步骤之一。在本文中，我们考虑了带有类标签的样本的监督降维问题。传统的费雪判别分析是一种流行而有力的方法。然而，如果某些类中的样本形成几个单独的簇，即多模态，则往往会给出不期望的结果。本文提出了一种新的降维方法，称为局部Fisher判别分析(LFDA)，它是Fisher判别分析的局部变体。LFDA考虑了数据的局部结构，因此可以适当地嵌入多模态数据。我们还证明了LFDA可以通过核技巧扩展到非线性降维场景。

引用次数: 373

The relationship between Precision-Recall and ROC curves 精密度-召回率与ROC曲线的关系

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143874

Jesse Davis, Mark H. Goadrich

Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm's performance. We show that a deep connection exists between ROC space and PR space, such that a curve dominates in ROC space if and only if it dominates in PR space. A corollary is the notion of an achievable PR curve, which has properties much like the convex hull in ROC space; we show an efficient algorithm for computing this curve. Finally, we also note differences in the two types of curves are significant for algorithm design. For example, in PR space it is incorrect to linearly interpolate between points. Furthermore, algorithms that optimize the area under the ROC curve are not guaranteed to optimize the area under the PR curve.

在机器学习中，接收算子特征(ROC)曲线通常用于表示二元决策问题的结果。然而，当处理高度倾斜的数据集时，Precision-Recall (PR)曲线提供了算法性能的更多信息。我们证明了ROC空间和PR空间之间存在着深刻的联系，这样一条曲线在ROC空间中占主导地位当且仅当它在PR空间中占主导地位。推论是可实现PR曲线的概念，它具有与ROC空间中的凸包非常相似的属性;我们给出了计算这条曲线的有效算法。最后，我们还注意到两种曲线的差异对于算法设计是重要的。例如，在PR空间中，在点之间进行线性插值是不正确的。此外，优化ROC曲线下面积的算法并不能保证优化PR曲线下面积。

引用次数: 5171

Optimal kernel selection in Kernel Fisher discriminant analysis 核Fisher判别分析中的最优核选择

Proceedings of the 23rd international conference on Machine learning

Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143903

Seung-Jean Kim, A. Magnani, Stephen P. Boyd

In Kernel Fisher discriminant analysis (KFDA), we carry out Fisher linear discriminant analysis in a high dimensional feature space defined implicitly by a kernel. The performance of KFDA depends on the choice of the kernel; in this paper, we consider the problem of finding the optimal kernel, over a given convex set of kernels. We show that this optimal kernel selection problem can be reformulated as a tractable convex optimization problem which interior-point methods can solve globally and efficiently. The kernel selection method is demonstrated with some UCI machine learning benchmark examples.

在Kernel Fisher判别分析(KFDA)中，我们在由Kernel隐式定义的高维特征空间中进行Fisher线性判别分析。KFDA的性能取决于内核的选择;在本文中，我们考虑在给定核的凸集上寻找最优核的问题。我们证明了这种最优核选择问题可以重新表述为一个可处理的凸优化问题，内点方法可以全局有效地求解。通过一些UCI机器学习的基准示例对核选择方法进行了验证。

引用次数: 165

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 23rd international conference on Machine learning

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀