首页 > 最新文献

Proceedings of the 23rd international conference on Machine learning最新文献

英文 中文
Generalized spectral bounds for sparse LDA 稀疏LDA的广义谱界
Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143925
B. Moghaddam, Yair Weiss, S. Avidan
We present a discrete spectral framework for the sparse or cardinality-constrained solution of a generalized Rayleigh quotient. This NP-hard combinatorial optimization problem is central to supervised learning tasks such as sparse LDA, feature selection and relevance ranking for classification. We derive a new generalized form of the Inclusion Principle for variational eigenvalue bounds, leading to exact and optimal sparse linear discriminants using branch-and-bound search. An efficient greedy (approximate) technique is also presented. The generalization performance of our sparse LDA algorithms is demonstrated with real-world UCI ML benchmarks and compared to a leading SVM-based gene selection algorithm for cancer classification.
我们提出了广义瑞利商的稀疏或基数约束解的离散谱框架。这种NP-hard组合优化问题是监督学习任务的核心,如稀疏LDA、特征选择和分类的相关性排序。我们导出了变分特征值界的包含原理的一种新的广义形式,通过分支定界搜索得到精确的最优稀疏线性判别。提出了一种高效的贪心(近似)算法。我们的稀疏LDA算法的泛化性能用真实世界的UCI ML基准进行了验证,并与领先的基于svm的癌症分类基因选择算法进行了比较。
{"title":"Generalized spectral bounds for sparse LDA","authors":"B. Moghaddam, Yair Weiss, S. Avidan","doi":"10.1145/1143844.1143925","DOIUrl":"https://doi.org/10.1145/1143844.1143925","url":null,"abstract":"We present a discrete spectral framework for the sparse or cardinality-constrained solution of a generalized Rayleigh quotient. This NP-hard combinatorial optimization problem is central to supervised learning tasks such as sparse LDA, feature selection and relevance ranking for classification. We derive a new generalized form of the Inclusion Principle for variational eigenvalue bounds, leading to exact and optimal sparse linear discriminants using branch-and-bound search. An efficient greedy (approximate) technique is also presented. The generalization performance of our sparse LDA algorithms is demonstrated with real-world UCI ML benchmarks and compared to a leading SVM-based gene selection algorithm for cancer classification.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"56 56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126350326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 154
Efficient MAP approximation for dense energy functions 密集能量函数的高效MAP近似
Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143913
Marius Leordeanu, M. Hebert
We present an efficient method for maximizing energy functions with first and second order potentials, suitable for MAP labeling estimation problems that arise in undirected graphical models. Our approach is to relax the integer constraints on the solution in two steps. First we efficiently obtain the relaxed global optimum following a procedure similar to the iterative power method for finding the largest eigenvector of a matrix. Next, we map the relaxed optimum on a simplex and show that the new energy obtained has a certain optimal bound. Starting from this energy we follow an efficient coordinate ascent procedure that is guaranteed to increase the energy at every step and converge to a solution that obeys the initial integral constraints. We also present a sufficient condition for ascent procedures that guarantees the increase in energy at every step.
我们提出了一种有效的一阶和二阶势能函数最大化方法,适用于无向图模型中出现的MAP标记估计问题。我们的方法是分两步放宽解的整数约束。首先,我们采用类似于求矩阵最大特征向量的迭代幂次法的方法,有效地获得了松弛的全局最优解。然后,将松弛最优映射到单纯形上,证明了得到的新能量具有一定的最优界。从这个能量开始,我们遵循一个有效的坐标上升过程,保证每一步能量都增加,并收敛到一个服从初始积分约束的解。我们还提出了一个上升过程的充分条件,保证每一步能量的增加。
{"title":"Efficient MAP approximation for dense energy functions","authors":"Marius Leordeanu, M. Hebert","doi":"10.1145/1143844.1143913","DOIUrl":"https://doi.org/10.1145/1143844.1143913","url":null,"abstract":"We present an efficient method for maximizing energy functions with first and second order potentials, suitable for MAP labeling estimation problems that arise in undirected graphical models. Our approach is to relax the integer constraints on the solution in two steps. First we efficiently obtain the relaxed global optimum following a procedure similar to the iterative power method for finding the largest eigenvector of a matrix. Next, we map the relaxed optimum on a simplex and show that the new energy obtained has a certain optimal bound. Starting from this energy we follow an efficient coordinate ascent procedure that is guaranteed to increase the energy at every step and converge to a solution that obeys the initial integral constraints. We also present a sufficient condition for ascent procedures that guarantees the increase in energy at every step.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134014825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Categorization in multiple category systems 多分类系统中的分类
Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143938
J. Renders, Éric Gaussier, Cyril Goutte, F. Pacull, G. Csurka
We explore the situation in which documents have to be categorized into more than one category system, a situation we refer to as multiple-view categorization. More particularly, we address the case where two different categorizers have already been built based on non-necessarily identical training sets, each one labeled using one category system. On the top of these categorizers considered as black-boxes, we propose some algorithms able to exploit a third training set containing a few examples annotated in both category systems. Such a situation arises for example in large companies where incoming mails have to be routed to several departments, each one relying on its own category system. We focus here on exploiting possible dependencies between category systems in order to refine the categorization decisions made by categorizers trained independently on different category systems. After a description of the multiple categorization problem, we present several possible solutions, based either on a categorization or reweighting approach, and compare them on real data. Lastly, we show how the multimedia categorization problem can be cast as a multiple categorization problem and assess our methods in this framework.
我们将探讨必须将文档分类到多个类别系统中的情况,我们将这种情况称为多视图分类。更具体地说,我们解决了两个不同的分类器已经基于不一定相同的训练集建立的情况,每个分类器都使用一个类别系统进行标记。在这些被认为是黑盒的分类器之上,我们提出了一些能够利用第三个训练集的算法,该训练集包含在两个分类系统中注释的一些示例。例如,在大公司中会出现这种情况,其中传入的邮件必须路由到几个部门,每个部门都依赖于自己的类别系统。我们在这里的重点是利用类别系统之间可能的依赖关系,以改进在不同类别系统上独立训练的分类器所做的分类决策。在描述了多重分类问题之后,我们提出了几种可能的解决方案,基于分类或重加权方法,并在实际数据上对它们进行了比较。最后,我们展示了如何将多媒体分类问题转换为多重分类问题,并在此框架下评估我们的方法。
{"title":"Categorization in multiple category systems","authors":"J. Renders, Éric Gaussier, Cyril Goutte, F. Pacull, G. Csurka","doi":"10.1145/1143844.1143938","DOIUrl":"https://doi.org/10.1145/1143844.1143938","url":null,"abstract":"We explore the situation in which documents have to be categorized into more than one category system, a situation we refer to as multiple-view categorization. More particularly, we address the case where two different categorizers have already been built based on non-necessarily identical training sets, each one labeled using one category system. On the top of these categorizers considered as black-boxes, we propose some algorithms able to exploit a third training set containing a few examples annotated in both category systems. Such a situation arises for example in large companies where incoming mails have to be routed to several departments, each one relying on its own category system. We focus here on exploiting possible dependencies between category systems in order to refine the categorization decisions made by categorizers trained independently on different category systems. After a description of the multiple categorization problem, we present several possible solutions, based either on a categorization or reweighting approach, and compare them on real data. Lastly, we show how the multimedia categorization problem can be cast as a multiple categorization problem and assess our methods in this framework.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"169 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129395553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Local distance preservation in the GP-LVM through back constraints 基于反向约束的GP-LVM局部距离保持
Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143909
Neil D. Lawrence, J. Q. Candela
The Gaussian process latent variable model (GP-LVM) is a generative approach to nonlinear low dimensional embedding, that provides a smooth probabilistic mapping from latent to data space. It is also a non-linear generalization of probabilistic PCA (PPCA) (Tipping & Bishop, 1999). While most approaches to non-linear dimensionality methods focus on preserving local distances in data space, the GP-LVM focusses on exactly the opposite. Being a smooth mapping from latent to data space, it focusses on keeping things apart in latent space that are far apart in data space. In this paper we first provide an overview of dimensionality reduction techniques, placing the emphasis on the kind of distance relation preserved. We then show how the GP-LVM can be generalized, through back constraints, to additionally preserve local distances. We give illustrative experiments on common data sets.
高斯过程隐变量模型(GP-LVM)是一种非线性低维嵌入的生成方法,它提供了从隐变量到数据空间的平滑概率映射。它也是概率主成分分析(PPCA)的非线性推广(Tipping & Bishop, 1999)。大多数非线性维数方法的重点是保持数据空间中的局部距离,而GP-LVM的重点恰恰相反。作为从潜在空间到数据空间的平滑映射,它的重点是将数据空间中相距很远的事物在潜在空间中分开。在本文中,我们首先概述了降维技术,重点放在保留的距离关系的类型上。然后,我们展示了如何通过反向约束对GP-LVM进行推广,以额外地保持局部距离。我们在常见的数据集上给出了说明性的实验。
{"title":"Local distance preservation in the GP-LVM through back constraints","authors":"Neil D. Lawrence, J. Q. Candela","doi":"10.1145/1143844.1143909","DOIUrl":"https://doi.org/10.1145/1143844.1143909","url":null,"abstract":"The Gaussian process latent variable model (GP-LVM) is a generative approach to nonlinear low dimensional embedding, that provides a smooth probabilistic mapping from latent to data space. It is also a non-linear generalization of probabilistic PCA (PPCA) (Tipping & Bishop, 1999). While most approaches to non-linear dimensionality methods focus on preserving local distances in data space, the GP-LVM focusses on exactly the opposite. Being a smooth mapping from latent to data space, it focusses on keeping things apart in latent space that are far apart in data space. In this paper we first provide an overview of dimensionality reduction techniques, placing the emphasis on the kind of distance relation preserved. We then show how the GP-LVM can be generalized, through back constraints, to additionally preserve local distances. We give illustrative experiments on common data sets.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130502166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 265
Discriminative unsupervised learning of structured predictors 结构化预测器的判别无监督学习
Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143977
Linli Xu, Dana F. Wilkinson, F. Southey, Dale Schuurmans
We present a new unsupervised algorithm for training structured predictors that is discriminative, convex, and avoids the use of EM. The idea is to formulate an unsupervised version of structured learning methods, such as maximum margin Markov networks, that can be trained via semidefinite programming. The result is a discriminative training criterion for structured predictors (like hidden Markov models) that remains unsupervised and does not create local minima. To reduce training cost, we reformulate the training procedure to mitigate the dependence on semidefinite programming, and finally propose a heuristic procedure that avoids semidefinite programming entirely. Experimental results show that the convex discriminative procedure can produce better conditional models than conventional Baum-Welch (EM) training.
我们提出了一种新的无监督算法,用于训练结构化预测器,该算法具有判别性、凸性,并且避免了EM的使用。其想法是制定结构化学习方法的无监督版本,例如可以通过半确定规划训练的最大边际马尔可夫网络。结果是结构化预测器(如隐马尔可夫模型)的判别训练标准,它仍然是无监督的,并且不会产生局部最小值。为了降低训练成本,我们对训练过程进行了重构,减少了对半确定规划的依赖,最后提出了一种完全避免半确定规划的启发式训练过程。实验结果表明,与传统的Baum-Welch (EM)训练相比,凸判别法可以得到更好的条件模型。
{"title":"Discriminative unsupervised learning of structured predictors","authors":"Linli Xu, Dana F. Wilkinson, F. Southey, Dale Schuurmans","doi":"10.1145/1143844.1143977","DOIUrl":"https://doi.org/10.1145/1143844.1143977","url":null,"abstract":"We present a new unsupervised algorithm for training structured predictors that is discriminative, convex, and avoids the use of EM. The idea is to formulate an unsupervised version of structured learning methods, such as maximum margin Markov networks, that can be trained via semidefinite programming. The result is a discriminative training criterion for structured predictors (like hidden Markov models) that remains unsupervised and does not create local minima. To reduce training cost, we reformulate the training procedure to mitigate the dependence on semidefinite programming, and finally propose a heuristic procedure that avoids semidefinite programming entirely. Experimental results show that the convex discriminative procedure can produce better conditional models than conventional Baum-Welch (EM) training.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123997286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Practical solutions to the problem of diagonal dominance in kernel document clustering 核心文档聚类中对角优势问题的实用解决方案
Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143892
Derek Greene, P. Cunningham
In supervised kernel methods, it has been observed that the performance of the SVM classifier is poor in cases where the diagonal entries of the Gram matrix are large relative to the off-diagonal entries. This problem, referred to as diagonal dominance, often occurs when certain kernel functions are applied to sparse high-dimensional data, such as text corpora. In this paper we investigate the implications of diagonal dominance for unsupervised kernel methods, specifically in the task of document clustering. We propose a selection of strategies for addressing this issue, and evaluate their effectiveness in producing more accurate and stable clusterings.
在监督核方法中,已经观察到,当Gram矩阵的对角线条目相对于非对角线条目较大时,支持向量机分类器的性能较差。当某些核函数应用于稀疏的高维数据(如文本语料库)时,通常会出现这种被称为对角优势的问题。在本文中,我们研究了对角优势对无监督核方法的影响,特别是在文档聚类任务中。我们提出了一系列解决这一问题的策略,并评估了它们在产生更准确和稳定的聚类方面的有效性。
{"title":"Practical solutions to the problem of diagonal dominance in kernel document clustering","authors":"Derek Greene, P. Cunningham","doi":"10.1145/1143844.1143892","DOIUrl":"https://doi.org/10.1145/1143844.1143892","url":null,"abstract":"In supervised kernel methods, it has been observed that the performance of the SVM classifier is poor in cases where the diagonal entries of the Gram matrix are large relative to the off-diagonal entries. This problem, referred to as diagonal dominance, often occurs when certain kernel functions are applied to sparse high-dimensional data, such as text corpora. In this paper we investigate the implications of diagonal dominance for unsupervised kernel methods, specifically in the task of document clustering. We propose a selection of strategies for addressing this issue, and evaluate their effectiveness in producing more accurate and stable clusterings.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116527870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 484
Statistical debugging: simultaneous identification of multiple bugs 统计调试:同时识别多个bug
Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143983
A. Zheng, Michael I. Jordan, B. Liblit, M. Naik, A. Aiken
We describe a statistical approach to software debugging in the presence of multiple bugs. Due to sparse sampling issues and complex interaction between program predicates, many generic off-the-shelf algorithms fail to select useful bug predictors. Taking inspiration from bi-clustering algorithms, we propose an iterative collective voting scheme for the program runs and predicates. We demonstrate successful debugging results on several real world programs and a large debugging benchmark suite.
我们描述了一种在存在多个错误的情况下进行软件调试的统计方法。由于稀疏的采样问题和程序谓词之间复杂的交互,许多通用的现成算法无法选择有用的错误预测器。受双聚类算法的启发,我们提出了一个迭代的程序运行和谓词的集体投票方案。我们在几个真实世界的程序和一个大型调试基准套件上演示了成功的调试结果。
{"title":"Statistical debugging: simultaneous identification of multiple bugs","authors":"A. Zheng, Michael I. Jordan, B. Liblit, M. Naik, A. Aiken","doi":"10.1145/1143844.1143983","DOIUrl":"https://doi.org/10.1145/1143844.1143983","url":null,"abstract":"We describe a statistical approach to software debugging in the presence of multiple bugs. Due to sparse sampling issues and complex interaction between program predicates, many generic off-the-shelf algorithms fail to select useful bug predictors. Taking inspiration from bi-clustering algorithms, we propose an iterative collective voting scheme for the program runs and predicates. We demonstrate successful debugging results on several real world programs and a large debugging benchmark suite.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128344488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 168
Robust Euclidean embedding 鲁棒欧几里得嵌入
Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143866
Lawrence Cayton, S. Dasgupta
We derive a robust Euclidean embedding procedure based on semidefinite programming that may be used in place of the popular classical multidimensional scaling (cMDS) algorithm. We motivate this algorithm by arguing that cMDS is not particularly robust and has several other deficiencies. General-purpose semidefinite programming solvers are too memory intensive for medium to large sized applications, so we also describe a fast subgradient-based implementation of the robust algorithm. Additionally, since cMDS is often used for dimensionality reduction, we provide an in-depth look at reducing dimensionality with embedding procedures. In particular, we show that it is NP-hard to find optimal low-dimensional embeddings under a variety of cost functions.
本文提出了一种基于半定规划的鲁棒欧几里得嵌入算法,可用于取代流行的经典多维尺度(cMDS)算法。我们认为cMDS不是特别健壮,并且有其他几个缺陷,从而激发了这个算法。一般用途的半定规划解算器对于大中型应用来说过于占用内存,因此我们还描述了一种快速的基于子梯度的鲁棒算法实现。此外,由于cMDS经常用于降维,因此我们将深入研究如何使用嵌入过程降维。特别是,我们证明了在各种成本函数下找到最优的低维嵌入是np困难的。
{"title":"Robust Euclidean embedding","authors":"Lawrence Cayton, S. Dasgupta","doi":"10.1145/1143844.1143866","DOIUrl":"https://doi.org/10.1145/1143844.1143866","url":null,"abstract":"We derive a robust Euclidean embedding procedure based on semidefinite programming that may be used in place of the popular classical multidimensional scaling (cMDS) algorithm. We motivate this algorithm by arguing that cMDS is not particularly robust and has several other deficiencies. General-purpose semidefinite programming solvers are too memory intensive for medium to large sized applications, so we also describe a fast subgradient-based implementation of the robust algorithm. Additionally, since cMDS is often used for dimensionality reduction, we provide an in-depth look at reducing dimensionality with embedding procedures. In particular, we show that it is NP-hard to find optimal low-dimensional embeddings under a variety of cost functions.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124715131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 76
A regularization framework for multiple-instance learning 多实例学习的正则化框架
Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143869
Pak-Ming Cheung, J. Kwok
This paper focuses on kernel methods for multi-instance learning. Existing methods require the prediction of the bag to be identical to the maximum of those of its individual instances. However, this is too restrictive as only the sign is important in classification. In this paper, we provide a more complete regularization framework for MI learning by allowing the use of different loss functions between the outputs of a bag and its associated instances. This is especially important as we generalize this for multi-instance regression. Moreover, both bag and instance information can now be directly used in the optimization. Instead of using heuristics to solve the resultant non-linear optimization problem, we use the constrained concave-convex procedure which has well-studied convergence properties. Experiments on both classification and regression data sets show that the proposed method leads to improved performance.
本文主要研究多实例学习的核方法。现有的方法要求袋子的预测与它的单个实例的最大值相同。然而,这太局限了,因为只有符号在分类中是重要的。在本文中,我们通过允许在一个袋子的输出和它的相关实例之间使用不同的损失函数,为人工智能学习提供了一个更完整的正则化框架。当我们将其推广到多实例回归时,这一点尤为重要。此外,包和实例信息现在都可以直接用于优化。我们不再使用启发式方法来解决由此产生的非线性优化问题,而是使用收敛性已经得到充分研究的约束凹凸过程。在分类和回归数据集上的实验表明,该方法的性能得到了提高。
{"title":"A regularization framework for multiple-instance learning","authors":"Pak-Ming Cheung, J. Kwok","doi":"10.1145/1143844.1143869","DOIUrl":"https://doi.org/10.1145/1143844.1143869","url":null,"abstract":"This paper focuses on kernel methods for multi-instance learning. Existing methods require the prediction of the bag to be identical to the maximum of those of its individual instances. However, this is too restrictive as only the sign is important in classification. In this paper, we provide a more complete regularization framework for MI learning by allowing the use of different loss functions between the outputs of a bag and its associated instances. This is especially important as we generalize this for multi-instance regression. Moreover, both bag and instance information can now be directly used in the optimization. Instead of using heuristics to solve the resultant non-linear optimization problem, we use the constrained concave-convex procedure which has well-studied convergence properties. Experiments on both classification and regression data sets show that the proposed method leads to improved performance.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125877486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 84
Totally corrective boosting algorithms that maximize the margin 完全纠正了提升算法,使利润最大化
Pub Date : 2006-06-25 DOI: 10.1145/1143844.1143970
Manfred K. Warmuth, Jun Liao, G. Rätsch
We consider boosting algorithms that maintain a distribution over a set of examples. At each iteration a weak hypothesis is received and the distribution is updated. We motivate these updates as minimizing the relative entropy subject to linear constraints. For example AdaBoost constrains the edge of the last hypothesis w.r.t. the updated distribution to be at most γ = 0. In some sense, AdaBoost is "corrective" w.r.t. the last hypothesis. A cleaner boosting method is to be "totally corrective": the edges of all past hypotheses are constrained to be at most γ, where γ is suitably adapted.Using new techniques, we prove the same iteration bounds for the totally corrective algorithms as for their corrective versions. Moreover with adaptive γ, the algorithms provably maximizes the margin. Experimentally, the totally corrective versions return smaller convex combinations of weak hypotheses than the corrective ones and are competitive with LPBoost, a totally corrective boosting algorithm with no regularization, for which there is no iteration bound known.
我们考虑在一组样本上保持分布的增强算法。在每次迭代中接收一个弱假设并更新分布。我们将这些更新激励为最小化受线性约束的相对熵。例如,AdaBoost约束了最后一个假设的边缘,即更新后的分布不超过γ = 0。从某种意义上说,AdaBoost是“纠正性的”,而不是最后一个假设。一种更清晰的增强方法是“完全纠正”:所有过去的假设的边缘都被限制为至多γ,其中γ是适当的。利用新技术,我们证明了完全校正算法的迭代界与校正算法的迭代界相同。此外,在自适应γ条件下,该算法可使余量最大化。在实验中,完全校正版本返回的弱假设的凸组合比校正版本更小,并且与LPBoost竞争,LPBoost是一种没有正则化的完全校正增强算法,它没有已知的迭代边界。
{"title":"Totally corrective boosting algorithms that maximize the margin","authors":"Manfred K. Warmuth, Jun Liao, G. Rätsch","doi":"10.1145/1143844.1143970","DOIUrl":"https://doi.org/10.1145/1143844.1143970","url":null,"abstract":"We consider boosting algorithms that maintain a distribution over a set of examples. At each iteration a weak hypothesis is received and the distribution is updated. We motivate these updates as minimizing the relative entropy subject to linear constraints. For example AdaBoost constrains the edge of the last hypothesis w.r.t. the updated distribution to be at most γ = 0. In some sense, AdaBoost is \"corrective\" w.r.t. the last hypothesis. A cleaner boosting method is to be \"totally corrective\": the edges of all past hypotheses are constrained to be at most γ, where γ is suitably adapted.Using new techniques, we prove the same iteration bounds for the totally corrective algorithms as for their corrective versions. Moreover with adaptive γ, the algorithms provably maximizes the margin. Experimentally, the totally corrective versions return smaller convex combinations of weak hypotheses than the corrective ones and are competitive with LPBoost, a totally corrective boosting algorithm with no regularization, for which there is no iteration bound known.","PeriodicalId":124011,"journal":{"name":"Proceedings of the 23rd international conference on Machine learning","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124406406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 130
期刊
Proceedings of the 23rd international conference on Machine learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1