学习等变结构化输出SVM回归量

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI:10.1109/ICCV.2011.6126339

A. Vedaldi, Matthew B. Blaschko, Andrew Zisserman

{"title":"学习等变结构化输出SVM回归量","authors":"A. Vedaldi, Matthew B. Blaschko, Andrew Zisserman","doi":"10.1109/ICCV.2011.6126339","DOIUrl":null,"url":null,"abstract":"Equivariance and invariance are often desired properties of a computer vision system. However, currently available strategies generally rely on virtual sampling, leaving open the question of how many samples are necessary, on the use of invariant feature representations, which can mistakenly discard information relevant to the vision task, or on the use of latent variable models, which result in non-convex training and expensive inference at test time. We propose here a generalization of structured output SVM regressors that can incorporate equivariance and invariance into a convex training procedure, enabling the incorporation of large families of transformations, while maintaining optimality and tractability. Importantly, test time inference does not require the estimation of latent variables, resulting in highly efficient objective functions. This results in a natural formulation for treating equivariance and invariance that is easily implemented as an adaptation of off-the-shelf optimization software, obviating the need for ad hoc sampling strategies. Theoretical results relating to vicinal risk, and experiments on challenging aerial car and pedestrian detection tasks show the effectiveness of the proposed solution.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"394 1","pages":"959-966"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":"{\"title\":\"Learning equivariant structured output SVM regressors\",\"authors\":\"A. Vedaldi, Matthew B. Blaschko, Andrew Zisserman\",\"doi\":\"10.1109/ICCV.2011.6126339\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Equivariance and invariance are often desired properties of a computer vision system. However, currently available strategies generally rely on virtual sampling, leaving open the question of how many samples are necessary, on the use of invariant feature representations, which can mistakenly discard information relevant to the vision task, or on the use of latent variable models, which result in non-convex training and expensive inference at test time. We propose here a generalization of structured output SVM regressors that can incorporate equivariance and invariance into a convex training procedure, enabling the incorporation of large families of transformations, while maintaining optimality and tractability. Importantly, test time inference does not require the estimation of latent variables, resulting in highly efficient objective functions. This results in a natural formulation for treating equivariance and invariance that is easily implemented as an adaptation of off-the-shelf optimization software, obviating the need for ad hoc sampling strategies. Theoretical results relating to vicinal risk, and experiments on challenging aerial car and pedestrian detection tasks show the effectiveness of the proposed solution.\",\"PeriodicalId\":6391,\"journal\":{\"name\":\"2011 International Conference on Computer Vision\",\"volume\":\"394 1\",\"pages\":\"959-966\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Computer Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCV.2011.6126339\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2011.6126339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

摘要

等变性和不变性通常是计算机视觉系统所需要的特性。然而，目前可用的策略通常依赖于虚拟采样，留下了需要多少样本的问题;使用不变特征表示，这可能会错误地丢弃与视觉任务相关的信息;或者使用潜在变量模型，这导致非凸训练和昂贵的测试时推理。我们在这里提出了一种结构化输出SVM回归量的泛化，它可以将等方差和不变性纳入凸训练过程，从而可以在保持最优性和可追溯性的同时纳入大族变换。重要的是，测试时间推断不需要估计潜在变量，从而产生高效的目标函数。这就产生了一种处理等变性和不变性的自然公式，这种公式很容易作为现成优化软件的改编实现，从而避免了对特别采样策略的需要。与周边风险相关的理论结果以及具有挑战性的空中车辆和行人检测任务的实验结果表明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Learning equivariant structured output SVM regressors

Equivariance and invariance are often desired properties of a computer vision system. However, currently available strategies generally rely on virtual sampling, leaving open the question of how many samples are necessary, on the use of invariant feature representations, which can mistakenly discard information relevant to the vision task, or on the use of latent variable models, which result in non-convex training and expensive inference at test time. We propose here a generalization of structured output SVM regressors that can incorporate equivariance and invariance into a convex training procedure, enabling the incorporation of large families of transformations, while maintaining optimality and tractability. Importantly, test time inference does not require the estimation of latent variables, resulting in highly efficient objective functions. This results in a natural formulation for treating equivariance and invariance that is easily implemented as an adaptation of off-the-shelf optimization software, obviating the need for ad hoc sampling strategies. Theoretical results relating to vicinal risk, and experiments on challenging aerial car and pedestrian detection tasks show the effectiveness of the proposed solution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 International Conference on Computer Vision

自引率

0.00%

发文量

期刊最新文献

Robust and efficient parametric face alignment Video parsing for abnormality detection From learning models of natural image patches to whole image restoration Discriminative figure-centric models for joint action localization and recognition A general preconditioning scheme for difference measures in deformable registration