IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision最新文献

英文中文

Benchmarking large-scale Fine-Grained Categorization 大规模细粒度分类的基准测试

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

Pub Date : 2014-03-24 DOI: 10.1109/WACV.2014.6836056

A. Angelova, Philip M. Long

This paper presents a systematic evaluation of recent methods in the fine-grained categorization domain, which have shown significant promise. More specifically, we investigate an automatic segmentation algorithm, a region pooling algorithm which is akin to pose-normalized pooling [31] [28], and a multi-class optimization method. We considered the largest and most popular datasets for fine-grained categorization available in the field: the Caltech-UCSD 200 Birds dataset [27], the Oxford 102 Flowers dataset [19], the Stanford 120 Dogs dataset [16], and the Oxford 37 Cats and Dogs dataset [21]. We view this work from a practitioner's perspective, answering the question: what are the methods that can create the best possible fine-grained recognition system which can be applied in practice? Our experiments provide insights of the relative merit of these methods. More importantly, after combining the methods, we achieve the top results in the field, outperforming the state-of-the-art methods by 4.8% and 10.3% for birds and dogs datasets, respectively. Additionally, our method achieves a mAP of 37.92 on the of 2012 Imagenet Fine-Grained Categorization Challenge [1], which outperforms the winner of this challenge by 5.7 points.

本文对细粒度分类领域的最新方法进行了系统的评价，这些方法显示出很大的前景。更具体地说，我们研究了一种自动分割算法，一种类似于姿态归一化池化的区域池化算法[31][28]，以及一种多类优化方法。我们考虑了该领域最大和最流行的细粒度分类数据集:加州理工大学-加州大学圣地亚哥分校200只鸟数据集[27]、牛津大学102只花数据集[19]、斯坦福大学120只狗数据集[16]和牛津大学37只猫和狗数据集[21]。我们从从业者的角度来看待这项工作，回答这个问题:哪些方法可以创建最好的细粒度识别系统，并可以在实践中应用?我们的实验提供了这些方法的相对优点的见解。更重要的是，在结合了这些方法之后，我们在该领域取得了最好的结果，在鸟类和狗的数据集上分别比最先进的方法高出4.8%和10.3%。此外，我们的方法在2012年Imagenet细粒度分类挑战赛[1]上取得了37.92分的mAP，比本次挑战赛的获胜者高出5.7分。

{"title":"Benchmarking large-scale Fine-Grained Categorization","authors":"A. Angelova, Philip M. Long","doi":"10.1109/WACV.2014.6836056","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836056","url":null,"abstract":"This paper presents a systematic evaluation of recent methods in the fine-grained categorization domain, which have shown significant promise. More specifically, we investigate an automatic segmentation algorithm, a region pooling algorithm which is akin to pose-normalized pooling [31] [28], and a multi-class optimization method. We considered the largest and most popular datasets for fine-grained categorization available in the field: the Caltech-UCSD 200 Birds dataset [27], the Oxford 102 Flowers dataset [19], the Stanford 120 Dogs dataset [16], and the Oxford 37 Cats and Dogs dataset [21]. We view this work from a practitioner's perspective, answering the question: what are the methods that can create the best possible fine-grained recognition system which can be applied in practice? Our experiments provide insights of the relative merit of these methods. More importantly, after combining the methods, we achieve the top results in the field, outperforming the state-of-the-art methods by 4.8% and 10.3% for birds and dogs datasets, respectively. Additionally, our method achieves a mAP of 37.92 on the of 2012 Imagenet Fine-Grained Categorization Challenge [1], which outperforms the winner of this challenge by 5.7 points.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"83 1","pages":"532-539"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89952993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Fast dense 3D reconstruction using an adaptive multiscale discrete-continuous variational method 基于自适应多尺度离散连续变分方法的快速密集三维重建

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

Pub Date : 2014-03-24 DOI: 10.1109/WACV.2014.6836118

Z. Kang, G. Medioni

We present a system for fast dense 3D reconstruction with a hand-held camera. Walking around a target object, we shoot sequential images using continuous shooting mode. High-quality camera poses are obtained offline using structure-from-motion (SfM) algorithm with Bundle Adjustment. Multi-view stereo is solved using a new, efficient adaptive multiscale discrete-continuous variational method to generate depth maps with sub-pixel accuracy. Depth maps are then fused into a 3D model using volumetric integration with truncated signed distance function (TSDF). Our system is accurate, efficient and flexible: accurate depth maps are estimated with sub-pixel accuracy in stereo matching; dense models can be achieved within minutes as major algorithms parallelized on multi-core processor and GPU; various tasks can be handled (e.g. reconstruction of objects in both indoor and outdoor environment with different scales) without specific hand-tuning parameters. We evaluate our system quantitatively and qualitatively on Middlebury benchmark and another dataset collected with a smartphone camera.

提出了一种基于手持相机的快速密集三维重建系统。在目标物体周围走动时，我们使用连续拍摄模式拍摄连续图像。采用带束调整的运动结构(SfM)算法离线获取高质量的相机姿态。采用一种新的、高效的自适应多尺度离散连续变分方法求解多视角立体图像，生成亚像素精度的深度图。然后使用截断符号距离函数(TSDF)的体积积分将深度图融合到3D模型中。我们的系统准确、高效、灵活:在立体匹配中以亚像素精度估计准确的深度图;由于主要算法在多核处理器和GPU上并行化，可以在几分钟内实现密集模型;无需特定的手动调整参数，就可以处理各种任务(例如，在不同尺度的室内和室外环境中重建物体)。我们在Middlebury基准和另一个用智能手机相机收集的数据集上定量和定性地评估了我们的系统。

{"title":"Fast dense 3D reconstruction using an adaptive multiscale discrete-continuous variational method","authors":"Z. Kang, G. Medioni","doi":"10.1109/WACV.2014.6836118","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836118","url":null,"abstract":"We present a system for fast dense 3D reconstruction with a hand-held camera. Walking around a target object, we shoot sequential images using continuous shooting mode. High-quality camera poses are obtained offline using structure-from-motion (SfM) algorithm with Bundle Adjustment. Multi-view stereo is solved using a new, efficient adaptive multiscale discrete-continuous variational method to generate depth maps with sub-pixel accuracy. Depth maps are then fused into a 3D model using volumetric integration with truncated signed distance function (TSDF). Our system is accurate, efficient and flexible: accurate depth maps are estimated with sub-pixel accuracy in stereo matching; dense models can be achieved within minutes as major algorithms parallelized on multi-core processor and GPU; various tasks can be handled (e.g. reconstruction of objects in both indoor and outdoor environment with different scales) without specific hand-tuning parameters. We evaluate our system quantitatively and qualitatively on Middlebury benchmark and another dataset collected with a smartphone camera.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"5 9 1","pages":"53-60"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80468986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Segmentation and tracking of partial planar templates 部分平面模板的分割与跟踪

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

Pub Date : 2014-03-24 DOI: 10.1109/WACV.2014.6835731

Abdelsalam Masoud, W. Hoff

We present an algorithm that can segment and track partial planar templates, from a sequence of images taken from a moving camera. By “partial planar template”, we mean that the template is the projection of a surface patch that is only partially planar; some of the points may correspond to other surfaces. The algorithm segments each image template to identify the pixels that belong to the dominant plane, and determines the three dimensional structure of that plane. We show that our algorithm can track such patches over a larger visual angle, compared to algorithms that assume that patches arise from a single planar surface. The new tracking algorithm is expected to improve the accuracy of visual simultaneous localization and mapping, especially in outdoor natural scenes where planar features are rare.

我们提出了一种算法，可以分割和跟踪部分平面模板，从一个移动的相机拍摄的图像序列。所谓“部分平面模板”，是指模板是仅部分平面的表面斑块的投影;有些点可能对应于其他曲面。该算法对每个图像模板进行分割，识别属于主平面的像素点，并确定该平面的三维结构。我们证明，与假设斑块来自单个平面的算法相比，我们的算法可以在更大的视角上跟踪这些斑块。新的跟踪算法有望提高视觉同步定位和映射的准确性，特别是在平面特征较少的室外自然场景中。

引用次数: 2

Gradient based efficient feature selection 基于梯度的高效特征选择

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

Pub Date : 2014-03-24 DOI: 10.1109/WACV.2014.6836102

S. Z. Gilani, F. Shafait, A. Mian

Selecting a reduced set of relevant and non-redundant features for supervised classification problems is a challenging task. We propose a gradient based feature selection method which can search the feature space efficiently and select a reduced set of representative features. We test our proposed algorithm on five small and medium sized pattern classification datasets as well as two large 3D face datasets for computer vision applications. Comparison with the state of the art wrapper and filter methods shows that our proposed technique yields better classification results in lesser number of evaluations of the target classifier. The feature subset selected by our algorithm is representative of the classes in the data and has the least variation in classification accuracy.

为监督分类问题选择一个精简的相关和非冗余特征集是一项具有挑战性的任务。提出了一种基于梯度的特征选择方法，该方法可以有效地搜索特征空间，并选择一个简化的代表性特征集。我们在五个中小型模式分类数据集以及两个用于计算机视觉应用的大型3D人脸数据集上测试了我们提出的算法。与最先进的包装器和过滤器方法的比较表明，我们提出的技术在较少的目标分类器评估次数下产生更好的分类结果。我们的算法选择的特征子集代表了数据中的类，并且分类精度变化最小。

引用次数: 8

Viewpoint-independent book spine segmentation 独立于视点的书脊分割

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

Pub Date : 2014-03-24 DOI: 10.1109/WACV.2014.6836066

L. Talker, Y. Moses

We propose a method to precisely segment books on bookshelves in images taken from general viewpoints. The proposed segmentation algorithm overcomes difficulties due to text and texture on book spines, various book orientations under perspective projection, and book proximity. A shape dependent active contour is used as a first step to establish a set of book spine candidates. A subset of these candidates are selected using spatial constraints on the assembly of spine candidates by formulating the selection problem as the maximal weighted independent set (MWIS) of a graph. The segmented book spines may be used by recognition systems (e.g., library automation), or rendered in computer graphics applications. We also propose a novel application that uses the segmented book spines to assist users in bookshelf reorganization or to modify the image to create a bookshelf with a tidier look. Our method was successfully tested on challenging sets of images.

我们提出了一种从一般视点拍摄的图像中精确分割书架上书籍的方法。本文提出的分割算法克服了书脊上的文本和纹理、透视投影下图书的不同方向以及图书的接近性所带来的困难。首先使用形状相关的活动轮廓来建立一组候选书脊。通过将选择问题表述为图的最大加权独立集(MWIS)，利用脊柱候选者组装的空间约束选择这些候选者的子集。分割的书脊可用于识别系统(例如，图书馆自动化)，或在计算机图形应用程序中呈现。我们还提出了一个新的应用程序，使用分割的书脊来帮助用户进行书架重组或修改图像，以创建一个更整洁的书架外观。我们的方法在具有挑战性的图像集上成功地进行了测试。

引用次数: 6

Detecting 3D geometric boundaries of indoor scenes under varying lighting 检测室内场景在不同光照条件下的三维几何边界

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

Pub Date : 2014-03-24 DOI: 10.1109/WACV.2014.6836125

Jie Ni, Tim K. Marks, Oncel Tuzel, F. Porikli

The goal of this research is to identify 3D geometric boundaries in a set of 2D photographs of a static indoor scene under unknown, changing lighting conditions. A 3D geometric boundary is a contour located at a 3D depth discontinuity or a discontinuity in the surface normal. These boundaries can be used effectively for reasoning about the 3D layout of a scene. To distinguish 3D geometric boundaries from 2D texture edges, we analyze the illumination subspace of local appearance at each image location. In indoor time-lapse photography and surveillance video, we frequently see images that are lit by unknown combinations of uncalibrated light sources. We introduce an algorithm for semi-binary nonnegative matrix factorization (SBNMF) to decompose such images into a set of lighting basis images, each of which shows the scene lit by a single light source. These basis images provide a natural, succinct representation of the scene, enabling tasks such as scene editing (e.g., relighting) and shadow edge identification.

本研究的目标是在未知的、不断变化的照明条件下，在一组静态室内场景的二维照片中识别出三维几何边界。三维几何边界是位于三维深度不连续或表面法线不连续处的轮廓。这些边界可以有效地用于推理场景的3D布局。为了区分三维几何边界和二维纹理边缘，我们分析了每个图像位置的局部外观的照明子空间。在室内延时摄影和监控视频中，我们经常看到由未校准光源的未知组合照亮的图像。我们引入了一种半二进制非负矩阵分解(SBNMF)算法，将这些图像分解为一组照明基础图像，每个图像显示单个光源照亮的场景。这些基础图像提供了一个自然、简洁的场景表示，支持场景编辑(例如，重新照明)和阴影边缘识别等任务。

引用次数: 1

3D Metric Rectification using Angle Regularity 使用角度规则的三维度量校正

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

Pub Date : 2014-03-24 DOI: 10.1109/WACV.2014.6836121

Aamer Zaheer, Sohaib Khan

This paper proposes Automatic Metric Rectification of projectively distorted 3D structures for man-made scenes using Angle Regularity. Man-made scenes, such as buildings, are characterized by a profusion of mutually orthogonal planes and lines. Assuming the availability of planar segmentation, we search for the rectifying 3D homography which maximizes the number of orthogonal plane-pairs in the structure. We formulate the orthogonality constraints in terms of the Absolute Dual Quadric (ADQ). Using RANSAC, we first estimate the ADQ which maximizes the number of planes meeting at right angles. A rectifying homography recovered from the ADQ is then used as an initial guess for nonlinear refinement. Quantitative experiments show that the method is highly robust to the amount of projective distortion, the number of outliers (i.e. non-orthogonal planes) and noise in structure recovery. Unlike previous literature, this method does not rely on any knowledge of the cameras or images, and no global model, such as Manhattan World, is imposed.

提出了一种基于角度规则的人工场景投影变形三维结构自动度量校正方法。人造场景，如建筑物，其特点是大量相互正交的平面和线条。假设平面分割的可用性，我们搜索使结构中正交平面对数量最大化的校正三维单应性。我们用绝对对偶二次函数(ADQ)来表述正交性约束。使用RANSAC，我们首先估计了最大平面直角相遇数量的ADQ。从ADQ中恢复的整流单应式然后用作非线性细化的初始猜测。定量实验表明，该方法对结构恢复中的投影失真量、离群点(即非正交平面)数量和噪声具有较强的鲁棒性。与以往的文献不同，这种方法不依赖于相机或图像的任何知识，也没有强加全球模型，如曼哈顿世界。

引用次数: 2

A combination of generative and discriminative models for fast unsupervised activity recognition from traffic scene videos 基于生成和判别模型的交通场景视频快速无监督活动识别

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

Pub Date : 2014-03-24 DOI: 10.1109/WACV.2014.6836042

M. V. Krishna, Joachim Denzler

Recent approaches in traffic and crowd scene analysis make extensive use of non-parametric hierarchical Bayesian models for intelligent clustering of features into activities. Although this has yielded impressive results, it requires the use of time consuming Bayesian inference during both training and classification. Therefore, we seek to limit Bayesian inference to the training stage, where unsupervised clustering is performed to extract semantically meaningful activities from the scene. In the testing stage, we use discriminative classifiers, taking advantage of their relative simplicity and fast inference. Experiments on publicly available data-sets show that our approach is comparable in classification accuracy to state-of-the-art methods and provides a significant speed-up in the testing phase.

最近的交通和人群场景分析方法广泛使用非参数层次贝叶斯模型将特征智能聚类到活动中。虽然这产生了令人印象深刻的结果，但它需要在训练和分类期间使用耗时的贝叶斯推理。因此，我们试图将贝叶斯推理限制在训练阶段，在训练阶段执行无监督聚类以从场景中提取语义上有意义的活动。在测试阶段，我们使用判别分类器，利用其相对简单和快速推理的优势。在公开可用的数据集上的实验表明，我们的方法在分类精度上可以与最先进的方法相媲美，并且在测试阶段提供了显着的加速。

引用次数: 7

A CRF approach to fitting a generalized hand skeleton model 广义手骨架模型的CRF拟合方法

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

Pub Date : 2014-03-24 DOI: 10.1109/WACV.2014.6836070

R. Mihail, G. Blomquist, Nathan Jacobs

We present a new point distribution model capable of modeling joint subluxation (shifting) in rheumatoid arthritis (RA) patients and an approach to fitting this model to posteroanterior view hand radiographs. We formulate this shape fitting problem as inference in a conditional random field. This model combines potential functions that focus on specific anatomical structures and a learned shape prior. We evaluate our approach on two datasets: one containing relatively healthy hands and one containing hands of rheumatoid arthritis patients. We provide an empirical analysis of the relative value of different potential functions. We also show how to use the fitted hand skeleton to initialize a process for automatically estimating bone contours, which is a challenging, but important, problem in RA disease progression assessment.

我们提出了一个新的点分布模型，能够模拟关节半脱位(移位)在类风湿关节炎(RA)患者和方法拟合该模型的后前位手x线片。我们将这个形状拟合问题表述为条件随机场中的推理。该模型结合了专注于特定解剖结构和学习形状先验的潜在功能。我们在两个数据集上评估我们的方法:一个包含相对健康的手，另一个包含类风湿关节炎患者的手。我们对不同势函数的相对值进行了实证分析。我们还展示了如何使用拟合的手部骨骼来初始化自动估计骨骼轮廓的过程，这是类风湿关节炎疾病进展评估中一个具有挑战性但重要的问题。

引用次数: 5

Pedestrian detection in low resolution videos 低分辨率视频中的行人检测

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

Pub Date : 2014-03-24 DOI: 10.1109/WACV.2014.6836038

Hisham Sager, W. Hoff

Pedestrian detection in low resolution videos can be challenging. In outdoor surveillance scenarios, the size of pedestrians in the images is often very small (around 20 pixels tall). The most common and successful approaches for single frame pedestrian detection use gradient-based features and a support vector machine classifier. We propose an extension of these ideas, and develop a new algorithm that extracts gradient features from a spatiotemporal volume, consisting of a short sequence of images (about one second in duration). The additional information provided by the motion of the person compensates for the loss of resolution. On standard datasets (PETS2001, VIRAT) we show a significant improvement in performance over single-frame detection.

低分辨率视频中的行人检测可能具有挑战性。在户外监控场景中，图像中的行人尺寸通常非常小(大约20像素高)。最常见和成功的单帧行人检测方法使用基于梯度的特征和支持向量机分类器。我们提出了这些思想的扩展，并开发了一种新的算法，从由短序列图像(持续时间约为一秒)组成的时空体中提取梯度特征。人的运动所提供的额外信息补偿了解析度的损失。在标准数据集(PETS2001, VIRAT)上，我们显示了比单帧检测性能的显著改进。

引用次数: 10

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀