首页 > 最新文献

2015 IEEE International Conference on Computer Vision (ICCV)最新文献

英文 中文
Query Adaptive Similarity Measure for RGB-D Object Recognition RGB-D对象识别的查询自适应相似度度量
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.25
Yanhua Cheng, Rui Cai, Chi Zhang, Zhiwei Li, Xin Zhao, Kaiqi Huang, Y. Rui
This paper studies the problem of improving the top-1 accuracy of RGB-D object recognition. Despite of the impressive top-5 accuracies achieved by existing methods, their top-1 accuracies are not very satisfactory. The reasons are in two-fold: (1) existing similarity measures are sensitive to object pose and scale changes, as well as intra-class variations, and (2) effectively fusing RGB and depth cues is still an open problem. To address these problems, this paper first proposes a new similarity measure based on dense matching, through which objects in comparison are warped and aligned, to better tolerate variations. Towards RGB and depth fusion, we argue that a constant and golden weight doesn't exist. The two modalities have varying contributions when comparing objects from different categories. To capture such a dynamic characteristic, a group of matchers equipped with various fusion weights is constructed, to explore the responses of dense matching under different fusion configurations. All the response scores are finally merged following a learning-to-combination way, which provides quite good generalization ability in practice. The proposed approach win the best results on several public benchmarks, e.g., achieves 92.7% top-1 test accuracy on the Washington RGB-D object dataset, with a 5.1% improvement over the state-of-the-art.
本文研究了提高RGB-D目标识别的top-1精度的问题。尽管现有方法取得了令人印象深刻的前5名精度,但它们的前1名精度并不十分令人满意。原因有两方面:(1)现有的相似性度量对物体姿态和尺度变化以及类内变化敏感;(2)有效融合RGB和深度线索仍然是一个悬而未决的问题。为了解决这些问题,本文首先提出了一种基于密集匹配的相似性度量方法,通过对比较对象进行扭曲和对齐,以更好地容忍变化。对于RGB和深度融合,我们认为不存在恒定的黄金权重。当比较来自不同类别的对象时,这两种模式有不同的贡献。为了捕捉这一动态特性,构建了一组配备不同融合权值的匹配器,探索不同融合配置下密集匹配的响应。所有的回答分数最终按照学习到组合的方式进行合并,在实践中具有很好的泛化能力。所提出的方法在几个公共基准测试中获得了最佳结果,例如,在华盛顿RGB-D对象数据集上达到了92.7%的top-1测试精度,比最先进的方法提高了5.1%。
{"title":"Query Adaptive Similarity Measure for RGB-D Object Recognition","authors":"Yanhua Cheng, Rui Cai, Chi Zhang, Zhiwei Li, Xin Zhao, Kaiqi Huang, Y. Rui","doi":"10.1109/ICCV.2015.25","DOIUrl":"https://doi.org/10.1109/ICCV.2015.25","url":null,"abstract":"This paper studies the problem of improving the top-1 accuracy of RGB-D object recognition. Despite of the impressive top-5 accuracies achieved by existing methods, their top-1 accuracies are not very satisfactory. The reasons are in two-fold: (1) existing similarity measures are sensitive to object pose and scale changes, as well as intra-class variations, and (2) effectively fusing RGB and depth cues is still an open problem. To address these problems, this paper first proposes a new similarity measure based on dense matching, through which objects in comparison are warped and aligned, to better tolerate variations. Towards RGB and depth fusion, we argue that a constant and golden weight doesn't exist. The two modalities have varying contributions when comparing objects from different categories. To capture such a dynamic characteristic, a group of matchers equipped with various fusion weights is constructed, to explore the responses of dense matching under different fusion configurations. All the response scores are finally merged following a learning-to-combination way, which provides quite good generalization ability in practice. The proposed approach win the best results on several public benchmarks, e.g., achieves 92.7% top-1 test accuracy on the Washington RGB-D object dataset, with a 5.1% improvement over the state-of-the-art.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"12 1","pages":"145-153"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83881035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Model-Based Tracking at 300Hz Using Raw Time-of-Flight Observations 使用原始飞行时间观测的300Hz基于模型的跟踪
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.408
Jan Stühmer, Sebastian Nowozin, A. Fitzgibbon, R. Szeliski, Travis Perry, S. Acharya, D. Cremers, J. Shotton
Consumer depth cameras have dramatically improved our ability to track rigid, articulated, and deformable 3D objects in real-time. However, depth cameras have a limited temporal resolution (frame-rate) that restricts the accuracy and robustness of tracking, especially for fast or unpredictable motion. In this paper, we show how to perform model-based object tracking which allows to reconstruct the object's depth at an order of magnitude higher frame-rate through simple modifications to an off-the-shelf depth camera. We focus on phase-based time-of-flight (ToF) sensing, which reconstructs each low frame-rate depth image from a set of short exposure 'raw' infrared captures. These raw captures are taken in quick succession near the beginning of each depth frame, and differ in the modulation of their active illumination. We make two contributions. First, we detail how to perform model-based tracking against these raw captures. Second, we show that by reprogramming the camera to space the raw captures uniformly in time, we obtain a 10x higher frame-rate, and thereby improve the ability to track fast-moving objects.
消费者深度相机极大地提高了我们实时跟踪刚性、铰接和可变形3D物体的能力。然而,深度相机具有有限的时间分辨率(帧率),这限制了跟踪的准确性和鲁棒性,特别是对于快速或不可预测的运动。在本文中,我们展示了如何执行基于模型的对象跟踪,该跟踪允许通过简单修改现成的深度相机以更高的帧率重建对象的深度。我们专注于基于相位的飞行时间(ToF)传感,它从一组短曝光“原始”红外捕获中重建每个低帧率深度图像。这些原始捕获是在每个深度帧开始附近快速连续拍摄的,并且在其主动照明的调制方面有所不同。我们有两个贡献。首先,我们详细介绍了如何针对这些原始捕获执行基于模型的跟踪。其次,我们表明,通过重新编程相机,使原始捕获在时间上均匀间隔,我们获得了10倍的高帧率,从而提高了跟踪快速移动物体的能力。
{"title":"Model-Based Tracking at 300Hz Using Raw Time-of-Flight Observations","authors":"Jan Stühmer, Sebastian Nowozin, A. Fitzgibbon, R. Szeliski, Travis Perry, S. Acharya, D. Cremers, J. Shotton","doi":"10.1109/ICCV.2015.408","DOIUrl":"https://doi.org/10.1109/ICCV.2015.408","url":null,"abstract":"Consumer depth cameras have dramatically improved our ability to track rigid, articulated, and deformable 3D objects in real-time. However, depth cameras have a limited temporal resolution (frame-rate) that restricts the accuracy and robustness of tracking, especially for fast or unpredictable motion. In this paper, we show how to perform model-based object tracking which allows to reconstruct the object's depth at an order of magnitude higher frame-rate through simple modifications to an off-the-shelf depth camera. We focus on phase-based time-of-flight (ToF) sensing, which reconstructs each low frame-rate depth image from a set of short exposure 'raw' infrared captures. These raw captures are taken in quick succession near the beginning of each depth frame, and differ in the modulation of their active illumination. We make two contributions. First, we detail how to perform model-based tracking against these raw captures. Second, we show that by reprogramming the camera to space the raw captures uniformly in time, we obtain a 10x higher frame-rate, and thereby improve the ability to track fast-moving objects.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"40 1","pages":"3577-3585"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88220051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Towards Pointless Structure from Motion: 3D Reconstruction and Camera Parameters from General 3D Curves 从运动走向无意义的结构:从一般3D曲线的3D重建和相机参数
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.272
Irina Nurutdinova, A. Fitzgibbon
Modern structure from motion (SfM) remains dependent on point features to recover camera positions, meaning that reconstruction is severely hampered in low-texture environments, for example scanning a plain coffee cup on an uncluttered table. We show how 3D curves can be used to refine camera position estimation in challenging low-texture scenes. In contrast to previous work, we allow the curves to be partially observed in all images, meaning that for the first time, curve-based SfM can be demonstrated in realistic scenes. The algorithm is based on bundle adjustment, so needs an initial estimate, but even a poor estimate from a few point correspondences can be substantially improved by including curves, suggesting that this method would benefit many existing systems.
现代运动结构(SfM)仍然依赖于点特征来恢复相机位置,这意味着重建在低纹理环境中严重受阻,例如扫描整洁桌子上的普通咖啡杯。我们展示了如何在具有挑战性的低纹理场景中使用3D曲线来改进相机位置估计。与之前的工作相反,我们允许在所有图像中部分观察曲线,这意味着第一次可以在现实场景中展示基于曲线的SfM。该算法基于束平差,因此需要一个初始估计,但即使是几个点对应的差估计也可以通过包含曲线而得到很大的改善,这表明该方法将使许多现有系统受益。
{"title":"Towards Pointless Structure from Motion: 3D Reconstruction and Camera Parameters from General 3D Curves","authors":"Irina Nurutdinova, A. Fitzgibbon","doi":"10.1109/ICCV.2015.272","DOIUrl":"https://doi.org/10.1109/ICCV.2015.272","url":null,"abstract":"Modern structure from motion (SfM) remains dependent on point features to recover camera positions, meaning that reconstruction is severely hampered in low-texture environments, for example scanning a plain coffee cup on an uncluttered table. We show how 3D curves can be used to refine camera position estimation in challenging low-texture scenes. In contrast to previous work, we allow the curves to be partially observed in all images, meaning that for the first time, curve-based SfM can be demonstrated in realistic scenes. The algorithm is based on bundle adjustment, so needs an initial estimate, but even a poor estimate from a few point correspondences can be substantially improved by including curves, suggesting that this method would benefit many existing systems.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"35 1","pages":"2363-2371"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87425208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
A Comprehensive Multi-Illuminant Dataset for Benchmarking of the Intrinsic Image Algorithms 一个综合的多光源数据集用于内在图像算法的基准测试
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.28
Shida Beigpour, A. Kolb, Sven Kunz
In this paper, we provide a new, real photo dataset with precise ground-truth for intrinsic image research. Prior ground-truth datasets have been restricted to rather simple illumination conditions and scene geometries, or have been enhanced using image synthesis methods. The dataset provided in this paper is based on complex multi-illuminant scenarios under multi-colored illumination conditions and challenging cast shadows. We provide full per-pixel intrinsic ground-truth data for these scenarios, i.e. reflectance, specularity, shading, and illumination for scenes as well as preliminary depth information. Furthermore, we evaluate 3 state-of-the-art intrinsic image recovery methods, using our dataset.
在本文中,我们提供了一个新的、真实的照片数据集,具有精确的地面真值,用于内在图像的研究。以前的真实数据集仅限于相当简单的照明条件和场景几何,或者已经使用图像合成方法进行了增强。本文提供的数据集是基于多色照明条件下的复杂多光源场景和具有挑战性的阴影。我们为这些场景提供了完整的每像素固有的真实数据,即反射率,镜面,阴影和场景照明以及初步深度信息。此外,我们使用我们的数据集评估了3种最先进的内在图像恢复方法。
{"title":"A Comprehensive Multi-Illuminant Dataset for Benchmarking of the Intrinsic Image Algorithms","authors":"Shida Beigpour, A. Kolb, Sven Kunz","doi":"10.1109/ICCV.2015.28","DOIUrl":"https://doi.org/10.1109/ICCV.2015.28","url":null,"abstract":"In this paper, we provide a new, real photo dataset with precise ground-truth for intrinsic image research. Prior ground-truth datasets have been restricted to rather simple illumination conditions and scene geometries, or have been enhanced using image synthesis methods. The dataset provided in this paper is based on complex multi-illuminant scenarios under multi-colored illumination conditions and challenging cast shadows. We provide full per-pixel intrinsic ground-truth data for these scenarios, i.e. reflectance, specularity, shading, and illumination for scenes as well as preliminary depth information. Furthermore, we evaluate 3 state-of-the-art intrinsic image recovery methods, using our dataset.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"48 1","pages":"172-180"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79112375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Joint Camera Clustering and Surface Segmentation for Large-Scale Multi-view Stereo 大尺度多视点立体联合相机聚类与曲面分割
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.241
Runze Zhang, Shiwei Li, Tian Fang, Siyu Zhu, Long Quan
In this paper, we propose an optimal decomposition approach to large-scale multi-view stereo from an initial sparse reconstruction. The success of the approach depends on the introduction of surface-segmentation-based camera clustering rather than sparse-point-based camera clustering, which suffers from the problems of non-uniform reconstruction coverage ratio and high redundancy. In details, we introduce three criteria for camera clustering and surface segmentation for reconstruction, and then we formulate these criteria into an energy minimization problem under constraints. To solve this problem, we propose a joint optimization in a hierarchical framework to obtain the final surface segments and corresponding optimal camera clusters. On each level of the hierarchical framework, the camera clustering problem is formulated as a parameter estimation problem of a probability model solved by a General Expectation-Maximization algorithm and the surface segmentation problem is formulated as a Markov Random Field model based on the probability estimated by the previous camera clustering process. The experiments on several Internet datasets and aerial photo datasets demonstrate that the proposed approach method generates more uniform and complete dense reconstruction with less redundancy, resulting in more efficient multi-view stereo algorithm.
本文提出了一种基于初始稀疏重建的大规模多视点立体图像的最优分解方法。该方法的成功取决于引入基于表面分割的相机聚类,而不是基于稀疏点的相机聚类,后者存在重建覆盖率不均匀和冗余度高的问题。详细介绍了用于重建的相机聚类和曲面分割的三个准则,并将这些准则转化为约束条件下的能量最小化问题。为了解决这一问题,我们提出了一种分层框架下的联合优化方法,以获得最终的曲面段和相应的最优相机簇。在每一层次框架中,将摄像机聚类问题表述为一个概率模型的参数估计问题,该概率模型由一般期望最大化算法求解;将曲面分割问题表述为一个基于前一摄像机聚类过程估计的概率的马尔可夫随机场模型。在多个互联网数据集和航空照片数据集上的实验表明,该方法产生的密集重构更均匀、更完整,冗余更少,从而提高了多视点立体图像算法的效率。
{"title":"Joint Camera Clustering and Surface Segmentation for Large-Scale Multi-view Stereo","authors":"Runze Zhang, Shiwei Li, Tian Fang, Siyu Zhu, Long Quan","doi":"10.1109/ICCV.2015.241","DOIUrl":"https://doi.org/10.1109/ICCV.2015.241","url":null,"abstract":"In this paper, we propose an optimal decomposition approach to large-scale multi-view stereo from an initial sparse reconstruction. The success of the approach depends on the introduction of surface-segmentation-based camera clustering rather than sparse-point-based camera clustering, which suffers from the problems of non-uniform reconstruction coverage ratio and high redundancy. In details, we introduce three criteria for camera clustering and surface segmentation for reconstruction, and then we formulate these criteria into an energy minimization problem under constraints. To solve this problem, we propose a joint optimization in a hierarchical framework to obtain the final surface segments and corresponding optimal camera clusters. On each level of the hierarchical framework, the camera clustering problem is formulated as a parameter estimation problem of a probability model solved by a General Expectation-Maximization algorithm and the surface segmentation problem is formulated as a Markov Random Field model based on the probability estimated by the previous camera clustering process. The experiments on several Internet datasets and aerial photo datasets demonstrate that the proposed approach method generates more uniform and complete dense reconstruction with less redundancy, resulting in more efficient multi-view stereo algorithm.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"82 1","pages":"2084-2092"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83392891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Multiple Hypothesis Tracking Revisited 重新审视多重假设跟踪
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.533
Chanho Kim, Fuxin Li, A. Ciptadi, James M. Rehg
This paper revisits the classical multiple hypotheses tracking (MHT) algorithm in a tracking-by-detection framework. The success of MHT largely depends on the ability to maintain a small list of potential hypotheses, which can be facilitated with the accurate object detectors that are currently available. We demonstrate that a classical MHT implementation from the 90's can come surprisingly close to the performance of state-of-the-art methods on standard benchmark datasets. In order to further utilize the strength of MHT in exploiting higher-order information, we introduce a method for training online appearance models for each track hypothesis. We show that appearance models can be learned efficiently via a regularized least squares framework, requiring only a few extra operations for each hypothesis branch. We obtain state-of-the-art results on popular tracking-by-detection datasets such as PETS and the recent MOT challenge.
本文在检测跟踪框架下重新研究了经典的多假设跟踪(MHT)算法。MHT的成功在很大程度上取决于维持一小部分潜在假设的能力,这可以通过目前可用的精确目标探测器来促进。我们证明了90年代的经典MHT实现可以惊人地接近标准基准数据集上最先进方法的性能。为了进一步利用MHT在挖掘高阶信息方面的优势,我们引入了一种训练每个轨道假设的在线外观模型的方法。我们证明了外观模型可以通过正则化最小二乘框架有效地学习,只需要对每个假设分支进行一些额外的操作。我们在流行的检测跟踪数据集(如pet和最近的MOT挑战)上获得了最先进的结果。
{"title":"Multiple Hypothesis Tracking Revisited","authors":"Chanho Kim, Fuxin Li, A. Ciptadi, James M. Rehg","doi":"10.1109/ICCV.2015.533","DOIUrl":"https://doi.org/10.1109/ICCV.2015.533","url":null,"abstract":"This paper revisits the classical multiple hypotheses tracking (MHT) algorithm in a tracking-by-detection framework. The success of MHT largely depends on the ability to maintain a small list of potential hypotheses, which can be facilitated with the accurate object detectors that are currently available. We demonstrate that a classical MHT implementation from the 90's can come surprisingly close to the performance of state-of-the-art methods on standard benchmark datasets. In order to further utilize the strength of MHT in exploiting higher-order information, we introduce a method for training online appearance models for each track hypothesis. We show that appearance models can be learned efficiently via a regularized least squares framework, requiring only a few extra operations for each hypothesis branch. We obtain state-of-the-art results on popular tracking-by-detection datasets such as PETS and the recent MOT challenge.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"104 1","pages":"4696-4704"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80830221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 571
You are Here: Mimicking the Human Thinking Process in Reading Floor-Plans 你在这里:模仿人类阅读平面图的思维过程
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.255
Hang Chu, Dong Ki Kim, Tsuhan Chen
A human can easily find his or her way in an unfamiliar building, by walking around and reading the floor-plan. We try to mimic and automate this human thinking process. More precisely, we introduce a new and useful task of locating an user in the floor-plan, by using only a camera and a floor-plan without any other prior information. We address the problem with a novel matching-localization algorithm that is inspired by human logic. We demonstrate through experiments that our method outperforms state-of-the-art floor-plan-based localization methods by a large margin, while also being highly efficient for real-time applications.
在不熟悉的建筑物里,人们可以通过四处走动和阅读楼层平面图来轻松地找到路。我们试图模仿和自动化人类的思维过程。更准确地说,我们引入了一个新的、有用的任务,即在平面图中定位用户,只使用相机和平面图,而不使用任何其他先验信息。我们用一种受人类逻辑启发的新颖匹配定位算法来解决这个问题。我们通过实验证明,我们的方法在很大程度上优于最先进的基于平面图的定位方法,同时在实时应用中也非常高效。
{"title":"You are Here: Mimicking the Human Thinking Process in Reading Floor-Plans","authors":"Hang Chu, Dong Ki Kim, Tsuhan Chen","doi":"10.1109/ICCV.2015.255","DOIUrl":"https://doi.org/10.1109/ICCV.2015.255","url":null,"abstract":"A human can easily find his or her way in an unfamiliar building, by walking around and reading the floor-plan. We try to mimic and automate this human thinking process. More precisely, we introduce a new and useful task of locating an user in the floor-plan, by using only a camera and a floor-plan without any other prior information. We address the problem with a novel matching-localization algorithm that is inspired by human logic. We demonstrate through experiments that our method outperforms state-of-the-art floor-plan-based localization methods by a large margin, while also being highly efficient for real-time applications.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"40 1","pages":"2210-2218"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86820098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Component-Wise Modeling of Articulated Objects 铰接对象的组件建模
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.268
Valsamis Ntouskos, Marta Sanzari, B. Cafaro, F. Nardi, Fabrizio Natola, F. Pirri, M. A. Garcia
We introduce a novel framework for modeling articulated objects based on the aspects of their components. By decomposing the object into components, we divide the problem in smaller modeling tasks. After obtaining 3D models for each component aspect by employing a shape deformation paradigm, we merge them together, forming the object components. The final model is obtained by assembling the components using an optimization scheme which fits the respective 3D models to the corresponding apparent contours in a reference pose. The results suggest that our approach can produce realistic 3D models of articulated objects in reasonable time.
我们介绍了一种基于铰接对象组件方面建模的新框架。通过将对象分解为组件,我们将问题划分为更小的建模任务。在使用形状变形范式获得每个部件方面的三维模型后,我们将它们合并在一起,形成对象组件。采用一种优化方案,将各部件的三维模型拟合到参考位姿的相应表观轮廓上,从而组装出最终模型。结果表明,我们的方法可以在合理的时间内生成逼真的三维铰接物体模型。
{"title":"Component-Wise Modeling of Articulated Objects","authors":"Valsamis Ntouskos, Marta Sanzari, B. Cafaro, F. Nardi, Fabrizio Natola, F. Pirri, M. A. Garcia","doi":"10.1109/ICCV.2015.268","DOIUrl":"https://doi.org/10.1109/ICCV.2015.268","url":null,"abstract":"We introduce a novel framework for modeling articulated objects based on the aspects of their components. By decomposing the object into components, we divide the problem in smaller modeling tasks. After obtaining 3D models for each component aspect by employing a shape deformation paradigm, we merge them together, forming the object components. The final model is obtained by assembling the components using an optimization scheme which fits the respective 3D models to the corresponding apparent contours in a reference pose. The results suggest that our approach can produce realistic 3D models of articulated objects in reasonable time.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"1 1","pages":"2327-2335"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75275605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
HICO: A Benchmark for Recognizing Human-Object Interactions in Images HICO:识别图像中人与物交互的基准
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.122
Yu-Wei Chao, Zhan Wang, Yugeng He, Jiaxuan Wang, Jia Deng
We introduce a new benchmark "Humans Interacting with Common Objects" (HICO) for recognizing human-object interactions (HOI). We demonstrate the key features of HICO: a diverse set of interactions with common object categories, a list of well-defined, sense-based HOI categories, and an exhaustive labeling of co-occurring interactions with an object category in each image. We perform an in-depth analysis of representative current approaches and show that DNNs enjoy a significant edge. In addition, we show that semantic knowledge can significantly improve HOI recognition, especially for uncommon categories.
我们引入了一个新的基准“人类与公共对象交互”(HICO)来识别人-物交互(HOI)。我们展示了HICO的关键特征:与常见对象类别的多种交互,定义良好的基于感官的HOI类别列表,以及与每个图像中对象类别共同发生的交互的详尽标记。我们对具有代表性的当前方法进行了深入分析,并表明dnn具有显著的优势。此外,我们发现语义知识可以显著提高HOI识别,特别是对于不常见的类别。
{"title":"HICO: A Benchmark for Recognizing Human-Object Interactions in Images","authors":"Yu-Wei Chao, Zhan Wang, Yugeng He, Jiaxuan Wang, Jia Deng","doi":"10.1109/ICCV.2015.122","DOIUrl":"https://doi.org/10.1109/ICCV.2015.122","url":null,"abstract":"We introduce a new benchmark \"Humans Interacting with Common Objects\" (HICO) for recognizing human-object interactions (HOI). We demonstrate the key features of HICO: a diverse set of interactions with common object categories, a list of well-defined, sense-based HOI categories, and an exhaustive labeling of co-occurring interactions with an object category in each image. We perform an in-depth analysis of representative current approaches and show that DNNs enjoy a significant edge. In addition, we show that semantic knowledge can significantly improve HOI recognition, especially for uncommon categories.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"17 1","pages":"1017-1025"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89756739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 258
Learning Spatially Regularized Correlation Filters for Visual Tracking 学习用于视觉跟踪的空间正则化相关滤波器
Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.490
Martin Danelljan, Gustav Häger, F. Khan, M. Felsberg
Robust and accurate visual tracking is one of the most challenging computer vision problems. Due to the inherent lack of training data, a robust approach for constructing a target appearance model is crucial. Recently, discriminatively learned correlation filters (DCF) have been successfully applied to address this problem for tracking. These methods utilize a periodic assumption of the training samples to efficiently learn a classifier on all patches in the target neighborhood. However, the periodic assumption also introduces unwanted boundary effects, which severely degrade the quality of the tracking model. We propose Spatially Regularized Discriminative Correlation Filters (SRDCF) for tracking. A spatial regularization component is introduced in the learning to penalize correlation filter coefficients depending on their spatial location. Our SRDCF formulation allows the correlation filters to be learned on a significantly larger set of negative training samples, without corrupting the positive samples. We further propose an optimization strategy, based on the iterative Gauss-Seidel method, for efficient online learning of our SRDCF. Experiments are performed on four benchmark datasets: OTB-2013, ALOV++, OTB-2015, and VOT2014. Our approach achieves state-of-the-art results on all four datasets. On OTB-2013 and OTB-2015, we obtain an absolute gain of 8.0% and 8.2% respectively, in mean overlap precision, compared to the best existing trackers.
鲁棒和精确的视觉跟踪是最具挑战性的计算机视觉问题之一。由于缺乏训练数据,构建目标外观模型的鲁棒方法至关重要。近年来,判别学习相关滤波器(DCF)被成功地应用于跟踪中。这些方法利用训练样本的周期性假设,在目标邻域的所有斑块上有效地学习分类器。然而,周期假设也引入了不必要的边界效应,严重降低了跟踪模型的质量。我们提出空间正则化判别相关滤波器(SRDCF)用于跟踪。在学习中引入空间正则化分量,根据相关滤波系数的空间位置对其进行惩罚。我们的SRDCF公式允许在更大的负训练样本集上学习相关滤波器,而不会破坏正样本。我们进一步提出了一种基于迭代Gauss-Seidel方法的优化策略,用于有效地在线学习我们的SRDCF。在OTB-2013、alov++、OTB-2015和VOT2014四个基准数据集上进行了实验。我们的方法在所有四个数据集上实现了最先进的结果。在OTB-2013和OTB-2015上,与现有最佳跟踪器相比,我们获得了平均重叠精度的绝对增益分别为8.0%和8.2%。
{"title":"Learning Spatially Regularized Correlation Filters for Visual Tracking","authors":"Martin Danelljan, Gustav Häger, F. Khan, M. Felsberg","doi":"10.1109/ICCV.2015.490","DOIUrl":"https://doi.org/10.1109/ICCV.2015.490","url":null,"abstract":"Robust and accurate visual tracking is one of the most challenging computer vision problems. Due to the inherent lack of training data, a robust approach for constructing a target appearance model is crucial. Recently, discriminatively learned correlation filters (DCF) have been successfully applied to address this problem for tracking. These methods utilize a periodic assumption of the training samples to efficiently learn a classifier on all patches in the target neighborhood. However, the periodic assumption also introduces unwanted boundary effects, which severely degrade the quality of the tracking model. We propose Spatially Regularized Discriminative Correlation Filters (SRDCF) for tracking. A spatial regularization component is introduced in the learning to penalize correlation filter coefficients depending on their spatial location. Our SRDCF formulation allows the correlation filters to be learned on a significantly larger set of negative training samples, without corrupting the positive samples. We further propose an optimization strategy, based on the iterative Gauss-Seidel method, for efficient online learning of our SRDCF. Experiments are performed on four benchmark datasets: OTB-2013, ALOV++, OTB-2015, and VOT2014. Our approach achieves state-of-the-art results on all four datasets. On OTB-2013 and OTB-2015, we obtain an absolute gain of 8.0% and 8.2% respectively, in mean overlap precision, compared to the best existing trackers.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"67 1","pages":"4310-4318"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91009967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1749
期刊
2015 IEEE International Conference on Computer Vision (ICCV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1