首页 > 最新文献

2013 IEEE International Conference on Computer Vision最新文献

英文 中文
Joint Segmentation and Pose Tracking of Human in Natural Videos 自然视频中人体关节分割与姿态跟踪
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.108
Taegyu Lim, Seunghoon Hong, Bohyung Han, J. Han
We propose an on-line algorithm to extract a human by foreground/background segmentation and estimate pose of the human from the videos captured by moving cameras. We claim that a virtuous cycle can be created by appropriate interactions between the two modules to solve individual problems. This joint estimation problem is divided into two sub problems, foreground/background segmentation and pose tracking, which alternate iteratively for optimization, segmentation step generates foreground mask for human pose tracking, and human pose tracking step provides fore-ground response map for segmentation. The final solution is obtained when the iterative procedure converges. We evaluate our algorithm quantitatively and qualitatively in real videos involving various challenges, and present its outstanding performance compared to the state-of-the-art techniques for segmentation and pose estimation.
提出了一种通过前景/背景分割提取人的在线算法,并从移动摄像机拍摄的视频中估计人的姿态。我们认为,通过两个模块之间的适当互动,可以创造一个良性循环,以解决个别问题。该联合估计问题分为前景/背景分割和姿态跟踪两个子问题,迭代交替进行优化,分割步骤生成用于人体姿态跟踪的前景掩模,人体姿态跟踪步骤提供用于分割的前景响应图。当迭代过程收敛时得到最终解。我们在涉及各种挑战的真实视频中定量和定性地评估了我们的算法,并与最先进的分割和姿态估计技术相比,展示了其出色的性能。
{"title":"Joint Segmentation and Pose Tracking of Human in Natural Videos","authors":"Taegyu Lim, Seunghoon Hong, Bohyung Han, J. Han","doi":"10.1109/ICCV.2013.108","DOIUrl":"https://doi.org/10.1109/ICCV.2013.108","url":null,"abstract":"We propose an on-line algorithm to extract a human by foreground/background segmentation and estimate pose of the human from the videos captured by moving cameras. We claim that a virtuous cycle can be created by appropriate interactions between the two modules to solve individual problems. This joint estimation problem is divided into two sub problems, foreground/background segmentation and pose tracking, which alternate iteratively for optimization, segmentation step generates foreground mask for human pose tracking, and human pose tracking step provides fore-ground response map for segmentation. The final solution is obtained when the iterative procedure converges. We evaluate our algorithm quantitatively and qualitatively in real videos involving various challenges, and present its outstanding performance compared to the state-of-the-art techniques for segmentation and pose estimation.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"1 1","pages":"833-840"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83112690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Latent Data Association: Bayesian Model Selection for Multi-target Tracking 潜在数据关联:多目标跟踪的贝叶斯模型选择
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.361
Aleksandr V. Segal, I. Reid
We propose a novel parametrization of the data association problem for multi-target tracking. In our formulation, the number of targets is implicitly inferred together with the data association, effectively solving data association and model selection as a single inference problem. The novel formulation allows us to interpret data association and tracking as a single Switching Linear Dynamical System (SLDS). We compute an approximate posterior solution to this problem using a dynamic programming/message passing technique. This inference-based approach allows us to incorporate richer probabilistic models into the tracking system. In particular, we incorporate inference over inliers/outliers and track termination times into the system. We evaluate our approach on publicly available datasets and demonstrate results competitive with, and in some cases exceeding the state of the art.
提出了一种新的多目标跟踪数据关联问题的参数化方法。在我们的公式中,目标的数量与数据关联一起隐式推断,有效地解决了数据关联和模型选择作为一个单一的推理问题。新公式允许我们将数据关联和跟踪解释为单个切换线性动力系统(SLDS)。我们使用动态规划/消息传递技术计算了该问题的近似后验解。这种基于推理的方法允许我们将更丰富的概率模型合并到跟踪系统中。特别是,我们在系统中加入了对内线/离群值的推断和跟踪终止时间。我们在公开可用的数据集上评估我们的方法,并展示与最先进的技术相竞争的结果,在某些情况下甚至超过了最先进的技术。
{"title":"Latent Data Association: Bayesian Model Selection for Multi-target Tracking","authors":"Aleksandr V. Segal, I. Reid","doi":"10.1109/ICCV.2013.361","DOIUrl":"https://doi.org/10.1109/ICCV.2013.361","url":null,"abstract":"We propose a novel parametrization of the data association problem for multi-target tracking. In our formulation, the number of targets is implicitly inferred together with the data association, effectively solving data association and model selection as a single inference problem. The novel formulation allows us to interpret data association and tracking as a single Switching Linear Dynamical System (SLDS). We compute an approximate posterior solution to this problem using a dynamic programming/message passing technique. This inference-based approach allows us to incorporate richer probabilistic models into the tracking system. In particular, we incorporate inference over inliers/outliers and track termination times into the system. We evaluate our approach on publicly available datasets and demonstrate results competitive with, and in some cases exceeding the state of the art.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"31 1","pages":"2904-2911"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83728544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 58
Semi-dense Visual Odometry for a Monocular Camera 用于单目摄像机的半密集视觉里程计
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.183
Jakob J. Engel, Jürgen Sturm, D. Cremers
We propose a fundamentally novel approach to real-time visual odometry for a monocular camera. It allows to benefit from the simplicity and accuracy of dense tracking - which does not depend on visual features - while running in real-time on a CPU. The key idea is to continuously estimate a semi-dense inverse depth map for the current frame, which in turn is used to track the motion of the camera using dense image alignment. More specifically, we estimate the depth of all pixels which have a non-negligible image gradient. Each estimate is represented as a Gaussian probability distribution over the inverse depth. We propagate this information over time, and update it with new measurements as new images arrive. In terms of tracking accuracy and computational speed, the proposed method compares favorably to both state-of-the-art dense and feature-based visual odometry and SLAM algorithms. As our method runs in real-time on a CPU, it is of large practical value for robotics and augmented reality applications.
我们提出了一种全新的单目相机实时视觉里程计方法。它允许受益于密集跟踪的简单性和准确性-不依赖于视觉特征-同时在CPU上实时运行。关键思想是连续估计当前帧的半密集反深度图,该深度图反过来用于使用密集图像对齐来跟踪相机的运动。更具体地说,我们估计具有不可忽略的图像梯度的所有像素的深度。每个估计都表示为逆深度上的高斯概率分布。随着时间的推移,我们传播这些信息,并在新图像到达时用新的测量值更新它。在跟踪精度和计算速度方面,该方法优于最先进的密集和基于特征的视觉里程计和SLAM算法。由于我们的方法是在CPU上实时运行的,因此对于机器人和增强现实应用具有很大的实用价值。
{"title":"Semi-dense Visual Odometry for a Monocular Camera","authors":"Jakob J. Engel, Jürgen Sturm, D. Cremers","doi":"10.1109/ICCV.2013.183","DOIUrl":"https://doi.org/10.1109/ICCV.2013.183","url":null,"abstract":"We propose a fundamentally novel approach to real-time visual odometry for a monocular camera. It allows to benefit from the simplicity and accuracy of dense tracking - which does not depend on visual features - while running in real-time on a CPU. The key idea is to continuously estimate a semi-dense inverse depth map for the current frame, which in turn is used to track the motion of the camera using dense image alignment. More specifically, we estimate the depth of all pixels which have a non-negligible image gradient. Each estimate is represented as a Gaussian probability distribution over the inverse depth. We propagate this information over time, and update it with new measurements as new images arrive. In terms of tracking accuracy and computational speed, the proposed method compares favorably to both state-of-the-art dense and feature-based visual odometry and SLAM algorithms. As our method runs in real-time on a CPU, it is of large practical value for robotics and augmented reality applications.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"19 1","pages":"1449-1456"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84619574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 558
Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose? 绘画人类空间:人类如何感知3D关节姿势?
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.163
Elisabeta Marinoiu, Dragos Papava, C. Sminchisescu
Human motion analysis in images and video is a central computer vision problem. Yet, there are no studies that reveal how humans perceive other people in images and how accurate they are. In this paper we aim to unveil some of the processing-as well as the levels of accuracy-involved in the 3D perception of people from images by assessing the human performance. Our contributions are: (1) the construction of an experimental apparatus that relates perception and measurement, in particular the visual and kinematic performance with respect to 3D ground truth when the human subject is presented an image of a person in a given pose, (2) the creation of a dataset containing images, articulated 2D and 3D pose ground truth, as well as synchronized eye movement recordings of human subjects, shown a variety of human body configurations, both easy and difficult, as well as their 're-enacted' 3D poses, (3) quantitative analysis revealing the human performance in 3D pose re-enactment tasks, the degree of stability in the visual fixation patterns of human subjects, and the way it correlates with different poses. We also discuss the implications of our findings for the construction of visual human sensing systems.
图像和视频中的人体运动分析是计算机视觉的核心问题。然而,目前还没有研究揭示人类是如何在图像中感知他人的,以及他们的感知有多准确。在这篇论文中,我们的目标是通过评估人的表现来揭示从图像中对人进行3D感知的一些处理过程以及准确性水平。我们的贡献是:(1)构建与感知和测量相关的实验装置,特别是当人类受试者以给定姿势呈现人的图像时,与3D地面真实相关的视觉和运动学性能;(2)创建包含图像的数据集,清晰的2D和3D姿势地面真实,以及人类受试者的同步眼动记录,显示各种人体构型,包括容易和困难;(3)定量分析揭示了人类在3D姿势再现任务中的表现,人类受试者的视觉固定模式的稳定程度,以及它与不同姿势的关联方式。我们还讨论了我们的发现对人类视觉传感系统建设的影响。
{"title":"Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?","authors":"Elisabeta Marinoiu, Dragos Papava, C. Sminchisescu","doi":"10.1109/ICCV.2013.163","DOIUrl":"https://doi.org/10.1109/ICCV.2013.163","url":null,"abstract":"Human motion analysis in images and video is a central computer vision problem. Yet, there are no studies that reveal how humans perceive other people in images and how accurate they are. In this paper we aim to unveil some of the processing-as well as the levels of accuracy-involved in the 3D perception of people from images by assessing the human performance. Our contributions are: (1) the construction of an experimental apparatus that relates perception and measurement, in particular the visual and kinematic performance with respect to 3D ground truth when the human subject is presented an image of a person in a given pose, (2) the creation of a dataset containing images, articulated 2D and 3D pose ground truth, as well as synchronized eye movement recordings of human subjects, shown a variety of human body configurations, both easy and difficult, as well as their 're-enacted' 3D poses, (3) quantitative analysis revealing the human performance in 3D pose re-enactment tasks, the degree of stability in the visual fixation patterns of human subjects, and the way it correlates with different poses. We also discuss the implications of our findings for the construction of visual human sensing systems.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"46 1","pages":"1289-1296"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84653928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Handwritten Word Spotting with Corrected Attributes 手写单词识别与更正属性
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.130
Jon Almazán, Albert Gordo, A. Fornés, Ernest Valveny
We propose an approach to multi-writer word spotting, where the goal is to find a query word in a dataset comprised of document images. We propose an attributes-based approach that leads to a low-dimensional, fixed-length representation of the word images that is fast to compute and, especially, fast to compare. This approach naturally leads to an unified representation of word images and strings, which seamlessly allows one to indistinctly perform query-by-example, where the query is an image, and query-by-string, where the query is a string. We also propose a calibration scheme to correct the attributes scores based on Canonical Correlation Analysis that greatly improves the results on a challenging dataset. We test our approach on two public datasets showing state-of-the-art results.
我们提出了一种多作者词识别方法,其目标是在由文档图像组成的数据集中找到一个查询词。我们提出了一种基于属性的方法,该方法可以产生低维固定长度的单词图像表示,计算速度快,特别是比较速度快。这种方法自然会导致单词图像和字符串的统一表示,从而无缝地允许按示例执行查询,其中查询是图像,而按字符串执行查询,其中查询是字符串。我们还提出了一种基于典型相关分析的属性分数校正方案,极大地改善了具有挑战性数据集的结果。我们在两个公共数据集上测试了我们的方法,显示了最先进的结果。
{"title":"Handwritten Word Spotting with Corrected Attributes","authors":"Jon Almazán, Albert Gordo, A. Fornés, Ernest Valveny","doi":"10.1109/ICCV.2013.130","DOIUrl":"https://doi.org/10.1109/ICCV.2013.130","url":null,"abstract":"We propose an approach to multi-writer word spotting, where the goal is to find a query word in a dataset comprised of document images. We propose an attributes-based approach that leads to a low-dimensional, fixed-length representation of the word images that is fast to compute and, especially, fast to compare. This approach naturally leads to an unified representation of word images and strings, which seamlessly allows one to indistinctly perform query-by-example, where the query is an image, and query-by-string, where the query is a string. We also propose a calibration scheme to correct the attributes scores based on Canonical Correlation Analysis that greatly improves the results on a challenging dataset. We test our approach on two public datasets showing state-of-the-art results.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"1 1","pages":"1017-1024"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88596867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Exploiting Reflection Change for Automatic Reflection Removal 利用反射变化自动消除反射
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.302
Yu Li, M. S. Brown
This paper introduces an automatic method for removing reflection interference when imaging a scene behind a glass surface. Our approach exploits the subtle changes in the reflection with respect to the background in a small set of images taken at slightly different view points. Key to this idea is the use of SIFT-flow to align the images such that a pixel-wise comparison can be made across the input set. Gradients with variation across the image set are assumed to belong to the reflected scenes while constant gradients are assumed to belong to the desired background scene. By correctly labelling gradients belonging to reflection or background, the background scene can be separated from the reflection interference. Unlike previous approaches that exploit motion, our approach does not make any assumptions regarding the background or reflected scenes' geometry, nor requires the reflection to be static. This makes our approach practical for use in casual imaging scenarios. Our approach is straight forward and produces good results compared with existing methods.
本文介绍了一种自动消除玻璃表面后景物成像时反射干扰的方法。我们的方法是利用一小组在不同视角拍摄的图像中反射与背景的微妙变化。这个想法的关键是使用SIFT-flow来对齐图像,以便可以跨输入集进行逐像素的比较。假设整个图像集中变化的梯度属于反射场景,而假设恒定的梯度属于所需的背景场景。通过正确标记属于反射或背景的梯度,可以将背景场景从反射干扰中分离出来。与之前利用运动的方法不同,我们的方法没有对背景或反射场景的几何形状做出任何假设,也不要求反射是静态的。这使得我们的方法在休闲成像场景中具有实用性。与现有方法相比,我们的方法简单明了,效果良好。
{"title":"Exploiting Reflection Change for Automatic Reflection Removal","authors":"Yu Li, M. S. Brown","doi":"10.1109/ICCV.2013.302","DOIUrl":"https://doi.org/10.1109/ICCV.2013.302","url":null,"abstract":"This paper introduces an automatic method for removing reflection interference when imaging a scene behind a glass surface. Our approach exploits the subtle changes in the reflection with respect to the background in a small set of images taken at slightly different view points. Key to this idea is the use of SIFT-flow to align the images such that a pixel-wise comparison can be made across the input set. Gradients with variation across the image set are assumed to belong to the reflected scenes while constant gradients are assumed to belong to the desired background scene. By correctly labelling gradients belonging to reflection or background, the background scene can be separated from the reflection interference. Unlike previous approaches that exploit motion, our approach does not make any assumptions regarding the background or reflected scenes' geometry, nor requires the reflection to be static. This makes our approach practical for use in casual imaging scenarios. Our approach is straight forward and produces good results compared with existing methods.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"10 1","pages":"2432-2439"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87179389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 160
Real-Time Solution to the Absolute Pose Problem with Unknown Radial Distortion and Focal Length 具有未知径向畸变和焦距的绝对位姿问题的实时求解
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.350
Z. Kukelova, Martin Bujnak, T. Pajdla
The problem of determining the absolute position and orientation of a camera from a set of 2D-to-3D point correspondences is one of the most important problems in computer vision with a broad range of applications. In this paper we present a new solution to the absolute pose problem for camera with unknown radial distortion and unknown focal length from five 2D-to-3D point correspondences. Our new solver is numerically more stable, more accurate, and significantly faster than the existing state-of-the-art minimal four point absolute pose solvers for this problem. Moreover, our solver results in less solutions and can handle larger radial distortions. The new solver is straightforward and uses only simple concepts from linear algebra. Therefore it is simpler than the state-of-the-art Groebner basis solvers. We compare our new solver with the existing state-of-the-art solvers and show its usefulness on synthetic and real datasets.
从一组二维到三维的点对应中确定相机的绝对位置和方向是计算机视觉中最重要的问题之一,具有广泛的应用。本文提出了一种新的解决未知径向畸变和未知焦距的5个二维到三维点对应的相机绝对位姿问题的方法。我们的新解算器在数值上更稳定,更准确,并且比现有的最先进的最小四点绝对姿态解算器更快。此外,我们的求解器得到的解更少,可以处理更大的径向扭曲。新的求解器很简单,只使用线性代数中的简单概念。因此,它比最先进的格罗布纳基求解器更简单。我们将我们的新求解器与现有的最先进的求解器进行比较,并显示其在合成和实际数据集上的实用性。
{"title":"Real-Time Solution to the Absolute Pose Problem with Unknown Radial Distortion and Focal Length","authors":"Z. Kukelova, Martin Bujnak, T. Pajdla","doi":"10.1109/ICCV.2013.350","DOIUrl":"https://doi.org/10.1109/ICCV.2013.350","url":null,"abstract":"The problem of determining the absolute position and orientation of a camera from a set of 2D-to-3D point correspondences is one of the most important problems in computer vision with a broad range of applications. In this paper we present a new solution to the absolute pose problem for camera with unknown radial distortion and unknown focal length from five 2D-to-3D point correspondences. Our new solver is numerically more stable, more accurate, and significantly faster than the existing state-of-the-art minimal four point absolute pose solvers for this problem. Moreover, our solver results in less solutions and can handle larger radial distortions. The new solver is straightforward and uses only simple concepts from linear algebra. Therefore it is simpler than the state-of-the-art Groebner basis solvers. We compare our new solver with the existing state-of-the-art solvers and show its usefulness on synthetic and real datasets.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"48 1","pages":"2816-2823"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88148469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 101
Coherent Motion Segmentation in Moving Camera Videos Using Optical Flow Orientations 基于光流取向的运动摄像机视频相干运动分割
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.199
M. Narayana, A. Hanson, E. Learned-Miller
In moving camera videos, motion segmentation is commonly performed using the image plane motion of pixels, or optical flow. However, objects that are at different depths from the camera can exhibit different optical flows even if they share the same real-world motion. This can cause a depth-dependent segmentation of the scene. Our goal is to develop a segmentation algorithm that clusters pixels that have similar real-world motion irrespective of their depth in the scene. Our solution uses optical flow orientations instead of the complete vectors and exploits the well-known property that under camera translation, optical flow orientations are independent of object depth. We introduce a probabilistic model that automatically estimates the number of observed independent motions and results in a labeling that is consistent with real-world motion in the scene. The result of our system is that static objects are correctly identified as one segment, even if they are at different depths. Color features and information from previous frames in the video sequence are used to correct occasional errors due to the orientation-based segmentation. We present results on more than thirty videos from different benchmarks. The system is particularly robust on complex background scenes containing objects at significantly different depths.
在移动摄像机视频中,运动分割通常使用像素的图像平面运动或光流来执行。然而,距离相机不同深度的物体即使具有相同的真实运动,也会表现出不同的光流。这可能导致场景的深度依赖分割。我们的目标是开发一种分割算法,将具有相似现实世界运动的像素聚类,而不管它们在场景中的深度如何。我们的解决方案使用光流方向而不是完全矢量,并利用了众所周知的特性,即在相机平移下,光流方向与物体深度无关。我们引入了一个概率模型,该模型自动估计观察到的独立运动的数量,并产生与场景中真实运动一致的标记。我们的系统的结果是,静态对象被正确地识别为一个部分,即使它们在不同的深度。利用视频序列中前一帧的颜色特征和信息来纠正由于基于方向的分割而产生的偶尔错误。我们展示了来自不同基准的30多个视频的结果。该系统在包含深度差异很大的物体的复杂背景场景中具有特别的鲁棒性。
{"title":"Coherent Motion Segmentation in Moving Camera Videos Using Optical Flow Orientations","authors":"M. Narayana, A. Hanson, E. Learned-Miller","doi":"10.1109/ICCV.2013.199","DOIUrl":"https://doi.org/10.1109/ICCV.2013.199","url":null,"abstract":"In moving camera videos, motion segmentation is commonly performed using the image plane motion of pixels, or optical flow. However, objects that are at different depths from the camera can exhibit different optical flows even if they share the same real-world motion. This can cause a depth-dependent segmentation of the scene. Our goal is to develop a segmentation algorithm that clusters pixels that have similar real-world motion irrespective of their depth in the scene. Our solution uses optical flow orientations instead of the complete vectors and exploits the well-known property that under camera translation, optical flow orientations are independent of object depth. We introduce a probabilistic model that automatically estimates the number of observed independent motions and results in a labeling that is consistent with real-world motion in the scene. The result of our system is that static objects are correctly identified as one segment, even if they are at different depths. Color features and information from previous frames in the video sequence are used to correct occasional errors due to the orientation-based segmentation. We present results on more than thirty videos from different benchmarks. The system is particularly robust on complex background scenes containing objects at significantly different depths.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"49 1","pages":"1577-1584"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91310511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 125
Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps 深度图动作识别的组稀疏性和几何约束字典学习
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.227
Jiajia Luo, Wei Wang, H. Qi
Human action recognition based on the depth information provided by commodity depth sensors is an important yet challenging task. The noisy depth maps, different lengths of action sequences, and free styles in performing actions, may cause large intra-class variations. In this paper, a new framework based on sparse coding and temporal pyramid matching (TPM) is proposed for depth-based human action recognition. Especially, a discriminative class-specific dictionary learning algorithm is proposed for sparse coding. By adding the group sparsity and geometry constraints, features can be well reconstructed by the sub-dictionary belonging to the same class, and the geometry relationships among features are also kept in the calculated coefficients. The proposed approach is evaluated on two benchmark datasets captured by depth cameras. Experimental results show that the proposed algorithm repeatedly achieves superior performance to the state of the art algorithms. Moreover, the proposed dictionary learning method also outperforms classic dictionary learning approaches.
基于商品深度传感器提供的深度信息进行人体动作识别是一项重要而又具有挑战性的任务。嘈杂的深度图、动作序列的不同长度以及执行动作的自由风格,可能会导致很大的类内变化。提出了一种基于稀疏编码和时间金字塔匹配(TPM)的基于深度的人体动作识别框架。特别提出了一种针对稀疏编码的判别分类字典学习算法。通过加入群稀疏性和几何约束,可以很好地利用属于同一类的子字典重构特征,并在计算系数中保持特征之间的几何关系。在深度相机捕获的两个基准数据集上对该方法进行了评估。实验结果表明,该算法多次取得了优于现有算法的性能。此外,所提出的字典学习方法也优于经典的字典学习方法。
{"title":"Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps","authors":"Jiajia Luo, Wei Wang, H. Qi","doi":"10.1109/ICCV.2013.227","DOIUrl":"https://doi.org/10.1109/ICCV.2013.227","url":null,"abstract":"Human action recognition based on the depth information provided by commodity depth sensors is an important yet challenging task. The noisy depth maps, different lengths of action sequences, and free styles in performing actions, may cause large intra-class variations. In this paper, a new framework based on sparse coding and temporal pyramid matching (TPM) is proposed for depth-based human action recognition. Especially, a discriminative class-specific dictionary learning algorithm is proposed for sparse coding. By adding the group sparsity and geometry constraints, features can be well reconstructed by the sub-dictionary belonging to the same class, and the geometry relationships among features are also kept in the calculated coefficients. The proposed approach is evaluated on two benchmark datasets captured by depth cameras. Experimental results show that the proposed algorithm repeatedly achieves superior performance to the state of the art algorithms. Moreover, the proposed dictionary learning method also outperforms classic dictionary learning approaches.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"25 1","pages":"1809-1816"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90457625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 215
An Enhanced Structure-from-Motion Paradigm Based on the Absolute Dual Quadric and Images of Circular Points 一种基于绝对对偶二次曲线和圆点图像的增强运动构造范式
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.126
L. Calvet, Pierre Gurdjos
This work aims at introducing a new unified Structure from Motion (SfM) paradigm in which images of circular point-pairs can be combined with images of natural points. An imaged circular point-pair encodes the 2D Euclidean structure of a world plane and can easily be derived from the image of a planar shape, especially those including circles. A classical SfM method generally runs two steps: first a projective factorization of all matched image points (into projective cameras and points) and second a camera self calibration that updates the obtained world from projective to Euclidean. This work shows how to introduce images of circular points in these two SfM steps while its key contribution is to provide the theoretical foundations for combining "classical" linear self-calibration constraints with additional ones derived from such images. We show that the two proposed SfM steps clearly contribute to better results than the classical approach. We validate our contributions on synthetic and real images.
这项工作旨在引入一种新的统一运动结构(SfM)范式,其中圆形点对的图像可以与自然点的图像相结合。圆形点对成像编码了世界平面的二维欧几里得结构,并且可以很容易地从平面形状的图像中导出,特别是那些包含圆的图像。经典的SfM方法一般分为两个步骤:首先对所有匹配的图像点进行投影分解(分为投影相机和投影点),然后进行相机自校准,将得到的世界从投影更新为欧几里得。这项工作展示了如何在这两个SfM步骤中引入圆形点的图像,而其关键贡献是为将“经典”线性自校准约束与从此类图像派生的附加约束相结合提供了理论基础。我们表明,两种提出的SfM步骤明显比经典方法有更好的结果。我们在合成图像和真实图像上验证了我们的贡献。
{"title":"An Enhanced Structure-from-Motion Paradigm Based on the Absolute Dual Quadric and Images of Circular Points","authors":"L. Calvet, Pierre Gurdjos","doi":"10.1109/ICCV.2013.126","DOIUrl":"https://doi.org/10.1109/ICCV.2013.126","url":null,"abstract":"This work aims at introducing a new unified Structure from Motion (SfM) paradigm in which images of circular point-pairs can be combined with images of natural points. An imaged circular point-pair encodes the 2D Euclidean structure of a world plane and can easily be derived from the image of a planar shape, especially those including circles. A classical SfM method generally runs two steps: first a projective factorization of all matched image points (into projective cameras and points) and second a camera self calibration that updates the obtained world from projective to Euclidean. This work shows how to introduce images of circular points in these two SfM steps while its key contribution is to provide the theoretical foundations for combining \"classical\" linear self-calibration constraints with additional ones derived from such images. We show that the two proposed SfM steps clearly contribute to better results than the classical approach. We validate our contributions on synthetic and real images.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"24 1","pages":"985-992"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73441409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2013 IEEE International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1