首页 > 最新文献

2011 IEEE Workshop on Applications of Computer Vision (WACV)最新文献

英文 中文
Comparing state-of-the-art visual features on invariant object recognition tasks 比较不变目标识别任务中最先进的视觉特征
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711540
Nicolas Pinto, Youssef Barhomi, David D. Cox, J. DiCarlo
Tolerance (“invariance”) to identity-preserving image variation (e.g. variation in position, scale, pose, illumination) is a fundamental problem that any visual object recognition system, biological or engineered, must solve. While standard natural image database benchmarks are useful for guiding progress in computer vision, they can fail to probe the ability of a recognition system to solve the invariance problem [23, 24, 25]. Thus, to understand which computational approaches are making progress on solving the invariance problem, we compared and contrasted a variety of state-of-the-art visual representations using synthetic recognition tasks designed to systematically probe invari-ance. We successfully re-implemented a variety of state-of-the-art visual representations and confirmed their published performance on a natural image benchmark. We here report that most of these representations perform poorly on invariant recognition, but that one representation [21] shows significant performance gains over two baseline representations. We also show how this approach can more deeply illuminate the strengths and weaknesses of different visual representations and thus guide progress on invariant object recognition.
对保持身份的图像变化的容忍度(“不变性”)(例如位置、比例、姿势、照明的变化)是任何视觉对象识别系统(生物或工程)必须解决的基本问题。虽然标准的自然图像数据库基准对于指导计算机视觉的进展是有用的,但它们可能无法探测识别系统解决不变性问题的能力[23,24,25]。因此,为了了解哪些计算方法在解决不变性问题方面取得了进展,我们使用旨在系统地探测不变性的综合识别任务,对各种最先进的视觉表示进行了比较和对比。我们成功地重新实现了各种最先进的视觉表现,并在自然图像基准上确认了它们的发布性能。我们在这里报告说,大多数这些表示在不变识别上表现不佳,但是一个表示[21]在两个基线表示上显示出显著的性能提升。我们还展示了这种方法如何更深入地阐明不同视觉表示的优缺点,从而指导不变对象识别的进展。
{"title":"Comparing state-of-the-art visual features on invariant object recognition tasks","authors":"Nicolas Pinto, Youssef Barhomi, David D. Cox, J. DiCarlo","doi":"10.1109/WACV.2011.5711540","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711540","url":null,"abstract":"Tolerance (“invariance”) to identity-preserving image variation (e.g. variation in position, scale, pose, illumination) is a fundamental problem that any visual object recognition system, biological or engineered, must solve. While standard natural image database benchmarks are useful for guiding progress in computer vision, they can fail to probe the ability of a recognition system to solve the invariance problem [23, 24, 25]. Thus, to understand which computational approaches are making progress on solving the invariance problem, we compared and contrasted a variety of state-of-the-art visual representations using synthetic recognition tasks designed to systematically probe invari-ance. We successfully re-implemented a variety of state-of-the-art visual representations and confirmed their published performance on a natural image benchmark. We here report that most of these representations perform poorly on invariant recognition, but that one representation [21] shows significant performance gains over two baseline representations. We also show how this approach can more deeply illuminate the strengths and weaknesses of different visual representations and thus guide progress on invariant object recognition.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122657451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
Tracking planes with Time of Flight cameras and J-linkage 用飞行时间照相机和j联动跟踪飞机
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711568
L. Schwarz, D. Mateus, Joé Lallemand, Nassir Navab
In this paper, we propose a method for detection and tracking of multiple planes in sequences of Time of Flight (ToF) depth images. Our approach extends the recent J-linkage algorithm for estimation of multiple model instances in noisy data to tracking. Instead of randomly selecting plane hypotheses in every image, we propagate plane hypotheses through the sequence of images, resulting in a significant reduction of computational load in every frame. We also introduce a multi-pass scheme that allows detecting and tracking planes of varying spatial extent along with their boundaries. Our qualitative and quantitative evaluation shows that the proposed method can robustly detect planes and consistently track the hypotheses through sequences of ToF images.
本文提出了一种飞行时间(ToF)深度图像序列中多平面的检测与跟踪方法。我们的方法将最近用于估计噪声数据中多个模型实例的J-linkage算法扩展到跟踪。我们不是在每张图像中随机选择平面假设,而是在图像序列中传播平面假设,从而大大减少了每帧的计算负荷。我们还介绍了一种多通道方案,允许检测和跟踪不同空间范围的平面及其边界。定性和定量评价表明,该方法可以鲁棒地检测平面,并通过ToF图像序列对假设进行一致的跟踪。
{"title":"Tracking planes with Time of Flight cameras and J-linkage","authors":"L. Schwarz, D. Mateus, Joé Lallemand, Nassir Navab","doi":"10.1109/WACV.2011.5711568","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711568","url":null,"abstract":"In this paper, we propose a method for detection and tracking of multiple planes in sequences of Time of Flight (ToF) depth images. Our approach extends the recent J-linkage algorithm for estimation of multiple model instances in noisy data to tracking. Instead of randomly selecting plane hypotheses in every image, we propagate plane hypotheses through the sequence of images, resulting in a significant reduction of computational load in every frame. We also introduce a multi-pass scheme that allows detecting and tracking planes of varying spatial extent along with their boundaries. Our qualitative and quantitative evaluation shows that the proposed method can robustly detect planes and consistently track the hypotheses through sequences of ToF images.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114463324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Multiple ant tracking with global foreground maximization and variable target proposal distribution 具有全局前景最大化和可变目标建议分布的多蚂蚁跟踪
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711555
Mary Fletcher, A. Dornhaus, M. Shin
Motion and behavior analysis of social insects such as ants requires tracking many ants over time. This process is highly labor-intensive and tedious. Automatic tracking is challenging as ants often interact with one another, resulting in frequent occlusions that cause drifts in tracking. In addition, tracking many objects is computationally expensive. In this paper, we present a robust and efficient method for tracking multiple ants. We first prevent drifts by maximizing the coverage of foreground pixels at at global scale. Secondly, we improve speed by reducing markov chain length through dynamically changing the target proposal distribution for perturbed ant selection. Using a real dataset with ground truth, we demonstrate that our algorithm was able to improve the accuracy by 15% (resulting in 98% tracking accuracy) and the speed by 76%.
对蚂蚁等群居昆虫的运动和行为分析需要对许多蚂蚁进行长期跟踪。这个过程是高度劳动密集型和繁琐的。自动跟踪是具有挑战性的,因为蚂蚁经常相互作用,导致频繁的闭塞,导致跟踪漂移。此外,跟踪许多对象的计算成本很高。本文提出了一种鲁棒且高效的多蚂蚁跟踪方法。我们首先通过最大化前景像素在全局尺度上的覆盖来防止漂移。其次,通过动态改变扰动蚂蚁选择的目标建议分布,减少马尔可夫链长度,提高速度;使用真实的数据集,我们证明了我们的算法能够将精度提高15%(跟踪精度达到98%),速度提高76%。
{"title":"Multiple ant tracking with global foreground maximization and variable target proposal distribution","authors":"Mary Fletcher, A. Dornhaus, M. Shin","doi":"10.1109/WACV.2011.5711555","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711555","url":null,"abstract":"Motion and behavior analysis of social insects such as ants requires tracking many ants over time. This process is highly labor-intensive and tedious. Automatic tracking is challenging as ants often interact with one another, resulting in frequent occlusions that cause drifts in tracking. In addition, tracking many objects is computationally expensive. In this paper, we present a robust and efficient method for tracking multiple ants. We first prevent drifts by maximizing the coverage of foreground pixels at at global scale. Secondly, we improve speed by reducing markov chain length through dynamically changing the target proposal distribution for perturbed ant selection. Using a real dataset with ground truth, we demonstrate that our algorithm was able to improve the accuracy by 15% (resulting in 98% tracking accuracy) and the speed by 76%.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121083883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Action recognition: A region based approach 动作识别:基于区域的方法
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711517
Hakan Bilen, Vinay P. Namboodiri, L. Gool
We address the problem of recognizing actions in reallife videos. Space-time interest point-based approaches have been widely prevalent towards solving this problem. In contrast, more spatially extended features such as regions have not been so popular. The reason is, any local region based approach requires the motion flow information for a specific region to be collated temporally. This is challenging as the local regions are deformable and not well delineated from the surroundings. In this paper we address this issue by using robust tracking of regions and we show that it is possible to obtain region descriptors for classification of actions. This paper lays the groundwork for further investigation into region based approaches. Through this paper we make the following contributions a) We advocate identification of salient regions based on motion segmentation b) We adopt a state-of-the art tracker for robust tracking of the identified regions rather than using isolated space-time blocks c) We propose optical flow based region descriptors to encode the extracted trajectories in piece-wise blocks. We demonstrate the performance of our system on real-world data sets.
我们解决了在现实生活视频中识别动作的问题。基于时空兴趣点的方法已经广泛应用于解决这一问题。相比之下,更多的空间扩展特征,如区域,就不那么受欢迎了。原因是,任何基于局部区域的方法都需要对特定区域的运动流信息进行临时整理。这是具有挑战性的,因为局部区域是可变形的,不能很好地从周围环境中勾画出来。在本文中,我们通过使用区域的鲁棒跟踪来解决这个问题,并且我们表明可以获得用于动作分类的区域描述符。本文为进一步研究基于区域的方法奠定了基础。通过本文,我们做出了以下贡献:a)我们提倡基于运动分割的显著区域识别;b)我们采用最先进的跟踪器对识别区域进行鲁棒跟踪,而不是使用孤立的时空块;c)我们提出了基于光流的区域描述符,将提取的轨迹编码为分段块。我们在真实的数据集上演示了我们的系统的性能。
{"title":"Action recognition: A region based approach","authors":"Hakan Bilen, Vinay P. Namboodiri, L. Gool","doi":"10.1109/WACV.2011.5711517","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711517","url":null,"abstract":"We address the problem of recognizing actions in reallife videos. Space-time interest point-based approaches have been widely prevalent towards solving this problem. In contrast, more spatially extended features such as regions have not been so popular. The reason is, any local region based approach requires the motion flow information for a specific region to be collated temporally. This is challenging as the local regions are deformable and not well delineated from the surroundings. In this paper we address this issue by using robust tracking of regions and we show that it is possible to obtain region descriptors for classification of actions. This paper lays the groundwork for further investigation into region based approaches. Through this paper we make the following contributions a) We advocate identification of salient regions based on motion segmentation b) We adopt a state-of-the art tracker for robust tracking of the identified regions rather than using isolated space-time blocks c) We propose optical flow based region descriptors to encode the extracted trajectories in piece-wise blocks. We demonstrate the performance of our system on real-world data sets.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116278405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Realistic stereo error models and finite optimal stereo baselines 现实立体误差模型和有限最优立体基线
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711535
Zhang Tao, T. Boult
Stereo reconstruction is an important research and application area, both for general 3D reconstruction and for operations like robotic navigation and remote sensing. This paper addresses the determination of parameters for a stereo system to optimize/minimize 3D reconstruction errors. Previous work on error analysis in stereo reconstruction optimized error in disparity space which led to the erroneous conclusion that, ignoring matching errors, errors decrease when the baseline goes to infinity. In this paper, we derive the first formal error model based on the more realistic “point-of-closest-approach” ray model used in modern stereo systems. We then show this results in finite optimal baseline that minimizes reconstruction errors in all three world directions. We also show why previous oversimplified error analysis results in infinite baselines. We derive the mathematical relationship between the error variances and the stereo system parameters. In our analysis, we consider the situations where errors exist in only one camera as well as errors in both cameras. We have derived the results for both parallel and verged systems, though only the simpler models are presented algebraically herein. The paper includes simulations to highlight the results and validate the approximations in the error propagation. The results should allow stereo system designers, or those using motion-stereo, to improve their system.
无论是一般的三维重建还是机器人导航、遥感等操作,立体重建都是一个重要的研究和应用领域。本文讨论了立体系统参数的确定,以优化/最小化三维重建误差。以往的立体重建误差分析都是在视差空间中优化误差,从而得出忽略匹配误差,当基线趋于无穷远时误差减小的错误结论。在本文中,我们基于现代立体系统中使用的更现实的“最接近点”射线模型推导了第一个形式化误差模型。然后,我们在有限最优基线中展示了这一结果,该基线使所有三个世界方向的重建误差最小化。我们还说明了为什么以前过于简化的误差分析导致无限基线。推导了误差方差与立体系统参数之间的数学关系。在我们的分析中,我们考虑了仅在一个相机中存在错误以及两个相机都存在错误的情况。我们已经导出了并行系统和边缘系统的结果,尽管这里只给出了更简单的代数模型。文中还包括仿真,以突出结果并验证误差传播中的近似。研究结果可以让立体音响系统设计师或使用运动立体音响的人改进他们的系统。
{"title":"Realistic stereo error models and finite optimal stereo baselines","authors":"Zhang Tao, T. Boult","doi":"10.1109/WACV.2011.5711535","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711535","url":null,"abstract":"Stereo reconstruction is an important research and application area, both for general 3D reconstruction and for operations like robotic navigation and remote sensing. This paper addresses the determination of parameters for a stereo system to optimize/minimize 3D reconstruction errors. Previous work on error analysis in stereo reconstruction optimized error in disparity space which led to the erroneous conclusion that, ignoring matching errors, errors decrease when the baseline goes to infinity. In this paper, we derive the first formal error model based on the more realistic “point-of-closest-approach” ray model used in modern stereo systems. We then show this results in finite optimal baseline that minimizes reconstruction errors in all three world directions. We also show why previous oversimplified error analysis results in infinite baselines. We derive the mathematical relationship between the error variances and the stereo system parameters. In our analysis, we consider the situations where errors exist in only one camera as well as errors in both cameras. We have derived the results for both parallel and verged systems, though only the simpler models are presented algebraically herein. The paper includes simulations to highlight the results and validate the approximations in the error propagation. The results should allow stereo system designers, or those using motion-stereo, to improve their system.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127774303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Classification of image registration problems using support vector machines 基于支持向量机的图像配准问题分类
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711526
S. Oldridge, S. Fels, G. Miller
This paper introduces a system that automatically classifies image pairs based on the type of registration required to align them. The system uses support vector machines to classify between panoramas, high-dynamic-range images, focal stacks, super-resolution, and unrelated image pairs. A feature vector was developed to describe the images, and 1100 pairs were used to train and test the system with 5-fold cross validation. The system is able to classify the desired registration application using a 1: Many classifier with an accuracy of 91.18%. Similarly 1:1 classifiers were developed for each class with classification rates as follows: Panorama image pairs are classified at 93.15%, high-dynamic-range pairs at 97.56%, focal stack pairs at 95.68%, super-resolution pairs at 99.25%, and finally unrelated image pairs at 95.79%. An investigation into feature importance outlines the utility of each feature individually. In addition, the invariance of the classification system towards the size of the image used to calculate the feature vector was explored. The classification of our system remains level at ∼91% until the image size is scaled to 10% (150 × 100 pixels), suggesting that our feature vector is image size invariant within this range.
本文介绍了一种基于配准类型对图像对进行自动分类的系统。该系统使用支持向量机对全景图、高动态范围图像、焦点堆栈、超分辨率图像和不相关图像对进行分类。开发了特征向量来描述图像,并使用1100对图像进行5倍交叉验证训练和测试系统。该系统能够使用1:Many分类器对期望的注册应用进行分类,准确率为91.18%。同样,每个类别都开发了1:1分类器,分类率如下:全景图像对的分类率为93.15%,高动态范围图像对的分类率为97.56%,焦叠图像对的分类率为95.68%,超分辨率图像对的分类率为99.25%,最后是无关图像对的分类率为95.79%。对特性重要性的调查可以单独列出每个特性的效用。此外,还探讨了分类系统对用于计算特征向量的图像大小的不变性。在图像尺寸缩放到10% (150 × 100像素)之前,我们系统的分类保持在91%的水平,这表明我们的特征向量在该范围内是图像尺寸不变的。
{"title":"Classification of image registration problems using support vector machines","authors":"S. Oldridge, S. Fels, G. Miller","doi":"10.1109/WACV.2011.5711526","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711526","url":null,"abstract":"This paper introduces a system that automatically classifies image pairs based on the type of registration required to align them. The system uses support vector machines to classify between panoramas, high-dynamic-range images, focal stacks, super-resolution, and unrelated image pairs. A feature vector was developed to describe the images, and 1100 pairs were used to train and test the system with 5-fold cross validation. The system is able to classify the desired registration application using a 1: Many classifier with an accuracy of 91.18%. Similarly 1:1 classifiers were developed for each class with classification rates as follows: Panorama image pairs are classified at 93.15%, high-dynamic-range pairs at 97.56%, focal stack pairs at 95.68%, super-resolution pairs at 99.25%, and finally unrelated image pairs at 95.79%. An investigation into feature importance outlines the utility of each feature individually. In addition, the invariance of the classification system towards the size of the image used to calculate the feature vector was explored. The classification of our system remains level at ∼91% until the image size is scaled to 10% (150 × 100 pixels), suggesting that our feature vector is image size invariant within this range.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130851847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Information fusion in low-resolution iris videos using Principal Components Transform 基于主成分变换的低分辨率虹膜视频信息融合
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711512
Raghavender R. Jillela, A. Ross, P. Flynn
The focus of this work is on improving the recognition performance of low-resolution iris video frames acquired under varying illumination. To facilitate this, an image-level fusion scheme with modest computational requirements is proposed. The proposed algorithm uses the evidence of multiple image frames of the same iris to extract discriminatory information via the Principal Components Transform (PCT). Experimental results on a subset of the MBGC NIR iris database demonstrate the utility of this scheme to achieve improved recognition accuracy when low-resolution probe images are compared against high-resolution gallery images.
本文的工作重点是提高在不同照度下获得的低分辨率虹膜视频帧的识别性能。为此,提出了一种计算量适中的图像级融合方案。该算法利用同一虹膜的多帧图像作为证据,通过主成分变换(PCT)提取鉴别信息。在MBGC近红外虹膜数据库的一个子集上的实验结果表明,当将低分辨率探针图像与高分辨率图库图像进行比较时,该方案可以提高识别精度。
{"title":"Information fusion in low-resolution iris videos using Principal Components Transform","authors":"Raghavender R. Jillela, A. Ross, P. Flynn","doi":"10.1109/WACV.2011.5711512","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711512","url":null,"abstract":"The focus of this work is on improving the recognition performance of low-resolution iris video frames acquired under varying illumination. To facilitate this, an image-level fusion scheme with modest computational requirements is proposed. The proposed algorithm uses the evidence of multiple image frames of the same iris to extract discriminatory information via the Principal Components Transform (PCT). Experimental results on a subset of the MBGC NIR iris database demonstrate the utility of this scheme to achieve improved recognition accuracy when low-resolution probe images are compared against high-resolution gallery images.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126148836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Illumination change compensation techniques to improve kinematic tracking 光照变化补偿技术改善运动跟踪
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711536
M. R. Bales, D. Forsthoefel, D. S. Wills, L. Wills
Illumination changes present challenging problems to video surveillance algorithms tasked with identifying and tracking objects. Illumination changes can drastically alter the appearance of a scene, causing truly salient features to be lost amid otherwise stable background. We describe an illumination change compensation method that identifies large, stable, chromatically distinct background features-called BigBackground regions — which are used as calibration anchors for scene correction. The benefits of this method are demonstrated for a computationally low-cost kinematic tracking application as it attempts to track objects during illumination changes. The BigBackground-based method is compared with other compensation techniques, and is found to successfully track 60% to 80% more objects during illumination changes. Video sequences of pedestrian and vehicular traffic are used for evaluation.
光照变化对视频监控算法提出了具有挑战性的问题,这些算法的任务是识别和跟踪物体。光照的变化可以极大地改变场景的外观,导致真正显著的特征在其他稳定的背景中丢失。我们描述了一种照明变化补偿方法,该方法可以识别大的、稳定的、色彩上不同的背景特征——称为BigBackground区域——作为场景校正的校准锚点。这种方法的优点被证明为计算成本低的运动跟踪应用,因为它试图在照明变化期间跟踪物体。通过与其他补偿技术的比较,发现在光照变化时,基于bigbackground的方法可以成功地跟踪60%到80%的目标。使用行人和车辆交通的视频序列进行评估。
{"title":"Illumination change compensation techniques to improve kinematic tracking","authors":"M. R. Bales, D. Forsthoefel, D. S. Wills, L. Wills","doi":"10.1109/WACV.2011.5711536","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711536","url":null,"abstract":"Illumination changes present challenging problems to video surveillance algorithms tasked with identifying and tracking objects. Illumination changes can drastically alter the appearance of a scene, causing truly salient features to be lost amid otherwise stable background. We describe an illumination change compensation method that identifies large, stable, chromatically distinct background features-called BigBackground regions — which are used as calibration anchors for scene correction. The benefits of this method are demonstrated for a computationally low-cost kinematic tracking application as it attempts to track objects during illumination changes. The BigBackground-based method is compared with other compensation techniques, and is found to successfully track 60% to 80% more objects during illumination changes. Video sequences of pedestrian and vehicular traffic are used for evaluation.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114764053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An analysis of facial shape and texture for recognition: A large scale evaluation on FRGC ver2.0 人脸形状和纹理识别分析——基于FRGC 2.0的大规模评价
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711505
R. Abiantun, Utsav Prabhu, Keshav Seshadri, J. Heo, M. Savvides
Traditional approaches to face recognition have utilized aligned facial images containing both shape and texture information. This paper analyzes the contributions of the individual facial shape and texture components to face recognition. These two components are evaluated independently and we investigate methods to combine the information gained from each of them to enhance face recognition performance. The contributions of this paper are the following: (1) to the best of our knowledge, it is the first large-scale study of how face recognition is influenced by shape and texture as all of our results are benchmarked against traditional approaches on the challenging NIST FRGC ver2.0 experiment 4 dataset, (2) we empirically show that shape information is reasonably discriminative, (3) we demonstrate significant improvement in performance by registering texture with dense shape information, and finally (4) show that fusing shape and texture information consistently boosts recognition results across different subspace-based algorithms.
传统的人脸识别方法是利用包含形状和纹理信息的对齐的人脸图像。分析了单个人脸形状和纹理分量对人脸识别的贡献。这两个成分是独立评估的,我们研究了如何结合从它们中获得的信息来提高人脸识别性能。本文的贡献如下:(1)据我们所知,这是第一次对形状和纹理如何影响人脸识别的大规模研究,因为我们所有的结果都是在具有挑战性的NIST FRGC ver2.0实验4数据集上与传统方法进行基准测试的,(2)我们实证地表明形状信息具有合理的判别性,(3)我们通过将纹理与密集的形状信息注册在一起,我们证明了性能的显着提高。最后(4)表明融合形状和纹理信息可以一致地提高不同子空间算法的识别效果。
{"title":"An analysis of facial shape and texture for recognition: A large scale evaluation on FRGC ver2.0","authors":"R. Abiantun, Utsav Prabhu, Keshav Seshadri, J. Heo, M. Savvides","doi":"10.1109/WACV.2011.5711505","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711505","url":null,"abstract":"Traditional approaches to face recognition have utilized aligned facial images containing both shape and texture information. This paper analyzes the contributions of the individual facial shape and texture components to face recognition. These two components are evaluated independently and we investigate methods to combine the information gained from each of them to enhance face recognition performance. The contributions of this paper are the following: (1) to the best of our knowledge, it is the first large-scale study of how face recognition is influenced by shape and texture as all of our results are benchmarked against traditional approaches on the challenging NIST FRGC ver2.0 experiment 4 dataset, (2) we empirically show that shape information is reasonably discriminative, (3) we demonstrate significant improvement in performance by registering texture with dense shape information, and finally (4) show that fusing shape and texture information consistently boosts recognition results across different subspace-based algorithms.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124531572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Personalized video summarization with human in the loop 个性化的视频总结与人在循环
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711483
Bohyung Han, Jihun Hamm, Jack Sim
In automatic video summarization, visual summary is constructed typically based on the analysis of low-level features with little consideration of video semantics. However, the contextual and semantic information of a video is marginally related to low-level features in practice although they are useful to compute visual similarity between frames. Therefore, we propose a novel video summarization technique, where the semantically important information is extracted from a set of keyframes given by human and the summary of a video is constructed based on the automatic temporal segmentation using the analysis of inter-frame similarity to the keyframes. Toward this goal, we model a video sequence with a dissimilarity matrix based on bidirectional similarity measure between every pair of frames, and subsequently characterize the structure of the video by a nonlinear manifold embedding. Then, we formulate video summarization as a variant of the 0–1 knapsack problem, which is solved by dynamic programming efficiently. The effectiveness of our algorithm is illustrated quantitatively and qualitatively using realistic videos collected from YouTube.
在自动视频摘要中,视觉摘要通常是基于对底层特征的分析而构建的,很少考虑视频语义。然而,视频的上下文和语义信息在实践中与底层特征的关系不大,尽管它们对计算帧之间的视觉相似性很有用。因此,我们提出了一种新的视频摘要技术,该技术从人类给出的一组关键帧中提取语义上重要的信息,并利用帧间与关键帧的相似度分析,在自动时间分割的基础上构建视频摘要。为此,我们利用基于每对帧之间的双向相似性度量的不相似矩阵来建模视频序列,然后通过非线性流形嵌入来表征视频的结构。然后,我们将视频摘要化为0-1背包问题的一个变体,并利用动态规划有效地解决了该问题。利用从YouTube上收集的真实视频,定量和定性地说明了我们算法的有效性。
{"title":"Personalized video summarization with human in the loop","authors":"Bohyung Han, Jihun Hamm, Jack Sim","doi":"10.1109/WACV.2011.5711483","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711483","url":null,"abstract":"In automatic video summarization, visual summary is constructed typically based on the analysis of low-level features with little consideration of video semantics. However, the contextual and semantic information of a video is marginally related to low-level features in practice although they are useful to compute visual similarity between frames. Therefore, we propose a novel video summarization technique, where the semantically important information is extracted from a set of keyframes given by human and the summary of a video is constructed based on the automatic temporal segmentation using the analysis of inter-frame similarity to the keyframes. Toward this goal, we model a video sequence with a dissimilarity matrix based on bidirectional similarity measure between every pair of frames, and subsequently characterize the structure of the video by a nonlinear manifold embedding. Then, we formulate video summarization as a variant of the 0–1 knapsack problem, which is solved by dynamic programming efficiently. The effectiveness of our algorithm is illustrated quantitatively and qualitatively using realistic videos collected from YouTube.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124903179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
期刊
2011 IEEE Workshop on Applications of Computer Vision (WACV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1