首页 > 最新文献

2011 IEEE Workshop on Applications of Computer Vision (WACV)最新文献

英文 中文
Multiple ant tracking with global foreground maximization and variable target proposal distribution 具有全局前景最大化和可变目标建议分布的多蚂蚁跟踪
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711555
Mary Fletcher, A. Dornhaus, M. Shin
Motion and behavior analysis of social insects such as ants requires tracking many ants over time. This process is highly labor-intensive and tedious. Automatic tracking is challenging as ants often interact with one another, resulting in frequent occlusions that cause drifts in tracking. In addition, tracking many objects is computationally expensive. In this paper, we present a robust and efficient method for tracking multiple ants. We first prevent drifts by maximizing the coverage of foreground pixels at at global scale. Secondly, we improve speed by reducing markov chain length through dynamically changing the target proposal distribution for perturbed ant selection. Using a real dataset with ground truth, we demonstrate that our algorithm was able to improve the accuracy by 15% (resulting in 98% tracking accuracy) and the speed by 76%.
对蚂蚁等群居昆虫的运动和行为分析需要对许多蚂蚁进行长期跟踪。这个过程是高度劳动密集型和繁琐的。自动跟踪是具有挑战性的,因为蚂蚁经常相互作用,导致频繁的闭塞,导致跟踪漂移。此外,跟踪许多对象的计算成本很高。本文提出了一种鲁棒且高效的多蚂蚁跟踪方法。我们首先通过最大化前景像素在全局尺度上的覆盖来防止漂移。其次,通过动态改变扰动蚂蚁选择的目标建议分布,减少马尔可夫链长度,提高速度;使用真实的数据集,我们证明了我们的算法能够将精度提高15%(跟踪精度达到98%),速度提高76%。
{"title":"Multiple ant tracking with global foreground maximization and variable target proposal distribution","authors":"Mary Fletcher, A. Dornhaus, M. Shin","doi":"10.1109/WACV.2011.5711555","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711555","url":null,"abstract":"Motion and behavior analysis of social insects such as ants requires tracking many ants over time. This process is highly labor-intensive and tedious. Automatic tracking is challenging as ants often interact with one another, resulting in frequent occlusions that cause drifts in tracking. In addition, tracking many objects is computationally expensive. In this paper, we present a robust and efficient method for tracking multiple ants. We first prevent drifts by maximizing the coverage of foreground pixels at at global scale. Secondly, we improve speed by reducing markov chain length through dynamically changing the target proposal distribution for perturbed ant selection. Using a real dataset with ground truth, we demonstrate that our algorithm was able to improve the accuracy by 15% (resulting in 98% tracking accuracy) and the speed by 76%.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121083883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Information fusion in low-resolution iris videos using Principal Components Transform 基于主成分变换的低分辨率虹膜视频信息融合
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711512
Raghavender R. Jillela, A. Ross, P. Flynn
The focus of this work is on improving the recognition performance of low-resolution iris video frames acquired under varying illumination. To facilitate this, an image-level fusion scheme with modest computational requirements is proposed. The proposed algorithm uses the evidence of multiple image frames of the same iris to extract discriminatory information via the Principal Components Transform (PCT). Experimental results on a subset of the MBGC NIR iris database demonstrate the utility of this scheme to achieve improved recognition accuracy when low-resolution probe images are compared against high-resolution gallery images.
本文的工作重点是提高在不同照度下获得的低分辨率虹膜视频帧的识别性能。为此,提出了一种计算量适中的图像级融合方案。该算法利用同一虹膜的多帧图像作为证据,通过主成分变换(PCT)提取鉴别信息。在MBGC近红外虹膜数据库的一个子集上的实验结果表明,当将低分辨率探针图像与高分辨率图库图像进行比较时,该方案可以提高识别精度。
{"title":"Information fusion in low-resolution iris videos using Principal Components Transform","authors":"Raghavender R. Jillela, A. Ross, P. Flynn","doi":"10.1109/WACV.2011.5711512","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711512","url":null,"abstract":"The focus of this work is on improving the recognition performance of low-resolution iris video frames acquired under varying illumination. To facilitate this, an image-level fusion scheme with modest computational requirements is proposed. The proposed algorithm uses the evidence of multiple image frames of the same iris to extract discriminatory information via the Principal Components Transform (PCT). Experimental results on a subset of the MBGC NIR iris database demonstrate the utility of this scheme to achieve improved recognition accuracy when low-resolution probe images are compared against high-resolution gallery images.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126148836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Action recognition: A region based approach 动作识别:基于区域的方法
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711517
Hakan Bilen, Vinay P. Namboodiri, L. Gool
We address the problem of recognizing actions in reallife videos. Space-time interest point-based approaches have been widely prevalent towards solving this problem. In contrast, more spatially extended features such as regions have not been so popular. The reason is, any local region based approach requires the motion flow information for a specific region to be collated temporally. This is challenging as the local regions are deformable and not well delineated from the surroundings. In this paper we address this issue by using robust tracking of regions and we show that it is possible to obtain region descriptors for classification of actions. This paper lays the groundwork for further investigation into region based approaches. Through this paper we make the following contributions a) We advocate identification of salient regions based on motion segmentation b) We adopt a state-of-the art tracker for robust tracking of the identified regions rather than using isolated space-time blocks c) We propose optical flow based region descriptors to encode the extracted trajectories in piece-wise blocks. We demonstrate the performance of our system on real-world data sets.
我们解决了在现实生活视频中识别动作的问题。基于时空兴趣点的方法已经广泛应用于解决这一问题。相比之下,更多的空间扩展特征,如区域,就不那么受欢迎了。原因是,任何基于局部区域的方法都需要对特定区域的运动流信息进行临时整理。这是具有挑战性的,因为局部区域是可变形的,不能很好地从周围环境中勾画出来。在本文中,我们通过使用区域的鲁棒跟踪来解决这个问题,并且我们表明可以获得用于动作分类的区域描述符。本文为进一步研究基于区域的方法奠定了基础。通过本文,我们做出了以下贡献:a)我们提倡基于运动分割的显著区域识别;b)我们采用最先进的跟踪器对识别区域进行鲁棒跟踪,而不是使用孤立的时空块;c)我们提出了基于光流的区域描述符,将提取的轨迹编码为分段块。我们在真实的数据集上演示了我们的系统的性能。
{"title":"Action recognition: A region based approach","authors":"Hakan Bilen, Vinay P. Namboodiri, L. Gool","doi":"10.1109/WACV.2011.5711517","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711517","url":null,"abstract":"We address the problem of recognizing actions in reallife videos. Space-time interest point-based approaches have been widely prevalent towards solving this problem. In contrast, more spatially extended features such as regions have not been so popular. The reason is, any local region based approach requires the motion flow information for a specific region to be collated temporally. This is challenging as the local regions are deformable and not well delineated from the surroundings. In this paper we address this issue by using robust tracking of regions and we show that it is possible to obtain region descriptors for classification of actions. This paper lays the groundwork for further investigation into region based approaches. Through this paper we make the following contributions a) We advocate identification of salient regions based on motion segmentation b) We adopt a state-of-the art tracker for robust tracking of the identified regions rather than using isolated space-time blocks c) We propose optical flow based region descriptors to encode the extracted trajectories in piece-wise blocks. We demonstrate the performance of our system on real-world data sets.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116278405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Tracking planes with Time of Flight cameras and J-linkage 用飞行时间照相机和j联动跟踪飞机
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711568
L. Schwarz, D. Mateus, Joé Lallemand, Nassir Navab
In this paper, we propose a method for detection and tracking of multiple planes in sequences of Time of Flight (ToF) depth images. Our approach extends the recent J-linkage algorithm for estimation of multiple model instances in noisy data to tracking. Instead of randomly selecting plane hypotheses in every image, we propagate plane hypotheses through the sequence of images, resulting in a significant reduction of computational load in every frame. We also introduce a multi-pass scheme that allows detecting and tracking planes of varying spatial extent along with their boundaries. Our qualitative and quantitative evaluation shows that the proposed method can robustly detect planes and consistently track the hypotheses through sequences of ToF images.
本文提出了一种飞行时间(ToF)深度图像序列中多平面的检测与跟踪方法。我们的方法将最近用于估计噪声数据中多个模型实例的J-linkage算法扩展到跟踪。我们不是在每张图像中随机选择平面假设,而是在图像序列中传播平面假设,从而大大减少了每帧的计算负荷。我们还介绍了一种多通道方案,允许检测和跟踪不同空间范围的平面及其边界。定性和定量评价表明,该方法可以鲁棒地检测平面,并通过ToF图像序列对假设进行一致的跟踪。
{"title":"Tracking planes with Time of Flight cameras and J-linkage","authors":"L. Schwarz, D. Mateus, Joé Lallemand, Nassir Navab","doi":"10.1109/WACV.2011.5711568","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711568","url":null,"abstract":"In this paper, we propose a method for detection and tracking of multiple planes in sequences of Time of Flight (ToF) depth images. Our approach extends the recent J-linkage algorithm for estimation of multiple model instances in noisy data to tracking. Instead of randomly selecting plane hypotheses in every image, we propagate plane hypotheses through the sequence of images, resulting in a significant reduction of computational load in every frame. We also introduce a multi-pass scheme that allows detecting and tracking planes of varying spatial extent along with their boundaries. Our qualitative and quantitative evaluation shows that the proposed method can robustly detect planes and consistently track the hypotheses through sequences of ToF images.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114463324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Illumination change compensation techniques to improve kinematic tracking 光照变化补偿技术改善运动跟踪
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711536
M. R. Bales, D. Forsthoefel, D. S. Wills, L. Wills
Illumination changes present challenging problems to video surveillance algorithms tasked with identifying and tracking objects. Illumination changes can drastically alter the appearance of a scene, causing truly salient features to be lost amid otherwise stable background. We describe an illumination change compensation method that identifies large, stable, chromatically distinct background features-called BigBackground regions — which are used as calibration anchors for scene correction. The benefits of this method are demonstrated for a computationally low-cost kinematic tracking application as it attempts to track objects during illumination changes. The BigBackground-based method is compared with other compensation techniques, and is found to successfully track 60% to 80% more objects during illumination changes. Video sequences of pedestrian and vehicular traffic are used for evaluation.
光照变化对视频监控算法提出了具有挑战性的问题,这些算法的任务是识别和跟踪物体。光照的变化可以极大地改变场景的外观,导致真正显著的特征在其他稳定的背景中丢失。我们描述了一种照明变化补偿方法,该方法可以识别大的、稳定的、色彩上不同的背景特征——称为BigBackground区域——作为场景校正的校准锚点。这种方法的优点被证明为计算成本低的运动跟踪应用,因为它试图在照明变化期间跟踪物体。通过与其他补偿技术的比较,发现在光照变化时,基于bigbackground的方法可以成功地跟踪60%到80%的目标。使用行人和车辆交通的视频序列进行评估。
{"title":"Illumination change compensation techniques to improve kinematic tracking","authors":"M. R. Bales, D. Forsthoefel, D. S. Wills, L. Wills","doi":"10.1109/WACV.2011.5711536","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711536","url":null,"abstract":"Illumination changes present challenging problems to video surveillance algorithms tasked with identifying and tracking objects. Illumination changes can drastically alter the appearance of a scene, causing truly salient features to be lost amid otherwise stable background. We describe an illumination change compensation method that identifies large, stable, chromatically distinct background features-called BigBackground regions — which are used as calibration anchors for scene correction. The benefits of this method are demonstrated for a computationally low-cost kinematic tracking application as it attempts to track objects during illumination changes. The BigBackground-based method is compared with other compensation techniques, and is found to successfully track 60% to 80% more objects during illumination changes. Video sequences of pedestrian and vehicular traffic are used for evaluation.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114764053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Classification of image registration problems using support vector machines 基于支持向量机的图像配准问题分类
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711526
S. Oldridge, S. Fels, G. Miller
This paper introduces a system that automatically classifies image pairs based on the type of registration required to align them. The system uses support vector machines to classify between panoramas, high-dynamic-range images, focal stacks, super-resolution, and unrelated image pairs. A feature vector was developed to describe the images, and 1100 pairs were used to train and test the system with 5-fold cross validation. The system is able to classify the desired registration application using a 1: Many classifier with an accuracy of 91.18%. Similarly 1:1 classifiers were developed for each class with classification rates as follows: Panorama image pairs are classified at 93.15%, high-dynamic-range pairs at 97.56%, focal stack pairs at 95.68%, super-resolution pairs at 99.25%, and finally unrelated image pairs at 95.79%. An investigation into feature importance outlines the utility of each feature individually. In addition, the invariance of the classification system towards the size of the image used to calculate the feature vector was explored. The classification of our system remains level at ∼91% until the image size is scaled to 10% (150 × 100 pixels), suggesting that our feature vector is image size invariant within this range.
本文介绍了一种基于配准类型对图像对进行自动分类的系统。该系统使用支持向量机对全景图、高动态范围图像、焦点堆栈、超分辨率图像和不相关图像对进行分类。开发了特征向量来描述图像,并使用1100对图像进行5倍交叉验证训练和测试系统。该系统能够使用1:Many分类器对期望的注册应用进行分类,准确率为91.18%。同样,每个类别都开发了1:1分类器,分类率如下:全景图像对的分类率为93.15%,高动态范围图像对的分类率为97.56%,焦叠图像对的分类率为95.68%,超分辨率图像对的分类率为99.25%,最后是无关图像对的分类率为95.79%。对特性重要性的调查可以单独列出每个特性的效用。此外,还探讨了分类系统对用于计算特征向量的图像大小的不变性。在图像尺寸缩放到10% (150 × 100像素)之前,我们系统的分类保持在91%的水平,这表明我们的特征向量在该范围内是图像尺寸不变的。
{"title":"Classification of image registration problems using support vector machines","authors":"S. Oldridge, S. Fels, G. Miller","doi":"10.1109/WACV.2011.5711526","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711526","url":null,"abstract":"This paper introduces a system that automatically classifies image pairs based on the type of registration required to align them. The system uses support vector machines to classify between panoramas, high-dynamic-range images, focal stacks, super-resolution, and unrelated image pairs. A feature vector was developed to describe the images, and 1100 pairs were used to train and test the system with 5-fold cross validation. The system is able to classify the desired registration application using a 1: Many classifier with an accuracy of 91.18%. Similarly 1:1 classifiers were developed for each class with classification rates as follows: Panorama image pairs are classified at 93.15%, high-dynamic-range pairs at 97.56%, focal stack pairs at 95.68%, super-resolution pairs at 99.25%, and finally unrelated image pairs at 95.79%. An investigation into feature importance outlines the utility of each feature individually. In addition, the invariance of the classification system towards the size of the image used to calculate the feature vector was explored. The classification of our system remains level at ∼91% until the image size is scaled to 10% (150 × 100 pixels), suggesting that our feature vector is image size invariant within this range.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130851847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Comparing state-of-the-art visual features on invariant object recognition tasks 比较不变目标识别任务中最先进的视觉特征
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711540
Nicolas Pinto, Youssef Barhomi, David D. Cox, J. DiCarlo
Tolerance (“invariance”) to identity-preserving image variation (e.g. variation in position, scale, pose, illumination) is a fundamental problem that any visual object recognition system, biological or engineered, must solve. While standard natural image database benchmarks are useful for guiding progress in computer vision, they can fail to probe the ability of a recognition system to solve the invariance problem [23, 24, 25]. Thus, to understand which computational approaches are making progress on solving the invariance problem, we compared and contrasted a variety of state-of-the-art visual representations using synthetic recognition tasks designed to systematically probe invari-ance. We successfully re-implemented a variety of state-of-the-art visual representations and confirmed their published performance on a natural image benchmark. We here report that most of these representations perform poorly on invariant recognition, but that one representation [21] shows significant performance gains over two baseline representations. We also show how this approach can more deeply illuminate the strengths and weaknesses of different visual representations and thus guide progress on invariant object recognition.
对保持身份的图像变化的容忍度(“不变性”)(例如位置、比例、姿势、照明的变化)是任何视觉对象识别系统(生物或工程)必须解决的基本问题。虽然标准的自然图像数据库基准对于指导计算机视觉的进展是有用的,但它们可能无法探测识别系统解决不变性问题的能力[23,24,25]。因此,为了了解哪些计算方法在解决不变性问题方面取得了进展,我们使用旨在系统地探测不变性的综合识别任务,对各种最先进的视觉表示进行了比较和对比。我们成功地重新实现了各种最先进的视觉表现,并在自然图像基准上确认了它们的发布性能。我们在这里报告说,大多数这些表示在不变识别上表现不佳,但是一个表示[21]在两个基线表示上显示出显著的性能提升。我们还展示了这种方法如何更深入地阐明不同视觉表示的优缺点,从而指导不变对象识别的进展。
{"title":"Comparing state-of-the-art visual features on invariant object recognition tasks","authors":"Nicolas Pinto, Youssef Barhomi, David D. Cox, J. DiCarlo","doi":"10.1109/WACV.2011.5711540","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711540","url":null,"abstract":"Tolerance (“invariance”) to identity-preserving image variation (e.g. variation in position, scale, pose, illumination) is a fundamental problem that any visual object recognition system, biological or engineered, must solve. While standard natural image database benchmarks are useful for guiding progress in computer vision, they can fail to probe the ability of a recognition system to solve the invariance problem [23, 24, 25]. Thus, to understand which computational approaches are making progress on solving the invariance problem, we compared and contrasted a variety of state-of-the-art visual representations using synthetic recognition tasks designed to systematically probe invari-ance. We successfully re-implemented a variety of state-of-the-art visual representations and confirmed their published performance on a natural image benchmark. We here report that most of these representations perform poorly on invariant recognition, but that one representation [21] shows significant performance gains over two baseline representations. We also show how this approach can more deeply illuminate the strengths and weaknesses of different visual representations and thus guide progress on invariant object recognition.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122657451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
Temporally consistent multi-class video-object segmentation with the Video Graph-Shifts algorithm 基于Video Graph-Shifts算法的时间一致性多类视频目标分割
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711561
Albert Y. C. Chen, Jason J. Corso
We present the Video Graph-Shifts (VGS) approach for efficiently incorporating temporal consistency into MRF energy minimization for multi-class video object segmentation. In contrast to previous methods, our dynamic temporal links avoid the computational overhead of using a fully connected spatiotemporal MRF, while still being able to deal with the uncertainties of the exact inter-frame pixel correspondence issues. The dynamic temporal links are initialized flexibly for balancing between speed and accuracy, and are automatically revised whenever a label change (shift) occurs during the energy minimization process. We show in the benchmark CamVid database and our own wintry driving dataset that VGS improves the issue of temporally inconsistent segmentation effectively-enhancements of up to 5% to 10% for those semantic classes with high intra-class variance. Furthermore, VGS processes each frame at pixel resolution in about one second, which provides a practical way of modeling complex probabilistic relationships in videos and solving it in near real-time.
我们提出了视频图移位(VGS)方法,将时间一致性有效地结合到MRF能量最小化中,用于多类视频目标分割。与以前的方法相比,我们的动态时间链接避免了使用完全连接的时空MRF的计算开销,同时仍然能够处理精确帧间像素对应问题的不确定性。动态时间链接灵活初始化,以平衡速度和精度,并在能量最小化过程中发生标签变化(移位)时自动修改。我们在基准CamVid数据库和我们自己的冬季驾驶数据集中表明,VGS有效地改善了暂时不一致的分割问题——对于那些具有高类内方差的语义类,增强幅度高达5%到10%。此外,VGS在1秒左右的时间内以像素分辨率处理每帧图像,为视频中复杂概率关系的建模和近实时求解提供了一种实用的方法。
{"title":"Temporally consistent multi-class video-object segmentation with the Video Graph-Shifts algorithm","authors":"Albert Y. C. Chen, Jason J. Corso","doi":"10.1109/WACV.2011.5711561","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711561","url":null,"abstract":"We present the Video Graph-Shifts (VGS) approach for efficiently incorporating temporal consistency into MRF energy minimization for multi-class video object segmentation. In contrast to previous methods, our dynamic temporal links avoid the computational overhead of using a fully connected spatiotemporal MRF, while still being able to deal with the uncertainties of the exact inter-frame pixel correspondence issues. The dynamic temporal links are initialized flexibly for balancing between speed and accuracy, and are automatically revised whenever a label change (shift) occurs during the energy minimization process. We show in the benchmark CamVid database and our own wintry driving dataset that VGS improves the issue of temporally inconsistent segmentation effectively-enhancements of up to 5% to 10% for those semantic classes with high intra-class variance. Furthermore, VGS processes each frame at pixel resolution in about one second, which provides a practical way of modeling complex probabilistic relationships in videos and solving it in near real-time.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129026218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Aligning surfaces without aligning surfaces 对准表面而不对准表面
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711500
Geoffrey Oxholm, K. Nishino
We introduce a novel method for matching and aligning 3D surfaces that do not have any overlapping surface information. When two matching surfaces do not overlap, all that remains in common between them is a thin strip along their borders. Aligning such fragments is challenging but crucial for various applications, such as reassembly of thin-shell ceramics from their broken pieces. Past work approach this problem by heavily relying on simplistic assumptions about the shape of the object, or its texture. Our method makes no such assumptions; instead, we leverage the geometric and photometric similarity of the matching surfaces along the break-line. We first encode the shape and color of the boundary contour of each fragment at various scales in a novel 2D representation. Reformulating contour matching as 2D image registration based on these scale-space images enables efficient and accurate break-line matching. We then align the fragments by estimating the rotation around the break-line through maximizing the geometric continuity across it with a least-squares minimization. We evaluate our method on real-word colonial artifacts recently excavated in Philadelphia, Pennsylvania. Our system dramatically increases the ease and efficiency at which users reassemble artifacts as we demonstrate on three different vessels.
本文提出了一种新的三维曲面匹配和对齐方法。当两个匹配的表面不重叠时,它们之间的共同之处就是沿着它们的边界有一条薄带。调整这些碎片具有挑战性,但对各种应用至关重要,例如从破碎的碎片中重新组装薄壳陶瓷。过去的工作通过严重依赖于对物体形状或纹理的简单假设来解决这个问题。我们的方法没有这样的假设;相反,我们利用沿折线的匹配表面的几何和光度相似性。我们首先用一种新的二维表示对每个片段在不同尺度上的边界轮廓的形状和颜色进行编码。在这些尺度空间图像的基础上,将轮廓匹配重新表述为二维图像配准,可以实现高效、准确的中断线匹配。然后,我们通过通过最小二乘最小化最大化跨越它的几何连续性来估计围绕断点的旋转来对齐碎片。我们用最近在宾夕法尼亚州费城出土的真实殖民文物来评估我们的方法。我们的系统极大地提高了用户重新组装工件的便利性和效率,正如我们在三种不同的容器上演示的那样。
{"title":"Aligning surfaces without aligning surfaces","authors":"Geoffrey Oxholm, K. Nishino","doi":"10.1109/WACV.2011.5711500","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711500","url":null,"abstract":"We introduce a novel method for matching and aligning 3D surfaces that do not have any overlapping surface information. When two matching surfaces do not overlap, all that remains in common between them is a thin strip along their borders. Aligning such fragments is challenging but crucial for various applications, such as reassembly of thin-shell ceramics from their broken pieces. Past work approach this problem by heavily relying on simplistic assumptions about the shape of the object, or its texture. Our method makes no such assumptions; instead, we leverage the geometric and photometric similarity of the matching surfaces along the break-line. We first encode the shape and color of the boundary contour of each fragment at various scales in a novel 2D representation. Reformulating contour matching as 2D image registration based on these scale-space images enables efficient and accurate break-line matching. We then align the fragments by estimating the rotation around the break-line through maximizing the geometric continuity across it with a least-squares minimization. We evaluate our method on real-word colonial artifacts recently excavated in Philadelphia, Pennsylvania. Our system dramatically increases the ease and efficiency at which users reassemble artifacts as we demonstrate on three different vessels.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124507570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An analysis of facial shape and texture for recognition: A large scale evaluation on FRGC ver2.0 人脸形状和纹理识别分析——基于FRGC 2.0的大规模评价
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711505
R. Abiantun, Utsav Prabhu, Keshav Seshadri, J. Heo, M. Savvides
Traditional approaches to face recognition have utilized aligned facial images containing both shape and texture information. This paper analyzes the contributions of the individual facial shape and texture components to face recognition. These two components are evaluated independently and we investigate methods to combine the information gained from each of them to enhance face recognition performance. The contributions of this paper are the following: (1) to the best of our knowledge, it is the first large-scale study of how face recognition is influenced by shape and texture as all of our results are benchmarked against traditional approaches on the challenging NIST FRGC ver2.0 experiment 4 dataset, (2) we empirically show that shape information is reasonably discriminative, (3) we demonstrate significant improvement in performance by registering texture with dense shape information, and finally (4) show that fusing shape and texture information consistently boosts recognition results across different subspace-based algorithms.
传统的人脸识别方法是利用包含形状和纹理信息的对齐的人脸图像。分析了单个人脸形状和纹理分量对人脸识别的贡献。这两个成分是独立评估的,我们研究了如何结合从它们中获得的信息来提高人脸识别性能。本文的贡献如下:(1)据我们所知,这是第一次对形状和纹理如何影响人脸识别的大规模研究,因为我们所有的结果都是在具有挑战性的NIST FRGC ver2.0实验4数据集上与传统方法进行基准测试的,(2)我们实证地表明形状信息具有合理的判别性,(3)我们通过将纹理与密集的形状信息注册在一起,我们证明了性能的显着提高。最后(4)表明融合形状和纹理信息可以一致地提高不同子空间算法的识别效果。
{"title":"An analysis of facial shape and texture for recognition: A large scale evaluation on FRGC ver2.0","authors":"R. Abiantun, Utsav Prabhu, Keshav Seshadri, J. Heo, M. Savvides","doi":"10.1109/WACV.2011.5711505","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711505","url":null,"abstract":"Traditional approaches to face recognition have utilized aligned facial images containing both shape and texture information. This paper analyzes the contributions of the individual facial shape and texture components to face recognition. These two components are evaluated independently and we investigate methods to combine the information gained from each of them to enhance face recognition performance. The contributions of this paper are the following: (1) to the best of our knowledge, it is the first large-scale study of how face recognition is influenced by shape and texture as all of our results are benchmarked against traditional approaches on the challenging NIST FRGC ver2.0 experiment 4 dataset, (2) we empirically show that shape information is reasonably discriminative, (3) we demonstrate significant improvement in performance by registering texture with dense shape information, and finally (4) show that fusing shape and texture information consistently boosts recognition results across different subspace-based algorithms.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124531572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2011 IEEE Workshop on Applications of Computer Vision (WACV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1