首页 > 最新文献

2011 IEEE Workshop on Applications of Computer Vision (WACV)最新文献

英文 中文
A parallel region based object recognition system 基于并行区域的目标识别系统
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711487
Bor-Yiing Su, T. Brutch, K. Keutzer
Object recognition is a key problem in the field of computer vision. However, highly accurate object recognition systems are also computationally intensive, which limits their applicability. In this paper, we focus on a state-of-the-art object recognition system. We identify key computations of the system, examine efficient algorithms for parallelizing key computations, and develop a parallel object recognition system. The time taken by the training procedure on 127 images, with an average size of 0.15 M pixels, is reduced from 2332 seconds to 20 seconds. Similarly, the classification time of one 0.15 M pixel image is reduced from 331 seconds to 2.78 seconds. This efficient implementation of the object recognition system now makes it practical to train hundreds of images within minutes, and makes it possible to analyze image databases with hundreds or thousands of images in minutes, which was previously not possible.
物体识别是计算机视觉领域的一个关键问题。然而,高精度的目标识别系统也需要大量的计算,这限制了它们的适用性。在本文中,我们重点研究了一个最先进的目标识别系统。我们确定了系统的关键计算,研究了并行关键计算的有效算法,并开发了一个并行目标识别系统。在127张平均大小为0.15 M像素的图像上,训练过程所花费的时间从2332秒减少到20秒。同样,一幅0.15 M像素图像的分类时间从331秒减少到2.78秒。物体识别系统的这种高效实现,现在可以在几分钟内训练数百张图像,并且可以在几分钟内分析数百或数千张图像的图像数据库,这在以前是不可能的。
{"title":"A parallel region based object recognition system","authors":"Bor-Yiing Su, T. Brutch, K. Keutzer","doi":"10.1109/WACV.2011.5711487","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711487","url":null,"abstract":"Object recognition is a key problem in the field of computer vision. However, highly accurate object recognition systems are also computationally intensive, which limits their applicability. In this paper, we focus on a state-of-the-art object recognition system. We identify key computations of the system, examine efficient algorithms for parallelizing key computations, and develop a parallel object recognition system. The time taken by the training procedure on 127 images, with an average size of 0.15 M pixels, is reduced from 2332 seconds to 20 seconds. Similarly, the classification time of one 0.15 M pixel image is reduced from 331 seconds to 2.78 seconds. This efficient implementation of the object recognition system now makes it practical to train hundreds of images within minutes, and makes it possible to analyze image databases with hundreds or thousands of images in minutes, which was previously not possible.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124859022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multi-view human action recognition system employing 2DPCA 采用2DPCA的多视角人体动作识别系统
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711513
Mohamed A. Naiel, M. Abdelwahab, M. El-Saban
A novel algorithm for view-invariant human action recognition is presented. This approach is based on Two-Dimensional Principal Component Analysis (2DPCA) applied directly on the Motion Energy Image (MEI) or the Motion History Image (MHI) in both the spatial domain and the transform domain. This method reduces the computational complexity by a factor of at least 66, achieving the highest recognition accuracy per camera, while maintaining minimum storage requirements, compared with the most recent reports in the field. Experimental results performed on the Weizmann action and the INIRIA IXMAS datasets confirm the excellent properties of the proposed algorithm, showing its robustness and ability to work with small number of training sequences. The dramatic reduction in computational complexity promotes the use in real time applications.
提出了一种新的视觉不变人体动作识别算法。该方法基于二维主成分分析(2DPCA),在空间域和变换域直接应用于运动能量图像(MEI)或运动历史图像(MHI)。与该领域的最新报告相比,该方法将计算复杂度降低了至少66倍,在保持最小存储要求的同时,实现了每个摄像机的最高识别精度。在Weizmann动作和INIRIA IXMAS数据集上的实验结果证实了该算法的优异性能,显示了其鲁棒性和处理少量训练序列的能力。计算复杂性的显著降低促进了实时应用程序的使用。
{"title":"Multi-view human action recognition system employing 2DPCA","authors":"Mohamed A. Naiel, M. Abdelwahab, M. El-Saban","doi":"10.1109/WACV.2011.5711513","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711513","url":null,"abstract":"A novel algorithm for view-invariant human action recognition is presented. This approach is based on Two-Dimensional Principal Component Analysis (2DPCA) applied directly on the Motion Energy Image (MEI) or the Motion History Image (MHI) in both the spatial domain and the transform domain. This method reduces the computational complexity by a factor of at least 66, achieving the highest recognition accuracy per camera, while maintaining minimum storage requirements, compared with the most recent reports in the field. Experimental results performed on the Weizmann action and the INIRIA IXMAS datasets confirm the excellent properties of the proposed algorithm, showing its robustness and ability to work with small number of training sequences. The dramatic reduction in computational complexity promotes the use in real time applications.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125854293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Bayesian 3D model based human detection in crowded scenes using efficient optimization 基于贝叶斯三维模型的拥挤场景中人检测高效优化
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711553
Lu Wang, N. Yung
In this paper, we solve the problem of human detection in crowded scenes using a Bayesian 3D model based method. Human candidates are first nominated by a head detector and a foot detector, then optimization is performed to find the best configuration of the candidates and their corresponding shape models. The solution is obtained by decomposing the mutually related candidates into un-occluded ones and occluded ones in each iteration, and then performing model matching for the un-occluded candidates. To this end, in addition to some obvious clues, we also derive a graph that depicts the inter-object relation so that unreasonable decomposition is avoided. The merit of the proposed optimization procedure is that its computational cost is similar to the greedy optimization methods while its performance is comparable to the global optimization approaches. For model matching, it is performed by employing both prior knowledge and image likelihood, where the priors include the distribution of individual shape models and the restriction on the inter-object distance in real world, and image likelihood is provided by foreground extraction and the edge information. After the model matching, a validation and rejection strategy based on minimum description length is applied to confirm the candidates that have reliable matching results. The proposed method is tested on both the publicly available Caviar dataset and a challenging dataset constructed by ourselves. The experimental results demonstrate the effectiveness of our approach.
本文采用基于贝叶斯三维模型的方法解决了拥挤场景下的人体检测问题。首先由头部检测器和足部检测器提名候选人体,然后进行优化,找到候选人体的最佳配置及其相应的形状模型。在每次迭代中将相互关联的候选对象分解为未遮挡的候选对象和被遮挡的候选对象,然后对未遮挡的候选对象进行模型匹配,从而得到解。为此,除了一些明显的线索外,我们还推导了一个描述对象间关系的图,以避免不合理的分解。该优化方法的优点是计算量与贪心优化方法相似,而性能与全局优化方法相当。模型匹配采用先验知识和图像似然相结合的方法,其中先验包括现实世界中单个形状模型的分布和物体间距离的限制,图像似然由前景提取和边缘信息提供。模型匹配完成后,采用基于最小描述长度的验证与拒绝策略,对具有可靠匹配结果的候选模型进行确认。在公开可用的鱼子酱数据集和我们自己构建的具有挑战性的数据集上对该方法进行了测试。实验结果证明了该方法的有效性。
{"title":"Bayesian 3D model based human detection in crowded scenes using efficient optimization","authors":"Lu Wang, N. Yung","doi":"10.1109/WACV.2011.5711553","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711553","url":null,"abstract":"In this paper, we solve the problem of human detection in crowded scenes using a Bayesian 3D model based method. Human candidates are first nominated by a head detector and a foot detector, then optimization is performed to find the best configuration of the candidates and their corresponding shape models. The solution is obtained by decomposing the mutually related candidates into un-occluded ones and occluded ones in each iteration, and then performing model matching for the un-occluded candidates. To this end, in addition to some obvious clues, we also derive a graph that depicts the inter-object relation so that unreasonable decomposition is avoided. The merit of the proposed optimization procedure is that its computational cost is similar to the greedy optimization methods while its performance is comparable to the global optimization approaches. For model matching, it is performed by employing both prior knowledge and image likelihood, where the priors include the distribution of individual shape models and the restriction on the inter-object distance in real world, and image likelihood is provided by foreground extraction and the edge information. After the model matching, a validation and rejection strategy based on minimum description length is applied to confirm the candidates that have reliable matching results. The proposed method is tested on both the publicly available Caviar dataset and a challenging dataset constructed by ourselves. The experimental results demonstrate the effectiveness of our approach.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117248044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Vehicle detection from low quality aerial LIDAR data 基于低质量航空激光雷达数据的车辆检测
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711551
Bo Yang, Pramod Sharma, R. Nevatia
In this paper we propose a vehicle detection framework on low resolution aerial range data. Our system consists of three steps: data mapping, 2D vehicle detection and postprocessing. First, we map the range data into 2D grayscale images by using the depth information only. For this purpose we propose a novel local ground plane estimation method, and the estimated ground plane is further refined by a global refinement process. Then we compute the depth value of missing points (points for which no depth information is available) by an effective interpolation method. In the second step, to train a classifier for the vehicles, we describe a method to generate more training examples from very few training annotations and adopt the fast cascade Adaboost approach for detecting vehicles in 2D grayscale images. Finally, in post-processing step we design a novel method to detect some vehicles which are comprised of clusters of missing points. We evaluate our method on real aerial data and the experiments demonstrate the effectiveness of our approach.
本文提出了一种基于低分辨率航拍距离数据的车辆检测框架。我们的系统包括三个步骤:数据映射、二维车辆检测和后处理。首先,仅利用深度信息将距离数据映射为二维灰度图像。为此,我们提出了一种新的局部地平面估计方法,并通过全局精化过程对估计的地平面进行进一步精化。然后通过一种有效的插值方法计算缺失点(没有深度信息的点)的深度值。在第二步中,为了训练车辆的分类器,我们描述了一种从很少的训练注释中生成更多训练示例的方法,并采用快速级联Adaboost方法来检测2D灰度图像中的车辆。最后,在后处理步骤中,我们设计了一种新的方法来检测一些由缺失点组成的车辆。在实际的航空数据上对该方法进行了验证,实验证明了该方法的有效性。
{"title":"Vehicle detection from low quality aerial LIDAR data","authors":"Bo Yang, Pramod Sharma, R. Nevatia","doi":"10.1109/WACV.2011.5711551","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711551","url":null,"abstract":"In this paper we propose a vehicle detection framework on low resolution aerial range data. Our system consists of three steps: data mapping, 2D vehicle detection and postprocessing. First, we map the range data into 2D grayscale images by using the depth information only. For this purpose we propose a novel local ground plane estimation method, and the estimated ground plane is further refined by a global refinement process. Then we compute the depth value of missing points (points for which no depth information is available) by an effective interpolation method. In the second step, to train a classifier for the vehicles, we describe a method to generate more training examples from very few training annotations and adopt the fast cascade Adaboost approach for detecting vehicles in 2D grayscale images. Finally, in post-processing step we design a novel method to detect some vehicles which are comprised of clusters of missing points. We evaluate our method on real aerial data and the experiments demonstrate the effectiveness of our approach.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129778686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Human gait estimation using a wearable camera 基于可穿戴相机的人体步态估计
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711514
Yoshihiro Watanabe, Tetsuo Hatanaka, T. Komuro, M. Ishikawa
We focus on the growing need for a technology that can achieve motion capture in outdoor environments. The conventional approaches have relied mainly on fixed installed cameras. With this approach, however, it is difficult to capture motion in everyday surroundings. This paper describes a new method for motion estimation using a single wearable camera. We focused on walking motion. The key point is how the system can estimate the original walking state using limited information from a wearable sensor. This paper describes three aspects: the configuration of the sensing system, gait representation, and the gait estimation method.
我们专注于在户外环境中实现运动捕捉的技术需求。传统的方法主要依赖于固定安装的摄像机。然而,用这种方法,很难捕捉日常环境中的运动。本文介绍了一种利用单台可穿戴摄像机进行运动估计的新方法。我们关注的是行走动作。关键是系统如何利用可穿戴传感器提供的有限信息来估计原始行走状态。本文从传感系统的配置、步态表征和步态估计方法三个方面进行了阐述。
{"title":"Human gait estimation using a wearable camera","authors":"Yoshihiro Watanabe, Tetsuo Hatanaka, T. Komuro, M. Ishikawa","doi":"10.1109/WACV.2011.5711514","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711514","url":null,"abstract":"We focus on the growing need for a technology that can achieve motion capture in outdoor environments. The conventional approaches have relied mainly on fixed installed cameras. With this approach, however, it is difficult to capture motion in everyday surroundings. This paper describes a new method for motion estimation using a single wearable camera. We focused on walking motion. The key point is how the system can estimate the original walking state using limited information from a wearable sensor. This paper describes three aspects: the configuration of the sensing system, gait representation, and the gait estimation method.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132538412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
2D Barcode localization and motion deblurring using a flutter shutter camera 二维条码定位和运动去模糊使用颤振快门相机
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711498
W. Xu, Scott McCloskey
We describe a system for localizing and deblurring motion-blurred 2D barcodes. Previous work on barcode detection and deblurring has mainly focused on 1D barcodes, and has employed traditional image acquisition which is not robust to motion blur. Our solution is based on coded exposure imaging which, as we show, enables well-posed de-convolution and decoding over a wider range of velocities. To serve this solution, we developed a simple and effective approach for 2D barcode localization under motion blur, a metric for evaluating the quality of the deblurred 2D barcodes, and an approach for motion direction estimation in coded exposure images. We tested our system on real camera images of three popular 2D barcode symbologies: Data Matrix, PDF417 and Aztec Code.
我们描述了一个用于定位和去模糊运动模糊二维条码的系统。以往在条形码检测和去模糊方面的工作主要集中在一维条形码上,并且采用传统的图像采集方法,对运动模糊的鲁棒性不强。我们的解决方案是基于编码曝光成像,正如我们所展示的,它可以在更大的速度范围内实现良好的反卷积和解码。为了实现这一解决方案,我们开发了一种简单有效的运动模糊下二维条形码定位方法,一种评估去模糊二维条形码质量的度量,以及一种在编码曝光图像中估计运动方向的方法。我们在三种流行的二维条码符号的真实相机图像上测试了我们的系统:Data Matrix、PDF417和Aztec Code。
{"title":"2D Barcode localization and motion deblurring using a flutter shutter camera","authors":"W. Xu, Scott McCloskey","doi":"10.1109/WACV.2011.5711498","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711498","url":null,"abstract":"We describe a system for localizing and deblurring motion-blurred 2D barcodes. Previous work on barcode detection and deblurring has mainly focused on 1D barcodes, and has employed traditional image acquisition which is not robust to motion blur. Our solution is based on coded exposure imaging which, as we show, enables well-posed de-convolution and decoding over a wider range of velocities. To serve this solution, we developed a simple and effective approach for 2D barcode localization under motion blur, a metric for evaluating the quality of the deblurred 2D barcodes, and an approach for motion direction estimation in coded exposure images. We tested our system on real camera images of three popular 2D barcode symbologies: Data Matrix, PDF417 and Aztec Code.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132086969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Dense point-to-point correspondences between 3D faces using parametric remeshing for constructing 3D Morphable Models 密集的点对点对应的三维面使用参数网格来构建三维变形模型
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711481
M. Kaiser, Gernot Heym, Nicolas H. Lehment, D. Arsic, G. Rigoll
In this contribution a novel method to compute dense point-to-point correspondences between 3D faces is presented. The correspondences can be employed for various face processing applications, for example for building up a 3D Morphable Model (3DMM). Paths connecting landmarks are traced on the 3D facial surface and the resulting patches are mapped into a uv-space. Triangle quadrisection is used to build up remeshes with high point density for each 3D facial surface. Each vertex of a remesh has one corresponding vertex in another remesh and all remeshes have the same connectivity. The quality of the point-to-point correspondences is demonstrated on the bases of two applications, namely morphing and constructing a 3DMM.
本文提出了一种计算三维面之间密集点对点对应关系的新方法。这些对应关系可用于各种人脸处理应用,例如用于建立3D变形模型(3DMM)。在3D面部表面上跟踪连接地标的路径,并将生成的补丁映射到uv空间中。三角形四边形用于为每个三维表面建立高点密度的网格。一个网格的每个顶点在另一个网格中有一个对应的顶点,所有网格都有相同的连通性。通过变形和构造两种应用,论证了点对点对应的质量。
{"title":"Dense point-to-point correspondences between 3D faces using parametric remeshing for constructing 3D Morphable Models","authors":"M. Kaiser, Gernot Heym, Nicolas H. Lehment, D. Arsic, G. Rigoll","doi":"10.1109/WACV.2011.5711481","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711481","url":null,"abstract":"In this contribution a novel method to compute dense point-to-point correspondences between 3D faces is presented. The correspondences can be employed for various face processing applications, for example for building up a 3D Morphable Model (3DMM). Paths connecting landmarks are traced on the 3D facial surface and the resulting patches are mapped into a uv-space. Triangle quadrisection is used to build up remeshes with high point density for each 3D facial surface. Each vertex of a remesh has one corresponding vertex in another remesh and all remeshes have the same connectivity. The quality of the point-to-point correspondences is demonstrated on the bases of two applications, namely morphing and constructing a 3DMM.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115790716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Real-time illumination-invariant motion detection in spatio-temporal image volumes 时空图像体中的实时光照不变运动检测
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711572
Johan Almbladh, K. Netzell
An algorithm for robust motion detection in video is proposed in this work. The algorithm continuously analyses the dense pixel volume formed by the current frame and its nearest neighbours in time. By assuming continuity of motion in space and time, pixels on slanted edges in this timespace pixel volume are considered to be in motion. This is in contrast to prevailing foreground-background models used for motion detection that consider a pixel's history in aggregation. By using an efficient data reduction scheme and leveraging logical bit-parallel operations of current CPUs, real-time performance is achieved even on resource-scarce embedded devices. Video surveillance applications demand for efficient algorithms which robustly detect motion across a wide variety of conditions without the need for on-site parameter adjustments. Experiments with real-world video show robust motion detection results with the proposed method, especially under conditions normally considered difficult, such as continuously changing illumination.
本文提出了一种视频鲁棒运动检测算法。该算法不断分析当前帧及其在时间上的近邻所形成的密集像素体。通过假设运动在空间和时间上的连续性,在该时空像素体中,倾斜边缘上的像素被认为是运动的。这与用于运动检测的主流前景-背景模型形成对比,后者考虑像素的聚合历史。通过使用有效的数据缩减方案和利用当前cpu的逻辑位并行操作,即使在资源稀缺的嵌入式设备上也能实现实时性能。视频监控应用需要高效的算法,这些算法可以在各种条件下健壮地检测运动,而无需现场参数调整。对真实视频的实验表明,该方法具有鲁棒的运动检测结果,特别是在通常认为困难的条件下,如连续变化的照明。
{"title":"Real-time illumination-invariant motion detection in spatio-temporal image volumes","authors":"Johan Almbladh, K. Netzell","doi":"10.1109/WACV.2011.5711572","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711572","url":null,"abstract":"An algorithm for robust motion detection in video is proposed in this work. The algorithm continuously analyses the dense pixel volume formed by the current frame and its nearest neighbours in time. By assuming continuity of motion in space and time, pixels on slanted edges in this timespace pixel volume are considered to be in motion. This is in contrast to prevailing foreground-background models used for motion detection that consider a pixel's history in aggregation. By using an efficient data reduction scheme and leveraging logical bit-parallel operations of current CPUs, real-time performance is achieved even on resource-scarce embedded devices. Video surveillance applications demand for efficient algorithms which robustly detect motion across a wide variety of conditions without the need for on-site parameter adjustments. Experiments with real-world video show robust motion detection results with the proposed method, especially under conditions normally considered difficult, such as continuously changing illumination.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114580276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Computationally efficient retrieval-based tracking system and augmented reality for large-scale areas 计算效率高的基于检索的大规模区域跟踪系统和增强现实
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711533
Wei Guan, Suya You, U. Neumann
We present a retrieval-based tracking system that requires less computational time and cost. The system tracks a user's location through a small portion of an image captured by the camera, and then refines the camera pose by propagating matchings to the whole image. Augmented information such as building names and locations will be delivered to the user. The progressive way to process image data not only can provide the user with location information at real-time speed, but more importantly, it reduces the feature matching time by limiting the searching ranges. The proposed system contains two parts, offline database building and online user tracking. The database is composed of image patches with features and location information. The images are captured at different locations of interests from different viewing angles and distances, and then these images are partitioned into smaller patches. The location of a user can be calculated by querying one or more patches of the captured image. Moreover, the system is capable to handle large occlusions in images due to the patch approach. Experiments show that the proposed tracking system is efficient and robust in many different environments.
我们提出了一个基于检索的跟踪系统,它需要更少的计算时间和成本。该系统通过相机拍摄的一小部分图像来跟踪用户的位置,然后通过将匹配传播到整个图像来改进相机姿势。诸如建筑物名称和位置等增强信息将被传递给用户。这种渐进式的图像数据处理方式不仅可以实时地为用户提供位置信息,更重要的是,通过限制搜索范围,减少了特征匹配时间。该系统包括离线数据库建立和在线用户跟踪两个部分。该数据库由带有特征和位置信息的图像补丁组成。这些图像从不同的视角和距离捕捉到不同的兴趣点,然后这些图像被分割成更小的块。可以通过查询捕获图像的一个或多个补丁来计算用户的位置。此外,由于采用了补丁方法,该系统能够处理图像中的大遮挡。实验表明,该跟踪系统在多种环境下都具有良好的鲁棒性和有效性。
{"title":"Computationally efficient retrieval-based tracking system and augmented reality for large-scale areas","authors":"Wei Guan, Suya You, U. Neumann","doi":"10.1109/WACV.2011.5711533","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711533","url":null,"abstract":"We present a retrieval-based tracking system that requires less computational time and cost. The system tracks a user's location through a small portion of an image captured by the camera, and then refines the camera pose by propagating matchings to the whole image. Augmented information such as building names and locations will be delivered to the user. The progressive way to process image data not only can provide the user with location information at real-time speed, but more importantly, it reduces the feature matching time by limiting the searching ranges. The proposed system contains two parts, offline database building and online user tracking. The database is composed of image patches with features and location information. The images are captured at different locations of interests from different viewing angles and distances, and then these images are partitioned into smaller patches. The location of a user can be calculated by querying one or more patches of the captured image. Moreover, the system is capable to handle large occlusions in images due to the patch approach. Experiments show that the proposed tracking system is efficient and robust in many different environments.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115055675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Soft margin keyframe comparison: Enhancing precision of fraud detection in retail surveillance 软边际关键帧比对:提高零售监控中欺诈检测的精度
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711552
Jiyan Pan, Quanfu Fan, Sharath Pankanti, Hoang Trinh, Prasad Gabbur, S. Miyazawa
We propose a novel approach for enhancing precision in a leading video analytics system that detects cashier fraud in grocery stores for loss prevention. While intelligent video analytics has recently become a promising means of loss prevention for retailers, most of the real-world systems suffer from a large number of false alarms, resulting in a significant waste of human labor during manual verification. Our proposed approach starts with the candidate fraudulent events detected by a state-of-the-art system. Such fraudulent events are a set of visually recognized checkout-related activities of the cashier without barcode associations. Instead of conducting costly video analysis, we extract a few keyframes to represent the essence of each candidate fraudulent event, and compare those keyframes to identify whether or not the event is a valid check-out process that involves consistent appearance changes on the lead-in belt, the scan area and the take-away belt. Our approach also performs a margin-based soft classification so that the user could trade off between saving human labor and preserving high recall. Experiments on days of surveillance videos collected from real grocery stores show that our algorithm can save about 50% of human labor while preserving over 90% of true alarms with small computational overhead.
我们提出了一种新的方法,以提高精确度在领先的视频分析系统,检测收银员欺诈在杂货店的损失预防。虽然智能视频分析最近已经成为零售商预防损失的一种很有前途的手段,但大多数现实世界的系统都存在大量的假警报,导致人工验证过程中大量浪费人力。我们建议的方法从最先进的系统检测到的候选欺诈事件开始。这种欺诈事件是一组没有条形码关联的收银员的视觉识别的结帐相关活动。我们没有进行昂贵的视频分析,而是提取了几个关键帧来表示每个候选欺诈事件的本质,并比较这些关键帧来确定该事件是否是一个有效的检查过程,该过程涉及导入带、扫描区域和带走带上一致的外观变化。我们的方法还执行基于边际的软分类,以便用户可以在节省人力和保持高召回率之间进行权衡。对从真实杂货店收集的几天监控视频进行的实验表明,我们的算法可以节省约50%的人力,同时以较小的计算开销保留90%以上的真实警报。
{"title":"Soft margin keyframe comparison: Enhancing precision of fraud detection in retail surveillance","authors":"Jiyan Pan, Quanfu Fan, Sharath Pankanti, Hoang Trinh, Prasad Gabbur, S. Miyazawa","doi":"10.1109/WACV.2011.5711552","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711552","url":null,"abstract":"We propose a novel approach for enhancing precision in a leading video analytics system that detects cashier fraud in grocery stores for loss prevention. While intelligent video analytics has recently become a promising means of loss prevention for retailers, most of the real-world systems suffer from a large number of false alarms, resulting in a significant waste of human labor during manual verification. Our proposed approach starts with the candidate fraudulent events detected by a state-of-the-art system. Such fraudulent events are a set of visually recognized checkout-related activities of the cashier without barcode associations. Instead of conducting costly video analysis, we extract a few keyframes to represent the essence of each candidate fraudulent event, and compare those keyframes to identify whether or not the event is a valid check-out process that involves consistent appearance changes on the lead-in belt, the scan area and the take-away belt. Our approach also performs a margin-based soft classification so that the user could trade off between saving human labor and preserving high recall. Experiments on days of surveillance videos collected from real grocery stores show that our algorithm can save about 50% of human labor while preserving over 90% of true alarms with small computational overhead.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116396566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2011 IEEE Workshop on Applications of Computer Vision (WACV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1