首页 > 最新文献

2011 IEEE Workshop on Applications of Computer Vision (WACV)最新文献

英文 中文
Active stereo vision for improving long range hearing using a Laser Doppler Vibrometer 利用激光多普勒测振仪改善远距离听力的主动立体视觉
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711554
Tao Wang, Rui Li, Zhigang Zhu, Yufu Qu
Laser Doppler Vibrometers (LDVs) have been widely applied for detecting vibrations in applications such as mechanics, bridge inspection, biometrics, as well as long-range surveillance in which acoustic signatures can be obtained at a large distance. However, in both industrial and scientific applications, the LDVs are manually controlled in surface selection, laser focusing, and acoustic acquisition. In this paper, we propose an active stereo vision approach to facilitate fast and automated laser pointing and tracking for long-range LDV hearing. The system contains: 1) a mirror on a Pan-Tilt-Unit (PTU) to reflect the laser beam to any locations freely and quickly, and 2) two Pan-Tilt-Zoom (PTZ) cameras, one of which is mounted on the Pan-Tilt-Unit (PTU) and aligned with the laser beam synchronously. The distance measurement using the stereo vision system as well as triangulation between camera and the LDV laser beam allow us to fast focus the laser beam on selected surfaces and to obtain acoustic signals up to 200 meters in real time. We present some promising results with the collaborative visual and LDV measurements for laser pointing and focusing in order to achieve long range audio detection.
激光多普勒测振仪(ldv)已广泛应用于检测振动,如力学,桥梁检查,生物识别,以及远程监视,其中可以在很远的距离获得声学特征。然而,在工业和科学应用中,ldv在表面选择、激光聚焦和声学采集方面都是手动控制的。在本文中,我们提出了一种主动立体视觉方法,以促进远程LDV听力的快速和自动激光指向和跟踪。该系统包括:1)一个安装在Pan-Tilt-Unit (PTU)上的反射镜,可以自由快速地将激光束反射到任意位置;2)两个Pan-Tilt-Zoom (PTZ)摄像机,其中一个安装在Pan-Tilt-Unit (PTU)上,与激光束同步对准。使用立体视觉系统进行距离测量,以及相机和LDV激光束之间的三角测量,使我们能够将激光束快速聚焦在选定的表面上,并实时获得200米以内的声信号。我们提出了一些有希望的结果与协同视觉和LDV测量激光指向和聚焦,以实现远程音频检测。
{"title":"Active stereo vision for improving long range hearing using a Laser Doppler Vibrometer","authors":"Tao Wang, Rui Li, Zhigang Zhu, Yufu Qu","doi":"10.1109/WACV.2011.5711554","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711554","url":null,"abstract":"Laser Doppler Vibrometers (LDVs) have been widely applied for detecting vibrations in applications such as mechanics, bridge inspection, biometrics, as well as long-range surveillance in which acoustic signatures can be obtained at a large distance. However, in both industrial and scientific applications, the LDVs are manually controlled in surface selection, laser focusing, and acoustic acquisition. In this paper, we propose an active stereo vision approach to facilitate fast and automated laser pointing and tracking for long-range LDV hearing. The system contains: 1) a mirror on a Pan-Tilt-Unit (PTU) to reflect the laser beam to any locations freely and quickly, and 2) two Pan-Tilt-Zoom (PTZ) cameras, one of which is mounted on the Pan-Tilt-Unit (PTU) and aligned with the laser beam synchronously. The distance measurement using the stereo vision system as well as triangulation between camera and the LDV laser beam allow us to fast focus the laser beam on selected surfaces and to obtain acoustic signals up to 200 meters in real time. We present some promising results with the collaborative visual and LDV measurements for laser pointing and focusing in order to achieve long range audio detection.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131737338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Exploratory analysis of time-lapse imagery with fast subset PCA 基于快速子集PCA的延时图像探索性分析
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711523
Austin Abrams, Emily Feder, Robert Pless
In surveillance and environmental monitoring applications, it is common to have millions of images of a particular scene. While there exist tools to find particular events, anomalies, human actions and behaviors, there has been little investigation of tools which allow more exploratory searches in the data. This paper proposes modifications to PCA that enable users to quickly recompute low-rank decompositions for select spatial and temporal subsets of the data. This process returns decompositions orders of magnitude faster than general PCA and are close to optimal in terms of reconstruction error. We show examples of real exploratory data analysis across several applications, including an interactive web application.
在监视和环境监测应用中,通常会有数百万张特定场景的图像。虽然有工具可以发现特定的事件、异常、人类行为和行为,但很少有工具可以在数据中进行更多的探索性搜索。本文提出了对PCA的修改,使用户能够快速重新计算数据的空间和时间子集的低秩分解。该过程比一般PCA更快地返回分解数量级,并且在重建误差方面接近最佳。我们将展示跨多个应用程序(包括一个交互式web应用程序)进行实际探索性数据分析的示例。
{"title":"Exploratory analysis of time-lapse imagery with fast subset PCA","authors":"Austin Abrams, Emily Feder, Robert Pless","doi":"10.1109/WACV.2011.5711523","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711523","url":null,"abstract":"In surveillance and environmental monitoring applications, it is common to have millions of images of a particular scene. While there exist tools to find particular events, anomalies, human actions and behaviors, there has been little investigation of tools which allow more exploratory searches in the data. This paper proposes modifications to PCA that enable users to quickly recompute low-rank decompositions for select spatial and temporal subsets of the data. This process returns decompositions orders of magnitude faster than general PCA and are close to optimal in terms of reconstruction error. We show examples of real exploratory data analysis across several applications, including an interactive web application.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134173100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Co-training framework of generative and discriminative trackers with partial occlusion handling 局部遮挡处理下生成和判别跟踪器的协同训练框架
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711565
T. Dinh, G. Medioni
Partial occlusion is a challenging problem in object tracking. In online visual tracking, it is the critical factor causing drift. To address this problem, we propose a novel approach using a co-training framework of generative and discriminative trackers. Our approach is able to detect the occluding region and continuously update both the generative and discriminative models using the information from the non-occluded part. The generative model encodes all of the appearance variations using a low dimension subspace, which helps provide a strong reacquisition ability. Meanwhile, the discriminative classifer, an online support vector machine, focuses on separating the object from the background using a Histograms of Oriented Gradients (HOG) feature set. For each search window, an occlusion likelihood map is generated by the two trackers through a co-decision process. If there is disagreement between these two trackers, the movement vote of KLT local features is used as a referee. Precise occlusion segmentation is performed using MeanShift. Finally, each tracker recovers the occluded part and updates its own model using the new non-occluded information. Experimental results on challenging sequences with different types of objects are presented. We also compare with other state-of-the-art methods to demonstrate the superiority and robustness of our tracking framework.
局部遮挡是目标跟踪中的一个难题。在在线视觉跟踪中,它是引起漂移的关键因素。为了解决这个问题,我们提出了一种新的方法,使用生成和判别跟踪器的共同训练框架。我们的方法能够检测遮挡区域,并使用来自非遮挡部分的信息不断更新生成和判别模型。生成模型使用低维子空间编码所有的外观变化,这有助于提供强大的再获取能力。与此同时,判别分类器是一种在线支持向量机,主要利用HOG特征集将目标与背景分离开来。对于每个搜索窗口,两个跟踪器通过共同决策过程生成一个遮挡似然图。如果两个跟踪器之间存在不一致,则使用KLT局部特征的运动投票作为裁判。使用MeanShift进行精确的遮挡分割。最后,每个跟踪器恢复被遮挡的部分,并使用新的未遮挡信息更新自己的模型。给出了不同目标类型挑战序列的实验结果。我们还与其他最先进的方法进行比较,以证明我们的跟踪框架的优越性和鲁棒性。
{"title":"Co-training framework of generative and discriminative trackers with partial occlusion handling","authors":"T. Dinh, G. Medioni","doi":"10.1109/WACV.2011.5711565","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711565","url":null,"abstract":"Partial occlusion is a challenging problem in object tracking. In online visual tracking, it is the critical factor causing drift. To address this problem, we propose a novel approach using a co-training framework of generative and discriminative trackers. Our approach is able to detect the occluding region and continuously update both the generative and discriminative models using the information from the non-occluded part. The generative model encodes all of the appearance variations using a low dimension subspace, which helps provide a strong reacquisition ability. Meanwhile, the discriminative classifer, an online support vector machine, focuses on separating the object from the background using a Histograms of Oriented Gradients (HOG) feature set. For each search window, an occlusion likelihood map is generated by the two trackers through a co-decision process. If there is disagreement between these two trackers, the movement vote of KLT local features is used as a referee. Precise occlusion segmentation is performed using MeanShift. Finally, each tracker recovers the occluded part and updates its own model using the new non-occluded information. Experimental results on challenging sequences with different types of objects are presented. We also compare with other state-of-the-art methods to demonstrate the superiority and robustness of our tracking framework.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131025792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
PIRF-Nav 2: Speeded-up online and incremental appearance-based SLAM in an indoor environment PIRF-Nav 2:在室内环境中加速在线和增量的基于外观的SLAM
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711496
Noppharit Tongprasit, Aram Kawewong, O. Hasegawa
This paper presents a fast, online, and incremental solution for an appearance-based loop closure detection problem in an indoor environment. This problem is important in terms of the navigation of mobile robots. Appearance-based Simultaneous Localization And Mapping (SLAM) for a highly dynamic environment, called Position Invariant Robust Feature Navigation (PIRF-Nav), was first proposed by Kawewong et al. in 2010. Their results showed major improvements from other state-of-the-art methods. However, the computational expense of PIRF-Nav is beyond real time, and it consumes a tremendous amount of memory. These two factors hinder the use of PIRF-Nav for mobile robot applications. This study proposed (i) modified PIRF extraction that makes the system more suitable for an indoor environment and (ii) new dictionary management that can eliminate redundant searching and conserve memory consumption. According to the results, our proposed method can finish tasks up to 12 times faster than PIRF-Nav, with only slight percentage decline in a recall, while the precision remains 1. In addition, for a more challenging task, we collected additional data from a crowded university canteen during lunch time. Even in this cluttered environment, our proposed method performs better with real-time processing compared with other methods.
本文针对室内环境中基于外观的闭环检测问题,提出了一种快速、在线和增量的解决方案。这个问题对于移动机器人的导航是很重要的。用于高度动态环境的基于外观的同时定位和映射(SLAM),称为位置不变鲁棒特征导航(PIRF-Nav),由Kawewong等人于2010年首次提出。他们的结果显示,与其他最先进的方法相比,有了重大改进。然而,PIRF-Nav的计算开销超出了实时性,并且消耗了大量的内存。这两个因素阻碍了PIRF-Nav在移动机器人应用中的应用。本研究提出(i)改进的PIRF提取,使系统更适合室内环境;(ii)新的字典管理,可以消除冗余搜索和节省内存消耗。结果表明,我们提出的方法完成任务的速度比PIRF-Nav快12倍,召回率只有轻微的百分比下降,而精度保持在1。此外,为了完成一项更具挑战性的任务,我们在午餐时间从拥挤的大学食堂收集了额外的数据。即使在这种混乱的环境中,与其他方法相比,我们提出的方法在实时处理方面表现更好。
{"title":"PIRF-Nav 2: Speeded-up online and incremental appearance-based SLAM in an indoor environment","authors":"Noppharit Tongprasit, Aram Kawewong, O. Hasegawa","doi":"10.1109/WACV.2011.5711496","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711496","url":null,"abstract":"This paper presents a fast, online, and incremental solution for an appearance-based loop closure detection problem in an indoor environment. This problem is important in terms of the navigation of mobile robots. Appearance-based Simultaneous Localization And Mapping (SLAM) for a highly dynamic environment, called Position Invariant Robust Feature Navigation (PIRF-Nav), was first proposed by Kawewong et al. in 2010. Their results showed major improvements from other state-of-the-art methods. However, the computational expense of PIRF-Nav is beyond real time, and it consumes a tremendous amount of memory. These two factors hinder the use of PIRF-Nav for mobile robot applications. This study proposed (i) modified PIRF extraction that makes the system more suitable for an indoor environment and (ii) new dictionary management that can eliminate redundant searching and conserve memory consumption. According to the results, our proposed method can finish tasks up to 12 times faster than PIRF-Nav, with only slight percentage decline in a recall, while the precision remains 1. In addition, for a more challenging task, we collected additional data from a crowded university canteen during lunch time. Even in this cluttered environment, our proposed method performs better with real-time processing compared with other methods.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124486807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Multi-modal summarization of key events and top players in sports tournament videos 多模式总结的关键事件和顶级球员在体育赛事视频
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711541
D. Tjondronegoro, Xiaohui Tao, Johannes Sasongko, C. Lau
To detect and annotate the key events of live sports videos, we need to tackle the semantic gaps of audio-visual information. Previous work has successfully extracted semantic from the time-stamped web match reports, which are synchronized with the video contents. However, web and social media articles with no time-stamps have not been fully leveraged, despite they are increasingly used to complement the coverage of major sporting tournaments. This paper aims to address this limitation using a novel multimodal summarization framework that is based on sentiment analysis and players' popularity. It uses audiovisual contents, web articles, blogs, and commentators' speech to automatically annotate and visualize the key events and key players in a sports tournament coverage. The experimental results demonstrate that the automatically generated video summaries are aligned with the events identified from the official website match reports.
要对体育视频直播中的关键事件进行检测和标注,需要解决视听信息的语义缺口问题。以前的工作已经成功地从带有时间戳的web匹配报告中提取了语义,并与视频内容同步。然而,没有时间戳的网络和社交媒体文章并没有得到充分利用,尽管它们越来越多地用于补充重大体育赛事的报道。本文旨在使用基于情感分析和玩家受欢迎程度的新颖多模态总结框架来解决这一限制。它使用视听内容、网络文章、博客和评论员的演讲来自动注释和可视化体育赛事报道中的关键事件和关键球员。实验结果表明,自动生成的视频摘要与官方网站比赛报告中识别的事件一致。
{"title":"Multi-modal summarization of key events and top players in sports tournament videos","authors":"D. Tjondronegoro, Xiaohui Tao, Johannes Sasongko, C. Lau","doi":"10.1109/WACV.2011.5711541","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711541","url":null,"abstract":"To detect and annotate the key events of live sports videos, we need to tackle the semantic gaps of audio-visual information. Previous work has successfully extracted semantic from the time-stamped web match reports, which are synchronized with the video contents. However, web and social media articles with no time-stamps have not been fully leveraged, despite they are increasingly used to complement the coverage of major sporting tournaments. This paper aims to address this limitation using a novel multimodal summarization framework that is based on sentiment analysis and players' popularity. It uses audiovisual contents, web articles, blogs, and commentators' speech to automatically annotate and visualize the key events and key players in a sports tournament coverage. The experimental results demonstrate that the automatically generated video summaries are aligned with the events identified from the official website match reports.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121933451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Experimental evidence of a template aging effect in iris biometrics 虹膜生物识别中模板老化效应的实验证据
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711508
S. Fenker, K. Bowyer
It has been widely accepted that iris biometric systems are not subject to a template aging effect. Baker et al. [1] recently presented the first published evidence of a template aging effect, using images acquired from 2004 through 2008 with an LG 2200 iris imaging system, representing a total of 13 subjects (26 irises). We report on a template aging study involving two different iris recognition algorithms, a larger number of subjects (43), a more modern imaging system (LG 4000), and over a shorter time-lapse (2 years). We also investigate the degree to which the template aging effect may be related to pupil dilation and/or contact lenses. We find evidence of a template aging effect, resulting in an increase in match hamming distance and false reject rate.
虹膜生物识别系统不受模板老化效应的影响已经被广泛接受。Baker等人最近首次发表了模板老化效应的证据,使用LG 2200虹膜成像系统从2004年到2008年获得的图像,共代表13名受试者(26个虹膜)。我们报告了一项模板老化研究,涉及两种不同的虹膜识别算法,更多的受试者(43),更现代的成像系统(LG 4000),以及更短的延时(2年)。我们还研究了模版老化效应在多大程度上与瞳孔扩张和/或隐形眼镜有关。我们发现了模板老化效应的证据,导致匹配汉明距离和误拒率的增加。
{"title":"Experimental evidence of a template aging effect in iris biometrics","authors":"S. Fenker, K. Bowyer","doi":"10.1109/WACV.2011.5711508","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711508","url":null,"abstract":"It has been widely accepted that iris biometric systems are not subject to a template aging effect. Baker et al. [1] recently presented the first published evidence of a template aging effect, using images acquired from 2004 through 2008 with an LG 2200 iris imaging system, representing a total of 13 subjects (26 irises). We report on a template aging study involving two different iris recognition algorithms, a larger number of subjects (43), a more modern imaging system (LG 4000), and over a shorter time-lapse (2 years). We also investigate the degree to which the template aging effect may be related to pupil dilation and/or contact lenses. We find evidence of a template aging effect, resulting in an increase in match hamming distance and false reject rate.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132188611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 77
Robust modified L2 local optical flow estimation and feature tracking 鲁棒改进L2局部光流估计和特征跟踪
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711571
T. Senst, Volker Eiselein, Rubén Heras Evangelio, T. Sikora
This paper describes a robust method for the local optical flow estimation and the KLT feature tracking performed on the GPU. Therefore we present an estimator based on the L2 norm with robust characteristics. In order to increase the robustness at discontinuities we propose a strategy to adapt the used region size. The GPU implementation of our approach achieves real-time (>25 fps) performance for High Definition (HD) video sequences while tracking several thousands of points. The benefit of the suggested enhancement is illustrated on the Middlebury optical flow benchmark.
本文描述了一种在GPU上进行局部光流估计和KLT特征跟踪的鲁棒方法。因此,我们提出了一种基于L2范数的鲁棒估计。为了提高在不连续点处的鲁棒性,我们提出了一种自适应区域大小的策略。该方法的GPU实现在跟踪数千个点的同时实现了高清晰度(HD)视频序列的实时(>25 fps)性能。在Middlebury光流基准上说明了所建议的增强的好处。
{"title":"Robust modified L2 local optical flow estimation and feature tracking","authors":"T. Senst, Volker Eiselein, Rubén Heras Evangelio, T. Sikora","doi":"10.1109/WACV.2011.5711571","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711571","url":null,"abstract":"This paper describes a robust method for the local optical flow estimation and the KLT feature tracking performed on the GPU. Therefore we present an estimator based on the L2 norm with robust characteristics. In order to increase the robustness at discontinuities we propose a strategy to adapt the used region size. The GPU implementation of our approach achieves real-time (>25 fps) performance for High Definition (HD) video sequences while tracking several thousands of points. The benefit of the suggested enhancement is illustrated on the Middlebury optical flow benchmark.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127598262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
An overview of automatic event detection in soccer matches 足球比赛中自动事件检测的概述
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711480
Samuel Felix de Sousa, A. Araújo, D. Menotti
Sports video analysis has received special attention from researchers due to its high popularity and general interest on semantic analysis. Hence, soccer videos represent an interesting field for research allowing many types of applications: indexing, summarization, players' behavior recognition and so forth. Many approaches have been applied for field extraction and recognition, arc and goalmouth detection, ball and players tracking, and high level techniques such as team tactics detection and soccer models definition. In this paper, we provide an hierarchy and we classify approaches into this hierarchy based on their analysis level, i.e., low, middle, and high levels. An overview of soccer event identification is presented and we discuss general issues related to it in order to provide relevant information about what has been done on soccer video processing.
体育视频分析由于其在语义分析方面的高度普及和广泛的兴趣而受到了研究人员的特别关注。因此,足球视频代表了一个有趣的研究领域,允许许多类型的应用:索引、摘要、球员行为识别等等。许多方法被应用于场地提取和识别,弧线和球门检测,球和球员跟踪,以及团队战术检测和足球模型定义等高层次技术。在本文中,我们提供了一个层次结构,并根据它们的分析级别将方法分类到这个层次结构中,即低、中、高级别。本文概述了足球赛事识别,并讨论了与之相关的一般问题,以便为足球视频处理提供相关信息。
{"title":"An overview of automatic event detection in soccer matches","authors":"Samuel Felix de Sousa, A. Araújo, D. Menotti","doi":"10.1109/WACV.2011.5711480","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711480","url":null,"abstract":"Sports video analysis has received special attention from researchers due to its high popularity and general interest on semantic analysis. Hence, soccer videos represent an interesting field for research allowing many types of applications: indexing, summarization, players' behavior recognition and so forth. Many approaches have been applied for field extraction and recognition, arc and goalmouth detection, ball and players tracking, and high level techniques such as team tactics detection and soccer models definition. In this paper, we provide an hierarchy and we classify approaches into this hierarchy based on their analysis level, i.e., low, middle, and high levels. An overview of soccer event identification is presented and we discuss general issues related to it in order to provide relevant information about what has been done on soccer video processing.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116994108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Detecting people carrying objects based on an optical flow motion model 基于光流运动模型的人携物检测
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711518
T. Senst, Rubén Heras Evangelio, T. Sikora
Detecting people carrying objects is a commonly formulated problem as a first step to monitor interactions between people and objects. Recent work relies on a precise foreground object segmentation, which is often difficult to achieve in video surveillance sequences due to a bad contrast of the foreground objects with the scene background, abrupt changing light conditions and small camera vibrations. In order to cope with these difficulties we propose an approach based on motion statistics. Therefore we use a Gaussian mixture motion model (GMMM) and, based on that model, we define a novel speed and direction independent motion descriptor in order to detect carried baggage as those regions not fitting in the motion description model of an average walking person. The system was tested with the public dataset PETS2006 and a more challenging dataset including abrupt lighting changes and bad color contrast and compared with existing systems, showing very promissing results.
检测携带物体的人是监测人与物体之间相互作用的第一步,这是一个常见的问题。最近的工作依赖于精确的前景目标分割,这在视频监控序列中往往难以实现,因为前景目标与场景背景的对比度不好,光线条件突变和相机振动小。为了克服这些困难,我们提出了一种基于运动统计的方法。因此,我们使用高斯混合运动模型(GMMM),并在该模型的基础上定义了一种新的独立于速度和方向的运动描述符,以检测携带的行李作为不适合普通步行者运动描述模型的区域。该系统在公共数据集PETS2006和更具挑战性的数据集(包括光照突变和颜色对比度差)上进行了测试,并与现有系统进行了比较,显示出非常有希望的结果。
{"title":"Detecting people carrying objects based on an optical flow motion model","authors":"T. Senst, Rubén Heras Evangelio, T. Sikora","doi":"10.1109/WACV.2011.5711518","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711518","url":null,"abstract":"Detecting people carrying objects is a commonly formulated problem as a first step to monitor interactions between people and objects. Recent work relies on a precise foreground object segmentation, which is often difficult to achieve in video surveillance sequences due to a bad contrast of the foreground objects with the scene background, abrupt changing light conditions and small camera vibrations. In order to cope with these difficulties we propose an approach based on motion statistics. Therefore we use a Gaussian mixture motion model (GMMM) and, based on that model, we define a novel speed and direction independent motion descriptor in order to detect carried baggage as those regions not fitting in the motion description model of an average walking person. The system was tested with the public dataset PETS2006 and a more challenging dataset including abrupt lighting changes and bad color contrast and compared with existing systems, showing very promissing results.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126534092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Segmenting color images into surface patches by exploiting sparse depth data 利用稀疏深度数据将彩色图像分割成表面小块
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711558
B. Dellen, G. Alenyà, S. Foix, C. Torras
We present a new method for segmenting color images into their composite surfaces by combining color segmentation with model-based fitting utilizing sparse depth data, acquired using time-of-flight (Swissranger, PMD CamCube) and stereo techniques. The main target of our work is the segmentation of plant structures, i.e., leaves, from color-depth images, and the extraction of color and 3D shape information for automating manipulation tasks. Since segmentation is performed in the dense color space, even sparse, incomplete, or noisy depth information can be used. This kind of data often represents a major challenge for methods operating in the 3D data space directly. To achieve our goal, we construct a three-stage segmentation hierarchy by segmenting the color image with different resolutions-assuming that “true” surface boundaries must appear at some point along the segmentation hierarchy. 3D surfaces are then fitted to the color-segment areas using depth data. Those segments which minimize the fitting error are selected and used to construct a new segmentation. Then, an additional region merging and a growing stage are applied to avoid over-segmentation and label previously unclustered points. Experimental results demonstrate that the method is successful in segmenting a variety of domestic objects and plants into quadratic surfaces. At the end of the procedure, the sparse depth data is completed using the extracted surface models, resulting in dense depth maps. For stereo, the resulting disparity maps are compared with ground truth and the average error is computed.
我们提出了一种将彩色图像分割成其复合表面的新方法,该方法将颜色分割与基于模型的拟合相结合,利用稀疏深度数据,使用飞行时间(Swissranger, PMD CamCube)和立体技术获得。我们工作的主要目标是从颜色深度图像中分割植物结构,即叶子,并提取颜色和3D形状信息以实现自动化操作任务。由于分割是在密集的色彩空间中进行的,因此即使是稀疏的、不完整的或有噪声的深度信息也可以使用。这类数据通常对直接在3D数据空间中操作的方法构成重大挑战。为了实现我们的目标,我们通过对不同分辨率的彩色图像进行分割来构建一个三阶段的分割层次结构——假设“真实”的表面边界必须出现在分割层次结构的某个点上。然后使用深度数据将3D表面拟合到颜色分段区域。选取拟合误差最小的段,构建新的分段。然后,采用额外的区域合并和生长阶段来避免过度分割和标记先前未聚类的点。实验结果表明,该方法能够成功地将各种家用物体和植物分割成二次曲面。在程序的最后,使用提取的表面模型完成稀疏深度数据,得到密集深度图。对于立体,将得到的视差图与地面真值进行比较,并计算平均误差。
{"title":"Segmenting color images into surface patches by exploiting sparse depth data","authors":"B. Dellen, G. Alenyà, S. Foix, C. Torras","doi":"10.1109/WACV.2011.5711558","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711558","url":null,"abstract":"We present a new method for segmenting color images into their composite surfaces by combining color segmentation with model-based fitting utilizing sparse depth data, acquired using time-of-flight (Swissranger, PMD CamCube) and stereo techniques. The main target of our work is the segmentation of plant structures, i.e., leaves, from color-depth images, and the extraction of color and 3D shape information for automating manipulation tasks. Since segmentation is performed in the dense color space, even sparse, incomplete, or noisy depth information can be used. This kind of data often represents a major challenge for methods operating in the 3D data space directly. To achieve our goal, we construct a three-stage segmentation hierarchy by segmenting the color image with different resolutions-assuming that “true” surface boundaries must appear at some point along the segmentation hierarchy. 3D surfaces are then fitted to the color-segment areas using depth data. Those segments which minimize the fitting error are selected and used to construct a new segmentation. Then, an additional region merging and a growing stage are applied to avoid over-segmentation and label previously unclustered points. Experimental results demonstrate that the method is successful in segmenting a variety of domestic objects and plants into quadratic surfaces. At the end of the procedure, the sparse depth data is completed using the extracted surface models, resulting in dense depth maps. For stereo, the resulting disparity maps are compared with ground truth and the average error is computed.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125477865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
期刊
2011 IEEE Workshop on Applications of Computer Vision (WACV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1