首页 > 最新文献

2011 IEEE Workshop on Applications of Computer Vision (WACV)最新文献

英文 中文
Closed-form solutions to multiple-view homography estimation 多视点单应性估计的封闭解
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711566
P. Schroeder, A. Bartoli, P. Georgel, Nassir Navab
The quality of a mosaic depends on the projective alignment of the images involved. After point-correspondences between the images have been established, bundle adjustment finds an alignment considered optimal under certain hypotheses. This procedure minimizes a nonlinear cost and has to be initialized with care. It is very common to compose inter-frame homographies which have been computed with standard methods in order to get an initial global alignment. This technique is suboptimal if there is noise or missing ho-mographies as it typically uses a small part of the available data. We propose four new closed-form solutions. They all provide non-heuristic initial alignments using all the known inter-frame homographies. Our methods are tested with synthetic and real data and are compared to the standard method. These experiments reveal that our methods are more accurate, taking advantage of the redundant information available in the set of inter-frame homographies.
马赛克的质量取决于所涉及图像的投影对齐。在图像之间的点对应关系建立之后,束平差找到在某些假设下被认为是最优的对齐。这个过程使非线性代价最小化,并且必须小心地初始化。为了得到初始的全局对齐,用标准方法计算出的帧间单应性是很常见的。如果存在噪声或缺少ho-mographies,则此技术不是最优的,因为它通常使用一小部分可用数据。我们提出了四种新的封闭解。它们都使用所有已知的帧间同形异构词提供非启发式初始比对。我们的方法用合成数据和实际数据进行了测试,并与标准方法进行了比较。实验结果表明,我们的方法利用了帧间同形词集中的冗余信息,具有较高的准确率。
{"title":"Closed-form solutions to multiple-view homography estimation","authors":"P. Schroeder, A. Bartoli, P. Georgel, Nassir Navab","doi":"10.1109/WACV.2011.5711566","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711566","url":null,"abstract":"The quality of a mosaic depends on the projective alignment of the images involved. After point-correspondences between the images have been established, bundle adjustment finds an alignment considered optimal under certain hypotheses. This procedure minimizes a nonlinear cost and has to be initialized with care. It is very common to compose inter-frame homographies which have been computed with standard methods in order to get an initial global alignment. This technique is suboptimal if there is noise or missing ho-mographies as it typically uses a small part of the available data. We propose four new closed-form solutions. They all provide non-heuristic initial alignments using all the known inter-frame homographies. Our methods are tested with synthetic and real data and are compared to the standard method. These experiments reveal that our methods are more accurate, taking advantage of the redundant information available in the set of inter-frame homographies.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131877050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
TranslatAR: A mobile augmented reality translator TranslatAR:移动增强现实翻译器
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711545
Victor Fragoso, Steffen Gauglitz, S. Zamora, Jim Kleban, M. Turk
We present a mobile augmented reality (AR) translation system, using a smartphone's camera and touchscreen, that requires the user to simply tap on the word of interest once in order to produce a translation, presented as an AR overlay. The translation seamlessly replaces the original text in the live camera stream, matching background and foreground colors estimated from the source images. For this purpose, we developed an efficient algorithm for accurately detecting the location and orientation of the text in a live camera stream that is robust to perspective distortion, and we combine it with OCR and a text-to-text translation engine. Our experimental results, using the ICDAR 2003 dataset and our own set of video sequences, quantify the accuracy of our detection and analyze the sources of failure among the system's components. With the OCR and translation running in a background thread, the system runs at 26 fps on a current generation smartphone (Nokia N900) and offers a particularly easy-to-use and simple method for translation, especially in situations in which typing or correct pronunciation (for systems with speech input) is cumbersome or impossible.
我们提出了一个移动增强现实(AR)翻译系统,使用智能手机的摄像头和触摸屏,用户只需点击一次感兴趣的单词,就可以产生翻译,以AR覆盖的形式呈现。翻译可以无缝地替换实时摄像机流中的原始文本,匹配从源图像估计的背景和前景颜色。为此,我们开发了一种高效的算法,用于准确检测实时摄像机流中文本的位置和方向,该算法对透视失真具有鲁棒性,并将其与OCR和文本到文本翻译引擎相结合。我们的实验结果,使用ICDAR 2003数据集和我们自己的一组视频序列,量化了我们检测的准确性,并分析了系统组件之间的故障来源。由于OCR和翻译在后台线程中运行,该系统在当前一代智能手机(诺基亚N900)上以26帧/秒的速度运行,并提供了一种特别易于使用和简单的翻译方法,特别是在打字或正确发音(对于语音输入系统)繁琐或不可能的情况下。
{"title":"TranslatAR: A mobile augmented reality translator","authors":"Victor Fragoso, Steffen Gauglitz, S. Zamora, Jim Kleban, M. Turk","doi":"10.1109/WACV.2011.5711545","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711545","url":null,"abstract":"We present a mobile augmented reality (AR) translation system, using a smartphone's camera and touchscreen, that requires the user to simply tap on the word of interest once in order to produce a translation, presented as an AR overlay. The translation seamlessly replaces the original text in the live camera stream, matching background and foreground colors estimated from the source images. For this purpose, we developed an efficient algorithm for accurately detecting the location and orientation of the text in a live camera stream that is robust to perspective distortion, and we combine it with OCR and a text-to-text translation engine. Our experimental results, using the ICDAR 2003 dataset and our own set of video sequences, quantify the accuracy of our detection and analyze the sources of failure among the system's components. With the OCR and translation running in a background thread, the system runs at 26 fps on a current generation smartphone (Nokia N900) and offers a particularly easy-to-use and simple method for translation, especially in situations in which typing or correct pronunciation (for systems with speech input) is cumbersome or impossible.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131195008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 98
Cell image analysis: Algorithms, system and applications 细胞图像分析:算法、系统和应用
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711528
T. Kanade, Zhaozheng Yin, Ryoma Bise, Seungil Huh, Sungeun Eom, Michael F. Sandbothe, Mei Chen
We present several algorithms for cell image analysis including microscopy image restoration, cell event detection and cell tracking in a large population. The algorithms are integrated into an automated system capable of quantifying cell proliferation metrics in vitro in real-time. This offers unique opportunities for biological applications such as efficient cell behavior discovery in response to different cell culturing conditions and adaptive experiment control. We quantitatively evaluated our system's performance on 16 microscopy image sequences with satisfactory accuracy for biologists' need. We have also developed a public website compatible to the system's local user interface, thereby allowing biologists to conveniently check their experiment progress online. The website will serve as a community resource that allows other research groups to upload their cell images for analysis and comparison.
我们提出了几种用于细胞图像分析的算法,包括显微镜图像恢复,细胞事件检测和大群体中的细胞跟踪。这些算法集成到一个自动化系统中,能够在体外实时定量细胞增殖指标。这为生物应用提供了独特的机会,例如响应不同细胞培养条件和适应性实验控制的有效细胞行为发现。我们定量评估了我们的系统在16个显微镜图像序列上的性能,满足了生物学家的需要。我们还开发了一个与系统本地用户界面兼容的公共网站,从而使生物学家可以方便地在线查看他们的实验进度。该网站将作为一个社区资源,允许其他研究小组上传他们的细胞图像进行分析和比较。
{"title":"Cell image analysis: Algorithms, system and applications","authors":"T. Kanade, Zhaozheng Yin, Ryoma Bise, Seungil Huh, Sungeun Eom, Michael F. Sandbothe, Mei Chen","doi":"10.1109/WACV.2011.5711528","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711528","url":null,"abstract":"We present several algorithms for cell image analysis including microscopy image restoration, cell event detection and cell tracking in a large population. The algorithms are integrated into an automated system capable of quantifying cell proliferation metrics in vitro in real-time. This offers unique opportunities for biological applications such as efficient cell behavior discovery in response to different cell culturing conditions and adaptive experiment control. We quantitatively evaluated our system's performance on 16 microscopy image sequences with satisfactory accuracy for biologists' need. We have also developed a public website compatible to the system's local user interface, thereby allowing biologists to conveniently check their experiment progress online. The website will serve as a community resource that allows other research groups to upload their cell images for analysis and comparison.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115767169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 110
Saliency retargeting: An approach to enhance image aesthetics 显著性再定位:一种提升图像美学的方法
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711486
L. Wong, Kok-Lim Low
A photograph that has visually dominant subjects in general induces stronger aesthetic interest. Inspired by this, we have developed a new approach to enhance image aesthetics through saliency retargeting. Our method alters low-level image features of the objects in the photograph such that their computed saliency measurements in the modified image become consistent with the intended order of their visual importance. The goal of our approach is to produce an image that can redirect the viewers' attention to the most important objects in the image, and thus making these objects the main subjects. Since many modified images can satisfy the same specified order of visual importance, we trained an aesthetics score prediction model to pick the one with the best aesthetics. Results from our user experiments support the effectiveness of our approach.
一般来说,具有视觉主导主体的照片会引起更强的审美兴趣。受此启发,我们开发了一种通过显著性重新定位来增强图像美学的新方法。我们的方法改变了照片中物体的低水平图像特征,使它们在修改后的图像中计算出的显著性测量值与它们视觉重要性的预期顺序一致。我们的目标是制作一幅图像,可以将观众的注意力转移到图像中最重要的物体上,从而使这些物体成为主要主题。由于许多修改后的图像可以满足相同的指定视觉重要性顺序,我们训练了一个美学评分预测模型来选择具有最佳美学的图像。我们的用户实验结果支持我们方法的有效性。
{"title":"Saliency retargeting: An approach to enhance image aesthetics","authors":"L. Wong, Kok-Lim Low","doi":"10.1109/WACV.2011.5711486","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711486","url":null,"abstract":"A photograph that has visually dominant subjects in general induces stronger aesthetic interest. Inspired by this, we have developed a new approach to enhance image aesthetics through saliency retargeting. Our method alters low-level image features of the objects in the photograph such that their computed saliency measurements in the modified image become consistent with the intended order of their visual importance. The goal of our approach is to produce an image that can redirect the viewers' attention to the most important objects in the image, and thus making these objects the main subjects. Since many modified images can satisfy the same specified order of visual importance, we trained an aesthetics score prediction model to pick the one with the best aesthetics. Results from our user experiments support the effectiveness of our approach.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123926793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Saliency detection based on proto-objects and topic model 基于原型对象和主题模型的显著性检测
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711493
Zhidong Li, Jie Xu, Yang Wang, G. Geers, Jun Yang
This paper proposes a novel computational framework for saliency detection, which integrates the saliency map computation and proto-objects detection. The proto-objects are detected based on the saliency map using latent topic model. The detected proto-objects are then utilized to improve the saliency map computation. Extensive experiments are performed on two publicly available datasets. The experimental results show that the proposed framework outperforms the state-of-art methods.
本文提出了一种新的显著性检测计算框架,将显著性图计算与原目标检测相结合。基于显著性图,利用潜在主题模型对原型对象进行检测。然后利用检测到的原始目标来改进显著性图的计算。在两个公开可用的数据集上进行了广泛的实验。实验结果表明,所提出的框架优于现有的方法。
{"title":"Saliency detection based on proto-objects and topic model","authors":"Zhidong Li, Jie Xu, Yang Wang, G. Geers, Jun Yang","doi":"10.1109/WACV.2011.5711493","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711493","url":null,"abstract":"This paper proposes a novel computational framework for saliency detection, which integrates the saliency map computation and proto-objects detection. The proto-objects are detected based on the saliency map using latent topic model. The detected proto-objects are then utilized to improve the saliency map computation. Extensive experiments are performed on two publicly available datasets. The experimental results show that the proposed framework outperforms the state-of-art methods.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124279655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
3D Object recognition using a voting algorithm in a real-world environment 在现实世界环境中使用投票算法的3D物体识别
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711497
S. Tangruamsub, Keisuke Takada, O. Hasegawa
This paper presents a novel 3D object recognition method. The proposed objectives are to overcome shortcoming of the appearance-based method, which lacks a spatial relationship between the parts of an object, and those of other 3D model methods, which require complicated computation. The proposed method is based on a voting process. Appearance estimation is introduced in this work in order to deal with the faulty detection problem. We tested our method for object detection and pose estimation, and the results showed that our method improved the average precision and detection time compared to other methods.
提出了一种新的三维目标识别方法。提出的目标是克服基于外观的方法缺乏物体各部分之间的空间关系,以及其他三维模型方法计算复杂的缺点。提出的方法基于投票过程。为了解决故障检测问题,本文引入了外观估计。我们对该方法进行了目标检测和姿态估计测试,结果表明,与其他方法相比,我们的方法提高了平均精度和检测时间。
{"title":"3D Object recognition using a voting algorithm in a real-world environment","authors":"S. Tangruamsub, Keisuke Takada, O. Hasegawa","doi":"10.1109/WACV.2011.5711497","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711497","url":null,"abstract":"This paper presents a novel 3D object recognition method. The proposed objectives are to overcome shortcoming of the appearance-based method, which lacks a spatial relationship between the parts of an object, and those of other 3D model methods, which require complicated computation. The proposed method is based on a voting process. Appearance estimation is introduced in this work in order to deal with the faulty detection problem. We tested our method for object detection and pose estimation, and the results showed that our method improved the average precision and detection time compared to other methods.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"445 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125771875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Multi-modal visual concept classification of images via Markov random walk over tags 基于标签上马尔可夫随机行走的图像多模态视觉概念分类
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711531
M. Kawanabe, Alexander Binder, Christina Müller, W. Wojcikiewicz
Automatic annotation of images is a challenging task in computer vision because of “semantic gap” between highlevel visual concepts and image appearances. Therefore, user tags attached to images can provide further information to bridge the gap, even though they are partially uninformative and misleading. In this work, we investigate multi-modal visual concept classification based on visual features and user tags via kernel-based classifiers. An issue here is how to construct kernels between sets of tags. We deploy Markov random walks on graphs of key tags to incorporate co-occurrence between them. This procedure acts as a smoothing of tag based features. Our experimental result on the ImageCLEF2010 PhotoAnnotation benchmark shows that our proposed method outperforms the baseline relying solely on visual information and a recently published state-of-the-art approach.
由于高级视觉概念与图像外观之间存在“语义差距”,图像的自动标注是计算机视觉领域的一个具有挑战性的任务。因此,附加到图像上的用户标签可以提供进一步的信息,以弥补差距,即使它们部分没有提供信息和误导。在这项工作中,我们通过基于核的分类器研究了基于视觉特征和用户标签的多模态视觉概念分类。这里的一个问题是如何在标记集之间构造内核。我们在关键标签的图上部署马尔可夫随机漫步,以结合它们之间的共现性。这个过程的作用是平滑基于标签的特征。我们在ImageCLEF2010 PhotoAnnotation基准上的实验结果表明,我们提出的方法优于仅依赖视觉信息和最近发表的最先进方法的基线。
{"title":"Multi-modal visual concept classification of images via Markov random walk over tags","authors":"M. Kawanabe, Alexander Binder, Christina Müller, W. Wojcikiewicz","doi":"10.1109/WACV.2011.5711531","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711531","url":null,"abstract":"Automatic annotation of images is a challenging task in computer vision because of “semantic gap” between highlevel visual concepts and image appearances. Therefore, user tags attached to images can provide further information to bridge the gap, even though they are partially uninformative and misleading. In this work, we investigate multi-modal visual concept classification based on visual features and user tags via kernel-based classifiers. An issue here is how to construct kernels between sets of tags. We deploy Markov random walks on graphs of key tags to incorporate co-occurrence between them. This procedure acts as a smoothing of tag based features. Our experimental result on the ImageCLEF2010 PhotoAnnotation benchmark shows that our proposed method outperforms the baseline relying solely on visual information and a recently published state-of-the-art approach.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"277 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114485587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
One video is sufficient? Human activity recognition using active video composition 一个视频就够了?基于活动视频合成的人类活动识别
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711564
M. Ryoo, Wonpil Yu
In this paper, we present a novel human activity recognition approach that only requires a single video example per activity. We introduce the paradigm of active video composition, which enables one-example recognition of complex activities. The idea is to automatically create a large number of semi-artificial training videos called composed videos by manipulating an original human activity video. A methodology to automatically compose activity videos having different backgrounds, translations, scales, actors, and movement structures is described in this paper. Furthermore, an active learning algorithm to model the temporal structure of the human activity has been designed, preventing the generation of composed training videos violating the structural constraints of the activity. The intention is to generate composed videos having correct organizations, and take advantage of them for the training of the recognition system. In contrast to previous passive recognition systems relying only on given training videos, our methodology actively composes necessary training videos that the system is expected to observe in its environment. Experimental results illustrate that a single fully labeled video per activity is sufficient for our methodology to reliably recognize human activities by utilizing composed training videos.
在本文中,我们提出了一种新的人类活动识别方法,每个活动只需要一个视频示例。我们介绍了主动视频合成的范例,它可以实现对复杂活动的单例识别。这个想法是通过操纵原始的人类活动视频,自动创建大量的半人工训练视频,称为合成视频。本文描述了一种自动合成具有不同背景、翻译、尺度、演员和运动结构的活动视频的方法。此外,设计了一种主动学习算法来模拟人类活动的时间结构,防止生成违反活动结构约束的合成训练视频。其目的是生成具有正确组织的合成视频,并利用它们来训练识别系统。与以前的被动识别系统只依赖于给定的训练视频相比,我们的方法积极地组成系统在其环境中预期观察到的必要的训练视频。实验结果表明,每个活动一个完全标记的视频足以让我们的方法通过使用合成的训练视频来可靠地识别人类活动。
{"title":"One video is sufficient? Human activity recognition using active video composition","authors":"M. Ryoo, Wonpil Yu","doi":"10.1109/WACV.2011.5711564","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711564","url":null,"abstract":"In this paper, we present a novel human activity recognition approach that only requires a single video example per activity. We introduce the paradigm of active video composition, which enables one-example recognition of complex activities. The idea is to automatically create a large number of semi-artificial training videos called composed videos by manipulating an original human activity video. A methodology to automatically compose activity videos having different backgrounds, translations, scales, actors, and movement structures is described in this paper. Furthermore, an active learning algorithm to model the temporal structure of the human activity has been designed, preventing the generation of composed training videos violating the structural constraints of the activity. The intention is to generate composed videos having correct organizations, and take advantage of them for the training of the recognition system. In contrast to previous passive recognition systems relying only on given training videos, our methodology actively composes necessary training videos that the system is expected to observe in its environment. Experimental results illustrate that a single fully labeled video per activity is sufficient for our methodology to reliably recognize human activities by utilizing composed training videos.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130081962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Car-Rec: A real time car recognition system car - rec:一个实时汽车识别系统
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711559
D. Jang, M. Turk
Recent advances in computer vision have significantly reduced the difficulty of object classification and recognition. Robust feature detector and descriptor algorithms are particularly useful, forming the basis for many recognition and classification applications. These algorithms have been used in divergent bag-of-words and structural matching approaches. This work demonstrates a recognition application, based upon the SURF feature descriptor algorithm, which fuses bag-of-words and structural verification techniques. The resulting system is applied to the domain of car recognition and achieves accurate (> 90%) and real-time performance when searching databases containing thousands of images.
计算机视觉的最新进展大大降低了物体分类和识别的难度。鲁棒特征检测器和描述符算法特别有用,构成了许多识别和分类应用的基础。这些算法已用于发散词袋和结构匹配方法。这项工作展示了一个基于SURF特征描述符算法的识别应用,该算法融合了词袋和结构验证技术。该系统应用于汽车识别领域,在搜索包含数千张图像的数据库时,达到了准确率(约90%)和实时性。
{"title":"Car-Rec: A real time car recognition system","authors":"D. Jang, M. Turk","doi":"10.1109/WACV.2011.5711559","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711559","url":null,"abstract":"Recent advances in computer vision have significantly reduced the difficulty of object classification and recognition. Robust feature detector and descriptor algorithms are particularly useful, forming the basis for many recognition and classification applications. These algorithms have been used in divergent bag-of-words and structural matching approaches. This work demonstrates a recognition application, based upon the SURF feature descriptor algorithm, which fuses bag-of-words and structural verification techniques. The resulting system is applied to the domain of car recognition and achieves accurate (> 90%) and real-time performance when searching databases containing thousands of images.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130964973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Realistic stereo error models and finite optimal stereo baselines 现实立体误差模型和有限最优立体基线
Pub Date : 2011-01-05 DOI: 10.1109/WACV.2011.5711535
Zhang Tao, T. Boult
Stereo reconstruction is an important research and application area, both for general 3D reconstruction and for operations like robotic navigation and remote sensing. This paper addresses the determination of parameters for a stereo system to optimize/minimize 3D reconstruction errors. Previous work on error analysis in stereo reconstruction optimized error in disparity space which led to the erroneous conclusion that, ignoring matching errors, errors decrease when the baseline goes to infinity. In this paper, we derive the first formal error model based on the more realistic “point-of-closest-approach” ray model used in modern stereo systems. We then show this results in finite optimal baseline that minimizes reconstruction errors in all three world directions. We also show why previous oversimplified error analysis results in infinite baselines. We derive the mathematical relationship between the error variances and the stereo system parameters. In our analysis, we consider the situations where errors exist in only one camera as well as errors in both cameras. We have derived the results for both parallel and verged systems, though only the simpler models are presented algebraically herein. The paper includes simulations to highlight the results and validate the approximations in the error propagation. The results should allow stereo system designers, or those using motion-stereo, to improve their system.
无论是一般的三维重建还是机器人导航、遥感等操作,立体重建都是一个重要的研究和应用领域。本文讨论了立体系统参数的确定,以优化/最小化三维重建误差。以往的立体重建误差分析都是在视差空间中优化误差,从而得出忽略匹配误差,当基线趋于无穷远时误差减小的错误结论。在本文中,我们基于现代立体系统中使用的更现实的“最接近点”射线模型推导了第一个形式化误差模型。然后,我们在有限最优基线中展示了这一结果,该基线使所有三个世界方向的重建误差最小化。我们还说明了为什么以前过于简化的误差分析导致无限基线。推导了误差方差与立体系统参数之间的数学关系。在我们的分析中,我们考虑了仅在一个相机中存在错误以及两个相机都存在错误的情况。我们已经导出了并行系统和边缘系统的结果,尽管这里只给出了更简单的代数模型。文中还包括仿真,以突出结果并验证误差传播中的近似。研究结果可以让立体音响系统设计师或使用运动立体音响的人改进他们的系统。
{"title":"Realistic stereo error models and finite optimal stereo baselines","authors":"Zhang Tao, T. Boult","doi":"10.1109/WACV.2011.5711535","DOIUrl":"https://doi.org/10.1109/WACV.2011.5711535","url":null,"abstract":"Stereo reconstruction is an important research and application area, both for general 3D reconstruction and for operations like robotic navigation and remote sensing. This paper addresses the determination of parameters for a stereo system to optimize/minimize 3D reconstruction errors. Previous work on error analysis in stereo reconstruction optimized error in disparity space which led to the erroneous conclusion that, ignoring matching errors, errors decrease when the baseline goes to infinity. In this paper, we derive the first formal error model based on the more realistic “point-of-closest-approach” ray model used in modern stereo systems. We then show this results in finite optimal baseline that minimizes reconstruction errors in all three world directions. We also show why previous oversimplified error analysis results in infinite baselines. We derive the mathematical relationship between the error variances and the stereo system parameters. In our analysis, we consider the situations where errors exist in only one camera as well as errors in both cameras. We have derived the results for both parallel and verged systems, though only the simpler models are presented algebraically herein. The paper includes simulations to highlight the results and validate the approximations in the error propagation. The results should allow stereo system designers, or those using motion-stereo, to improve their system.","PeriodicalId":424724,"journal":{"name":"2011 IEEE Workshop on Applications of Computer Vision (WACV)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127774303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2011 IEEE Workshop on Applications of Computer Vision (WACV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1