首页 > 最新文献

IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision最新文献

英文 中文
Data-driven road detection 数据驱动的道路检测
J. Álvarez, M. Salzmann, N. Barnes
In this paper, we tackle the problem of road detection from RGB images. In particular, we follow a data-driven approach to segmenting the road pixels in an image. To this end, we introduce two road detection methods: A top-down approach that builds an image-level road prior based on the traffic pattern observed in an input image, and a bottom-up technique that estimates the probability that an image superpixel belongs to the road surface in a nonparametric manner. Both our algorithms work on the principle of label transfer in the sense that the road prior is directly constructed from the ground-truth segmentations of training images. Our experimental evaluation on four different datasets shows that this approach outperforms existing top-down and bottom-up techniques, and is key to the robustness of road detection algorithms to the dataset bias.
本文主要研究基于RGB图像的道路检测问题。特别是,我们遵循数据驱动的方法来分割图像中的道路像素。为此,我们介绍了两种道路检测方法:一种是自上而下的方法,基于输入图像中观察到的交通模式构建图像级道路先验;另一种是自下而上的技术,以非参数方式估计图像超像素属于路面的概率。我们的两种算法都基于标签转移的原理,即道路先验是直接从训练图像的真实分割中构建的。我们对四个不同数据集的实验评估表明,该方法优于现有的自顶向下和自底向上技术,并且是道路检测算法对数据集偏差的鲁棒性的关键。
{"title":"Data-driven road detection","authors":"J. Álvarez, M. Salzmann, N. Barnes","doi":"10.1109/WACV.2014.6835730","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835730","url":null,"abstract":"In this paper, we tackle the problem of road detection from RGB images. In particular, we follow a data-driven approach to segmenting the road pixels in an image. To this end, we introduce two road detection methods: A top-down approach that builds an image-level road prior based on the traffic pattern observed in an input image, and a bottom-up technique that estimates the probability that an image superpixel belongs to the road surface in a nonparametric manner. Both our algorithms work on the principle of label transfer in the sense that the road prior is directly constructed from the ground-truth segmentations of training images. Our experimental evaluation on four different datasets shows that this approach outperforms existing top-down and bottom-up techniques, and is key to the robustness of road detection algorithms to the dataset bias.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"60 1","pages":"1134-1141"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84797067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Determining underwater vehicle movement from sonar data in relatively featureless seafloor tracking missions 在相对无特征的海底跟踪任务中,从声纳数据确定水下航行器的运动
A. Spears, A. Howard, M. West, Thomas Collins
Navigation through underwater environments is challenging given the lack of accurate positioning systems. The determination of underwater vehicle movement using an integrated acoustic sonar sensor would provide underwater vehicles with greatly increased autonomous navigation capabilities. A forward looking sonar sensor may be used for determining autonomous vehicle movement using filtering and optical flow algorithms. Optical flow algorithms have shown excellent results for vision image processing. However, they have been found difficult to implement using sonar data due to the high level of noise present, as well as the widely varying appearances of objects from frame to frame. For the bottom tracking applications considered, the simplifying assumption can be made that all features move with an equivalent direction and magnitude between frames. Statistical analysis of all estimated feature movements provides an accurate estimate of the overall shift, which translates directly to the vehicle movement. Results using acoustic sonar data are presented which illustrate the effectiveness of this methodology.
由于缺乏精确的定位系统,在水下环境中导航是一项挑战。利用集成声呐传感器确定水下航行器的运动将大大提高水下航行器的自主导航能力。前视声纳传感器可用于通过滤波和光流算法确定自动车辆的运动。光流算法在视觉图像处理中表现出优异的效果。然而,由于存在高水平的噪声,以及从一帧到另一帧物体的广泛变化的外观,它们已经发现难以使用声纳数据来实现。对于所考虑的底部跟踪应用,可以简化假设所有特征在帧之间以相同的方向和大小移动。对所有估计的特征运动的统计分析提供了对整体位移的准确估计,这直接转化为车辆的运动。利用声呐数据的结果说明了该方法的有效性。
{"title":"Determining underwater vehicle movement from sonar data in relatively featureless seafloor tracking missions","authors":"A. Spears, A. Howard, M. West, Thomas Collins","doi":"10.1109/WACV.2014.6836007","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836007","url":null,"abstract":"Navigation through underwater environments is challenging given the lack of accurate positioning systems. The determination of underwater vehicle movement using an integrated acoustic sonar sensor would provide underwater vehicles with greatly increased autonomous navigation capabilities. A forward looking sonar sensor may be used for determining autonomous vehicle movement using filtering and optical flow algorithms. Optical flow algorithms have shown excellent results for vision image processing. However, they have been found difficult to implement using sonar data due to the high level of noise present, as well as the widely varying appearances of objects from frame to frame. For the bottom tracking applications considered, the simplifying assumption can be made that all features move with an equivalent direction and magnitude between frames. Statistical analysis of all estimated feature movements provides an accurate estimate of the overall shift, which translates directly to the vehicle movement. Results using acoustic sonar data are presented which illustrate the effectiveness of this methodology.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"1 1","pages":"909-916"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83030338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Automatic 3D change detection for glaucoma diagnosis 用于青光眼诊断的自动三维变化检测
Lu Wang, V. Kallem, Mayank Bansal, J. Eledath, H. Sawhney, Denise J. Pearson, R. Stone
Important diagnostic criteria for glaucoma are changes in the 3D structure of the optic disc due to optic nerve damage. We propose an automatic approach for detecting these changes in 3D models reconstructed from fundus images of the same patient taken at different times. For each time session, only two uncalibated fundus images are required. The approach applies a 6-point algorithm to estimate relative camera pose assuming a constant camera focal length. To deal with the instability of 3D reconstruction associated with fundus images, our approach keeps multiple candidate reconstruction solutions for each image pair. The best 3D reconstruction is found by optimizing the 3D registration of all images after an iterative bundle adjustment that tolerates possible structure changes. The 3D structure changes are detected by evaluating the reprojection errors of feature points in image space. We validate the approach by comparing the diagnosis results with manual grading by human experts on a fundus image dataset.
青光眼的重要诊断标准是视神经损伤引起的视盘三维结构改变。我们提出了一种自动检测这些变化的方法,这些变化来自于同一患者在不同时间拍摄的眼底图像重建的3D模型。对于每个时段,只需要两张未经校准的眼底图像。该方法采用6点算法来估计相机的相对姿态,假设相机焦距恒定。为了解决眼底图像三维重建的不稳定性,我们的方法为每对图像保留了多个候选重建方案。在允许可能的结构变化的迭代束调整后,通过优化所有图像的3D配准来找到最佳的3D重建。通过评估图像空间中特征点的重投影误差来检测三维结构的变化。我们通过将诊断结果与人类专家在眼底图像数据集上的手动分级进行比较来验证该方法。
{"title":"Automatic 3D change detection for glaucoma diagnosis","authors":"Lu Wang, V. Kallem, Mayank Bansal, J. Eledath, H. Sawhney, Denise J. Pearson, R. Stone","doi":"10.1109/WACV.2014.6836072","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836072","url":null,"abstract":"Important diagnostic criteria for glaucoma are changes in the 3D structure of the optic disc due to optic nerve damage. We propose an automatic approach for detecting these changes in 3D models reconstructed from fundus images of the same patient taken at different times. For each time session, only two uncalibated fundus images are required. The approach applies a 6-point algorithm to estimate relative camera pose assuming a constant camera focal length. To deal with the instability of 3D reconstruction associated with fundus images, our approach keeps multiple candidate reconstruction solutions for each image pair. The best 3D reconstruction is found by optimizing the 3D registration of all images after an iterative bundle adjustment that tolerates possible structure changes. The 3D structure changes are detected by evaluating the reprojection errors of feature points in image space. We validate the approach by comparing the diagnosis results with manual grading by human experts on a fundus image dataset.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"55 1","pages":"401-408"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83542530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Online discriminative dictionary learning for visual tracking 用于视觉跟踪的在线判别字典学习
Fan Yang, Zhuolin Jiang, L. Davis
Dictionary learning has been applied to various computer vision problems, such as image restoration, object classification and face recognition. In this work, we propose a tracking framework based on sparse representation and online discriminative dictionary learning. By associating dictionary items with label information, the learned dictionary is both reconstructive and discriminative, which better distinguishes target objects from the background. During tracking, the best target candidate is selected by a joint decision measure. Reliable tracking results and augmented training samples are accumulated into two sets to update the dictionary. Both online dictionary learning and the proposed joint decision measure are important for the final tracking performance. Experiments show that our approach outperforms several recently proposed trackers.
字典学习已经应用于各种计算机视觉问题,如图像恢复、对象分类和人脸识别。在这项工作中,我们提出了一个基于稀疏表示和在线判别字典学习的跟踪框架。通过将字典项与标签信息相关联,学习到的字典具有重建性和判别性,可以更好地将目标对象与背景区分开来。在跟踪过程中,通过联合决策度量选择最佳候选目标。将可靠的跟踪结果和增强的训练样本累积成两组来更新字典。在线字典学习和所提出的联合决策度量对最终的跟踪性能都很重要。实验表明,我们的方法优于最近提出的几种跟踪器。
{"title":"Online discriminative dictionary learning for visual tracking","authors":"Fan Yang, Zhuolin Jiang, L. Davis","doi":"10.1109/WACV.2014.6836014","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836014","url":null,"abstract":"Dictionary learning has been applied to various computer vision problems, such as image restoration, object classification and face recognition. In this work, we propose a tracking framework based on sparse representation and online discriminative dictionary learning. By associating dictionary items with label information, the learned dictionary is both reconstructive and discriminative, which better distinguishes target objects from the background. During tracking, the best target candidate is selected by a joint decision measure. Reliable tracking results and augmented training samples are accumulated into two sets to update the dictionary. Both online dictionary learning and the proposed joint decision measure are important for the final tracking performance. Experiments show that our approach outperforms several recently proposed trackers.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"15 1","pages":"854-861"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75212398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Ant tracking with occlusion tunnels 蚂蚁追踪闭塞隧道
Thomas Fasciano, A. Dornhaus, M. Shin
The automated tracking of social insects, such as ants, can efficiently provide unparalleled amounts of data for the of study complex group behaviors. However, a high level of occlusion along with similarity in appearance and motion can cause the tracking to drift to an incorrect ant. In this paper, we reduce drifting by using occlusion to identify incorrect ants and prevent the tracking from drifting to them. The key idea is that a set of ants enter occlusion, move through occlusion then exit occlusion. We do not attempt to track through occlusions but simply find a set of objects that enters and exits them. Knowing that tracking must stay within a set of ants exiting a given occlusion, we reduce drifting by preventing tracking to ants outside the occlusion. Using four 5000 frame video sequences of an ant colony, we demonstrate that the usage of occlusion tunnel reduces the tracking error of (1) drifting to another ant by 30% and (2) early termination of tracking by 7%.
对蚂蚁等群居昆虫的自动跟踪可以有效地为研究复杂的群体行为提供无与伦比的数据。然而,高水平的遮挡以及相似的外观和运动可能导致跟踪漂移到一个不正确的蚂蚁。在本文中,我们通过遮挡来识别不正确的蚂蚁并防止跟踪漂移到他们身上来减少漂移。关键思想是一组蚂蚁进入遮挡,穿过遮挡,然后离开遮挡。我们不尝试通过遮挡来跟踪,而是简单地找到一组进入和退出遮挡的对象。知道跟踪必须停留在一组离开给定遮挡的蚂蚁内,我们通过防止跟踪到遮挡外的蚂蚁来减少漂移。利用一个蚁群的4个5000帧视频序列,我们证明了遮挡隧道的使用将(1)漂移到另一个蚂蚁的跟踪误差降低了30%,(2)跟踪提前终止的跟踪误差降低了7%。
{"title":"Ant tracking with occlusion tunnels","authors":"Thomas Fasciano, A. Dornhaus, M. Shin","doi":"10.1109/WACV.2014.6836002","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836002","url":null,"abstract":"The automated tracking of social insects, such as ants, can efficiently provide unparalleled amounts of data for the of study complex group behaviors. However, a high level of occlusion along with similarity in appearance and motion can cause the tracking to drift to an incorrect ant. In this paper, we reduce drifting by using occlusion to identify incorrect ants and prevent the tracking from drifting to them. The key idea is that a set of ants enter occlusion, move through occlusion then exit occlusion. We do not attempt to track through occlusions but simply find a set of objects that enters and exits them. Knowing that tracking must stay within a set of ants exiting a given occlusion, we reduce drifting by preventing tracking to ants outside the occlusion. Using four 5000 frame video sequences of an ant colony, we demonstrate that the usage of occlusion tunnel reduces the tracking error of (1) drifting to another ant by 30% and (2) early termination of tracking by 7%.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"49 1","pages":"947-952"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77551840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Multi-leaf alignment from fluorescence plant images 荧光植物图像的多叶排列
Xi Yin, Xiaoming Liu, Jin Chen, D. Kramer
In this paper, we propose a multi-leaf alignment framework based on Chamfer matching to study the problem of leaf alignment from fluorescence images of plants, which will provide a leaf-level analysis of photosynthetic activities. Different from the naive procedure of aligning leaves iteratively using the Chamfer distance, the new algorithm aims to find the best alignment of multiple leaves simultaneously in an input image. We formulate an optimization problem of an objective function with three terms: the average of chamfer distances of aligned leaves, the number of leaves, and the difference between the synthesized mask by the leaf candidates and the original image mask. Gradient descent is used to minimize our objective function. A quantitative evaluation framework is also formulated to test the performance of our algorithm. Experimental results show that the proposed multi-leaf alignment optimization performs substantially better than the baseline of the Chamfer matching algorithm in terms of both accuracy and efficiency.
本文提出了一种基于Chamfer匹配的多叶片定位框架,用于研究植物荧光图像中叶片的定位问题,为叶片水平的光合活动分析提供依据。与传统的利用Chamfer距离迭代对齐叶子的方法不同,新算法的目标是在输入图像中同时找到多个叶子的最佳对齐方式。我们提出了一个目标函数的优化问题,该目标函数包含三个项:对齐叶片的倒角距离的平均值、叶片的数量以及候选叶片合成的掩模与原始图像掩模之间的差异。梯度下降是用来最小化我们的目标函数。此外,还制定了一个量化评估框架来测试我们的算法的性能。实验结果表明,所提出的多叶对齐优化算法在精度和效率方面都明显优于Chamfer匹配算法的基线。
{"title":"Multi-leaf alignment from fluorescence plant images","authors":"Xi Yin, Xiaoming Liu, Jin Chen, D. Kramer","doi":"10.1109/WACV.2014.6836067","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836067","url":null,"abstract":"In this paper, we propose a multi-leaf alignment framework based on Chamfer matching to study the problem of leaf alignment from fluorescence images of plants, which will provide a leaf-level analysis of photosynthetic activities. Different from the naive procedure of aligning leaves iteratively using the Chamfer distance, the new algorithm aims to find the best alignment of multiple leaves simultaneously in an input image. We formulate an optimization problem of an objective function with three terms: the average of chamfer distances of aligned leaves, the number of leaves, and the difference between the synthesized mask by the leaf candidates and the original image mask. Gradient descent is used to minimize our objective function. A quantitative evaluation framework is also formulated to test the performance of our algorithm. Experimental results show that the proposed multi-leaf alignment optimization performs substantially better than the baseline of the Chamfer matching algorithm in terms of both accuracy and efficiency.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"36 1","pages":"437-444"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76112119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Combining semantic scene priors and haze removal for single image depth estimation 结合语义场景先验和去雾的单幅图像深度估计
Ke Wang, Enrique Dunn, Joseph Tighe, Jan-Michael Frahm
We consider the problem of estimating the relative depth of a scene from a monocular image. The dark channel prior, used as a statistical observation of haze free images, has been previously leveraged for haze removal and relative depth estimation tasks. However, as a local measure, it fails to account for higher order semantic relationship among scene elements. We propose a dual channel prior used for identifying pixels that are unlikely to comply with the dark channel assumption, leading to erroneous depth estimates. We further leverage semantic segmentation information and patch match label propagation to enforce semantically consistent geometric priors. Experiments illustrate the quantitative and qualitative advantages of our approach when compared to state of the art methods.
我们考虑了从单目图像中估计场景相对深度的问题。暗通道先验,用作无雾霾图像的统计观测,以前已用于雾霾去除和相对深度估计任务。然而,作为一种局部度量,它无法考虑场景元素之间的高阶语义关系。我们提出了一个双通道先验用于识别像素不太可能符合暗通道假设,导致错误的深度估计。我们进一步利用语义分割信息和补丁匹配标签传播来强制语义一致的几何先验。实验表明,与最先进的方法相比,我们的方法具有定量和定性的优势。
{"title":"Combining semantic scene priors and haze removal for single image depth estimation","authors":"Ke Wang, Enrique Dunn, Joseph Tighe, Jan-Michael Frahm","doi":"10.1109/WACV.2014.6836021","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836021","url":null,"abstract":"We consider the problem of estimating the relative depth of a scene from a monocular image. The dark channel prior, used as a statistical observation of haze free images, has been previously leveraged for haze removal and relative depth estimation tasks. However, as a local measure, it fails to account for higher order semantic relationship among scene elements. We propose a dual channel prior used for identifying pixels that are unlikely to comply with the dark channel assumption, leading to erroneous depth estimates. We further leverage semantic segmentation information and patch match label propagation to enforce semantically consistent geometric priors. Experiments illustrate the quantitative and qualitative advantages of our approach when compared to state of the art methods.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"16 1","pages":"800-807"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74584603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Beyond PASCAL: A benchmark for 3D object detection in the wild 超越PASCAL:野外3D物体检测的基准
Yu Xiang, Roozbeh Mottaghi, S. Savarese
3D object detection and pose estimation methods have become popular in recent years since they can handle ambiguities in 2D images and also provide a richer description for objects compared to 2D object detectors. However, most of the datasets for 3D recognition are limited to a small amount of images per category or are captured in controlled environments. In this paper, we contribute PASCAL3D+ dataset, which is a novel and challenging dataset for 3D object detection and pose estimation. PASCAL3D+ augments 12 rigid categories of the PASCAL VOC 2012 [4] with 3D annotations. Furthermore, more images are added for each category from ImageNet [3]. PASCAL3D+ images exhibit much more variability compared to the existing 3D datasets, and on average there are more than 3,000 object instances per category. We believe this dataset will provide a rich testbed to study 3D detection and pose estimation and will help to significantly push forward research in this area. We provide the results of variations of DPM [6] on our new dataset for object detection and viewpoint estimation in different scenarios, which can be used as baselines for the community. Our benchmark is available online at http://cvgl.stanford.edu/projects/pascal3d.
三维目标检测和姿态估计方法近年来变得流行,因为它们可以处理二维图像中的歧义,并且与二维目标检测器相比,还提供了更丰富的对象描述。然而,大多数用于3D识别的数据集仅限于每个类别的少量图像或在受控环境中捕获。在本文中,我们提供了PASCAL3D+数据集,这是一个新颖而具有挑战性的3D目标检测和姿态估计数据集。PASCAL3D+用3D注释增强了PASCAL VOC 2012[4]的12个刚性类别。此外,从ImageNet[3]中为每个类别添加更多图像。与现有的3D数据集相比,PASCAL3D+图像表现出更多的可变性,平均每个类别有3000多个对象实例。我们相信该数据集将为研究3D检测和姿态估计提供丰富的测试平台,并将有助于显著推进该领域的研究。我们在我们的新数据集上提供了DPM的变化结果[6],用于不同场景下的目标检测和视点估计,这可以作为社区的基线。我们的基准可以在http://cvgl.stanford.edu/projects/pascal3d上找到。
{"title":"Beyond PASCAL: A benchmark for 3D object detection in the wild","authors":"Yu Xiang, Roozbeh Mottaghi, S. Savarese","doi":"10.1109/WACV.2014.6836101","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836101","url":null,"abstract":"3D object detection and pose estimation methods have become popular in recent years since they can handle ambiguities in 2D images and also provide a richer description for objects compared to 2D object detectors. However, most of the datasets for 3D recognition are limited to a small amount of images per category or are captured in controlled environments. In this paper, we contribute PASCAL3D+ dataset, which is a novel and challenging dataset for 3D object detection and pose estimation. PASCAL3D+ augments 12 rigid categories of the PASCAL VOC 2012 [4] with 3D annotations. Furthermore, more images are added for each category from ImageNet [3]. PASCAL3D+ images exhibit much more variability compared to the existing 3D datasets, and on average there are more than 3,000 object instances per category. We believe this dataset will provide a rich testbed to study 3D detection and pose estimation and will help to significantly push forward research in this area. We provide the results of variations of DPM [6] on our new dataset for object detection and viewpoint estimation in different scenarios, which can be used as baselines for the community. Our benchmark is available online at http://cvgl.stanford.edu/projects/pascal3d.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"14 1","pages":"75-82"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80353826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 726
Bayesian Optimization with an Empirical Hardness Model for approximate Nearest Neighbour Search 基于经验硬度模型的近似最近邻搜索贝叶斯优化
Julieta Martinez, J. Little, Nando de Freitas
Nearest Neighbour Search in high-dimensional spaces is a common problem in Computer Vision. Although no algorithm better than linear search is known, approximate algorithms are commonly used to tackle this problem. The drawback of using such algorithms is that their performance depends highly on parameter tuning. While this process can be automated using standard empirical optimization techniques, tuning is still time-consuming. In this paper, we propose to use Empirical Hardness Models to reduce the number of parameter configurations that Bayesian Optimization has to try, speeding up the optimization process. Evaluation on standard benchmarks of SIFT and GIST descriptors shows the viability of our approach.
高维空间的最近邻搜索是计算机视觉中的一个常见问题。虽然没有比线性搜索更好的算法,但近似算法通常用于解决此问题。使用这种算法的缺点是它们的性能高度依赖于参数调优。虽然这个过程可以使用标准的经验优化技术实现自动化,但调优仍然很耗时。在本文中,我们建议使用经验硬度模型来减少贝叶斯优化必须尝试的参数配置数量,加快优化过程。对SIFT和GIST描述符的标准基准的评估表明了我们的方法的可行性。
{"title":"Bayesian Optimization with an Empirical Hardness Model for approximate Nearest Neighbour Search","authors":"Julieta Martinez, J. Little, Nando de Freitas","doi":"10.1109/WACV.2014.6836049","DOIUrl":"https://doi.org/10.1109/WACV.2014.6836049","url":null,"abstract":"Nearest Neighbour Search in high-dimensional spaces is a common problem in Computer Vision. Although no algorithm better than linear search is known, approximate algorithms are commonly used to tackle this problem. The drawback of using such algorithms is that their performance depends highly on parameter tuning. While this process can be automated using standard empirical optimization techniques, tuning is still time-consuming. In this paper, we propose to use Empirical Hardness Models to reduce the number of parameter configurations that Bayesian Optimization has to try, speeding up the optimization process. Evaluation on standard benchmarks of SIFT and GIST descriptors shows the viability of our approach.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"77 1","pages":"588-595"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80764441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A lp-norm MTMKL framework for simultaneous detection of multiple facial action units 同时检测多个面部动作单元的lp范数MTMKL框架
Xiao Zhang, M. Mahoor, S. Mavadati, J. Cohn
Facial action unit (AU) detection is a challenging topic in computer vision and pattern recognition. Most existing approaches design classifiers to detect AUs individually or AU combinations without considering the intrinsic relations among AUs. This paper presents a novel method, lp-norm multi-task multiple kernel learning (MTMKL), that jointly learns the classifiers for detecting the absence and presence of multiple AUs. lp-norm MTMKL is an extension of the regularized multi-task learning, which learns shared kernels from a given set of base kernels among all the tasks within Support Vector Machines (SVM). Our approach has several advantages over existing methods: (1) AU detection work is transformed to a MTL problem, where given a specific frame, multiple AUs are detected simultaneously by exploiting their inter-relations; (2) lp-norm multiple kernel learning is applied to increase the discriminant power of classifiers. Our experimental results on the CK+ and DISFA databases show that the proposed method outperforms the state-of-the-art methods for AU detection.
面部动作单元(AU)检测是计算机视觉和模式识别领域的一个具有挑战性的课题。大多数现有方法设计的分类器都是单独检测AU或组合检测AU,而没有考虑AU之间的内在关系。本文提出了一种新的方法——低范数多任务多核学习(MTMKL),该方法联合学习分类器来检测多个目标的存在和不存在。lp-norm MTMKL是正则化多任务学习的扩展,它从支持向量机(SVM)中所有任务的给定基核集合中学习共享核。与现有方法相比,我们的方法具有以下几个优点:(1)将AU检测工作转化为MTL问题,在给定特定框架的情况下,通过利用它们之间的相互关系同时检测多个AU;(2)采用低范数多核学习提高分类器的判别能力。我们在CK+和DISFA数据库上的实验结果表明,所提出的方法优于最先进的AU检测方法。
{"title":"A lp-norm MTMKL framework for simultaneous detection of multiple facial action units","authors":"Xiao Zhang, M. Mahoor, S. Mavadati, J. Cohn","doi":"10.1109/WACV.2014.6835735","DOIUrl":"https://doi.org/10.1109/WACV.2014.6835735","url":null,"abstract":"Facial action unit (AU) detection is a challenging topic in computer vision and pattern recognition. Most existing approaches design classifiers to detect AUs individually or AU combinations without considering the intrinsic relations among AUs. This paper presents a novel method, lp-norm multi-task multiple kernel learning (MTMKL), that jointly learns the classifiers for detecting the absence and presence of multiple AUs. lp-norm MTMKL is an extension of the regularized multi-task learning, which learns shared kernels from a given set of base kernels among all the tasks within Support Vector Machines (SVM). Our approach has several advantages over existing methods: (1) AU detection work is transformed to a MTL problem, where given a specific frame, multiple AUs are detected simultaneously by exploiting their inter-relations; (2) lp-norm multiple kernel learning is applied to increase the discriminant power of classifiers. Our experimental results on the CK+ and DISFA databases show that the proposed method outperforms the state-of-the-art methods for AU detection.","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":"17 1","pages":"1104-1111"},"PeriodicalIF":0.0,"publicationDate":"2014-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73480128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
期刊
IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1