首页 > 最新文献

2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops最新文献

英文 中文
Robust detection of semantically equivalent visually dissimilar objects 语义等效视觉不相似对象的鲁棒检测
T. Goh, Ryan West, K. Okada
We propose a novel and robust detection of semantically equivalent but visually dissimilar object parts with the presence of geometric domain variations. The presented algorithms follow a part-based object learning and recognition framework proposed by Epshtein and Ullman. This approach characterizes the location of a visually dissimilar object (i.e., root fragment) as a function of its relative geometrical configuration to a set of local context patches (i.e., context fragments). This work extends the original detection algorithm for handling more realistic geometric domain variation by using robust candidate generation, exploiting geometric invariances of a pair of similar polygons, as well as SIFT-based context descriptors. An entropic feature selection is also integrated in order to improve its performance. Furthermore, robust voting in a maximum density framework is realized by variable bandwidth mean shift, allowing better root detection performance with the presence of significant errors in detecting corresponding context fragments. We evaluate the proposed solution for the task of detecting various facial parts using FERET database. Our experimental results demonstrate the advantage of our solution by indicating significant improvement of detection performance and robustness over the original system.
我们提出了一种新的和鲁棒的检测语义等效但视觉上不同的物体部分与存在几何域变化。所提出的算法遵循Epshtein和Ullman提出的基于零件的对象学习和识别框架。这种方法将视觉上不同的对象(即根片段)的位置表征为其相对于一组局部上下文补丁(即上下文片段)的几何结构的函数。这项工作扩展了原始的检测算法,通过使用鲁棒候选生成,利用一对相似多边形的几何不变性,以及基于sift的上下文描述符,来处理更真实的几何域变化。为了提高其性能,还集成了熵特征选择。此外,通过可变带宽平均移位实现了最大密度框架下的鲁棒投票,在检测相应上下文片段存在显著错误的情况下,实现了更好的根检测性能。我们评估了使用FERET数据库检测各种面部部位的任务所提出的解决方案。我们的实验结果证明了我们的解决方案的优势,表明我们的检测性能和鲁棒性比原来的系统有显著的提高。
{"title":"Robust detection of semantically equivalent visually dissimilar objects","authors":"T. Goh, Ryan West, K. Okada","doi":"10.1109/CVPRW.2008.4563038","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563038","url":null,"abstract":"We propose a novel and robust detection of semantically equivalent but visually dissimilar object parts with the presence of geometric domain variations. The presented algorithms follow a part-based object learning and recognition framework proposed by Epshtein and Ullman. This approach characterizes the location of a visually dissimilar object (i.e., root fragment) as a function of its relative geometrical configuration to a set of local context patches (i.e., context fragments). This work extends the original detection algorithm for handling more realistic geometric domain variation by using robust candidate generation, exploiting geometric invariances of a pair of similar polygons, as well as SIFT-based context descriptors. An entropic feature selection is also integrated in order to improve its performance. Furthermore, robust voting in a maximum density framework is realized by variable bandwidth mean shift, allowing better root detection performance with the presence of significant errors in detecting corresponding context fragments. We evaluate the proposed solution for the task of detecting various facial parts using FERET database. Our experimental results demonstrate the advantage of our solution by indicating significant improvement of detection performance and robustness over the original system.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124949490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Visual detection of lintel-occluded doors from a single image 门楣遮挡门单幅图像的视觉检测
Zhichao Chen, Stan Birchfield
Doors are important landmarks for indoor mobile robot navigation. Most existing algorithms for door detection use range sensors or work in limited environments because of restricted assumptions about color, pose, or lighting. We present a vision-based door detection algorithm that achieves robustness by utilizing a variety of features, including color, texture, and intensity edges. We introduce two novel geometric features that increase performance significantly: concavity and bottom-edge intensity profile. The features are combined using Adaboost to ensure optimal linear weighting. On a large database of images collected in a wide variety of conditions, the algorithm achieves more than 90% detection with a low false positive rate. Additional experiments demonstrate the suitability of the algorithm for real-time applications using a mobile robot equipped with an off-the-shelf camera and laptop.
门是室内移动机器人导航的重要标志。大多数现有的门检测算法使用距离传感器或在有限的环境中工作,因为对颜色,姿势或照明的假设有限。我们提出了一种基于视觉的门检测算法,该算法通过利用各种特征(包括颜色、纹理和强度边缘)实现鲁棒性。我们引入了两个新的几何特征,显著提高性能:凹凸和底边强度剖面。使用Adaboost将这些特征结合起来,以确保最佳的线性加权。在多种条件下采集的大型图像数据库上,该算法实现了90%以上的检测,假阳性率低。另外的实验表明,该算法适用于配备现成相机和笔记本电脑的移动机器人的实时应用。
{"title":"Visual detection of lintel-occluded doors from a single image","authors":"Zhichao Chen, Stan Birchfield","doi":"10.1109/CVPRW.2008.4563142","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563142","url":null,"abstract":"Doors are important landmarks for indoor mobile robot navigation. Most existing algorithms for door detection use range sensors or work in limited environments because of restricted assumptions about color, pose, or lighting. We present a vision-based door detection algorithm that achieves robustness by utilizing a variety of features, including color, texture, and intensity edges. We introduce two novel geometric features that increase performance significantly: concavity and bottom-edge intensity profile. The features are combined using Adaboost to ensure optimal linear weighting. On a large database of images collected in a wide variety of conditions, the algorithm achieves more than 90% detection with a low false positive rate. Additional experiments demonstrate the suitability of the algorithm for real-time applications using a mobile robot equipped with an off-the-shelf camera and laptop.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125099960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
Riemannian manifold optimisation for non-rigid structure from motion 运动非刚性结构的黎曼流形优化
Appu Shaji, S. Chandran
This paper address the problem of automatically extracting the 3D configurations of deformable objects from 2D features. Our focus in this work is to build on the observation that the subspace spanned by the motion parameters is a subset of a smooth manifold, and therefore we hunt for the solution in this space, rather than use heuristics (as previously attempted earlier). We succeed in this by attaching a canonical Riemannian metric, and using a variant of the non-rigid factorisation algorithm for structure from motion. We qualitatively and quantitatively show that our algorithm produces better results when compared to the state of art.
本文研究了从二维特征中自动提取可变形物体三维结构的问题。我们在这项工作中的重点是建立在观察的基础上,即运动参数所跨越的子空间是光滑流形的子集,因此我们在这个空间中寻找解决方案,而不是使用启发式(正如之前尝试的那样)。我们通过附加一个标准黎曼度量,并使用一种非刚性分解算法的变体来实现这一点。我们定性和定量地表明,与目前的技术水平相比,我们的算法产生更好的结果。
{"title":"Riemannian manifold optimisation for non-rigid structure from motion","authors":"Appu Shaji, S. Chandran","doi":"10.1109/CVPRW.2008.4563071","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563071","url":null,"abstract":"This paper address the problem of automatically extracting the 3D configurations of deformable objects from 2D features. Our focus in this work is to build on the observation that the subspace spanned by the motion parameters is a subset of a smooth manifold, and therefore we hunt for the solution in this space, rather than use heuristics (as previously attempted earlier). We succeed in this by attaching a canonical Riemannian metric, and using a variant of the non-rigid factorisation algorithm for structure from motion. We qualitatively and quantitatively show that our algorithm produces better results when compared to the state of art.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"44 12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122430170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Remote and head-motion-free gaze tracking for real environments with automated head-eye model calibrations 具有自动头眼模型校准的真实环境的远程和头部运动无凝视跟踪
H. Yamazoe, A. Utsumi, Tomoko Yonezawa, Shinji Abe
We propose a gaze estimation method that substantially relaxes the practical constraints possessed by most conventional methods. Gaze estimation research has a long history, and many systems including some commercial schemes have been proposed. However, the application domain of gaze estimation is still limited (e.g, measurement devices for HCI issues, input devices for VDT works) due to the limitations of such systems. First, users must be close to the system (or must wear it) since most systems employ IR illumination and/or stereo cameras. Second, users are required to perform manual calibrations to get geometrically meaningful data. These limitations prevent applications of the system that capture and utilize useful human gaze information in daily situations. In our method, inspired by a bundled adjustment framework, the parameters of the 3D head-eye model are robustly estimated by minimizing pixel-wise re-projection errors between single-camera input images and eye model projections for multiple frames with adjacently estimated head poses. Since this process runs automatically, users does not need to be aware of it. Using the estimated parameters, 3D head poses and gaze directions for newly observed images can be directly determined with the same error minimization manner. This mechanism enables robust gaze estimation with single-camera-based low resolution images without user-aware preparation tasks (i.e., calibration). Experimental results show the proposed method achieves 6deg accuracy with QVGA (320 times 240) images. The proposed algorithm is free from observation distances. We confirmed that our system works with long-distance observations (10 meters).
我们提出了一种注视估计方法,大大放宽了大多数传统方法所具有的实际约束。注视估计的研究历史悠久,已经提出了许多系统,包括一些商业方案。然而,由于这些系统的局限性,注视估计的应用领域仍然有限(例如,用于HCI问题的测量设备,用于VDT工作的输入设备)。首先,用户必须靠近系统(或必须佩戴它),因为大多数系统使用红外照明和/或立体相机。其次,用户需要进行手动校准以获得几何上有意义的数据。这些限制阻碍了系统在日常情况下捕获和利用有用的人类凝视信息的应用。在我们的方法中,受捆绑调整框架的启发,通过最小化单摄像机输入图像与具有邻接估计头部姿态的多帧眼睛模型投影之间的逐像素重投影误差,对3D头眼模型的参数进行鲁棒估计。因为这个过程是自动运行的,所以用户不需要知道它。利用估计的参数,可以直接确定新观测图像的三维头部姿态和凝视方向,并以相同的误差最小化方式进行。这种机制可以实现基于单摄像头的低分辨率图像的鲁棒凝视估计,而无需用户感知的准备任务(即校准)。实验结果表明,该方法在QVGA (320 × 240)图像上达到了6度的精度。该算法不受观测距离的影响。我们确认我们的系统适用于远距离观测(10米)。
{"title":"Remote and head-motion-free gaze tracking for real environments with automated head-eye model calibrations","authors":"H. Yamazoe, A. Utsumi, Tomoko Yonezawa, Shinji Abe","doi":"10.1109/CVPRW.2008.4563184","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563184","url":null,"abstract":"We propose a gaze estimation method that substantially relaxes the practical constraints possessed by most conventional methods. Gaze estimation research has a long history, and many systems including some commercial schemes have been proposed. However, the application domain of gaze estimation is still limited (e.g, measurement devices for HCI issues, input devices for VDT works) due to the limitations of such systems. First, users must be close to the system (or must wear it) since most systems employ IR illumination and/or stereo cameras. Second, users are required to perform manual calibrations to get geometrically meaningful data. These limitations prevent applications of the system that capture and utilize useful human gaze information in daily situations. In our method, inspired by a bundled adjustment framework, the parameters of the 3D head-eye model are robustly estimated by minimizing pixel-wise re-projection errors between single-camera input images and eye model projections for multiple frames with adjacently estimated head poses. Since this process runs automatically, users does not need to be aware of it. Using the estimated parameters, 3D head poses and gaze directions for newly observed images can be directly determined with the same error minimization manner. This mechanism enables robust gaze estimation with single-camera-based low resolution images without user-aware preparation tasks (i.e., calibration). Experimental results show the proposed method achieves 6deg accuracy with QVGA (320 times 240) images. The proposed algorithm is free from observation distances. We confirmed that our system works with long-distance observations (10 meters).","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122475170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Comparison of combination methods utilizing T-normalization and second best score model 使用t归一化和次优得分模型的组合方法的比较
S. Tulyakov, Zhi Zhang, V. Govindaraju
The combination of biometric matching scores can be enhanced by taking into account the matching scores related to all enrolled persons in addition to traditional combinations utilizing only matching scores related to a single person. Identification models take into account the dependence between matching scores assigned to different persons and can be used for such enhancement. In this paper we compare the use of two such models - T-normalization and second best score model. The comparison is performed using two combination algorithms - likelihood ratio and multilayer perceptron. The results show, that while second best score model delivers better performance improvement than T-normalization, two models are complementary to each other and can be used together for further improvements.
除了仅利用与单个人相关的匹配分数的传统组合外,还可以通过考虑与所有登记人员相关的匹配分数来增强生物识别匹配分数的组合。识别模型考虑了分配给不同人的匹配分数之间的依赖性,可以用于这种增强。在本文中,我们比较了两种这样的模型- t归一化和次优得分模型的使用。采用两种组合算法——似然比算法和多层感知器算法进行比较。结果表明,虽然次优得分模型比t归一化提供了更好的性能改进,但两个模型是互补的,可以一起使用以进一步改进。
{"title":"Comparison of combination methods utilizing T-normalization and second best score model","authors":"S. Tulyakov, Zhi Zhang, V. Govindaraju","doi":"10.1109/CVPRW.2008.4563105","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563105","url":null,"abstract":"The combination of biometric matching scores can be enhanced by taking into account the matching scores related to all enrolled persons in addition to traditional combinations utilizing only matching scores related to a single person. Identification models take into account the dependence between matching scores assigned to different persons and can be used for such enhancement. In this paper we compare the use of two such models - T-normalization and second best score model. The comparison is performed using two combination algorithms - likelihood ratio and multilayer perceptron. The results show, that while second best score model delivers better performance improvement than T-normalization, two models are complementary to each other and can be used together for further improvements.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122654300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
3D tracking in unknown environments using on-line keypoint learning for mobile augmented reality 移动增强现实中使用在线关键点学习的未知环境中的3D跟踪
Gerhard Schall, H. Grabner, Michael Grabner, Paul Wohlhart, D. Schmalstieg, H. Bischof
In this paper we present a natural feature tracking algorithm based on on-line boosting used for localizing a mobile computer. Mobile augmented reality requires highly accurate and fast six degrees of freedom tracking in order to provide registered graphical overlays to a mobile user. With advances in mobile computer hardware, vision-based tracking approaches have the potential to provide efficient solutions that are non-invasive in contrast to the currently dominating marker-based approaches. We propose to use a tracking approach which can use in an unknown environment, i.e. the target has not be known beforehand. The core of the tracker is an on-line learning algorithm, which updates the tracker as new data becomes available. This is suitable in many mobile augmented reality applications. We demonstrate the applicability of our approach on tasks where the target objects are not known beforehand, i.e. interactive planing.
本文提出了一种基于在线增强的自然特征跟踪算法,用于移动计算机的定位。移动增强现实需要高度准确和快速的六自由度跟踪,以便向移动用户提供注册的图形叠加。随着移动计算机硬件的进步,与目前占主导地位的基于标记的方法相比,基于视觉的跟踪方法有可能提供有效的非侵入性解决方案。我们建议使用一种可以在未知环境中使用的跟踪方法,即事先不知道目标。跟踪器的核心是一个在线学习算法,当新的数据可用时,它会更新跟踪器。这适用于许多移动增强现实应用程序。我们证明了我们的方法在目标对象事先未知的任务上的适用性,即交互式规划。
{"title":"3D tracking in unknown environments using on-line keypoint learning for mobile augmented reality","authors":"Gerhard Schall, H. Grabner, Michael Grabner, Paul Wohlhart, D. Schmalstieg, H. Bischof","doi":"10.1109/CVPRW.2008.4563134","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563134","url":null,"abstract":"In this paper we present a natural feature tracking algorithm based on on-line boosting used for localizing a mobile computer. Mobile augmented reality requires highly accurate and fast six degrees of freedom tracking in order to provide registered graphical overlays to a mobile user. With advances in mobile computer hardware, vision-based tracking approaches have the potential to provide efficient solutions that are non-invasive in contrast to the currently dominating marker-based approaches. We propose to use a tracking approach which can use in an unknown environment, i.e. the target has not be known beforehand. The core of the tracker is an on-line learning algorithm, which updates the tracker as new data becomes available. This is suitable in many mobile augmented reality applications. We demonstrate the applicability of our approach on tasks where the target objects are not known beforehand, i.e. interactive planing.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133180596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Stereo depth with a Unified Architecture GPU 立体深度与统一架构GPU
Joel Gibson, Oge Marques
This paper describes how the calculation of depth from stereo images was accelerated using a GPU. The Compute Unified Device Architecture (CUDA) from NVIDIA was employed in novel ways to compute depth using BT cost matching and the semi-global matching algorithm. The challenges of mapping a sequential algorithm to a massively parallel thread environment and performance optimization techniques are considered.
本文描述了如何利用GPU加速立体图像的深度计算。采用NVIDIA的计算统一设备架构(CUDA),采用BT代价匹配和半全局匹配算法进行深度计算。考虑了将顺序算法映射到大规模并行线程环境和性能优化技术的挑战。
{"title":"Stereo depth with a Unified Architecture GPU","authors":"Joel Gibson, Oge Marques","doi":"10.1109/CVPRW.2008.4563092","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563092","url":null,"abstract":"This paper describes how the calculation of depth from stereo images was accelerated using a GPU. The Compute Unified Device Architecture (CUDA) from NVIDIA was employed in novel ways to compute depth using BT cost matching and the semi-global matching algorithm. The challenges of mapping a sequential algorithm to a massively parallel thread environment and performance optimization techniques are considered.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133212470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Integration of multiple contextual information for image segmentation using a Bayesian Network 基于贝叶斯网络的多上下文信息集成图像分割
Lei Zhang, Q. Ji
We propose a Bayesian network (BN) model to integrate multiple contextual information and the image measurements for image segmentation. The BN model systematically encodes the contextual relationships between regions, edges and vertices, as well as their image measurements with uncertainties. It allows a principled probabilistic inference to be performed so that image segmentation can be achieved through a most probable explanation (MPE) inference in the BN model. We have achieved encouraging results on the horse images from the Weizmann dataset. We have also demonstrated the possible ways to extend the BN model so as to incorporate other contextual information such as the global object shape and human intervention for improving image segmentation. Human intervention is encoded as new evidence in the BN model. Its impact is propagated through belief propagation to update the states of the whole model. From the updated BN model, new image segmentation is produced.
我们提出了一种贝叶斯网络(BN)模型来整合多个上下文信息和图像测量,用于图像分割。BN模型系统地编码区域、边缘和顶点之间的上下文关系,以及它们具有不确定性的图像测量。它允许执行原则性的概率推理,以便通过BN模型中的最可能解释(MPE)推理实现图像分割。我们在Weizmann数据集中的马图像上取得了令人鼓舞的结果。我们还展示了扩展BN模型的可能方法,以便纳入其他上下文信息,如全局对象形状和人为干预,以改善图像分割。在BN模型中,人为干预被编码为新的证据。其影响通过信念传播传播,更新整个模型的状态。从更新的BN模型,产生新的图像分割。
{"title":"Integration of multiple contextual information for image segmentation using a Bayesian Network","authors":"Lei Zhang, Q. Ji","doi":"10.1109/CVPRW.2008.4563043","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563043","url":null,"abstract":"We propose a Bayesian network (BN) model to integrate multiple contextual information and the image measurements for image segmentation. The BN model systematically encodes the contextual relationships between regions, edges and vertices, as well as their image measurements with uncertainties. It allows a principled probabilistic inference to be performed so that image segmentation can be achieved through a most probable explanation (MPE) inference in the BN model. We have achieved encouraging results on the horse images from the Weizmann dataset. We have also demonstrated the possible ways to extend the BN model so as to incorporate other contextual information such as the global object shape and human intervention for improving image segmentation. Human intervention is encoded as new evidence in the BN model. Its impact is propagated through belief propagation to update the states of the whole model. From the updated BN model, new image segmentation is produced.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133951475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Not only size matters: Regularized partial matching of nonrigid shapes 不仅尺寸重要:非刚性形状的正则化部分匹配
A. Bronstein, M. Bronstein
Partial matching is probably one of the most challenging problems in nonrigid shape analysis. The problem consists of matching similar parts of shapes that are dissimilar on the whole and can assume different forms by undergoing nonrigid deformations. Conceptually, two shapes can be considered partially matching if they have significant similar parts, with the simplest definition of significance being the size of the parts. Thus, partial matching can be defined as a multicriterion optimization problem trying to simultaneously maximize the similarity and the size of these parts. In this paper, we propose a different definition of significance, taking into account the regularity of parts besides their size. The regularity term proposed here is similar to the spirit of the Mumford-Shah functional. Numerical experiments show that the regularized partial matching produces semantically better results compared to the non-regularized one.
部分匹配是非刚性形状分析中最具挑战性的问题之一。该问题包括匹配整体上不同形状的相似部分,并且可以通过非刚性变形呈现不同的形式。从概念上讲,如果两个形状具有重要的相似部分,则可以认为它们部分匹配,最简单的重要性定义是部分的大小。因此,部分匹配可以定义为试图同时最大化这些零件的相似度和尺寸的多准则优化问题。在本文中,我们提出了一种不同的意义定义,除了考虑零件的大小之外,还考虑了零件的规律性。这里提出的正则性项类似于Mumford-Shah泛函的精神。数值实验表明,正则化部分匹配比非正则化部分匹配具有更好的语义效果。
{"title":"Not only size matters: Regularized partial matching of nonrigid shapes","authors":"A. Bronstein, M. Bronstein","doi":"10.1109/CVPRW.2008.4563077","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563077","url":null,"abstract":"Partial matching is probably one of the most challenging problems in nonrigid shape analysis. The problem consists of matching similar parts of shapes that are dissimilar on the whole and can assume different forms by undergoing nonrigid deformations. Conceptually, two shapes can be considered partially matching if they have significant similar parts, with the simplest definition of significance being the size of the parts. Thus, partial matching can be defined as a multicriterion optimization problem trying to simultaneously maximize the similarity and the size of these parts. In this paper, we propose a different definition of significance, taking into account the regularity of parts besides their size. The regularity term proposed here is similar to the spirit of the Mumford-Shah functional. Numerical experiments show that the regularized partial matching produces semantically better results compared to the non-regularized one.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133379410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Learning the abstract motion semantics of verbs from captioned videos 从字幕视频中学习动词的抽象动作语义
Stefan Mathe, A. Fazly, Sven J. Dickinson, S. Stevenson
We propose an algorithm for learning the semantics of a (motion) verb from videos depicting the action expressed by the verb, paired with sentences describing the action participants and their roles. Acknowledging that commonalities among example videos may not exist at the level of the input features, our approximation algorithm efficiently searches the space of more abstract features for a common solution. We test our algorithm by using it to learn the semantics of a sample set of verbs; results demonstrate the usefulness of the proposed framework, while identifying directions for further improvement.
我们提出了一种算法,用于从描述动作的视频中学习动作动词的语义,并与描述动作参与者及其角色的句子配对。考虑到示例视频之间的共性在输入特征级别上可能不存在,我们的近似算法有效地搜索更抽象的特征空间以寻找公共解决方案。我们通过学习一组动词样本的语义来测试我们的算法;结果证明了所提出的框架的有效性,同时确定了进一步改进的方向。
{"title":"Learning the abstract motion semantics of verbs from captioned videos","authors":"Stefan Mathe, A. Fazly, Sven J. Dickinson, S. Stevenson","doi":"10.1109/CVPRW.2008.4563042","DOIUrl":"https://doi.org/10.1109/CVPRW.2008.4563042","url":null,"abstract":"We propose an algorithm for learning the semantics of a (motion) verb from videos depicting the action expressed by the verb, paired with sentences describing the action participants and their roles. Acknowledging that commonalities among example videos may not exist at the level of the input features, our approximation algorithm efficiently searches the space of more abstract features for a common solution. We test our algorithm by using it to learn the semantics of a sample set of verbs; results demonstrate the usefulness of the proposed framework, while identifying directions for further improvement.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130307625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1