首页 > 最新文献

2009 IEEE Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
Noninvasive volumetric imaging of cardiac electrophysiology 心脏电生理的无创容积成像
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206717
Linwei Wang, Heye Zhang, Ken C. L. Wong, Huafeng Liu, P. Shi
Volumetric details of cardiac electrophysiology, such as transmembrane potential dynamics and tissue excitability of the myocardium, are of fundamental importance for understanding normal and pathological cardiac mechanisms, and for aiding the diagnosis and treatment of cardiac arrhythmia. Noninvasive observations, however, are made on body surface as an integration-projection of the volumetric phenomena inside patient's heart. We present a physiological-model-constrained statistical framework where prior knowledge of general myocardial electrical activity is used to guide the reconstruction of patient-specific volumetric cardiac electrophysiological details from body surface potential data. Sequential data assimilation with proper computational reduction is developed to estimate transmembrane potential and myocardial excitability inside the heart, which are then utilized to depict arrhythmogenic substrates. Effectiveness and validity of the framework is demonstrated through its application to evaluate the location and extent of myocardial infract using real patient data.
心脏电生理的体积细节,如跨膜电位动力学和心肌的组织兴奋性,对于理解正常和病理心脏机制以及帮助心律失常的诊断和治疗具有重要意义。然而,非侵入性观察是在体表上进行的,作为患者心脏内体积现象的集成投影。我们提出了一个生理模型约束的统计框架,其中使用一般心肌电活动的先验知识来指导从体表电位数据重建患者特异性体积心脏电生理细节。序贯数据同化与适当的计算简化被开发来估计跨膜电位和心脏内的心肌兴奋性,然后用来描绘心律失常的底物。通过使用真实患者数据评估心肌梗死的位置和程度,证明了该框架的有效性和有效性。
{"title":"Noninvasive volumetric imaging of cardiac electrophysiology","authors":"Linwei Wang, Heye Zhang, Ken C. L. Wong, Huafeng Liu, P. Shi","doi":"10.1109/CVPR.2009.5206717","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206717","url":null,"abstract":"Volumetric details of cardiac electrophysiology, such as transmembrane potential dynamics and tissue excitability of the myocardium, are of fundamental importance for understanding normal and pathological cardiac mechanisms, and for aiding the diagnosis and treatment of cardiac arrhythmia. Noninvasive observations, however, are made on body surface as an integration-projection of the volumetric phenomena inside patient's heart. We present a physiological-model-constrained statistical framework where prior knowledge of general myocardial electrical activity is used to guide the reconstruction of patient-specific volumetric cardiac electrophysiological details from body surface potential data. Sequential data assimilation with proper computational reduction is developed to estimate transmembrane potential and myocardial excitability inside the heart, which are then utilized to depict arrhythmogenic substrates. Effectiveness and validity of the framework is demonstrated through its application to evaluate the location and extent of myocardial infract using real patient data.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115772469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A streaming framework for seamless building reconstruction from large-scale aerial LiDAR data 基于大规模航空激光雷达数据的无缝建筑重建流框架
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206760
Qian-Yi Zhou, U. Neumann
We present a streaming framework for seamless building reconstruction from huge aerial LiDAR point sets. By storing data as stream files on hard disk and using main memory as only a temporary storage for ongoing computation, we achieve efficient out-of-core data management. This gives us the ability to handle data sets with hundreds of millions of points in a uniform manner. By adapting a building modeling pipeline into our streaming framework, we create the whole urban model of Atlanta from 17.7 GB LiDAR data with 683 M points in under 25 hours using less than 1 GB memory. To integrate this complex modeling pipeline with our streaming framework, we develop a state propagation mechanism, and extend current reconstruction algorithms to handle the large scale of data.
我们提出了一个流式框架,用于从巨大的空中激光雷达点集进行无缝建筑重建。通过将数据以流文件的形式存储在硬盘上,并使用主存作为正在进行的计算的临时存储,我们实现了高效的核外数据管理。这使我们能够以统一的方式处理具有数亿个点的数据集。通过将建筑建模管道融入我们的流媒体框架,我们在25小时内使用不到1gb的内存,从17.7 GB的激光雷达数据和683m个点创建了亚特兰大的整个城市模型。为了将这种复杂的建模管道与我们的流框架集成,我们开发了一种状态传播机制,并扩展了当前的重建算法来处理大规模数据。
{"title":"A streaming framework for seamless building reconstruction from large-scale aerial LiDAR data","authors":"Qian-Yi Zhou, U. Neumann","doi":"10.1109/CVPR.2009.5206760","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206760","url":null,"abstract":"We present a streaming framework for seamless building reconstruction from huge aerial LiDAR point sets. By storing data as stream files on hard disk and using main memory as only a temporary storage for ongoing computation, we achieve efficient out-of-core data management. This gives us the ability to handle data sets with hundreds of millions of points in a uniform manner. By adapting a building modeling pipeline into our streaming framework, we create the whole urban model of Atlanta from 17.7 GB LiDAR data with 683 M points in under 25 hours using less than 1 GB memory. To integrate this complex modeling pipeline with our streaming framework, we develop a state propagation mechanism, and extend current reconstruction algorithms to handle the large scale of data.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115881360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Image deblurring for less intrusive iris capture 图像去模糊,较少侵入虹膜捕获
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206700
Xinyu Huang, Liu Ren, Ruigang Yang
For most iris capturing scenarios, captured iris images could easily blur when the user is out of the depth of field (DOF) of the camera, or when he or she is moving. The common solution is to let the user try the capturing process again as the quality of these blurred iris images is not good enough for recognition. In this paper, we propose a novel iris deblurring algorithm that can be used to improve the robustness and nonintrusiveness for iris capture. Unlike other iris deblurring algorithms, the key feature of our algorithm is that we use the domain knowledge inherent in iris images and iris capture settings to improve the performance, which could be in the form of iris image statistics, characteristics of pupils or highlights, or even depth information from the iris capturing system itself. Our experiments on both synthetic and real data demonstrate that our deblurring algorithm can significantly restore blurred iris patterns and therefore improve the robustness of iris capture.
对于大多数虹膜捕获场景,当用户超出相机的景深(DOF)时,或者当他或她移动时,捕获的虹膜图像很容易模糊。常见的解决方案是让用户再次尝试捕获过程,因为这些模糊的虹膜图像质量不够好,无法识别。在本文中,我们提出了一种新的虹膜去模糊算法,可以用来提高虹膜捕获的鲁棒性和非侵入性。与其他虹膜去模糊算法不同,我们的算法的关键特征是我们使用虹膜图像固有的领域知识和虹膜捕获设置来提高性能,这些知识可以以虹膜图像统计,瞳孔或高光特征,甚至虹膜捕获系统本身的深度信息的形式出现。我们在合成数据和真实数据上的实验表明,我们的去模糊算法可以显著地恢复模糊的虹膜图案,从而提高虹膜捕获的鲁棒性。
{"title":"Image deblurring for less intrusive iris capture","authors":"Xinyu Huang, Liu Ren, Ruigang Yang","doi":"10.1109/CVPR.2009.5206700","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206700","url":null,"abstract":"For most iris capturing scenarios, captured iris images could easily blur when the user is out of the depth of field (DOF) of the camera, or when he or she is moving. The common solution is to let the user try the capturing process again as the quality of these blurred iris images is not good enough for recognition. In this paper, we propose a novel iris deblurring algorithm that can be used to improve the robustness and nonintrusiveness for iris capture. Unlike other iris deblurring algorithms, the key feature of our algorithm is that we use the domain knowledge inherent in iris images and iris capture settings to improve the performance, which could be in the form of iris image statistics, characteristics of pupils or highlights, or even depth information from the iris capturing system itself. Our experiments on both synthetic and real data demonstrate that our deblurring algorithm can significantly restore blurred iris patterns and therefore improve the robustness of iris capture.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115551023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Marked point processes for crowd counting 人群计数的标记点过程
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206621
Weina Ge, R. Collins
A Bayesian marked point process (MPP) model is developed to detect and count people in crowded scenes. The model couples a spatial stochastic process governing number and placement of individuals with a conditional mark process for selecting body shape. We automatically learn the mark (shape) process from training video by estimating a mixture of Bernoulli shape prototypes along with an extrinsic shape distribution describing the orientation and scaling of these shapes for any given image location. The reversible jump Markov Chain Monte Carlo framework is used to efficiently search for the maximum a posteriori configuration of shapes, leading to an estimate of the count, location and pose of each person in the scene. Quantitative results of crowd counting are presented for two publicly available datasets with known ground truth.
提出了一种贝叶斯标记点过程(MPP)模型,用于拥挤场景中人群的检测和计数。该模型将控制个体数量和位置的空间随机过程与选择体型的条件标记过程相结合。我们通过估计伯努利形状原型的混合以及描述这些形状在任何给定图像位置的方向和缩放的外在形状分布,从训练视频中自动学习标记(形状)过程。利用可逆跳跃马尔可夫链蒙特卡罗框架有效地搜索形状的最大后验配置,从而估计场景中每个人的数量、位置和姿势。人群计数的定量结果提出了两个公开可用的数据集与已知的地面真相。
{"title":"Marked point processes for crowd counting","authors":"Weina Ge, R. Collins","doi":"10.1109/CVPR.2009.5206621","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206621","url":null,"abstract":"A Bayesian marked point process (MPP) model is developed to detect and count people in crowded scenes. The model couples a spatial stochastic process governing number and placement of individuals with a conditional mark process for selecting body shape. We automatically learn the mark (shape) process from training video by estimating a mixture of Bernoulli shape prototypes along with an extrinsic shape distribution describing the orientation and scaling of these shapes for any given image location. The reversible jump Markov Chain Monte Carlo framework is used to efficiently search for the maximum a posteriori configuration of shapes, leading to an estimate of the count, location and pose of each person in the scene. Quantitative results of crowd counting are presented for two publicly available datasets with known ground truth.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114337045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 263
Active volume models for 3D medical image segmentation 三维医学图像分割的活动体模型
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206563
Tian Shen, Hongsheng Li, Z. Qian, Xiaolei Huang
In this paper, we propose a novel predictive model for object boundary, which can integrate information from any sources. The model is a dynamic “object” model whose manifestation includes a deformable surface representing shape, a volumetric interior carrying appearance statistics, and an embedded classifier that separates object from background based on current feature information. Unlike Snakes, Level Set, Graph Cut, MRF and CRF approaches, the model is “self-contained” in that it does not model the background, but rather focuses on an accurate representation of the foreground object's attributes. As we will show, however, the model is capable of reasoning about the background statistics thus can detect when is change sufficient to invoke a boundary decision. The shape of the 3D model is considered as an elastic solid, with a simplex-mesh (i.e. finite element triangulation) surface made of thousands of vertices. Deformations of the model are derived from a linear system that encodes external forces from the boundary of a Region of Interest (ROI), which is a binary mask representing the object region predicted by the current model. Efficient optimization and fast convergence of the model are achieved using the Finite Element Method (FEM). Other advantages of the model include the ease of dealing with topology changes and its ability to incorporate human interactions. Segmentation and validation results are presented for experiments on noisy 3D medical images.
本文提出了一种新的目标边界预测模型,该模型可以集成任意来源的信息。该模型是一个动态的“对象”模型,其表现形式包括一个表示形状的可变形表面,一个承载外观统计的体积内部,以及一个基于当前特征信息将对象与背景分离的嵌入式分类器。与snake, Level Set, Graph Cut, MRF和CRF方法不同,该模型是“自包含的”,因为它不模拟背景,而是专注于前景对象属性的准确表示。然而,正如我们将展示的那样,该模型能够对背景统计数据进行推理,因此可以检测到何时变化足以调用边界决策。三维模型的形状被认为是一个弹性实体,具有由数千个顶点组成的简单网格(即有限元三角剖分)表面。模型的变形来源于一个线性系统,该系统对来自感兴趣区域(ROI)边界的外力进行编码,感兴趣区域(ROI)是表示当前模型预测的对象区域的二进制掩码。采用有限元法实现了模型的高效优化和快速收敛。该模型的其他优点包括易于处理拓扑变化,以及能够整合人类交互。给出了对带有噪声的三维医学图像的分割和验证实验结果。
{"title":"Active volume models for 3D medical image segmentation","authors":"Tian Shen, Hongsheng Li, Z. Qian, Xiaolei Huang","doi":"10.1109/CVPR.2009.5206563","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206563","url":null,"abstract":"In this paper, we propose a novel predictive model for object boundary, which can integrate information from any sources. The model is a dynamic “object” model whose manifestation includes a deformable surface representing shape, a volumetric interior carrying appearance statistics, and an embedded classifier that separates object from background based on current feature information. Unlike Snakes, Level Set, Graph Cut, MRF and CRF approaches, the model is “self-contained” in that it does not model the background, but rather focuses on an accurate representation of the foreground object's attributes. As we will show, however, the model is capable of reasoning about the background statistics thus can detect when is change sufficient to invoke a boundary decision. The shape of the 3D model is considered as an elastic solid, with a simplex-mesh (i.e. finite element triangulation) surface made of thousands of vertices. Deformations of the model are derived from a linear system that encodes external forces from the boundary of a Region of Interest (ROI), which is a binary mask representing the object region predicted by the current model. Efficient optimization and fast convergence of the model are achieved using the Finite Element Method (FEM). Other advantages of the model include the ease of dealing with topology changes and its ability to incorporate human interactions. Segmentation and validation results are presented for experiments on noisy 3D medical images.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114521024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Discriminatively trained particle filters for complex multi-object tracking 用于复杂多目标跟踪的判别训练粒子滤波器
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206801
Robin Hess, Alan Fern
This work presents a discriminative training method for particle filters in the context of multi-object tracking. We are motivated by the difficulty of hand-tuning the many model parameters for such applications and also by results in many application domains indicating that discriminative training is often superior to generative training methods. Our learning approach is tightly integrated into the actual inference process of the filter and attempts to directly optimize the filter parameters in response to observed errors. We present experimental results in the challenging domain of American football where our filter is trained to track all 22 players throughout football plays. The training method is shown to significantly improve performance of the tracker and to significantly outperform two recent particle-based multi-object tracking methods.
提出了一种基于多目标跟踪的粒子滤波器判别训练方法。我们的动机是手工调整这些应用程序的许多模型参数的困难,以及许多应用领域的结果表明,判别训练通常优于生成训练方法。我们的学习方法与滤波器的实际推理过程紧密结合,并尝试根据观察到的误差直接优化滤波器参数。我们在具有挑战性的橄榄球领域展示了实验结果,我们的过滤器被训练成在整个橄榄球比赛中跟踪所有22名球员。结果表明,该训练方法显著提高了跟踪器的性能,显著优于两种基于粒子的多目标跟踪方法。
{"title":"Discriminatively trained particle filters for complex multi-object tracking","authors":"Robin Hess, Alan Fern","doi":"10.1109/CVPR.2009.5206801","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206801","url":null,"abstract":"This work presents a discriminative training method for particle filters in the context of multi-object tracking. We are motivated by the difficulty of hand-tuning the many model parameters for such applications and also by results in many application domains indicating that discriminative training is often superior to generative training methods. Our learning approach is tightly integrated into the actual inference process of the filter and attempts to directly optimize the filter parameters in response to observed errors. We present experimental results in the challenging domain of American football where our filter is trained to track all 22 players throughout football plays. The training method is shown to significantly improve performance of the tracker and to significantly outperform two recent particle-based multi-object tracking methods.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116079486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 135
An instance selection approach to Multiple instance Learning 多实例学习的实例选择方法
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206655
Zhouyu Fu, A. Robles-Kelly
Multiple-instance learning (MIL) is a new paradigm of supervised learning that deals with the classification of bags. Each bag is presented as a collection of instances from which features are extracted. In MIL, we have usually confronted with a large instance space for even moderately sized data sets since each bag may contain many instances. Hence it is important to design efficient instance pruning and selection techniques to speed up the learning process without compromising on the performance. In this paper, we address the issue of instance selection in multiple instance learning and propose the IS-MIL, an instance selection framework for MIL, to tackle large-scale MIL problems. IS-MIL is based on an alternative optimisation framework by iteratively repeating the steps of instance selection/updating and classifier learning, which is guaranteed to converge. Experimental results demonstrate the utility and efficiency of the proposed approach compared to the alternatives.
多实例学习(Multiple-instance learning, MIL)是一种处理袋分类的监督学习新范式。每个包被表示为实例的集合,从中提取特征。在MIL中,即使是中等大小的数据集,我们通常也会面临较大的实例空间,因为每个包可能包含许多实例。因此,设计有效的实例修剪和选择技术以在不影响性能的情况下加快学习过程是非常重要的。在本文中,我们解决了多实例学习中的实例选择问题,并提出了一种多实例学习的实例选择框架IS-MIL来解决大规模的MIL问题。is - mil是基于一个备选优化框架,通过迭代重复实例选择/更新和分类器学习的步骤,保证收敛。实验结果证明了该方法的实用性和有效性。
{"title":"An instance selection approach to Multiple instance Learning","authors":"Zhouyu Fu, A. Robles-Kelly","doi":"10.1109/CVPR.2009.5206655","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206655","url":null,"abstract":"Multiple-instance learning (MIL) is a new paradigm of supervised learning that deals with the classification of bags. Each bag is presented as a collection of instances from which features are extracted. In MIL, we have usually confronted with a large instance space for even moderately sized data sets since each bag may contain many instances. Hence it is important to design efficient instance pruning and selection techniques to speed up the learning process without compromising on the performance. In this paper, we address the issue of instance selection in multiple instance learning and propose the IS-MIL, an instance selection framework for MIL, to tackle large-scale MIL problems. IS-MIL is based on an alternative optimisation framework by iteratively repeating the steps of instance selection/updating and classifier learning, which is guaranteed to converge. Experimental results demonstrate the utility and efficiency of the proposed approach compared to the alternatives.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123316711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
A family of contextual measures of similarity between distributions with application to image retrieval 分布间相似性的一组上下文度量及其在图像检索中的应用
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206505
F. Perronnin, Yan Liu, J. Renders
We introduce a novel family of contextual measures of similarity between distributions: the similarity between two distributions q and p is measured in the context of a third distribution u. In our framework any traditional measure of similarity / dissimilarity has its contextual counterpart. We show that for two important families of divergences (Bregman and Csisz'ar), the contextual similarity computation consists in solving a convex optimization problem. We focus on the case of multinomials and explain how to compute in practice the similarity for several well-known measures. These contextual measures are then applied to the image retrieval problem. In such a case, the context u is estimated from the neighbors of a query q. One of the main benefits of our approach lies in the fact that using different contexts, and especially contexts at multiple scales (i.e. broad and narrow contexts), provides different views on the same problem. Combining the different views can improve retrieval accuracy. We will show on two very different datasets (one of photographs, the other of document images) that the proposed measures have a relatively small positive impact on macro Average Precision (which measures purely ranking) and a large positive impact on micro Average Precision (which measures both ranking and consistency of the scores across multiple queries).
我们引入了一组新的分布之间相似性的上下文度量:两个分布q和p之间的相似性是在第三个分布u的背景下测量的。在我们的框架中,任何传统的相似性/不相似性度量都有其上下文对应。我们表明,对于两个重要的散度族(Bregman和cissz 'ar),上下文相似性计算包括解决一个凸优化问题。我们将重点讨论多项式的情况,并解释如何在实践中计算几种众所周知的度量的相似度。然后将这些上下文度量应用于图像检索问题。在这种情况下,上下文u是从查询q的邻居中估计出来的。我们的方法的一个主要好处在于,使用不同的上下文,特别是在多个尺度上的上下文(即广义和狭义上下文),可以对同一个问题提供不同的观点。结合不同的视图可以提高检索的准确性。我们将在两个非常不同的数据集(一个是照片,另一个是文档图像)上展示,提议的度量对宏观平均精度(衡量纯粹的排名)有相对较小的积极影响,而对微观平均精度(衡量多个查询的排名和分数的一致性)有很大的积极影响。
{"title":"A family of contextual measures of similarity between distributions with application to image retrieval","authors":"F. Perronnin, Yan Liu, J. Renders","doi":"10.1109/CVPR.2009.5206505","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206505","url":null,"abstract":"We introduce a novel family of contextual measures of similarity between distributions: the similarity between two distributions q and p is measured in the context of a third distribution u. In our framework any traditional measure of similarity / dissimilarity has its contextual counterpart. We show that for two important families of divergences (Bregman and Csisz'ar), the contextual similarity computation consists in solving a convex optimization problem. We focus on the case of multinomials and explain how to compute in practice the similarity for several well-known measures. These contextual measures are then applied to the image retrieval problem. In such a case, the context u is estimated from the neighbors of a query q. One of the main benefits of our approach lies in the fact that using different contexts, and especially contexts at multiple scales (i.e. broad and narrow contexts), provides different views on the same problem. Combining the different views can improve retrieval accuracy. We will show on two very different datasets (one of photographs, the other of document images) that the proposed measures have a relatively small positive impact on macro Average Precision (which measures purely ranking) and a large positive impact on micro Average Precision (which measures both ranking and consistency of the scores across multiple queries).","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123443148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Discriminative subvolume search for efficient action detection 判别子卷搜索,有效的动作检测
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206671
Junsong Yuan, Zicheng Liu, Ying Wu
Actions are spatio-temporal patterns which can be characterized by collections of spatio-temporal invariant features. Detection of actions is to find the re-occurrences (e.g. through pattern matching) of such spatio-temporal patterns. This paper addresses two critical issues in pattern matching-based action detection: (1) efficiency of pattern search in 3D videos and (2) tolerance of intra-pattern variations of actions. Our contributions are two-fold. First, we propose a discriminative pattern matching called naive-Bayes based mutual information maximization (NBMIM) for multi-class action categorization. It improves the state-of-the-art results on standard KTH dataset. Second, a novel search algorithm is proposed to locate the optimal subvolume in the 3D video space for efficient action detection. Our method is purely data-driven and does not rely on object detection, tracking or background subtraction. It can well handle the intra-pattern variations of actions such as scale and speed variations, and is insensitive to dynamic and clutter backgrounds and even partial occlusions. The experiments on versatile datasets including KTH and CMU action datasets demonstrate the effectiveness and efficiency of our method.
动作是一种时空模式,可以通过时空不变特征的集合来表征。动作检测就是发现这些时空模式的再次出现(例如通过模式匹配)。本文解决了基于模式匹配的动作检测中的两个关键问题:(1)3D视频中模式搜索的效率;(2)动作模式内变化的容忍度。我们的贡献是双重的。首先,我们提出了一种判别模式匹配方法——基于朴素贝叶斯的互信息最大化(NBMIM),用于多类动作分类。它改进了标准KTH数据集上的最新结果。其次,提出了一种新的搜索算法,在三维视频空间中定位最优子体以进行有效的动作检测。我们的方法是纯数据驱动的,不依赖于目标检测、跟踪或背景减除。它可以很好地处理动作的模式内变化,如规模和速度变化,并且对动态和杂乱背景甚至部分遮挡不敏感。在KTH和CMU动作数据集上的实验证明了该方法的有效性和高效性。
{"title":"Discriminative subvolume search for efficient action detection","authors":"Junsong Yuan, Zicheng Liu, Ying Wu","doi":"10.1109/CVPR.2009.5206671","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206671","url":null,"abstract":"Actions are spatio-temporal patterns which can be characterized by collections of spatio-temporal invariant features. Detection of actions is to find the re-occurrences (e.g. through pattern matching) of such spatio-temporal patterns. This paper addresses two critical issues in pattern matching-based action detection: (1) efficiency of pattern search in 3D videos and (2) tolerance of intra-pattern variations of actions. Our contributions are two-fold. First, we propose a discriminative pattern matching called naive-Bayes based mutual information maximization (NBMIM) for multi-class action categorization. It improves the state-of-the-art results on standard KTH dataset. Second, a novel search algorithm is proposed to locate the optimal subvolume in the 3D video space for efficient action detection. Our method is purely data-driven and does not rely on object detection, tracking or background subtraction. It can well handle the intra-pattern variations of actions such as scale and speed variations, and is insensitive to dynamic and clutter backgrounds and even partial occlusions. The experiments on versatile datasets including KTH and CMU action datasets demonstrate the effectiveness and efficiency of our method.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121698973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 317
Symmetry integrated region-based image segmentation 基于区域的对称集成图像分割
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206570
Yu Sun, B. Bhanu
Symmetry is an important cue for machine perception that involves high-level knowledge of image components. Unlike most of the previous research that only computes symmetry in an image, this paper integrates symmetry with image segmentation to improve the segmentation performance. The symmetry integration is used to optimize both the segmentation and the symmetry of regions simultaneously. Interesting points are initially extracted from an image and they are further refined for detecting symmetry axis. A symmetry affinity matrix is used explicitly as a constraint in a region growing algorithm in order to refine the symmetry of segmented regions. Experimental results and comparisons from a wide domain of images indicate a promising improvement by symmetry integrated image segmentation compared to other image segmentation methods that do not exploit symmetry.
对称是机器感知的一个重要线索,它涉及图像组件的高级知识。与以往大多数研究只计算图像中的对称性不同,本文将对称性与图像分割相结合,提高了图像分割的性能。采用对称积分法同时优化区域分割和区域对称性。首先从图像中提取感兴趣的点,然后对其进行进一步细化以检测对称轴。在区域生长算法中明确地使用对称亲和矩阵作为约束,以改进分割区域的对称性。从广泛的图像领域的实验结果和比较表明,与其他不利用对称性的图像分割方法相比,对称集成图像分割有很大的改善。
{"title":"Symmetry integrated region-based image segmentation","authors":"Yu Sun, B. Bhanu","doi":"10.1109/CVPR.2009.5206570","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206570","url":null,"abstract":"Symmetry is an important cue for machine perception that involves high-level knowledge of image components. Unlike most of the previous research that only computes symmetry in an image, this paper integrates symmetry with image segmentation to improve the segmentation performance. The symmetry integration is used to optimize both the segmentation and the symmetry of regions simultaneously. Interesting points are initially extracted from an image and they are further refined for detecting symmetry axis. A symmetry affinity matrix is used explicitly as a constraint in a region growing algorithm in order to refine the symmetry of segmented regions. Experimental results and comparisons from a wide domain of images indicate a promising improvement by symmetry integrated image segmentation compared to other image segmentation methods that do not exploit symmetry.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123959175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
期刊
2009 IEEE Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1