首页 > 最新文献

2009 IEEE Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
Disambiguating the recognition of 3D objects 消除三维物体识别的歧义
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206683
Gutemberg Guerra-Filho
We propose novel algorithms for the detection, segmentation, recognition, and pose estimation of three-dimensional objects. Our approach initially infers geometric primitives to describe the set of 3D objects. A hierarchical structure is constructed to organize the objects in terms of shared primitives and relations between different primitives in the same object. This structure is shown to disambiguate the object models and to improve recognition rates. The primitives are obtained through our new Invariant Hough Transform. This algorithm uses geometric invariants to compute relations for subsets of points in a specific object. Each relation is stored in a hash table according to the invariant value. The hash table is used to find potential corresponding points between objects. With point matches, pose estimation is achieved by building a probability distribution of transformations. We evaluate our methods with experiments using synthetic and real 3D objects.
我们提出了三维物体的检测、分割、识别和姿态估计的新算法。我们的方法首先推断几何原语来描述3D对象集。层次结构是根据共享原语和同一对象中不同原语之间的关系来组织对象的。这种结构可以消除目标模型的歧义,提高识别率。这些原语是通过新的不变霍夫变换得到的。该算法使用几何不变量来计算特定对象中点子集的关系。每个关系根据不变值存储在哈希表中。哈希表用于查找对象之间可能对应的点。对于点匹配,姿态估计是通过建立变换的概率分布来实现的。我们用合成的和真实的三维物体来评估我们的方法。
{"title":"Disambiguating the recognition of 3D objects","authors":"Gutemberg Guerra-Filho","doi":"10.1109/CVPR.2009.5206683","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206683","url":null,"abstract":"We propose novel algorithms for the detection, segmentation, recognition, and pose estimation of three-dimensional objects. Our approach initially infers geometric primitives to describe the set of 3D objects. A hierarchical structure is constructed to organize the objects in terms of shared primitives and relations between different primitives in the same object. This structure is shown to disambiguate the object models and to improve recognition rates. The primitives are obtained through our new Invariant Hough Transform. This algorithm uses geometric invariants to compute relations for subsets of points in a specific object. Each relation is stored in a hash table according to the invariant value. The hash table is used to find potential corresponding points between objects. With point matches, pose estimation is achieved by building a probability distribution of transformations. We evaluate our methods with experiments using synthetic and real 3D objects.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131302227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Towards high-resolution large-scale multi-view stereo 迈向高分辨率大尺度多视点立体
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206617
Hoang-Hiep Vu, R. Keriven, Patrick Labatut, Jean-Philippe Pons
Boosted by the Middlebury challenge, the precision of dense multi-view stereovision methods has increased drastically in the past few years. Yet, most methods, although they perform well on this benchmark, are still inapplicable to large-scale data sets taken under uncontrolled conditions. In this paper, we propose a multi-view stereo pipeline able to deal at the same time with very large scenes while still producing highly detailed reconstructions within very reasonable time. The keys to these benefits are twofold: (i) a minimum s-t cut based global optimization that transforms a dense point cloud into a visibility consistent mesh, followed by (ii) a mesh-based variational refinement that captures small details, smartly handling photo-consistency, regularization and adaptive resolution. Our method has been tested on numerous large-scale outdoor scenes. The accuracy of our reconstructions is also measured on the recent dense multi-view benchmark proposed by Strecha et al., showing our results to compare more than favorably with the current state-of-the-art.
在米德尔伯里挑战的推动下,密集多视图立体视觉方法的精度在过去几年中急剧提高。然而,大多数方法虽然在这个基准上表现良好,但仍然不适用于在不受控制的条件下采集的大规模数据集。在本文中,我们提出了一种多视图立体管道,能够同时处理非常大的场景,同时在非常合理的时间内产生非常详细的重建。这些好处的关键是双重的:(i)基于最小s-t切割的全局优化,将密集的点云转换为可见性一致的网格,其次是(ii)基于网格的变分细化,捕获小细节,巧妙地处理照片一致性,正则化和自适应分辨率。我们的方法已经在许多大型户外场景中进行了测试。我们的重建精度也在Strecha等人最近提出的密集多视图基准上进行了测量,表明我们的结果与当前最先进的技术相比更加有利。
{"title":"Towards high-resolution large-scale multi-view stereo","authors":"Hoang-Hiep Vu, R. Keriven, Patrick Labatut, Jean-Philippe Pons","doi":"10.1109/CVPR.2009.5206617","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206617","url":null,"abstract":"Boosted by the Middlebury challenge, the precision of dense multi-view stereovision methods has increased drastically in the past few years. Yet, most methods, although they perform well on this benchmark, are still inapplicable to large-scale data sets taken under uncontrolled conditions. In this paper, we propose a multi-view stereo pipeline able to deal at the same time with very large scenes while still producing highly detailed reconstructions within very reasonable time. The keys to these benefits are twofold: (i) a minimum s-t cut based global optimization that transforms a dense point cloud into a visibility consistent mesh, followed by (ii) a mesh-based variational refinement that captures small details, smartly handling photo-consistency, regularization and adaptive resolution. Our method has been tested on numerous large-scale outdoor scenes. The accuracy of our reconstructions is also measured on the recent dense multi-view benchmark proposed by Strecha et al., showing our results to compare more than favorably with the current state-of-the-art.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131847950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 279
LidarBoost: Depth superresolution for ToF 3D shape scanning LidarBoost:用于ToF 3D形状扫描的深度超分辨率
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206804
Sebastian Schuon, C. Theobalt, James Davis, S. Thrun
Depth maps captured with time-of-flight cameras have very low data quality: the image resolution is rather limited and the level of random noise contained in the depth maps is very high. Therefore, such flash lidars cannot be used out of the box for high-quality 3D object scanning. To solve this problem, we present LidarBoost, a 3D depth superresolution method that combines several low resolution noisy depth images of a static scene from slightly displaced viewpoints, and merges them into a high-resolution depth image. We have developed an optimization framework that uses a data fidelity term and a geometry prior term that is tailored to the specific characteristics of flash lidars. We demonstrate both visually and quantitatively that LidarBoost produces better results than previous methods from the literature.
用飞行时间相机捕获的深度图数据质量很低:图像分辨率相当有限,深度图中包含的随机噪声水平非常高。因此,这种闪光激光雷达不能开箱即用,用于高质量的3D物体扫描。为了解决这个问题,我们提出了LidarBoost,这是一种3D深度超分辨率方法,它将静态场景的几个低分辨率噪声深度图像从稍微偏移的视点组合在一起,并将它们合并成一个高分辨率深度图像。我们开发了一个优化框架,该框架使用数据保真度术语和几何优先术语,根据闪光激光雷达的特定特性量身定制。我们从视觉上和定量上证明LidarBoost比以前的文献方法产生更好的结果。
{"title":"LidarBoost: Depth superresolution for ToF 3D shape scanning","authors":"Sebastian Schuon, C. Theobalt, James Davis, S. Thrun","doi":"10.1109/CVPR.2009.5206804","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206804","url":null,"abstract":"Depth maps captured with time-of-flight cameras have very low data quality: the image resolution is rather limited and the level of random noise contained in the depth maps is very high. Therefore, such flash lidars cannot be used out of the box for high-quality 3D object scanning. To solve this problem, we present LidarBoost, a 3D depth superresolution method that combines several low resolution noisy depth images of a static scene from slightly displaced viewpoints, and merges them into a high-resolution depth image. We have developed an optimization framework that uses a data fidelity term and a geometry prior term that is tailored to the specific characteristics of flash lidars. We demonstrate both visually and quantitatively that LidarBoost produces better results than previous methods from the literature.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115431083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 230
Illumination and spatially varying specular reflectance from a single view 单个视图的光照和空间变化的镜面反射率
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206764
K. Hara, K. Nishino
Estimating the illumination and the reflectance properties of an object surface from a sparse set of images is an important but inherently ill-posed problem. The problem becomes even harder if we wish to account for the spatial variation of material properties on the surface. In this paper, we derive a novel method for estimating the spatially varying specular reflectance properties, of a surface of known geometry, as well as the illumination distribution from a specular-only image, for instance, captured using polarization to separate reflection components. Unlike previous work, we do not assume the illumination to be a single point light source. We model specular reflection with a spherical statistical distribution and encode the spatial variation with radial basis functions of its parameters. This allows us to formulate the simultaneous estimation of spatially varying specular reflectance and illumination as a sound probabilistic inference problem, in particular, using Csiszar's I-divergence measure. To solve it, we derive an iterative algorithm similar to expectation maximization. We demonstrate the effectiveness of the method on synthetic and real-world scenes.
从一组稀疏图像中估计物体表面的照度和反射率是一个重要但固有不适定的问题。如果我们想要解释表面材料属性的空间变化,这个问题就变得更加困难了。在本文中,我们推导了一种新的方法来估计空间变化的镜面反射特性,已知几何形状的表面,以及从仅镜面图像的照明分布,例如,使用偏振来分离反射分量。与以前的工作不同,我们不假设照明是单点光源。我们用球面统计分布来模拟镜面反射,并用其参数的径向基函数来编码空间变化。这使我们能够将空间变化的镜面反射率和照度的同时估计作为一个可靠的概率推理问题,特别是使用cisszar的i -散度度量。为了解决这个问题,我们推导了一个类似于期望最大化的迭代算法。我们在合成场景和真实场景中证明了该方法的有效性。
{"title":"Illumination and spatially varying specular reflectance from a single view","authors":"K. Hara, K. Nishino","doi":"10.1109/CVPR.2009.5206764","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206764","url":null,"abstract":"Estimating the illumination and the reflectance properties of an object surface from a sparse set of images is an important but inherently ill-posed problem. The problem becomes even harder if we wish to account for the spatial variation of material properties on the surface. In this paper, we derive a novel method for estimating the spatially varying specular reflectance properties, of a surface of known geometry, as well as the illumination distribution from a specular-only image, for instance, captured using polarization to separate reflection components. Unlike previous work, we do not assume the illumination to be a single point light source. We model specular reflection with a spherical statistical distribution and encode the spatial variation with radial basis functions of its parameters. This allows us to formulate the simultaneous estimation of spatially varying specular reflectance and illumination as a sound probabilistic inference problem, in particular, using Csiszar's I-divergence measure. To solve it, we derive an iterative algorithm similar to expectation maximization. We demonstrate the effectiveness of the method on synthetic and real-world scenes.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115439556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Simultaneous image classification and annotation 同时进行图像分类和标注
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206800
Chong Wang, D. Blei, Li Fei-Fei
Image classification and annotation are important problems in computer vision, but rarely considered together. Intuitively, annotations provide evidence for the class label, and the class label provides evidence for annotations. For example, an image of class highway is more likely annotated with words “road,” “car,” and “traffic” than words “fish,” “boat,” and “scuba.” In this paper, we develop a new probabilistic model for jointly modeling the image, its class label, and its annotations. Our model treats the class label as a global description of the image, and treats annotation terms as local descriptions of parts of the image. Its underlying probabilistic assumptions naturally integrate these two sources of information. We derive an approximate inference and estimation algorithms based on variational methods, as well as efficient approximations for classifying and annotating new images. We examine the performance of our model on two real-world image data sets, illustrating that a single model provides competitive annotation performance, and superior classification performance.
图像分类和标注是计算机视觉中的重要问题,但很少同时考虑。直观地说,注释为类标签提供证据,类标签为注释提供证据。例如,一类highway的图像更有可能被注释为“road”、“car”和“traffic”,而不是“fish”、“boat”和“scuba”。在本文中,我们开发了一种新的概率模型来联合建模图像,它的类标签和注释。我们的模型将类标签视为图像的全局描述,并将注释术语视为图像部分的局部描述。其潜在的概率假设自然地整合了这两种信息来源。我们推导了一种基于变分方法的近似推理和估计算法,以及用于分类和注释新图像的有效近似。我们在两个真实世界的图像数据集上检查了我们的模型的性能,说明单个模型提供了具有竞争力的注释性能和更好的分类性能。
{"title":"Simultaneous image classification and annotation","authors":"Chong Wang, D. Blei, Li Fei-Fei","doi":"10.1109/CVPR.2009.5206800","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206800","url":null,"abstract":"Image classification and annotation are important problems in computer vision, but rarely considered together. Intuitively, annotations provide evidence for the class label, and the class label provides evidence for annotations. For example, an image of class highway is more likely annotated with words “road,” “car,” and “traffic” than words “fish,” “boat,” and “scuba.” In this paper, we develop a new probabilistic model for jointly modeling the image, its class label, and its annotations. Our model treats the class label as a global description of the image, and treats annotation terms as local descriptions of parts of the image. Its underlying probabilistic assumptions naturally integrate these two sources of information. We derive an approximate inference and estimation algorithms based on variational methods, as well as efficient approximations for classifying and annotating new images. We examine the performance of our model on two real-world image data sets, illustrating that a single model provides competitive annotation performance, and superior classification performance.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115540802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 612
Noninvasive volumetric imaging of cardiac electrophysiology 心脏电生理的无创容积成像
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206717
Linwei Wang, Heye Zhang, Ken C. L. Wong, Huafeng Liu, P. Shi
Volumetric details of cardiac electrophysiology, such as transmembrane potential dynamics and tissue excitability of the myocardium, are of fundamental importance for understanding normal and pathological cardiac mechanisms, and for aiding the diagnosis and treatment of cardiac arrhythmia. Noninvasive observations, however, are made on body surface as an integration-projection of the volumetric phenomena inside patient's heart. We present a physiological-model-constrained statistical framework where prior knowledge of general myocardial electrical activity is used to guide the reconstruction of patient-specific volumetric cardiac electrophysiological details from body surface potential data. Sequential data assimilation with proper computational reduction is developed to estimate transmembrane potential and myocardial excitability inside the heart, which are then utilized to depict arrhythmogenic substrates. Effectiveness and validity of the framework is demonstrated through its application to evaluate the location and extent of myocardial infract using real patient data.
心脏电生理的体积细节,如跨膜电位动力学和心肌的组织兴奋性,对于理解正常和病理心脏机制以及帮助心律失常的诊断和治疗具有重要意义。然而,非侵入性观察是在体表上进行的,作为患者心脏内体积现象的集成投影。我们提出了一个生理模型约束的统计框架,其中使用一般心肌电活动的先验知识来指导从体表电位数据重建患者特异性体积心脏电生理细节。序贯数据同化与适当的计算简化被开发来估计跨膜电位和心脏内的心肌兴奋性,然后用来描绘心律失常的底物。通过使用真实患者数据评估心肌梗死的位置和程度,证明了该框架的有效性和有效性。
{"title":"Noninvasive volumetric imaging of cardiac electrophysiology","authors":"Linwei Wang, Heye Zhang, Ken C. L. Wong, Huafeng Liu, P. Shi","doi":"10.1109/CVPR.2009.5206717","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206717","url":null,"abstract":"Volumetric details of cardiac electrophysiology, such as transmembrane potential dynamics and tissue excitability of the myocardium, are of fundamental importance for understanding normal and pathological cardiac mechanisms, and for aiding the diagnosis and treatment of cardiac arrhythmia. Noninvasive observations, however, are made on body surface as an integration-projection of the volumetric phenomena inside patient's heart. We present a physiological-model-constrained statistical framework where prior knowledge of general myocardial electrical activity is used to guide the reconstruction of patient-specific volumetric cardiac electrophysiological details from body surface potential data. Sequential data assimilation with proper computational reduction is developed to estimate transmembrane potential and myocardial excitability inside the heart, which are then utilized to depict arrhythmogenic substrates. Effectiveness and validity of the framework is demonstrated through its application to evaluate the location and extent of myocardial infract using real patient data.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115772469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A streaming framework for seamless building reconstruction from large-scale aerial LiDAR data 基于大规模航空激光雷达数据的无缝建筑重建流框架
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206760
Qian-Yi Zhou, U. Neumann
We present a streaming framework for seamless building reconstruction from huge aerial LiDAR point sets. By storing data as stream files on hard disk and using main memory as only a temporary storage for ongoing computation, we achieve efficient out-of-core data management. This gives us the ability to handle data sets with hundreds of millions of points in a uniform manner. By adapting a building modeling pipeline into our streaming framework, we create the whole urban model of Atlanta from 17.7 GB LiDAR data with 683 M points in under 25 hours using less than 1 GB memory. To integrate this complex modeling pipeline with our streaming framework, we develop a state propagation mechanism, and extend current reconstruction algorithms to handle the large scale of data.
我们提出了一个流式框架,用于从巨大的空中激光雷达点集进行无缝建筑重建。通过将数据以流文件的形式存储在硬盘上,并使用主存作为正在进行的计算的临时存储,我们实现了高效的核外数据管理。这使我们能够以统一的方式处理具有数亿个点的数据集。通过将建筑建模管道融入我们的流媒体框架,我们在25小时内使用不到1gb的内存,从17.7 GB的激光雷达数据和683m个点创建了亚特兰大的整个城市模型。为了将这种复杂的建模管道与我们的流框架集成,我们开发了一种状态传播机制,并扩展了当前的重建算法来处理大规模数据。
{"title":"A streaming framework for seamless building reconstruction from large-scale aerial LiDAR data","authors":"Qian-Yi Zhou, U. Neumann","doi":"10.1109/CVPR.2009.5206760","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206760","url":null,"abstract":"We present a streaming framework for seamless building reconstruction from huge aerial LiDAR point sets. By storing data as stream files on hard disk and using main memory as only a temporary storage for ongoing computation, we achieve efficient out-of-core data management. This gives us the ability to handle data sets with hundreds of millions of points in a uniform manner. By adapting a building modeling pipeline into our streaming framework, we create the whole urban model of Atlanta from 17.7 GB LiDAR data with 683 M points in under 25 hours using less than 1 GB memory. To integrate this complex modeling pipeline with our streaming framework, we develop a state propagation mechanism, and extend current reconstruction algorithms to handle the large scale of data.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115881360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Learning rotational features for filament detection 学习旋转特征的灯丝检测
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206511
Germán González, F. Fleuret, P. Fua
State-of-the-art approaches for detecting filament-like structures in noisy images rely on filters optimized for signals of a particular shape, such as an ideal edge or ridge. While these approaches are optimal when the image conforms to these ideal shapes, their performance quickly degrades on many types of real data where the image deviates from the ideal model, and when noise processes violate a Gaussian assumption. In this paper, we show that by learning rotational features, we can outperform state-of-the-art filament detection techniques on many different kinds of imagery. More specifically, we demonstrate superior performance for the detection of blood vessel in retinal scans, neurons in brightfield microscopy imagery, and streets in satellite imagery.
在噪声图像中检测丝状结构的最先进方法依赖于针对特定形状的信号优化的滤波器,例如理想的边缘或脊。当图像符合这些理想形状时,这些方法是最优的,但是当图像偏离理想模型时,当噪声过程违反高斯假设时,它们的性能会迅速下降。在本文中,我们表明,通过学习旋转特征,我们可以在许多不同类型的图像上优于最先进的灯丝检测技术。更具体地说,我们在视网膜扫描中的血管检测、明场显微镜图像中的神经元检测和卫星图像中的街道检测方面展示了卓越的性能。
{"title":"Learning rotational features for filament detection","authors":"Germán González, F. Fleuret, P. Fua","doi":"10.1109/CVPR.2009.5206511","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206511","url":null,"abstract":"State-of-the-art approaches for detecting filament-like structures in noisy images rely on filters optimized for signals of a particular shape, such as an ideal edge or ridge. While these approaches are optimal when the image conforms to these ideal shapes, their performance quickly degrades on many types of real data where the image deviates from the ideal model, and when noise processes violate a Gaussian assumption. In this paper, we show that by learning rotational features, we can outperform state-of-the-art filament detection techniques on many different kinds of imagery. More specifically, we demonstrate superior performance for the detection of blood vessel in retinal scans, neurons in brightfield microscopy imagery, and streets in satellite imagery.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"36 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124383463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
A similarity measure between vector sequences with application to handwritten word image retrieval 向量序列之间的相似性度量及其在手写文字图像检索中的应用
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206783
José A. Rodríguez-Serrano, F. Perronnin, J. Lladós, Gemma Sánchez
This article proposes a novel similarity measure between vector sequences. Recently, a model-based approach was introduced to address this issue. It consists in modeling each sequence with a continuous Hidden Markov Model (CHMM) and computing a probabilistic measure of similarity between C-HMMs. In this paper we propose to model sequences with semi-continuous HMMs (SC-HMMs): the Gaussians of the SC-HMMs are constrained to belong to a shared pool of Gaussians. This constraint provides two major benefits. First, the a priori information contained in the common set of Gaussians leads to a more accurate estimate of the HMM parameters. Second, the computation of a probabilistic similarity between two SC-HMMs can be simplified to a Dynamic Time Warping (DTW) between their mixture weight vectors, which reduces significantly the computational cost. Experimental results on a handwritten word retrieval task show that the proposed similarity outperforms the traditional DTW between the original sequences, and the model-based approach which uses C-HMMs. We also show that this increase in accuracy can be traded against a significant reduction of the computational cost (up to 100 times).
本文提出了一种新的向量序列相似性度量方法。最近,引入了一种基于模型的方法来解决这个问题。它包括用连续隐马尔可夫模型(CHMM)对每个序列建模,并计算c - hmm之间相似性的概率度量。本文提出用半连续hmm (sc - hmm)对序列进行建模:sc - hmm的高斯函数被约束为属于一个共享的高斯函数池。这个约束提供了两个主要好处。首先,包含在公共高斯集合中的先验信息可以更准确地估计HMM参数。其次,将两个sc - hmm之间的概率相似度计算简化为混合权重向量之间的动态时间规整(DTW),大大降低了计算成本。手写体单词检索的实验结果表明,本文提出的相似度方法优于传统的原始序列之间的DTW方法和基于模型的使用c - hmm的方法。我们还表明,准确度的提高可以与计算成本的显著降低(高达100倍)相交换。
{"title":"A similarity measure between vector sequences with application to handwritten word image retrieval","authors":"José A. Rodríguez-Serrano, F. Perronnin, J. Lladós, Gemma Sánchez","doi":"10.1109/CVPR.2009.5206783","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206783","url":null,"abstract":"This article proposes a novel similarity measure between vector sequences. Recently, a model-based approach was introduced to address this issue. It consists in modeling each sequence with a continuous Hidden Markov Model (CHMM) and computing a probabilistic measure of similarity between C-HMMs. In this paper we propose to model sequences with semi-continuous HMMs (SC-HMMs): the Gaussians of the SC-HMMs are constrained to belong to a shared pool of Gaussians. This constraint provides two major benefits. First, the a priori information contained in the common set of Gaussians leads to a more accurate estimate of the HMM parameters. Second, the computation of a probabilistic similarity between two SC-HMMs can be simplified to a Dynamic Time Warping (DTW) between their mixture weight vectors, which reduces significantly the computational cost. Experimental results on a handwritten word retrieval task show that the proposed similarity outperforms the traditional DTW between the original sequences, and the model-based approach which uses C-HMMs. We also show that this increase in accuracy can be traded against a significant reduction of the computational cost (up to 100 times).","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114413139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Coded exposure deblurring: Optimized codes for PSF estimation and invertibility 编码曝光去模糊:PSF估计和可逆性的优化代码
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206685
Amit K. Agrawal, Yi Xu
We consider the problem of single image object motion deblurring from a static camera. It is well-known that deblurring of moving objects using a traditional camera is ill-posed, due to the loss of high spatial frequencies in the captured blurred image. A coded exposure camera modulates the integration pattern of light by opening and closing the shutter within the exposure time using a binary code. The code is chosen to make the resulting point spread function (PSF) invertible, for best deconvolution performance. However, for a successful deconvolution algorithm, PSF estimation is as important as PSF invertibility. We show that PSF estimation is easier if the resulting motion blur is smooth and the optimal code for PSF invertibility could worsen PSF estimation, since it leads to non-smooth blur. We show that both criterions of PSF invertibility and PSF estimation can be simultaneously met, albeit with a slight increase in the deconvolution noise. We propose design rules for a code to have good PSF estimation capability and outline two search criteria for finding the optimal code for a given length. We present theoretical analysis comparing the performance of the proposed code with the code optimized solely for PSF invertibility. We also show how to easily implement coded exposure on a consumer grade machine vision camera with no additional hardware. Real experimental results demonstrate the effectiveness of the proposed codes for motion deblurring.
研究了静态摄像机中单幅图像物体运动的去模糊问题。众所周知,由于在捕获的模糊图像中丢失了高空间频率,使用传统相机对运动物体进行去模糊是不恰当的。编码曝光相机通过使用二进制代码在曝光时间内打开和关闭快门来调制光的集成模式。为了获得最佳的反卷积性能,选择的代码使结果点扩展函数(PSF)可逆。然而,对于一个成功的反卷积算法,PSF估计和PSF可逆性同样重要。我们表明,如果产生的运动模糊是平滑的,PSF估计更容易,而PSF可逆性的最佳代码可能会恶化PSF估计,因为它会导致非平滑模糊。我们证明了PSF可逆性和PSF估计的两个准则可以同时满足,尽管反卷积噪声略有增加。我们提出了具有良好PSF估计能力的代码的设计规则,并概述了寻找给定长度的最优代码的两个搜索准则。我们提出了理论分析,比较了所提出的代码与仅针对PSF可逆性优化的代码的性能。我们还展示了如何在没有额外硬件的情况下在消费级机器视觉相机上轻松实现编码曝光。实际实验结果证明了所提代码对运动去模糊的有效性。
{"title":"Coded exposure deblurring: Optimized codes for PSF estimation and invertibility","authors":"Amit K. Agrawal, Yi Xu","doi":"10.1109/CVPR.2009.5206685","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206685","url":null,"abstract":"We consider the problem of single image object motion deblurring from a static camera. It is well-known that deblurring of moving objects using a traditional camera is ill-posed, due to the loss of high spatial frequencies in the captured blurred image. A coded exposure camera modulates the integration pattern of light by opening and closing the shutter within the exposure time using a binary code. The code is chosen to make the resulting point spread function (PSF) invertible, for best deconvolution performance. However, for a successful deconvolution algorithm, PSF estimation is as important as PSF invertibility. We show that PSF estimation is easier if the resulting motion blur is smooth and the optimal code for PSF invertibility could worsen PSF estimation, since it leads to non-smooth blur. We show that both criterions of PSF invertibility and PSF estimation can be simultaneously met, albeit with a slight increase in the deconvolution noise. We propose design rules for a code to have good PSF estimation capability and outline two search criteria for finding the optimal code for a given length. We present theoretical analysis comparing the performance of the proposed code with the code optimized solely for PSF invertibility. We also show how to easily implement coded exposure on a consumer grade machine vision camera with no additional hardware. Real experimental results demonstrate the effectiveness of the proposed codes for motion deblurring.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"287 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123445806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 89
期刊
2009 IEEE Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1