首页 > 最新文献

2013 IEEE Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
Learning without Human Scores for Blind Image Quality Assessment 无人工学习盲图像质量评估
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.133
Wufeng Xue, Lei Zhang, X. Mou
General purpose blind image quality assessment (BIQA) has been recently attracting significant attention in the fields of image processing, vision and machine learning. State-of-the-art BIQA methods usually learn to evaluate the image quality by regression from human subjective scores of the training samples. However, these methods need a large number of human scored images for training, and lack an explicit explanation of how the image quality is affected by image local features. An interesting question is then: can we learn for effective BIQA without using human scored images? This paper makes a good effort to answer this question. We partition the distorted images into overlapped patches, and use a percentile pooling strategy to estimate the local quality of each patch. Then a quality-aware clustering (QAC) method is proposed to learn a set of centroids on each quality level. These centroids are then used as a codebook to infer the quality of each patch in a given image, and subsequently a perceptual quality score of the whole image can be obtained. The proposed QAC based BIQA method is simple yet effective. It not only has comparable accuracy to those methods using human scored images in learning, but also has merits such as high linearity to human perception of image quality, real-time implementation and availability of image local quality map.
通用盲图像质量评估(BIQA)近年来在图像处理、视觉和机器学习等领域受到广泛关注。最先进的BIQA方法通常通过回归训练样本的人类主观分数来学习评估图像质量。然而,这些方法需要大量的人类评分图像进行训练,并且缺乏对图像局部特征如何影响图像质量的明确解释。一个有趣的问题是:我们能在不使用人类评分图像的情况下学习有效的BIQA吗?本文试图回答这个问题。我们将扭曲图像分割成重叠的小块,并使用百分位池化策略来估计每个小块的局部质量。然后提出了一种质量感知聚类(QAC)方法,在每个质量层次上学习一组质心。然后使用这些质心作为码本来推断给定图像中每个补丁的质量,随后可以获得整个图像的感知质量分数。提出的基于QAC的BIQA方法简单有效。它不仅具有与人类评分图像学习方法相当的准确性,而且具有与人类对图像质量感知线性度高、图像局部质量图实时性强、可用性高等优点。
{"title":"Learning without Human Scores for Blind Image Quality Assessment","authors":"Wufeng Xue, Lei Zhang, X. Mou","doi":"10.1109/CVPR.2013.133","DOIUrl":"https://doi.org/10.1109/CVPR.2013.133","url":null,"abstract":"General purpose blind image quality assessment (BIQA) has been recently attracting significant attention in the fields of image processing, vision and machine learning. State-of-the-art BIQA methods usually learn to evaluate the image quality by regression from human subjective scores of the training samples. However, these methods need a large number of human scored images for training, and lack an explicit explanation of how the image quality is affected by image local features. An interesting question is then: can we learn for effective BIQA without using human scored images? This paper makes a good effort to answer this question. We partition the distorted images into overlapped patches, and use a percentile pooling strategy to estimate the local quality of each patch. Then a quality-aware clustering (QAC) method is proposed to learn a set of centroids on each quality level. These centroids are then used as a codebook to infer the quality of each patch in a given image, and subsequently a perceptual quality score of the whole image can be obtained. The proposed QAC based BIQA method is simple yet effective. It not only has comparable accuracy to those methods using human scored images in learning, but also has merits such as high linearity to human perception of image quality, real-time implementation and availability of image local quality map.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"41 4 1","pages":"995-1002"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82851025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 353
Structured Face Hallucination 结构化面部幻觉
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.146
Chih-Yuan Yang, Sifei Liu, Ming-Hsuan Yang
The goal of face hallucination is to generate high-resolution images with fidelity from low-resolution ones. In contrast to existing methods based on patch similarity or holistic constraints in the image space, we propose to exploit local image structures for face hallucination. Each face image is represented in terms of facial components, contours and smooth regions. The image structure is maintained via matching gradients in the reconstructed high-resolution output. For facial components, we align input images to generate accurate exemplars and transfer the high-frequency details for preserving structural consistency. For contours, we learn statistical priors to generate salient structures in the high-resolution images. A patch matching method is utilized on the smooth regions where the image gradients are preserved. Experimental results demonstrate that the proposed algorithm generates hallucinated face images with favorable quality and adaptability.
人脸幻觉的目标是从低分辨率图像中生成具有保真度的高分辨率图像。与现有的基于图像空间中的斑块相似度或整体约束的方法相比,我们提出利用局部图像结构来实现人脸幻觉。每张人脸图像都是根据人脸成分、轮廓和光滑区域来表示的。通过在重建的高分辨率输出中匹配梯度来保持图像结构。对于面部成分,我们对齐输入图像以生成准确的样本,并传输高频细节以保持结构一致性。对于轮廓,我们学习统计先验来生成高分辨率图像中的显著结构。在保持图像梯度的光滑区域上采用了一种补丁匹配方法。实验结果表明,该算法生成的幻觉人脸图像具有良好的质量和适应性。
{"title":"Structured Face Hallucination","authors":"Chih-Yuan Yang, Sifei Liu, Ming-Hsuan Yang","doi":"10.1109/CVPR.2013.146","DOIUrl":"https://doi.org/10.1109/CVPR.2013.146","url":null,"abstract":"The goal of face hallucination is to generate high-resolution images with fidelity from low-resolution ones. In contrast to existing methods based on patch similarity or holistic constraints in the image space, we propose to exploit local image structures for face hallucination. Each face image is represented in terms of facial components, contours and smooth regions. The image structure is maintained via matching gradients in the reconstructed high-resolution output. For facial components, we align input images to generate accurate exemplars and transfer the high-frequency details for preserving structural consistency. For contours, we learn statistical priors to generate salient structures in the high-resolution images. A patch matching method is utilized on the smooth regions where the image gradients are preserved. Experimental results demonstrate that the proposed algorithm generates hallucinated face images with favorable quality and adaptability.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"90 3-4","pages":"1099-1106"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91489258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 134
A Statistical Model for Recreational Trails in Aerial Images 航拍影像中休闲步道的统计模型
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.50
Andrew Predoehl, S. Morris, Kobus Barnard
We present a statistical model of aerial images of recreational trails, and a method to infer trail routes in such images. We learn a set of text ons describing the images, and use them to divide the image into super-pixels represented by their text on. We then learn, for each text on, the frequency of generating on-trail and off-trail pixels, and the direction of trail through on-trail pixels. From these, we derive an image likelihood function. We combine that with a prior model of trail length and smoothness, yielding a posterior distribution for trails, given an image. We search for good values of this posterior using a novel stochastic variation of Dijkstra's algorithm. Our experiments, on trail images and ground truth collected in the western continental USA, show substantial improvement over those of the previous best trail-finding method.
提出了一种休闲步道航拍图像的统计模型,并提出了一种从航拍图像中推断步道路线的方法。我们学习一组描述图像的文本,并使用它们将图像划分为由其文本表示的超像素。然后我们学习,对于每一个文本,生成轨迹上和轨迹外像素的频率,以及轨迹通过轨迹上像素的方向。由此,我们推导出图像似然函数。我们将其与轨迹长度和平滑度的先验模型相结合,得到给定图像的轨迹后验分布。我们使用Dijkstra算法的一种新的随机变化来搜索这个后验的良好值。我们在美国西部大陆采集的路径图像和地面真实情况的实验表明,与以前的最佳路径寻找方法相比,我们的方法有了很大的改进。
{"title":"A Statistical Model for Recreational Trails in Aerial Images","authors":"Andrew Predoehl, S. Morris, Kobus Barnard","doi":"10.1109/CVPR.2013.50","DOIUrl":"https://doi.org/10.1109/CVPR.2013.50","url":null,"abstract":"We present a statistical model of aerial images of recreational trails, and a method to infer trail routes in such images. We learn a set of text ons describing the images, and use them to divide the image into super-pixels represented by their text on. We then learn, for each text on, the frequency of generating on-trail and off-trail pixels, and the direction of trail through on-trail pixels. From these, we derive an image likelihood function. We combine that with a prior model of trail length and smoothness, yielding a posterior distribution for trails, given an image. We search for good values of this posterior using a novel stochastic variation of Dijkstra's algorithm. Our experiments, on trail images and ground truth collected in the western continental USA, show substantial improvement over those of the previous best trail-finding method.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"45 1","pages":"337-344"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90163056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Max-Margin Riffled Independence Model for Image Tag Ranking 图像标签排序的最大边界riffle独立模型
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.399
Tian Lan, Greg Mori
We propose Max-Margin Riffled Independence Model (MMRIM), a new method for image tag ranking modeling the structured preferences among tags. The goal is to predict a ranked tag list for a given image, where tags are ordered by their importance or relevance to the image content. Our model integrates the max-margin formalism with riffled independence factorizations proposed in [10], which naturally allows for structured learning and efficient ranking. Experimental results on the SUN Attribute and Label Me datasets demonstrate the superior performance of the proposed model compared with baseline tag ranking methods. We also apply the predicted rank list of tags to several higher-level computer vision applications in image understanding and retrieval, and demonstrate that MMRIM significantly improves the accuracy of these applications.
本文提出了一种新的图像标签排序方法——最大边界riffle独立模型(MMRIM)。目标是预测给定图像的排名标签列表,其中标签根据其重要性或与图像内容的相关性排序。我们的模型将[10]中提出的最大边际形式主义与独立分解相结合,自然允许结构化学习和高效排序。在SUN Attribute和Label Me数据集上的实验结果表明,与基线标签排序方法相比,该模型具有更好的性能。我们还将预测的标签排名列表应用于图像理解和检索中的几个高级计算机视觉应用,并证明MMRIM显著提高了这些应用的准确性。
{"title":"A Max-Margin Riffled Independence Model for Image Tag Ranking","authors":"Tian Lan, Greg Mori","doi":"10.1109/CVPR.2013.399","DOIUrl":"https://doi.org/10.1109/CVPR.2013.399","url":null,"abstract":"We propose Max-Margin Riffled Independence Model (MMRIM), a new method for image tag ranking modeling the structured preferences among tags. The goal is to predict a ranked tag list for a given image, where tags are ordered by their importance or relevance to the image content. Our model integrates the max-margin formalism with riffled independence factorizations proposed in [10], which naturally allows for structured learning and efficient ranking. Experimental results on the SUN Attribute and Label Me datasets demonstrate the superior performance of the proposed model compared with baseline tag ranking methods. We also apply the predicted rank list of tags to several higher-level computer vision applications in image understanding and retrieval, and demonstrate that MMRIM significantly improves the accuracy of these applications.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"31 1","pages":"3103-3110"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90490779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Detecting Pulse from Head Motions in Video 视频中头部运动脉冲检测
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.440
Guha Balakrishnan, F. Durand, J. Guttag
We extract heart rate and beat lengths from videos by measuring subtle head motion caused by the Newtonian reaction to the influx of blood at each beat. Our method tracks features on the head and performs principal component analysis (PCA) to decompose their trajectories into a set of component motions. It then chooses the component that best corresponds to heartbeats based on its temporal frequency spectrum. Finally, we analyze the motion projected to this component and identify peaks of the trajectories, which correspond to heartbeats. When evaluated on 18 subjects, our approach reported heart rates nearly identical to an electrocardiogram device. Additionally we were able to capture clinically relevant information about heart rate variability.
我们从视频中提取心率和心跳长度,方法是测量每次心跳时血液流入时的牛顿反应引起的细微头部运动。我们的方法跟踪头部特征,并执行主成分分析(PCA)将其轨迹分解为一组分量运动。然后,它根据时间频谱选择最符合心跳的分量。最后,我们分析了投射到该分量的运动,并识别了与心跳对应的轨迹峰值。当对18名受试者进行评估时,我们的方法报告的心率几乎与心电图设备相同。此外,我们还能够获得有关心率变异性的临床相关信息。
{"title":"Detecting Pulse from Head Motions in Video","authors":"Guha Balakrishnan, F. Durand, J. Guttag","doi":"10.1109/CVPR.2013.440","DOIUrl":"https://doi.org/10.1109/CVPR.2013.440","url":null,"abstract":"We extract heart rate and beat lengths from videos by measuring subtle head motion caused by the Newtonian reaction to the influx of blood at each beat. Our method tracks features on the head and performs principal component analysis (PCA) to decompose their trajectories into a set of component motions. It then chooses the component that best corresponds to heartbeats based on its temporal frequency spectrum. Finally, we analyze the motion projected to this component and identify peaks of the trajectories, which correspond to heartbeats. When evaluated on 18 subjects, our approach reported heart rates nearly identical to an electrocardiogram device. Additionally we were able to capture clinically relevant information about heart rate variability.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"81 1","pages":"3430-3437"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83410380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 508
Spectral Modeling and Relighting of Reflective-Fluorescent Scenes 反射-荧光场景的光谱建模和重照明
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.191
Antony Lam, Imari Sato
Hyper spectral reflectance data allows for highly accurate spectral relighting under arbitrary illumination, which is invaluable to applications ranging from archiving cultural e-heritage to consumer product design. Past methods for capturing the spectral reflectance of scenes has proven successful in relighting but they all share a common assumption. All the methods do not consider the effects of fluorescence despite fluorescence being found in many everyday objects. In this paper, we describe the very different ways that reflectance and fluorescence interact with illuminants and show the need to explicitly consider fluorescence in the relighting problem. We then propose a robust method based on well established theories of reflectance and fluorescence for imaging each of these components. Finally, we show that we can relight real scenes of reflective-fluorescent surfaces with much higher accuracy in comparison to only considering the reflective component.
高光谱反射数据允许在任意照明下高精度的光谱重新照明,这对于从存档文化电子遗产到消费产品设计等应用来说是非常宝贵的。过去捕获场景的光谱反射率的方法已被证明在重光照方面是成功的,但它们都有一个共同的假设。尽管荧光在许多日常物品中都存在,但所有的方法都没有考虑到荧光的影响。在本文中,我们描述了反射率和荧光与光源相互作用的非常不同的方式,并显示了在重照明问题中明确考虑荧光的必要性。然后,我们提出了一种基于反射率和荧光成像这些组件的成熟理论的稳健方法。最后,我们表明,与只考虑反射成分相比,我们可以以更高的精度照亮反射荧光表面的真实场景。
{"title":"Spectral Modeling and Relighting of Reflective-Fluorescent Scenes","authors":"Antony Lam, Imari Sato","doi":"10.1109/CVPR.2013.191","DOIUrl":"https://doi.org/10.1109/CVPR.2013.191","url":null,"abstract":"Hyper spectral reflectance data allows for highly accurate spectral relighting under arbitrary illumination, which is invaluable to applications ranging from archiving cultural e-heritage to consumer product design. Past methods for capturing the spectral reflectance of scenes has proven successful in relighting but they all share a common assumption. All the methods do not consider the effects of fluorescence despite fluorescence being found in many everyday objects. In this paper, we describe the very different ways that reflectance and fluorescence interact with illuminants and show the need to explicitly consider fluorescence in the relighting problem. We then propose a robust method based on well established theories of reflectance and fluorescence for imaging each of these components. Finally, we show that we can relight real scenes of reflective-fluorescent surfaces with much higher accuracy in comparison to only considering the reflective component.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"9 1","pages":"1452-1459"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83555104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Bayesian Depth-from-Defocus with Shading Constraints 具有阴影约束的贝叶斯离焦深度
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.35
Chen Li, Shuochen Su, Yasuyuki Matsushita, Kun Zhou, Stephen Lin
We present a method that enhances the performance of depth-from-defocus (DFD) through the use of shading information. DFD suffers from important limitations - namely coarse shape reconstruction and poor accuracy on texture less surfaces - that can be overcome with the help of shading. We integrate both forms of data within a Bayesian framework that capitalizes on their relative strengths. Shading data, however, is challenging to recover accurately from surfaces that contain texture. To address this issue, we propose an iterative technique that utilizes depth information to improve shading estimation, which in turn is used to elevate depth estimation in the presence of textures. With this approach, we demonstrate improvements over existing DFD techniques, as well as effective shape reconstruction of texture less surfaces.
我们提出了一种通过使用阴影信息来增强离焦深度(DFD)性能的方法。DFD有一些重要的限制,即粗糙的形状重建和纹理较少的表面上的精度差,这些可以通过阴影的帮助来克服。我们将两种形式的数据集成在一个贝叶斯框架中,利用它们的相对优势。然而,从包含纹理的表面精确恢复阴影数据是具有挑战性的。为了解决这个问题,我们提出了一种迭代技术,利用深度信息来改进阴影估计,从而提高纹理存在时的深度估计。通过这种方法,我们展示了对现有DFD技术的改进,以及对无纹理表面的有效形状重建。
{"title":"Bayesian Depth-from-Defocus with Shading Constraints","authors":"Chen Li, Shuochen Su, Yasuyuki Matsushita, Kun Zhou, Stephen Lin","doi":"10.1109/CVPR.2013.35","DOIUrl":"https://doi.org/10.1109/CVPR.2013.35","url":null,"abstract":"We present a method that enhances the performance of depth-from-defocus (DFD) through the use of shading information. DFD suffers from important limitations - namely coarse shape reconstruction and poor accuracy on texture less surfaces - that can be overcome with the help of shading. We integrate both forms of data within a Bayesian framework that capitalizes on their relative strengths. Shading data, however, is challenging to recover accurately from surfaces that contain texture. To address this issue, we propose an iterative technique that utilizes depth information to improve shading estimation, which in turn is used to elevate depth estimation in the presence of textures. With this approach, we demonstrate improvements over existing DFD techniques, as well as effective shape reconstruction of texture less surfaces.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"22 1 1","pages":"217-224"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84720066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Block and Group Regularized Sparse Modeling for Dictionary Learning 面向字典学习的块和组正则化稀疏建模
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.55
Yu-Tseh Chi, Mohsen Ali, Ajit Rajwade, J. Ho
This paper proposes a dictionary learning framework that combines the proposed block/group (BGSC) or reconstructed block/group (R-BGSC) sparse coding schemes with the novel Intra-block Coherence Suppression Dictionary Learning algorithm. An important and distinguishing feature of the proposed framework is that all dictionary blocks are trained simultaneously with respect to each data group while the intra-block coherence being explicitly minimized as an important objective. We provide both empirical evidence and heuristic support for this feature that can be considered as a direct consequence of incorporating both the group structure for the input data and the block structure for the dictionary in the learning process. The optimization problems for both the dictionary learning and sparse coding can be solved efficiently using block-gradient descent, and the details of the optimization algorithms are presented. We evaluate the proposed methods using well-known datasets, and favorable comparisons with state-of-the-art dictionary learning methods demonstrate the viability and validity of the proposed framework.
本文提出了一种字典学习框架,该框架将提出的块/组(BGSC)或重构块/组(R-BGSC)稀疏编码方案与新的块内相干抑制字典学习算法相结合。所提出的框架的一个重要和显著的特征是,所有的字典块都是相对于每个数据组同时训练的,而块内一致性被明确地最小化作为一个重要目标。我们为这一特征提供了经验证据和启发式支持,这可以被认为是在学习过程中结合输入数据的组结构和字典的块结构的直接结果。采用分块梯度下降法可以有效地解决字典学习和稀疏编码的优化问题,并给出了具体的优化算法。我们使用已知的数据集评估了所提出的方法,并与最先进的字典学习方法进行了有利的比较,证明了所提出框架的可行性和有效性。
{"title":"Block and Group Regularized Sparse Modeling for Dictionary Learning","authors":"Yu-Tseh Chi, Mohsen Ali, Ajit Rajwade, J. Ho","doi":"10.1109/CVPR.2013.55","DOIUrl":"https://doi.org/10.1109/CVPR.2013.55","url":null,"abstract":"This paper proposes a dictionary learning framework that combines the proposed block/group (BGSC) or reconstructed block/group (R-BGSC) sparse coding schemes with the novel Intra-block Coherence Suppression Dictionary Learning algorithm. An important and distinguishing feature of the proposed framework is that all dictionary blocks are trained simultaneously with respect to each data group while the intra-block coherence being explicitly minimized as an important objective. We provide both empirical evidence and heuristic support for this feature that can be considered as a direct consequence of incorporating both the group structure for the input data and the block structure for the dictionary in the learning process. The optimization problems for both the dictionary learning and sparse coding can be solved efficiently using block-gradient descent, and the details of the optimization algorithms are presented. We evaluate the proposed methods using well-known datasets, and favorable comparisons with state-of-the-art dictionary learning methods demonstrate the viability and validity of the proposed framework.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"11 1","pages":"377-382"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84775914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Graph Matching with Anchor Nodes: A Learning Approach 锚节点图匹配:一种学习方法
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.374
Nan Hu, R. Rustamov, L. Guibas
In this paper, we consider the weighted graph matching problem with partially disclosed correspondences between a number of anchor nodes. Our construction exploits recently introduced node signatures based on graph Laplacians, namely the Laplacian family signature (LFS) on the nodes, and the pair wise heat kernel map on the edges. In this paper, without assuming an explicit form of parametric dependence nor a distance metric between node signatures, we formulate an optimization problem which incorporates the knowledge of anchor nodes. Solving this problem gives us an optimized proximity measure specific to the graphs under consideration. Using this as a first order compatibility term, we then set up an integer quadratic program (IQP) to solve for a near optimal graph matching. Our experiments demonstrate the superior performance of our approach on randomly generated graphs and on two widely-used image sequences, when compared with other existing signature and adjacency matrix based graph matching methods.
在本文中,我们考虑了一些锚节点之间部分公开对应关系的加权图匹配问题。我们的构造利用了最近引入的基于图拉普拉斯算子的节点签名,即节点上的拉普拉斯族签名(LFS)和边缘上的成对热核图。在本文中,我们没有假设参数依赖的显式形式,也没有假设节点签名之间的距离度量,我们提出了一个包含锚节点知识的优化问题。解决这个问题会给我们一个优化的接近度量,具体到所考虑的图。以此作为一阶相容项,我们建立了一个整数二次规划(IQP)来求解接近最优的图匹配。我们的实验表明,与其他现有的基于签名和邻接矩阵的图匹配方法相比,我们的方法在随机生成图和两个广泛使用的图像序列上具有优越的性能。
{"title":"Graph Matching with Anchor Nodes: A Learning Approach","authors":"Nan Hu, R. Rustamov, L. Guibas","doi":"10.1109/CVPR.2013.374","DOIUrl":"https://doi.org/10.1109/CVPR.2013.374","url":null,"abstract":"In this paper, we consider the weighted graph matching problem with partially disclosed correspondences between a number of anchor nodes. Our construction exploits recently introduced node signatures based on graph Laplacians, namely the Laplacian family signature (LFS) on the nodes, and the pair wise heat kernel map on the edges. In this paper, without assuming an explicit form of parametric dependence nor a distance metric between node signatures, we formulate an optimization problem which incorporates the knowledge of anchor nodes. Solving this problem gives us an optimized proximity measure specific to the graphs under consideration. Using this as a first order compatibility term, we then set up an integer quadratic program (IQP) to solve for a near optimal graph matching. Our experiments demonstrate the superior performance of our approach on randomly generated graphs and on two widely-used image sequences, when compared with other existing signature and adjacency matrix based graph matching methods.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"38 1","pages":"2906-2913"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86936893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Weakly Supervised Learning of Mid-Level Features with Beta-Bernoulli Process Restricted Boltzmann Machines β -伯努利过程受限玻尔兹曼机的弱监督中级特征学习
Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.68
Roni Mittelman, Honglak Lee, B. Kuipers, S. Savarese
The use of semantic attributes in computer vision problems has been gaining increased popularity in recent years. Attributes provide an intermediate feature representation in between low-level features and the class categories, and offer several attractive properties, among which are improved learning of novel categories based on few examples, as well as allowing for zero-shot learning. However, the major caveat is that learning semantic attributes is a laborious task, requiring a significant amount of time and human intervention to provide labels. In order to address this issue, we propose a weakly supervised approach to learn mid-level features, where the only supervision is provided by the category classes of the training examples. We develop a novel extension of the restricted Boltzmann machine (RBM) with Beta-Bernoulli process priors. Unlike the standard RBM, our model uses the class labels to promote more efficient sharing of information by different categories. This tends to improve the generalization performance. By using semantic attributes for which annotations are available, we show that we can find correspondences between the mid-level features that we learn and the labeled attributes. Therefore, the mid-level features have distinct semantic characterization which is very similar to that given by the semantic attributes, even though their labeling was not used during the training process. Our experimental results in object recognition tasks show significant performance gains, outperforming methods which rely on manually labeled semantic attributes.
近年来,语义属性在计算机视觉问题中的应用越来越受欢迎。属性提供了低级特征和类类别之间的中间特征表示,并提供了几个有吸引力的特性,其中包括基于少量示例的新类别的改进学习,以及允许零次学习。然而,主要的警告是,学习语义属性是一项费力的任务,需要大量的时间和人工干预来提供标签。为了解决这个问题,我们提出了一种弱监督的方法来学习中级特征,其中唯一的监督是由训练样本的类别类提供的。本文提出了一种具有β -伯努利过程先验的受限玻尔兹曼机(RBM)的新扩展。与标准RBM不同,我们的模型使用类标签来促进不同类别之间更有效的信息共享。这倾向于提高泛化性能。通过使用标注可用的语义属性,我们可以找到我们学习的中级特征和标记属性之间的对应关系。因此,中级特征具有明显的语义特征,这与语义属性给出的语义特征非常相似,尽管在训练过程中没有使用它们的标记。我们在目标识别任务中的实验结果显示了显着的性能提升,优于依赖手动标记语义属性的方法。
{"title":"Weakly Supervised Learning of Mid-Level Features with Beta-Bernoulli Process Restricted Boltzmann Machines","authors":"Roni Mittelman, Honglak Lee, B. Kuipers, S. Savarese","doi":"10.1109/CVPR.2013.68","DOIUrl":"https://doi.org/10.1109/CVPR.2013.68","url":null,"abstract":"The use of semantic attributes in computer vision problems has been gaining increased popularity in recent years. Attributes provide an intermediate feature representation in between low-level features and the class categories, and offer several attractive properties, among which are improved learning of novel categories based on few examples, as well as allowing for zero-shot learning. However, the major caveat is that learning semantic attributes is a laborious task, requiring a significant amount of time and human intervention to provide labels. In order to address this issue, we propose a weakly supervised approach to learn mid-level features, where the only supervision is provided by the category classes of the training examples. We develop a novel extension of the restricted Boltzmann machine (RBM) with Beta-Bernoulli process priors. Unlike the standard RBM, our model uses the class labels to promote more efficient sharing of information by different categories. This tends to improve the generalization performance. By using semantic attributes for which annotations are available, we show that we can find correspondences between the mid-level features that we learn and the labeled attributes. Therefore, the mid-level features have distinct semantic characterization which is very similar to that given by the semantic attributes, even though their labeling was not used during the training process. Our experimental results in object recognition tasks show significant performance gains, outperforming methods which rely on manually labeled semantic attributes.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"66 1","pages":"476-483"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89920705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
期刊
2013 IEEE Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1