首页 > 最新文献

2013 IEEE International Conference on Computer Vision最新文献

英文 中文
Incorporating Cloud Distribution in Sky Representation 在天空表现中结合云分布
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.267
Kuan-Chuan Peng, Tsuhan Chen
Most sky models only describe the cloudiness of the overall sky by a single category or parameter such as sky index, which does not account for the distribution of the clouds across the sky. To capture variable cloudiness, we extend the concept of sky index to a random field indicating the level of cloudiness of each sky pixel in our proposed sky representation based on the Igawa sky model. We formulate the problem of solving the sky index of every sky pixel as a labeling problem, where an approximate solution can be efficiently found. Experimental results show that our proposed sky model has better expressiveness, stability with respect to variation in camera parameters, and geo-location estimation in outdoor images compared to the uniform sky index model. Potential applications of our proposed sky model include sky image rendering, where sky images can be generated with an arbitrary cloud distribution at any time and any location, previously impossible with traditional sky models.
大多数天空模型仅通过单一类别或参数(如天空指数)来描述整个天空的云量,而不能说明云在天空中的分布。为了捕获可变的云量,我们将天空指数的概念扩展到一个随机场,该随机场表示基于Igawa天空模型的天空表示中每个天空像素的云量水平。我们将求解每个天空像素的天空指数问题表述为一个标记问题,可以有效地找到一个近似解。实验结果表明,与均匀天空指数模型相比,我们提出的天空指数模型在室外图像中具有更好的表达能力、对相机参数变化的稳定性和地理位置估计能力。我们提出的天空模型的潜在应用包括天空图像渲染,其中天空图像可以在任何时间和任何地点以任意云分布生成,这在以前的传统天空模型中是不可能的。
{"title":"Incorporating Cloud Distribution in Sky Representation","authors":"Kuan-Chuan Peng, Tsuhan Chen","doi":"10.1109/ICCV.2013.267","DOIUrl":"https://doi.org/10.1109/ICCV.2013.267","url":null,"abstract":"Most sky models only describe the cloudiness of the overall sky by a single category or parameter such as sky index, which does not account for the distribution of the clouds across the sky. To capture variable cloudiness, we extend the concept of sky index to a random field indicating the level of cloudiness of each sky pixel in our proposed sky representation based on the Igawa sky model. We formulate the problem of solving the sky index of every sky pixel as a labeling problem, where an approximate solution can be efficiently found. Experimental results show that our proposed sky model has better expressiveness, stability with respect to variation in camera parameters, and geo-location estimation in outdoor images compared to the uniform sky index model. Potential applications of our proposed sky model include sky image rendering, where sky images can be generated with an arbitrary cloud distribution at any time and any location, previously impossible with traditional sky models.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74462622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Event Detection in Complex Scenes Using Interval Temporal Constraints 基于间隔时间约束的复杂场景事件检测
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.395
Yifan Zhang, Q. Ji, Hanqing Lu
In complex scenes with multiple atomic events happening sequentially or in parallel, detecting each individual event separately may not always obtain robust and reliable result. It is essential to detect them in a holistic way which incorporates the causality and temporal dependency among them to compensate the limitation of current computer vision techniques. In this paper, we propose an interval temporal constrained dynamic Bayesian network to extend Allen's interval algebra network (IAN) [2] from a deterministic static model to a probabilistic dynamic system, which can not only capture the complex interval temporal relationships, but also model the evolution dynamics and handle the uncertainty from the noisy visual observation. In the model, the topology of the IAN on each time slice and the interlinks between the time slices are discovered by an advanced structure learning method. The duration of the event and the unsynchronized time lags between two correlated event intervals are captured by a duration model, so that we can better determine the temporal boundary of the event. Empirical results on two real world datasets show the power of the proposed interval temporal constrained model.
在多个原子事件顺序或并行发生的复杂场景中,单独检测每个事件可能并不总是得到鲁棒可靠的结果。为了弥补当前计算机视觉技术的局限性,必须综合考虑它们之间的因果关系和时间依赖性,以整体的方式对它们进行检测。本文提出了一种区间时间约束的动态贝叶斯网络,将Allen的区间代数网络(IAN)[2]从确定性静态模型扩展到概率动态系统,不仅可以捕获复杂的区间时间关系,还可以建模进化动力学并处理来自噪声视觉观察的不确定性。在该模型中,通过一种先进的结构学习方法发现了每个时间片上IAN的拓扑结构和时间片之间的相互联系。持续时间模型捕获事件的持续时间和两个相关事件间隔之间的不同步时间滞后,以便我们更好地确定事件的时间边界。在两个真实世界数据集上的经验结果显示了所提出的区间时间约束模型的强大功能。
{"title":"Event Detection in Complex Scenes Using Interval Temporal Constraints","authors":"Yifan Zhang, Q. Ji, Hanqing Lu","doi":"10.1109/ICCV.2013.395","DOIUrl":"https://doi.org/10.1109/ICCV.2013.395","url":null,"abstract":"In complex scenes with multiple atomic events happening sequentially or in parallel, detecting each individual event separately may not always obtain robust and reliable result. It is essential to detect them in a holistic way which incorporates the causality and temporal dependency among them to compensate the limitation of current computer vision techniques. In this paper, we propose an interval temporal constrained dynamic Bayesian network to extend Allen's interval algebra network (IAN) [2] from a deterministic static model to a probabilistic dynamic system, which can not only capture the complex interval temporal relationships, but also model the evolution dynamics and handle the uncertainty from the noisy visual observation. In the model, the topology of the IAN on each time slice and the interlinks between the time slices are discovered by an advanced structure learning method. The duration of the event and the unsynchronized time lags between two correlated event intervals are captured by a duration model, so that we can better determine the temporal boundary of the event. Empirical results on two real world datasets show the power of the proposed interval temporal constrained model.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74212856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Latent Space Sparse Subspace Clustering 潜在空间稀疏子空间聚类
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.35
Vishal M. Patel, H. V. Nguyen, René Vidal
We propose a novel algorithm called Latent Space Sparse Subspace Clustering for simultaneous dimensionality reduction and clustering of data lying in a union of subspaces. Specifically, we describe a method that learns the projection of data and finds the sparse coefficients in the low-dimensional latent space. Cluster labels are then assigned by applying spectral clustering to a similarity matrix built from these sparse coefficients. An efficient optimization method is proposed and its non-linear extensions based on the kernel methods are presented. One of the main advantages of our method is that it is computationally efficient as the sparse coefficients are found in the low-dimensional latent space. Various experiments show that the proposed method performs better than the competitive state-of-the-art subspace clustering methods.
本文提出了一种新的隐空间稀疏子空间聚类算法,用于同时对子空间并集中的数据进行降维和聚类。具体来说,我们描述了一种学习数据投影并在低维潜在空间中找到稀疏系数的方法。然后通过对由这些稀疏系数建立的相似矩阵应用谱聚类来分配聚类标签。提出了一种有效的优化方法,并在核方法的基础上进行了非线性扩展。该方法的一个主要优点是计算效率高,因为稀疏系数是在低维潜在空间中找到的。实验结果表明,该方法优于现有的子空间聚类方法。
{"title":"Latent Space Sparse Subspace Clustering","authors":"Vishal M. Patel, H. V. Nguyen, René Vidal","doi":"10.1109/ICCV.2013.35","DOIUrl":"https://doi.org/10.1109/ICCV.2013.35","url":null,"abstract":"We propose a novel algorithm called Latent Space Sparse Subspace Clustering for simultaneous dimensionality reduction and clustering of data lying in a union of subspaces. Specifically, we describe a method that learns the projection of data and finds the sparse coefficients in the low-dimensional latent space. Cluster labels are then assigned by applying spectral clustering to a similarity matrix built from these sparse coefficients. An efficient optimization method is proposed and its non-linear extensions based on the kernel methods are presented. One of the main advantages of our method is that it is computationally efficient as the sparse coefficients are found in the low-dimensional latent space. Various experiments show that the proposed method performs better than the competitive state-of-the-art subspace clustering methods.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72695870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 205
Estimating the Material Properties of Fabric from Video 从视频中估计织物的材料性能
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.455
K. Bouman, Bei Xiao, P. Battaglia, W. Freeman
Passively estimating the intrinsic material properties of deformable objects moving in a natural environment is essential for scene understanding. We present a framework to automatically analyze videos of fabrics moving under various unknown wind forces, and recover two key material properties of the fabric: stiffness and area weight. We extend features previously developed to compactly represent static image textures to describe video textures, such as fabric motion. A discriminatively trained regression model is then used to predict the physical properties of fabric from these features. The success of our model is demonstrated on a new, publicly available database of fabric videos with corresponding measured ground truth material properties. We show that our predictions are well correlated with ground truth measurements of stiffness and density for the fabrics. Our contributions include: (a) a database that can be used for training and testing algorithms for passively predicting fabric properties from video, (b) an algorithm for predicting the material properties of fabric from a video, and (c) a perceptual study of humans' ability to estimate the material properties of fabric from videos and images.
被动地估计在自然环境中移动的可变形物体的内在材料特性对于场景理解至关重要。我们提出了一个框架来自动分析织物在各种未知风力下运动的视频,并恢复织物的两个关键材料特性:刚度和面积重量。我们扩展了先前开发的功能,以紧凑地表示静态图像纹理,以描述视频纹理,例如织物运动。然后使用判别训练的回归模型从这些特征中预测织物的物理特性。我们的模型的成功在一个新的,公开可用的织物视频数据库上得到了证明,该数据库具有相应的测量的地面真值材料属性。我们表明,我们的预测与织物的刚度和密度的地面真实测量结果很好地相关。我们的贡献包括:(a)可用于训练和测试算法的数据库,用于被动地从视频中预测织物性能,(b)从视频中预测织物材料性能的算法,以及(c)对人类从视频和图像中估计织物材料性能的能力的感知研究。
{"title":"Estimating the Material Properties of Fabric from Video","authors":"K. Bouman, Bei Xiao, P. Battaglia, W. Freeman","doi":"10.1109/ICCV.2013.455","DOIUrl":"https://doi.org/10.1109/ICCV.2013.455","url":null,"abstract":"Passively estimating the intrinsic material properties of deformable objects moving in a natural environment is essential for scene understanding. We present a framework to automatically analyze videos of fabrics moving under various unknown wind forces, and recover two key material properties of the fabric: stiffness and area weight. We extend features previously developed to compactly represent static image textures to describe video textures, such as fabric motion. A discriminatively trained regression model is then used to predict the physical properties of fabric from these features. The success of our model is demonstrated on a new, publicly available database of fabric videos with corresponding measured ground truth material properties. We show that our predictions are well correlated with ground truth measurements of stiffness and density for the fabrics. Our contributions include: (a) a database that can be used for training and testing algorithms for passively predicting fabric properties from video, (b) an algorithm for predicting the material properties of fabric from a video, and (c) a perceptual study of humans' ability to estimate the material properties of fabric from videos and images.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73840209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 88
Event Recognition in Photo Collections with a Stopwatch HMM 带有秒表的照片集合中的事件识别
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.151
Lukas Bossard, M. Guillaumin, L. Gool
The task of recognizing events in photo collections is central for automatically organizing images. It is also very challenging, because of the ambiguity of photos across different event classes and because many photos do not convey enough relevant information. Unfortunately, the field still lacks standard evaluation data sets to allow comparison of different approaches. In this paper, we introduce and release a novel data set of personal photo collections containing more than 61,000 images in 807 collections, annotated with 14 diverse social event classes. Casting collections as sequential data, we build upon recent and state-of-the-art work in event recognition in videos to propose a latent sub-event approach for event recognition in photo collections. However, photos in collections are sparsely sampled over time and come in bursts from which transpires the importance of specific moments for the photographers. Thus, we adapt a discriminative hidden Markov model to allow the transitions between states to be a function of the time gap between consecutive images, which we coin as Stopwatch Hidden Markov model (SHMM). In our experiments, we show that our proposed model outperforms approaches based only on feature pooling or a classical hidden Markov model. With an average accuracy of 56%, we also highlight the difficulty of the data set and the need for future advances in event recognition in photo collections.
识别照片集中的事件是自动组织图像的核心任务。这也是非常具有挑战性的,因为不同事件类别的照片具有模糊性,而且许多照片没有传达足够的相关信息。不幸的是,该领域仍然缺乏标准的评估数据集来比较不同的方法。在本文中,我们介绍并发布了一个新的个人照片集合数据集,其中包含807个集合中的61,000多张图像,并注释了14个不同的社会事件类别。将集合作为顺序数据,我们基于最近和最先进的视频事件识别工作,提出了一种用于照片集合事件识别的潜在子事件方法。然而,随着时间的推移,收藏中的照片是稀疏采样的,并且是突发的,从中可以看出特定时刻对摄影师的重要性。因此,我们采用了一种判别式隐马尔可夫模型,使状态之间的转换成为连续图像之间时间间隔的函数,我们称之为秒表隐马尔可夫模型(SHMM)。在我们的实验中,我们表明我们提出的模型优于仅基于特征池或经典隐马尔可夫模型的方法。平均准确率为56%,我们还强调了数据集的难度以及未来在照片集中事件识别方面的需求。
{"title":"Event Recognition in Photo Collections with a Stopwatch HMM","authors":"Lukas Bossard, M. Guillaumin, L. Gool","doi":"10.1109/ICCV.2013.151","DOIUrl":"https://doi.org/10.1109/ICCV.2013.151","url":null,"abstract":"The task of recognizing events in photo collections is central for automatically organizing images. It is also very challenging, because of the ambiguity of photos across different event classes and because many photos do not convey enough relevant information. Unfortunately, the field still lacks standard evaluation data sets to allow comparison of different approaches. In this paper, we introduce and release a novel data set of personal photo collections containing more than 61,000 images in 807 collections, annotated with 14 diverse social event classes. Casting collections as sequential data, we build upon recent and state-of-the-art work in event recognition in videos to propose a latent sub-event approach for event recognition in photo collections. However, photos in collections are sparsely sampled over time and come in bursts from which transpires the importance of specific moments for the photographers. Thus, we adapt a discriminative hidden Markov model to allow the transitions between states to be a function of the time gap between consecutive images, which we coin as Stopwatch Hidden Markov model (SHMM). In our experiments, we show that our proposed model outperforms approaches based only on feature pooling or a classical hidden Markov model. With an average accuracy of 56%, we also highlight the difficulty of the data set and the need for future advances in event recognition in photo collections.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84345079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Single-Patch Low-Rank Prior for Non-pointwise Impulse Noise Removal 非点脉冲噪声去除的单斑低秩先验算法
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.137
Ruixuan Wang, E. Trucco
This paper introduces a `low-rank prior' for small oriented noise-free image patches: considering an oriented patch as a matrix, a low-rank matrix approximation is enough to preserve the texture details in the properly oriented patch. Based on this prior, we propose a single-patch method within a generalized joint low-rank and sparse matrix recovery framework to simultaneously detect and remove non-point wise random-valued impulse noise (e.g., very small blobs). A weighting matrix is incorporated in the framework to encode an initial estimate of the spatial noise distribution. An accelerated proximal gradient method is adapted to estimate the optimal noise-free image patches. Experiments show the effectiveness of our framework in removing non-point wise random-valued impulse noise.
本文介绍了一种用于小定向无噪声图像补丁的“低秩先验”:将定向补丁视为矩阵,低秩矩阵近似足以保留适当定向补丁中的纹理细节。在此基础上,我们提出了一种基于广义联合低秩和稀疏矩阵恢复框架的单patch方法来同时检测和去除非点随机值脉冲噪声(例如,非常小的blobs)。在框架中加入加权矩阵来编码空间噪声分布的初始估计。采用加速近端梯度法估计最优无噪声图像块。实验证明了该框架在去除非点方向随机值脉冲噪声方面的有效性。
{"title":"Single-Patch Low-Rank Prior for Non-pointwise Impulse Noise Removal","authors":"Ruixuan Wang, E. Trucco","doi":"10.1109/ICCV.2013.137","DOIUrl":"https://doi.org/10.1109/ICCV.2013.137","url":null,"abstract":"This paper introduces a `low-rank prior' for small oriented noise-free image patches: considering an oriented patch as a matrix, a low-rank matrix approximation is enough to preserve the texture details in the properly oriented patch. Based on this prior, we propose a single-patch method within a generalized joint low-rank and sparse matrix recovery framework to simultaneously detect and remove non-point wise random-valued impulse noise (e.g., very small blobs). A weighting matrix is incorporated in the framework to encode an initial estimate of the spatial noise distribution. An accelerated proximal gradient method is adapted to estimate the optimal noise-free image patches. Experiments show the effectiveness of our framework in removing non-point wise random-valued impulse noise.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81670805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Action and Event Recognition with Fisher Vectors on a Compact Feature Set 基于压缩特征集的Fisher向量动作和事件识别
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.228
Dan Oneaţă, J. Verbeek, C. Schmid
Action recognition in uncontrolled video is an important and challenging computer vision problem. Recent progress in this area is due to new local features and models that capture spatio-temporal structure between local features, or human-object interactions. Instead of working towards more complex models, we focus on the low-level features and their encoding. We evaluate the use of Fisher vectors as an alternative to bag-of-word histograms to aggregate a small set of state-of-the-art low-level descriptors, in combination with linear classifiers. We present a large and varied set of evaluations, considering (i) classification of short actions in five datasets, (ii) localization of such actions in feature-length movies, and (iii) large-scale recognition of complex events. We find that for basic action recognition and localization MBH features alone are enough for state-of-the-art performance. For complex events we find that SIFT and MFCC features provide complementary cues. On all three problems we obtain state-of-the-art results, while using fewer features and less complex models.
非受控视频中的动作识别是一个重要而富有挑战性的计算机视觉问题。这一领域的最新进展是由于新的局部特征和模型捕获了局部特征之间的时空结构,或人与物体的相互作用。我们没有致力于更复杂的模型,而是专注于底层特征及其编码。我们评估了Fisher向量作为词袋直方图的替代方法的使用,以结合线性分类器聚合一小组最先进的低级描述符。我们提出了一套庞大而多样的评估,考虑了(i)五个数据集中的短动作分类,(ii)长片电影中这些动作的本地化,以及(iii)复杂事件的大规模识别。我们发现,对于基本的动作识别和定位,MBH特征本身就足以达到最先进的性能。对于复杂事件,我们发现SIFT和MFCC特征提供了互补的线索。在这三个问题上,我们使用更少的特征和更简单的模型,获得了最先进的结果。
{"title":"Action and Event Recognition with Fisher Vectors on a Compact Feature Set","authors":"Dan Oneaţă, J. Verbeek, C. Schmid","doi":"10.1109/ICCV.2013.228","DOIUrl":"https://doi.org/10.1109/ICCV.2013.228","url":null,"abstract":"Action recognition in uncontrolled video is an important and challenging computer vision problem. Recent progress in this area is due to new local features and models that capture spatio-temporal structure between local features, or human-object interactions. Instead of working towards more complex models, we focus on the low-level features and their encoding. We evaluate the use of Fisher vectors as an alternative to bag-of-word histograms to aggregate a small set of state-of-the-art low-level descriptors, in combination with linear classifiers. We present a large and varied set of evaluations, considering (i) classification of short actions in five datasets, (ii) localization of such actions in feature-length movies, and (iii) large-scale recognition of complex events. We find that for basic action recognition and localization MBH features alone are enough for state-of-the-art performance. For complex events we find that SIFT and MFCC features provide complementary cues. On all three problems we obtain state-of-the-art results, while using fewer features and less complex models.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81950150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 423
Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration 基于稀疏性的正交字典学习图像恢复
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.420
Chenglong Bao, Jian-Feng Cai, Hui Ji
In recent years, how to learn a dictionary from input images for sparse modelling has been one very active topic in image processing and recognition. Most existing dictionary learning methods consider an over-complete dictionary, e.g. the K-SVD method. Often they require solving some minimization problem that is very challenging in terms of computational feasibility and efficiency. However, if the correlations among dictionary atoms are not well constrained, the redundancy of the dictionary does not necessarily improve the performance of sparse coding. This paper proposed a fast orthogonal dictionary learning method for sparse image representation. With comparable performance on several image restoration tasks, the proposed method is much more computationally efficient than the over-complete dictionary based learning methods.
近年来,如何从输入图像中学习字典进行稀疏建模一直是图像处理和识别领域的一个非常活跃的课题。大多数现有的字典学习方法都考虑过完备字典,例如K-SVD方法。它们通常需要解决一些最小化问题,这些问题在计算可行性和效率方面非常具有挑战性。然而,如果字典原子之间的相关性没有得到很好的约束,字典的冗余不一定会提高稀疏编码的性能。提出了一种用于稀疏图像表示的快速正交字典学习方法。与基于过完备字典的学习方法相比,该方法在多个图像恢复任务上的性能相当,计算效率更高。
{"title":"Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration","authors":"Chenglong Bao, Jian-Feng Cai, Hui Ji","doi":"10.1109/ICCV.2013.420","DOIUrl":"https://doi.org/10.1109/ICCV.2013.420","url":null,"abstract":"In recent years, how to learn a dictionary from input images for sparse modelling has been one very active topic in image processing and recognition. Most existing dictionary learning methods consider an over-complete dictionary, e.g. the K-SVD method. Often they require solving some minimization problem that is very challenging in terms of computational feasibility and efficiency. However, if the correlations among dictionary atoms are not well constrained, the redundancy of the dictionary does not necessarily improve the performance of sparse coding. This paper proposed a fast orthogonal dictionary learning method for sparse image representation. With comparable performance on several image restoration tasks, the proposed method is much more computationally efficient than the over-complete dictionary based learning methods.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82104470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 92
A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis 一个统一的视频分割基准:标注、度量和分析
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.438
Fabio Galasso, N. Nagaraja, Tatiana Jimenez Cardenas, T. Brox, B. Schiele
Video segmentation research is currently limited by the lack of a benchmark dataset that covers the large variety of sub problems appearing in video segmentation and that is large enough to avoid over fitting. Consequently, there is little analysis of video segmentation which generalizes across subtasks, and it is not yet clear which and how video segmentation should leverage the information from the still-frames, as previously studied in image segmentation, alongside video specific information, such as temporal volume, motion and occlusion. In this work we provide such an analysis based on annotations of a large video dataset, where each video is manually segmented by multiple persons. Moreover, we introduce a new volume-based metric that includes the important aspect of temporal consistency, that can deal with segmentation hierarchies, and that reflects the tradeoff between over-segmentation and segmentation accuracy.
视频分割研究目前受到缺乏一个基准数据集的限制,该数据集涵盖了视频分割中出现的大量子问题,并且足够大以避免过度拟合。因此,很少有跨子任务的视频分割分析,目前还不清楚视频分割应该利用静止帧的信息,如先前在图像分割中研究的那样,以及视频特定信息,如时间体积,运动和遮挡。在这项工作中,我们基于大型视频数据集的注释提供了这样的分析,其中每个视频都是由多人手动分割的。此外,我们引入了一种新的基于体积的度量,该度量包括时间一致性的重要方面,可以处理分割层次,并反映了过度分割和分割精度之间的权衡。
{"title":"A Unified Video Segmentation Benchmark: Annotation, Metrics and Analysis","authors":"Fabio Galasso, N. Nagaraja, Tatiana Jimenez Cardenas, T. Brox, B. Schiele","doi":"10.1109/ICCV.2013.438","DOIUrl":"https://doi.org/10.1109/ICCV.2013.438","url":null,"abstract":"Video segmentation research is currently limited by the lack of a benchmark dataset that covers the large variety of sub problems appearing in video segmentation and that is large enough to avoid over fitting. Consequently, there is little analysis of video segmentation which generalizes across subtasks, and it is not yet clear which and how video segmentation should leverage the information from the still-frames, as previously studied in image segmentation, alongside video specific information, such as temporal volume, motion and occlusion. In this work we provide such an analysis based on annotations of a large video dataset, where each video is manually segmented by multiple persons. Moreover, we introduce a new volume-based metric that includes the important aspect of temporal consistency, that can deal with segmentation hierarchies, and that reflects the tradeoff between over-segmentation and segmentation accuracy.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79795873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 168
Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification 分类分类的简单潜Dirichlet分配
Pub Date : 2013-12-01 DOI: 10.1109/ICCV.2013.332
Mandar Dixit, Nikhil Rasiwasia, N. Vasconcelos
An extension of the latent Dirichlet allocation (LDA), denoted class-specific-simplex LDA (css-LDA), is proposed for image classification. An analysis of the supervised LDA models currently used for this task shows that the impact of class information on the topics discovered by these models is very weak in general. This implies that the discovered topics are driven by general image regularities, rather than the semantic regularities of interest for classification. To address this, we introduce a model that induces supervision in topic discovery, while retaining the original flexibility of LDA to account for unanticipated structures of interest. The proposed css-LDA is an LDA model with class supervision at the level of image features. In css-LDA topics are discovered per class, i.e. a single set of topics shared across classes is replaced by multiple class-specific topic sets. This model can be used for generative classification using the Bayes decision rule or even extended to discriminative classification with support vector machines (SVMs). A css-LDA model can endow an image with a vector of class and topic specific count statistics that are similar to the Bag-of-words (BoW) histogram. SVM-based discriminants can be learned for classes in the space of these histograms. The effectiveness of css-LDA model in both generative and discriminative classification frameworks is demonstrated through an extensive experimental evaluation, involving multiple benchmark datasets, where it is shown to outperform all existing LDA based image classification approaches.
提出了一种用于图像分类的潜狄利克雷分配(LDA)的扩展,即类特定单纯形LDA (css-LDA)。对目前用于此任务的监督LDA模型的分析表明,类信息对这些模型发现的主题的影响通常非常弱。这意味着发现的主题是由一般图像规律驱动的,而不是由分类感兴趣的语义规律驱动的。为了解决这个问题,我们引入了一个模型,该模型在主题发现中引入了监督,同时保留了LDA的原始灵活性,以解释意想不到的兴趣结构。本文提出的css-LDA是一种在图像特征层面上具有类监督的LDA模型。在css-LDA中,每个类都发现主题,即跨类共享的单一主题集被多个特定于类的主题集所取代。该模型可以用于贝叶斯决策规则的生成分类,甚至可以扩展到支持向量机(svm)的判别分类。css-LDA模型可以为图像赋予特定于类和主题的计数统计向量,类似于词袋直方图(BoW)。基于svm的判别器可以学习到这些直方图空间中的类。通过涉及多个基准数据集的广泛实验评估,证明了css-LDA模型在生成和判别分类框架中的有效性,其中它被证明优于所有现有的基于LDA的图像分类方法。
{"title":"Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification","authors":"Mandar Dixit, Nikhil Rasiwasia, N. Vasconcelos","doi":"10.1109/ICCV.2013.332","DOIUrl":"https://doi.org/10.1109/ICCV.2013.332","url":null,"abstract":"An extension of the latent Dirichlet allocation (LDA), denoted class-specific-simplex LDA (css-LDA), is proposed for image classification. An analysis of the supervised LDA models currently used for this task shows that the impact of class information on the topics discovered by these models is very weak in general. This implies that the discovered topics are driven by general image regularities, rather than the semantic regularities of interest for classification. To address this, we introduce a model that induces supervision in topic discovery, while retaining the original flexibility of LDA to account for unanticipated structures of interest. The proposed css-LDA is an LDA model with class supervision at the level of image features. In css-LDA topics are discovered per class, i.e. a single set of topics shared across classes is replaced by multiple class-specific topic sets. This model can be used for generative classification using the Bayes decision rule or even extended to discriminative classification with support vector machines (SVMs). A css-LDA model can endow an image with a vector of class and topic specific count statistics that are similar to the Bag-of-words (BoW) histogram. SVM-based discriminants can be learned for classes in the space of these histograms. The effectiveness of css-LDA model in both generative and discriminative classification frameworks is demonstrated through an extensive experimental evaluation, involving multiple benchmark datasets, where it is shown to outperform all existing LDA based image classification approaches.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85125200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2013 IEEE International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1