Pub Date : 2009-06-20DOI: 10.1109/CVPR.2009.5206690
S. Lazebnik, M. Raginsky
This paper presents a nonparametric approach to labeling of local image regions that is inspired by recent developments in information-theoretic denoising. The chief novelty of this approach rests in its ability to derive an unsupervised contextual prior over image classes from unlabeled test data. Labeled training data is needed only to learn a local appearance model for image patches (although additional supervisory information can optionally be incorporated when it is available). Instead of assuming a parametric prior such as a Markov random field for the class labels, the proposed approach uses the empirical Bayes technique of statistical inversion to recover a contextual model directly from the test data, either as a spatially varying or as a globally constant prior distribution over the classes in the image. Results on two challenging datasets convincingly demonstrate that useful contextual information can indeed be learned from unlabeled data.
{"title":"An empirical Bayes approach to contextual region classification","authors":"S. Lazebnik, M. Raginsky","doi":"10.1109/CVPR.2009.5206690","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206690","url":null,"abstract":"This paper presents a nonparametric approach to labeling of local image regions that is inspired by recent developments in information-theoretic denoising. The chief novelty of this approach rests in its ability to derive an unsupervised contextual prior over image classes from unlabeled test data. Labeled training data is needed only to learn a local appearance model for image patches (although additional supervisory information can optionally be incorporated when it is available). Instead of assuming a parametric prior such as a Markov random field for the class labels, the proposed approach uses the empirical Bayes technique of statistical inversion to recover a contextual model directly from the test data, either as a spatially varying or as a globally constant prior distribution over the classes in the image. Results on two challenging datasets convincingly demonstrate that useful contextual information can indeed be learned from unlabeled data.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134234300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPR.2009.5206606
Timothy J. Burns, Jason J. Corso
Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult. Most current methods combine text extraction and heuristics for segmentation, but text extraction is prone to failure and measuring accuracy remains a difficult challenge. Furthermore, when presented with significant degradation many common heuristic methods fall apart. In this paper, we propose a Bayesian generative model for document images which seeks to overcome some of these drawbacks. Our model automatically discovers different regions present in a document image in a completely unsupervised fashion. We attempt no text extraction, but rather use discrete patch-based codebook learning to make our probabilistic representation feasible. Each latent region topic is a distribution over these patch indices. We capture rough document layout with an MRF Potts model. We take an analysis by synthesis approach to examine the model, and provide quantitative segmentation results on a manually labeled document image data set. We illustrate our model's robustness by providing results on a highly degraded version of our test set.
{"title":"Robust unsupervised segmentation of degraded document images with topic models","authors":"Timothy J. Burns, Jason J. Corso","doi":"10.1109/CVPR.2009.5206606","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206606","url":null,"abstract":"Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult. Most current methods combine text extraction and heuristics for segmentation, but text extraction is prone to failure and measuring accuracy remains a difficult challenge. Furthermore, when presented with significant degradation many common heuristic methods fall apart. In this paper, we propose a Bayesian generative model for document images which seeks to overcome some of these drawbacks. Our model automatically discovers different regions present in a document image in a completely unsupervised fashion. We attempt no text extraction, but rather use discrete patch-based codebook learning to make our probabilistic representation feasible. Each latent region topic is a distribution over these patch indices. We capture rough document layout with an MRF Potts model. We take an analysis by synthesis approach to examine the model, and provide quantitative segmentation results on a manually labeled document image data set. We illustrate our model's robustness by providing results on a highly degraded version of our test set.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133820220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPR.2009.5206725
H. Mirzaalian, G. Hamarneh, Tim K. Lee
Density of moles is a strong predictor of malignant melanoma. Some dermatologists advocate periodic full-body scan for high-risk patients. In current practice, physicians compare images taken at different time instances to recognize changes. There is an important clinical need to follow changes in the number of moles and their appearance (size, color, texture, shape) in images from two different times. In this paper, we propose a method for finding corresponding moles in patient's skin back images at different scanning times. At first, a template is defined for the human back to calculate the moles' normalized spatial coordinates. Next, matching moles across images is modeled as a graph matching problem and algebraic relations between nodes and edges in the graphs are induced in the matching cost function, which contains terms reflecting proximity regularization, angular agreement between mole pairs, and agreement between the moles' normalized coordinates calculated in the unwarped back template. We propose and discuss alternative approaches for evaluating the goodness of matching. We evaluate our method on a large set of synthetic data (hundreds of pairs) as well as 56 pairs of real dermatological images. Our proposed method compares favorably with the state-of-the-art.
{"title":"A graph-based approach to skin mole matching incorporating template-normalized coordinates","authors":"H. Mirzaalian, G. Hamarneh, Tim K. Lee","doi":"10.1109/CVPR.2009.5206725","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206725","url":null,"abstract":"Density of moles is a strong predictor of malignant melanoma. Some dermatologists advocate periodic full-body scan for high-risk patients. In current practice, physicians compare images taken at different time instances to recognize changes. There is an important clinical need to follow changes in the number of moles and their appearance (size, color, texture, shape) in images from two different times. In this paper, we propose a method for finding corresponding moles in patient's skin back images at different scanning times. At first, a template is defined for the human back to calculate the moles' normalized spatial coordinates. Next, matching moles across images is modeled as a graph matching problem and algebraic relations between nodes and edges in the graphs are induced in the matching cost function, which contains terms reflecting proximity regularization, angular agreement between mole pairs, and agreement between the moles' normalized coordinates calculated in the unwarped back template. We propose and discuss alternative approaches for evaluating the goodness of matching. We evaluate our method on a large set of synthetic data (hundreds of pairs) as well as 56 pairs of real dermatological images. Our proposed method compares favorably with the state-of-the-art.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124286214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPR.2009.5206784
Lennart Wietzke, G. Sommer, O. Fleischmann
This paper covers a fundamental problem of local phase based signal processing: the isotropic generalization of the classical 1D analytic signal to two dimensions. The well known analytic signal enables the analysis of local phase and amplitude information of 1D signals. Local phase, amplitude and additional orientation information can be extracted by the 2D monogenic signal with the restriction to the subclass of intrinsically one dimensional signals. In case of 2D image signals the monogenic signal enables the rotationally invariant analysis of lines and edges. In this work we present the 2D analytic signal as a novel generalization of both the analytic signal and the 2D monogenic signal. In case of 2D image signals the 2D analytic signal enables the isotropic analysis of lines, edges, corners and junctions in one unified framework. Furthermore, we show that 2D signals exist per se in a 3D projective subspace of the homogeneous conformal space which delivers a descriptive geometric interpretation of signals providing new insights on the relation of geometry and 2D signals.
{"title":"The geometry of 2D image signals","authors":"Lennart Wietzke, G. Sommer, O. Fleischmann","doi":"10.1109/CVPR.2009.5206784","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206784","url":null,"abstract":"This paper covers a fundamental problem of local phase based signal processing: the isotropic generalization of the classical 1D analytic signal to two dimensions. The well known analytic signal enables the analysis of local phase and amplitude information of 1D signals. Local phase, amplitude and additional orientation information can be extracted by the 2D monogenic signal with the restriction to the subclass of intrinsically one dimensional signals. In case of 2D image signals the monogenic signal enables the rotationally invariant analysis of lines and edges. In this work we present the 2D analytic signal as a novel generalization of both the analytic signal and the 2D monogenic signal. In case of 2D image signals the 2D analytic signal enables the isotropic analysis of lines, edges, corners and junctions in one unified framework. Furthermore, we show that 2D signals exist per se in a 3D projective subspace of the homogeneous conformal space which delivers a descriptive geometric interpretation of signals providing new insights on the relation of geometry and 2D signals.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122197540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPR.2009.5206679
Jinjun Wang, Shenghuo Zhu, Yihong Gong
We present a resolution-invariant image representation (RIIR) framework in this paper. The RIIR framework includes the methods of building a set of multi-resolution bases from training images, estimating the optimal sparse resolution-invariant representation of any image, and reconstructing the missing patches of any resolution level. As the proposed RIIR framework has many potential resolution enhancement applications, we discuss three novel image magnification applications in this paper. In the first application, we apply the RIIR framework to perform Multi-Scale Image Magnification where we also introduced a training strategy to built a compact RIIR set. In the second application, the RIIR framework is extended to conduct Continuous Image Scaling where a new base at any resolution level can be generated using existing RIIR set on the fly. In the third application, we further apply the RIIR framework onto Content-Base Automatic Zooming applications. The experimental results show that in all these applications, our RIIR based method outperforms existing methods in various aspects.
{"title":"Resolution-Invariant Image Representation and its applications","authors":"Jinjun Wang, Shenghuo Zhu, Yihong Gong","doi":"10.1109/CVPR.2009.5206679","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206679","url":null,"abstract":"We present a resolution-invariant image representation (RIIR) framework in this paper. The RIIR framework includes the methods of building a set of multi-resolution bases from training images, estimating the optimal sparse resolution-invariant representation of any image, and reconstructing the missing patches of any resolution level. As the proposed RIIR framework has many potential resolution enhancement applications, we discuss three novel image magnification applications in this paper. In the first application, we apply the RIIR framework to perform Multi-Scale Image Magnification where we also introduced a training strategy to built a compact RIIR set. In the second application, the RIIR framework is extended to conduct Continuous Image Scaling where a new base at any resolution level can be generated using existing RIIR set on the fly. In the third application, we further apply the RIIR framework onto Content-Base Automatic Zooming applications. The experimental results show that in all these applications, our RIIR based method outperforms existing methods in various aspects.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124540445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPR.2009.5206566
Zhong Wu, Qifa Ke, M. Isard, Jian Sun
In state-of-the-art image retrieval systems, an image is represented by a bag of visual words obtained by quantizing high-dimensional local image descriptors, and scalable schemes inspired by text retrieval are then applied for large scale image indexing and retrieval. Bag-of-words representations, however: 1) reduce the discriminative power of image features due to feature quantization; and 2) ignore geometric relationships among visual words. Exploiting such geometric constraints, by estimating a 2D affine transformation between a query image and each candidate image, has been shown to greatly improve retrieval precision but at high computational cost. In this paper we present a novel scheme where image features are bundled into local groups. Each group of bundled features becomes much more discriminative than a single feature, and within each group simple and robust geometric constraints can be efficiently enforced. Experiments in Web image search, with a database of more than one million images, show that our scheme achieves a 49% improvement in average precision over the baseline bag-of-words approach. Retrieval performance is comparable to existing full geometric verification approaches while being much less computationally expensive. When combined with full geometric verification we achieve a 77% precision improvement over the baseline bag-of-words approach, and a 24% improvement over full geometric verification alone.
{"title":"Bundling features for large scale partial-duplicate web image search","authors":"Zhong Wu, Qifa Ke, M. Isard, Jian Sun","doi":"10.1109/CVPR.2009.5206566","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206566","url":null,"abstract":"In state-of-the-art image retrieval systems, an image is represented by a bag of visual words obtained by quantizing high-dimensional local image descriptors, and scalable schemes inspired by text retrieval are then applied for large scale image indexing and retrieval. Bag-of-words representations, however: 1) reduce the discriminative power of image features due to feature quantization; and 2) ignore geometric relationships among visual words. Exploiting such geometric constraints, by estimating a 2D affine transformation between a query image and each candidate image, has been shown to greatly improve retrieval precision but at high computational cost. In this paper we present a novel scheme where image features are bundled into local groups. Each group of bundled features becomes much more discriminative than a single feature, and within each group simple and robust geometric constraints can be efficiently enforced. Experiments in Web image search, with a database of more than one million images, show that our scheme achieves a 49% improvement in average precision over the baseline bag-of-words approach. Retrieval performance is comparable to existing full geometric verification approaches while being much less computationally expensive. When combined with full geometric verification we achieve a 77% precision improvement over the baseline bag-of-words approach, and a 24% improvement over full geometric verification alone.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"421 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132015186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPR.2009.5206601
Jyotirmoy Banerjee, A. Namboodiri, C. V. Jawahar
We propose an approach to restore severely degraded document images using a probabilistic context model. Unlike traditional approaches that use previously learned prior models to restore an image, we are able to learn the text model from the degraded document itself, making the approach independent of script, font, style, etc. We model the contextual relationship using an MRF. The ability to work with larger patch sizes allows us to deal with severe degradations including cuts, blobs, merges and vandalized documents. Our approach can also integrate document restoration and super-resolution into a single framework, thus directly generating high quality images from degraded documents. Experimental results show significant improvement in image quality on document images collected from various sources including magazines and books, and comprehensively demonstrate the robustness and adaptability of the approach. It works well with document collections such as books, even with severe degradations, and hence is ideally suited for repositories such as digital libraries.
{"title":"Contextual restoration of severely degraded document images","authors":"Jyotirmoy Banerjee, A. Namboodiri, C. V. Jawahar","doi":"10.1109/CVPR.2009.5206601","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206601","url":null,"abstract":"We propose an approach to restore severely degraded document images using a probabilistic context model. Unlike traditional approaches that use previously learned prior models to restore an image, we are able to learn the text model from the degraded document itself, making the approach independent of script, font, style, etc. We model the contextual relationship using an MRF. The ability to work with larger patch sizes allows us to deal with severe degradations including cuts, blobs, merges and vandalized documents. Our approach can also integrate document restoration and super-resolution into a single framework, thus directly generating high quality images from degraded documents. Experimental results show significant improvement in image quality on document images collected from various sources including magazines and books, and comprehensively demonstrate the robustness and adaptability of the approach. It works well with document collections such as books, even with severe degradations, and hence is ideally suited for repositories such as digital libraries.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134372421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPR.2009.5206751
U. Paquet
The accurate localization of facial features plays a fundamental role in any face recognition pipeline. Constrained local models (CLM) provide an effective approach to localization by coupling ensembles of local patch detectors for non-rigid object alignment. A recent improvement has been made by using generic convex quadratic fitting (CQF), which elegantly addresses the CLM warp update by enforcing convexity of the patch response surfaces. In this paper, CQF is generalized to a Bayesian inference problem, in which it appears as a particular maximum likelihood solution. The Bayesian viewpoint holds many advantages: for example, the task of feature localization can explicitly build on previous face detection stages, and multiple sets of patch responses can be seamlessly incorporated. A second contribution of the paper is an analytic solution to finding convex approximations to patch response surfaces, which removes CQF's reliance on a numeric optimizer. Improvements in feature localization performance are illustrated on the Labeled Faces in the Wild and BioID data sets.
{"title":"Convexity and Bayesian constrained local models","authors":"U. Paquet","doi":"10.1109/CVPR.2009.5206751","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206751","url":null,"abstract":"The accurate localization of facial features plays a fundamental role in any face recognition pipeline. Constrained local models (CLM) provide an effective approach to localization by coupling ensembles of local patch detectors for non-rigid object alignment. A recent improvement has been made by using generic convex quadratic fitting (CQF), which elegantly addresses the CLM warp update by enforcing convexity of the patch response surfaces. In this paper, CQF is generalized to a Bayesian inference problem, in which it appears as a particular maximum likelihood solution. The Bayesian viewpoint holds many advantages: for example, the task of feature localization can explicitly build on previous face detection stages, and multiple sets of patch responses can be seamlessly incorporated. A second contribution of the paper is an analytic solution to finding convex approximations to patch response surfaces, which removes CQF's reliance on a numeric optimizer. Improvements in feature localization performance are illustrated on the Labeled Faces in the Wild and BioID data sets.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134091010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPR.2009.5206704
G. Sundaramoorthi, P. Petersen, V. Varadarajan, Stefano Soatto
We consider regions of images that exhibit smooth statistics, and pose the question of characterizing the "essence" of these regions that matters for recognition. Ideally, this would be a statistic (a function of the image) that does not depend on viewpoint and illumination, and yet is sufficient for the task. In this manuscript, we show that such statistics exist. That is, one can compute deterministic functions of the image that contain all the "information" present in the original image, except for the effects of viewpoint and illumination. We also show that such statistics are supported on a "thin" (zero-measure) subset of the image domain, and thus the "information" in an image that is relevant for recognition is sparse. Yet, from this thin set one can reconstruct an image that is equivalent to the original up to a change of viewpoint and local illumination (contrast). Finally, we formalize the notion of "information" an image contains for the purpose of viewpoint- and illumination- invariant tasks, which we call "actionable information" following ideas of J. J. Gibson.
我们考虑显示平滑统计的图像区域,并提出表征这些区域的“本质”的问题,这些区域对识别很重要。理想情况下,这将是一个不依赖于视点和照明的统计(图像的函数),但对于任务来说已经足够了。在本文中,我们证明了这种统计是存在的。也就是说,人们可以计算出包含原始图像中除了视点和光照影响之外的所有“信息”的图像的确定性函数。我们还表明,在图像域的“薄”(零度量)子集上支持这种统计,因此图像中与识别相关的“信息”是稀疏的。然而,在改变视点和局部照明(对比度)的情况下,从这个薄集可以重建出与原始图像等效的图像。最后,我们形式化了图像包含的“信息”的概念,用于视点和照明不变任务,我们将其称为“可操作信息”,遵循J. J. Gibson的思想。
{"title":"On the set of images modulo viewpoint and contrast changes","authors":"G. Sundaramoorthi, P. Petersen, V. Varadarajan, Stefano Soatto","doi":"10.1109/CVPR.2009.5206704","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206704","url":null,"abstract":"We consider regions of images that exhibit smooth statistics, and pose the question of characterizing the \"essence\" of these regions that matters for recognition. Ideally, this would be a statistic (a function of the image) that does not depend on viewpoint and illumination, and yet is sufficient for the task. In this manuscript, we show that such statistics exist. That is, one can compute deterministic functions of the image that contain all the \"information\" present in the original image, except for the effects of viewpoint and illumination. We also show that such statistics are supported on a \"thin\" (zero-measure) subset of the image domain, and thus the \"information\" in an image that is relevant for recognition is sparse. Yet, from this thin set one can reconstruct an image that is equivalent to the original up to a change of viewpoint and local illumination (contrast). Finally, we formalize the notion of \"information\" an image contains for the purpose of viewpoint- and illumination- invariant tasks, which we call \"actionable information\" following ideas of J. J. Gibson.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134370234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-06-20DOI: 10.1109/CVPR.2009.5206743
Jian-Feng Cai, Hui Ji, Chaoqiang Liu, Zuowei Shen
Restoring a clear image from a single motion-blurred image due to camera shake has long been a challenging problem in digital imaging. Existing blind deblurring techniques either only remove simple motion blurring, or need user interactions to work on more complex cases. In this paper, we present an approach to remove motion blurring from a single image by formulating the blind blurring as a new joint optimization problem, which simultaneously maximizes the sparsity of the blur kernel and the sparsity of the clear image under certain suitable redundant tight frame systems (curvelet system for kernels and framelet system for images). Without requiring any prior information of the blur kernel as the input, our proposed approach is able to recover high-quality images from given blurred images. Furthermore, the new sparsity constraints under tight frame systems enable the application of a fast algorithm called linearized Bregman iteration to efficiently solve the proposed minimization problem. The experiments on both simulated images and real images showed that our algorithm can effectively removing complex motion blurring from nature images.
{"title":"Blind motion deblurring from a single image using sparse approximation","authors":"Jian-Feng Cai, Hui Ji, Chaoqiang Liu, Zuowei Shen","doi":"10.1109/CVPR.2009.5206743","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206743","url":null,"abstract":"Restoring a clear image from a single motion-blurred image due to camera shake has long been a challenging problem in digital imaging. Existing blind deblurring techniques either only remove simple motion blurring, or need user interactions to work on more complex cases. In this paper, we present an approach to remove motion blurring from a single image by formulating the blind blurring as a new joint optimization problem, which simultaneously maximizes the sparsity of the blur kernel and the sparsity of the clear image under certain suitable redundant tight frame systems (curvelet system for kernels and framelet system for images). Without requiring any prior information of the blur kernel as the input, our proposed approach is able to recover high-quality images from given blurred images. Furthermore, the new sparsity constraints under tight frame systems enable the application of a fast algorithm called linearized Bregman iteration to efficiently solve the proposed minimization problem. The experiments on both simulated images and real images showed that our algorithm can effectively removing complex motion blurring from nature images.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132291167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}