首页 > 最新文献

2009 IEEE Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
An empirical Bayes approach to contextual region classification 上下文区域分类的经验贝叶斯方法
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206690
S. Lazebnik, M. Raginsky
This paper presents a nonparametric approach to labeling of local image regions that is inspired by recent developments in information-theoretic denoising. The chief novelty of this approach rests in its ability to derive an unsupervised contextual prior over image classes from unlabeled test data. Labeled training data is needed only to learn a local appearance model for image patches (although additional supervisory information can optionally be incorporated when it is available). Instead of assuming a parametric prior such as a Markov random field for the class labels, the proposed approach uses the empirical Bayes technique of statistical inversion to recover a contextual model directly from the test data, either as a spatially varying or as a globally constant prior distribution over the classes in the image. Results on two challenging datasets convincingly demonstrate that useful contextual information can indeed be learned from unlabeled data.
本文提出了一种局部图像区域标记的非参数方法,该方法受信息理论去噪的最新发展的启发。这种方法的主要新颖之处在于它能够从未标记的测试数据中获得图像类的无监督上下文先验。只有在学习图像补丁的局部外观模型时才需要标记的训练数据(尽管在可用的情况下可以选择性地加入额外的监督信息)。该方法没有假设一个参数先验,例如类标签的马尔可夫随机场,而是使用统计反演的经验贝叶斯技术直接从测试数据中恢复上下文模型,无论是作为空间变化的还是作为图像中类的全局恒定先验分布。两个具有挑战性的数据集的结果令人信服地表明,有用的上下文信息确实可以从未标记的数据中学习。
{"title":"An empirical Bayes approach to contextual region classification","authors":"S. Lazebnik, M. Raginsky","doi":"10.1109/CVPR.2009.5206690","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206690","url":null,"abstract":"This paper presents a nonparametric approach to labeling of local image regions that is inspired by recent developments in information-theoretic denoising. The chief novelty of this approach rests in its ability to derive an unsupervised contextual prior over image classes from unlabeled test data. Labeled training data is needed only to learn a local appearance model for image patches (although additional supervisory information can optionally be incorporated when it is available). Instead of assuming a parametric prior such as a Markov random field for the class labels, the proposed approach uses the empirical Bayes technique of statistical inversion to recover a contextual model directly from the test data, either as a spatially varying or as a globally constant prior distribution over the classes in the image. Results on two challenging datasets convincingly demonstrate that useful contextual information can indeed be learned from unlabeled data.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134234300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Robust unsupervised segmentation of degraded document images with topic models 基于主题模型的退化文档图像鲁棒无监督分割
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206606
Timothy J. Burns, Jason J. Corso
Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult. Most current methods combine text extraction and heuristics for segmentation, but text extraction is prone to failure and measuring accuracy remains a difficult challenge. Furthermore, when presented with significant degradation many common heuristic methods fall apart. In this paper, we propose a Bayesian generative model for document images which seeks to overcome some of these drawbacks. Our model automatically discovers different regions present in a document image in a completely unsupervised fashion. We attempt no text extraction, but rather use discrete patch-based codebook learning to make our probabilistic representation feasible. Each latent region topic is a distribution over these patch indices. We capture rough document layout with an MRF Potts model. We take an analysis by synthesis approach to examine the model, and provide quantitative segmentation results on a manually labeled document image data set. We illustrate our model's robustness by providing results on a highly degraded version of our test set.
文档图像分割一直是一个具有挑战性的视觉问题。尽管文档图像具有结构化的布局,但捕获足够的图像用于分割可能很困难。目前大多数方法结合了文本提取和启发式分割,但文本提取容易失败,测量精度仍然是一个困难的挑战。此外,当出现显著的退化时,许多常见的启发式方法都失效了。在本文中,我们提出了一个文档图像的贝叶斯生成模型,旨在克服这些缺点。我们的模型以完全无监督的方式自动发现文档图像中存在的不同区域。我们没有尝试文本提取,而是使用离散的基于补丁的码本学习来使我们的概率表示可行。每个潜在区域主题是这些斑块指数的分布。我们使用MRF Potts模型捕获粗略的文档布局。我们采用综合分析的方法来检验模型,并在手动标记的文档图像数据集上提供定量分割结果。我们通过在测试集的高度降级版本上提供结果来说明模型的鲁棒性。
{"title":"Robust unsupervised segmentation of degraded document images with topic models","authors":"Timothy J. Burns, Jason J. Corso","doi":"10.1109/CVPR.2009.5206606","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206606","url":null,"abstract":"Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult. Most current methods combine text extraction and heuristics for segmentation, but text extraction is prone to failure and measuring accuracy remains a difficult challenge. Furthermore, when presented with significant degradation many common heuristic methods fall apart. In this paper, we propose a Bayesian generative model for document images which seeks to overcome some of these drawbacks. Our model automatically discovers different regions present in a document image in a completely unsupervised fashion. We attempt no text extraction, but rather use discrete patch-based codebook learning to make our probabilistic representation feasible. Each latent region topic is a distribution over these patch indices. We capture rough document layout with an MRF Potts model. We take an analysis by synthesis approach to examine the model, and provide quantitative segmentation results on a manually labeled document image data set. We illustrate our model's robustness by providing results on a highly degraded version of our test set.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133820220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A graph-based approach to skin mole matching incorporating template-normalized coordinates 结合模板归一化坐标的基于图的皮肤痣匹配方法
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206725
H. Mirzaalian, G. Hamarneh, Tim K. Lee
Density of moles is a strong predictor of malignant melanoma. Some dermatologists advocate periodic full-body scan for high-risk patients. In current practice, physicians compare images taken at different time instances to recognize changes. There is an important clinical need to follow changes in the number of moles and their appearance (size, color, texture, shape) in images from two different times. In this paper, we propose a method for finding corresponding moles in patient's skin back images at different scanning times. At first, a template is defined for the human back to calculate the moles' normalized spatial coordinates. Next, matching moles across images is modeled as a graph matching problem and algebraic relations between nodes and edges in the graphs are induced in the matching cost function, which contains terms reflecting proximity regularization, angular agreement between mole pairs, and agreement between the moles' normalized coordinates calculated in the unwarped back template. We propose and discuss alternative approaches for evaluating the goodness of matching. We evaluate our method on a large set of synthetic data (hundreds of pairs) as well as 56 pairs of real dermatological images. Our proposed method compares favorably with the state-of-the-art.
痣的密度是恶性黑色素瘤的一个强有力的预测指标。一些皮肤科医生提倡对高危患者进行定期全身扫描。在目前的实践中,医生比较在不同时间拍摄的图像来识别变化。有一个重要的临床需要跟踪变化的痣的数量和外观(大小,颜色,质地,形状)的图像从两个不同的时间。在本文中,我们提出了一种在不同扫描时间的患者皮肤背部图像中寻找相应痣的方法。首先为人体背部定义一个模板,计算鼹鼠的归一化空间坐标。接下来,将图像间的痣匹配建模为图匹配问题,并在匹配成本函数中归纳图中节点和边之间的代数关系,该函数包含反映邻近正则化、痣对之间的角度一致性以及在未弯曲的背模板中计算的痣的归一化坐标之间的一致性的项。我们提出并讨论了评估匹配优度的替代方法。我们在大量合成数据(数百对)以及56对真实皮肤病学图像上评估了我们的方法。我们提出的方法比最先进的方法要好。
{"title":"A graph-based approach to skin mole matching incorporating template-normalized coordinates","authors":"H. Mirzaalian, G. Hamarneh, Tim K. Lee","doi":"10.1109/CVPR.2009.5206725","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206725","url":null,"abstract":"Density of moles is a strong predictor of malignant melanoma. Some dermatologists advocate periodic full-body scan for high-risk patients. In current practice, physicians compare images taken at different time instances to recognize changes. There is an important clinical need to follow changes in the number of moles and their appearance (size, color, texture, shape) in images from two different times. In this paper, we propose a method for finding corresponding moles in patient's skin back images at different scanning times. At first, a template is defined for the human back to calculate the moles' normalized spatial coordinates. Next, matching moles across images is modeled as a graph matching problem and algebraic relations between nodes and edges in the graphs are induced in the matching cost function, which contains terms reflecting proximity regularization, angular agreement between mole pairs, and agreement between the moles' normalized coordinates calculated in the unwarped back template. We propose and discuss alternative approaches for evaluating the goodness of matching. We evaluate our method on a large set of synthetic data (hundreds of pairs) as well as 56 pairs of real dermatological images. Our proposed method compares favorably with the state-of-the-art.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124286214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
The geometry of 2D image signals 二维图像信号的几何特性
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206784
Lennart Wietzke, G. Sommer, O. Fleischmann
This paper covers a fundamental problem of local phase based signal processing: the isotropic generalization of the classical 1D analytic signal to two dimensions. The well known analytic signal enables the analysis of local phase and amplitude information of 1D signals. Local phase, amplitude and additional orientation information can be extracted by the 2D monogenic signal with the restriction to the subclass of intrinsically one dimensional signals. In case of 2D image signals the monogenic signal enables the rotationally invariant analysis of lines and edges. In this work we present the 2D analytic signal as a novel generalization of both the analytic signal and the 2D monogenic signal. In case of 2D image signals the 2D analytic signal enables the isotropic analysis of lines, edges, corners and junctions in one unified framework. Furthermore, we show that 2D signals exist per se in a 3D projective subspace of the homogeneous conformal space which delivers a descriptive geometric interpretation of signals providing new insights on the relation of geometry and 2D signals.
本文讨论了局部相位信号处理的一个基本问题:经典一维解析信号向二维的各向同性推广。众所周知的解析信号可以分析一维信号的局部相位和幅度信息。二维单基因信号在本质一维信号子类的限制下,可以提取局部相位、幅度和附加的方位信息。在二维图像信号的情况下,单基因信号能够对线和边缘进行旋转不变性分析。在这项工作中,我们提出了二维解析信号作为解析信号和二维单基因信号的一种新的推广。对于二维图像信号,二维分析信号可以在一个统一的框架内对线、边、角和结点进行各向同性分析。此外,我们证明了二维信号本身存在于齐次共形空间的三维射影子空间中,这提供了信号的描述性几何解释,为几何和二维信号的关系提供了新的见解。
{"title":"The geometry of 2D image signals","authors":"Lennart Wietzke, G. Sommer, O. Fleischmann","doi":"10.1109/CVPR.2009.5206784","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206784","url":null,"abstract":"This paper covers a fundamental problem of local phase based signal processing: the isotropic generalization of the classical 1D analytic signal to two dimensions. The well known analytic signal enables the analysis of local phase and amplitude information of 1D signals. Local phase, amplitude and additional orientation information can be extracted by the 2D monogenic signal with the restriction to the subclass of intrinsically one dimensional signals. In case of 2D image signals the monogenic signal enables the rotationally invariant analysis of lines and edges. In this work we present the 2D analytic signal as a novel generalization of both the analytic signal and the 2D monogenic signal. In case of 2D image signals the 2D analytic signal enables the isotropic analysis of lines, edges, corners and junctions in one unified framework. Furthermore, we show that 2D signals exist per se in a 3D projective subspace of the homogeneous conformal space which delivers a descriptive geometric interpretation of signals providing new insights on the relation of geometry and 2D signals.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122197540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Resolution-Invariant Image Representation and its applications 分辨率不变图像表示及其应用
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206679
Jinjun Wang, Shenghuo Zhu, Yihong Gong
We present a resolution-invariant image representation (RIIR) framework in this paper. The RIIR framework includes the methods of building a set of multi-resolution bases from training images, estimating the optimal sparse resolution-invariant representation of any image, and reconstructing the missing patches of any resolution level. As the proposed RIIR framework has many potential resolution enhancement applications, we discuss three novel image magnification applications in this paper. In the first application, we apply the RIIR framework to perform Multi-Scale Image Magnification where we also introduced a training strategy to built a compact RIIR set. In the second application, the RIIR framework is extended to conduct Continuous Image Scaling where a new base at any resolution level can be generated using existing RIIR set on the fly. In the third application, we further apply the RIIR framework onto Content-Base Automatic Zooming applications. The experimental results show that in all these applications, our RIIR based method outperforms existing methods in various aspects.
本文提出了一种分辨率不变图像表示(RIIR)框架。RIIR框架包括从训练图像中构建一组多分辨率基,估计任意图像的最优稀疏分辨率不变表示以及重建任意分辨率水平的缺失补丁的方法。由于所提出的RIIR框架具有许多潜在的分辨率增强应用,因此本文讨论了三种新的图像放大应用。在第一个应用中,我们应用RIIR框架来执行多尺度图像放大,其中我们还引入了一个训练策略来构建一个紧凑的RIIR集。在第二个应用中,RIIR框架被扩展到进行连续图像缩放,其中可以使用现有的RIIR动态生成任何分辨率水平的新基础。在第三个应用程序中,我们进一步将RIIR框架应用到基于内容的自动缩放应用程序中。实验结果表明,在所有这些应用中,基于RIIR的方法在各个方面都优于现有方法。
{"title":"Resolution-Invariant Image Representation and its applications","authors":"Jinjun Wang, Shenghuo Zhu, Yihong Gong","doi":"10.1109/CVPR.2009.5206679","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206679","url":null,"abstract":"We present a resolution-invariant image representation (RIIR) framework in this paper. The RIIR framework includes the methods of building a set of multi-resolution bases from training images, estimating the optimal sparse resolution-invariant representation of any image, and reconstructing the missing patches of any resolution level. As the proposed RIIR framework has many potential resolution enhancement applications, we discuss three novel image magnification applications in this paper. In the first application, we apply the RIIR framework to perform Multi-Scale Image Magnification where we also introduced a training strategy to built a compact RIIR set. In the second application, the RIIR framework is extended to conduct Continuous Image Scaling where a new base at any resolution level can be generated using existing RIIR set on the fly. In the third application, we further apply the RIIR framework onto Content-Base Automatic Zooming applications. The experimental results show that in all these applications, our RIIR based method outperforms existing methods in various aspects.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124540445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Bundling features for large scale partial-duplicate web image search 用于大规模部分重复web图像搜索的捆绑功能
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206566
Zhong Wu, Qifa Ke, M. Isard, Jian Sun
In state-of-the-art image retrieval systems, an image is represented by a bag of visual words obtained by quantizing high-dimensional local image descriptors, and scalable schemes inspired by text retrieval are then applied for large scale image indexing and retrieval. Bag-of-words representations, however: 1) reduce the discriminative power of image features due to feature quantization; and 2) ignore geometric relationships among visual words. Exploiting such geometric constraints, by estimating a 2D affine transformation between a query image and each candidate image, has been shown to greatly improve retrieval precision but at high computational cost. In this paper we present a novel scheme where image features are bundled into local groups. Each group of bundled features becomes much more discriminative than a single feature, and within each group simple and robust geometric constraints can be efficiently enforced. Experiments in Web image search, with a database of more than one million images, show that our scheme achieves a 49% improvement in average precision over the baseline bag-of-words approach. Retrieval performance is comparable to existing full geometric verification approaches while being much less computationally expensive. When combined with full geometric verification we achieve a 77% precision improvement over the baseline bag-of-words approach, and a 24% improvement over full geometric verification alone.
在最先进的图像检索系统中,图像由量化高维局部图像描述符获得的视觉词包表示,然后将受文本检索启发的可扩展方案应用于大规模图像索引和检索。然而,词袋表示:1)由于特征量化,降低了图像特征的判别能力;2)忽略视觉词之间的几何关系。利用这些几何约束,通过估计查询图像和每个候选图像之间的二维仿射变换,已被证明可以大大提高检索精度,但计算成本很高。在本文中,我们提出了一种新的方案,将图像特征捆绑到局部组中。每一组捆绑的特征都比单个特征更具区别性,并且在每一组中都可以有效地执行简单而健壮的几何约束。在超过一百万张图片的数据库中进行的Web图像搜索实验表明,我们的方案比基线词袋方法的平均精度提高了49%。检索性能与现有的全几何验证方法相当,而计算成本要低得多。当与完整的几何验证相结合时,我们比基线词袋方法的精度提高了77%,比单独的完整几何验证提高了24%。
{"title":"Bundling features for large scale partial-duplicate web image search","authors":"Zhong Wu, Qifa Ke, M. Isard, Jian Sun","doi":"10.1109/CVPR.2009.5206566","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206566","url":null,"abstract":"In state-of-the-art image retrieval systems, an image is represented by a bag of visual words obtained by quantizing high-dimensional local image descriptors, and scalable schemes inspired by text retrieval are then applied for large scale image indexing and retrieval. Bag-of-words representations, however: 1) reduce the discriminative power of image features due to feature quantization; and 2) ignore geometric relationships among visual words. Exploiting such geometric constraints, by estimating a 2D affine transformation between a query image and each candidate image, has been shown to greatly improve retrieval precision but at high computational cost. In this paper we present a novel scheme where image features are bundled into local groups. Each group of bundled features becomes much more discriminative than a single feature, and within each group simple and robust geometric constraints can be efficiently enforced. Experiments in Web image search, with a database of more than one million images, show that our scheme achieves a 49% improvement in average precision over the baseline bag-of-words approach. Retrieval performance is comparable to existing full geometric verification approaches while being much less computationally expensive. When combined with full geometric verification we achieve a 77% precision improvement over the baseline bag-of-words approach, and a 24% improvement over full geometric verification alone.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"421 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132015186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 460
Contextual restoration of severely degraded document images 严重退化的文档图像的上下文恢复
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206601
Jyotirmoy Banerjee, A. Namboodiri, C. V. Jawahar
We propose an approach to restore severely degraded document images using a probabilistic context model. Unlike traditional approaches that use previously learned prior models to restore an image, we are able to learn the text model from the degraded document itself, making the approach independent of script, font, style, etc. We model the contextual relationship using an MRF. The ability to work with larger patch sizes allows us to deal with severe degradations including cuts, blobs, merges and vandalized documents. Our approach can also integrate document restoration and super-resolution into a single framework, thus directly generating high quality images from degraded documents. Experimental results show significant improvement in image quality on document images collected from various sources including magazines and books, and comprehensively demonstrate the robustness and adaptability of the approach. It works well with document collections such as books, even with severe degradations, and hence is ideally suited for repositories such as digital libraries.
我们提出了一种使用概率上下文模型来恢复严重退化的文档图像的方法。与使用先前学习的先验模型来恢复图像的传统方法不同,我们能够从退化的文档本身学习文本模型,使该方法独立于脚本,字体,样式等。我们使用MRF对上下文关系建模。使用更大的补丁大小的能力使我们能够处理严重的降级,包括切割,斑点,合并和破坏文档。我们的方法还可以将文档恢复和超分辨率集成到一个框架中,从而直接从退化的文档中生成高质量的图像。实验结果表明,该方法对各种来源的文档图像(包括杂志和书籍)的图像质量有显著提高,并全面证明了该方法的鲁棒性和适应性。它可以很好地处理诸如书籍之类的文档集合,甚至在严重退化的情况下也是如此,因此非常适合于诸如数字图书馆之类的存储库。
{"title":"Contextual restoration of severely degraded document images","authors":"Jyotirmoy Banerjee, A. Namboodiri, C. V. Jawahar","doi":"10.1109/CVPR.2009.5206601","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206601","url":null,"abstract":"We propose an approach to restore severely degraded document images using a probabilistic context model. Unlike traditional approaches that use previously learned prior models to restore an image, we are able to learn the text model from the degraded document itself, making the approach independent of script, font, style, etc. We model the contextual relationship using an MRF. The ability to work with larger patch sizes allows us to deal with severe degradations including cuts, blobs, merges and vandalized documents. Our approach can also integrate document restoration and super-resolution into a single framework, thus directly generating high quality images from degraded documents. Experimental results show significant improvement in image quality on document images collected from various sources including magazines and books, and comprehensively demonstrate the robustness and adaptability of the approach. It works well with document collections such as books, even with severe degradations, and hence is ideally suited for repositories such as digital libraries.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134372421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Convexity and Bayesian constrained local models 凸性和贝叶斯约束局部模型
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206751
U. Paquet
The accurate localization of facial features plays a fundamental role in any face recognition pipeline. Constrained local models (CLM) provide an effective approach to localization by coupling ensembles of local patch detectors for non-rigid object alignment. A recent improvement has been made by using generic convex quadratic fitting (CQF), which elegantly addresses the CLM warp update by enforcing convexity of the patch response surfaces. In this paper, CQF is generalized to a Bayesian inference problem, in which it appears as a particular maximum likelihood solution. The Bayesian viewpoint holds many advantages: for example, the task of feature localization can explicitly build on previous face detection stages, and multiple sets of patch responses can be seamlessly incorporated. A second contribution of the paper is an analytic solution to finding convex approximations to patch response surfaces, which removes CQF's reliance on a numeric optimizer. Improvements in feature localization performance are illustrated on the Labeled Faces in the Wild and BioID data sets.
面部特征的准确定位是任何人脸识别流程的基础。约束局部模型(CLM)通过局部贴片检测器的耦合集成为非刚性目标对准提供了一种有效的定位方法。最近通过使用通用凸二次拟合(CQF)进行了改进,该拟合通过增强贴片响应面的凸性来优雅地解决CLM弯曲更新问题。本文将CQF推广到一个贝叶斯推理问题,在这个问题中,CQF表现为一个特殊的极大似然解。贝叶斯观点具有许多优点:例如,特征定位任务可以明确地建立在先前的人脸检测阶段,并且可以无缝地合并多组补丁响应。本文的第二个贡献是找到补丁响应面的凸近似的解析解,它消除了CQF对数值优化器的依赖。在野生和生物id数据集的标记面孔上说明了特征定位性能的改进。
{"title":"Convexity and Bayesian constrained local models","authors":"U. Paquet","doi":"10.1109/CVPR.2009.5206751","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206751","url":null,"abstract":"The accurate localization of facial features plays a fundamental role in any face recognition pipeline. Constrained local models (CLM) provide an effective approach to localization by coupling ensembles of local patch detectors for non-rigid object alignment. A recent improvement has been made by using generic convex quadratic fitting (CQF), which elegantly addresses the CLM warp update by enforcing convexity of the patch response surfaces. In this paper, CQF is generalized to a Bayesian inference problem, in which it appears as a particular maximum likelihood solution. The Bayesian viewpoint holds many advantages: for example, the task of feature localization can explicitly build on previous face detection stages, and multiple sets of patch responses can be seamlessly incorporated. A second contribution of the paper is an analytic solution to finding convex approximations to patch response surfaces, which removes CQF's reliance on a numeric optimizer. Improvements in feature localization performance are illustrated on the Labeled Faces in the Wild and BioID data sets.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134091010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
On the set of images modulo viewpoint and contrast changes 对集合图像的模视点和对比度进行变化
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206704
G. Sundaramoorthi, P. Petersen, V. Varadarajan, Stefano Soatto
We consider regions of images that exhibit smooth statistics, and pose the question of characterizing the "essence" of these regions that matters for recognition. Ideally, this would be a statistic (a function of the image) that does not depend on viewpoint and illumination, and yet is sufficient for the task. In this manuscript, we show that such statistics exist. That is, one can compute deterministic functions of the image that contain all the "information" present in the original image, except for the effects of viewpoint and illumination. We also show that such statistics are supported on a "thin" (zero-measure) subset of the image domain, and thus the "information" in an image that is relevant for recognition is sparse. Yet, from this thin set one can reconstruct an image that is equivalent to the original up to a change of viewpoint and local illumination (contrast). Finally, we formalize the notion of "information" an image contains for the purpose of viewpoint- and illumination- invariant tasks, which we call "actionable information" following ideas of J. J. Gibson.
我们考虑显示平滑统计的图像区域,并提出表征这些区域的“本质”的问题,这些区域对识别很重要。理想情况下,这将是一个不依赖于视点和照明的统计(图像的函数),但对于任务来说已经足够了。在本文中,我们证明了这种统计是存在的。也就是说,人们可以计算出包含原始图像中除了视点和光照影响之外的所有“信息”的图像的确定性函数。我们还表明,在图像域的“薄”(零度量)子集上支持这种统计,因此图像中与识别相关的“信息”是稀疏的。然而,在改变视点和局部照明(对比度)的情况下,从这个薄集可以重建出与原始图像等效的图像。最后,我们形式化了图像包含的“信息”的概念,用于视点和照明不变任务,我们将其称为“可操作信息”,遵循J. J. Gibson的思想。
{"title":"On the set of images modulo viewpoint and contrast changes","authors":"G. Sundaramoorthi, P. Petersen, V. Varadarajan, Stefano Soatto","doi":"10.1109/CVPR.2009.5206704","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206704","url":null,"abstract":"We consider regions of images that exhibit smooth statistics, and pose the question of characterizing the \"essence\" of these regions that matters for recognition. Ideally, this would be a statistic (a function of the image) that does not depend on viewpoint and illumination, and yet is sufficient for the task. In this manuscript, we show that such statistics exist. That is, one can compute deterministic functions of the image that contain all the \"information\" present in the original image, except for the effects of viewpoint and illumination. We also show that such statistics are supported on a \"thin\" (zero-measure) subset of the image domain, and thus the \"information\" in an image that is relevant for recognition is sparse. Yet, from this thin set one can reconstruct an image that is equivalent to the original up to a change of viewpoint and local illumination (contrast). Finally, we formalize the notion of \"information\" an image contains for the purpose of viewpoint- and illumination- invariant tasks, which we call \"actionable information\" following ideas of J. J. Gibson.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134370234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Blind motion deblurring from a single image using sparse approximation 利用稀疏逼近对单幅图像进行盲运动去模糊
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206743
Jian-Feng Cai, Hui Ji, Chaoqiang Liu, Zuowei Shen
Restoring a clear image from a single motion-blurred image due to camera shake has long been a challenging problem in digital imaging. Existing blind deblurring techniques either only remove simple motion blurring, or need user interactions to work on more complex cases. In this paper, we present an approach to remove motion blurring from a single image by formulating the blind blurring as a new joint optimization problem, which simultaneously maximizes the sparsity of the blur kernel and the sparsity of the clear image under certain suitable redundant tight frame systems (curvelet system for kernels and framelet system for images). Without requiring any prior information of the blur kernel as the input, our proposed approach is able to recover high-quality images from given blurred images. Furthermore, the new sparsity constraints under tight frame systems enable the application of a fast algorithm called linearized Bregman iteration to efficiently solve the proposed minimization problem. The experiments on both simulated images and real images showed that our algorithm can effectively removing complex motion blurring from nature images.
从由相机抖动引起的单一运动模糊图像中恢复清晰图像一直是数字成像领域的难题。现有的盲去模糊技术要么只能去除简单的运动模糊,要么需要用户交互才能处理更复杂的情况。本文提出了一种消除单幅图像运动模糊的方法,该方法将盲模糊作为一种新的联合优化问题,在适当的冗余紧帧系统(对核的曲线系统和对图像的框架系统)下,使模糊核的稀疏性和清晰图像的稀疏性同时最大化。在不需要任何模糊核的先验信息作为输入的情况下,我们提出的方法能够从给定的模糊图像中恢复高质量的图像。此外,在紧框架系统下,新的稀疏性约束使得线性化布雷格曼迭代的快速算法能够有效地解决所提出的最小化问题。在模拟图像和真实图像上的实验表明,该算法可以有效地去除自然图像中的复杂运动模糊。
{"title":"Blind motion deblurring from a single image using sparse approximation","authors":"Jian-Feng Cai, Hui Ji, Chaoqiang Liu, Zuowei Shen","doi":"10.1109/CVPR.2009.5206743","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206743","url":null,"abstract":"Restoring a clear image from a single motion-blurred image due to camera shake has long been a challenging problem in digital imaging. Existing blind deblurring techniques either only remove simple motion blurring, or need user interactions to work on more complex cases. In this paper, we present an approach to remove motion blurring from a single image by formulating the blind blurring as a new joint optimization problem, which simultaneously maximizes the sparsity of the blur kernel and the sparsity of the clear image under certain suitable redundant tight frame systems (curvelet system for kernels and framelet system for images). Without requiring any prior information of the blur kernel as the input, our proposed approach is able to recover high-quality images from given blurred images. Furthermore, the new sparsity constraints under tight frame systems enable the application of a fast algorithm called linearized Bregman iteration to efficiently solve the proposed minimization problem. The experiments on both simulated images and real images showed that our algorithm can effectively removing complex motion blurring from nature images.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132291167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 317
期刊
2009 IEEE Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1