Pub Date : 2011-06-16DOI: 10.1109/IVMSPW.2011.5970362
Joan Bruna, S. Mallat
A scattering transform defines a signal representation which is invariant to translations and Lipschitz continuous relatively to deformations. It is implemented with a non-linear convolution network that iterates over wavelet and modulus operators. Lipschitz continuity locally linearizes deformations. Complex classes of signals and textures can be modeled with low-dimensional affine spaces, computed with a PCA in the scattering domain. Classification is performed with a penalized model selection. State of the art results are obtained for handwritten digit recognition over small training sets, and for texture classification. 1
{"title":"Classification with invariant scattering representations","authors":"Joan Bruna, S. Mallat","doi":"10.1109/IVMSPW.2011.5970362","DOIUrl":"https://doi.org/10.1109/IVMSPW.2011.5970362","url":null,"abstract":"A scattering transform defines a signal representation which is invariant to translations and Lipschitz continuous relatively to deformations. It is implemented with a non-linear convolution network that iterates over wavelet and modulus operators. Lipschitz continuity locally linearizes deformations. Complex classes of signals and textures can be modeled with low-dimensional affine spaces, computed with a PCA in the scattering domain. Classification is performed with a penalized model selection. State of the art results are obtained for handwritten digit recognition over small training sets, and for texture classification. 1","PeriodicalId":405588,"journal":{"name":"2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116603906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-06-16DOI: 10.1109/IVMSPW.2011.5970345
M. Lucassen, T. Gevers, A. Gijsenij
Performance measures for quantifying human color constancy and computational color constancy are very different. The former relate to measurements on individual object colors whereas the latter relate to the accuracy of the estimated illuminant. To bridge this gap, we propose a psychophysical method in which observers judge the global color fidelity of the visual scene rendered under different illuminants. In each experimental trial, the scene is rendered under three illuminants, two chromatic test illuminants and one neutral reference illuminant. Observers indicate which of the two test illuminants leads to better color fidelity in comparison to the reference illuminant. Here we study multicolor scenes with chromatic distributions that are differently oriented in color space, while having the same average chromaticity. We show that when these distributions are rendered under colored illumination they lead to different perceptual estimates of the color fidelity.
{"title":"Color fidelity of chromatic distributions by triad illuminant comparison","authors":"M. Lucassen, T. Gevers, A. Gijsenij","doi":"10.1109/IVMSPW.2011.5970345","DOIUrl":"https://doi.org/10.1109/IVMSPW.2011.5970345","url":null,"abstract":"Performance measures for quantifying human color constancy and computational color constancy are very different. The former relate to measurements on individual object colors whereas the latter relate to the accuracy of the estimated illuminant. To bridge this gap, we propose a psychophysical method in which observers judge the global color fidelity of the visual scene rendered under different illuminants. In each experimental trial, the scene is rendered under three illuminants, two chromatic test illuminants and one neutral reference illuminant. Observers indicate which of the two test illuminants leads to better color fidelity in comparison to the reference illuminant. Here we study multicolor scenes with chromatic distributions that are differently oriented in color space, while having the same average chromaticity. We show that when these distributions are rendered under colored illumination they lead to different perceptual estimates of the color fidelity.","PeriodicalId":405588,"journal":{"name":"2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124764996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-06-16DOI: 10.1109/IVMSPW.2011.5970366
J. Zujovic, T. Pappas, D. Neuhoff, R. Egmond, H. Ridder
In order to facilitate the development of objective texture similarity metrics and to evaluate their performance, one needs a large texture database accurately labeled with perceived similarities between images. We propose ViSiProG, a new Visual Similarity by Progressive Grouping procedure for conducting subjective experiments that organizes a texture database into clusters of visually similar images. The grouping is based on visual blending, and greatly simplifies pairwise labeling. ViSiProG collects subjective data in an efficient and effectivemanner, so that a relatively large database of textures can be accommodated. Experimental results and comparisons with structural texture similarity metrics demonstrate both the effectiveness of the proposed subjective testing procedure and the performance of the metrics.
{"title":"A new subjective procedure for evaluation and development of texture similarity metrics","authors":"J. Zujovic, T. Pappas, D. Neuhoff, R. Egmond, H. Ridder","doi":"10.1109/IVMSPW.2011.5970366","DOIUrl":"https://doi.org/10.1109/IVMSPW.2011.5970366","url":null,"abstract":"In order to facilitate the development of objective texture similarity metrics and to evaluate their performance, one needs a large texture database accurately labeled with perceived similarities between images. We propose ViSiProG, a new Visual Similarity by Progressive Grouping procedure for conducting subjective experiments that organizes a texture database into clusters of visually similar images. The grouping is based on visual blending, and greatly simplifies pairwise labeling. ViSiProG collects subjective data in an efficient and effectivemanner, so that a relatively large database of textures can be accommodated. Experimental results and comparisons with structural texture similarity metrics demonstrate both the effectiveness of the proposed subjective testing procedure and the performance of the metrics.","PeriodicalId":405588,"journal":{"name":"2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131321566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-06-16DOI: 10.1109/IVMSPW.2011.5970371
M. Bernhard, Ling Zhang, M. Wimmer
In computer games, a user's attention is focused on the current task, and task-irrelevant details remain unnoticed. This behavior, known as inattentional blindness, is a main problem for the optimal placement of information or advertisements. We propose a guiding principle based on Wolfe's theory of Guided Search, which predicts the saliency of objects during a visual search task. Assuming that computer games elicit visual search tasks frequently, we applied this model in a “reverse” direction: Given a target item (e.g., advertisement) which should be noticed by the user, we choose a frequently searched game item and modify it so that it shares some perceptual features (e.g., color or orientation) with the target item. A memory experiment with 36 participants showed that in an action video game, advertisements were more noticeable to users when this method is applied.
{"title":"Manipulating attention in computer games","authors":"M. Bernhard, Ling Zhang, M. Wimmer","doi":"10.1109/IVMSPW.2011.5970371","DOIUrl":"https://doi.org/10.1109/IVMSPW.2011.5970371","url":null,"abstract":"In computer games, a user's attention is focused on the current task, and task-irrelevant details remain unnoticed. This behavior, known as inattentional blindness, is a main problem for the optimal placement of information or advertisements. We propose a guiding principle based on Wolfe's theory of Guided Search, which predicts the saliency of objects during a visual search task. Assuming that computer games elicit visual search tasks frequently, we applied this model in a “reverse” direction: Given a target item (e.g., advertisement) which should be noticed by the user, we choose a frequently searched game item and modify it so that it shares some perceptual features (e.g., color or orientation) with the target item. A memory experiment with 36 participants showed that in an action video game, advertisements were more noticeable to users when this method is applied.","PeriodicalId":405588,"journal":{"name":"2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis","volume":"117 1-2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123566634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-06-16DOI: 10.1109/IVMSPW.2011.5970372
L. Dong, Weisi Lin, Ce Zhu, S. H. Soon
In this work, we firstly identify the shortcomings of the existing work of selective image rendering. In order to remedy the identified problems, we put forward the concept and formulation of a graphical saliency model (GSM) for selective image rendering applications, in which the sampling rate is determined adaptively according to the resultant saliency map under a computation budget. Different from the existing visual attention (VA) models which have been devised for natural image/video processing and applied to image rendering, the GSM considers the characteristics of the rendering process and aims to detect regions which require high computation to be rendered for good use of the said budget. The proposed GSM improves a VA model by incorporating a metric of rendering complexity. Experiment results show that, under a limited computation budget, selective rendering guided by the proposed GSM can achieve better perceived graphic quality, compared with that merely based upon a VA model.
{"title":"Selective rendering with graphical saliency model","authors":"L. Dong, Weisi Lin, Ce Zhu, S. H. Soon","doi":"10.1109/IVMSPW.2011.5970372","DOIUrl":"https://doi.org/10.1109/IVMSPW.2011.5970372","url":null,"abstract":"In this work, we firstly identify the shortcomings of the existing work of selective image rendering. In order to remedy the identified problems, we put forward the concept and formulation of a graphical saliency model (GSM) for selective image rendering applications, in which the sampling rate is determined adaptively according to the resultant saliency map under a computation budget. Different from the existing visual attention (VA) models which have been devised for natural image/video processing and applied to image rendering, the GSM considers the characteristics of the rendering process and aims to detect regions which require high computation to be rendered for good use of the said budget. The proposed GSM improves a VA model by incorporating a metric of rendering complexity. Experiment results show that, under a limited computation budget, selective rendering guided by the proposed GSM can achieve better perceived graphic quality, compared with that merely based upon a VA model.","PeriodicalId":405588,"journal":{"name":"2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129333718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-06-16DOI: 10.1109/IVMSPW.2011.5970354
Parul Shah, T. V. Srikanth, S. N. Merchant, U. Desai
In this paper, we propose a novel fusion rule for combining multifocus images of a scene by taking their weighted average in wavelet domain. The weights are decided adaptively by computing significance of the pixel using information available at finer resolution bands such that the proposed significance depends on the strength of edges giving more weightage to pixel with sharper neighborhood. The performance have been extensively tested on several pairs of multifocus images and compared quantitatively with various existing methods. The analysis shows that the proposed method increases the quality of the fused image significantly, both visually and in terms quantitative parameters, by achieving major reduction in artefacts.
{"title":"A novel multifocus image fusion scheme based on pixel significance using wavelet transform","authors":"Parul Shah, T. V. Srikanth, S. N. Merchant, U. Desai","doi":"10.1109/IVMSPW.2011.5970354","DOIUrl":"https://doi.org/10.1109/IVMSPW.2011.5970354","url":null,"abstract":"In this paper, we propose a novel fusion rule for combining multifocus images of a scene by taking their weighted average in wavelet domain. The weights are decided adaptively by computing significance of the pixel using information available at finer resolution bands such that the proposed significance depends on the strength of edges giving more weightage to pixel with sharper neighborhood. The performance have been extensively tested on several pairs of multifocus images and compared quantitatively with various existing methods. The analysis shows that the proposed method increases the quality of the fused image significantly, both visually and in terms quantitative parameters, by achieving major reduction in artefacts.","PeriodicalId":405588,"journal":{"name":"2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121176516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-06-16DOI: 10.1109/IVMSPW.2011.5970351
Z. Hou, H. Eng, T. Koh
This paper presents a method to improve the image contrast adaptively with account of both local and global image context. Firstly, the image is analyzed to find the region containing meaningful contents with good contrast and the region containing meaningful contents but with poor contrast. The analysis is based on the different responses from two edge detectors: the Canny and the zero-crossing detector. Then statistics of the gradient field in the former region is utilized to correct the gradient field in the latter region. Reconstruction of the contents in the latter region is accomplished through solving a Poisson equation with Dirichlet boundary conditions. Throughout the process, objects with poor visibility are automatically detected and adaptively enhanced without sacrifice of the contrast of image contents that are properly illuminated. Experiments show the advantages of the proposed method over the conventional contrast enhancement methods.
{"title":"Local correction with global constraint for image enhancement","authors":"Z. Hou, H. Eng, T. Koh","doi":"10.1109/IVMSPW.2011.5970351","DOIUrl":"https://doi.org/10.1109/IVMSPW.2011.5970351","url":null,"abstract":"This paper presents a method to improve the image contrast adaptively with account of both local and global image context. Firstly, the image is analyzed to find the region containing meaningful contents with good contrast and the region containing meaningful contents but with poor contrast. The analysis is based on the different responses from two edge detectors: the Canny and the zero-crossing detector. Then statistics of the gradient field in the former region is utilized to correct the gradient field in the latter region. Reconstruction of the contents in the latter region is accomplished through solving a Poisson equation with Dirichlet boundary conditions. Throughout the process, objects with poor visibility are automatically detected and adaptively enhanced without sacrifice of the contrast of image contents that are properly illuminated. Experiments show the advantages of the proposed method over the conventional contrast enhancement methods.","PeriodicalId":405588,"journal":{"name":"2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123829761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-06-16DOI: 10.1109/IVMSPW.2011.5970363
Lin Feng, Anand Bilas Ray
The local-pyramid approach for image representation and feature extraction is studied for the Content-Based Image Retrieval (CBIR). Lazebnik's pyramid matching kernels and the K-means clustering is used. The SIFT descriptor is deployed for feature extraction from the images, resulting in an efficient image representation scheme and reduction of the computational complexity. Histogram intersection is used to compute the similarity between the query image and the database images. The local-pyramid approach with a 3-level pyramid and a dictionary size of 100 achieves an average precision of 86.5% in retrieving images from the benchmark database, COREL 1K, and 77.35% for that with random image database.
{"title":"A comparative study on the local-pyramid approach for Content-Based Image Retrieval","authors":"Lin Feng, Anand Bilas Ray","doi":"10.1109/IVMSPW.2011.5970363","DOIUrl":"https://doi.org/10.1109/IVMSPW.2011.5970363","url":null,"abstract":"The local-pyramid approach for image representation and feature extraction is studied for the Content-Based Image Retrieval (CBIR). Lazebnik's pyramid matching kernels and the K-means clustering is used. The SIFT descriptor is deployed for feature extraction from the images, resulting in an efficient image representation scheme and reduction of the computational complexity. Histogram intersection is used to compute the similarity between the query image and the database images. The local-pyramid approach with a 3-level pyramid and a dictionary size of 100 achieves an average precision of 86.5% in retrieving images from the benchmark database, COREL 1K, and 77.35% for that with random image database.","PeriodicalId":405588,"journal":{"name":"2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127379407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-06-16DOI: 10.1109/IVMSPW.2011.5970355
Yunliang Cai, G. Baciu
Detection of repetitive patterns in texture images is a longstanding problem in texture analysis. In the textile industry, this is particularly useful in isolating repeats in woven fabric designs. Based on repetitive patterns, textile designers can identify and classify complex textures. In this paper, we propose a new method for detecting, locating, and grouping the repetitive patterns, particularly for near regular textures (NRT) based on a mid-level patch descriptor. A NRT is parameterized as a vector-valued function representing a texton unit together with a set of geometric transformations. We perform shape alignment by image congealing and correlation matching. Our experiments demonstrate that our patch-based method significantly improves the performance and the versatility of repetitive pattern detection in NRT images.
{"title":"Detection of repetitive patterns in near regular texture images","authors":"Yunliang Cai, G. Baciu","doi":"10.1109/IVMSPW.2011.5970355","DOIUrl":"https://doi.org/10.1109/IVMSPW.2011.5970355","url":null,"abstract":"Detection of repetitive patterns in texture images is a longstanding problem in texture analysis. In the textile industry, this is particularly useful in isolating repeats in woven fabric designs. Based on repetitive patterns, textile designers can identify and classify complex textures. In this paper, we propose a new method for detecting, locating, and grouping the repetitive patterns, particularly for near regular textures (NRT) based on a mid-level patch descriptor. A NRT is parameterized as a vector-valued function representing a texton unit together with a set of geometric transformations. We perform shape alignment by image congealing and correlation matching. Our experiments demonstrate that our patch-based method significantly improves the performance and the versatility of repetitive pattern detection in NRT images.","PeriodicalId":405588,"journal":{"name":"2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116691857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-06-16DOI: 10.1109/IVMSPW.2011.5970361
Baptiste Magnier, Daniel Diep, P. Montesinos
In this paper we propose a new perceptual curve detection method in images based on the difference of half rotating Gaussian filters. The novelty of this approach resides in the mixing of ideas coming both from directional filters, perceptual organization and DoG method. We obtain a new anisotropic DoG detector enabling very precise detection of perceptual curve points. Moreover, this detector performs correctly at perceptual curves even if highly bended, and is precise on perceptual junctions. This detector has been tested successfully on various image types presenting real difficult problems for classical detection methods.
{"title":"Perceptual curve extraction","authors":"Baptiste Magnier, Daniel Diep, P. Montesinos","doi":"10.1109/IVMSPW.2011.5970361","DOIUrl":"https://doi.org/10.1109/IVMSPW.2011.5970361","url":null,"abstract":"In this paper we propose a new perceptual curve detection method in images based on the difference of half rotating Gaussian filters. The novelty of this approach resides in the mixing of ideas coming both from directional filters, perceptual organization and DoG method. We obtain a new anisotropic DoG detector enabling very precise detection of perceptual curve points. Moreover, this detector performs correctly at perceptual curves even if highly bended, and is precise on perceptual junctions. This detector has been tested successfully on various image types presenting real difficult problems for classical detection methods.","PeriodicalId":405588,"journal":{"name":"2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125723296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}