Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469552
S. Zenati, A. Boukrouche, P. Neveux
In the present work, we discuss an extension of the deconvolution techniques of Sekko [20] and Neveux [18] to 3D signals. The signals are assumed to be degraded by electronic linear systems, in which parameters are slowly time-varying such as sensors or other storage systems. For this purpose, Sekko & al. [20] developed a structure that has been adapted to time-varying systems [18] in order to produce an inverse filter with constant gain. This latter method was applied successfully to ordinary images [23]. The treatment of omnidirectional images requires working on the unit sphere. Therefore, the problem should be cast in 3D. In the 3D case, the deconvolution method [18] can be applied after some manipulations. The Heinz-Hopf fibration offers the possibility to consider that the sphere is similar to a torus. The advantage of this approach is that Kalman filtering can be applied and omnidirectional images projected on the sphere can be deconvolved.
{"title":"Deconvolution for slowly time-varying systems 3D cases","authors":"S. Zenati, A. Boukrouche, P. Neveux","doi":"10.1109/IPTA.2012.6469552","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469552","url":null,"abstract":"In the present work, we discuss an extension of the deconvolution techniques of Sekko [20] and Neveux [18] to 3D signals. The signals are assumed to be degraded by electronic linear systems, in which parameters are slowly time-varying such as sensors or other storage systems. For this purpose, Sekko & al. [20] developed a structure that has been adapted to time-varying systems [18] in order to produce an inverse filter with constant gain. This latter method was applied successfully to ordinary images [23]. The treatment of omnidirectional images requires working on the unit sphere. Therefore, the problem should be cast in 3D. In the 3D case, the deconvolution method [18] can be applied after some manipulations. The Heinz-Hopf fibration offers the possibility to consider that the sphere is similar to a torus. The advantage of this approach is that Kalman filtering can be applied and omnidirectional images projected on the sphere can be deconvolved.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121236232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469549
S. Sokoll, Hagen Beelitz, M. Heine, Klaus D. Tönnies
We propose an algorithm for the three-dimensional (3D) reconstruction of axonal structures to allow for the correlation of axonal structure, individual synaptic activity and single molecule tracking. In contrast to related works, only active synapses are stained in our acquisitions and the axonal structure is only visible by autofluorescence. We tackle this problem by detection of the medial axis line in the two-dimensional (2D) intensity projection of the 3D image and reconstruction of the 3D structure by axial interpolation between connected active synapses. Due to the noncontinuous staining, the detection of the medial axis line cannot rely on a global tree like structure. Instead, we compute an initial skeleton by global segmentation and expand it by iteratively adding line segments that are locally optimal according to the model knowledge of an axon. We evaluate our algorithm against a ground truth computed from co-transfection of surface molecules that result in reliable continuous staining of the axonal structure.
{"title":"Towards automatic reconstruction of axonal structures in volumetric microscopy images depicting only active synapses","authors":"S. Sokoll, Hagen Beelitz, M. Heine, Klaus D. Tönnies","doi":"10.1109/IPTA.2012.6469549","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469549","url":null,"abstract":"We propose an algorithm for the three-dimensional (3D) reconstruction of axonal structures to allow for the correlation of axonal structure, individual synaptic activity and single molecule tracking. In contrast to related works, only active synapses are stained in our acquisitions and the axonal structure is only visible by autofluorescence. We tackle this problem by detection of the medial axis line in the two-dimensional (2D) intensity projection of the 3D image and reconstruction of the 3D structure by axial interpolation between connected active synapses. Due to the noncontinuous staining, the detection of the medial axis line cannot rely on a global tree like structure. Instead, we compute an initial skeleton by global segmentation and expand it by iteratively adding line segments that are locally optimal according to the model knowledge of an axon. We evaluate our algorithm against a ground truth computed from co-transfection of surface molecules that result in reliable continuous staining of the axonal structure.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125246188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469554
Arkadiusz Pawlik
We present a range of image and video analysis techniques that we have developed in connection with license plate recognition. Our methods focus on two areas - efficient image preprocessing to improve low-quality detection rate and combining the detection results from multiple frames to improve the accuracy of the recognized license plates. To evaluate our algorithms, we have implemented a complete ANPR system that detects and reads license plates. The system can process up to 110 frames per second on single CPU core and scales well to at least 4 cores. The recognition rate varies depending on the quality of video streams (amount of motion blur, resolution), but approaches 100% for clear, sharp license plate input data. The software is currently marketed commercially as CarID1. Some of our methods are more general and may have applications outside of the ANPR domain.
{"title":"High performance automatic number plate recognition in video streams","authors":"Arkadiusz Pawlik","doi":"10.1109/IPTA.2012.6469554","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469554","url":null,"abstract":"We present a range of image and video analysis techniques that we have developed in connection with license plate recognition. Our methods focus on two areas - efficient image preprocessing to improve low-quality detection rate and combining the detection results from multiple frames to improve the accuracy of the recognized license plates. To evaluate our algorithms, we have implemented a complete ANPR system that detects and reads license plates. The system can process up to 110 frames per second on single CPU core and scales well to at least 4 cores. The recognition rate varies depending on the quality of video streams (amount of motion blur, resolution), but approaches 100% for clear, sharp license plate input data. The software is currently marketed commercially as CarID1. Some of our methods are more general and may have applications outside of the ANPR domain.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"200 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125559339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469508
S. Martinis
This paper describes the workflow of an automatic near-real time oil spill detection approach using single-polarized high resolution X-Band Synthetic Aperture Radar satellite data. Dark formations on the water surface are classified in a completely unsupervised way using an automatic tile-based thresholding procedure. The derived global threshold value is used for the initialization of a hybrid multi-contextual Markov image model which integrates scale-dependent and spatial contextual information on irregular hierarchical graph structures into the segment-based labeling process of slick-covered and slick-free water surfaces. Experimental investigations performed on TerraSAR-X ScanSAR data acquired during large-scale oil pollutions in the Gulf of Mexico in May 2010 confirm the effectiveness of the proposed method with respect to accuracy and computational effort.
{"title":"Automatic oil spill detection in TerraSAR-X data using multi-contextual Markov modeling on irregular graphs","authors":"S. Martinis","doi":"10.1109/IPTA.2012.6469508","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469508","url":null,"abstract":"This paper describes the workflow of an automatic near-real time oil spill detection approach using single-polarized high resolution X-Band Synthetic Aperture Radar satellite data. Dark formations on the water surface are classified in a completely unsupervised way using an automatic tile-based thresholding procedure. The derived global threshold value is used for the initialization of a hybrid multi-contextual Markov image model which integrates scale-dependent and spatial contextual information on irregular hierarchical graph structures into the segment-based labeling process of slick-covered and slick-free water surfaces. Experimental investigations performed on TerraSAR-X ScanSAR data acquired during large-scale oil pollutions in the Gulf of Mexico in May 2010 confirm the effectiveness of the proposed method with respect to accuracy and computational effort.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126962804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469509
Yannick Dennemont, Guillaume Bouyer, S. Otmane, M. Mallem
This work studies, implements and evaluates a gestures recognition module based on discrete Hidden Markov Models. The module is implemented on Matlab and used from Virtools. It can be used with different inputs therefore serves different recognition purposes. We focus on the 3D positions, our devices common information, as inputs for gesture recognition. Experiments are realized with an infra-red tracked flystick. Finally, the recognition rate is more than 90% with a personalized learning base. Otherwise, the results are beyond 70%, for an evaluation of 8 users on a real time mini-game. The rates are basically 80% for simple gestures and 60% for complex ones.
{"title":"A discrete Hidden Markov models recognition module for temporal series: Application to real-time 3D hand gestures","authors":"Yannick Dennemont, Guillaume Bouyer, S. Otmane, M. Mallem","doi":"10.1109/IPTA.2012.6469509","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469509","url":null,"abstract":"This work studies, implements and evaluates a gestures recognition module based on discrete Hidden Markov Models. The module is implemented on Matlab and used from Virtools. It can be used with different inputs therefore serves different recognition purposes. We focus on the 3D positions, our devices common information, as inputs for gesture recognition. Experiments are realized with an infra-red tracked flystick. Finally, the recognition rate is more than 90% with a personalized learning base. Otherwise, the results are beyond 70%, for an evaluation of 8 users on a real time mini-game. The rates are basically 80% for simple gestures and 60% for complex ones.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129105058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469544
T. T. Dang, M. Larabi, Azeddine Beghdadi
Recently, digital image inpainting has attracted strong research interest because of its extensive applications in real life. The terminology “inpainting” refers to automatic restoration of image defects such as scratches or blotches as well as removal of unwanted objects as, for instance, subtitles, logos, etc, such that it is undetectable by viewers without the reference to the original image. Many works on this subject have been published in recent years. This paper introduces a novel unsupervised image completion framework using a modified exemplar-based method in conjunction with a pyramidal representation of an image. A top-down iterative completion is performed gradually with multi-resolution patches and a window-based priority. The proposed approach is verified on different natural images. Also, a comparison with some existing methods coming from literature is carried out and the results show improvement in favor of our approach.
{"title":"Multi-resolution patch and window-based priority for digital image inpainting problem","authors":"T. T. Dang, M. Larabi, Azeddine Beghdadi","doi":"10.1109/IPTA.2012.6469544","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469544","url":null,"abstract":"Recently, digital image inpainting has attracted strong research interest because of its extensive applications in real life. The terminology “inpainting” refers to automatic restoration of image defects such as scratches or blotches as well as removal of unwanted objects as, for instance, subtitles, logos, etc, such that it is undetectable by viewers without the reference to the original image. Many works on this subject have been published in recent years. This paper introduces a novel unsupervised image completion framework using a modified exemplar-based method in conjunction with a pyramidal representation of an image. A top-down iterative completion is performed gradually with multi-resolution patches and a window-based priority. The proposed approach is verified on different natural images. Also, a comparison with some existing methods coming from literature is carried out and the results show improvement in favor of our approach.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128548629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469556
K. A. Saadi, Khalil Zebbiche, M. Laadjel, M. Morsli
Fingerprints are becoming popular in automated systems and for IT system user authentication. They are unique to each person and are designed to allow instant establishment personal identity in real time application. Enhancing their security in terms of fidelity and integrity becomes paramount. Since fingerprint images are usually compressed using Wavelet-packet Scalar Quantization (WSQ) before they are transmitted over networks, in this paper, we apply a fragile watermarking algorithm operating directly in compressed domain for protecting the evidentiary integrity of the WSQ bitstream. This work is motivated by the results obtained in previous video watermarking methods working in variable length codeword (VLC) domain to provide real time detection. The principle of the method is based on mapping the codewords to the outside of the used codespace, the watermark is embedded into stream as forced bit errors. The developed algorithm achieves high capacity and preserves the file size of WSQ bitstream while maintaining high perceptible quality.
{"title":"Real time watermarking to authenticate the WSQ bitstream","authors":"K. A. Saadi, Khalil Zebbiche, M. Laadjel, M. Morsli","doi":"10.1109/IPTA.2012.6469556","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469556","url":null,"abstract":"Fingerprints are becoming popular in automated systems and for IT system user authentication. They are unique to each person and are designed to allow instant establishment personal identity in real time application. Enhancing their security in terms of fidelity and integrity becomes paramount. Since fingerprint images are usually compressed using Wavelet-packet Scalar Quantization (WSQ) before they are transmitted over networks, in this paper, we apply a fragile watermarking algorithm operating directly in compressed domain for protecting the evidentiary integrity of the WSQ bitstream. This work is motivated by the results obtained in previous video watermarking methods working in variable length codeword (VLC) domain to provide real time detection. The principle of the method is based on mapping the codewords to the outside of the used codespace, the watermark is embedded into stream as forced bit errors. The developed algorithm achieves high capacity and preserves the file size of WSQ bitstream while maintaining high perceptible quality.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"75 15","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114005467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469551
E. Goceri, M. Z. Unlu, C. Guzelis, O. Dicle
A fast and accurate liver segmentation method is a challenging work in medical image analysis area. Liver segmentation is an important process for computer-assisted diagnosis, pre-evaluation of liver transplantation and therapy planning of liver tumors. There are several advantages of magnetic resonance imaging such as free form ionizing radiation and good contrast visualization of soft tissue. Also, innovations in recent technology and image acquisition techniques have made magnetic resonance imaging a major tool in modern medicine. However, the use of magnetic resonance images for liver segmentation has been slow when we compare applications with the central nervous systems and musculoskeletal. The reasons are irregular shape, size and position of the liver, contrast agent effects and similarities of the gray values of neighbor organs. Therefore, in this study, we present a fully automatic liver segmentation method by using an approximation of the level set based contour evolution from T2 weighted magnetic resonance data sets. The method avoids solving partial differential equations and applies only integer operations with a two-cycle segmentation algorithm. The efficiency of the proposed approach is achieved by applying the algorithm to all slices with a constant number of iteration and performing the contour evolution without any user defined initial contour. The obtained results are evaluated with four different similarity measures and they show that the automatic segmentation approach gives successful results.
{"title":"An automatic level set based liver segmentation from MRI data sets","authors":"E. Goceri, M. Z. Unlu, C. Guzelis, O. Dicle","doi":"10.1109/IPTA.2012.6469551","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469551","url":null,"abstract":"A fast and accurate liver segmentation method is a challenging work in medical image analysis area. Liver segmentation is an important process for computer-assisted diagnosis, pre-evaluation of liver transplantation and therapy planning of liver tumors. There are several advantages of magnetic resonance imaging such as free form ionizing radiation and good contrast visualization of soft tissue. Also, innovations in recent technology and image acquisition techniques have made magnetic resonance imaging a major tool in modern medicine. However, the use of magnetic resonance images for liver segmentation has been slow when we compare applications with the central nervous systems and musculoskeletal. The reasons are irregular shape, size and position of the liver, contrast agent effects and similarities of the gray values of neighbor organs. Therefore, in this study, we present a fully automatic liver segmentation method by using an approximation of the level set based contour evolution from T2 weighted magnetic resonance data sets. The method avoids solving partial differential equations and applies only integer operations with a two-cycle segmentation algorithm. The efficiency of the proposed approach is achieved by applying the algorithm to all slices with a constant number of iteration and performing the contour evolution without any user defined initial contour. The obtained results are evaluated with four different similarity measures and they show that the automatic segmentation approach gives successful results.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121295351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469573
Gizem Akti, Dionysis Goularas
This paper presents a method allowing the conversion of images into sound. Initially, a frequency component extraction is realized from the original image. At this stage, the image is divided into windows in order to represent consecutive different time periods using STFT. Then, the dominant frequencies of each window are mapped into corresponding sound frequencies through Fourier analysis. This procedure is applied twice and two series of sound frequency components are produced: The first is originated from the brightness of the image, the second from the dominant RGB layer. The connection between the visual impression of the image and the psychoacoustic effect of the sound mapping is done by using different musical scales according to the dominant color of the image. The results revealed that the melody extracted from this analysis produces a certain psychoacoustic impression, as it has reported by several volunteers. Despite the fact that volunteers could not always do the association between image and sound, they could hardly believe that the music was produced by an algorithmic procedure.
{"title":"Frequency component extraction from color images for specific sound transformation and analysis","authors":"Gizem Akti, Dionysis Goularas","doi":"10.1109/IPTA.2012.6469573","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469573","url":null,"abstract":"This paper presents a method allowing the conversion of images into sound. Initially, a frequency component extraction is realized from the original image. At this stage, the image is divided into windows in order to represent consecutive different time periods using STFT. Then, the dominant frequencies of each window are mapped into corresponding sound frequencies through Fourier analysis. This procedure is applied twice and two series of sound frequency components are produced: The first is originated from the brightness of the image, the second from the dominant RGB layer. The connection between the visual impression of the image and the psychoacoustic effect of the sound mapping is done by using different musical scales according to the dominant color of the image. The results revealed that the melody extracted from this analysis produces a certain psychoacoustic impression, as it has reported by several volunteers. Despite the fact that volunteers could not always do the association between image and sound, they could hardly believe that the music was produced by an algorithmic procedure.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125510566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469550
Amel Ksibi, Mouna Dammak, A. Ammar, M. Mejdoub, C. Amar
Automatic photo annotation task aims to describe the semantic content by detecting high level concepts in order to further facilitate concept based video retrieval. Most of existing approaches are based on independent semantic concept detectors without considering the contextual correlation between concepts. This drawback has its impact over the efficiency of such systems. Recently, harnessing contextual information to improve the effectiveness of concepts detection becomes a promising direction in such field. In this paper, we propose a new contextbased annotation refinement process. For this purpose, we define a new semantic measure called “Second Order Co-occurence Flickr context similarity” (SOCFCS) which aims to extract the semantic context correlation between two concepts by exploring Flickr resources (Flickr related-tags). Our measure is an extension of FCS measure by taking into consideration the FCS values of common Flickr related-tags of the two target concepts. Our proposed measure is applied to build a concept network which models the semantic context inter-relationships among concepts. A Random Walk with Restart process is performed over this network to refine the annotation results by exploring the contextual correlation among concepts. Experimental studies are conducted on ImageCLEF 2011 Collection containing 10000 images and 99 concepts. The results demonstrate the effectiveness of our proposed approach.
照片自动标注任务旨在通过检测高级概念来描述语义内容,从而进一步促进基于概念的视频检索。现有的方法大多是基于独立的语义概念检测器,没有考虑概念之间的上下文相关性。这一缺点影响了这类系统的效率。近年来,利用上下文信息来提高概念检测的有效性成为该领域一个很有前途的方向。在本文中,我们提出了一种新的基于上下文的标注改进过程。为此,我们定义了一个新的语义度量,称为“二阶共现Flickr上下文相似度”(SOCFCS),旨在通过探索Flickr资源(Flickr相关标签)来提取两个概念之间的语义上下文相关性。我们的度量是FCS度量的扩展,考虑了两个目标概念的常见Flickr相关标签的FCS值。我们提出的方法被用于建立一个概念网络,该网络对概念之间的语义上下文相互关系进行建模。在该网络上执行随机行走(Random Walk with Restart)过程,通过探索概念之间的上下文相关性来改进注释结果。在包含10000张图片和99个概念的ImageCLEF 2011 Collection上进行实验研究。结果证明了我们所提出的方法的有效性。
{"title":"Flickr-based semantic context to refine automatic photo annotation","authors":"Amel Ksibi, Mouna Dammak, A. Ammar, M. Mejdoub, C. Amar","doi":"10.1109/IPTA.2012.6469550","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469550","url":null,"abstract":"Automatic photo annotation task aims to describe the semantic content by detecting high level concepts in order to further facilitate concept based video retrieval. Most of existing approaches are based on independent semantic concept detectors without considering the contextual correlation between concepts. This drawback has its impact over the efficiency of such systems. Recently, harnessing contextual information to improve the effectiveness of concepts detection becomes a promising direction in such field. In this paper, we propose a new contextbased annotation refinement process. For this purpose, we define a new semantic measure called “Second Order Co-occurence Flickr context similarity” (SOCFCS) which aims to extract the semantic context correlation between two concepts by exploring Flickr resources (Flickr related-tags). Our measure is an extension of FCS measure by taking into consideration the FCS values of common Flickr related-tags of the two target concepts. Our proposed measure is applied to build a concept network which models the semantic context inter-relationships among concepts. A Random Walk with Restart process is performed over this network to refine the annotation results by exploring the contextual correlation among concepts. Experimental studies are conducted on ImageCLEF 2011 Collection containing 10000 images and 99 concepts. The results demonstrate the effectiveness of our proposed approach.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114637278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}