Pub Date : 2007-11-12DOI: 10.1109/ICIP.2007.4379878
O. Michailovich, A. Tannenbaum
Segmentation of medical ultrasound images (e.g., for the purpose of surgical or radiotherapy planning) is known to be a difficult task due to the relatively low resolution and reduced contrast of the images, as well as due to the discontinuity and uncertainty of segmentation boundaries caused by speckle noise. Under such conditions, useful segmentation results seem to be only achievable by means of relatively complex algorithms, which are usually computationally involved and/or require a prior learning. In this paper, a different approach to the problem of segmentation of medical ultrasound images is proposed. In particular, we propose to preprocess the images before they are subjected to a segmentation procedure. The proposed preprocessing modifies the images (without affecting their anatomic contents) so that the resulting images can be effectively segmented by relatively simple and computationally efficient means. The performance of the proposed method is tested in a series of both in silico and in vivo experiments.
{"title":"Segmentation of Medical Ultrasound Images using Active Contours","authors":"O. Michailovich, A. Tannenbaum","doi":"10.1109/ICIP.2007.4379878","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379878","url":null,"abstract":"Segmentation of medical ultrasound images (e.g., for the purpose of surgical or radiotherapy planning) is known to be a difficult task due to the relatively low resolution and reduced contrast of the images, as well as due to the discontinuity and uncertainty of segmentation boundaries caused by speckle noise. Under such conditions, useful segmentation results seem to be only achievable by means of relatively complex algorithms, which are usually computationally involved and/or require a prior learning. In this paper, a different approach to the problem of segmentation of medical ultrasound images is proposed. In particular, we propose to preprocess the images before they are subjected to a segmentation procedure. The proposed preprocessing modifies the images (without affecting their anatomic contents) so that the resulting images can be effectively segmented by relatively simple and computationally efficient means. The performance of the proposed method is tested in a series of both in silico and in vivo experiments.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134318293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-12DOI: 10.1109/ICIP.2007.4379248
Peng Guan, Yaoliang Yu, Liming Zhang
Although 2D-based face recognition methods have made great progress in the past decades, there are also some unsolved problems such as PIE. Recently, more and more researchers have focused on 3D-based face recognition approaches. Among these techniques, facial feature point localization plays an important role in representing and matching 3D faces. In this paper, we present a novel feature point localization method on 3D faces combining global shape model and local surface model. Bezier surface is introduced to represent local structure of different feature points and global shape model is utilized to constrain the local search result. Experimental results based on comparison of our method and curvature analysis show the feasibility and efficiency of the new idea.
{"title":"A Novel Facial Feature Point Localization Method on 3D Faces","authors":"Peng Guan, Yaoliang Yu, Liming Zhang","doi":"10.1109/ICIP.2007.4379248","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379248","url":null,"abstract":"Although 2D-based face recognition methods have made great progress in the past decades, there are also some unsolved problems such as PIE. Recently, more and more researchers have focused on 3D-based face recognition approaches. Among these techniques, facial feature point localization plays an important role in representing and matching 3D faces. In this paper, we present a novel feature point localization method on 3D faces combining global shape model and local surface model. Bezier surface is introduced to represent local structure of different feature points and global shape model is utilized to constrain the local search result. Experimental results based on comparison of our method and curvature analysis show the feasibility and efficiency of the new idea.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134359246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-12DOI: 10.1109/ICIP.2007.4379600
Le Dong, E. Izquierdo
A system for image analysis and classification based on a knowledge structuring technique is presented. The knowledge structuring technique automatically creates a relevance map from salient areas of natural images. It also derives a set of well-structured representations from low-level description to drive the final classification. The backbone of the knowledge structuring technique is a distribution mapping strategy involving two basic modules: structured low-level feature extraction using convolution neural network and a topology representation module based on a growing cell structure network. Classification is achieved by simulating high-level top-down visual information perception and classifying using an incremental Bayesian parameter estimation method. The proposed modular system architecture offers straightforward expansion to include user relevance feedback, contextual input, and multimodal information if available.
{"title":"A Knowledge Structuring Technique for Image Classification","authors":"Le Dong, E. Izquierdo","doi":"10.1109/ICIP.2007.4379600","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379600","url":null,"abstract":"A system for image analysis and classification based on a knowledge structuring technique is presented. The knowledge structuring technique automatically creates a relevance map from salient areas of natural images. It also derives a set of well-structured representations from low-level description to drive the final classification. The backbone of the knowledge structuring technique is a distribution mapping strategy involving two basic modules: structured low-level feature extraction using convolution neural network and a topology representation module based on a growing cell structure network. Classification is achieved by simulating high-level top-down visual information perception and classifying using an incremental Bayesian parameter estimation method. The proposed modular system architecture offers straightforward expansion to include user relevance feedback, contextual input, and multimodal information if available.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131475158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-12DOI: 10.1109/ICIP.2007.4379545
Hong Yang, Yihua Tan, J. Tian, Jian Liu
Adaptive pixel-wise Gaussian mixture model (GMM) is a popular method to model dynamic scenes viewed by a fixed camera. However, it is not a trivial problem for GMM to capture the accurate mean and variance of a complex pixel. This paper presents a two-layer Gaussian mixture model (TLGMM) of dynamic scenes for moving object detection. The first layer, namely real model, deals with gradually changing pixels specially; the second layer, called on-ready model, focuses on those pixels changing significantly and irregularly. TLGMM can represent dynamic scenes more accurately and effectively. Additionally, a long term and a short term variance are taken into account to alleviate the transparent problems faced by pixel-based methods.
{"title":"Accurate Dynamic Scene Model for Moving Object Detection","authors":"Hong Yang, Yihua Tan, J. Tian, Jian Liu","doi":"10.1109/ICIP.2007.4379545","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379545","url":null,"abstract":"Adaptive pixel-wise Gaussian mixture model (GMM) is a popular method to model dynamic scenes viewed by a fixed camera. However, it is not a trivial problem for GMM to capture the accurate mean and variance of a complex pixel. This paper presents a two-layer Gaussian mixture model (TLGMM) of dynamic scenes for moving object detection. The first layer, namely real model, deals with gradually changing pixels specially; the second layer, called on-ready model, focuses on those pixels changing significantly and irregularly. TLGMM can represent dynamic scenes more accurately and effectively. Additionally, a long term and a short term variance are taken into account to alleviate the transparent problems faced by pixel-based methods.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132795622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-12DOI: 10.1109/ICIP.2007.4380061
Andrew C. Gallagher, Tsuhan Chen
Markov networks are an effective tool for the difficult but important problem of recognizing people in consumer image collections. Given a small set of labeled faces, we seek to recognize the other faces in an image collection. The constraints of the problem are exploited when forming the Markov network edge potentials. Inference is also used to suggest faces for the user to label, minimizing the work on the part of the user. In one test set containing 4 individuals, an 86% recognition rate is achieved with only 3 labeled examples.
{"title":"Using a Markov Network to Recognize People in Consumer Images","authors":"Andrew C. Gallagher, Tsuhan Chen","doi":"10.1109/ICIP.2007.4380061","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4380061","url":null,"abstract":"Markov networks are an effective tool for the difficult but important problem of recognizing people in consumer image collections. Given a small set of labeled faces, we seek to recognize the other faces in an image collection. The constraints of the problem are exploited when forming the Markov network edge potentials. Inference is also used to suggest faces for the user to label, minimizing the work on the part of the user. In one test set containing 4 individuals, an 86% recognition rate is achieved with only 3 labeled examples.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131016192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-12DOI: 10.1109/ICIP.2007.4379830
Byeongdu La, Minyoung Eom, Yoonsik Choe
The H.264/AVC video coding standard uses the rate distortion optimization (RDO) method to improve the compression performance in the intra prediction. Whereas the computational complexity is increased comparing with previous standards due to this method, even though this standard selects the best coding mode for the current macroblock. In this paper, a fast intra mode decision algorithm for H.264/AVC encoder based on dominant edge direction (DED) is proposed. The algorithm uses the approximation of discrete cosine transform (DCT) coefficient formula. By detecting the DED before intra prediction, 3 modes instead of 9 modes are chosen for RDO calculation to decide the best mode in the 4 times 4 luma block. For the 16 times 16 luma and the 8 times 8 chroma, instead of 4 modes, only 2 modes are chosen. Experimental results show that the computation time of the proposed algorithm is decreased to about 71% of the full search method in the reference code with negligible quality loss.
{"title":"Fast Mode Decision for Intra Prediction in H.264/AVC Encoder","authors":"Byeongdu La, Minyoung Eom, Yoonsik Choe","doi":"10.1109/ICIP.2007.4379830","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379830","url":null,"abstract":"The H.264/AVC video coding standard uses the rate distortion optimization (RDO) method to improve the compression performance in the intra prediction. Whereas the computational complexity is increased comparing with previous standards due to this method, even though this standard selects the best coding mode for the current macroblock. In this paper, a fast intra mode decision algorithm for H.264/AVC encoder based on dominant edge direction (DED) is proposed. The algorithm uses the approximation of discrete cosine transform (DCT) coefficient formula. By detecting the DED before intra prediction, 3 modes instead of 9 modes are chosen for RDO calculation to decide the best mode in the 4 times 4 luma block. For the 16 times 16 luma and the 8 times 8 chroma, instead of 4 modes, only 2 modes are chosen. Experimental results show that the computation time of the proposed algorithm is decreased to about 71% of the full search method in the reference code with negligible quality loss.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131160945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-12DOI: 10.1109/ICIP.2007.4379015
D. Corrigan, N. Harte, A. Kokaram
Film Tear is a form of degradation in archived film and is the physical ripping of the film material. Tear causes displacement of a region of the degraded frame and the loss of image data along the boundary of tear. In [1], a restoration algorithm was proposed to correct the displacement in the frame introduced by the tear by estimating the global motion of the 2 regions either side of the tear. However, the algorithm depended on a user-defined segmentation to divide the frame. This paper presents a new fully-automated segmentation algorithm which divides affected frames along the tear. The algorithm employs the graph cuts optimisation technique and uses temporal intensity differences, rather than spatial gradient, to describe the boundary properties of the segmentation. Segmentations produced with the proposed algorithm agree well with the perceived correct segmentation.
{"title":"Automated Segmentation of Torn Frames using the Graph Cuts Technique","authors":"D. Corrigan, N. Harte, A. Kokaram","doi":"10.1109/ICIP.2007.4379015","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379015","url":null,"abstract":"Film Tear is a form of degradation in archived film and is the physical ripping of the film material. Tear causes displacement of a region of the degraded frame and the loss of image data along the boundary of tear. In [1], a restoration algorithm was proposed to correct the displacement in the frame introduced by the tear by estimating the global motion of the 2 regions either side of the tear. However, the algorithm depended on a user-defined segmentation to divide the frame. This paper presents a new fully-automated segmentation algorithm which divides affected frames along the tear. The algorithm employs the graph cuts optimisation technique and uses temporal intensity differences, rather than spatial gradient, to describe the boundary properties of the segmentation. Segmentations produced with the proposed algorithm agree well with the perceived correct segmentation.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131174104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-12DOI: 10.1109/ICIP.2007.4379769
A. Reibman, D. Poole
We examine metrics to predict the visibility of packet losses in MPEG-2 and H.264 compressed video. We use subjective data that has a wide range of parameters, including different error concealment strategies and different compression standards. We evaluate SSIM, MSE, and a slice-boundary mismatch (SBM) metric for their effectiveness at characterizing packet-loss impairments.
{"title":"Characterizing packet-loss impairments in compressed video","authors":"A. Reibman, D. Poole","doi":"10.1109/ICIP.2007.4379769","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379769","url":null,"abstract":"We examine metrics to predict the visibility of packet losses in MPEG-2 and H.264 compressed video. We use subjective data that has a wide range of parameters, including different error concealment strategies and different compression standards. We evaluate SSIM, MSE, and a slice-boundary mismatch (SBM) metric for their effectiveness at characterizing packet-loss impairments.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130953438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-12DOI: 10.1109/ICIP.2007.4380075
Golnaz Abdollahian, E. Delp
In this paper, we propose an algorithm for identifying regions of interest (ROIs) in video, particularly for the keyframes extracted from a home video. The camera motion is introduced as a new factor that can influence the visual saliency. The global motion parameters are used to generate location-based importance maps. These maps can be combined with other saliency maps calculated using other visual and high-level features. Here, we employed the contrast-based saliency as an important low level factor along with face detection as a high level feature in our approach.
{"title":"Finding Regions of Interest in Home Videos Based on Camera Motion","authors":"Golnaz Abdollahian, E. Delp","doi":"10.1109/ICIP.2007.4380075","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4380075","url":null,"abstract":"In this paper, we propose an algorithm for identifying regions of interest (ROIs) in video, particularly for the keyframes extracted from a home video. The camera motion is introduced as a new factor that can influence the visual saliency. The global motion parameters are used to generate location-based importance maps. These maps can be combined with other saliency maps calculated using other visual and high-level features. Here, we employed the contrast-based saliency as an important low level factor along with face detection as a high level feature in our approach.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131126565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-12DOI: 10.1109/ICIP.2007.4378975
Y. Sutcu, Baris Coskun, H. Sencar, N. Memon
Powerful digital media editing tools make producing good quality forgeries very easy for almost anyone. Therefore, proving the authenticity and integrity of digital media becomes increasingly important. In this work, we propose a simple method to detect image tampering operations that involve sharpness/blurriness adjustment. Our approach is based on the assumption that if a digital image undergoes a copy-paste type of forgery, average sharpness/blurriness value of the forged region is expected to be different as compared to the non-tampered parts of the image. The method of estimating sharpness/blurriness value of an image is based on the regularity properties of wavelet transform coefficients which involves measuring the decay of wavelet transform coefficients across scales. Our preliminary results show that the estimated sharpness/blurriness scores can be used to identify tampered areas of the image.
{"title":"Tamper Detection Based on Regularity of Wavelet Transform Coefficients","authors":"Y. Sutcu, Baris Coskun, H. Sencar, N. Memon","doi":"10.1109/ICIP.2007.4378975","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4378975","url":null,"abstract":"Powerful digital media editing tools make producing good quality forgeries very easy for almost anyone. Therefore, proving the authenticity and integrity of digital media becomes increasingly important. In this work, we propose a simple method to detect image tampering operations that involve sharpness/blurriness adjustment. Our approach is based on the assumption that if a digital image undergoes a copy-paste type of forgery, average sharpness/blurriness value of the forged region is expected to be different as compared to the non-tampered parts of the image. The method of estimating sharpness/blurriness value of an image is based on the regularity properties of wavelet transform coefficients which involves measuring the decay of wavelet transform coefficients across scales. Our preliminary results show that the estimated sharpness/blurriness scores can be used to identify tampered areas of the image.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132882413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}