Pub Date : 2001-09-26DOI: 10.1109/ICIAP.2001.957049
E. Bigorgne, C. Achard, J. Devars
This paper presents an effective use of local descriptors for object or scene recognition and indexing. This approach is in keeping with model-based recognition systems and consists of an extension of a standard point-to-point matching between two images. Aiming at this, we address the use of Full-Zernike moments as a reliable local characterization of the image signal. A fundamental characteristic of the used descriptors is then their ability to "absorb" a given set of potential image modifications. Their design calls principally for the theory of invariants. A built-in invariance to similarities allows one to manage narrow bounded perspective transformations. Moreover we provide a study of the substantial and costless contribution of the use of color information. In order to achieve photometric invariance, different types of normalization are evaluated through a model-based object recognition task.
{"title":"A local color descriptor for efficient scene-object recognition","authors":"E. Bigorgne, C. Achard, J. Devars","doi":"10.1109/ICIAP.2001.957049","DOIUrl":"https://doi.org/10.1109/ICIAP.2001.957049","url":null,"abstract":"This paper presents an effective use of local descriptors for object or scene recognition and indexing. This approach is in keeping with model-based recognition systems and consists of an extension of a standard point-to-point matching between two images. Aiming at this, we address the use of Full-Zernike moments as a reliable local characterization of the image signal. A fundamental characteristic of the used descriptors is then their ability to \"absorb\" a given set of potential image modifications. Their design calls principally for the theory of invariants. A built-in invariance to similarities allows one to manage narrow bounded perspective transformations. Moreover we provide a study of the substantial and costless contribution of the use of color information. In order to achieve photometric invariance, different types of normalization are evaluated through a model-based object recognition task.","PeriodicalId":365627,"journal":{"name":"Proceedings 11th International Conference on Image Analysis and Processing","volume":"299 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115999755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/ICIAP.2001.956984
Yingjie Wang, C. Chua, Yeong-Khing Ho, Ying Ren
This paper presents a feature-based face recognition system based on both 3D range data as well as 2D gray-level facial images. Ten 2D feature points and four 3D feature points are designed to be robust against changes of facial expressions and viewpoints and are described by Gabor filter responses in the 2D domain and point signature in the 3D domain. Localizing feature points in a new facial image is based on 3D-2D correspondence, average layout and corresponding bunch (covering a wide range of possible variations on each point). Extracted shape features from 3D feature points and texture features from 2D feature points are first projected into their own subspace using PCA. In subspace, the corresponding shape and texture weight vectors are then integrated to form an augmented vector which is used to represent each facial image. For a given test facial image, the best match in the model library is identified according to a classifier. Similarity function and support vector machine (SVM) are two types of classifier considered. Experimental results involving 2D persons with different facial expressions and extracted from different viewpoints have demonstrated the efficiency of our algorithm.
{"title":"Integrated 2D and 3D images for face recognition","authors":"Yingjie Wang, C. Chua, Yeong-Khing Ho, Ying Ren","doi":"10.1109/ICIAP.2001.956984","DOIUrl":"https://doi.org/10.1109/ICIAP.2001.956984","url":null,"abstract":"This paper presents a feature-based face recognition system based on both 3D range data as well as 2D gray-level facial images. Ten 2D feature points and four 3D feature points are designed to be robust against changes of facial expressions and viewpoints and are described by Gabor filter responses in the 2D domain and point signature in the 3D domain. Localizing feature points in a new facial image is based on 3D-2D correspondence, average layout and corresponding bunch (covering a wide range of possible variations on each point). Extracted shape features from 3D feature points and texture features from 2D feature points are first projected into their own subspace using PCA. In subspace, the corresponding shape and texture weight vectors are then integrated to form an augmented vector which is used to represent each facial image. For a given test facial image, the best match in the model library is identified according to a classifier. Similarity function and support vector machine (SVM) are two types of classifier considered. Experimental results involving 2D persons with different facial expressions and extracted from different viewpoints have demonstrated the efficiency of our algorithm.","PeriodicalId":365627,"journal":{"name":"Proceedings 11th International Conference on Image Analysis and Processing","volume":"263 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123394486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/ICIAP.2001.957078
Jitendra Malik
We develop a two-stage framework for parsing and understanding images, a process of image segmentation grouping pixels to form regions of coherent color and texture, and a process of recognition - comparing assemblies of such regions, hypothesized to correspond to a single object, with views of stored prototypes. We treat segmenting images into regions as an optimization problem: partition the image into regions such that there is high similarity within a region and low similarity across regions. This is formalized as the minimization of the normalized cut between regions. Using ideas from spectral graph theory, the minimization can be set as an eigenvalue problem. Visual attributes such as color, texture, contour and motion are encoded in this framework by suitable specification of graph edge weights. The recognition problem requires us to compare assemblies of image regions with previously stored proto-typical views of known objects. We have devised a novel algorithm for shape matching based on a relationship descriptor called the shape context. This enables us to compute similarity measures between shapes which, together with similarity measures for texture and color, can be used for object recognition. The shape matching algorithm has yielded excellent results on a variety of different 2D and 3D recognition problems.
{"title":"Visual grouping and object recognition","authors":"Jitendra Malik","doi":"10.1109/ICIAP.2001.957078","DOIUrl":"https://doi.org/10.1109/ICIAP.2001.957078","url":null,"abstract":"We develop a two-stage framework for parsing and understanding images, a process of image segmentation grouping pixels to form regions of coherent color and texture, and a process of recognition - comparing assemblies of such regions, hypothesized to correspond to a single object, with views of stored prototypes. We treat segmenting images into regions as an optimization problem: partition the image into regions such that there is high similarity within a region and low similarity across regions. This is formalized as the minimization of the normalized cut between regions. Using ideas from spectral graph theory, the minimization can be set as an eigenvalue problem. Visual attributes such as color, texture, contour and motion are encoded in this framework by suitable specification of graph edge weights. The recognition problem requires us to compare assemblies of image regions with previously stored proto-typical views of known objects. We have devised a novel algorithm for shape matching based on a relationship descriptor called the shape context. This enables us to compute similarity measures between shapes which, together with similarity measures for texture and color, can be used for object recognition. The shape matching algorithm has yielded excellent results on a variety of different 2D and 3D recognition problems.","PeriodicalId":365627,"journal":{"name":"Proceedings 11th International Conference on Image Analysis and Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132258081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/ICIAP.2001.957004
A. Plebe
This work describes a method for filtering image sequences degraded by noise, where the main object is moving with an almost periodic displacement. This object is assumed to be the only region of interest in the image, and tracking its movement against the background is the goal of the image processing. Under such circumstances, it is argued that a noise reduction strategy based on the knowledge of the motion will be more efficient than other classical methods for dynamic image sequences. This kind of problem is not unusual in the processing of scientific images, especially in the medical field. In this case the presence of noise is critical not only for the degradation of the visual quality, but also for the effectiveness of subsequent processing tasks, such as analysis and clinical interpretation.
{"title":"Noise filtering of periodic image sequences","authors":"A. Plebe","doi":"10.1109/ICIAP.2001.957004","DOIUrl":"https://doi.org/10.1109/ICIAP.2001.957004","url":null,"abstract":"This work describes a method for filtering image sequences degraded by noise, where the main object is moving with an almost periodic displacement. This object is assumed to be the only region of interest in the image, and tracking its movement against the background is the goal of the image processing. Under such circumstances, it is argued that a noise reduction strategy based on the knowledge of the motion will be more efficient than other classical methods for dynamic image sequences. This kind of problem is not unusual in the processing of scientific images, especially in the medical field. In this case the presence of noise is critical not only for the degradation of the visual quality, but also for the effectiveness of subsequent processing tasks, such as analysis and clinical interpretation.","PeriodicalId":365627,"journal":{"name":"Proceedings 11th International Conference on Image Analysis and Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132387172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/ICIAP.2001.957064
C. Sacchi, C. Regazzoni, G. Vernazza
Lately, the interest in advanced video-based surveillance applications has been increasing. This is especially true in the field of urban railway transport where video-based surveillance can be exploited to face many relevant security aspects (e.g. vandalism, overcrowding, abandoned object detection etc.). This paper aims at investigating an open problem in the implementation of video-based surveillance systems for transport applications, i.e., the implementation of reliable image understanding modules in order to recognize dangerous situations with reduced false alarm and misdetection rates. We considered the use of a neural network-based classifier for detecting vandal behavior in metro stations. The achieved results show that the classifier achieves very good performance even in the presence of high scene complexity.
{"title":"A neural network-based image processing system for detection of vandal acts in unmanned railway environments","authors":"C. Sacchi, C. Regazzoni, G. Vernazza","doi":"10.1109/ICIAP.2001.957064","DOIUrl":"https://doi.org/10.1109/ICIAP.2001.957064","url":null,"abstract":"Lately, the interest in advanced video-based surveillance applications has been increasing. This is especially true in the field of urban railway transport where video-based surveillance can be exploited to face many relevant security aspects (e.g. vandalism, overcrowding, abandoned object detection etc.). This paper aims at investigating an open problem in the implementation of video-based surveillance systems for transport applications, i.e., the implementation of reliable image understanding modules in order to recognize dangerous situations with reduced false alarm and misdetection rates. We considered the use of a neural network-based classifier for detecting vandal behavior in metro stations. The achieved results show that the classifier achieves very good performance even in the presence of high scene complexity.","PeriodicalId":365627,"journal":{"name":"Proceedings 11th International Conference on Image Analysis and Processing","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132552683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/ICIAP.2001.957073
D. Toth, T. Aach
A pattern recognition system used for industrial inspection has to be highly reliable and fast. The reliability is essential for reducing the cost caused by incorrect decisions, while speed is necessary for real-time operation. We address the problem of inspecting optical media like compact disks and digital versatile disks. As the disks are checked during production and the output of the production line has to be sufficiently high, the time available for the whole examination is very short, ie, about 1 sec per disk. In such real-time applications, the well-known minimum distance algorithm is often used as classifier. However, its main drawback is the unreliability when the training data are not well clustered in feature-space. Here we describe a method for off-line outlier detection, which cleans the training data set and yields substantially better classification results. It works on a statistical test basis. In addition, two improved versions of the minimum distance classifier, which both yield higher rates of correct classification with practically no speed-loss are presented. To evaluate the results, we compare them to the results obtained using a standard minimum distance classifier, a k-nearest neighbor classifier, and a fuzzy k-nearest neighbor classifier.
{"title":"Improved minimum distance classification with Gaussian outlier detection for industrial inspection","authors":"D. Toth, T. Aach","doi":"10.1109/ICIAP.2001.957073","DOIUrl":"https://doi.org/10.1109/ICIAP.2001.957073","url":null,"abstract":"A pattern recognition system used for industrial inspection has to be highly reliable and fast. The reliability is essential for reducing the cost caused by incorrect decisions, while speed is necessary for real-time operation. We address the problem of inspecting optical media like compact disks and digital versatile disks. As the disks are checked during production and the output of the production line has to be sufficiently high, the time available for the whole examination is very short, ie, about 1 sec per disk. In such real-time applications, the well-known minimum distance algorithm is often used as classifier. However, its main drawback is the unreliability when the training data are not well clustered in feature-space. Here we describe a method for off-line outlier detection, which cleans the training data set and yields substantially better classification results. It works on a statistical test basis. In addition, two improved versions of the minimum distance classifier, which both yield higher rates of correct classification with practically no speed-loss are presented. To evaluate the results, we compare them to the results obtained using a standard minimum distance classifier, a k-nearest neighbor classifier, and a fuzzy k-nearest neighbor classifier.","PeriodicalId":365627,"journal":{"name":"Proceedings 11th International Conference on Image Analysis and Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123330493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/ICIAP.2001.957030
Marco Carcassoni, E. Hancock
The modal correspondence method of L.S. Shapiro and J.M. Brady (see Image and Vision Computing, vol.10, p.283-8, 1992) aims to match point-sets by comparing the eigenvectors of a pairwise point proximity matrix. Although elegant by means of its matrix representation, the method is notoriously susceptible to differences in the relational structure of the point-sets under consideration. We demonstrate how the method can be rendered robust to structural differences by adopting a hierarchical approach. We place the modal matching problem in a probabilistic setting in which the arrangement of pairwise clusters can be used to constrain the individual point correspondences. We commence by using an iterative pairwise clustering method which can be applied to locate the main structure in the point-sets under study. Once we have located point clusters, we compute within-cluster and between-cluster proximity matrices. The modal coefficients for these two sets of proximity matrices are used to compute the probabilities that the detected cluster-centres are in correspondence and also the probabilities that individual points are in correspondence. We develop an evidence-combining framework which draws on these two sets of probabilities to locate point correspondences. In this way, the arrangement of the cluster-centre correspondences constrain the individual point correspondences.
{"title":"A hierarchical framework for modal correspondence matching","authors":"Marco Carcassoni, E. Hancock","doi":"10.1109/ICIAP.2001.957030","DOIUrl":"https://doi.org/10.1109/ICIAP.2001.957030","url":null,"abstract":"The modal correspondence method of L.S. Shapiro and J.M. Brady (see Image and Vision Computing, vol.10, p.283-8, 1992) aims to match point-sets by comparing the eigenvectors of a pairwise point proximity matrix. Although elegant by means of its matrix representation, the method is notoriously susceptible to differences in the relational structure of the point-sets under consideration. We demonstrate how the method can be rendered robust to structural differences by adopting a hierarchical approach. We place the modal matching problem in a probabilistic setting in which the arrangement of pairwise clusters can be used to constrain the individual point correspondences. We commence by using an iterative pairwise clustering method which can be applied to locate the main structure in the point-sets under study. Once we have located point clusters, we compute within-cluster and between-cluster proximity matrices. The modal coefficients for these two sets of proximity matrices are used to compute the probabilities that the detected cluster-centres are in correspondence and also the probabilities that individual points are in correspondence. We develop an evidence-combining framework which draws on these two sets of probabilities to locate point correspondences. In this way, the arrangement of the cluster-centre correspondences constrain the individual point correspondences.","PeriodicalId":365627,"journal":{"name":"Proceedings 11th International Conference on Image Analysis and Processing","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123344154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/ICIAP.2001.957043
Gsk Fung, N. Yung, G. Pang, A. Lai
For an accurate scene analysis in monocular image sequences, a robust segmentation of a moving object from the static background is generally required. However, the existence of moving cast shadow may lead to an inaccurate object segmentation, and as a result, lead to further erroneous scene analysis. An effective detection of moving cast shadow in monocular color image sequences is developed. Firstly, by realizing the various characteristics of shadow in luminance, chrominance, and gradient density, an indicator, called shadow confidence score, of the probability of the region classified as cast shadow is calculated. Secondly the canny edge detector is employed to detect edge pixels in the detected region. These pixels are then bounded by their convex hull, which estimates the position of the object. Lastly, by analyzing the shadow confidence score and the bounding hull, the cast shadow is identified as those regions outside the bounding hull and with high shadow confidence score. A number of typical outdoor scenes are evaluated and it is shown that our method can effectively detect the associated cast shadow from the object of interest.
{"title":"Effective moving cast shadow detection for monocular color image sequences","authors":"Gsk Fung, N. Yung, G. Pang, A. Lai","doi":"10.1109/ICIAP.2001.957043","DOIUrl":"https://doi.org/10.1109/ICIAP.2001.957043","url":null,"abstract":"For an accurate scene analysis in monocular image sequences, a robust segmentation of a moving object from the static background is generally required. However, the existence of moving cast shadow may lead to an inaccurate object segmentation, and as a result, lead to further erroneous scene analysis. An effective detection of moving cast shadow in monocular color image sequences is developed. Firstly, by realizing the various characteristics of shadow in luminance, chrominance, and gradient density, an indicator, called shadow confidence score, of the probability of the region classified as cast shadow is calculated. Secondly the canny edge detector is employed to detect edge pixels in the detected region. These pixels are then bounded by their convex hull, which estimates the position of the object. Lastly, by analyzing the shadow confidence score and the bounding hull, the cast shadow is identified as those regions outside the bounding hull and with high shadow confidence score. A number of typical outdoor scenes are evaluated and it is shown that our method can effectively detect the associated cast shadow from the object of interest.","PeriodicalId":365627,"journal":{"name":"Proceedings 11th International Conference on Image Analysis and Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115539143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/ICIAP.2001.957002
I. Haritaoglu, M. Flickner
We describe a real-time vision system for electronic billboards that can detect and count number of people standing in front of the billboards, determine how long they have been looking at the advertisements currently shown on the billboards, and try to obtain demographics information about the audience automatically to determine when and which advertisements might be shown on the electronic billboard to reach a targeted audience. As the location of billboard, such as coffee shops and stores, are very crowded areas where people either are waiting or moving together individual person cannot be isolated but are partially or total occluded by other people. We combined silhouette and motion-based people detection with fast infrared illumination-based pupil detection to detect people and determine whether they are looking at the billboards or not. Experimental results demonstrate the robustness and real-time performance of the algorithm.
{"title":"Attentive billboards","authors":"I. Haritaoglu, M. Flickner","doi":"10.1109/ICIAP.2001.957002","DOIUrl":"https://doi.org/10.1109/ICIAP.2001.957002","url":null,"abstract":"We describe a real-time vision system for electronic billboards that can detect and count number of people standing in front of the billboards, determine how long they have been looking at the advertisements currently shown on the billboards, and try to obtain demographics information about the audience automatically to determine when and which advertisements might be shown on the electronic billboard to reach a targeted audience. As the location of billboard, such as coffee shops and stores, are very crowded areas where people either are waiting or moving together individual person cannot be isolated but are partially or total occluded by other people. We combined silhouette and motion-based people detection with fast infrared illumination-based pupil detection to detect people and determine whether they are looking at the billboards or not. Experimental results demonstrate the robustness and real-time performance of the algorithm.","PeriodicalId":365627,"journal":{"name":"Proceedings 11th International Conference on Image Analysis and Processing","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115828394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}