In this work we consider the problem of soccer team discrimination. The approach we propose starts from the monocular images acquired by a still camera. The first step is the soccer player detection, performed by means of background subtraction. An algorithm based on pixels energy content has been implemented in order to detect moving objects. The use of energy information, combined with a temporal sliding window procedure, allows to be substantially independent from motion hypothesis. Colour histograms in RGB space are extracted from each player, and provided to the unsupervised classification phase. This is composed by two distinct modules: firstly, a modified version of the BSAS clustering algorithm builds the clusters for each class of objects. Then, at runtime, each player is classified by evaluating its distance, in the features space, from the classes previously detected. Algorithms have been tested on different real soccer matches of the Italian Serie A.
{"title":"An Unsupervised Approach for Segmentation and Clustering of Soccer Players","authors":"P. Spagnolo, N. Mosca, M. Nitti, A. Distante","doi":"10.1109/IMVIP.2007.10","DOIUrl":"https://doi.org/10.1109/IMVIP.2007.10","url":null,"abstract":"In this work we consider the problem of soccer team discrimination. The approach we propose starts from the monocular images acquired by a still camera. The first step is the soccer player detection, performed by means of background subtraction. An algorithm based on pixels energy content has been implemented in order to detect moving objects. The use of energy information, combined with a temporal sliding window procedure, allows to be substantially independent from motion hypothesis. Colour histograms in RGB space are extracted from each player, and provided to the unsupervised classification phase. This is composed by two distinct modules: firstly, a modified version of the BSAS clustering algorithm builds the clusters for each class of objects. Then, at runtime, each player is classified by evaluating its distance, in the features space, from the classes previously detected. Algorithms have been tested on different real soccer matches of the Italian Serie A.","PeriodicalId":249544,"journal":{"name":"International Machine Vision and Image Processing Conference (IMVIP 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133183729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Users often select training images from video sequences at random, but it is hard for users to know the correctness of selection for the system. In this paper, we propose a small improvement to select training images, and incorporate a simple technique for foreground detection in grayscale video sequences.
{"title":"The Improvement of the Background Subtraction and Shadow Detection in Grayscale Video Sequences","authors":"Yung-Gi Wu, Chung-Ying Tsai","doi":"10.1109/IMVIP.2007.41","DOIUrl":"https://doi.org/10.1109/IMVIP.2007.41","url":null,"abstract":"Users often select training images from video sequences at random, but it is hard for users to know the correctness of selection for the system. In this paper, we propose a small improvement to select training images, and incorporate a simple technique for foreground detection in grayscale video sequences.","PeriodicalId":249544,"journal":{"name":"International Machine Vision and Image Processing Conference (IMVIP 2007)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131072373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To enable fast reliable feature matching or tracking in scenes, features need to be discrete and meaningful, and hence corner detection is often used for this purpose. However, to obtain a higher level description of an image, such as identification of objects, additional information such as edges is required, and more recently detectors have been proposed that find both edges and corners. Recently, finite-element based methods have been used to develop gradient operators for edge detection that have improved angular accuracy over standard techniques. We extend this work to corner detection, enabling edge and corner detection to be integrated. In addition we present a combined operator, enabling edge and corner detection to be achieved concurrently, and we demonstrate that accuracy is comparable to well- known existing corner detectors and edge detectors, and, as standard post-smoothing of the corner map is not required, significantly reduced computation time can be achieved.
{"title":"Near-Circular Corner and Edge Detection Operators","authors":"D. Kerr, S. Coleman, B. Scotney","doi":"10.1109/IMVIP.2007.30","DOIUrl":"https://doi.org/10.1109/IMVIP.2007.30","url":null,"abstract":"To enable fast reliable feature matching or tracking in scenes, features need to be discrete and meaningful, and hence corner detection is often used for this purpose. However, to obtain a higher level description of an image, such as identification of objects, additional information such as edges is required, and more recently detectors have been proposed that find both edges and corners. Recently, finite-element based methods have been used to develop gradient operators for edge detection that have improved angular accuracy over standard techniques. We extend this work to corner detection, enabling edge and corner detection to be integrated. In addition we present a combined operator, enabling edge and corner detection to be achieved concurrently, and we demonstrate that accuracy is comparable to well- known existing corner detectors and edge detectors, and, as standard post-smoothing of the corner map is not required, significantly reduced computation time can be achieved.","PeriodicalId":249544,"journal":{"name":"International Machine Vision and Image Processing Conference (IMVIP 2007)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115133064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The research discussed in this paper documents a comparative analysis of two nonlinear dimensionality reduction techniques for the classification of facial expressions at varying degrees of intensity. These nonlinear dimensionality reduction techniques are Kernel Principal Component Analysis (KPCA) and Locally Linear Embedding (LLE). The approaches presented in this paper employ psychological tools, computer vision techniques and machine learning algorithms. In this paper we concentrate on comparing the performance of these two techniques when combined with Support Vector Machines (SVMs) at the task of classifying facial expressions across the full expression intensity range from near-neutral to extreme facial expression. Receiver Operating Characteristic (ROC) curve analysis is employed as a means of comprehensively comparing the results of these techniques.
{"title":"Non-Linear Approaches for the Classification of Facial Expressions at Varying Degrees of Intensity","authors":"J. Reilly, J. Ghent, J. McDonald","doi":"10.1109/IMVIP.2007.31","DOIUrl":"https://doi.org/10.1109/IMVIP.2007.31","url":null,"abstract":"The research discussed in this paper documents a comparative analysis of two nonlinear dimensionality reduction techniques for the classification of facial expressions at varying degrees of intensity. These nonlinear dimensionality reduction techniques are Kernel Principal Component Analysis (KPCA) and Locally Linear Embedding (LLE). The approaches presented in this paper employ psychological tools, computer vision techniques and machine learning algorithms. In this paper we concentrate on comparing the performance of these two techniques when combined with Support Vector Machines (SVMs) at the task of classifying facial expressions across the full expression intensity range from near-neutral to extreme facial expression. Receiver Operating Characteristic (ROC) curve analysis is employed as a means of comprehensively comparing the results of these techniques.","PeriodicalId":249544,"journal":{"name":"International Machine Vision and Image Processing Conference (IMVIP 2007)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122430533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Determining the ground truth in human labeled video data is a common challenge in surveillance and medical imaging research work. In this paper we describe four statistical experiments that examine different approaches of determining ground truth in a database of hand washing videos.
{"title":"Statistical analysis of ground truth in human labeled data","authors":"Jiang Zhou, F. Vilariño, L. Gérard, Li Xuchun","doi":"10.1109/IMVIP.2007.39","DOIUrl":"https://doi.org/10.1109/IMVIP.2007.39","url":null,"abstract":"Determining the ground truth in human labeled video data is a common challenge in surveillance and medical imaging research work. In this paper we describe four statistical experiments that examine different approaches of determining ground truth in a database of hand washing videos.","PeriodicalId":249544,"journal":{"name":"International Machine Vision and Image Processing Conference (IMVIP 2007)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123536439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a novel technique for the automatic classification of noisy and incomplete shoeprint images, based on topological and pattern spectra. We first consider the pattern spectrum proposed by Maragos. We extend each spectrum with the spectrum for the complement image. We also propose a topological spectrum for a shoeprint image, based on repeated open operations with increasing size of structuring element, giving a distribution of Euler numbers. The normalised differential of this series gives the topological spectrum. We secondly propose a hybrid algorithm which uses a distance measure based on a combination of both spectra as the feature of a shoeprint image. To evaluate the performance of the techniques, we use a database of 500 'clean' shoeprints to generate five test databases each with 2500 degraded images, such as Gaussian noise, incompletion, rotation, rescale, and scene background. The statistical evaluations in terms of precision vs. recall are given in the final section. Tests show that our hybrid technique combining both spectra gives significant improvements over previously published results for edge direction histogram.
{"title":"Shoeprint Image Retrieval by Topological and Pattern Spectra","authors":"H. Su, D. Crookes, A. Bouridane","doi":"10.1109/IMVIP.2007.37","DOIUrl":"https://doi.org/10.1109/IMVIP.2007.37","url":null,"abstract":"In this paper, we propose a novel technique for the automatic classification of noisy and incomplete shoeprint images, based on topological and pattern spectra. We first consider the pattern spectrum proposed by Maragos. We extend each spectrum with the spectrum for the complement image. We also propose a topological spectrum for a shoeprint image, based on repeated open operations with increasing size of structuring element, giving a distribution of Euler numbers. The normalised differential of this series gives the topological spectrum. We secondly propose a hybrid algorithm which uses a distance measure based on a combination of both spectra as the feature of a shoeprint image. To evaluate the performance of the techniques, we use a database of 500 'clean' shoeprints to generate five test databases each with 2500 degraded images, such as Gaussian noise, incompletion, rotation, rescale, and scene background. The statistical evaluations in terms of precision vs. recall are given in the final section. Tests show that our hybrid technique combining both spectra gives significant improvements over previously published results for edge direction histogram.","PeriodicalId":249544,"journal":{"name":"International Machine Vision and Image Processing Conference (IMVIP 2007)","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125780686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a survey of the research carried out to date in the area of computer-based deformable modelling. Due to their cross-disciplinary nature, deformable modelling techniques have been the subject of vigorous research over the past three decades and have found numerous applications in the fields of machine vision (image analysis, image segmentation, image matching, and motion tracking), visualisation (shape representation and data fitting), and computer graphics (shape modelling, simulation, and animation). Previous review papers have been field/application specific and have therefore been limited in their coverage of techniques. This survey focuses on general deformable models for computer-based modelling, which can be used for computer graphics, visualisation, and various image processing applications. The paper organizes the various approaches by technique and provides a description, critique, and overview of applications for each. Finally, the state of the art of deformable modelling is discussed, and areas of importance for future research are suggested.
{"title":"A Survey of Computer-Based Deformable Models","authors":"Patricia Moore, Derek Molloy","doi":"10.1109/IMVIP.2007.7","DOIUrl":"https://doi.org/10.1109/IMVIP.2007.7","url":null,"abstract":"This paper presents a survey of the research carried out to date in the area of computer-based deformable modelling. Due to their cross-disciplinary nature, deformable modelling techniques have been the subject of vigorous research over the past three decades and have found numerous applications in the fields of machine vision (image analysis, image segmentation, image matching, and motion tracking), visualisation (shape representation and data fitting), and computer graphics (shape modelling, simulation, and animation). Previous review papers have been field/application specific and have therefore been limited in their coverage of techniques. This survey focuses on general deformable models for computer-based modelling, which can be used for computer graphics, visualisation, and various image processing applications. The paper organizes the various approaches by technique and provides a description, critique, and overview of applications for each. Finally, the state of the art of deformable modelling is discussed, and areas of importance for future research are suggested.","PeriodicalId":249544,"journal":{"name":"International Machine Vision and Image Processing Conference (IMVIP 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133907921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a video object motion segmentation method for object tracking in visual surveillance. In the first step, the frames are first decomposed into small facets (regions), using colour information. Then, based on the detected motion, the motion segmentation is performed at facet level. A Bayesian approach is applied in clustering facets into moving objects and tracking moving video objects. Experiments have verified that the proposed method can efficiently tackle the complexity of video motion tracking.
{"title":"Video Object Motion Segmentation for Intelligent Visual Surveillance","authors":"M. Jiang, D. Crookes","doi":"10.1109/IMVIP.2007.43","DOIUrl":"https://doi.org/10.1109/IMVIP.2007.43","url":null,"abstract":"This paper presents a video object motion segmentation method for object tracking in visual surveillance. In the first step, the frames are first decomposed into small facets (regions), using colour information. Then, based on the detected motion, the motion segmentation is performed at facet level. A Bayesian approach is applied in clustering facets into moving objects and tracking moving video objects. Experiments have verified that the proposed method can efficiently tackle the complexity of video motion tracking.","PeriodicalId":249544,"journal":{"name":"International Machine Vision and Image Processing Conference (IMVIP 2007)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115418387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automated teller machines obviously require functions for protecting themselves from crimes, because they must handle cash. This paper discusses self-defense-technologies based on image processing and recognition to realize such functions in the machines. The technologies include (i) banknote validation for preventing machines from receiving counterfeit banknotes, (ii) form & character recognition for preventing machines from accepting remittance forms with out of due dates, (Hi) person identification to stop machines from transacting with non-customers, and (iv) object recognition to guard machines against foreign objects such as spy cams that may be attached to them. After describing the outline of the system, technologies (i), (ii), and (Hi) are introduced. This paper concentrates on the object recognition technology for detecting foreign objects. Although the technology is based on conventional background- subtraction, we developed special noise reduction techniques to make it suitable for actual machines. The object recognition technology was evaluated in experiments using real 160-hour image video composed of about 8.6 times 10 frames in total, and the false-alarm rate and false-negative rate were 0.7% and 0%, respectively.
{"title":"Self-Defense-Technologies for Automated Teller Machines","authors":"H. Sako, T. Watanabe, H. Nagayoshi, T. Kagehiro","doi":"10.1109/IMVIP.2007.36","DOIUrl":"https://doi.org/10.1109/IMVIP.2007.36","url":null,"abstract":"Automated teller machines obviously require functions for protecting themselves from crimes, because they must handle cash. This paper discusses self-defense-technologies based on image processing and recognition to realize such functions in the machines. The technologies include (i) banknote validation for preventing machines from receiving counterfeit banknotes, (ii) form & character recognition for preventing machines from accepting remittance forms with out of due dates, (Hi) person identification to stop machines from transacting with non-customers, and (iv) object recognition to guard machines against foreign objects such as spy cams that may be attached to them. After describing the outline of the system, technologies (i), (ii), and (Hi) are introduced. This paper concentrates on the object recognition technology for detecting foreign objects. Although the technology is based on conventional background- subtraction, we developed special noise reduction techniques to make it suitable for actual machines. The object recognition technology was evaluated in experiments using real 160-hour image video composed of about 8.6 times 10 frames in total, and the false-alarm rate and false-negative rate were 0.7% and 0%, respectively.","PeriodicalId":249544,"journal":{"name":"International Machine Vision and Image Processing Conference (IMVIP 2007)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133265133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently the performance of nonlinear transforms have been given a lot of attention to overcome the suboptimal n- terms approximation power of tensor product wavelet methods on higher dimensions. The suboptimal performance prevails when those transforms are used for a sparse representation of functions consisting of smoothly varying areas separated by smooth contours. This paper introduces a method creating normal meshes with nonsubdivision connectivity to approximate the nonsmoothness of such images efficiently. From a domain decomposition viewpoint, the method is a triangulation refinement method preserving contours. The so-called normal offset decomposition searches from the midpoint of the edges in the previous approximation along the normal direction until it pierces the surface that represents the image and adds the piercing points to the approximation. The transform is nonlinear as it depends on the actual image. In this paper, we propose a normal offset based compression algorithm for digital images. The discrete setting causes the transform to become redundant. We also propose a model to encode the obtained coefficients. We show rate distortion curves and compare the results with the JPEG2000 encoder.
{"title":"A Nonlinear Contour Preserving Transform for Geometrical Image Compression","authors":"W. Van Aerschot, M. Jansen, A. Bultheel","doi":"10.1109/IMVIP.2007.5","DOIUrl":"https://doi.org/10.1109/IMVIP.2007.5","url":null,"abstract":"Recently the performance of nonlinear transforms have been given a lot of attention to overcome the suboptimal n- terms approximation power of tensor product wavelet methods on higher dimensions. The suboptimal performance prevails when those transforms are used for a sparse representation of functions consisting of smoothly varying areas separated by smooth contours. This paper introduces a method creating normal meshes with nonsubdivision connectivity to approximate the nonsmoothness of such images efficiently. From a domain decomposition viewpoint, the method is a triangulation refinement method preserving contours. The so-called normal offset decomposition searches from the midpoint of the edges in the previous approximation along the normal direction until it pierces the surface that represents the image and adds the piercing points to the approximation. The transform is nonlinear as it depends on the actual image. In this paper, we propose a normal offset based compression algorithm for digital images. The discrete setting causes the transform to become redundant. We also propose a model to encode the obtained coefficients. We show rate distortion curves and compare the results with the JPEG2000 encoder.","PeriodicalId":249544,"journal":{"name":"International Machine Vision and Image Processing Conference (IMVIP 2007)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131860780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}