This paper presents a corner-based method for tracking objects in video image frames. The method uses a vectorial shape representation based on relative object main corners positions and a non-linear voting method to evaluating the new object position at each iteration. The initialization consists of individuating an area including the object to be tracked. Information of the corners distribution around a reference point is used to find the most probable target position in the next frame. The method can be used in both fixed and mobile cameras both for vehicles and pedestrians.
{"title":"A Nonlinear-Shift Approach to Object Tracking Based on Shape Information","authors":"M. Asadi, A. Beoldo, C. Regazzoni","doi":"10.1109/ICIAP.2007.14","DOIUrl":"https://doi.org/10.1109/ICIAP.2007.14","url":null,"abstract":"This paper presents a corner-based method for tracking objects in video image frames. The method uses a vectorial shape representation based on relative object main corners positions and a non-linear voting method to evaluating the new object position at each iteration. The initialization consists of individuating an area including the object to be tracked. Information of the corners distribution around a reference point is used to find the most probable target position in the next frame. The method can be used in both fixed and mobile cameras both for vehicles and pedestrians.","PeriodicalId":118466,"journal":{"name":"14th International Conference on Image Analysis and Processing (ICIAP 2007)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132640965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anan Liu, Sheng Tang, Yongdong Zhang, Jintao Li, Zhaoxuan Yang
For conveniently navigating and editing the news programs, it is very important to segment the video into meaningful units. The effective indexing of news videos can be fulfilled by the anchorperson shot because it is an indicator which denotes the occurrence of upcoming news stories. The paper presents a novel anchorperson detection algorithm based on spatio-temporal slice (STS). With STSpattern analysis, clustering and decision fusion, anchorperson shots can be detected for browsing news video. The large-scale experimental results demonstrate that the algorithm is accurate, robust and effective.
{"title":"A Novel Anchorperson Detection Algorithm Based on Spatio-temporal Slice","authors":"Anan Liu, Sheng Tang, Yongdong Zhang, Jintao Li, Zhaoxuan Yang","doi":"10.1109/ICIAP.2007.15","DOIUrl":"https://doi.org/10.1109/ICIAP.2007.15","url":null,"abstract":"For conveniently navigating and editing the news programs, it is very important to segment the video into meaningful units. The effective indexing of news videos can be fulfilled by the anchorperson shot because it is an indicator which denotes the occurrence of upcoming news stories. The paper presents a novel anchorperson detection algorithm based on spatio-temporal slice (STS). With STSpattern analysis, clustering and decision fusion, anchorperson shots can be detected for browsing news video. The large-scale experimental results demonstrate that the algorithm is accurate, robust and effective.","PeriodicalId":118466,"journal":{"name":"14th International Conference on Image Analysis and Processing (ICIAP 2007)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132920622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this study, the implementation of an image processing technique on compute unified device architecture (CUDA) is discussed. CUDA is a new hardware and software architecture developed by NVIDIA Corporation for the general- purpose computation on graphics processing units. CUDA features an on-chip shared memory with very fast general read and write access, which enables threads in a block to share their data effectively. CUDA also provides a user- friendly development environment through an extension to the C programming language. This study focused on CUDA implementation of a representative optical flow computation proposed by Horn and Schunck in 1981. Their method produces the dense displacement field and has a straightforward processing procedure. A CUDA implementation of Horn and Schunck's method is proposed and investigated based on simulation results.
{"title":"Optical Flow Computation on Compute Unified Device Architecture","authors":"Y. Mizukami, Katsumi Tadamura","doi":"10.1109/ICIAP.2007.97","DOIUrl":"https://doi.org/10.1109/ICIAP.2007.97","url":null,"abstract":"In this study, the implementation of an image processing technique on compute unified device architecture (CUDA) is discussed. CUDA is a new hardware and software architecture developed by NVIDIA Corporation for the general- purpose computation on graphics processing units. CUDA features an on-chip shared memory with very fast general read and write access, which enables threads in a block to share their data effectively. CUDA also provides a user- friendly development environment through an extension to the C programming language. This study focused on CUDA implementation of a representative optical flow computation proposed by Horn and Schunck in 1981. Their method produces the dense displacement field and has a straightforward processing procedure. A CUDA implementation of Horn and Schunck's method is proposed and investigated based on simulation results.","PeriodicalId":118466,"journal":{"name":"14th International Conference on Image Analysis and Processing (ICIAP 2007)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132537340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Motivated by humans' ability to lipread, the visual component is considered to yield information in the speech recognition system. The lip-reading is the perception of the speech purely based on observing the talker lip movements. The major difficulty of the lip- reading system is the extraction of the visual speech descriptors. In fact, to ensure this task it is necessary to carry out an automatic localization and tracking of the labial gestures. We present in this paper a new automatic approach for lip and point of interest localization on a speaker's face based both on the color information of mouth and a geometric model of lips. This hybrid solution makes our method more tolerant to noise and artifacts in the image. Experiments revealed that our lip POI localization approach for lip-reading purpose is promising. The presented results show that our system recognizes 94.64 % of French visemes.
{"title":"Colour and Geometric based Model for Lip Localisation: Application for Lip-reading System","authors":"S. Werda, W. Mahdi, A. B. Hamadou","doi":"10.1109/ICIAP.2007.42","DOIUrl":"https://doi.org/10.1109/ICIAP.2007.42","url":null,"abstract":"Motivated by humans' ability to lipread, the visual component is considered to yield information in the speech recognition system. The lip-reading is the perception of the speech purely based on observing the talker lip movements. The major difficulty of the lip- reading system is the extraction of the visual speech descriptors. In fact, to ensure this task it is necessary to carry out an automatic localization and tracking of the labial gestures. We present in this paper a new automatic approach for lip and point of interest localization on a speaker's face based both on the color information of mouth and a geometric model of lips. This hybrid solution makes our method more tolerant to noise and artifacts in the image. Experiments revealed that our lip POI localization approach for lip-reading purpose is promising. The presented results show that our system recognizes 94.64 % of French visemes.","PeriodicalId":118466,"journal":{"name":"14th International Conference on Image Analysis and Processing (ICIAP 2007)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116729714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gabor filter responses are general purpose features for computer vision and image processing and have been very successful in many application areas, for example in bio- metric authentication (fingerprint matching, face detection, face recognition and iris recognition). In a typical feature construction, filters are utilised as a multi-resolution structure of several filters tuned to different frequencies and orientations. The multi-resolution structure is similar to wavelets, but the non-orthogonality of Gabor functions implies the main weakness: computational heaviness. The high computational complexity prevents their use in many real-time or near real-time tasks. In this study, an efficient sequential computation method for multi-resolution Gabor features is presented.
{"title":"Fast extraction of multi-resolution Gabor features","authors":"J. Ilonen, J. Kämäräinen, H. Kälviäinen","doi":"10.1109/ICIAP.2007.67","DOIUrl":"https://doi.org/10.1109/ICIAP.2007.67","url":null,"abstract":"Gabor filter responses are general purpose features for computer vision and image processing and have been very successful in many application areas, for example in bio- metric authentication (fingerprint matching, face detection, face recognition and iris recognition). In a typical feature construction, filters are utilised as a multi-resolution structure of several filters tuned to different frequencies and orientations. The multi-resolution structure is similar to wavelets, but the non-orthogonality of Gabor functions implies the main weakness: computational heaviness. The high computational complexity prevents their use in many real-time or near real-time tasks. In this study, an efficient sequential computation method for multi-resolution Gabor features is presented.","PeriodicalId":118466,"journal":{"name":"14th International Conference on Image Analysis and Processing (ICIAP 2007)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133790201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Motorized dome-type cameras, also called PTZ camera, allow the creation of panoramas. These panoramas represent the whole of the scene seen by the camera. In the case of a PTZ camera and with certain constraints, the scene seen by the camera can be considered as a sphere. The creation of a panorama consists in traversing a sphere in an exhaustive way. The acquired images are then projected on unspecified support which can be a cylinder, a cube or others. The projection of the rectangular images onto a sphere inevitably involves partial overlap between images. These overlaps lead to useless calculations. In order to limit the number of images we propose the calculation of an optimal trajectory for the camera according to intrinsic and extrinsic constraints.
{"title":"Panoramic mosaicing optimization","authors":"Lionel Robinault, S. Bres, S. Miguet","doi":"10.1109/ICIAP.2007.100","DOIUrl":"https://doi.org/10.1109/ICIAP.2007.100","url":null,"abstract":"Motorized dome-type cameras, also called PTZ camera, allow the creation of panoramas. These panoramas represent the whole of the scene seen by the camera. In the case of a PTZ camera and with certain constraints, the scene seen by the camera can be considered as a sphere. The creation of a panorama consists in traversing a sphere in an exhaustive way. The acquired images are then projected on unspecified support which can be a cylinder, a cube or others. The projection of the rectangular images onto a sphere inevitably involves partial overlap between images. These overlaps lead to useless calculations. In order to limit the number of images we propose the calculation of an optimal trajectory for the camera according to intrinsic and extrinsic constraints.","PeriodicalId":118466,"journal":{"name":"14th International Conference on Image Analysis and Processing (ICIAP 2007)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133833544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present a compact hybrid video sensor that combines perspective and omnidirectional vision to achieve a 360deg field of view, as well as high-resolution images. Those characteristics, in association with 3D metric reconstruction capabilities, are suitable for vision tasks such as surveillance and obstacle detection for autonomous robot navigation. We describe the sensor calibration procedure, with particular regard to mirror-to-camera positioning. We also present some results obtained in testing the accuracy of 3D reconstruction, which have confirmed the correctness of the calibration.
{"title":"Hybrid Stereo Sensor with Omnidirectional Vision Capabilities: Overview and Calibration Procedures","authors":"S. Cagnoni, M. Mordonini, Luca Mussi, G. Adorni","doi":"10.1109/ICIAP.2007.77","DOIUrl":"https://doi.org/10.1109/ICIAP.2007.77","url":null,"abstract":"In this paper, we present a compact hybrid video sensor that combines perspective and omnidirectional vision to achieve a 360deg field of view, as well as high-resolution images. Those characteristics, in association with 3D metric reconstruction capabilities, are suitable for vision tasks such as surveillance and obstacle detection for autonomous robot navigation. We describe the sensor calibration procedure, with particular regard to mirror-to-camera positioning. We also present some results obtained in testing the accuracy of 3D reconstruction, which have confirmed the correctness of the calibration.","PeriodicalId":118466,"journal":{"name":"14th International Conference on Image Analysis and Processing (ICIAP 2007)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129320358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
For measuring the 3D shape of large objects, scanning by a moving range sensor is one of the most efficient method. However, if we use moving range sensors, the obtained data have some distortions due to the movement of the sensor during the scanning process. In this paper, we propose a method for recovering correct 3D range data from a moving range sensor by using the multiple view geometry. We assume that range sensor radiates laser beams in raster scan order, and they are observed from a static camera. We first show that we can deal with range data as 3D space-time images, and show that the extended multiple view geometry can be used for representing the relationship between the 3D space-time of camera image and the 3D space-time of range data. We next show that the multiple view geometry under extended projections can be used for rectifying 3D data obtained by the moving range sensor. The method is implemented and tested in synthetic images and range data. The stability of recovered 3D shape is also evaluated.
{"title":"Rectification of 3D Data Obtained from Moving Range Sensors by using Multiple View Geometry","authors":"K. Kozuka, J. Sato","doi":"10.1109/ICIAP.2007.110","DOIUrl":"https://doi.org/10.1109/ICIAP.2007.110","url":null,"abstract":"For measuring the 3D shape of large objects, scanning by a moving range sensor is one of the most efficient method. However, if we use moving range sensors, the obtained data have some distortions due to the movement of the sensor during the scanning process. In this paper, we propose a method for recovering correct 3D range data from a moving range sensor by using the multiple view geometry. We assume that range sensor radiates laser beams in raster scan order, and they are observed from a static camera. We first show that we can deal with range data as 3D space-time images, and show that the extended multiple view geometry can be used for representing the relationship between the 3D space-time of camera image and the 3D space-time of range data. We next show that the multiple view geometry under extended projections can be used for rectifying 3D data obtained by the moving range sensor. The method is implemented and tested in synthetic images and range data. The stability of recovered 3D shape is also evaluated.","PeriodicalId":118466,"journal":{"name":"14th International Conference on Image Analysis and Processing (ICIAP 2007)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123803381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper a novel measure for visual correspondence is proposed, to be adopted for common computer vision tasks such as pattern matching, stereo vision, change detection. The proposed measure implicitly exploits the concept of order-preservation between neighbouring pixels and it is suitable for those cases where disturbance factors such as photometric distortions and occlusions occur between the images to be matched. Furthermore, this measure tends to be robust in presence of significant amount of noise which can be introduced, e.g., by cheap camera sensors. Experimental results demonstrate the effectiveness of the proposed approach in a typical template matching scenario as well as in a particular application dealing with secure gate access control.
{"title":"A robust measure for visual correspondence","authors":"Federico Tombari, L. D. Stefano, S. Mattoccia","doi":"10.1109/ICIAP.2007.16","DOIUrl":"https://doi.org/10.1109/ICIAP.2007.16","url":null,"abstract":"In this paper a novel measure for visual correspondence is proposed, to be adopted for common computer vision tasks such as pattern matching, stereo vision, change detection. The proposed measure implicitly exploits the concept of order-preservation between neighbouring pixels and it is suitable for those cases where disturbance factors such as photometric distortions and occlusions occur between the images to be matched. Furthermore, this measure tends to be robust in presence of significant amount of noise which can be introduced, e.g., by cheap camera sensors. Experimental results demonstrate the effectiveness of the proposed approach in a typical template matching scenario as well as in a particular application dealing with secure gate access control.","PeriodicalId":118466,"journal":{"name":"14th International Conference on Image Analysis and Processing (ICIAP 2007)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124070300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce the polar distance transform and the grey weighted polar distance transform for computation of minimum cost paths preferring circular shape, as well as give algorithms for implementations in a digital setting. An alternative to the polar distance transform is to transform the image to polar coordinates, and then apply a Cartesian distance transform. By using the polar distance transform, resampling of the image and interpolation of new pixel values are avoided. We also handle the case of grey weighted distance transform in a 5 times 5 neighbourhood, which, to our knowledge, is new. Initial results of using the grey weighted polar distance transform to outline annual rings in images of log end faces are presented.
{"title":"Grey Weighted Polar Distance Transform for Outlining Circular and Approximately Circular Objects","authors":"K. Norell, Joakim Lindblad, S. Svensson","doi":"10.1109/ICIAP.2007.74","DOIUrl":"https://doi.org/10.1109/ICIAP.2007.74","url":null,"abstract":"We introduce the polar distance transform and the grey weighted polar distance transform for computation of minimum cost paths preferring circular shape, as well as give algorithms for implementations in a digital setting. An alternative to the polar distance transform is to transform the image to polar coordinates, and then apply a Cartesian distance transform. By using the polar distance transform, resampling of the image and interpolation of new pixel values are avoided. We also handle the case of grey weighted distance transform in a 5 times 5 neighbourhood, which, to our knowledge, is new. Initial results of using the grey weighted polar distance transform to outline annual rings in images of log end faces are presented.","PeriodicalId":118466,"journal":{"name":"14th International Conference on Image Analysis and Processing (ICIAP 2007)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127787761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}