Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469540
B. Bosek, L. Horwath, Grzegorz Matecki, Arkadiusz Pawlik
One of the key problems of computer vision and automated surveillance is to determine if two snapshots of objects in a video feed correspond to the same real one. In this paper we propose an efficient GPGPU based system for short-term matching of people in a video feed. The main contributions of our approach consist of image enhancement techniques, data preprocessing methods based on statistical sampling combined with local algorithms for finding Voronoi diagrams and efficient similarity metric based on non crossing maximum matchings in weighted graphs. Our algorithms, thanks to their local nature, are easily parallelized. We propose an implementation on GPGPU that allows real time computation in reasonable circumstances. Achieved results show that described algorithms may be used in a variety of contexts.
{"title":"High performance GPGPU based system for matching people in a live video feed","authors":"B. Bosek, L. Horwath, Grzegorz Matecki, Arkadiusz Pawlik","doi":"10.1109/IPTA.2012.6469540","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469540","url":null,"abstract":"One of the key problems of computer vision and automated surveillance is to determine if two snapshots of objects in a video feed correspond to the same real one. In this paper we propose an efficient GPGPU based system for short-term matching of people in a video feed. The main contributions of our approach consist of image enhancement techniques, data preprocessing methods based on statistical sampling combined with local algorithms for finding Voronoi diagrams and efficient similarity metric based on non crossing maximum matchings in weighted graphs. Our algorithms, thanks to their local nature, are easily parallelized. We propose an implementation on GPGPU that allows real time computation in reasonable circumstances. Achieved results show that described algorithms may be used in a variety of contexts.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122105428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469559
E. Durá, J. Domingo, A. F. Rojas-Arboleda, L. Martí-Bonmatí
This paper is concerned with liver atlas construction. One of the most important issues in the framework of computational abdominal anatomy is to define an atlas that provides a priori information for common medical task such as registration and segmentation. Unlike other approaches already proposed so far (to our knowledge), in this paper we propose to use the concept of random compact mean set to build probabilistic liver atlases. To accomplish this task a two-tier process was carried out. First a set of 3D images was manually segmented by a physician. We see the different 3D segmented shapes as a realization of a random compact set. Secondly, elements of two known definitions of mean set were applied to build a probabilistic atlas that captures the variability of the cases, keeping nevertheless the essential shape of the liver.
{"title":"Mean sets for building 3D probabilistic liver atlas from perfusion MR images","authors":"E. Durá, J. Domingo, A. F. Rojas-Arboleda, L. Martí-Bonmatí","doi":"10.1109/IPTA.2012.6469559","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469559","url":null,"abstract":"This paper is concerned with liver atlas construction. One of the most important issues in the framework of computational abdominal anatomy is to define an atlas that provides a priori information for common medical task such as registration and segmentation. Unlike other approaches already proposed so far (to our knowledge), in this paper we propose to use the concept of random compact mean set to build probabilistic liver atlases. To accomplish this task a two-tier process was carried out. First a set of 3D images was manually segmented by a physician. We see the different 3D segmented shapes as a realization of a random compact set. Secondly, elements of two known definitions of mean set were applied to build a probabilistic atlas that captures the variability of the cases, keeping nevertheless the essential shape of the liver.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130549452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469527
M. Minervini, Mario Damiano, V. Tucci, A. Bifone, A. Gozzi, S. Tsaftaris
The combined use of mice that have genetic mutations (transgenic mouse models) of human pathology and advanced neuroimaging methods (such as MRI) has the potential to radically change how we approach disease understanding, diagnosis and treatment. Morphological changes occurring in the brain of transgenic animals as a result of the interaction between environment and genotype, can be assessed using advanced image analysis methods, an effort described as “mouse brain phenotyping”. However, the computational methods required for the analysis of high-resolution brain images are demanding. In this paper, we propose a computationally effective cloud-based implementation of morphometric analysis of high-resolution mouse brain datasets. We show that the proposed approach is highly scalable and suited for a variety of methods for MR-based brain phenotyping. The proposed approach is easy to deploy, and could become an alternative for laboratories that may require instant access to large high performance computing infrastructure.
{"title":"Mouse neuroimaging phenotyping in the cloud","authors":"M. Minervini, Mario Damiano, V. Tucci, A. Bifone, A. Gozzi, S. Tsaftaris","doi":"10.1109/IPTA.2012.6469527","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469527","url":null,"abstract":"The combined use of mice that have genetic mutations (transgenic mouse models) of human pathology and advanced neuroimaging methods (such as MRI) has the potential to radically change how we approach disease understanding, diagnosis and treatment. Morphological changes occurring in the brain of transgenic animals as a result of the interaction between environment and genotype, can be assessed using advanced image analysis methods, an effort described as “mouse brain phenotyping”. However, the computational methods required for the analysis of high-resolution brain images are demanding. In this paper, we propose a computationally effective cloud-based implementation of morphometric analysis of high-resolution mouse brain datasets. We show that the proposed approach is highly scalable and suited for a variety of methods for MR-based brain phenotyping. The proposed approach is easy to deploy, and could become an alternative for laboratories that may require instant access to large high performance computing infrastructure.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116438139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469511
M. Dawood, C. Cappelle, Maan El Badaoui El Najjar, M. Khalil, D. Pomorski
This paper proposed a new vehicle geo-localization method in urban environment integrating a new source of information that is a virtual 3D city model. This 3D model provides a realistic representation of the navigation environment of the vehicle. To optimize the performance of vehicle geo-localization system, several sources of information are integrated for their complementarity and redundancy: a GPS receiver, proprioceptive sensors (odometers and gyrometer), a video camera and a virtual 3D city model. The pose estimation algorithm used to fuse the different sensors data is an IMM-UKF (Interacting Multiple Model - Unscented Kalman Filter). The proprioceptive sensors allow to continuously estimating the dead-reckoning position and orientation of the vehicle. This dead-reckoning estimation of the pose is corrected by GPS measurements. Moreover, a 3D model/camera based observation of the vehicle pose is constructed to compensate the drift of the dead-reckoning localization when GPS measurements are unavailable for a long time. This pose observation is based on the matching between the virtual image extracted from the 3D city model and the real image acquired by the camera. The observation construction is composed of two major parts. The first part consists in detecting and matching the feature points of the real and virtual images. Three features are compared: Harris corner, SIFT (Scale Invariant Feature Transform) and SURF (Speed Up Robust Features). The second part is the pose computation using POSIT algorithm and the previously matched features set. The developed approach has been tested on a real sequence and the obtained results proved the feasibility and robustness of the approach.
本文提出了一种基于虚拟三维城市模型的城市环境下车辆地理定位新方法。该3D模型提供了车辆导航环境的真实表示。为了优化车辆地理定位系统的性能,集成了几个信息来源,以实现它们的互补性和冗余性:GPS接收器、本体感觉传感器(里程表和陀螺仪)、摄像机和虚拟3D城市模型。用于融合不同传感器数据的姿态估计算法是IMM-UKF(交互多模型-无气味卡尔曼滤波)。本体感觉传感器允许持续估计车辆的航位推算位置和方向。这种姿态的航位推算估计通过GPS测量进行修正。此外,构建了基于三维模型/相机的车辆姿态观测,以补偿长时间无法获得GPS测量时航位推算定位的漂移。这种姿态观察是基于从三维城市模型中提取的虚拟图像与相机获取的真实图像之间的匹配。观测建设由两大部分组成。第一部分是对真实图像和虚拟图像的特征点进行检测和匹配。比较了Harris角、SIFT (Scale Invariant Feature Transform)和SURF (Speed Up Robust features)三种特征。第二部分是利用POSIT算法和之前匹配的特征集进行姿态计算。该方法已在一个实际序列上进行了测试,结果证明了该方法的可行性和鲁棒性。
{"title":"Harris, SIFT and SURF features comparison for vehicle localization based on virtual 3D model and camera","authors":"M. Dawood, C. Cappelle, Maan El Badaoui El Najjar, M. Khalil, D. Pomorski","doi":"10.1109/IPTA.2012.6469511","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469511","url":null,"abstract":"This paper proposed a new vehicle geo-localization method in urban environment integrating a new source of information that is a virtual 3D city model. This 3D model provides a realistic representation of the navigation environment of the vehicle. To optimize the performance of vehicle geo-localization system, several sources of information are integrated for their complementarity and redundancy: a GPS receiver, proprioceptive sensors (odometers and gyrometer), a video camera and a virtual 3D city model. The pose estimation algorithm used to fuse the different sensors data is an IMM-UKF (Interacting Multiple Model - Unscented Kalman Filter). The proprioceptive sensors allow to continuously estimating the dead-reckoning position and orientation of the vehicle. This dead-reckoning estimation of the pose is corrected by GPS measurements. Moreover, a 3D model/camera based observation of the vehicle pose is constructed to compensate the drift of the dead-reckoning localization when GPS measurements are unavailable for a long time. This pose observation is based on the matching between the virtual image extracted from the 3D city model and the real image acquired by the camera. The observation construction is composed of two major parts. The first part consists in detecting and matching the feature points of the real and virtual images. Three features are compared: Harris corner, SIFT (Scale Invariant Feature Transform) and SURF (Speed Up Robust Features). The second part is the pose computation using POSIT algorithm and the previously matched features set. The developed approach has been tested on a real sequence and the obtained results proved the feasibility and robustness of the approach.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"156 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133029344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469566
G. Nunnari, F. Cannavò, M. Fargetta, A. Spata
The aim of this paper is to propose a strategy able to provide 3D temporal evolution of ground deformations. To this end, for a given multi-temporal dataset of DInSAR (Differential Interferometric SAR) data and GPS measurements, a Grid infrastructure is used to perform parallel execution of the SISTEM (Simultaneous and Integrated Strain Tensor Estimation from geodetic and satellite deformation Measurements) method in order to estimate 3D ground deformation maps. Then a SBAS-like algorithm is used to merge the estimated static maps to provide a 3D temporal evolution of deformations over the whole investigated area.
{"title":"A Grid application to estimate 3D temporal evolution of ground deformation displacements","authors":"G. Nunnari, F. Cannavò, M. Fargetta, A. Spata","doi":"10.1109/IPTA.2012.6469566","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469566","url":null,"abstract":"The aim of this paper is to propose a strategy able to provide 3D temporal evolution of ground deformations. To this end, for a given multi-temporal dataset of DInSAR (Differential Interferometric SAR) data and GPS measurements, a Grid infrastructure is used to perform parallel execution of the SISTEM (Simultaneous and Integrated Strain Tensor Estimation from geodetic and satellite deformation Measurements) method in order to estimate 3D ground deformation maps. Then a SBAS-like algorithm is used to merge the estimated static maps to provide a 3D temporal evolution of deformations over the whole investigated area.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133690979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469480
Christian Wolf, A. Baskurt
Applications such as video surveillance, robotics, source selection, and video indexing often require the recognition of actions based on the motion of different actors in a video. Certain applications may require assigning activities to several predefined classes, while others may rely on the detection of abnormal or infrequent activities. In this summary we provide a survey of dominant models and methods and discuss recent developments in this domain. We briefly describe two recent contributions: joint level feature and sequence learning, as well as space-time graph matching.
{"title":"Action recognition in videos","authors":"Christian Wolf, A. Baskurt","doi":"10.1109/IPTA.2012.6469480","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469480","url":null,"abstract":"Applications such as video surveillance, robotics, source selection, and video indexing often require the recognition of actions based on the motion of different actors in a video. Certain applications may require assigning activities to several predefined classes, while others may rely on the detection of abnormal or infrequent activities. In this summary we provide a survey of dominant models and methods and discuss recent developments in this domain. We briefly describe two recent contributions: joint level feature and sequence learning, as well as space-time graph matching.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"260 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116235438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469518
D. Stipanicev, Ljiljana Šerić, Maja Braović, D. Krstinić, Toni Jakovcevic, M. Stula, M. Bugarić, J. Maras
Wildfires are natural risk phenomena that cause significant economic and environmental damage. In wildfire fighting strategy it is important to detect the wildfire in its initial stage and to apply, as soon as possible, the most appropriate fire fighting action. In both cases wildfire monitoring and surveillance systems are of great importance, so in the last decade the interest for various wildfire monitoring and surveillance systems has increased, both on the research and the implementation level. This paper describes one such system named iForestFire. It is an example of advanced terrestrial vision based wildfire monitoring and surveillance system, today widely used in various Croatian National and Nature Parks and regions, but it is also a system in constant development and improvement, both on theoretical and practical level. This paper describes its last improvements in video detection part that are based on notation of observer, cogent confabulation theory and mechanism of thought. Inclusion of cogent confabulation theory allows us to expend the use of existing wildfire observers to more general natural risk observers.
{"title":"Vision based wildfire and natural risk observers","authors":"D. Stipanicev, Ljiljana Šerić, Maja Braović, D. Krstinić, Toni Jakovcevic, M. Stula, M. Bugarić, J. Maras","doi":"10.1109/IPTA.2012.6469518","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469518","url":null,"abstract":"Wildfires are natural risk phenomena that cause significant economic and environmental damage. In wildfire fighting strategy it is important to detect the wildfire in its initial stage and to apply, as soon as possible, the most appropriate fire fighting action. In both cases wildfire monitoring and surveillance systems are of great importance, so in the last decade the interest for various wildfire monitoring and surveillance systems has increased, both on the research and the implementation level. This paper describes one such system named iForestFire. It is an example of advanced terrestrial vision based wildfire monitoring and surveillance system, today widely used in various Croatian National and Nature Parks and regions, but it is also a system in constant development and improvement, both on theoretical and practical level. This paper describes its last improvements in video detection part that are based on notation of observer, cogent confabulation theory and mechanism of thought. Inclusion of cogent confabulation theory allows us to expend the use of existing wildfire observers to more general natural risk observers.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116635940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469532
Ahmad Shahin, W. Moudani, Fadi Chakik
In this paper we present a hybrid model for image compression based on fuzzy segmentation and Partial Differential Equations. The main motivation behind our approach is to produce immediate access to objects/features of interest in a high quality decoded image which could be useful on smart devices, for analysis purpose, as well as for multimedia content-based description standards. The image is approximated as a set of uniform regions: The technique will assign well-defined members to homogenous regions in order to achieve image segmentation. The fuzzy c-means (FcM) is a guide to cluster image data. A second stage coding is applied using entropy coding to remove the whole image entropy redundancy. In the decoding phase, we suggest the application of a nonlinear anisotropic diffusion to enhance the quality of the coded image.
{"title":"Image compression based on fuzzy segmentation and anisotropic diffusion","authors":"Ahmad Shahin, W. Moudani, Fadi Chakik","doi":"10.1109/IPTA.2012.6469532","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469532","url":null,"abstract":"In this paper we present a hybrid model for image compression based on fuzzy segmentation and Partial Differential Equations. The main motivation behind our approach is to produce immediate access to objects/features of interest in a high quality decoded image which could be useful on smart devices, for analysis purpose, as well as for multimedia content-based description standards. The image is approximated as a set of uniform regions: The technique will assign well-defined members to homogenous regions in order to achieve image segmentation. The fuzzy c-means (FcM) is a guide to cluster image data. A second stage coding is applied using entropy coding to remove the whole image entropy redundancy. In the decoding phase, we suggest the application of a nonlinear anisotropic diffusion to enhance the quality of the coded image.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127121562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469535
Soon-Won Hong, L. Choi
Unlike simple images processed by the existing image-based search engines, flowers have wider and more irregular range of shapes and patterns. In this paper we present an automatic recognition system of flowers for smartphone users. After a user transmits a flower image to the server, the image processing and searching is performed only by the server, eliminating the user interaction from the recognition process. The server detects the contour of a flower image by using both color-based and edge-based contour detection. Then, we classify its color groups and contour shapes by using k-means clustering and history matching. After comparing the input image with the reference images stored on the server, the server sends the most similar image to the user. We also address the image recognition failure issue caused by the light and the camera angle by partial recognition and image recovery. We have obtained the success rate of 94.8% for 500 images from 100 species.
{"title":"Automatic recognition of flowers through color and edge based contour detection","authors":"Soon-Won Hong, L. Choi","doi":"10.1109/IPTA.2012.6469535","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469535","url":null,"abstract":"Unlike simple images processed by the existing image-based search engines, flowers have wider and more irregular range of shapes and patterns. In this paper we present an automatic recognition system of flowers for smartphone users. After a user transmits a flower image to the server, the image processing and searching is performed only by the server, eliminating the user interaction from the recognition process. The server detects the contour of a flower image by using both color-based and edge-based contour detection. Then, we classify its color groups and contour shapes by using k-means clustering and history matching. After comparing the input image with the reference images stored on the server, the server sends the most similar image to the user. We also address the image recognition failure issue caused by the light and the camera angle by partial recognition and image recovery. We have obtained the success rate of 94.8% for 500 images from 100 species.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127757583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-01DOI: 10.1109/IPTA.2012.6469524
Jean-François Collumeau, R. Leconge, B. Emile, H. Laurent
A high proportion of hospital-acquired diseases are transmitted nowadays during surgery despite existing asepsis preservation measures. These are quite drastic, prohibiting surgeons from interacting directly with non-sterile equipment. Indirect control is presently achieved through an assistant or a nurse. Gesture-based Human-Computer Interfaces constitute a promising approach for giving direct control over such equipment to surgeons. This paper introduces a novel hand descriptor based on measurements extracted from hand contour convex and concave extrema. Using a 9750-picture database created especially for this purpose, it is compared with three state-of-the-art description methods, namely Hu moments, and both SIFT and HOG features. Effects of large amounts of hand rotation are also studied on each rotation axis independently. Obtained results give HOG features as best in recognizing hands from our database, closely followed by the proposed descriptor. Performance comparison when facing rotated hands shows our descriptor as the most robust to rotations, outperforming the other descriptors by a wide margin.
{"title":"Hand gesture recognition using a dedicated geometric descriptor","authors":"Jean-François Collumeau, R. Leconge, B. Emile, H. Laurent","doi":"10.1109/IPTA.2012.6469524","DOIUrl":"https://doi.org/10.1109/IPTA.2012.6469524","url":null,"abstract":"A high proportion of hospital-acquired diseases are transmitted nowadays during surgery despite existing asepsis preservation measures. These are quite drastic, prohibiting surgeons from interacting directly with non-sterile equipment. Indirect control is presently achieved through an assistant or a nurse. Gesture-based Human-Computer Interfaces constitute a promising approach for giving direct control over such equipment to surgeons. This paper introduces a novel hand descriptor based on measurements extracted from hand contour convex and concave extrema. Using a 9750-picture database created especially for this purpose, it is compared with three state-of-the-art description methods, namely Hu moments, and both SIFT and HOG features. Effects of large amounts of hand rotation are also studied on each rotation axis independently. Obtained results give HOG features as best in recognizing hands from our database, closely followed by the proposed descriptor. Performance comparison when facing rotated hands shows our descriptor as the most robust to rotations, outperforming the other descriptors by a wide margin.","PeriodicalId":267290,"journal":{"name":"2012 3rd International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117352495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}