The work presented here solves two major problems of hand pose recognition: (A) determining what pose is shown in a given, input picture and (B) detecting the presence of a known input pose in a given input video. It builds on the earlier work of Athitsos and Sclaroff (2003) toward solving problem A. Because that method relies upon lines found in the input data and requires computer-generated database models, it is unsuitable for the later, video problem. Our reworking of this framework uses different, region-based information to allow video frames to be used as the "database" in which to look for the test pose. It returns database images of hands in the same configuration as a query image by using a series of steps based on the number and direction of visible finger protrusions, Chamfer distance, orientation histograms, and a competitive, comparison-based matching of each visible finger segment. Detailed result data demonstrates the system's feasibility and potential.
{"title":"Segment-based hand pose estimation","authors":"Christopher G. Schwarz, N. Lobo","doi":"10.1109/CRV.2005.72","DOIUrl":"https://doi.org/10.1109/CRV.2005.72","url":null,"abstract":"The work presented here solves two major problems of hand pose recognition: (A) determining what pose is shown in a given, input picture and (B) detecting the presence of a known input pose in a given input video. It builds on the earlier work of Athitsos and Sclaroff (2003) toward solving problem A. Because that method relies upon lines found in the input data and requires computer-generated database models, it is unsuitable for the later, video problem. Our reworking of this framework uses different, region-based information to allow video frames to be used as the \"database\" in which to look for the test pose. It returns database images of hands in the same configuration as a query image by using a series of steps based on the number and direction of visible finger protrusions, Chamfer distance, orientation histograms, and a competitive, comparison-based matching of each visible finger segment. Detailed result data demonstrates the system's feasibility and potential.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131019310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The extraction of optimal features, in a classification sense, is still quite challenging in the context of large-scale classification problems (such as visual recognition), involving a large number of classes and significant amounts of training data per class. We present an optimal, in the minimum Bayes error sense, algorithm for feature design that combines the most appealing properties of the two strategies that are currently dominant: feature extraction (FE) and feature selection (FS). The new algorithm proceeds by interleaving pairs of FS and FE steps, which amount to a sequential search for the most discriminant directions in a collection of two dimensional subspaces. It combines the fast convergence rate of FS with the ability of FE to uncover optimal features that are not part of the original basis functions, leading to solutions that are better than those achievable by either FE or FS alone, in a small number of iterations. Because the basic iteration has very low complexity, the new algorithm is scalable in the number of classes of the recognition problem, a property that is currently only available for feature extraction methods that are either sub-optimal or optimal under restrictive assumptions that do not hold for generic recognition. Experimental results show significant improvements over these methods, either through much greater robustness to local minima or by achieving significantly faster convergence.
{"title":"Minimum Bayes error features for visual recognition by sequential feature selection and extraction","authors":"G. Carneiro, N. Vasconcelos","doi":"10.1109/CRV.2005.53","DOIUrl":"https://doi.org/10.1109/CRV.2005.53","url":null,"abstract":"The extraction of optimal features, in a classification sense, is still quite challenging in the context of large-scale classification problems (such as visual recognition), involving a large number of classes and significant amounts of training data per class. We present an optimal, in the minimum Bayes error sense, algorithm for feature design that combines the most appealing properties of the two strategies that are currently dominant: feature extraction (FE) and feature selection (FS). The new algorithm proceeds by interleaving pairs of FS and FE steps, which amount to a sequential search for the most discriminant directions in a collection of two dimensional subspaces. It combines the fast convergence rate of FS with the ability of FE to uncover optimal features that are not part of the original basis functions, leading to solutions that are better than those achievable by either FE or FS alone, in a small number of iterations. Because the basic iteration has very low complexity, the new algorithm is scalable in the number of classes of the recognition problem, a property that is currently only available for feature extraction methods that are either sub-optimal or optimal under restrictive assumptions that do not hold for generic recognition. Experimental results show significant improvements over these methods, either through much greater robustness to local minima or by achieving significantly faster convergence.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114083359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Producing high-resolution underwater imagery in range of visibility conditions is a capability of critical demand for a number of applications. New generation of forward-scan acoustic video cameras, becoming available for commercial applications in recent years, produce images with considerably more target details than optical systems in turbid waters. Previous computer processing of sonar imagery has dominantly involved target segmentation, classification and recognition by exploiting 2D visual cues from texture, or object/shadow shape in a single frame. Processing of video is becoming more and more important because of various applications that involve target tracking, object identification in search and inspection, self localization and mapping, among many other applications. This paper addresses the image registration problem for acoustic video, and the preprocessing steps to be applied to the raw video from a DID-SON acoustic camera for image calibration, filtering and enhancement to achieve reliable results.
{"title":"On processing and registration of forward-scan acoustic video imagery","authors":"S. Negahdaripour, P. Firoozfam, P. Sabzmeydani","doi":"10.1109/CRV.2005.57","DOIUrl":"https://doi.org/10.1109/CRV.2005.57","url":null,"abstract":"Producing high-resolution underwater imagery in range of visibility conditions is a capability of critical demand for a number of applications. New generation of forward-scan acoustic video cameras, becoming available for commercial applications in recent years, produce images with considerably more target details than optical systems in turbid waters. Previous computer processing of sonar imagery has dominantly involved target segmentation, classification and recognition by exploiting 2D visual cues from texture, or object/shadow shape in a single frame. Processing of video is becoming more and more important because of various applications that involve target tracking, object identification in search and inspection, self localization and mapping, among many other applications. This paper addresses the image registration problem for acoustic video, and the preprocessing steps to be applied to the raw video from a DID-SON acoustic camera for image calibration, filtering and enhancement to achieve reliable results.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124333302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In MRF based unsupervised segmentation, the MRF model parameters are typically estimated globally. Those global statistics sometimes are far from accurate for local areas if the image is highly non-stationary, and hence will generate false boundaries. The problem cannot be solved if local statistics are not considered. This work incorporates the local feature of edge strength in the MRF energy function, and segmentation is obtained by reducing the energy function using iterative classification and region merging.
{"title":"Combining local and global features for image segmentation using iterative classification and region merging","authors":"Qiyao Yu, David A Clausi","doi":"10.1109/CRV.2005.27","DOIUrl":"https://doi.org/10.1109/CRV.2005.27","url":null,"abstract":"In MRF based unsupervised segmentation, the MRF model parameters are typically estimated globally. Those global statistics sometimes are far from accurate for local areas if the image is highly non-stationary, and hence will generate false boundaries. The problem cannot be solved if local statistics are not considered. This work incorporates the local feature of edge strength in the MRF energy function, and segmentation is obtained by reducing the energy function using iterative classification and region merging.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"138 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120886656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a method to automatically track human body parts in the context of gait modelisation and recognition. The proposed approach is based on a five points human model (head, hands, and feet) where the points are detected and tracked independently. Tracking is fully automatic (no manual initialization of the five points) since it will be used in a real-time surveillance system. Feet are detected in each frame by first finding the space between the legs in the human silhouette. The issue of feet self-occlusion is handled using optical flow and motion correspondence. Skin color segmentation is used to find hands in each frame and tracking is achieved by using a bounding box overlap algorithm. The head is defined as the mass center of a region of the upper silhouette.
{"title":"Body tracking in human walk from monocular video sequences","authors":"F. Jean, R. Bergevin, A. Albu","doi":"10.1109/CRV.2005.24","DOIUrl":"https://doi.org/10.1109/CRV.2005.24","url":null,"abstract":"This paper proposes a method to automatically track human body parts in the context of gait modelisation and recognition. The proposed approach is based on a five points human model (head, hands, and feet) where the points are detected and tracked independently. Tracking is fully automatic (no manual initialization of the five points) since it will be used in a real-time surveillance system. Feet are detected in each frame by first finding the space between the legs in the human silhouette. The issue of feet self-occlusion is handled using optical flow and motion correspondence. Skin color segmentation is used to find hands in each frame and tracking is achieved by using a bounding box overlap algorithm. The head is defined as the mass center of a region of the upper silhouette.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114562897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Shimoide, Ilmi Yoon, M. Fuse, Holly C. Beale, Rahul Singh
The problem of elucidating the functional significance of genes is a key challenge of modern science. Solving this problem can lead to fundamental advancements across multiple areas such starting from pharmaceutical drug discovery to agricultural sciences. A commonly used approach in this context involves studying genetic influence on model organisms. These influences can be expressed at behavioral, morphological, anatomical, or molecular levels and the expressed patterns are called phenotypes. Unfortunately, detailed studies of many phenotypes, such as the behavior of an organism, is highly complicated due to the inherent complexity of the phenotype pattern and because of the fact that it may evolve over long time periods. In this paper, we propose applying color-based tracking to study Ecdysis in the hornworm - a biologically highly relevant phenotype whose complexity had thus far, prevented application of automated approaches. We present experimental results which demonstrate the accuracy of tracking and phenotype determination under conditions of complex body movement, partial occlusions, and body deformations. A key additional goal of our paper is to expose the computer vision community to such novel applications, where techniques from vision and pattern analysis can have a seminal influence on other branches of modern science.
{"title":"Automated behavioral phenotype detection and analysis using color-based motion tracking","authors":"A. Shimoide, Ilmi Yoon, M. Fuse, Holly C. Beale, Rahul Singh","doi":"10.1109/CRV.2005.20","DOIUrl":"https://doi.org/10.1109/CRV.2005.20","url":null,"abstract":"The problem of elucidating the functional significance of genes is a key challenge of modern science. Solving this problem can lead to fundamental advancements across multiple areas such starting from pharmaceutical drug discovery to agricultural sciences. A commonly used approach in this context involves studying genetic influence on model organisms. These influences can be expressed at behavioral, morphological, anatomical, or molecular levels and the expressed patterns are called phenotypes. Unfortunately, detailed studies of many phenotypes, such as the behavior of an organism, is highly complicated due to the inherent complexity of the phenotype pattern and because of the fact that it may evolve over long time periods. In this paper, we propose applying color-based tracking to study Ecdysis in the hornworm - a biologically highly relevant phenotype whose complexity had thus far, prevented application of automated approaches. We present experimental results which demonstrate the accuracy of tracking and phenotype determination under conditions of complex body movement, partial occlusions, and body deformations. A key additional goal of our paper is to expose the computer vision community to such novel applications, where techniques from vision and pattern analysis can have a seminal influence on other branches of modern science.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114690940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yueyun Shu, Guillaume-Alexandre Bilodeau, F. Cheriet
This paper presents a method that combines graph-based segmentation and multistage region merging to segment laparoscopic images. Starting with image preprocessing, including Gaussian smoothing, brightness and contrast enhancement, and histogram thresholding, we then apply an efficient graph-based method to produce a coarse segmentation of laparoscopic images. Next, regions are further merged in a multistage process based on features like grey-level similarity, region size and common edge length. At each stage, regions are merged iteratively according to a merging score until convergence. Experimental results show that our approach can achieve good spatial coherence, accurate edge location and appropriately segmented regions in real surgical images.
{"title":"Segmentation of laparoscopic images: integrating graph-based segmentation and multistage region merging","authors":"Yueyun Shu, Guillaume-Alexandre Bilodeau, F. Cheriet","doi":"10.1109/CRV.2005.74","DOIUrl":"https://doi.org/10.1109/CRV.2005.74","url":null,"abstract":"This paper presents a method that combines graph-based segmentation and multistage region merging to segment laparoscopic images. Starting with image preprocessing, including Gaussian smoothing, brightness and contrast enhancement, and histogram thresholding, we then apply an efficient graph-based method to produce a coarse segmentation of laparoscopic images. Next, regions are further merged in a multistage process based on features like grey-level similarity, region size and common edge length. At each stage, regions are merged iteratively according to a merging score until convergence. Experimental results show that our approach can achieve good spatial coherence, accurate edge location and appropriately segmented regions in real surgical images.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129265913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a novel method for recognizing partially occluded objects. The proposed method uses corner points and their spatial relationship perceived through the application of triangular spatial relationship (TSR) in Guru and Nagabhushan (2001) by considering three consecutive corner points at a time. The perceived TSR among corner points are used to create a model object-base using the technique of perfect hashing. The matched sequence is preserved in a two-dimensional matrix called status matrix. Experimental results, on real images of varying complexity of a reasonably large database of objects have established the robustness of the method.
{"title":"Recognition of partially occluded objects using perfect hashing: an efficient and robust approach","authors":"R. Dinesh, D. S. Guru","doi":"10.1109/CRV.2005.66","DOIUrl":"https://doi.org/10.1109/CRV.2005.66","url":null,"abstract":"This paper presents a novel method for recognizing partially occluded objects. The proposed method uses corner points and their spatial relationship perceived through the application of triangular spatial relationship (TSR) in Guru and Nagabhushan (2001) by considering three consecutive corner points at a time. The perceived TSR among corner points are used to create a model object-base using the technique of perfect hashing. The matched sequence is preserved in a two-dimensional matrix called status matrix. Experimental results, on real images of varying complexity of a reasonably large database of objects have established the robustness of the method.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132362796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describe a new algorithm to segment range images into continuous regions represented by Bezier polynomials. The main problem in many segmentation algorithms is that it is hard to accurately detect at the same time large continuous regions and their boundary location. In this paper, a Bayesian framework is used to determine through a region growing process large continuous regions. Following this process, an exact description of the boundary of each region is computed from the mutual intersection of the extracted parametric polynomials followed by a closure and approximation of this new boundary using a gradient vector flow algorithm. This algorithm is capable of segmenting not only polyhedral objects but also sculptured surfaces by creating a network of closed trimmed Bezier surfaces that are compatible with most CAD systems. Experimental results show that significant improvement of region boundary localization and closure can be achieved. In this paper, a systematic comparison of our algorithm to the most well known algorithms in the literature is presented to highlight its performance.
{"title":"An experimental comparison of a hierarchical range image segmentation algorithm","authors":"G. Osorio, P. Boulanger, F. Prieto","doi":"10.1109/CRV.2005.15","DOIUrl":"https://doi.org/10.1109/CRV.2005.15","url":null,"abstract":"This paper describe a new algorithm to segment range images into continuous regions represented by Bezier polynomials. The main problem in many segmentation algorithms is that it is hard to accurately detect at the same time large continuous regions and their boundary location. In this paper, a Bayesian framework is used to determine through a region growing process large continuous regions. Following this process, an exact description of the boundary of each region is computed from the mutual intersection of the extracted parametric polynomials followed by a closure and approximation of this new boundary using a gradient vector flow algorithm. This algorithm is capable of segmenting not only polyhedral objects but also sculptured surfaces by creating a network of closed trimmed Bezier surfaces that are compatible with most CAD systems. Experimental results show that significant improvement of region boundary localization and closure can be achieved. In this paper, a systematic comparison of our algorithm to the most well known algorithms in the literature is presented to highlight its performance.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133526132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vision-based motion variable estimation has been an area of intensive interest, especially for emerging applications in space robotics such as satellite maintenance, refueling and the removal of space debris. For each of these tasks, accurate kinematic motion estimates of an object are required before a robot can approach or interact with the object. In this paper, a technique is presented for autonomous identification of an object against a cluttered background and simultaneous estimation of kinematic variables of the object undergoing general 3D motion using an eye-in-hand robot camera system. The object and marker identification strategy has been partially validated by using a spherical balloon with circular markers and a stationary camera. While the validation of the kinematic variables estimation algorithm has been completed against simulated data.
{"title":"Kinematic variables estimation using eye-in-hand robot camera system","authors":"Siddharth Verma, I. Sharf, G. Dudek","doi":"10.1109/CRV.2005.51","DOIUrl":"https://doi.org/10.1109/CRV.2005.51","url":null,"abstract":"Vision-based motion variable estimation has been an area of intensive interest, especially for emerging applications in space robotics such as satellite maintenance, refueling and the removal of space debris. For each of these tasks, accurate kinematic motion estimates of an object are required before a robot can approach or interact with the object. In this paper, a technique is presented for autonomous identification of an object against a cluttered background and simultaneous estimation of kinematic variables of the object undergoing general 3D motion using an eye-in-hand robot camera system. The object and marker identification strategy has been partially validated by using a spherical balloon with circular markers and a stationary camera. While the validation of the kinematic variables estimation algorithm has been completed against simulated data.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134228068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}