We present a proof of concept system to represent and reason about hockey play. The system takes as input player motion trajectory data tracked from game video and supported by knowledge of hockey strategy, game situation and specific player profiles. The raw motion trajectory data consists of space-time point sequences of player position registered to rink coordinates. The raw data is augmented with knowledge of forward/backward skating, possession of the puck and specific player attributes (e.g., shoots left, shoots right). We use a finite state machine (FSM) model to represent our total knowledge of given situations and develop evaluation functions for primitive hockey behaviours (e.g., pass, shot). Based on the augmented trajectory data, the FSMs and the evaluation functions, we describe what happened in each identified situation, assess the outcome, estimate when and where key play choices were made, and attempt to predict whether better alternatives were available to achieve understood goals. A textual natural language description and a simple ID graphic animation of the analysis are produced as the output. The design is flexible to allow the substitution of different analysis modules and extensible to allow the inclusion of additional hockey situations.
{"title":"Analysis of player actions in selected hockey game situations","authors":"Fahong Li, R. Woodham","doi":"10.1109/CRV.2005.17","DOIUrl":"https://doi.org/10.1109/CRV.2005.17","url":null,"abstract":"We present a proof of concept system to represent and reason about hockey play. The system takes as input player motion trajectory data tracked from game video and supported by knowledge of hockey strategy, game situation and specific player profiles. The raw motion trajectory data consists of space-time point sequences of player position registered to rink coordinates. The raw data is augmented with knowledge of forward/backward skating, possession of the puck and specific player attributes (e.g., shoots left, shoots right). We use a finite state machine (FSM) model to represent our total knowledge of given situations and develop evaluation functions for primitive hockey behaviours (e.g., pass, shot). Based on the augmented trajectory data, the FSMs and the evaluation functions, we describe what happened in each identified situation, assess the outcome, estimate when and where key play choices were made, and attempt to predict whether better alternatives were available to achieve understood goals. A textual natural language description and a simple ID graphic animation of the analysis are produced as the output. The design is flexible to allow the substitution of different analysis modules and extensible to allow the inclusion of additional hockey situations.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132126239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes a view-based method for object recognition and estimation of object pose from a single image. The method is based on feature vector matching and clustering. A set of interest points is detected and combined into pairs. A pair of patches, centered around each point in the pair, is extracted from a local orientation image. The patch orientation and size depends on the relative positions of the points, which make them invariant to translation, rotation, and locally invariant to scale. Each pair of patches constitutes a feature vector. The method is demonstrated on a number of real images and the patch-duplet feature is compared to the SIFT feature.
{"title":"Patch-duplets for object recognition and pose estimation","authors":"B. Johansson, A. Moe","doi":"10.1109/CRV.2005.58","DOIUrl":"https://doi.org/10.1109/CRV.2005.58","url":null,"abstract":"This paper describes a view-based method for object recognition and estimation of object pose from a single image. The method is based on feature vector matching and clustering. A set of interest points is detected and combined into pairs. A pair of patches, centered around each point in the pair, is extracted from a local orientation image. The patch orientation and size depends on the relative positions of the points, which make them invariant to translation, rotation, and locally invariant to scale. Each pair of patches constitutes a feature vector. The method is demonstrated on a number of real images and the patch-duplet feature is compared to the SIFT feature.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133677940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There is a great need for programs that can describe what people are doing from video. This is difficult to do, because it is hard to identify and track people in video sequences, because we have no canonical vocabulary for describing what people are doing, and because the interpretation of what people are doing depends very strongly on what is nearby. Tracking is hard, because it is important to track relatively small structures that can move relatively fast for example, lower arms. I will describe research into kinematic tracking tracking that reports the kinematic configuration of the body that has resulted in a fairly accurate, fully automatic tracker, that can keep track of multiple people. Once one has tracked the body, one must interpret the results. One way to do so is to have a motion synthesis system that takes the track, and produces a motion that is (a) like a human motion and (b) close to the track. Our work has produced a high-quality motion synthesis system that can produce motions that look very much like human activities. I will describe work that couples that system with a tracker to produce a description of the activities, entirely automatically. I will speculate on some of the many open problems. What should one report? How do nearby objects affect one’s interpretation of activities? How can one interpret patterns of behavior?
{"title":"Looking at People","authors":"D. Forsyth","doi":"10.1109/CRV.2005.52","DOIUrl":"https://doi.org/10.1109/CRV.2005.52","url":null,"abstract":"There is a great need for programs that can describe what people are doing from video. This is difficult to do, because it is hard to identify and track people in video sequences, because we have no canonical vocabulary for describing what people are doing, and because the interpretation of what people are doing depends very strongly on what is nearby. Tracking is hard, because it is important to track relatively small structures that can move relatively fast for example, lower arms. I will describe research into kinematic tracking tracking that reports the kinematic configuration of the body that has resulted in a fairly accurate, fully automatic tracker, that can keep track of multiple people. Once one has tracked the body, one must interpret the results. One way to do so is to have a motion synthesis system that takes the track, and produces a motion that is (a) like a human motion and (b) close to the track. Our work has produced a high-quality motion synthesis system that can produce motions that look very much like human activities. I will describe work that couples that system with a tracker to produce a description of the activities, entirely automatically. I will speculate on some of the many open problems. What should one report? How do nearby objects affect one’s interpretation of activities? How can one interpret patterns of behavior?","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114916338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We use shape based searching to detect head contours of arbitrary size in single images. We begin by looking for a head model which is optimal for finding head contours of various orientations. Then we develop a search technique for finding contours of arbitrary size, and present a way in which the search space can be narrowed using color and intensity gradient information. We show results and discuss how the method could be improved.
{"title":"Automatic detection of heads in colored images","authors":"J. Garcia, N. Lobo, M. Shah, J. Feinstein","doi":"10.1109/CRV.2005.21","DOIUrl":"https://doi.org/10.1109/CRV.2005.21","url":null,"abstract":"We use shape based searching to detect head contours of arbitrary size in single images. We begin by looking for a head model which is optimal for finding head contours of various orientations. Then we develop a search technique for finding contours of arbitrary size, and present a way in which the search space can be narrowed using color and intensity gradient information. We show results and discuss how the method could be improved.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117153178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we present a new method for automatic feature planning for visual tracking systems employed in visual servoing control of robot manipulators. Such planning of optimal feature sets is of utmost importance in order to ensure accurate and robust execution of general robot tasks using visual servoing. First we introduce a novel platform for simulation and preparation of visual servoing systems. Subsequently we demonstrate how this platform, together with combinatorial optimization techniques and fitness measures which consider several aspects related to the robustness of the tracking system, can be used to plan reliable and information rich feature sets. Finally, we present experiments which compare the performance of a visual servoing system employing the proposed feature planning technique to that of a servoing system based on features selected using traditional methods. These experiments demonstrate that our technique not only improves the robustness of the visual tracking system but also significantly increases the accuracy of the visual servoing control loop.
{"title":"Feature planning for robust execution of general robot tasks using visual servoing","authors":"M. Paulin","doi":"10.1109/CRV.2005.43","DOIUrl":"https://doi.org/10.1109/CRV.2005.43","url":null,"abstract":"In this paper we present a new method for automatic feature planning for visual tracking systems employed in visual servoing control of robot manipulators. Such planning of optimal feature sets is of utmost importance in order to ensure accurate and robust execution of general robot tasks using visual servoing. First we introduce a novel platform for simulation and preparation of visual servoing systems. Subsequently we demonstrate how this platform, together with combinatorial optimization techniques and fitness measures which consider several aspects related to the robustness of the tracking system, can be used to plan reliable and information rich feature sets. Finally, we present experiments which compare the performance of a visual servoing system employing the proposed feature planning technique to that of a servoing system based on features selected using traditional methods. These experiments demonstrate that our technique not only improves the robustness of the visual tracking system but also significantly increases the accuracy of the visual servoing control loop.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124759139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The necessity and utility of visual attention are discussed in the context of stereo vision in machines and primates. Specific problems that arise in this domain including binocular rivalry, and the deployment of attention in three-dimensional space are considered. Necessary conditions are outlined for achieving appropriate attentional behaviour in both the aforementioned domains. In this light, we outline classes of existing computational models of attention and discuss their applicability for realizing binocular attention. Finally, a stereo attention framework is presented by considering the tenets of an existing attentional architecture that extends naturally to the binocular domain, in conjunction with the connectivity of units involved in achieving stereo vision.
{"title":"An attentional framework for stereo vision","authors":"Neil D. B. Bruce, John K. Tsotsos","doi":"10.1109/CRV.2005.13","DOIUrl":"https://doi.org/10.1109/CRV.2005.13","url":null,"abstract":"The necessity and utility of visual attention are discussed in the context of stereo vision in machines and primates. Specific problems that arise in this domain including binocular rivalry, and the deployment of attention in three-dimensional space are considered. Necessary conditions are outlined for achieving appropriate attentional behaviour in both the aforementioned domains. In this light, we outline classes of existing computational models of attention and discuss their applicability for realizing binocular attention. Finally, a stereo attention framework is presented by considering the tenets of an existing attentional architecture that extends naturally to the binocular domain, in conjunction with the connectivity of units involved in achieving stereo vision.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129908843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present an object detection system that is applied to detecting pedestrians in still images, without assuming any a priori knowledge about the image. The system works as follows: In a first stage a classifier examines each location in the image at different scales. Then in a second stage the system tries to eliminate false detections based on heuristics. The classifier is based on the idea that principal components analysis (PCA) can compress optimally only the kind of images that were used to compute the principal components (PCs), and that any other kind of images will not be compressed well using a few components. Thus the classifier performs separately the PCA from the positive examples and from the negative examples, when it needs to classify a new pattern it projects it into both sets of PCs and compares the reconstructions. The system is able to detect frontal and rear views of pedestrians, and usually can also detect side views of pedestrians despite not being trained for this task. Comparisons with other pedestrian detection systems are presented; our system has better performance in positive detection and in false detection rate.
{"title":"An object detection system using image reconstruction with PCA","authors":"Luis Malagón-Borja, O. Fuentes","doi":"10.1109/CRV.2005.16","DOIUrl":"https://doi.org/10.1109/CRV.2005.16","url":null,"abstract":"We present an object detection system that is applied to detecting pedestrians in still images, without assuming any a priori knowledge about the image. The system works as follows: In a first stage a classifier examines each location in the image at different scales. Then in a second stage the system tries to eliminate false detections based on heuristics. The classifier is based on the idea that principal components analysis (PCA) can compress optimally only the kind of images that were used to compute the principal components (PCs), and that any other kind of images will not be compressed well using a few components. Thus the classifier performs separately the PCA from the positive examples and from the negative examples, when it needs to classify a new pattern it projects it into both sets of PCs and compares the reconstructions. The system is able to detect frontal and rear views of pedestrians, and usually can also detect side views of pedestrians despite not being trained for this task. Comparisons with other pedestrian detection systems are presented; our system has better performance in positive detection and in false detection rate.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127401517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A system for the detection of independently moving objects by a moving observer by means of investigating optical flow fields is presented. The usability of the algorithm is shown by a collision detection application. Since the measurement of optical flow is a computationally expensive operation, it is necessary to restrict the number of flow measurements. The first part of the paper describes the usage of a particle filter for the determination of positions where optical flow is calculated. This approach results in a fixed number of optical flow calculations leading to a robust real time detection of independently moving objects on standard consumer PCs. The detection method for independent motion relies on knowledge about the camera motion. Even though inertial sensors provide information about the camera motion, the sensor data does not always satisfy the requirements of the proposed detection method. The second part of this paper therefore deals with the enhancement of the camera motion using image information. The third part of this work specifies the final decision module of the algorithm. It derives a decision (whether to issue a warning or not) from the sparse detection information.
{"title":"A monocular collision warning system","authors":"F. Woelk, S. Gehrig, R. Koch","doi":"10.1109/CRV.2005.8","DOIUrl":"https://doi.org/10.1109/CRV.2005.8","url":null,"abstract":"A system for the detection of independently moving objects by a moving observer by means of investigating optical flow fields is presented. The usability of the algorithm is shown by a collision detection application. Since the measurement of optical flow is a computationally expensive operation, it is necessary to restrict the number of flow measurements. The first part of the paper describes the usage of a particle filter for the determination of positions where optical flow is calculated. This approach results in a fixed number of optical flow calculations leading to a robust real time detection of independently moving objects on standard consumer PCs. The detection method for independent motion relies on knowledge about the camera motion. Even though inertial sensors provide information about the camera motion, the sensor data does not always satisfy the requirements of the proposed detection method. The second part of this paper therefore deals with the enhancement of the camera motion using image information. The third part of this work specifies the final decision module of the algorithm. It derives a decision (whether to issue a warning or not) from the sparse detection information.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129946898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. C. Santana, O. Déniz-Suárez, J. Lorenzo-Navarro, M. Hernández-Tejera
This paper tackles the problem of online acquisition of exemplars for dynamic updating of classifiers for facial analysis. Most facial analysis systems apply a previously computed classifier to a set of images, or recently to the output of real-time face detection systems. Here we describe an approach to select significant detected faces during interactive sessions in order to learn and modify online, with the initial help of an expert, a classifier for a given task. Preliminary experiments are reported related to gender recognition.
{"title":"Face exemplars selection from video streams for online learning","authors":"M. C. Santana, O. Déniz-Suárez, J. Lorenzo-Navarro, M. Hernández-Tejera","doi":"10.1109/CRV.2005.41","DOIUrl":"https://doi.org/10.1109/CRV.2005.41","url":null,"abstract":"This paper tackles the problem of online acquisition of exemplars for dynamic updating of classifiers for facial analysis. Most facial analysis systems apply a previously computed classifier to a set of images, or recently to the output of real-time face detection systems. Here we describe an approach to select significant detected faces during interactive sessions in order to learn and modify online, with the initial help of an expert, a classifier for a given task. Preliminary experiments are reported related to gender recognition.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123785142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Iles, David A Clausi, Shannon M. Puddister, G. Brodland
Four computer vision algorithms to measure the average orientation, shape and size of cells in images of biological tissue are proposed and tested. These properties, which can be embodied by an elliptical 'composite cell' are crucial for biomechanical tissue models. To automatically determine these properties is challenging due to the diverse nature of the image data, with tremendous and unpredictable variability in illumination, cell pigmentation, cell shape, and cell boundary visibility. First, a simple edge detection routine is performed on the raw images to locate cell edges and remove pigmentation variation. The edge map is then converted into the magnitude spatial-frequency domain where the spatial patterns of the cells appear as energy impulses. Four candidate methods that analyze the spatial-frequency data to estimate the properties of the composite cell are presented and compared. These methods are: least squares ellipse fitting, correlation, area moments and Gabor filters. Robustness is demonstrated by successful application on a wide variety of real images.
{"title":"Average cell orientation, shape and size estimated from tissue images","authors":"P. Iles, David A Clausi, Shannon M. Puddister, G. Brodland","doi":"10.1109/CRV.2005.22","DOIUrl":"https://doi.org/10.1109/CRV.2005.22","url":null,"abstract":"Four computer vision algorithms to measure the average orientation, shape and size of cells in images of biological tissue are proposed and tested. These properties, which can be embodied by an elliptical 'composite cell' are crucial for biomechanical tissue models. To automatically determine these properties is challenging due to the diverse nature of the image data, with tremendous and unpredictable variability in illumination, cell pigmentation, cell shape, and cell boundary visibility. First, a simple edge detection routine is performed on the raw images to locate cell edges and remove pigmentation variation. The edge map is then converted into the magnitude spatial-frequency domain where the spatial patterns of the cells appear as energy impulses. Four candidate methods that analyze the spatial-frequency data to estimate the properties of the composite cell are presented and compared. These methods are: least squares ellipse fitting, correlation, area moments and Gabor filters. Robustness is demonstrated by successful application on a wide variety of real images.","PeriodicalId":307318,"journal":{"name":"The 2nd Canadian Conference on Computer and Robot Vision (CRV'05)","volume":"348 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122648122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}