Pub Date : 2003-06-16DOI: 10.1109/CVPRW.2003.10067
Yousun Kang, O. Hasegawa, H. Nagahashi
While some image textures can be changed with scale, others cannot. We focus on a multi-scale features of determing the sensitivity of the texture intensity to change. This paper presents a new method of texture structure classification and depth estimation using multi-scale features extracted from a higher order of the local autocorrelation functions. Multi-scale features consist of the meansand variances of distributions, which are extracted from theautocorrelation feature vectors according to multi-level scale. In order to reduce dimensional feature vectors, we employ the Principal Component Analysis (PCA) in the autocorrelation feature space. Each training image texture makes its own multi-scale model in a reduced PCA feature space, and the test of the texture image is projected in the homogeneous PCA space of the training data. The experimental results show that the proposed multi-scale feature can be utilized notonly for texture classification, but also depth estimation in two dimensional images with texture gradients.
{"title":"Texture Structure Classification and Depth Estimation using Multi-Scale Local Autocorrelation Features","authors":"Yousun Kang, O. Hasegawa, H. Nagahashi","doi":"10.1109/CVPRW.2003.10067","DOIUrl":"https://doi.org/10.1109/CVPRW.2003.10067","url":null,"abstract":"While some image textures can be changed with scale, others cannot. We focus on a multi-scale features of determing the sensitivity of the texture intensity to change. This paper presents a new method of texture structure classification and depth estimation using multi-scale features extracted from a higher order of the local autocorrelation functions. Multi-scale features consist of the meansand variances of distributions, which are extracted from theautocorrelation feature vectors according to multi-level scale. In order to reduce dimensional feature vectors, we employ the Principal Component Analysis (PCA) in the autocorrelation feature space. Each training image texture makes its own multi-scale model in a reduced PCA feature space, and the test of the texture image is projected in the homogeneous PCA space of the training data. The experimental results show that the proposed multi-scale feature can be utilized notonly for texture classification, but also depth estimation in two dimensional images with texture gradients.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127739034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-06-16DOI: 10.1109/CVPRW.2003.10071
T. Mashita, Y. Iwai, M. Yachida
We propose a system for pointing gesture recognition and detecting indicated objects by using a vision sensor. By using random sampling and importance sampling, our method can track hands and estimate hand positions in real-time. By using the concepts of a cognitiveorigin and a reference plane, our system can also detect a direction to an indicated object. We use an omnidirectional vision sensor in order to cover the wide range of hand operations and movement of indicated objects. The camera is mounted on the head, which enables the system to be tolerant of the occlusion problem. The method for detecting an indicated object uses a linear model with the concepts of a cognitive origin and a reference plane.
{"title":"A WearableCamera System for Pointing Gesture Recognition and Detecting Indicated Objects","authors":"T. Mashita, Y. Iwai, M. Yachida","doi":"10.1109/CVPRW.2003.10071","DOIUrl":"https://doi.org/10.1109/CVPRW.2003.10071","url":null,"abstract":"We propose a system for pointing gesture recognition and detecting indicated objects by using a vision sensor. By using random sampling and importance sampling, our method can track hands and estimate hand positions in real-time. By using the concepts of a cognitiveorigin and a reference plane, our system can also detect a direction to an indicated object. We use an omnidirectional vision sensor in order to cover the wide range of hand operations and movement of indicated objects. The camera is mounted on the head, which enables the system to be tolerant of the occlusion problem. The method for detecting an indicated object uses a linear model with the concepts of a cognitive origin and a reference plane.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115299099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-06-16DOI: 10.1109/CVPRW.2003.10030
Zhe Wang, Yue Lu, C. Tan
A method of word extraction based on the area Voronoi diagram is presented in this paper. Firstly, connected components are generated from the input image. Secondly, noise removal is performed including a special symbol detection technique to find some types of special symbols lying between words. Thirdly, base on the area Voronoi diagram, we select appropriate Voronoi edges which separate two neighboring connected components. Finally, words are extracted by merging the connected components based on the Voronoi edge between them. The result generated by this method is satisfactory with the ability to correctly group words of different size, font and arrangement. Experiments show that the proposed method achieves a high accuracy.
{"title":"Word Extraction Using Area Voronoi Diagram","authors":"Zhe Wang, Yue Lu, C. Tan","doi":"10.1109/CVPRW.2003.10030","DOIUrl":"https://doi.org/10.1109/CVPRW.2003.10030","url":null,"abstract":"A method of word extraction based on the area Voronoi diagram is presented in this paper. Firstly, connected components are generated from the input image. Secondly, noise removal is performed including a special symbol detection technique to find some types of special symbols lying between words. Thirdly, base on the area Voronoi diagram, we select appropriate Voronoi edges which separate two neighboring connected components. Finally, words are extracted by merging the connected components based on the Voronoi edge between them. The result generated by this method is satisfactory with the ability to correctly group words of different size, font and arrangement. Experiments show that the proposed method achieves a high accuracy.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115156603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-06-01DOI: 10.1109/CVPRW.2003.10052
Jason J. Corso, Darius Burschka, Gregory Hager
We present a platform for human-machine interfaces that provides functionality for robust, unencumbered interaction: the 4D Touchpad (4DT). The goal is direct interaction with interface components through intuitive actions and gestures. The 4DT is based on the 3D-2D Projection-based mode of the VICs framework. The fundamental idea behind VICs is that expensive global image processing with user modeling and tracking is not necessary in general vision-based HCI. Instead, interface components operating under simple-to-complex rules in local image regions provide more robust and less costly functionality with 3 spatial dimensions and 1 temporal dimension. A prototype realization of the 4DT platform is presented; it operates through a set of planar homographies with uncalibrated cameras.
{"title":"The 4D Touchpad: Unencumbered HCI With VICs","authors":"Jason J. Corso, Darius Burschka, Gregory Hager","doi":"10.1109/CVPRW.2003.10052","DOIUrl":"https://doi.org/10.1109/CVPRW.2003.10052","url":null,"abstract":"We present a platform for human-machine interfaces that provides functionality for robust, unencumbered interaction: the 4D Touchpad (4DT). The goal is direct interaction with interface components through intuitive actions and gestures. The 4DT is based on the 3D-2D Projection-based mode of the VICs framework. The fundamental idea behind VICs is that expensive global image processing with user modeling and tracking is not necessary in general vision-based HCI. Instead, interface components operating under simple-to-complex rules in local image regions provide more robust and less costly functionality with 3 spatial dimensions and 1 temporal dimension. A prototype realization of the 4DT platform is presented; it operates through a set of planar homographies with uncalibrated cameras.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122252142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-06-01DOI: 10.1109/CVPRW.2003.10102
Huimin Chen, K. Pattipati, T. Kirubarajan, Y. Bar-Shalom
In this paper we formulate data association with possibly unresolved measurements as an augmented assignment problem. Unlike conventional measurement-to-track association via assignment, this augmented assignment problem has much greater complexity when each target originated measurement can be of single or multiple origins. The main point is that standard one-to-one assignment algorithms do not work in the case of unresolved measurements because the constraints in the augmented assignment problem are very different. A suboptimal approach is considered for solving the resulting optimization problem via linear programming (LP) by relaxing the integer constraints. A tracker based on probabilistic data association filter (PDAF) using the LP solutions is also discussed. Simulation results show that the percentage of track loss is significantly reduced by solving the augmented assignment rather than the conventional assignment.
{"title":"General Data Association with Possibly Unresolved Measurements Using Linear Programming","authors":"Huimin Chen, K. Pattipati, T. Kirubarajan, Y. Bar-Shalom","doi":"10.1109/CVPRW.2003.10102","DOIUrl":"https://doi.org/10.1109/CVPRW.2003.10102","url":null,"abstract":"In this paper we formulate data association with possibly unresolved measurements as an augmented assignment problem. Unlike conventional measurement-to-track association via assignment, this augmented assignment problem has much greater complexity when each target originated measurement can be of single or multiple origins. The main point is that standard one-to-one assignment algorithms do not work in the case of unresolved measurements because the constraints in the augmented assignment problem are very different. A suboptimal approach is considered for solving the resulting optimization problem via linear programming (LP) by relaxing the integer constraints. A tracker based on probabilistic data association filter (PDAF) using the LP solutions is also discussed. Simulation results show that the percentage of track loss is significantly reduced by solving the augmented assignment rather than the conventional assignment.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":" 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132011367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-06-01DOI: 10.1109/cvprw.2003.10065
B. Draper, Carol Kaito, J. Bins
Feature weighting algorithms assign weights to features according to their relevance to a particular task. Unfortunately, the best-known feature weighting algorithm, ReliefF, is biased. It decreases the relevance of some features and increases the relevance of others when irrelevant attributes are added to the data set. This paper presents an improved version of the algorithm, Iterative Relief, and shows on synthetic data that it removes the bias found in ReliefF. This paper also shows that Iterative Relief outperforms ReliefF on the task of cat and dog discrimination, using real images.
{"title":"Iterative Relief","authors":"B. Draper, Carol Kaito, J. Bins","doi":"10.1109/cvprw.2003.10065","DOIUrl":"https://doi.org/10.1109/cvprw.2003.10065","url":null,"abstract":"Feature weighting algorithms assign weights to features according to their relevance to a particular task. Unfortunately, the best-known feature weighting algorithm, ReliefF, is biased. It decreases the relevance of some features and increases the relevance of others when irrelevant attributes are added to the data set. This paper presents an improved version of the algorithm, Iterative Relief, and shows on synthetic data that it removes the bias found in ReliefF. This paper also shows that Iterative Relief outperforms ReliefF on the task of cat and dog discrimination, using real images.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134495590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-06-01DOI: 10.1109/CVPRW.2003.10046
Konrad Tollmar, D. Demirdjian, Trevor Darrell
Navigating virtual environments usually requires a wired interface, game console, or keyboard. The advent of perceptual interface techniques allows a new option: the passive and untethered sensing of users' pose and gesture to allow them maneuver through and manipulate virtual worlds. We describe new algorithms for interacting with 3-D environments using real-time articulated body tracking with standard cameras and personal computers. Our method is based on rigid stereo-motion estimation algorithms and uses a linear technique for enforcing articulation constraints. With our tracking system users can navigate virtual environments using 3-D gesture and body poses. We analyze the space of possible perceptual interface abstractions for full-body navigation, and present a prototype system based on these results. We finally describe an initial evaluation of our prototype system with users guiding avatars through a series of 3-D virtual game worlds.
{"title":"Gesture + Play Exploring Full-Body Navigation for Virtual Environments","authors":"Konrad Tollmar, D. Demirdjian, Trevor Darrell","doi":"10.1109/CVPRW.2003.10046","DOIUrl":"https://doi.org/10.1109/CVPRW.2003.10046","url":null,"abstract":"Navigating virtual environments usually requires a wired interface, game console, or keyboard. The advent of perceptual interface techniques allows a new option: the passive and untethered sensing of users' pose and gesture to allow them maneuver through and manipulate virtual worlds. We describe new algorithms for interacting with 3-D environments using real-time articulated body tracking with standard cameras and personal computers. Our method is based on rigid stereo-motion estimation algorithms and uses a linear technique for enforcing articulation constraints. With our tracking system users can navigate virtual environments using 3-D gesture and body poses. We analyze the space of possible perceptual interface abstractions for full-body navigation, and present a prototype system based on these results. We finally describe an initial evaluation of our prototype system with users guiding avatars through a series of 3-D virtual game worlds.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131203821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-03-20DOI: 10.1109/CVPRW.2003.10045
M. Shin, L. Tsap, Dmitry Goldgof
This paper presents a perceptual interface for visualization navigation using gesture recognition. Scientists are interested in developing interactive settings for exploring large data sets in an intuitive environment. The input consists of registered 3-D data. Bezier curves are used for trajectory analysis and classification of gestures. The method is robust and reliable: correct hand identification rate is 99.9% (from 1641 frames), modes of hand movements are correct 95.6% of the time, recognition rate (given the right mode) is 97.9%. An application to gesture-controlled visualization is also presented. The paper advances the state-of-the-art of human-computer interaction with a robust attachment- and marker-free gestural information processing for visualization.
{"title":"Towards Perceptual Interface for Visualization Navigation of Large Data Sets","authors":"M. Shin, L. Tsap, Dmitry Goldgof","doi":"10.1109/CVPRW.2003.10045","DOIUrl":"https://doi.org/10.1109/CVPRW.2003.10045","url":null,"abstract":"This paper presents a perceptual interface for visualization navigation using gesture recognition. Scientists are interested in developing interactive settings for exploring large data sets in an intuitive environment. The input consists of registered 3-D data. Bezier curves are used for trajectory analysis and classification of gestures. The method is robust and reliable: correct hand identification rate is 99.9% (from 1641 frames), modes of hand movements are correct 95.6% of the time, recognition rate (given the right mode) is 97.9%. An application to gesture-controlled visualization is also presented. The paper advances the state-of-the-art of human-computer interaction with a robust attachment- and marker-free gestural information processing for visualization.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132098717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/CVPRW.2003.10060
Peng Zhang, Jing Peng, C. Domeniconi
We study the use of kernel subspace methods for learning low-dimensional representations for classification. We propose a kernel pooled local discriminant subspace method and compare it against several competing techniques: Principal Component Analysis (PCA), Kernel PCA (KPCA), and linear local pooling in classification problems. We evaluate the classification performance of the nearest-neighbor rule with each subspace representation. The experimental results demonstrate the effectiveness and performance superiority of the kernel pooled subspace method over competing methods such as PCA and KPCA in some classification problems.
{"title":"Kernel Pooled Local Subspaces for Classification","authors":"Peng Zhang, Jing Peng, C. Domeniconi","doi":"10.1109/CVPRW.2003.10060","DOIUrl":"https://doi.org/10.1109/CVPRW.2003.10060","url":null,"abstract":"We study the use of kernel subspace methods for learning low-dimensional representations for classification. We propose a kernel pooled local discriminant subspace method and compare it against several competing techniques: Principal Component Analysis (PCA), Kernel PCA (KPCA), and linear local pooling in classification problems. We evaluate the classification performance of the nearest-neighbor rule with each subspace representation. The experimental results demonstrate the effectiveness and performance superiority of the kernel pooled subspace method over competing methods such as PCA and KPCA in some classification problems.","PeriodicalId":121249,"journal":{"name":"2003 Conference on Computer Vision and Pattern Recognition Workshop","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132437644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}