Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048493
Graeme S. Chambers, S. Venkatesh, G. West, H. Bui
We present a novel technique for the recognition of complex human gestures for video annotation using accelerometers and the hidden Markov model. Our extension to the standard hidden Markov model allows us to consider gestures at different levels of abstraction through a hierarchy of hidden states. Accelerometers in the form of wrist bands are attached to humans performing intentional gestures, such as umpires in sports. Video annotation is then performed by populating the video with time stamps indicating significant events, where a particular gesture occurs. The novelty of the technique lies in the development of a probabilistic hierarchical framework for complex gesture recognition and the use of accelerometers to extract gestures and significant events for video annotation.
{"title":"Hierarchical recognition of intentional human gestures for sports video annotation","authors":"Graeme S. Chambers, S. Venkatesh, G. West, H. Bui","doi":"10.1109/ICPR.2002.1048493","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048493","url":null,"abstract":"We present a novel technique for the recognition of complex human gestures for video annotation using accelerometers and the hidden Markov model. Our extension to the standard hidden Markov model allows us to consider gestures at different levels of abstraction through a hierarchy of hidden states. Accelerometers in the form of wrist bands are attached to humans performing intentional gestures, such as umpires in sports. Video annotation is then performed by populating the video with time stamps indicating significant events, where a particular gesture occurs. The novelty of the technique lies in the development of a probabilistic hierarchical framework for complex gesture recognition and the use of accelerometers to extract gestures and significant events for video annotation.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131358551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048242
Jinwuk Seok, Jeun-Woo Lee
Recently, various types of neural network models have been used successfully to applications in pattern recognition, control, signal processing, and so on. However, the previous models are not suitable for hardware implementation due to their complexity. In this paper, we present a survey of the stochastic analysis for the Langevin competitive learning algorithm, known for its easy hardware implementation. Since the Langevin competitive learning algorithm uses a time-invariant learning rate and a stochastic reinforcement term, it is necessary to analyze with stochastic differential or difference equation. The result of the analysis verifies that the Langevin competitive learning process is equal to the standard Ornstein-Uhlenback process and has a weak convergence property. The experimental results for Gaussian distributed data confirm the analysis provided in this paper.
{"title":"The analysis of a stochastic differential approach for Langevin competitive learning algorithm","authors":"Jinwuk Seok, Jeun-Woo Lee","doi":"10.1109/ICPR.2002.1048242","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048242","url":null,"abstract":"Recently, various types of neural network models have been used successfully to applications in pattern recognition, control, signal processing, and so on. However, the previous models are not suitable for hardware implementation due to their complexity. In this paper, we present a survey of the stochastic analysis for the Langevin competitive learning algorithm, known for its easy hardware implementation. Since the Langevin competitive learning algorithm uses a time-invariant learning rate and a stochastic reinforcement term, it is necessary to analyze with stochastic differential or difference equation. The result of the analysis verifies that the Langevin competitive learning process is equal to the standard Ornstein-Uhlenback process and has a weak convergence property. The experimental results for Gaussian distributed data confirm the analysis provided in this paper.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133513971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048381
A. Heyden, D. Huynh
A scheme is described for incorporation of scene constraints into the structure from motion problem. Specifically, the absolute quadric is recovered with constraints imposed by orthogonal scene planes. The scheme involves a number of steps. A projective reconstruction is first obtained, followed by a linear technique to form an initial estimate of the absolute quadric. A nonlinear iteration then refines this quadric and the camera intrinsic parameters to upgrade the projective reconstruction to Euclidean. Finally, a bundle adjustment algorithm optimizes the Euclidean reconstruction to give a statistically optimal result. This chain of algorithms is essentially the same as used in auto-calibration and the novelty of this paper is the inclusion of orthogonal scene plane constraints in each step. The algorithms involved are demonstrated on both simulated and real data showing the performance and usability of the proposed scheme.
{"title":"Auto-calibration via the absolute quadric and scene constraints","authors":"A. Heyden, D. Huynh","doi":"10.1109/ICPR.2002.1048381","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048381","url":null,"abstract":"A scheme is described for incorporation of scene constraints into the structure from motion problem. Specifically, the absolute quadric is recovered with constraints imposed by orthogonal scene planes. The scheme involves a number of steps. A projective reconstruction is first obtained, followed by a linear technique to form an initial estimate of the absolute quadric. A nonlinear iteration then refines this quadric and the camera intrinsic parameters to upgrade the projective reconstruction to Euclidean. Finally, a bundle adjustment algorithm optimizes the Euclidean reconstruction to give a statistically optimal result. This chain of algorithms is essentially the same as used in auto-calibration and the novelty of this paper is the inclusion of orthogonal scene plane constraints in each step. The algorithms involved are demonstrated on both simulated and real data showing the performance and usability of the proposed scheme.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133943595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048338
J. Sánchez, Xavier Binefa, J. Kender
We propose a compact descriptor of video contents based on modeling the temporal behavior of image features using coupled Markov chains. The framework allows us to combine multiple features within the same model, including the representation of the dependencies and relationships between them. The Kullback-Leibler divergence stands out as the base of a perceptually significant distance measure for our descriptor Our experiments show that complex highlevel visual contents in different domains can be characterized using very simple low-level features, such as motion and color.
{"title":"Coupled Markov chains for video contents characterization","authors":"J. Sánchez, Xavier Binefa, J. Kender","doi":"10.1109/ICPR.2002.1048338","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048338","url":null,"abstract":"We propose a compact descriptor of video contents based on modeling the temporal behavior of image features using coupled Markov chains. The framework allows us to combine multiple features within the same model, including the representation of the dependencies and relationships between them. The Kullback-Leibler divergence stands out as the base of a perceptually significant distance measure for our descriptor Our experiments show that complex highlevel visual contents in different domains can be characterized using very simple low-level features, such as motion and color.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133973086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048232
Christophe Garcia, M. Delakis
In this paper, we present a connectionist approach for detecting and precisely localizing semi-frontal human faces in complex images, making no assumption about the content or the lighting conditions of the scene, or about the size or the appearance of the faces. We propose a convolutional neural network architecture designed to recognize strongly variable face patterns directly from pixel images with no preprocessing, by automatically synthesizing its own set of feature extractors from a large training set of faces. We present in details the optimized design of our architecture, our learning strategy and the resulting process of face detection. We also provide experimental results to demonstrate the robustness of our approach and its capability to precisely detect extremely variable faces in uncontrolled environments.
{"title":"A neural architecture for fast and robust face detection","authors":"Christophe Garcia, M. Delakis","doi":"10.1109/ICPR.2002.1048232","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048232","url":null,"abstract":"In this paper, we present a connectionist approach for detecting and precisely localizing semi-frontal human faces in complex images, making no assumption about the content or the lighting conditions of the scene, or about the size or the appearance of the faces. We propose a convolutional neural network architecture designed to recognize strongly variable face patterns directly from pixel images with no preprocessing, by automatically synthesizing its own set of feature extractors from a large training set of faces. We present in details the optimized design of our architecture, our learning strategy and the resulting process of face detection. We also provide experimental results to demonstrate the robustness of our approach and its capability to precisely detect extremely variable faces in uncontrolled environments.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134013577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048360
J. Au, Ze-Nian Li, M. S. Drew
In this paper, we present a new technique based on feature localization for segmenting and tracking objects in videos. A video locale is a sequence of image feature locales that share similar features (color, texture, shape, and motion) in the spatio-temporal domain of videos. Image feature locales are grown from tiles (blocks of pixels) and can be non-disjoint and non-connected. To exploit the temporal redundancy in digital videos, two algorithms (intra-frame and inter-frame) are used to grow locales efficiently. Multiple motion tracking is achieved by tracking and performing tile-based dominant motion estimation for each locale separately.
{"title":"Object segmentation and tracking using video locales","authors":"J. Au, Ze-Nian Li, M. S. Drew","doi":"10.1109/ICPR.2002.1048360","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048360","url":null,"abstract":"In this paper, we present a new technique based on feature localization for segmenting and tracking objects in videos. A video locale is a sequence of image feature locales that share similar features (color, texture, shape, and motion) in the spatio-temporal domain of videos. Image feature locales are grown from tiles (blocks of pixels) and can be non-disjoint and non-connected. To exploit the temporal redundancy in digital videos, two algorithms (intra-frame and inter-frame) are used to grow locales efficiently. Multiple motion tracking is achieved by tracking and performing tile-based dominant motion estimation for each locale separately.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131821290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048380
Thomas Zöller, L. Hermes, J. Buhmann
Unsupervised image segmentation can be formulated as a clustering problem in which pixels or small image patches are grouped together based on local feature information. In this contribution, parametric distributional clustering (PDC) is presented as a novel approach to image segmentation based on color and texture clues. The objective function of the PDC model is derived from the recently proposed Information Bottleneck framework (Tishby et al., 1999), but it can equivalently be formulated in terms of a maximum likelihood solution. Its optimization is performed by deterministic annealing. Segmentation results are shown for natural wildlife imagery.
无监督图像分割可以表述为基于局部特征信息将像素或小图像块分组在一起的聚类问题。在这篇贡献中,参数分布聚类(PDC)是一种基于颜色和纹理线索的图像分割新方法。PDC模型的目标函数来源于最近提出的信息瓶颈框架(Tishby et al., 1999),但它可以等效地用最大似然解来表示。采用确定性退火方法对其进行优化。自然野生动物图像的分割结果。
{"title":"Combined color and texture segmentation by parametric distributional clustering","authors":"Thomas Zöller, L. Hermes, J. Buhmann","doi":"10.1109/ICPR.2002.1048380","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048380","url":null,"abstract":"Unsupervised image segmentation can be formulated as a clustering problem in which pixels or small image patches are grouped together based on local feature information. In this contribution, parametric distributional clustering (PDC) is presented as a novel approach to image segmentation based on color and texture clues. The objective function of the PDC model is derived from the recently proposed Information Bottleneck framework (Tishby et al., 1999), but it can equivalently be formulated in terms of a maximum likelihood solution. Its optimization is performed by deterministic annealing. Segmentation results are shown for natural wildlife imagery.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114173675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048373
Huiguang He, Jie Tian, Jing Wang, Hong Chen, X. P. Zhang
Edge detection is the basic operation in image processing and analysis. Multiresolution sequential edge linking (MSEL) Cook and Delp (1995) has a number of advantages over other edge detection schemes, such as lower false alarm rates while maintaining full connectivity of the edge. However, it is not reasonable in the initial value selection and is time consuming. For this problem, we first use anisotropic diffusion to smooth the image while keeping the edge, and then use the feedback method to optimize the initial value. We apply our method to a medical image, and experiments show that our method is more efficient and accurate than the old MSEL.
{"title":"Improved MSEL and its medical application","authors":"Huiguang He, Jie Tian, Jing Wang, Hong Chen, X. P. Zhang","doi":"10.1109/ICPR.2002.1048373","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048373","url":null,"abstract":"Edge detection is the basic operation in image processing and analysis. Multiresolution sequential edge linking (MSEL) Cook and Delp (1995) has a number of advantages over other edge detection schemes, such as lower false alarm rates while maintaining full connectivity of the edge. However, it is not reasonable in the initial value selection and is time consuming. For this problem, we first use anisotropic diffusion to smooth the image while keeping the edge, and then use the feedback method to optimize the initial value. We apply our method to a medical image, and experiments show that our method is more efficient and accurate than the old MSEL.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117125021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048479
T. Ojala, Markus Aittola, Esa Matinmikko
This paper conducts an empirical evaluation of MPEG-7 visual part of experimentation model (XM) color descriptors in a challenging problem of content-based retrieval of semantic image categories. The performance of the four color descriptors provided in the current XM reference implementation, Color Layout, Color Structure, Dominant Color and Scalable Color is compared to that of HSV autocorrelogram, which has done well in recent empirical studies. Experimental results show that Color Structure provides best retrieval accuracy, whereas the computationally most expensive descriptor Dominant Color is worst in this problem.
{"title":"Empirical evaluation of MPEG-7 XM color descriptors in content-based retrieval of semantic image categories","authors":"T. Ojala, Markus Aittola, Esa Matinmikko","doi":"10.1109/ICPR.2002.1048479","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048479","url":null,"abstract":"This paper conducts an empirical evaluation of MPEG-7 visual part of experimentation model (XM) color descriptors in a challenging problem of content-based retrieval of semantic image categories. The performance of the four color descriptors provided in the current XM reference implementation, Color Layout, Color Structure, Dominant Color and Scalable Color is compared to that of HSV autocorrelogram, which has done well in recent empirical studies. Experimental results show that Color Structure provides best retrieval accuracy, whereas the computationally most expensive descriptor Dominant Color is worst in this problem.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123309336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-12-10DOI: 10.1109/ICPR.2002.1048364
Guanghui Wang, Yihong Wu, Zhanyi Hu
An approach is proposed for single view based plane metrology. The approach is based on a pair of vanishing points from two orthogonal sets of space parallel lines. Extensive experiments on simulated data as well as on real images showed that our new approach can achieve as good result as that of the homography based one which is widely used in the literature, but our new approach does not need any explicit specifications of space control points. Since in many real applications, particularly, in indoor environment, orthogonal lines are not rare, for example, a frame of window or a door, our new approach is of widely applicable.
{"title":"A novel approach for single view based plane metrology","authors":"Guanghui Wang, Yihong Wu, Zhanyi Hu","doi":"10.1109/ICPR.2002.1048364","DOIUrl":"https://doi.org/10.1109/ICPR.2002.1048364","url":null,"abstract":"An approach is proposed for single view based plane metrology. The approach is based on a pair of vanishing points from two orthogonal sets of space parallel lines. Extensive experiments on simulated data as well as on real images showed that our new approach can achieve as good result as that of the homography based one which is widely used in the literature, but our new approach does not need any explicit specifications of space control points. Since in many real applications, particularly, in indoor environment, orthogonal lines are not rare, for example, a frame of window or a door, our new approach is of widely applicable.","PeriodicalId":159502,"journal":{"name":"Object recognition supported by user interaction for service robots","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125146397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}