Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166625
W. Lin, Chung-Lin Huang, Shih-Chung Hsu, Hung-Wei Lin, Hau-Wei Wang
The markerless vision-based human motion parameters capturing has been widely applied for human-machine interface. However, it faces two problems: the high-dimensional parameter estimation and the self-occlusion. Here, we propose a 3-D human model with structural, kinematic, and temporal constraints to track a walking human object in any viewing direction. Our method modifies the Annealed Particle Filter (APF) by applying the pre-trained spatial correlation map and the temporal constraint to estimate the motion parameters of a walking human object. In the experiments, we demonstrate that the proposed method requires less computation time and generates more accurate results.
{"title":"A vision-based walking motion parameters capturing system","authors":"W. Lin, Chung-Lin Huang, Shih-Chung Hsu, Hung-Wei Lin, Hau-Wei Wang","doi":"10.1109/ACPR.2011.6166625","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166625","url":null,"abstract":"The markerless vision-based human motion parameters capturing has been widely applied for human-machine interface. However, it faces two problems: the high-dimensional parameter estimation and the self-occlusion. Here, we propose a 3-D human model with structural, kinematic, and temporal constraints to track a walking human object in any viewing direction. Our method modifies the Annealed Particle Filter (APF) by applying the pre-trained spatial correlation map and the temporal constraint to estimate the motion parameters of a walking human object. In the experiments, we demonstrate that the proposed method requires less computation time and generates more accurate results.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127282219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166611
Gang Zeng, Rui Gan, H. Zha
Like many natural and other man-made objects, buildings contain repeating elements. The repetition is an important cue for most applications, and can be partial, approximate or both. This paper presents a robust and accurate building facade interpretation algorithm that processes a single input image and efficiently discovers and extracts the repeating elements (e.g. windows) without any prior knowledge about their shape, intensity or structure. The method is based on locally registering certain key regions in pairs and using these matches to accumulate evidence for averaged templates. These templates are determined via the graph-theoretical concept of minimum spanning tree (MST) and via mutual information (MI). Based on the templates, the repeating elements are finally extracted from the input image. Real scene examples demonstrate the ability of the proposed algorithm to capture important high-level information about the structure of a building facade, which in turn can support further processing operations, including compression, segmentation, editing and reconstruction.
{"title":"Building facade interpretation exploiting repetition and mixed templates","authors":"Gang Zeng, Rui Gan, H. Zha","doi":"10.1109/ACPR.2011.6166611","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166611","url":null,"abstract":"Like many natural and other man-made objects, buildings contain repeating elements. The repetition is an important cue for most applications, and can be partial, approximate or both. This paper presents a robust and accurate building facade interpretation algorithm that processes a single input image and efficiently discovers and extracts the repeating elements (e.g. windows) without any prior knowledge about their shape, intensity or structure. The method is based on locally registering certain key regions in pairs and using these matches to accumulate evidence for averaged templates. These templates are determined via the graph-theoretical concept of minimum spanning tree (MST) and via mutual information (MI). Based on the templates, the repeating elements are finally extracted from the input image. Real scene examples demonstrate the ability of the proposed algorithm to capture important high-level information about the structure of a building facade, which in turn can support further processing operations, including compression, segmentation, editing and reconstruction.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"28 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130329518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166659
Xiaoyuan Jing, Chao Lan, Min Li, Yong-Fang Yao, D. Zhang, Jing-yu Yang
Feature extraction is an important research topic in the field of pattern recognition. The class-specific idea tends to recast a traditional multi-class feature extraction and recognition task into several binary class problems, and therefore inevitably class imbalance problem, where the minority class is the specific class, and the majority class consists of all the other classes. However, discriminative information from binary class problems is usually limited, and imbalanced data may have negative effect on the recognition performance. For solving these problems, in this paper, we propose two novel approaches to learn discriminant features from imbalanced data, named class-balanced discrimination (CBD) and orthogonal CBD (OCBD). For a specific class, we select a reduced counterpart class whose data are nearest to the data of specific class, and further divide them into smaller subsets, each of which has the same size as the specific class, to achieve balance. Then, each subset is combined with the minority class, and linear discriminant analysis (LDA) is performed on them to extract discriminative vectors. To further remove redundant information, we impose orthogonal constraint on the extracted discriminant vectors among correlated classes. Experimental results on three public image databases demonstrate that the proposed approaches outperform several related image feature extraction and recognition methods.
{"title":"Class-imbalance learning based discriminant analysis","authors":"Xiaoyuan Jing, Chao Lan, Min Li, Yong-Fang Yao, D. Zhang, Jing-yu Yang","doi":"10.1109/ACPR.2011.6166659","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166659","url":null,"abstract":"Feature extraction is an important research topic in the field of pattern recognition. The class-specific idea tends to recast a traditional multi-class feature extraction and recognition task into several binary class problems, and therefore inevitably class imbalance problem, where the minority class is the specific class, and the majority class consists of all the other classes. However, discriminative information from binary class problems is usually limited, and imbalanced data may have negative effect on the recognition performance. For solving these problems, in this paper, we propose two novel approaches to learn discriminant features from imbalanced data, named class-balanced discrimination (CBD) and orthogonal CBD (OCBD). For a specific class, we select a reduced counterpart class whose data are nearest to the data of specific class, and further divide them into smaller subsets, each of which has the same size as the specific class, to achieve balance. Then, each subset is combined with the minority class, and linear discriminant analysis (LDA) is performed on them to extract discriminative vectors. To further remove redundant information, we impose orthogonal constraint on the extracted discriminant vectors among correlated classes. Experimental results on three public image databases demonstrate that the proposed approaches outperform several related image feature extraction and recognition methods.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"19 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131658159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166551
Jianyun Liu, Yunhong Wang, Zhaoxiang Zhang, Yi Mo
Moving objects classification in traffic scene videos is a hot topic in recent years. It has significant meaning to intelligent traffic system by classifying moving traffic objects into pedestrians, motor vehicles, non-motor vehicles etc.. Traditional machine learning approaches make the assumption that source scene objects and target scene objects share same distributions, which does not hold for most occasions. Under this circumstance, large amount of manual labeling for target scene data is needed, which is time and labor consuming. In this paper, we introduce TrAdaBoost, a transfer learning algorithm, to bridge the gap between source and target scene. During training procedure, TrAdaBoost makes full use of the source scene data that is most similar to the target scene data so that only small number of labeled target scene data could help improve the performance significantly. The features used for classification are Histogram of Oriented Gradient features of the appearance based instances. The experiment results show the outstanding performance of the transfer learning method comparing with traditional machine learning algorithm.
{"title":"Multi-view moving objects classification via transfer learning","authors":"Jianyun Liu, Yunhong Wang, Zhaoxiang Zhang, Yi Mo","doi":"10.1109/ACPR.2011.6166551","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166551","url":null,"abstract":"Moving objects classification in traffic scene videos is a hot topic in recent years. It has significant meaning to intelligent traffic system by classifying moving traffic objects into pedestrians, motor vehicles, non-motor vehicles etc.. Traditional machine learning approaches make the assumption that source scene objects and target scene objects share same distributions, which does not hold for most occasions. Under this circumstance, large amount of manual labeling for target scene data is needed, which is time and labor consuming. In this paper, we introduce TrAdaBoost, a transfer learning algorithm, to bridge the gap between source and target scene. During training procedure, TrAdaBoost makes full use of the source scene data that is most similar to the target scene data so that only small number of labeled target scene data could help improve the performance significantly. The features used for classification are Histogram of Oriented Gradient features of the appearance based instances. The experiment results show the outstanding performance of the transfer learning method comparing with traditional machine learning algorithm.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"194 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129279338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166707
Wuxia Zhang, Yuan Yuan, Xuelong Li, Pingkun Yan
Image segmentation plays a critical role in medical imaging applications, whereas it is still a challenging problem due to the complex shapes and complicated texture of structures in medical images. Model based methods have been widely used for medical image segmentation as a priori knowledge can be incorporated. Accurate shape prior estimation is one of the major factors affecting the accuracy of model based segmentation methods. This paper proposes a novel statistical shape modeling method, which aims to estimate target-oriented shape prior by applying the constraint from the intrinsic structure of the training shape set. The proposed shape modeling method is incorporated into a deformable model based framework for image segmentation. The experimental results showed that the proposed method can achieve more accurate segmentation compared with other existing methods.
{"title":"Target-oriented shape modeling with structure constraint for image segmentation","authors":"Wuxia Zhang, Yuan Yuan, Xuelong Li, Pingkun Yan","doi":"10.1109/ACPR.2011.6166707","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166707","url":null,"abstract":"Image segmentation plays a critical role in medical imaging applications, whereas it is still a challenging problem due to the complex shapes and complicated texture of structures in medical images. Model based methods have been widely used for medical image segmentation as a priori knowledge can be incorporated. Accurate shape prior estimation is one of the major factors affecting the accuracy of model based segmentation methods. This paper proposes a novel statistical shape modeling method, which aims to estimate target-oriented shape prior by applying the constraint from the intrinsic structure of the training shape set. The proposed shape modeling method is incorporated into a deformable model based framework for image segmentation. The experimental results showed that the proposed method can achieve more accurate segmentation compared with other existing methods.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131345821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166669
Mingyang Jiang, Chunxiao Li, Zirui Deng, Jufu Feng, Liwei Wang
We propose an error learning model for image classification. Motivated by the observation that classifiers trained using local grid regions of the images are often biased, i.e., contain many classification error, we present a two-level combined model to learn useful classification information from these errors, based on Bayes rule. We give theoretical analysis and explanation to show that this error learning model is effective to correct the classification errors made by the local region classifiers. We conduct extensive experiments on benchmark image classification datasets, promising results are obtained.
{"title":"Learning from error: A two-level combined model for image classification","authors":"Mingyang Jiang, Chunxiao Li, Zirui Deng, Jufu Feng, Liwei Wang","doi":"10.1109/ACPR.2011.6166669","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166669","url":null,"abstract":"We propose an error learning model for image classification. Motivated by the observation that classifiers trained using local grid regions of the images are often biased, i.e., contain many classification error, we present a two-level combined model to learn useful classification information from these errors, based on Bayes rule. We give theoretical analysis and explanation to show that this error learning model is effective to correct the classification errors made by the local region classifiers. We conduct extensive experiments on benchmark image classification datasets, promising results are obtained.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115013739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166686
Jingwen Li, Lei Huang, Chang-ping Liu
People counting is a challenging task and has attracted much attention in the area of video surveillance. In this paper, we present an efficient self-learning people counting system which can count the exact number of people in a region of interest. This system based on bag-of-features model can effectively detect the pedestrians some of which are usually treated as background because they are static or move slowly. The system can also select pedestrian and non-pedestrian samples automatically and update the classifier in real-time to make it more suitable for certain specific scene. Experimental results on a practical public dataset named CASIA Pedestrian Counting Dataset show that the proposed people counting system is robust and accurate.
{"title":"An efficient self-learning people counting system","authors":"Jingwen Li, Lei Huang, Chang-ping Liu","doi":"10.1109/ACPR.2011.6166686","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166686","url":null,"abstract":"People counting is a challenging task and has attracted much attention in the area of video surveillance. In this paper, we present an efficient self-learning people counting system which can count the exact number of people in a region of interest. This system based on bag-of-features model can effectively detect the pedestrians some of which are usually treated as background because they are static or move slowly. The system can also select pedestrian and non-pedestrian samples automatically and update the classifier in real-time to make it more suitable for certain specific scene. Experimental results on a practical public dataset named CASIA Pedestrian Counting Dataset show that the proposed people counting system is robust and accurate.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115646972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166571
Lin He, Z. Yu, Z. Gu, Yuanqing Li
It has been verified that hyperspectral data is statistically characterized by elliptical symmetric distribution. Accordingly, we introduce the ellipsoidal discriminant boundaries and present an elliptical symmetric distribution based maximal margin (ESD-MM) classifier for hypespectral classification. In this method, the characteristic of elliptical symmetric distribution (ESD) of hyperspectral data is combined with the maximal margin rule. This strategy enables the ESD-MM classifier to achieve good performance, especially when follows dimensionality reduction. Experimental results on real Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data demonstrated that ESD-MM classifier has better performance than commonly used Bayes classifier, Fisher linear discriminant (FLD) and linear support vector machine (SVM).
{"title":"Elliptical symmetric distribution based maximal margin classification for hyperspectral imagery","authors":"Lin He, Z. Yu, Z. Gu, Yuanqing Li","doi":"10.1109/ACPR.2011.6166571","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166571","url":null,"abstract":"It has been verified that hyperspectral data is statistically characterized by elliptical symmetric distribution. Accordingly, we introduce the ellipsoidal discriminant boundaries and present an elliptical symmetric distribution based maximal margin (ESD-MM) classifier for hypespectral classification. In this method, the characteristic of elliptical symmetric distribution (ESD) of hyperspectral data is combined with the maximal margin rule. This strategy enables the ESD-MM classifier to achieve good performance, especially when follows dimensionality reduction. Experimental results on real Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data demonstrated that ESD-MM classifier has better performance than commonly used Bayes classifier, Fisher linear discriminant (FLD) and linear support vector machine (SVM).","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116076639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166597
Guangming Chang, Chunfen Yuan, Weiming Hu
Visual vocabulary is now widely used in many video analysis tasks, such as event detection, video retrieval and video classification. In most approaches the vocabularies are solely based on statistics of visual features and generated by clustering. Little attention has been paid to the interclass similarity among different events or actions. In this paper, we present a novel approach to mine the interclass visual similarity statistically and then use it to supervise the generation of visual vocabulary. We construct a measurement of interclass similarity, embed the similarity to the Euclidean distance and use the refined distance to generate visual vocabulary iteratively. The experiments in Weizmann and KTH datasets show that our approach outperforms the traditional vocabulary based approach by about 5%.
{"title":"Interclass visual similarity based visual vocabulary learning","authors":"Guangming Chang, Chunfen Yuan, Weiming Hu","doi":"10.1109/ACPR.2011.6166597","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166597","url":null,"abstract":"Visual vocabulary is now widely used in many video analysis tasks, such as event detection, video retrieval and video classification. In most approaches the vocabularies are solely based on statistics of visual features and generated by clustering. Little attention has been paid to the interclass similarity among different events or actions. In this paper, we present a novel approach to mine the interclass visual similarity statistically and then use it to supervise the generation of visual vocabulary. We construct a measurement of interclass similarity, embed the similarity to the Euclidean distance and use the refined distance to generate visual vocabulary iteratively. The experiments in Weizmann and KTH datasets show that our approach outperforms the traditional vocabulary based approach by about 5%.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116086711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166648
Qingshan Li, Yue Zhou, Lei Xu
This paper presents a novel method for natural image understanding. We improved the effect of saliency detection for the purpose of image segmentation at first. Then Graph cuts are used to find global optimal segmentation of N-dimensional image. After that, we adopt the scheme of supervised learning to classify the scene type of the image. The main advantages of our method are that: Firstly we revised the existed sparse saliency model to better suit for image segmentation, Secondly we propose a new color modeling method during the process of GrabCut segmentation. Finally we extract object-level top down information and low-level image cues together to distinguish the type of images. Experiments show that our proposed scheme can obtain comparable performance to other approaches.
{"title":"Saliency based natural image understanding","authors":"Qingshan Li, Yue Zhou, Lei Xu","doi":"10.1109/ACPR.2011.6166648","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166648","url":null,"abstract":"This paper presents a novel method for natural image understanding. We improved the effect of saliency detection for the purpose of image segmentation at first. Then Graph cuts are used to find global optimal segmentation of N-dimensional image. After that, we adopt the scheme of supervised learning to classify the scene type of the image. The main advantages of our method are that: Firstly we revised the existed sparse saliency model to better suit for image segmentation, Secondly we propose a new color modeling method during the process of GrabCut segmentation. Finally we extract object-level top down information and low-level image cues together to distinguish the type of images. Experiments show that our proposed scheme can obtain comparable performance to other approaches.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"256 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122502641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}