Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166597
Guangming Chang, Chunfen Yuan, Weiming Hu
Visual vocabulary is now widely used in many video analysis tasks, such as event detection, video retrieval and video classification. In most approaches the vocabularies are solely based on statistics of visual features and generated by clustering. Little attention has been paid to the interclass similarity among different events or actions. In this paper, we present a novel approach to mine the interclass visual similarity statistically and then use it to supervise the generation of visual vocabulary. We construct a measurement of interclass similarity, embed the similarity to the Euclidean distance and use the refined distance to generate visual vocabulary iteratively. The experiments in Weizmann and KTH datasets show that our approach outperforms the traditional vocabulary based approach by about 5%.
{"title":"Interclass visual similarity based visual vocabulary learning","authors":"Guangming Chang, Chunfen Yuan, Weiming Hu","doi":"10.1109/ACPR.2011.6166597","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166597","url":null,"abstract":"Visual vocabulary is now widely used in many video analysis tasks, such as event detection, video retrieval and video classification. In most approaches the vocabularies are solely based on statistics of visual features and generated by clustering. Little attention has been paid to the interclass similarity among different events or actions. In this paper, we present a novel approach to mine the interclass visual similarity statistically and then use it to supervise the generation of visual vocabulary. We construct a measurement of interclass similarity, embed the similarity to the Euclidean distance and use the refined distance to generate visual vocabulary iteratively. The experiments in Weizmann and KTH datasets show that our approach outperforms the traditional vocabulary based approach by about 5%.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116086711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166571
Lin He, Z. Yu, Z. Gu, Yuanqing Li
It has been verified that hyperspectral data is statistically characterized by elliptical symmetric distribution. Accordingly, we introduce the ellipsoidal discriminant boundaries and present an elliptical symmetric distribution based maximal margin (ESD-MM) classifier for hypespectral classification. In this method, the characteristic of elliptical symmetric distribution (ESD) of hyperspectral data is combined with the maximal margin rule. This strategy enables the ESD-MM classifier to achieve good performance, especially when follows dimensionality reduction. Experimental results on real Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data demonstrated that ESD-MM classifier has better performance than commonly used Bayes classifier, Fisher linear discriminant (FLD) and linear support vector machine (SVM).
{"title":"Elliptical symmetric distribution based maximal margin classification for hyperspectral imagery","authors":"Lin He, Z. Yu, Z. Gu, Yuanqing Li","doi":"10.1109/ACPR.2011.6166571","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166571","url":null,"abstract":"It has been verified that hyperspectral data is statistically characterized by elliptical symmetric distribution. Accordingly, we introduce the ellipsoidal discriminant boundaries and present an elliptical symmetric distribution based maximal margin (ESD-MM) classifier for hypespectral classification. In this method, the characteristic of elliptical symmetric distribution (ESD) of hyperspectral data is combined with the maximal margin rule. This strategy enables the ESD-MM classifier to achieve good performance, especially when follows dimensionality reduction. Experimental results on real Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) data demonstrated that ESD-MM classifier has better performance than commonly used Bayes classifier, Fisher linear discriminant (FLD) and linear support vector machine (SVM).","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116076639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166569
Cheng Chen, F. Lauze, C. Igel, Aasa Feragen, M. Loog, M. Nielsen
Given a training set of images and a binary classifier, we introduce the notion of an exaggerated image stereotype for some image class of interest, which emphasizes/exaggerates the characteristic patterns in an image and visualizes which visual information the classification relies on. This is useful for gaining insight into the classification mechanism. The exaggerated image stereotypes results in a proper trade-off between classification accuracy and likelihood of being generated from the class of interest. This is done by optimizing an objective function which consists of a discriminative term based on the classification result, and a generative term based on the assumption of the class distribution. We use this idea with Fisher's Linear Discriminant rule, and assume a multivariate normal distribution for samples within a class. The proposed framework has been applied on handwritten digit data, illustrating specific features differentiating digits. Then it is applied to a face dataset using Active Appearance Model (AAM), where male faces stereotypes are evolved from initial female faces.
{"title":"Towards exaggerated image stereotypes","authors":"Cheng Chen, F. Lauze, C. Igel, Aasa Feragen, M. Loog, M. Nielsen","doi":"10.1109/ACPR.2011.6166569","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166569","url":null,"abstract":"Given a training set of images and a binary classifier, we introduce the notion of an exaggerated image stereotype for some image class of interest, which emphasizes/exaggerates the characteristic patterns in an image and visualizes which visual information the classification relies on. This is useful for gaining insight into the classification mechanism. The exaggerated image stereotypes results in a proper trade-off between classification accuracy and likelihood of being generated from the class of interest. This is done by optimizing an objective function which consists of a discriminative term based on the classification result, and a generative term based on the assumption of the class distribution. We use this idea with Fisher's Linear Discriminant rule, and assume a multivariate normal distribution for samples within a class. The proposed framework has been applied on handwritten digit data, illustrating specific features differentiating digits. Then it is applied to a face dataset using Active Appearance Model (AAM), where male faces stereotypes are evolved from initial female faces.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"272 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122837763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166555
Piji Li, Jun Ma
We consider the problem of generating concise sentences to describe still pictures automatically. We treat objects in images (nouns in sentences) as hidden information of actions (verbs). Therefore, the sentence generation problem can be transformed into action detection and scene classification problems. We employ Latent Multiple Kernel Learning (L-MKL) to learn the action detectors from “Exemplarlets”, and utilize MKL to learn the scene classifiers. The image features employed include distribution of edges, dense visual words and feature descriptors at different levels of spatial pyramid. For a new image we can detect the action using a sliding-window detector learnt via L-MKL, predict the scene the action happened in and build haction, scenei tuples. Finally, these tuples will be translated into concise sentences according to previously defined grammar template. We show both the classification and sentence generating results on our newly collected dataset of six actions as well as demonstrate improved performance over existing methods.
{"title":"What is happening in a still picture?","authors":"Piji Li, Jun Ma","doi":"10.1109/ACPR.2011.6166555","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166555","url":null,"abstract":"We consider the problem of generating concise sentences to describe still pictures automatically. We treat objects in images (nouns in sentences) as hidden information of actions (verbs). Therefore, the sentence generation problem can be transformed into action detection and scene classification problems. We employ Latent Multiple Kernel Learning (L-MKL) to learn the action detectors from “Exemplarlets”, and utilize MKL to learn the scene classifiers. The image features employed include distribution of edges, dense visual words and feature descriptors at different levels of spatial pyramid. For a new image we can detect the action using a sliding-window detector learnt via L-MKL, predict the scene the action happened in and build haction, scenei tuples. Finally, these tuples will be translated into concise sentences according to previously defined grammar template. We show both the classification and sentence generating results on our newly collected dataset of six actions as well as demonstrate improved performance over existing methods.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127903171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166648
Qingshan Li, Yue Zhou, Lei Xu
This paper presents a novel method for natural image understanding. We improved the effect of saliency detection for the purpose of image segmentation at first. Then Graph cuts are used to find global optimal segmentation of N-dimensional image. After that, we adopt the scheme of supervised learning to classify the scene type of the image. The main advantages of our method are that: Firstly we revised the existed sparse saliency model to better suit for image segmentation, Secondly we propose a new color modeling method during the process of GrabCut segmentation. Finally we extract object-level top down information and low-level image cues together to distinguish the type of images. Experiments show that our proposed scheme can obtain comparable performance to other approaches.
{"title":"Saliency based natural image understanding","authors":"Qingshan Li, Yue Zhou, Lei Xu","doi":"10.1109/ACPR.2011.6166648","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166648","url":null,"abstract":"This paper presents a novel method for natural image understanding. We improved the effect of saliency detection for the purpose of image segmentation at first. Then Graph cuts are used to find global optimal segmentation of N-dimensional image. After that, we adopt the scheme of supervised learning to classify the scene type of the image. The main advantages of our method are that: Firstly we revised the existed sparse saliency model to better suit for image segmentation, Secondly we propose a new color modeling method during the process of GrabCut segmentation. Finally we extract object-level top down information and low-level image cues together to distinguish the type of images. Experiments show that our proposed scheme can obtain comparable performance to other approaches.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"256 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122502641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166557
Shihai Wang, Geng Li
The first strand of this research is concerned with the classification noise issue. Classification noise, (worry labeling), is a further consequence of the difficulties in accurately labeling the real training data. For efficient reduction of the negative influence produced by noisy samples, we propose a new weight scheme with a nonlinear model with the local proximity assumption for the Boosting algorithm. The effectiveness of our method has been evaluated by using a set of University of California Irvine Machine Learning Repository (UCI) [1] benchmarks. We report promising results.
{"title":"An improvment of weight scheme on adaBoost in the presence of noisy data","authors":"Shihai Wang, Geng Li","doi":"10.1109/ACPR.2011.6166557","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166557","url":null,"abstract":"The first strand of this research is concerned with the classification noise issue. Classification noise, (worry labeling), is a further consequence of the difficulties in accurately labeling the real training data. For efficient reduction of the negative influence produced by noisy samples, we propose a new weight scheme with a nonlinear model with the local proximity assumption for the Boosting algorithm. The effectiveness of our method has been evaluated by using a set of University of California Irvine Machine Learning Repository (UCI) [1] benchmarks. We report promising results.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126556984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166701
Juan Hu, Youbin Chen
A method for writer-independent off-line handwritten signature verification based on grey level feature extraction and Real Adaboost algorithm is proposed. Firstly, both global and local features are used simultaneously. The texture information such as co-occurrence matrix and local binary pattern are analyzed and used as features. Secondly, Support Vector Machines (SVMs) and the squared Mahalanobis distance classifier are introduced. Finally, Real Adaboost algorithm is applied. Experiments on the public signature database GPDS Corpus show that our proposed method has achieved the FRR 5.64% and the FAR 5.37% which are the best so far compared with other published results.
{"title":"Fusion of features and classifiers for off-line handwritten signature verification","authors":"Juan Hu, Youbin Chen","doi":"10.1109/ACPR.2011.6166701","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166701","url":null,"abstract":"A method for writer-independent off-line handwritten signature verification based on grey level feature extraction and Real Adaboost algorithm is proposed. Firstly, both global and local features are used simultaneously. The texture information such as co-occurrence matrix and local binary pattern are analyzed and used as features. Secondly, Support Vector Machines (SVMs) and the squared Mahalanobis distance classifier are introduced. Finally, Real Adaboost algorithm is applied. Experiments on the public signature database GPDS Corpus show that our proposed method has achieved the FRR 5.64% and the FAR 5.37% which are the best so far compared with other published results.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126536137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Image-To-Class distance is first proposed in Naive-Bayes Nearest-Neighbor. NBNN is a feature-based image classifier, and can achieve impressive classification accuracy. However, the performance of NBNN relies heavily on the large number of training samples. If using small number of training samples, the performance will degrade. The goal of this paper is to address this issue. The main contribution of this paper is that we propose a robust Image-to-Class distance by local learning. We define the patch-to-class distance as the distance between the input patch to its nearest neighbor in one class, which is reconstructed in the local manifold space; and then our image-to-class distance is the sum of patch-to-class distance. Furthermore, we take advantage of large-margin metric learning framework to obtain a proper Mahalanobis metric for each class. We evaluate the proposed method on four benchmark datasets: Caltech, Corel, Scene13, and Graz. The results show that our defined Image-To-Class Distance is more robust than NBNN and Optimal-NBNN, and by combining with the learned metric for each class, our method can achieve significant improvement over previous reported results on these datasets.
{"title":"A local learning based Image-To-Class distance for image classification","authors":"Xinyuan Cai, Baihua Xiao, Chunheng Wang, Rongguo Zhang","doi":"10.1109/ACPR.2011.6166577","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166577","url":null,"abstract":"Image-To-Class distance is first proposed in Naive-Bayes Nearest-Neighbor. NBNN is a feature-based image classifier, and can achieve impressive classification accuracy. However, the performance of NBNN relies heavily on the large number of training samples. If using small number of training samples, the performance will degrade. The goal of this paper is to address this issue. The main contribution of this paper is that we propose a robust Image-to-Class distance by local learning. We define the patch-to-class distance as the distance between the input patch to its nearest neighbor in one class, which is reconstructed in the local manifold space; and then our image-to-class distance is the sum of patch-to-class distance. Furthermore, we take advantage of large-margin metric learning framework to obtain a proper Mahalanobis metric for each class. We evaluate the proposed method on four benchmark datasets: Caltech, Corel, Scene13, and Graz. The results show that our defined Image-To-Class Distance is more robust than NBNN and Optimal-NBNN, and by combining with the learned metric for each class, our method can achieve significant improvement over previous reported results on these datasets.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124944082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166691
Y. Kaneko, J. Miura
This paper describes a method of generating a new view sequence for view-based outdoor navigation. View-based navigation approaches have been shown to be effective but have a drawback that a view sequence for the route to be navigated is needed beforehand. This will be an issue especially for navigation in an open space where numerous potential routes exist; it is almost impossible to take view sequences for all of the routes by actually moving on them. We therefore develop a method of generating a view sequence for arbitrary routes from an omnidirectional view sequence taken at a limited movement. The method is based on visual odometry-based map generation and an image-to-image morphing using homography. The effectiveness of the method is validated by view-based localization experiments.
{"title":"View sequence generation for view-based outdoor navigation","authors":"Y. Kaneko, J. Miura","doi":"10.1109/ACPR.2011.6166691","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166691","url":null,"abstract":"This paper describes a method of generating a new view sequence for view-based outdoor navigation. View-based navigation approaches have been shown to be effective but have a drawback that a view sequence for the route to be navigated is needed beforehand. This will be an issue especially for navigation in an open space where numerous potential routes exist; it is almost impossible to take view sequences for all of the routes by actually moving on them. We therefore develop a method of generating a view sequence for arbitrary routes from an omnidirectional view sequence taken at a limited movement. The method is based on visual odometry-based map generation and an image-to-image morphing using homography. The effectiveness of the method is validated by view-based localization experiments.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129055030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166609
Hongbin Xie, Gang Zeng, Rui Gan, H. Zha
Unsupervised identical object segmentation remains a challenging problem in vision research due to the difficulties in obtaining high-level structural knowledge about the scene. In this paper, we present an algorithm based on level set with a novel similarity constraint term for identical objects segmentation. The key component of the proposed algorithm is to embed the similarity constraint into curve evolution, where the evolving speed is high in regions of similar appearance and becomes low in areas with distinct contents. The algorithm starts with a pair of seed matches (e.g. SIFT) and evolve the small initial circle to form large similar regions under the similarity constraint. The similarity constraint is related to local alignment with assumption that the warp between identical objects is affine transformation. The right warp aligns the identical objects and promotes the similar regions growth. The alignment and expansion alternate until the curve reaches the boundaries of similar objects. Real experiments validates the efficiency and effectiveness of the proposed algorithm.
{"title":"Identical object segmentation through level sets with similarity constraint","authors":"Hongbin Xie, Gang Zeng, Rui Gan, H. Zha","doi":"10.1109/ACPR.2011.6166609","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166609","url":null,"abstract":"Unsupervised identical object segmentation remains a challenging problem in vision research due to the difficulties in obtaining high-level structural knowledge about the scene. In this paper, we present an algorithm based on level set with a novel similarity constraint term for identical objects segmentation. The key component of the proposed algorithm is to embed the similarity constraint into curve evolution, where the evolving speed is high in regions of similar appearance and becomes low in areas with distinct contents. The algorithm starts with a pair of seed matches (e.g. SIFT) and evolve the small initial circle to form large similar regions under the similarity constraint. The similarity constraint is related to local alignment with assumption that the warp between identical objects is affine transformation. The right warp aligns the identical objects and promotes the similar regions growth. The alignment and expansion alternate until the curve reaches the boundaries of similar objects. Real experiments validates the efficiency and effectiveness of the proposed algorithm.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"88 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123523327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}