Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166581
Fei Yuan, Gui-Song Xia, H. Sahbi, V. Prinet
We present a novel feature, named Spatio-Temporal Interest Points Chain (STIPC), for activity representation and recognition. This new feature consists of a set of trackable spatio-temporal interest points, which correspond to a series of discontinuous motion among a long-term motion of an object or its part. By this chain feature, we can not only capture the discriminative motion information which space-time interest point-like feature try to pursue, but also build the connection between them. Specifically, we first extract the point trajectories from the image sequences, then partition the points on each trajectory into two kinds of different yet close related points: discontinuous motion points and continuous motion points. We extract local space-time features around discontinuous motion points and use a chain model to represent them. Furthermore, we introduce a chain descriptor to encode the temporal relationships between these interdependent local space-time features. The experimental results on challenging datasets show that our STIPC features improves local space-time features and achieve state-of-the-art results.
{"title":"Spatio-Temporal Interest Points Chain (STIPC) for activity recognition","authors":"Fei Yuan, Gui-Song Xia, H. Sahbi, V. Prinet","doi":"10.1109/ACPR.2011.6166581","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166581","url":null,"abstract":"We present a novel feature, named Spatio-Temporal Interest Points Chain (STIPC), for activity representation and recognition. This new feature consists of a set of trackable spatio-temporal interest points, which correspond to a series of discontinuous motion among a long-term motion of an object or its part. By this chain feature, we can not only capture the discriminative motion information which space-time interest point-like feature try to pursue, but also build the connection between them. Specifically, we first extract the point trajectories from the image sequences, then partition the points on each trajectory into two kinds of different yet close related points: discontinuous motion points and continuous motion points. We extract local space-time features around discontinuous motion points and use a chain model to represent them. Furthermore, we introduce a chain descriptor to encode the temporal relationships between these interdependent local space-time features. The experimental results on challenging datasets show that our STIPC features improves local space-time features and achieve state-of-the-art results.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123790948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166549
Wei Jiang, Wenju Liu, Pengfei Hu
The smoothness of spectral envelope is a commonly known attribute of clean speech. In this study, this principle is modeled through oscillation degree of each time-frequency (T-F) unit, and then incorporated into a computational auditory scene analysis (CASA) system for monaural voiced speech separation. Specifically, oscillation degrees of autocorrelation function (ODACF) and of envelope autocorrelation function (ODEACF) are extracted for each T-F unit, which are then utilized in T-F unit labeling. Experiment results indicate that target units and interference units are distinguished more effectively by incorporating the spectral smoothness principle than by using the harmonic principle alone, and obvious segregation improvements are obtained.
{"title":"Modeling spectral smoothness principle for monaural voiced speech separation","authors":"Wei Jiang, Wenju Liu, Pengfei Hu","doi":"10.1109/ACPR.2011.6166549","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166549","url":null,"abstract":"The smoothness of spectral envelope is a commonly known attribute of clean speech. In this study, this principle is modeled through oscillation degree of each time-frequency (T-F) unit, and then incorporated into a computational auditory scene analysis (CASA) system for monaural voiced speech separation. Specifically, oscillation degrees of autocorrelation function (ODACF) and of envelope autocorrelation function (ODEACF) are extracted for each T-F unit, which are then utilized in T-F unit labeling. Experiment results indicate that target units and interference units are distinguished more effectively by incorporating the spectral smoothness principle than by using the harmonic principle alone, and obvious segregation improvements are obtained.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122940358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166647
Zhihui Lai, Qingcai Chen, Zhong Jin
The techniques of linear dimensionality reduction have been attracted widely attention in the fields of computer vision and pattern recognition. In this paper, we propose a novel framework called Sparse Bilinear Preserving Projections (SBPP) for image feature extraction. We generalized the image-based bilinear preserving projections into sparse case for feature extraction. Different from the popular bilinear linear projection techniques, the projections of SBPP are sparse, i.e. most elements in the projections are zeros. In the proposed framework, we use the local neighborhood graph to model the manifold structure of the data set at first, and then spectral analysis and L1-norm regression by using the Elastic Net are combined together to iteratively learn the sparse bilinear projections, which optimal preserve the local geometric structure of the image manifold. Experiments on some databases show that SBPP is competitive to some state-of-the-art techniques.
{"title":"Sparse bilinear preserving projections","authors":"Zhihui Lai, Qingcai Chen, Zhong Jin","doi":"10.1109/ACPR.2011.6166647","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166647","url":null,"abstract":"The techniques of linear dimensionality reduction have been attracted widely attention in the fields of computer vision and pattern recognition. In this paper, we propose a novel framework called Sparse Bilinear Preserving Projections (SBPP) for image feature extraction. We generalized the image-based bilinear preserving projections into sparse case for feature extraction. Different from the popular bilinear linear projection techniques, the projections of SBPP are sparse, i.e. most elements in the projections are zeros. In the proposed framework, we use the local neighborhood graph to model the manifold structure of the data set at first, and then spectral analysis and L1-norm regression by using the Elastic Net are combined together to iteratively learn the sparse bilinear projections, which optimal preserve the local geometric structure of the image manifold. Experiments on some databases show that SBPP is competitive to some state-of-the-art techniques.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127670946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166706
Sujing Wang, Chengcheng Jia, Huiling Chen, Bo Wu, Chunguang Zhou
Face recognition plays a important role in computer vision. Recent researches show that high dimensional face images lie on or close to a low dimensional manifold. LPP is a widely used manifold reduced dimensionality technique. But it suffers two problem: (1) Small Sample Size problem; (2)the performance is sensitive to the neighborhood size k. In order to address the problems, this paper proposed a Matrix Exponential LPP. To void the singular matrix, the proposed algorithm introduced the matrix exponential to obtain more valuable information for LPP. The experiments were conducted on two face database, Yale and Georgia Tech. And the results proved the performances of the proposed algorithm was better than that of LPP.
{"title":"Matrix Exponential LPP for face recognition","authors":"Sujing Wang, Chengcheng Jia, Huiling Chen, Bo Wu, Chunguang Zhou","doi":"10.1109/ACPR.2011.6166706","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166706","url":null,"abstract":"Face recognition plays a important role in computer vision. Recent researches show that high dimensional face images lie on or close to a low dimensional manifold. LPP is a widely used manifold reduced dimensionality technique. But it suffers two problem: (1) Small Sample Size problem; (2)the performance is sensitive to the neighborhood size k. In order to address the problems, this paper proposed a Matrix Exponential LPP. To void the singular matrix, the proposed algorithm introduced the matrix exponential to obtain more valuable information for LPP. The experiments were conducted on two face database, Yale and Georgia Tech. And the results proved the performances of the proposed algorithm was better than that of LPP.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133535962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166658
Xiao-Hua Liu, Cheng-Lin Liu
The Gaussian mixture model (GMM) has been widely used in pattern recognition problems for clustering and probability density estimation. Given the number of mixture components (model order), the parameters of GMM can be estimated by the EM algorithm. The model order selection, however, remains an open problem. For classification purpose, we propose a discriminative model selection method to optimize the orders of all classes. Based on the GMMs initialized in some way, the orders of all classes are adjusted heuristically to improve the cross-validated classification accuracy. The model orders selected in this discriminative way are expected to give higher generalized accuracy than classwise model selection. Our experimental results on some UCI datasets demonstrate the superior classification performance of the proposed method.
{"title":"Discriminative model selection for Gaussian mixture models for classification","authors":"Xiao-Hua Liu, Cheng-Lin Liu","doi":"10.1109/ACPR.2011.6166658","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166658","url":null,"abstract":"The Gaussian mixture model (GMM) has been widely used in pattern recognition problems for clustering and probability density estimation. Given the number of mixture components (model order), the parameters of GMM can be estimated by the EM algorithm. The model order selection, however, remains an open problem. For classification purpose, we propose a discriminative model selection method to optimize the orders of all classes. Based on the GMMs initialized in some way, the orders of all classes are adjusted heuristically to improve the cross-validated classification accuracy. The model orders selected in this discriminative way are expected to give higher generalized accuracy than classwise model selection. Our experimental results on some UCI datasets demonstrate the superior classification performance of the proposed method.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124703264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166589
Jin Zhang, Yonghong Song, Yuanlin Zhang, Xiaobing Wang
This paper presents a novel color quantization method based on Normalized Cut clustering algorithm, in order to generate a quantized image with the minimum loss of information and the maximum compression ratio, which benefits the storage and transmission of the color image. This new method uses a deformed Median Cut algorithm as a coarse partition of color pixels in the RGB color space, and then take the average color of each partition as the representative color of a node to construct a condensed graph. By employing the Normalized Cut clustering algorithm, we could get the palette with defined color number, and then reconstruct the quantized image. Experiments on common used test images demonstrate that our method is very competitive with state-of-the-art color quantization methods in terms of image quality, compression ratio and computation time.
{"title":"A new approach of color image quantization based on Normalized Cut algorithm","authors":"Jin Zhang, Yonghong Song, Yuanlin Zhang, Xiaobing Wang","doi":"10.1109/ACPR.2011.6166589","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166589","url":null,"abstract":"This paper presents a novel color quantization method based on Normalized Cut clustering algorithm, in order to generate a quantized image with the minimum loss of information and the maximum compression ratio, which benefits the storage and transmission of the color image. This new method uses a deformed Median Cut algorithm as a coarse partition of color pixels in the RGB color space, and then take the average color of each partition as the representative color of a node to construct a condensed graph. By employing the Normalized Cut clustering algorithm, we could get the palette with defined color number, and then reconstruct the quantized image. Experiments on common used test images demonstrate that our method is very competitive with state-of-the-art color quantization methods in terms of image quality, compression ratio and computation time.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134223912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166587
Tongtong Chen, Bin Dai, Daxue Liu, Bo Zhang, Qixu Liu
Obtaining a comprehensive model of large and complex ground typically is crucial for autonomous driving both in urban and countryside environments. This paper presents an improved ground segmentation method for 3D LIDAR point clouds. Our approach builds on a polar grid map, which is divided into some sectors, then 1D Gaussian process (GP) regression model and Incremental Sample Consensus (INSAC) algorithm is used to extract ground for every sector. Experiments are carried out at the autonomous vehicle in different outdoor scenes, and results are compared to those of the existing method. We show that our method can get more promising performance.
{"title":"3D LIDAR-based ground segmentation","authors":"Tongtong Chen, Bin Dai, Daxue Liu, Bo Zhang, Qixu Liu","doi":"10.1109/ACPR.2011.6166587","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166587","url":null,"abstract":"Obtaining a comprehensive model of large and complex ground typically is crucial for autonomous driving both in urban and countryside environments. This paper presents an improved ground segmentation method for 3D LIDAR point clouds. Our approach builds on a polar grid map, which is divided into some sectors, then 1D Gaussian process (GP) regression model and Incremental Sample Consensus (INSAC) algorithm is used to extract ground for every sector. Experiments are carried out at the autonomous vehicle in different outdoor scenes, and results are compared to those of the existing method. We show that our method can get more promising performance.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115022478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166546
Takahiro Ota, T. Wada
This paper presents a method of fast and accurate character localization for OCR (Optical Character Reader). We already proposed an acceleration framework of arbitrary classifiers, classifier molding, for real-time verification of characters printed by Industrial Ink Jet Printer (IIJP). In this framework, the behavior of accurate but slow character classifier is learnt by linear regression tree. The resulted classifier is up to 1,500 times faster than the original one but is not fast enough for real-time pyramidal scan of VGA images, which is necessary for scale-free character recognition. For solving this problem, we also proposed CCS (Classification based Character Segmentation). This method finds character arrangement that maximizes the sum of the likelihood of character regions assuming that all characters are horizontally aligned with almost regular intervals. This assumption is not always true even for the characters printed by IIJP. For solving this problem, we extended the idea of CCS to arbitrary located characters. Our method first generates character-region candidates based on local elliptical regions, named Fast-Hessian-Affine regions, and finds most likely character arrangement. Through experiments, we confirmed that our method quickly and accurately recognizes non-uniformly arranged characters.
提出了一种快速准确的OCR (Optical character Reader)字符定位方法。我们已经提出了一个任意分类器的加速框架,分类器成型,用于工业喷墨打印机(IIJP)打印的字符的实时验证。在该框架中,通过线性回归树学习准确但速度慢的字符分类器的行为。所得到的分类器比原来的分类器快了1500倍,但对于VGA图像的实时金字塔扫描来说还不够快,这是无比例字符识别所必需的。为了解决这个问题,我们还提出了基于分类的字符分割(CCS)。该方法找到的字符排列,最大限度地提高字符区域的可能性的总和,假设所有字符水平对齐几乎有规则的间隔。即使对于IIJP打印的字符,这个假设也不总是正确的。为了解决这个问题,我们将CCS的理念扩展到任意位置的字符。该方法首先基于局部椭圆区域(Fast-Hessian-Affine region)生成候选字符区域,并找到最可能的字符排列。实验结果表明,该方法能够快速准确地识别非均匀排列字符。
{"title":"Classification based character segmentation guided by Fast-Hessian-Affine regions","authors":"Takahiro Ota, T. Wada","doi":"10.1109/ACPR.2011.6166546","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166546","url":null,"abstract":"This paper presents a method of fast and accurate character localization for OCR (Optical Character Reader). We already proposed an acceleration framework of arbitrary classifiers, classifier molding, for real-time verification of characters printed by Industrial Ink Jet Printer (IIJP). In this framework, the behavior of accurate but slow character classifier is learnt by linear regression tree. The resulted classifier is up to 1,500 times faster than the original one but is not fast enough for real-time pyramidal scan of VGA images, which is necessary for scale-free character recognition. For solving this problem, we also proposed CCS (Classification based Character Segmentation). This method finds character arrangement that maximizes the sum of the likelihood of character regions assuming that all characters are horizontally aligned with almost regular intervals. This assumption is not always true even for the characters printed by IIJP. For solving this problem, we extended the idea of CCS to arbitrary located characters. Our method first generates character-region candidates based on local elliptical regions, named Fast-Hessian-Affine regions, and finds most likely character arrangement. Through experiments, we confirmed that our method quickly and accurately recognizes non-uniformly arranged characters.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122404767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/ACPR.2011.6166683
Qian Chen, Haiyuan Wu, H. Taki
This paper describes an effective method for detecting multiple symmetric objects in an image. A “pseudo-affine invariant SIFT” is used for detecting symmetric feature pairs in perspective images. Candidates of symmetric axes are estimated from every two symmetric feature pairs, and the one supported by the most symmetric feature pairs is detected as the most relevant symmetric axis of a symmetric object. The symmetric feature pairs supporting the symmetric axis are then used to detect other symmetric axes in the same symmetric object. This procedure is applied repeatedly to the symmetric feature pairs after eliminating the ones that support the already detected symmetric axes to detect all symmetric objects in the image. The effectiveness of this method has been confirmed through several experiments using real images and common image databases.
{"title":"Detecting multiple symmetries with extended SIFT","authors":"Qian Chen, Haiyuan Wu, H. Taki","doi":"10.1109/ACPR.2011.6166683","DOIUrl":"https://doi.org/10.1109/ACPR.2011.6166683","url":null,"abstract":"This paper describes an effective method for detecting multiple symmetric objects in an image. A “pseudo-affine invariant SIFT” is used for detecting symmetric feature pairs in perspective images. Candidates of symmetric axes are estimated from every two symmetric feature pairs, and the one supported by the most symmetric feature pairs is detected as the most relevant symmetric axis of a symmetric object. The symmetric feature pairs supporting the symmetric axis are then used to detect other symmetric axes in the same symmetric object. This procedure is applied repeatedly to the symmetric feature pairs after eliminating the ones that support the already detected symmetric axes to detect all symmetric objects in the image. The effectiveness of this method has been confirmed through several experiments using real images and common image databases.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129295517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01DOI: 10.1109/acpr.2011.6166550
Jiewei Wang, Yunhong Wang, Zhaoxiang Zhang
{"title":"Interesting region detection in aerial video using Bayesian topic models","authors":"Jiewei Wang, Yunhong Wang, Zhaoxiang Zhang","doi":"10.1109/acpr.2011.6166550","DOIUrl":"https://doi.org/10.1109/acpr.2011.6166550","url":null,"abstract":"","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130860536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}