Extracting planar surfaces from 3D point clouds is an important and challenging step for generating building models as the obtained data are always noisy, missing and unorganised. In this paper, we present a novel graph Laplacian regularized K-planes method for segmenting piece-wise planar surfaces of urban building point clouds. The core ideas behind our model are from two aspects: 1) a linear projection model is utilized to fit planar surfaces globally, 2) a graph Laplacian regularization is applied to preserve smoothness of each plane locally. The two terms are combined as an objective function, which is minimized via an iterative updating algorithm. Comparative experiments on both synthetic and real data sets are performed. The results demonstrate the effectiveness and efficiency of our method.
{"title":"Planar Segmentation from Point Clouds via Graph Laplacian Regularized K-Planes","authors":"Wei Sui, Lingfeng Wang, Huai-Yu Wu, Chunhong Pan","doi":"10.1109/ACPR.2013.15","DOIUrl":"https://doi.org/10.1109/ACPR.2013.15","url":null,"abstract":"Extracting planar surfaces from 3D point clouds is an important and challenging step for generating building models as the obtained data are always noisy, missing and unorganised. In this paper, we present a novel graph Laplacian regularized K-planes method for segmenting piece-wise planar surfaces of urban building point clouds. The core ideas behind our model are from two aspects: 1) a linear projection model is utilized to fit planar surfaces globally, 2) a graph Laplacian regularization is applied to preserve smoothness of each plane locally. The two terms are combined as an objective function, which is minimized via an iterative updating algorithm. Comparative experiments on both synthetic and real data sets are performed. The results demonstrate the effectiveness and efficiency of our method.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114665793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Local color correction methods transfer colors between corresponding regions. However, inconsistent segmentation between the source image and the target image tends to degrade the correction result. In this paper, we propose a local color correction technique for coarsely registered images. In the segmentation step, it enforces the consistent segmentation on both source and target images to alleviate the inaccurate registration problem. In the color transfer step, it uses the region confidences and the bilateral-filter-like color influence maps to improve the color correction result. The experiment shows the proposed method achieves improved color correction results compared with the global methods and the recent local color correction methods.
{"title":"Consistent Segmentation Based Color Correction for Coarsely Registered Images","authors":"Haoxing Wang, Longquan Dai, Xiaopeng Zhang","doi":"10.1109/ACPR.2013.72","DOIUrl":"https://doi.org/10.1109/ACPR.2013.72","url":null,"abstract":"Local color correction methods transfer colors between corresponding regions. However, inconsistent segmentation between the source image and the target image tends to degrade the correction result. In this paper, we propose a local color correction technique for coarsely registered images. In the segmentation step, it enforces the consistent segmentation on both source and target images to alleviate the inaccurate registration problem. In the color transfer step, it uses the region confidences and the bilateral-filter-like color influence maps to improve the color correction result. The experiment shows the proposed method achieves improved color correction results compared with the global methods and the recent local color correction methods.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"22 3 Suppl 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115350663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xueyun Chen, Shiming Xiang, Cheng-Lin Liu, Chunhong Pan
Deep convolutional Neural Networks (DNN) is the state-of-the-art machine learning method. It has been used in many recognition tasks including handwritten digits, Chinese words and traffic signs, etc. However, training and test DNN are time-consuming tasks. In practical vehicle detection application, both speed and accuracy are required. So increasing the speeds of DNN while keeping its high accuracy has significant meaning for many recognition and detection applications. We introduce parallel branches into the DNN. The maps of the layers of DNN are divided into several parallel branches, each branch has the same number of maps. There are not direct connections between different branches. Our parallel DNN (PNN) keeps the same structure and dimensions of the DNN, reducing the total number of connections between maps. The more number of branches we divide, the more swift the speed of the PNN is, the conventional DNN becomes a special form of PNN which has only one branch. Experiments on large vehicle database showed that the detection accuracy of PNN dropped slightly with the speed increasing. Even the fastest PNN (10 times faster than DNN), whose branch has only two maps, fully outperformed the traditional methods based on features (such as HOG, LBP). In fact, PNN provides a good solution way for compromising the speed and accuracy requirements in many applications.
{"title":"Vehicle Detection in Satellite Images by Parallel Deep Convolutional Neural Networks","authors":"Xueyun Chen, Shiming Xiang, Cheng-Lin Liu, Chunhong Pan","doi":"10.1109/ACPR.2013.33","DOIUrl":"https://doi.org/10.1109/ACPR.2013.33","url":null,"abstract":"Deep convolutional Neural Networks (DNN) is the state-of-the-art machine learning method. It has been used in many recognition tasks including handwritten digits, Chinese words and traffic signs, etc. However, training and test DNN are time-consuming tasks. In practical vehicle detection application, both speed and accuracy are required. So increasing the speeds of DNN while keeping its high accuracy has significant meaning for many recognition and detection applications. We introduce parallel branches into the DNN. The maps of the layers of DNN are divided into several parallel branches, each branch has the same number of maps. There are not direct connections between different branches. Our parallel DNN (PNN) keeps the same structure and dimensions of the DNN, reducing the total number of connections between maps. The more number of branches we divide, the more swift the speed of the PNN is, the conventional DNN becomes a special form of PNN which has only one branch. Experiments on large vehicle database showed that the detection accuracy of PNN dropped slightly with the speed increasing. Even the fastest PNN (10 times faster than DNN), whose branch has only two maps, fully outperformed the traditional methods based on features (such as HOG, LBP). In fact, PNN provides a good solution way for compromising the speed and accuracy requirements in many applications.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"314 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122751171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a new binary descriptor based background modeling approach which is robust to lighting changes and dynamic backgrounds in the environment. Instead of using traditional parametric models, our background models are constructed by background instances using binary descriptors computed from observed backgrounds. As shown in the experiments, our method can achieve better foreground detection results and fewer false alarms compared to the state-of-the-art methods.
{"title":"Real-Time Binary Descriptor Based Background Modeling","authors":"Wan-Chen Liu, Shu-Zhe Lin, Min-Hsiang Yang, Chun-Rong Huang","doi":"10.1109/ACPR.2013.125","DOIUrl":"https://doi.org/10.1109/ACPR.2013.125","url":null,"abstract":"In this paper, we propose a new binary descriptor based background modeling approach which is robust to lighting changes and dynamic backgrounds in the environment. Instead of using traditional parametric models, our background models are constructed by background instances using binary descriptors computed from observed backgrounds. As shown in the experiments, our method can achieve better foreground detection results and fewer false alarms compared to the state-of-the-art methods.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122795885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We aim to learn local orientation field patterns in fingerprints and correct distorted field patterns in noisy fingerprint images. This is formulated as a learning problem and achieved using two continuous restricted Boltzmann machines. The learnt orientation fields are then used in conjunction with traditional Gabor based algorithms for fingerprint enhancement. Orientation fields extracted by gradient-based methods are local, and do not consider neighboring orientations. If some amount of noise is present in a fingerprint, then these methods perform poorly when enhancing the image, affecting fingerprint matching. This paper presents a method to correct the resulting noisy regions over patches of the fingerprint by training two continuous restricted Boltzmann machines. The continuous RBMs are trained with clean fingerprint images and applied to overlapping patches of the input fingerprint. Experimental results show that one can successfully restore patches of noisy fingerprint images.
{"title":"Learning Fingerprint Orientation Fields Using Continuous Restricted Boltzmann Machines","authors":"M. Sahasrabudhe, A. Namboodiri","doi":"10.1109/ACPR.2013.37","DOIUrl":"https://doi.org/10.1109/ACPR.2013.37","url":null,"abstract":"We aim to learn local orientation field patterns in fingerprints and correct distorted field patterns in noisy fingerprint images. This is formulated as a learning problem and achieved using two continuous restricted Boltzmann machines. The learnt orientation fields are then used in conjunction with traditional Gabor based algorithms for fingerprint enhancement. Orientation fields extracted by gradient-based methods are local, and do not consider neighboring orientations. If some amount of noise is present in a fingerprint, then these methods perform poorly when enhancing the image, affecting fingerprint matching. This paper presents a method to correct the resulting noisy regions over patches of the fingerprint by training two continuous restricted Boltzmann machines. The continuous RBMs are trained with clean fingerprint images and applied to overlapping patches of the input fingerprint. Experimental results show that one can successfully restore patches of noisy fingerprint images.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126147391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a set of efficient processes for extracting all four elements of Chinese news web pages, namely news title, release date, news source and the main text. Our approach is based on a deep analysis of content and structure features of current Chinese news. We take content indicators as the key to recover tree structure of the main text. Additionally, we come up with the concept of Length-Distance Ratio to help improve performance. Our method rarely depends on selection of samples and has strong generalization ability regardless of training process, distinguishing itself from most existing methods. We have tested our approach on 1721 labeled Chinese news pages from 429 web sites. Results show that an 87% accuracy was achieved for news source extraction, and over 95% accuracy for other three elements.
{"title":"Automatic Elements Extraction of Chinese Web News Using Prior Information of Content and Structure","authors":"Chengru Song, Shifeng Weng, Changshui Zhang","doi":"10.1109/ACPR.2013.52","DOIUrl":"https://doi.org/10.1109/ACPR.2013.52","url":null,"abstract":"We propose a set of efficient processes for extracting all four elements of Chinese news web pages, namely news title, release date, news source and the main text. Our approach is based on a deep analysis of content and structure features of current Chinese news. We take content indicators as the key to recover tree structure of the main text. Additionally, we come up with the concept of Length-Distance Ratio to help improve performance. Our method rarely depends on selection of samples and has strong generalization ability regardless of training process, distinguishing itself from most existing methods. We have tested our approach on 1721 labeled Chinese news pages from 429 web sites. Results show that an 87% accuracy was achieved for news source extraction, and over 95% accuracy for other three elements.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129354926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyun Yan, Yuehuang Wang, Mengmeng Song, Man Jiang
Saliency detection as a recently active research field of computer vision, has a wide range of applications, such as pattern recognition, image retrieval, adaptive compression, target detection, etc. In this paper, we propose a saliency detection method based on color spatial variance weighted graph model, which is designed rely on a background prior. First, the original image is partitioned into small patches, then we use mean-shift clustering algorithm on this patches to get sorts of clustering centers that represents the main colors of whole image. In modeling stage, all patches and the clustering centers are denoted as nodes on a specific graph model. The saliency of each patch is defined as weighted sum of weights on shortest paths from the patch to all clustering centers, each shortest path is weighted according to color spatial variance. Our saliency detection method is computational efficient and outperformed the state of art methods by higher precision and better recall rates, when we took evaluation on the popular MSRA1000 database.
{"title":"Saliency Detection Using Color Spatial Variance Weighted Graph Model","authors":"Xiaoyun Yan, Yuehuang Wang, Mengmeng Song, Man Jiang","doi":"10.1109/ACPR.2013.93","DOIUrl":"https://doi.org/10.1109/ACPR.2013.93","url":null,"abstract":"Saliency detection as a recently active research field of computer vision, has a wide range of applications, such as pattern recognition, image retrieval, adaptive compression, target detection, etc. In this paper, we propose a saliency detection method based on color spatial variance weighted graph model, which is designed rely on a background prior. First, the original image is partitioned into small patches, then we use mean-shift clustering algorithm on this patches to get sorts of clustering centers that represents the main colors of whole image. In modeling stage, all patches and the clustering centers are denoted as nodes on a specific graph model. The saliency of each patch is defined as weighted sum of weights on shortest paths from the patch to all clustering centers, each shortest path is weighted according to color spatial variance. Our saliency detection method is computational efficient and outperformed the state of art methods by higher precision and better recall rates, when we took evaluation on the popular MSRA1000 database.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129293559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Detecting the banknote serial number is an important task in business transaction. In this paper, we propose a new banknote number recognition method. The preprocessing of each banknote image is used to locate position of the banknote number image. Each number image is divided into non-overlapping partitions and the average gray value of each partition is used as feature vector for recognition. The optimal kernel function is obtained by the semi-definite programming (SDP). The experimental results show that the proposed method outperforms MASK, BP, HMM, Single SVM classifiers.
{"title":"New Banknote Number Recognition Algorithm Based on Support Vector Machine","authors":"S. Gai, Guowei Yang, S. Zhang, M. Wan","doi":"10.1109/ACPR.2013.115","DOIUrl":"https://doi.org/10.1109/ACPR.2013.115","url":null,"abstract":"Detecting the banknote serial number is an important task in business transaction. In this paper, we propose a new banknote number recognition method. The preprocessing of each banknote image is used to locate position of the banknote number image. Each number image is divided into non-overlapping partitions and the average gray value of each partition is used as feature vector for recognition. The optimal kernel function is obtained by the semi-definite programming (SDP). The experimental results show that the proposed method outperforms MASK, BP, HMM, Single SVM classifiers.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"170 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131465893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Imtiaz, U. Mahbub, G. Schaefer, Md Atiqur Rahman Ahad
This paper proposes a novel approach for human action recognition using multi-resolution feature extraction based on the two-dimensional discrete wavelet transform (2D-DWT). Action representations can be considered as image templates, which can be useful for understanding various actions or gestures as well as for recognition and analysis. An action recognition scheme is developed based on extracting features from the frames of a video sequence. The proposed feature selection algorithm offers the advantage of very low feature dimensionality and therefore lower computational burden. It is shown that the use of wavelet-domain features enhances the distinguish ability of different actions, resulting in a very high within-class compactness and between-class separability of the extracted features, while certain undesirable phenomena, such as camera movement and change in camera distance from the subject, are less severe in the frequency domain. Principal component analysis is performed to further reduce the dimensionality of the feature space. Extensive experimentations on a standard benchmark database confirm that the proposed approach offers not only computational savings but also a very recognition accuracy.
{"title":"A Multi-resolution Action Recognition Algorithm Using Wavelet Domain Features","authors":"H. Imtiaz, U. Mahbub, G. Schaefer, Md Atiqur Rahman Ahad","doi":"10.1109/ACPR.2013.143","DOIUrl":"https://doi.org/10.1109/ACPR.2013.143","url":null,"abstract":"This paper proposes a novel approach for human action recognition using multi-resolution feature extraction based on the two-dimensional discrete wavelet transform (2D-DWT). Action representations can be considered as image templates, which can be useful for understanding various actions or gestures as well as for recognition and analysis. An action recognition scheme is developed based on extracting features from the frames of a video sequence. The proposed feature selection algorithm offers the advantage of very low feature dimensionality and therefore lower computational burden. It is shown that the use of wavelet-domain features enhances the distinguish ability of different actions, resulting in a very high within-class compactness and between-class separability of the extracted features, while certain undesirable phenomena, such as camera movement and change in camera distance from the subject, are less severe in the frequency domain. Principal component analysis is performed to further reduce the dimensionality of the feature space. Extensive experimentations on a standard benchmark database confirm that the proposed approach offers not only computational savings but also a very recognition accuracy.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126963557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Runze Zhang, Ruiling Deng, Xin He, Gang Zeng, Rui Gan, H. Zha
With strong inference of hierarchical and repetitive structures, semantic information has been widely used in dealing with urban scenes. In this paper, we present a super-pixel-based facade parsing framework which combines the top-down shape grammar splitting with bottom-up information aggregation: machine learning forecasts prior classes, super-pixels improve compactness, and boundary estimation divides the splitting into two procedures - raw and fine, providing a reasonable initial guess for the latter to achieve better random walk optimization results. We also put forward the correlation judging between floors for the purpose of compromising freedom degree reduction with style variety and flexibility, which is also introduced as alignment constraint term to extend the probability energy. Experiments show that our method converges fast and achieves the state-of-the-art results for different styles. Further study on understanding and reconstruction is in progress of exploiting these results.
{"title":"Correlation-Based Facade Parsing Using Shape Grammar","authors":"Runze Zhang, Ruiling Deng, Xin He, Gang Zeng, Rui Gan, H. Zha","doi":"10.1109/ACPR.2013.81","DOIUrl":"https://doi.org/10.1109/ACPR.2013.81","url":null,"abstract":"With strong inference of hierarchical and repetitive structures, semantic information has been widely used in dealing with urban scenes. In this paper, we present a super-pixel-based facade parsing framework which combines the top-down shape grammar splitting with bottom-up information aggregation: machine learning forecasts prior classes, super-pixels improve compactness, and boundary estimation divides the splitting into two procedures - raw and fine, providing a reasonable initial guess for the latter to achieve better random walk optimization results. We also put forward the correlation judging between floors for the purpose of compromising freedom degree reduction with style variety and flexibility, which is also introduced as alignment constraint term to extend the probability energy. Experiments show that our method converges fast and achieves the state-of-the-art results for different styles. Further study on understanding and reconstruction is in progress of exploiting these results.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116170001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}