Alessandro Zamberletti, I. Gallo, S. Albertini, L. Noce
Barcode reading mobile applications to identify products from pictures acquired by mobile devices are widely used by customers from all over the world to perform online price comparisons or to access reviews written by other customers. Most of the currently available 1D barcode reading applications focus on effectively decoding barcodes and treat the underlying detection task as a side problem that needs to be solved using general purpose object detection methods. However, the majority of mobile devices do not meet the minimum working requirements of those complex general purpose object detection algorithms and most of the efficient specifically designed 1D barcode detection algorithms require user interaction to work properly. In this work, we present a novel method for 1D barcode detection in camera captured images, based on a supervised machine learning algorithm that identifies the characteristic visual patterns of 1D barcodes’ parallel bars in the two-dimensional Hough Transform space of the processed images. The method we propose is angle invariant, requires no user interaction and can be effectively executed on a mobile device; it achieves excellent results for two standard 1D barcode datasets: WWU Muenster Barcode Database and ArTe-Lab 1D Medium Barcode Dataset. Moreover, we prove that it is possible to enhance the performance of a state-of-the-art 1D barcode reading library by coupling it with our detection method.
{"title":"Neural 1D Barcode Detection Using the Hough Transform","authors":"Alessandro Zamberletti, I. Gallo, S. Albertini, L. Noce","doi":"10.2197/ipsjtcva.7.1","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.1","url":null,"abstract":"Barcode reading mobile applications to identify products from pictures acquired by mobile devices are widely used by customers from all over the world to perform online price comparisons or to access reviews written by other customers. Most of the currently available 1D barcode reading applications focus on effectively decoding barcodes and treat the underlying detection task as a side problem that needs to be solved using general purpose object detection methods. However, the majority of mobile devices do not meet the minimum working requirements of those complex general purpose object detection algorithms and most of the efficient specifically designed 1D barcode detection algorithms require user interaction to work properly. In this work, we present a novel method for 1D barcode detection in camera captured images, based on a supervised machine learning algorithm that identifies the characteristic visual patterns of 1D barcodes’ parallel bars in the two-dimensional Hough Transform space of the processed images. The method we propose is angle invariant, requires no user interaction and can be effectively executed on a mobile device; it achieves excellent results for two standard 1D barcode datasets: WWU Muenster Barcode Database and ArTe-Lab 1D Medium Barcode Dataset. Moreover, we prove that it is possible to enhance the performance of a state-of-the-art 1D barcode reading library by coupling it with our detection method.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"2 1","pages":"1-9"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85591940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents automatic Martian dust storm detection from multiple wavelength data based on decision level fusion. In our proposed method, visual features are first extracted from multiple wavelength data, and optimal features are selected for Martian dust storm detection based on the minimal-Redundancy-Maximal-Relevance algorithm. Second, the selected visual features are used to train the Support Vector Machine classifiers that are constructed on each data. Furthermore, as a main contribution of this paper, the proposed method integrates the multiple detection results obtained from heterogeneous data based on decision level fusion, while considering each classifier’s detection performance to obtain accurate final detection results. Consequently, the proposed method realizes successful Martian dust storm detection.
{"title":"Automatic Martian Dust Storm Detection from Multiple Wavelength Data Based on Decision Level Fusion","authors":"Keisuke Maeda, Takahiro Ogawa, M. Haseyama","doi":"10.2197/ipsjtcva.7.79","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.79","url":null,"abstract":"This paper presents automatic Martian dust storm detection from multiple wavelength data based on decision level fusion. In our proposed method, visual features are first extracted from multiple wavelength data, and optimal features are selected for Martian dust storm detection based on the minimal-Redundancy-Maximal-Relevance algorithm. Second, the selected visual features are used to train the Support Vector Machine classifiers that are constructed on each data. Furthermore, as a main contribution of this paper, the proposed method integrates the multiple detection results obtained from heterogeneous data based on decision level fusion, while considering each classifier’s detection performance to obtain accurate final detection results. Consequently, the proposed method realizes successful Martian dust storm detection.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"51 7 1","pages":"79-83"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83401976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a fast and accurate object detection algorithm based on binary co-occurrence features. In our method, co-occurrences of all the possible pairs of binary elements in a block of binarized HOG are enumerated by logical operations, i.g. circular shift and XOR. This resulted in extremely fast co-occurrence extraction. Our experiments revealed that our method can process a VGA-size image at 64.6 fps, that is two times faster than the camera frame rate (30 fps), on only a single core of CPU (Intel Core i7-3820 3.60 GHz), while at the same time achieving a higher classification accuracy than original (real-valued) HOG in the case of a pedestrian detection task.
{"title":"Fast and Accurate Object Detection Based on Binary Co-occurrence Features","authors":"Mitsuru Ambai, Taketo Kimura, Chiori Sakai","doi":"10.2197/ipsjtcva.7.55","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.55","url":null,"abstract":"In this paper, we propose a fast and accurate object detection algorithm based on binary co-occurrence features. In our method, co-occurrences of all the possible pairs of binary elements in a block of binarized HOG are enumerated by logical operations, i.g. circular shift and XOR. This resulted in extremely fast co-occurrence extraction. Our experiments revealed that our method can process a VGA-size image at 64.6 fps, that is two times faster than the camera frame rate (30 fps), on only a single core of CPU (Intel Core i7-3820 3.60 GHz), while at the same time achieving a higher classification accuracy than original (real-valued) HOG in the case of a pedestrian detection task.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"6 1","pages":"55-58"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76032269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Fukuda, Shintaro Hara, K. Asakawa, H. Ishikawa, M. Noshiro, Mituaki Katuya
Dichromats are color-blind persons missing one of the three cone systems. We consider a computer simulation of color confusion for dichromats for any colors on any video device, which transforms color in each pixel into a representative color among the set of its confusion colors. As a guiding principle of the simulation we adopt the proportionality law between the pre-transformed and post-transformed colors, which ensures that the same colors are not transformed to two or more different colors apart from intensity. We show that such a simulation algorithm with the proportionality law is unique for the video displays whose projected gamut onto the plane perpendicular to the color confusion axis in the LMS space is hexagon. Almost all video display including sRGB satisfy this condition and we demonstrate this unique simulation in sRGB video display. As a corollary we show that it is impossible to build an appropriate algorithm if we demand the additivity law, which is mathematically stronger than the proportionality law and enable the additive mixture among post-transformed colors as well as for dichromats.
{"title":"Computer Simulation of Color Confusion for Dichromats in Video Device Gamut under Proportionality Law","authors":"H. Fukuda, Shintaro Hara, K. Asakawa, H. Ishikawa, M. Noshiro, Mituaki Katuya","doi":"10.2197/ipsjtcva.7.41","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.41","url":null,"abstract":"Dichromats are color-blind persons missing one of the three cone systems. We consider a computer simulation of color confusion for dichromats for any colors on any video device, which transforms color in each pixel into a representative color among the set of its confusion colors. As a guiding principle of the simulation we adopt the proportionality law between the pre-transformed and post-transformed colors, which ensures that the same colors are not transformed to two or more different colors apart from intensity. We show that such a simulation algorithm with the proportionality law is unique for the video displays whose projected gamut onto the plane perpendicular to the color confusion axis in the LMS space is hexagon. Almost all video display including sRGB satisfy this condition and we demonstrate this unique simulation in sRGB video display. As a corollary we show that it is impossible to build an appropriate algorithm if we demand the additivity law, which is mathematically stronger than the proportionality law and enable the additive mixture among post-transformed colors as well as for dichromats.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"75 1","pages":"41-49"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86286249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christos Varytimidis, Konstantinos Rapantzikos, Yannis Avrithis, S. Kollias
Local feature detection has been an essential part of many methods for computer vision applications like large scale image retrieval, object detection, or tracking. Recently, structure-guided feature detectors have been proposed, exploiting image edges to accurately capture local shape. Among them, the WαSH detector [Varytimidis et al., 2012] starts from sampling binary edges and exploits α-shapes, a computational geometry representation that describes local shape in different scales. In this work, we propose a novel image sampling method, based on dithering smooth image functions other than intensity. Samples are extracted on image contours representing the underlying shapes, with sampling density determined by image functions like the gradient or Hessian response, rather than being fixed. We thoroughly evaluate the parameters of the method, and achieve state-of-the-art performance on a series of matching and retrieval experiments.
{"title":"Dithering-based Sampling and Weighted α-shapes for Local Feature Detection","authors":"Christos Varytimidis, Konstantinos Rapantzikos, Yannis Avrithis, S. Kollias","doi":"10.2197/ipsjtcva.7.189","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.189","url":null,"abstract":"Local feature detection has been an essential part of many methods for computer vision applications like large scale image retrieval, object detection, or tracking. Recently, structure-guided feature detectors have been proposed, exploiting image edges to accurately capture local shape. Among them, the WαSH detector [Varytimidis et al., 2012] starts from sampling binary edges and exploits α-shapes, a computational geometry representation that describes local shape in different scales. In this work, we propose a novel image sampling method, based on dithering smooth image functions other than intensity. Samples are extracted on image contours representing the underlying shapes, with sampling density determined by image functions like the gradient or Hessian response, rather than being fixed. We thoroughly evaluate the parameters of the method, and achieve state-of-the-art performance on a series of matching and retrieval experiments.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"4 1","pages":"189-200"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84697153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Facial expression recognition (FER) is a crucial technology and a challenging task for human–computer interaction. Previous methods have been using different feature descriptors for FER and there is a lack of comparison study. In this paper, we aim to identify the best features descriptor for FER by empirically evaluating five feature descriptors, namely Gabor, Haar, Local Binary Pattern (LBP), Histogram of Oriented Gradients (HOG), and Binary Robust Independent Elementary Features (BRIEF) descriptors. We examine each feature descriptor by considering six classification methods, such as k-Nearest Neighbors (k-NN), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), and Adaptive Boosting (AdaBoost) with four unique facial expression datasets. In addition to test accuracies, we present confusion matrices of FER. We also analyze the effect of combined features and image resolutions on FER performance. Our study indicates that HOG descriptor works the best for FER when image resolution of a detected face is higher than 48×48 pixels.
{"title":"Facial Expression Recognition and Analysis: A Comparison Study of Feature Descriptors","authors":"Chun Fui Liew, T. Yairi","doi":"10.2197/ipsjtcva.7.104","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.104","url":null,"abstract":"Facial expression recognition (FER) is a crucial technology and a challenging task for human–computer interaction. Previous methods have been using different feature descriptors for FER and there is a lack of comparison study. In this paper, we aim to identify the best features descriptor for FER by empirically evaluating five feature descriptors, namely Gabor, Haar, Local Binary Pattern (LBP), Histogram of Oriented Gradients (HOG), and Binary Robust Independent Elementary Features (BRIEF) descriptors. We examine each feature descriptor by considering six classification methods, such as k-Nearest Neighbors (k-NN), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), and Adaptive Boosting (AdaBoost) with four unique facial expression datasets. In addition to test accuracies, we present confusion matrices of FER. We also analyze the effect of combined features and image resolutions on FER performance. Our study indicates that HOG descriptor works the best for FER when image resolution of a detected face is higher than 48×48 pixels.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"17 1","pages":"104-120"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89472112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The process of Ultra High Definition TV videos requires a lot of resources in terms of memory and computation time. In this paper we consider a block-propagation background subtraction (BPBGS) method which spreads to neighboring blocks if a part of an object is detected on the borders of the current block. This allows us to avoid processing unnecessary areas which do not contain any object thus saving memory and computational time. The results show that our method is particularly efficient in sequences where objects occupy a small portion of the scene despite the fact that there are a lot of background movements. At same scale our BPBGS performs much faster than the state-of-art methods for a similar detection quality.
{"title":"Block-Propagative Background Subtraction System for UHDTV Videos","authors":"A. Beaugendre, S. Goto","doi":"10.2197/ipsjtcva.7.31","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.31","url":null,"abstract":"The process of Ultra High Definition TV videos requires a lot of resources in terms of memory and computation time. In this paper we consider a block-propagation background subtraction (BPBGS) method which spreads to neighboring blocks if a part of an object is detected on the borders of the current block. This allows us to avoid processing unnecessary areas which do not contain any object thus saving memory and computational time. The results show that our method is particularly efficient in sequences where objects occupy a small portion of the scene despite the fact that there are a lot of background movements. At same scale our BPBGS performs much faster than the state-of-art methods for a similar detection quality.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"74 1","pages":"31-34"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90604125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a novel interest point detector stemming from the intuition that image patches which are highly dissimilar over a relatively large extent of their surroundings hold the property of being repeatable and distinctive. This concept of contextual self-dissimilarity reverses the key paradigm of recent successful techniques such as the Local Self-Similarity descriptor and the Non-Local Means filter, which build upon the presence of similar rather than dissimilar patches. Moreover, our approach extends to contextual information the local self-dissimilarity notion embedded in established detectors of corner-like interest points, thereby achieving enhanced repeatability, distinctiveness and localization accuracy. As the key principle and machinery of our method are amenable to a variety of data kinds, including multi-channel images and organized 3D measurements, we delineate how to extend the basic formulation in order to deal with range and RGB-D images, such as those provided by consumer depth cameras.
{"title":"The Maximal Self-dissimilarity Interest Point Detector","authors":"Federico Tombari, L. D. Stefano","doi":"10.2197/ipsjtcva.7.175","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.175","url":null,"abstract":"We propose a novel interest point detector stemming from the intuition that image patches which are highly dissimilar over a relatively large extent of their surroundings hold the property of being repeatable and distinctive. This concept of contextual self-dissimilarity reverses the key paradigm of recent successful techniques such as the Local Self-Similarity descriptor and the Non-Local Means filter, which build upon the presence of similar rather than dissimilar patches. Moreover, our approach extends to contextual information the local self-dissimilarity notion embedded in established detectors of corner-like interest points, thereby achieving enhanced repeatability, distinctiveness and localization accuracy. As the key principle and machinery of our method are amenable to a variety of data kinds, including multi-channel images and organized 3D measurements, we delineate how to extend the basic formulation in order to deal with range and RGB-D images, such as those provided by consumer depth cameras.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"28 1","pages":"175-188"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81513428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Masaki Hayashi, Kyoko Oshima, Masamoto Tanabiki, Y. Aoki
We propose a human lower body pose estimation method for team sport videos, which is integrated with tracking-by-detection technique. The proposed Label-Grid classifier uses the grid histogram feature of the tracked window from the tracker and estimates the lower body joint position of a specific joint as the class label of the multiclass classifiers, whose classes correspond to the candidate joint positions on the grid. By learning various types of player poses and scales of Histogram-of-Oriented Gradients features within one team sport, our method can estimate poses even if the players are motion-blurred and low-resolution images without requiring a motion-model regression or part-based model, which are popular vision-based human pose estimation techniques. Moreover, our method can estimate poses with part-occlusions and non-upright side poses, which part-detector-based methods find it difficult to estimate with only one model. Experimental results show the advantage of our method for side running poses and non-walking poses. The results also show the robustness of our method for a large variety of poses and scales in team sports videos.
{"title":"Lower Body Pose Estimation in Team Sports Videos Using Label-Grid Classifier Integrated with Tracking-by-Detection","authors":"Masaki Hayashi, Kyoko Oshima, Masamoto Tanabiki, Y. Aoki","doi":"10.2197/ipsjtcva.7.18","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.18","url":null,"abstract":"We propose a human lower body pose estimation method for team sport videos, which is integrated with tracking-by-detection technique. The proposed Label-Grid classifier uses the grid histogram feature of the tracked window from the tracker and estimates the lower body joint position of a specific joint as the class label of the multiclass classifiers, whose classes correspond to the candidate joint positions on the grid. By learning various types of player poses and scales of Histogram-of-Oriented Gradients features within one team sport, our method can estimate poses even if the players are motion-blurred and low-resolution images without requiring a motion-model regression or part-based model, which are popular vision-based human pose estimation techniques. Moreover, our method can estimate poses with part-occlusions and non-upright side poses, which part-detector-based methods find it difficult to estimate with only one model. Experimental results show the advantage of our method for side running poses and non-walking poses. The results also show the robustness of our method for a large variety of poses and scales in team sports videos.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"14 1","pages":"18-30"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82481270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xueyun Chen, Shiming Xiang, Cheng-Lin Liu, Chunhong Pan
Features play crucial role in the performance of classifier for object detection from high-resolution remote sensing images. In this paper, we implemented two types of deep learning methods, deep convolutional neural network (DNN) and deep belief net (DBN), comparing their performances with that of the traditional methods (handcrafted features with a shallow classifier) in the task of aircraft detection. These methods learn robust features from a large set of training samples to obtain a better performance. The depth of their layers (>6 layers) grants them the ability to extract stable and large-scale features from the image. Our experiments show both deep learning methods reduce at least 40% of the false alarm rate of the traditional methods (HOG, LBP+SVM), and DNN performs a little better than DBN. We also fed some multi-preprocessed images simultaneously to one DNN model, and found that such a practice helps to improve the performance of the model obviously with no extra-computing burden adding.
{"title":"Aircraft Detection by Deep Convolutional Neural Networks","authors":"Xueyun Chen, Shiming Xiang, Cheng-Lin Liu, Chunhong Pan","doi":"10.2197/ipsjtcva.7.10","DOIUrl":"https://doi.org/10.2197/ipsjtcva.7.10","url":null,"abstract":"Features play crucial role in the performance of classifier for object detection from high-resolution remote sensing images. In this paper, we implemented two types of deep learning methods, deep convolutional neural network (DNN) and deep belief net (DBN), comparing their performances with that of the traditional methods (handcrafted features with a shallow classifier) in the task of aircraft detection. These methods learn robust features from a large set of training samples to obtain a better performance. The depth of their layers (>6 layers) grants them the ability to extract stable and large-scale features from the image. Our experiments show both deep learning methods reduce at least 40% of the false alarm rate of the traditional methods (HOG, LBP+SVM), and DNN performs a little better than DBN. We also fed some multi-preprocessed images simultaneously to one DNN model, and found that such a practice helps to improve the performance of the model obviously with no extra-computing burden adding.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"5 1","pages":"10-17"},"PeriodicalIF":0.0,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86847670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}