This paper proposes a unified probabilistic model to model the relationships between attributes and objects for attribute prediction and object recognition. As a list of semantically meaningful properties of objects, attributes generally relate to each other statistically. In this paper, we propose a unified probabilistic model to automatically discover and capture both the object-dependent and object-independent attribute relationships. The model utilizes the captured relationships to benefit both attribute prediction and object recognition. Experiments on four benchmark attribute datasets demonstrate the effectiveness of the proposed unified model for improving attribute prediction as well as object recognition in both standard and zero-shot learning cases.
{"title":"A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects","authors":"Xiaoyang Wang, Q. Ji","doi":"10.1109/ICCV.2013.264","DOIUrl":"https://doi.org/10.1109/ICCV.2013.264","url":null,"abstract":"This paper proposes a unified probabilistic model to model the relationships between attributes and objects for attribute prediction and object recognition. As a list of semantically meaningful properties of objects, attributes generally relate to each other statistically. In this paper, we propose a unified probabilistic model to automatically discover and capture both the object-dependent and object-independent attribute relationships. The model utilizes the captured relationships to benefit both attribute prediction and object recognition. Experiments on four benchmark attribute datasets demonstrate the effectiveness of the proposed unified model for improving attribute prediction as well as object recognition in both standard and zero-shot learning cases.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"16 1","pages":"2120-2127"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89302564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marco San-Biagio, M. Crocco, M. Cristani, Samuele Martelli, Vittorio Murino
Capturing the essential characteristics of visual objects by considering how their features are inter-related is a recent philosophy of object classification. In this paper, we embed this principle in a novel image descriptor, dubbed Heterogeneous Auto-Similarities of Characteristics (HASC). HASC is applied to heterogeneous dense features maps, encoding linear relations by co variances and nonlinear associations through information-theoretic measures such as mutual information and entropy. In this way, highly complex structural information can be expressed in a compact, scale invariant and robust manner. The effectiveness of HASC is tested on many diverse detection and classification scenarios, considering objects, textures and pedestrians, on widely known benchmarks (Caltech-101, Brodatz, Daimler Multi-Cue). In all the cases, the results obtained with standard classifiers demonstrate the superiority of HASC with respect to the most adopted local feature descriptors nowadays, such as SIFT, HOG, LBP and feature co variances. In addition, HASC sets the state-of-the-art on the Brodatz texture dataset and the Daimler Multi-Cue pedestrian dataset, without exploiting ad-hoc sophisticated classifiers.
{"title":"Heterogeneous Auto-similarities of Characteristics (HASC): Exploiting Relational Information for Classification","authors":"Marco San-Biagio, M. Crocco, M. Cristani, Samuele Martelli, Vittorio Murino","doi":"10.1109/ICCV.2013.105","DOIUrl":"https://doi.org/10.1109/ICCV.2013.105","url":null,"abstract":"Capturing the essential characteristics of visual objects by considering how their features are inter-related is a recent philosophy of object classification. In this paper, we embed this principle in a novel image descriptor, dubbed Heterogeneous Auto-Similarities of Characteristics (HASC). HASC is applied to heterogeneous dense features maps, encoding linear relations by co variances and nonlinear associations through information-theoretic measures such as mutual information and entropy. In this way, highly complex structural information can be expressed in a compact, scale invariant and robust manner. The effectiveness of HASC is tested on many diverse detection and classification scenarios, considering objects, textures and pedestrians, on widely known benchmarks (Caltech-101, Brodatz, Daimler Multi-Cue). In all the cases, the results obtained with standard classifiers demonstrate the superiority of HASC with respect to the most adopted local feature descriptors nowadays, such as SIFT, HOG, LBP and feature co variances. In addition, HASC sets the state-of-the-art on the Brodatz texture dataset and the Daimler Multi-Cue pedestrian dataset, without exploiting ad-hoc sophisticated classifiers.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"48 1","pages":"809-816"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87623164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recognition of gesture sequences is in general a very difficult problem, but in certain domains the difficulty may be mitigated by exploiting the domain's ``grammar''. One such grammatically constrained gesture sequence domain is sign language. In this paper we investigate the case of finger spelling recognition, which can be very challenging due to the quick, small motions of the fingers. Most prior work on this task has assumed a closed vocabulary of finger spelled words, here we study the more natural open-vocabulary case, where the only domain knowledge is the possible finger spelled letters and statistics of their sequences. We develop a semi-Markov conditional model approach, where feature functions are defined over segments of video and their corresponding letter labels. We use classifiers of letters and linguistic hand shape features, along with expected motion profiles, to define segmental feature functions. This approach improves letter error rate (Levenshtein distance between hypothesized and correct letter sequences) from 16.3% using a hidden Markov model baseline to 11.6% using the proposed semi-Markov model.
{"title":"Fingerspelling Recognition with Semi-Markov Conditional Random Fields","authors":"Taehwan Kim, Gregory Shakhnarovich, Karen Livescu","doi":"10.1109/ICCV.2013.192","DOIUrl":"https://doi.org/10.1109/ICCV.2013.192","url":null,"abstract":"Recognition of gesture sequences is in general a very difficult problem, but in certain domains the difficulty may be mitigated by exploiting the domain's ``grammar''. One such grammatically constrained gesture sequence domain is sign language. In this paper we investigate the case of finger spelling recognition, which can be very challenging due to the quick, small motions of the fingers. Most prior work on this task has assumed a closed vocabulary of finger spelled words, here we study the more natural open-vocabulary case, where the only domain knowledge is the possible finger spelled letters and statistics of their sequences. We develop a semi-Markov conditional model approach, where feature functions are defined over segments of video and their corresponding letter labels. We use classifiers of letters and linguistic hand shape features, along with expected motion profiles, to define segmental feature functions. This approach improves letter error rate (Levenshtein distance between hypothesized and correct letter sequences) from 16.3% using a hidden Markov model baseline to 11.6% using the proposed semi-Markov model.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"33 1","pages":"1521-1528"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84428479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gaofeng Meng, Ying Wang, Jiangyong Duan, Shiming Xiang, Chunhong Pan
Images captured in foggy weather conditions often suffer from bad visibility. In this paper, we propose an efficient regularization method to remove hazes from a single input image. Our method benefits much from an exploration on the inherent boundary constraint on the transmission function. This constraint, combined with a weighted L1-norm based contextual regularization, is modeled into an optimization problem to estimate the unknown scene transmission. A quite efficient algorithm based on variable splitting is also presented to solve the problem. The proposed method requires only a few general assumptions and can restore a high-quality haze-free image with faithful colors and fine image details. Experimental results on a variety of haze images demonstrate the effectiveness and efficiency of the proposed method.
{"title":"Efficient Image Dehazing with Boundary Constraint and Contextual Regularization","authors":"Gaofeng Meng, Ying Wang, Jiangyong Duan, Shiming Xiang, Chunhong Pan","doi":"10.1109/ICCV.2013.82","DOIUrl":"https://doi.org/10.1109/ICCV.2013.82","url":null,"abstract":"Images captured in foggy weather conditions often suffer from bad visibility. In this paper, we propose an efficient regularization method to remove hazes from a single input image. Our method benefits much from an exploration on the inherent boundary constraint on the transmission function. This constraint, combined with a weighted L1-norm based contextual regularization, is modeled into an optimization problem to estimate the unknown scene transmission. A quite efficient algorithm based on variable splitting is also presented to solve the problem. The proposed method requires only a few general assumptions and can restore a high-quality haze-free image with faithful colors and fine image details. Experimental results on a variety of haze images demonstrate the effectiveness and efficiency of the proposed method.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"11 1","pages":"617-624"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88166131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiwu Huang, Xiaowei Zhao, S. Shan, Ruiping Wang, Xilin Chen
The Still-to-Video (S2V) face recognition systems typically need to match faces in low-quality videos captured under unconstrained conditions against high quality still face images, which is very challenging because of noise, image blur, low face resolutions, varying head pose, complex lighting, and alignment difficulty. To address the problem, one solution is to select the frames of `best quality' from videos (hereinafter called quality alignment in this paper). Meanwhile, the faces in the selected frames should also be geometrically aligned to the still faces offline well-aligned in the gallery. In this paper, we discover that the interactions among the three tasks-quality alignment, geometric alignment and face recognition-can benefit from each other, thus should be performed jointly. With this in mind, we propose a Coupling Alignments with Recognition (CAR) method to tightly couple these tasks via low-rank regularized sparse representation in a unified framework. Our method makes the three tasks promote mutually by a joint optimization in an Augmented Lagrange Multiplier routine. Extensive experiments on two challenging S2V datasets demonstrate that our method outperforms the state-of-the-art methods impressively.
{"title":"Coupling Alignments with Recognition for Still-to-Video Face Recognition","authors":"Zhiwu Huang, Xiaowei Zhao, S. Shan, Ruiping Wang, Xilin Chen","doi":"10.1109/ICCV.2013.409","DOIUrl":"https://doi.org/10.1109/ICCV.2013.409","url":null,"abstract":"The Still-to-Video (S2V) face recognition systems typically need to match faces in low-quality videos captured under unconstrained conditions against high quality still face images, which is very challenging because of noise, image blur, low face resolutions, varying head pose, complex lighting, and alignment difficulty. To address the problem, one solution is to select the frames of `best quality' from videos (hereinafter called quality alignment in this paper). Meanwhile, the faces in the selected frames should also be geometrically aligned to the still faces offline well-aligned in the gallery. In this paper, we discover that the interactions among the three tasks-quality alignment, geometric alignment and face recognition-can benefit from each other, thus should be performed jointly. With this in mind, we propose a Coupling Alignments with Recognition (CAR) method to tightly couple these tasks via low-rank regularized sparse representation in a unified framework. Our method makes the three tasks promote mutually by a joint optimization in an Augmented Lagrange Multiplier routine. Extensive experiments on two challenging S2V datasets demonstrate that our method outperforms the state-of-the-art methods impressively.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"50 1","pages":"3296-3303"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86659637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Yan, E. Ricci, Subramanian Ramanathan, O. Lanz, N. Sebe
We propose a novel Multi-Task Learning framework (FEGA-MTL) for classifying the head pose of a person who moves freely in an environment monitored by multiple, large field-of-view surveillance cameras. As the target (person) moves, distortions in facial appearance owing to camera perspective and scale severely impede performance of traditional head pose classification methods. FEGA-MTL operates on a dense uniform spatial grid and learns appearance relationships across partitions as well as partition-specific appearance variations for a given head pose to build region-specific classifiers. Guided by two graphs which a-priori model appearance similarity among (i) grid partitions based on camera geometry and (ii) head pose classes, the learner efficiently clusters appearance wise related grid partitions to derive the optimal partitioning. For pose classification, upon determining the target's position using a person tracker, the appropriate region specific classifier is invoked. Experiments confirm that FEGA-MTL achieves state-of-the-art classification with few training data.
{"title":"No Matter Where You Are: Flexible Graph-Guided Multi-task Learning for Multi-view Head Pose Classification under Target Motion","authors":"Yan Yan, E. Ricci, Subramanian Ramanathan, O. Lanz, N. Sebe","doi":"10.1109/ICCV.2013.150","DOIUrl":"https://doi.org/10.1109/ICCV.2013.150","url":null,"abstract":"We propose a novel Multi-Task Learning framework (FEGA-MTL) for classifying the head pose of a person who moves freely in an environment monitored by multiple, large field-of-view surveillance cameras. As the target (person) moves, distortions in facial appearance owing to camera perspective and scale severely impede performance of traditional head pose classification methods. FEGA-MTL operates on a dense uniform spatial grid and learns appearance relationships across partitions as well as partition-specific appearance variations for a given head pose to build region-specific classifiers. Guided by two graphs which a-priori model appearance similarity among (i) grid partitions based on camera geometry and (ii) head pose classes, the learner efficiently clusters appearance wise related grid partitions to derive the optimal partitioning. For pose classification, upon determining the target's position using a person tracker, the appropriate region specific classifier is invoked. Experiments confirm that FEGA-MTL achieves state-of-the-art classification with few training data.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"57 1","pages":"1177-1184"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87041528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a model for gaze prediction in egocentric video by leveraging the implicit cues that exist in camera wearer's behaviors. Specifically, we compute the camera wearer's head motion and hand location from the video and combine them to estimate where the eyes look. We further model the dynamic behavior of the gaze, in particular fixations, as latent variables to improve the gaze prediction. Our gaze prediction results outperform the state-of-the-art algorithms by a large margin on publicly available egocentric vision datasets. In addition, we demonstrate that we get a significant performance boost in recognizing daily actions and segmenting foreground objects by plugging in our gaze predictions into state-of-the-art methods.
{"title":"Learning to Predict Gaze in Egocentric Video","authors":"Yin Li, A. Fathi, James M. Rehg","doi":"10.1109/ICCV.2013.399","DOIUrl":"https://doi.org/10.1109/ICCV.2013.399","url":null,"abstract":"We present a model for gaze prediction in egocentric video by leveraging the implicit cues that exist in camera wearer's behaviors. Specifically, we compute the camera wearer's head motion and hand location from the video and combine them to estimate where the eyes look. We further model the dynamic behavior of the gaze, in particular fixations, as latent variables to improve the gaze prediction. Our gaze prediction results outperform the state-of-the-art algorithms by a large margin on publicly available egocentric vision datasets. In addition, we demonstrate that we get a significant performance boost in recognizing daily actions and segmenting foreground objects by plugging in our gaze predictions into state-of-the-art methods.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"27 1","pages":"3216-3223"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85358397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we address the problem of robust and efficient averaging of relative 3D rotations. Apart from having an interesting geometric structure, robust rotation averaging addresses the need for a good initialization for large scale optimization used in structure-from-motion pipelines. Such pipelines often use unstructured image datasets harvested from the internet thereby requiring an initialization method that is robust to outliers. Our approach works on the Lie group structure of 3D rotations and solves the problem of large-scale robust rotation averaging in two ways. Firstly, we use modern ℓ1 optimizers to carry out robust averaging of relative rotations that is efficient, scalable and robust to outliers. In addition, we also develop a two step method that uses the ℓ1 solution as an initialisation for an iteratively reweighted least squares (IRLS) approach. These methods achieve excellent results on large-scale, real world datasets and significantly outperform existing methods, i.e. the state-of-the-art discrete-continuous optimization method of [3] as well as the Weiszfeld method of [8]. We demonstrate the efficacy of our method on two large scale real world datasets and also provide the results of the two aforementioned methods for comparison.
{"title":"Efficient and Robust Large-Scale Rotation Averaging","authors":"Avishek Chatterjee, V. Govindu","doi":"10.1109/ICCV.2013.70","DOIUrl":"https://doi.org/10.1109/ICCV.2013.70","url":null,"abstract":"In this paper we address the problem of robust and efficient averaging of relative 3D rotations. Apart from having an interesting geometric structure, robust rotation averaging addresses the need for a good initialization for large scale optimization used in structure-from-motion pipelines. Such pipelines often use unstructured image datasets harvested from the internet thereby requiring an initialization method that is robust to outliers. Our approach works on the Lie group structure of 3D rotations and solves the problem of large-scale robust rotation averaging in two ways. Firstly, we use modern ℓ1 optimizers to carry out robust averaging of relative rotations that is efficient, scalable and robust to outliers. In addition, we also develop a two step method that uses the ℓ1 solution as an initialisation for an iteratively reweighted least squares (IRLS) approach. These methods achieve excellent results on large-scale, real world datasets and significantly outperform existing methods, i.e. the state-of-the-art discrete-continuous optimization method of [3] as well as the Weiszfeld method of [8]. We demonstrate the efficacy of our method on two large scale real world datasets and also provide the results of the two aforementioned methods for comparison.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"11 1","pages":"521-528"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85786140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a novel framework to estimate detailed shape of diffuse objects with uniform albedo from a single RGB-D image. To estimate accurate lighting in natural illumination environment, we introduce a general lighting model consisting of two components: global and local models. The global lighting model is estimated from the RGB-D input using the low-dimensional characteristic of a diffuse reflectance model. The local lighting model represents spatially varying illumination and it is estimated by using the smoothly-varying characteristic of illumination. With both the global and local lighting model, we can estimate complex lighting variations in uncontrolled natural illumination conditions accurately. For high quality shape capture, a shape-from-shading approach is applied with the estimated lighting model. Since the entire process is done with a single RGB-D input, our method is capable of capturing the high quality shape details of a dynamic object under natural illumination. Experimental results demonstrate the feasibility and effectiveness of our method that dramatically improves shape details of the rough depth input.
{"title":"High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination","authors":"Yudeog Han, Joon-Young Lee, In-So Kweon","doi":"10.1109/ICCV.2013.204","DOIUrl":"https://doi.org/10.1109/ICCV.2013.204","url":null,"abstract":"We present a novel framework to estimate detailed shape of diffuse objects with uniform albedo from a single RGB-D image. To estimate accurate lighting in natural illumination environment, we introduce a general lighting model consisting of two components: global and local models. The global lighting model is estimated from the RGB-D input using the low-dimensional characteristic of a diffuse reflectance model. The local lighting model represents spatially varying illumination and it is estimated by using the smoothly-varying characteristic of illumination. With both the global and local lighting model, we can estimate complex lighting variations in uncontrolled natural illumination conditions accurately. For high quality shape capture, a shape-from-shading approach is applied with the estimated lighting model. Since the entire process is done with a single RGB-D input, our method is capable of capturing the high quality shape details of a dynamic object under natural illumination. Experimental results demonstrate the feasibility and effectiveness of our method that dramatically improves shape details of the rough depth input.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"47 1","pages":"1617-1624"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86319904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes to learn binary hash codes within a statistical learning framework, in which an upper bound of the probability of Bayes decision errors is derived for different forms of hash functions and a rigorous proof of the convergence of the upper bound is presented. Consequently, minimizing such an upper bound leads to consistent performance improvements of existing hash code learning algorithms, regardless of whether original algorithms are unsupervised or supervised. This paper also illustrates a fast hash coding method that exploits simple binary tests to achieve orders of magnitude improvement in coding speed as compared to projection based methods.
{"title":"Supervised Binary Hash Code Learning with Jensen Shannon Divergence","authors":"Lixin Fan","doi":"10.1109/ICCV.2013.325","DOIUrl":"https://doi.org/10.1109/ICCV.2013.325","url":null,"abstract":"This paper proposes to learn binary hash codes within a statistical learning framework, in which an upper bound of the probability of Bayes decision errors is derived for different forms of hash functions and a rigorous proof of the convergence of the upper bound is presented. Consequently, minimizing such an upper bound leads to consistent performance improvements of existing hash code learning algorithms, regardless of whether original algorithms are unsupervised or supervised. This paper also illustrates a fast hash coding method that exploits simple binary tests to achieve orders of magnitude improvement in coding speed as compared to projection based methods.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"07 1","pages":"2616-2623"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86328050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}