首页 > 最新文献

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)最新文献

英文 中文
A Multi-level Contextual Model for Person Recognition in Photo Albums 相册中人识别的多层次语境模型
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.145
Haoxiang Li, Jonathan Brandt, Zhe L. Lin, Xiaohui Shen, G. Hua
In this work, we present a new framework for person recognition in photo albums that exploits contextual cues at multiple levels, spanning individual persons, individual photos, and photo groups. Through experiments, we show that the information available at each of these distinct contextual levels provides complementary cues as to person identities. At the person level, we leverage clothing and body appearance in addition to facial appearance, and to compensate for instances where the faces are not visible. At the photo level we leverage a learned prior on the joint distribution of identities on the same photo to guide the identity assignments. Going beyond a single photo, we are able to infer natural groupings of photos with shared context in an unsupervised manner. By exploiting this shared contextual information, we are able to reduce the identity search space and exploit higher intra-personal appearance consistency within photo groups. Our new framework enables efficient use of these complementary multi-level contextual cues to improve overall recognition rates on the photo album person recognition task, as demonstrated through state-of-theart results on a challenging public dataset. Our results outperform competing methods by a significant margin, while being computationally efficient and practical in a real world application.
在这项工作中,我们提出了一个新的相册人物识别框架,该框架利用多层次的上下文线索,跨越个人、个人照片和照片组。通过实验,我们发现在这些不同的语境层面上的信息提供了关于个人身份的互补线索。在人的层面上,除了面部外观,我们还利用了服装和身体外观,并补偿了脸部不可见的情况。在照片层面,我们利用对同一张照片上恒等式联合分布的学习先验来指导恒等式分配。超越单张照片,我们能够以一种无监督的方式推断出具有共享背景的照片的自然分组。通过利用这种共享的上下文信息,我们能够减少身份搜索空间,并在照片组中利用更高的个人外观一致性。我们的新框架能够有效地利用这些互补的多层次上下文线索来提高相册人物识别任务的整体识别率,正如在具有挑战性的公共数据集上的最新结果所证明的那样。我们的结果在很大程度上优于竞争方法,同时在实际应用中具有计算效率和实用性。
{"title":"A Multi-level Contextual Model for Person Recognition in Photo Albums","authors":"Haoxiang Li, Jonathan Brandt, Zhe L. Lin, Xiaohui Shen, G. Hua","doi":"10.1109/CVPR.2016.145","DOIUrl":"https://doi.org/10.1109/CVPR.2016.145","url":null,"abstract":"In this work, we present a new framework for person recognition in photo albums that exploits contextual cues at multiple levels, spanning individual persons, individual photos, and photo groups. Through experiments, we show that the information available at each of these distinct contextual levels provides complementary cues as to person identities. At the person level, we leverage clothing and body appearance in addition to facial appearance, and to compensate for instances where the faces are not visible. At the photo level we leverage a learned prior on the joint distribution of identities on the same photo to guide the identity assignments. Going beyond a single photo, we are able to infer natural groupings of photos with shared context in an unsupervised manner. By exploiting this shared contextual information, we are able to reduce the identity search space and exploit higher intra-personal appearance consistency within photo groups. Our new framework enables efficient use of these complementary multi-level contextual cues to improve overall recognition rates on the photo album person recognition task, as demonstrated through state-of-theart results on a challenging public dataset. Our results outperform competing methods by a significant margin, while being computationally efficient and practical in a real world application.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"12 1","pages":"1297-1305"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91156869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
DeepHand: Robust Hand Pose Estimation by Completing a Matrix Imputed with Deep Features DeepHand:基于深度特征完成矩阵的鲁棒手部姿态估计
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.450
Ayan Sinha, Chiho Choi, K. Ramani
We propose DeepHand to estimate the 3D pose of a hand using depth data from commercial 3D sensors. We discriminatively train convolutional neural networks to output a low dimensional activation feature given a depth map. This activation feature vector is representative of the global or local joint angle parameters of a hand pose. We efficiently identify 'spatial' nearest neighbors to the activation feature, from a database of features corresponding to synthetic depth maps, and store some 'temporal' neighbors from previous frames. Our matrix completion algorithm uses these 'spatio-temporal' activation features and the corresponding known pose parameter values to estimate the unknown pose parameters of the input feature vector. Our database of activation features supplements large viewpoint coverage and our hierarchical estimation of pose parameters is robust to occlusions. We show that our approach compares favorably to state-of-the-art methods while achieving real time performance (≈ 32 FPS) on a standard computer.
我们提出DeepHand使用商用3D传感器的深度数据来估计手的3D姿势。我们判别训练卷积神经网络输出给定深度图的低维激活特征。该激活特征向量代表了手部姿态的全局或局部关节角度参数。我们从与合成深度图相对应的特征数据库中有效地识别出激活特征的“空间”近邻,并存储来自前一帧的一些“时间”近邻。我们的矩阵补全算法使用这些“时空”激活特征和相应的已知姿态参数值来估计输入特征向量的未知姿态参数。我们的激活特征数据库补充了大的视点覆盖率,我们的姿态参数分层估计对遮挡具有鲁棒性。我们表明,在标准计算机上实现实时性能(≈32 FPS)的同时,我们的方法与最先进的方法相比具有优势。
{"title":"DeepHand: Robust Hand Pose Estimation by Completing a Matrix Imputed with Deep Features","authors":"Ayan Sinha, Chiho Choi, K. Ramani","doi":"10.1109/CVPR.2016.450","DOIUrl":"https://doi.org/10.1109/CVPR.2016.450","url":null,"abstract":"We propose DeepHand to estimate the 3D pose of a hand using depth data from commercial 3D sensors. We discriminatively train convolutional neural networks to output a low dimensional activation feature given a depth map. This activation feature vector is representative of the global or local joint angle parameters of a hand pose. We efficiently identify 'spatial' nearest neighbors to the activation feature, from a database of features corresponding to synthetic depth maps, and store some 'temporal' neighbors from previous frames. Our matrix completion algorithm uses these 'spatio-temporal' activation features and the corresponding known pose parameter values to estimate the unknown pose parameters of the input feature vector. Our database of activation features supplements large viewpoint coverage and our hierarchical estimation of pose parameters is robust to occlusions. We show that our approach compares favorably to state-of-the-art methods while achieving real time performance (≈ 32 FPS) on a standard computer.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"30 1","pages":"4150-4158"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84267380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 151
Multimodal Spontaneous Emotion Corpus for Human Behavior Analysis 用于人类行为分析的多模态自发情绪语料库
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.374
Zheng Zhang, J. Girard, Yue Wu, Xing Zhang, Peng Liu, U. Ciftci, Shaun J. Canavan, M. Reale, Andy Horowitz, Huiyuan Yang, J. Cohn, Q. Ji, L. Yin
Emotion is expressed in multiple modalities, yet most research has considered at most one or two. This stems in part from the lack of large, diverse, well-annotated, multimodal databases with which to develop and test algorithms. We present a well-annotated, multimodal, multidimensional spontaneous emotion corpus of 140 participants. Emotion inductions were highly varied. Data were acquired from a variety of sensors of the face that included high-resolution 3D dynamic imaging, high-resolution 2D video, and thermal (infrared) sensing, and contact physiological sensors that included electrical conductivity of the skin, respiration, blood pressure, and heart rate. Facial expression was annotated for both the occurrence and intensity of facial action units from 2D video by experts in the Facial Action Coding System (FACS). The corpus further includes derived features from 3D, 2D, and IR (infrared) sensors and baseline results for facial expression and action unit detection. The entire corpus will be made available to the research community.
情绪有多种表达方式,但大多数研究最多只考虑了一种或两种。这部分源于缺乏大型的、多样化的、注释良好的、多模式的数据库来开发和测试算法。我们提出了一个140名参与者的良好注释,多模态,多维自发情绪语料库。情绪感应是高度多样化的。数据来自面部的各种传感器,包括高分辨率3D动态成像、高分辨率2D视频和热(红外)传感,以及接触生理传感器,包括皮肤电导率、呼吸、血压和心率。面部表情由专家在面部动作编码系统(FACS)中对2D视频中面部动作单元的出现次数和强度进行标注。该语料库还包括来自3D、2D和IR(红外)传感器的衍生特征,以及用于面部表情和动作单元检测的基线结果。整个语料库将提供给研究界。
{"title":"Multimodal Spontaneous Emotion Corpus for Human Behavior Analysis","authors":"Zheng Zhang, J. Girard, Yue Wu, Xing Zhang, Peng Liu, U. Ciftci, Shaun J. Canavan, M. Reale, Andy Horowitz, Huiyuan Yang, J. Cohn, Q. Ji, L. Yin","doi":"10.1109/CVPR.2016.374","DOIUrl":"https://doi.org/10.1109/CVPR.2016.374","url":null,"abstract":"Emotion is expressed in multiple modalities, yet most research has considered at most one or two. This stems in part from the lack of large, diverse, well-annotated, multimodal databases with which to develop and test algorithms. We present a well-annotated, multimodal, multidimensional spontaneous emotion corpus of 140 participants. Emotion inductions were highly varied. Data were acquired from a variety of sensors of the face that included high-resolution 3D dynamic imaging, high-resolution 2D video, and thermal (infrared) sensing, and contact physiological sensors that included electrical conductivity of the skin, respiration, blood pressure, and heart rate. Facial expression was annotated for both the occurrence and intensity of facial action units from 2D video by experts in the Facial Action Coding System (FACS). The corpus further includes derived features from 3D, 2D, and IR (infrared) sensors and baseline results for facial expression and action unit detection. The entire corpus will be made available to the research community.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"8 1","pages":"3438-3446"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79878887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 299
Single Image Camera Calibration with Lenticular Arrays for Augmented Reality 基于透镜阵列的增强现实单图像相机标定
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.358
Ian Schillebeeckx, Robert Pless
We consider the problem of camera pose estimation for a scenario where the camera may have continuous and unknown changes in its focal length. Understanding frame by frame changes in camera focal length is vital to accurately estimating camera pose and vital to accurately rendering virtual objects in a scene with the correct perspective. However, most approaches to camera calibration require geometric constraints from many frames or the observation of a 3D calibration object - both of which may not be feasible in augmented reality settings. This paper introduces a calibration object based on a flat lenticular array that creates a color coded light-field whose observed color changes depending on the angle from which it is viewed. We derive an approach to estimate the focal length of the camera and the relative pose of an object from a single image. We characterize the performance of camera calibration across various focal lengths and camera models, and we demonstrate the advantages of the focal length estimation in rendering a virtual object in a video with constant zooming.
我们考虑了一个场景下的相机姿态估计问题,其中相机的焦距可能有连续和未知的变化。了解相机焦距的逐帧变化对于准确估计相机姿态和准确渲染场景中具有正确视角的虚拟物体至关重要。然而,大多数相机校准方法需要来自许多帧的几何约束或对3D校准对象的观察-这两种方法在增强现实设置中可能都不可行。本文介绍了一种基于平面透镜阵列的校准对象,该对象可以产生一个颜色编码的光场,其观察到的颜色随观察角度的变化而变化。我们推导了一种从单幅图像中估计相机焦距和物体相对姿态的方法。我们描述了不同焦距和相机模型的相机校准性能,并展示了焦距估计在恒定变焦视频中呈现虚拟物体的优势。
{"title":"Single Image Camera Calibration with Lenticular Arrays for Augmented Reality","authors":"Ian Schillebeeckx, Robert Pless","doi":"10.1109/CVPR.2016.358","DOIUrl":"https://doi.org/10.1109/CVPR.2016.358","url":null,"abstract":"We consider the problem of camera pose estimation for a scenario where the camera may have continuous and unknown changes in its focal length. Understanding frame by frame changes in camera focal length is vital to accurately estimating camera pose and vital to accurately rendering virtual objects in a scene with the correct perspective. However, most approaches to camera calibration require geometric constraints from many frames or the observation of a 3D calibration object - both of which may not be feasible in augmented reality settings. This paper introduces a calibration object based on a flat lenticular array that creates a color coded light-field whose observed color changes depending on the angle from which it is viewed. We derive an approach to estimate the focal length of the camera and the relative pose of an object from a single image. We characterize the performance of camera calibration across various focal lengths and camera models, and we demonstrate the advantages of the focal length estimation in rendering a virtual object in a video with constant zooming.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"25 1","pages":"3290-3298"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87094779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Mnemonic Descent Method: A Recurrent Process Applied for End-to-End Face Alignment 助记下降法:一种用于端到端人脸对齐的循环过程
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.453
George Trigeorgis, Patrick Snape, M. Nicolaou, Epameinondas Antonakos, S. Zafeiriou
Cascaded regression has recently become the method of choice for solving non-linear least squares problems such as deformable image alignment. Given a sizeable training set, cascaded regression learns a set of generic rules that are sequentially applied to minimise the least squares problem. Despite the success of cascaded regression for problems such as face alignment and head pose estimation, there are several shortcomings arising in the strategies proposed thus far. Specifically, (a) the regressors are learnt independently, (b) the descent directions may cancel one another out and (c) handcrafted features (e.g., HoGs, SIFT etc.) are mainly used to drive the cascade, which may be sub-optimal for the task at hand. In this paper, we propose a combined and jointly trained convolutional recurrent neural network architecture that allows the training of an end-to-end to system that attempts to alleviate the aforementioned drawbacks. The recurrent module facilitates the joint optimisation of the regressors by assuming the cascades form a nonlinear dynamical system, in effect fully utilising the information between all cascade levels by introducing a memory unit that shares information across all levels. The convolutional module allows the network to extract features that are specialised for the task at hand and are experimentally shown to outperform hand-crafted features. We show that the application of the proposed architecture for the problem of face alignment results in a strong improvement over the current state-of-the-art.
近年来,级联回归已成为求解非线性最小二乘问题(如可变形图像对齐)的首选方法。给定一个相当大的训练集,级联回归学习一组通用规则,这些规则依次应用于最小化最小二乘问题。尽管级联回归在人脸对齐和头姿估计等问题上取得了成功,但目前提出的策略存在一些缺点。具体来说,(a)回归量是独立学习的,(b)下降方向可能相互抵消,(c)手工制作的特征(例如,hog, SIFT等)主要用于驱动级联,这对于手头的任务来说可能是次优的。在本文中,我们提出了一个组合和联合训练的卷积递归神经网络架构,该架构允许端到端系统的训练,试图减轻上述缺点。通过假设级联形成一个非线性动力系统,循环模块促进了回归量的联合优化,通过引入一个在所有级联级别之间共享信息的存储单元,有效地充分利用了所有级联级别之间的信息。卷积模块允许网络提取专门用于手头任务的特征,并在实验中证明优于手工制作的特征。我们表明,所提出的结构在人脸对齐问题上的应用比目前的最新技术有了很大的改进。
{"title":"Mnemonic Descent Method: A Recurrent Process Applied for End-to-End Face Alignment","authors":"George Trigeorgis, Patrick Snape, M. Nicolaou, Epameinondas Antonakos, S. Zafeiriou","doi":"10.1109/CVPR.2016.453","DOIUrl":"https://doi.org/10.1109/CVPR.2016.453","url":null,"abstract":"Cascaded regression has recently become the method of choice for solving non-linear least squares problems such as deformable image alignment. Given a sizeable training set, cascaded regression learns a set of generic rules that are sequentially applied to minimise the least squares problem. Despite the success of cascaded regression for problems such as face alignment and head pose estimation, there are several shortcomings arising in the strategies proposed thus far. Specifically, (a) the regressors are learnt independently, (b) the descent directions may cancel one another out and (c) handcrafted features (e.g., HoGs, SIFT etc.) are mainly used to drive the cascade, which may be sub-optimal for the task at hand. In this paper, we propose a combined and jointly trained convolutional recurrent neural network architecture that allows the training of an end-to-end to system that attempts to alleviate the aforementioned drawbacks. The recurrent module facilitates the joint optimisation of the regressors by assuming the cascades form a nonlinear dynamical system, in effect fully utilising the information between all cascade levels by introducing a memory unit that shares information across all levels. The convolutional module allows the network to extract features that are specialised for the task at hand and are experimentally shown to outperform hand-crafted features. We show that the application of the proposed architecture for the problem of face alignment results in a strong improvement over the current state-of-the-art.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"47 1","pages":"4177-4187"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87349045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 340
Learning Cross-Domain Landmarks for Heterogeneous Domain Adaptation 异构领域适应的跨领域标志学习
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.549
Yao-Hung Hubert Tsai, Yi-Ren Yeh, Y. Wang
While domain adaptation (DA) aims to associate the learning tasks across data domains, heterogeneous domain adaptation (HDA) particularly deals with learning from cross-domain data which are of different types of features. In other words, for HDA, data from source and target domains are observed in separate feature spaces and thus exhibit distinct distributions. In this paper, we propose a novel learning algorithm of Cross-Domain Landmark Selection (CDLS) for solving the above task. With the goal of deriving a domain-invariant feature subspace for HDA, our CDLS is able to identify representative cross-domain data, including the unlabeled ones in the target domain, for performing adaptation. In addition, the adaptation capabilities of such cross-domain landmarks can be determined accordingly. This is the reason why our CDLS is able to achieve promising HDA performance when comparing to state-of-the-art HDA methods. We conduct classification experiments using data across different features, domains, and modalities. The effectiveness of our proposed method can be successfully verified.
领域自适应(DA)旨在跨数据域关联学习任务,而异构领域自适应(HDA)主要处理具有不同类型特征的跨领域数据的学习。换句话说,对于HDA,来自源域和目标域的数据在不同的特征空间中观察,因此表现出不同的分布。本文提出了一种新的跨域地标选择(Cross-Domain Landmark Selection, CDLS)学习算法来解决上述问题。为了获得HDA的域不变特征子空间,我们的CDLS能够识别具有代表性的跨域数据,包括目标域中未标记的数据,以便进行自适应。此外,还可以据此确定这些跨域地标的自适应能力。这就是为什么与最先进的HDA方法相比,我们的CDLS能够实现有希望的HDA性能的原因。我们使用不同特征、领域和模态的数据进行分类实验。该方法的有效性得到了成功的验证。
{"title":"Learning Cross-Domain Landmarks for Heterogeneous Domain Adaptation","authors":"Yao-Hung Hubert Tsai, Yi-Ren Yeh, Y. Wang","doi":"10.1109/CVPR.2016.549","DOIUrl":"https://doi.org/10.1109/CVPR.2016.549","url":null,"abstract":"While domain adaptation (DA) aims to associate the learning tasks across data domains, heterogeneous domain adaptation (HDA) particularly deals with learning from cross-domain data which are of different types of features. In other words, for HDA, data from source and target domains are observed in separate feature spaces and thus exhibit distinct distributions. In this paper, we propose a novel learning algorithm of Cross-Domain Landmark Selection (CDLS) for solving the above task. With the goal of deriving a domain-invariant feature subspace for HDA, our CDLS is able to identify representative cross-domain data, including the unlabeled ones in the target domain, for performing adaptation. In addition, the adaptation capabilities of such cross-domain landmarks can be determined accordingly. This is the reason why our CDLS is able to achieve promising HDA performance when comparing to state-of-the-art HDA methods. We conduct classification experiments using data across different features, domains, and modalities. The effectiveness of our proposed method can be successfully verified.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"93 1","pages":"5081-5090"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85697281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 144
Do It Yourself Hyperspectral Imaging with Everyday Digital Cameras 用日常数码相机自己做高光谱成像
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.270
Seoung Wug Oh, M. S. Brown, M. Pollefeys, Seon Joo Kim
Capturing hyperspectral images requires expensive and specialized hardware that is not readily accessible to most users. Digital cameras, on the other hand, are significantly cheaper in comparison and can be easily purchased and used. In this paper, we present a framework for reconstructing hyperspectral images by using multiple consumer-level digital cameras. Our approach works by exploiting the different spectral sensitivities of different camera sensors. In particular, due to the differences in spectral sensitivities of the cameras, different cameras yield different RGB measurements for the same spectral signal. We introduce an algorithm that is able to combine and convert these different RGB measurements into a single hyperspectral image for both indoor and outdoor scenes. This camera-based approach allows hyperspectral imaging at a fraction of the cost of most existing hyperspectral hardware. We validate the accuracy of our reconstruction against ground truth hyperspectral images (using both synthetic and real cases) and show its usage on relighting applications.
捕获高光谱图像需要昂贵的专用硬件,大多数用户无法轻易获得。另一方面,相比之下,数码相机要便宜得多,而且很容易购买和使用。在本文中,我们提出了一个使用多个消费级数码相机重建高光谱图像的框架。我们的方法通过利用不同相机传感器的不同光谱灵敏度来工作。特别是,由于相机光谱灵敏度的差异,不同的相机对同一光谱信号产生不同的RGB测量值。我们介绍了一种算法,该算法能够将这些不同的RGB测量值组合并转换为室内和室外场景的单个高光谱图像。这种基于相机的方法使得高光谱成像的成本仅为大多数现有高光谱硬件的一小部分。我们对地真高光谱图像(使用合成和真实案例)验证了我们的重建的准确性,并展示了它在重照明应用中的使用。
{"title":"Do It Yourself Hyperspectral Imaging with Everyday Digital Cameras","authors":"Seoung Wug Oh, M. S. Brown, M. Pollefeys, Seon Joo Kim","doi":"10.1109/CVPR.2016.270","DOIUrl":"https://doi.org/10.1109/CVPR.2016.270","url":null,"abstract":"Capturing hyperspectral images requires expensive and specialized hardware that is not readily accessible to most users. Digital cameras, on the other hand, are significantly cheaper in comparison and can be easily purchased and used. In this paper, we present a framework for reconstructing hyperspectral images by using multiple consumer-level digital cameras. Our approach works by exploiting the different spectral sensitivities of different camera sensors. In particular, due to the differences in spectral sensitivities of the cameras, different cameras yield different RGB measurements for the same spectral signal. We introduce an algorithm that is able to combine and convert these different RGB measurements into a single hyperspectral image for both indoor and outdoor scenes. This camera-based approach allows hyperspectral imaging at a fraction of the cost of most existing hyperspectral hardware. We validate the accuracy of our reconstruction against ground truth hyperspectral images (using both synthetic and real cases) and show its usage on relighting applications.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"96 3 1","pages":"2461-2469"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83358815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 70
6D Dynamic Camera Relocalization from Single Reference Image 从单个参考图像重新定位6D动态相机
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.439
Wei Feng, Fei-Peng Tian, Qian Zhang, Ji-zhou Sun
Dynamic relocalization of 6D camera pose from single reference image is a costly and challenging task that requires delicate hand-eye calibration and precision positioning platform to do 3D mechanical rotation and translation. In this paper, we show that high-quality camera relocalization can be achieved in a much less expensive way. Based on inexpensive platform with unreliable absolute repositioning accuracy (ARA), we propose a hand-eye calibration free strategy to actively relocate camera into the same 6D pose that produces the input reference image, by sequentially correcting 3D relative rotation and translation. We theoretically prove that, by this strategy, both rotational and translational relative pose can be effectively reduced to zero, with bounded unknown hand-eye pose displacement. To conquer 3D rotation and translation ambiguity, this theoretical strategy is further revised to a practical relocalization algorithm with faster convergence rate and more reliability by jointly adjusting 3D relative rotation and translation. Extensive experiments validate the effectiveness and superior accuracy of the proposed approach on laboratory tests and challenging real-world applications.
从单个参考图像动态重新定位6D相机姿态是一项昂贵且具有挑战性的任务,需要精细的手眼校准和精密的定位平台来进行3D机械旋转和平移。在本文中,我们证明了高质量的相机重新定位可以以更便宜的方式实现。基于价格低廉且绝对重新定位精度(ARA)不可靠的平台,我们提出了一种无需手眼校准的策略,通过顺序校正3D相对旋转和平移,将相机主动重新定位到产生输入参考图像的相同6D姿态。我们从理论上证明,通过该策略,旋转和平移的相对姿态都可以有效地降为零,手眼姿态位移有界未知。为了克服三维旋转和平移的模糊性,通过联合调整三维相对旋转和平移,将理论策略进一步修正为收敛速度更快、可靠性更高的实际再定位算法。广泛的实验验证了所提出的方法在实验室测试和具有挑战性的实际应用中的有效性和卓越的准确性。
{"title":"6D Dynamic Camera Relocalization from Single Reference Image","authors":"Wei Feng, Fei-Peng Tian, Qian Zhang, Ji-zhou Sun","doi":"10.1109/CVPR.2016.439","DOIUrl":"https://doi.org/10.1109/CVPR.2016.439","url":null,"abstract":"Dynamic relocalization of 6D camera pose from single reference image is a costly and challenging task that requires delicate hand-eye calibration and precision positioning platform to do 3D mechanical rotation and translation. In this paper, we show that high-quality camera relocalization can be achieved in a much less expensive way. Based on inexpensive platform with unreliable absolute repositioning accuracy (ARA), we propose a hand-eye calibration free strategy to actively relocate camera into the same 6D pose that produces the input reference image, by sequentially correcting 3D relative rotation and translation. We theoretically prove that, by this strategy, both rotational and translational relative pose can be effectively reduced to zero, with bounded unknown hand-eye pose displacement. To conquer 3D rotation and translation ambiguity, this theoretical strategy is further revised to a practical relocalization algorithm with faster convergence rate and more reliability by jointly adjusting 3D relative rotation and translation. Extensive experiments validate the effectiveness and superior accuracy of the proposed approach on laboratory tests and challenging real-world applications.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"32 1","pages":"4049-4057"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89429716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Constrained Deep Transfer Feature Learning and Its Applications 约束深度迁移特征学习及其应用
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.551
Yue Wu, Q. Ji
Feature learning with deep models has achieved impressive results for both data representation and classification for various vision tasks. Deep feature learning, however, typically requires a large amount of training data, which may not be feasible for some application domains. Transfer learning can be one of the approaches to alleviate this problem by transferring data from data-rich source domain to data-scarce target domain. Existing transfer learning methods typically perform one-shot transfer learning and often ignore the specific properties that the transferred data must satisfy. To address these issues, we introduce a constrained deep transfer feature learning method to perform simultaneous transfer learning and feature learning by performing transfer learning in a progressively improving feature space iteratively in order to better narrow the gap between the target domain and the source domain for effective transfer of the data from source domain to target domain. Furthermore, we propose to exploit the target domain knowledge and incorporate such prior knowledge as constraint during transfer learning to ensure that the transferred data satisfies certain properties of the target domain. To demonstrate the effectiveness of the proposed constrained deep transfer feature learning method, we apply it to thermal feature learning for eye detection by transferring from the visible domain. We also applied the proposed method for cross-view facial expression recognition as a second application. The experimental results demonstrate the effectiveness of the proposed method for both applications.
深度模型的特征学习在各种视觉任务的数据表示和分类方面取得了令人印象深刻的结果。然而,深度特征学习通常需要大量的训练数据,这在某些应用领域可能是不可行的。迁移学习可以将数据从数据丰富的源域转移到数据稀缺的目标域,是缓解这一问题的方法之一。现有的迁移学习方法通常是一次性迁移学习,往往忽略了迁移数据必须满足的特定属性。为了解决这些问题,我们引入了一种约束深度迁移特征学习方法,通过在逐步改进的特征空间中迭代地进行迁移学习,同时进行迁移学习和特征学习,以更好地缩小目标域和源域之间的差距,从而有效地将数据从源域转移到目标域。此外,我们提出利用目标领域知识,并在迁移学习过程中加入这些先验知识作为约束,以确保迁移的数据满足目标领域的某些属性。为了验证所提出的约束深度转移特征学习方法的有效性,我们将其应用于人眼检测的热特征学习。我们还将提出的方法应用于交叉视图面部表情识别作为第二个应用。实验结果证明了该方法在两种应用中的有效性。
{"title":"Constrained Deep Transfer Feature Learning and Its Applications","authors":"Yue Wu, Q. Ji","doi":"10.1109/CVPR.2016.551","DOIUrl":"https://doi.org/10.1109/CVPR.2016.551","url":null,"abstract":"Feature learning with deep models has achieved impressive results for both data representation and classification for various vision tasks. Deep feature learning, however, typically requires a large amount of training data, which may not be feasible for some application domains. Transfer learning can be one of the approaches to alleviate this problem by transferring data from data-rich source domain to data-scarce target domain. Existing transfer learning methods typically perform one-shot transfer learning and often ignore the specific properties that the transferred data must satisfy. To address these issues, we introduce a constrained deep transfer feature learning method to perform simultaneous transfer learning and feature learning by performing transfer learning in a progressively improving feature space iteratively in order to better narrow the gap between the target domain and the source domain for effective transfer of the data from source domain to target domain. Furthermore, we propose to exploit the target domain knowledge and incorporate such prior knowledge as constraint during transfer learning to ensure that the transferred data satisfies certain properties of the target domain. To demonstrate the effectiveness of the proposed constrained deep transfer feature learning method, we apply it to thermal feature learning for eye detection by transferring from the visible domain. We also applied the proposed method for cross-view facial expression recognition as a second application. The experimental results demonstrate the effectiveness of the proposed method for both applications.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"1 1","pages":"5101-5109"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89681627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
CoMaL: Good Features to Match on Object Boundaries CoMaL:在对象边界上匹配的良好特征
Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.43
Swarna Kamlam Ravindran, Anurag Mittal
Traditional Feature Detectors and Trackers use information aggregation in 2D patches to detect and match discriminative patches. However, this information does not remain the same at object boundaries when there is object motion against a significantly varying background. In this paper, we propose a new approach for feature detection, tracking and re-detection that gives significantly improved results at the object boundaries. We utilize level lines or iso-intensity curves that often remain stable and can be reliably detected even at the object boundaries, which they often trace. Stable portions of long level lines are detected and points of high curvature are detected on such curves for corner detection. Further, this level line is used to separate the portions belonging to the two objects, which is then used for robust matching of such points. While such CoMaL (Corners on Maximally-stable Level Line Segments) points were found to be much more reliable at the object boundary regions, they perform comparably at the interior regions as well. This is illustrated in exhaustive experiments on realworld datasets.
传统的特征检测器和跟踪器利用二维补丁中的信息聚合来检测和匹配有区别的补丁。然而,当物体在明显变化的背景下运动时,这些信息在物体边界处并不保持不变。在本文中,我们提出了一种新的特征检测、跟踪和重新检测方法,该方法在目标边界处得到了显着改善的结果。我们利用水平线或等强度曲线,它们通常保持稳定,甚至可以在物体边界可靠地检测到,它们经常跟踪。检测长水平线的稳定部分,并在此曲线上检测高曲率点进行拐角检测。此外,这条水平线用于分离属于两个对象的部分,然后用于这些点的鲁棒匹配。虽然发现这样的CoMaL(最稳定水平线段上的角)点在物体边界区域更可靠,但它们在内部区域的表现也相当。这在真实世界数据集的详尽实验中得到了说明。
{"title":"CoMaL: Good Features to Match on Object Boundaries","authors":"Swarna Kamlam Ravindran, Anurag Mittal","doi":"10.1109/CVPR.2016.43","DOIUrl":"https://doi.org/10.1109/CVPR.2016.43","url":null,"abstract":"Traditional Feature Detectors and Trackers use information aggregation in 2D patches to detect and match discriminative patches. However, this information does not remain the same at object boundaries when there is object motion against a significantly varying background. In this paper, we propose a new approach for feature detection, tracking and re-detection that gives significantly improved results at the object boundaries. We utilize level lines or iso-intensity curves that often remain stable and can be reliably detected even at the object boundaries, which they often trace. Stable portions of long level lines are detected and points of high curvature are detected on such curves for corner detection. Further, this level line is used to separate the portions belonging to the two objects, which is then used for robust matching of such points. While such CoMaL (Corners on Maximally-stable Level Line Segments) points were found to be much more reliable at the object boundary regions, they perform comparably at the interior regions as well. This is illustrated in exhaustive experiments on realworld datasets.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"108 1","pages":"336-345"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79377702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1