首页 > 最新文献

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)最新文献

英文 中文
Unconstrained Face Alignment Without Face Detection 无人脸检测的无约束人脸对齐
Xiaohu Shao, Junliang Xing, Jiang-Jing Lv, C. Xiao, Pengcheng Liu, Youji Feng, Cheng Cheng
This paper introduces our submission to the 2nd Facial Landmark Localisation Competition. We present a deep architecture to directly detect facial landmarks without using face detection as an initialization. The architecture consists of two stages, a Basic Landmark Prediction Stage and a Whole Landmark Regression Stage. At the former stage, given an input image, the basic landmarks of all faces are detected by a sub-network of landmark heatmap and affinity field prediction. At the latter stage, the coarse canonical face and the pose can be generated by a Pose Splitting Layer based on the visible basic landmarks. According to its pose, each canonical state is distributed to the corresponding branch of the shape regression sub-networks for the whole landmark detection. Experimental results show that our method obtains promising results on the 300-W dataset, and achieves superior performances over the baselines of the semi-frontal and the profile categories in this competition.
本文介绍了我们提交给第二届面部地标本地化竞赛的作品。我们提出了一种不使用人脸检测作为初始化而直接检测人脸标志的深度架构。该体系结构包括两个阶段:基本里程碑预测阶段和整个里程碑回归阶段。在前一阶段,给定输入图像,通过地标热图和亲和场预测的子网络检测所有人脸的基本地标;在后一阶段,基于可见的基本地标,通过姿态分割层生成粗规范面和姿态。根据其位姿,将每个规范状态分配到形状回归子网络的相应分支中,用于整个地标检测。实验结果表明,我们的方法在300-W数据集上取得了令人满意的结果,并且在本次比赛中取得了优于半正面和轮廓类别基线的性能。
{"title":"Unconstrained Face Alignment Without Face Detection","authors":"Xiaohu Shao, Junliang Xing, Jiang-Jing Lv, C. Xiao, Pengcheng Liu, Youji Feng, Cheng Cheng","doi":"10.1109/CVPRW.2017.258","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.258","url":null,"abstract":"This paper introduces our submission to the 2nd Facial Landmark Localisation Competition. We present a deep architecture to directly detect facial landmarks without using face detection as an initialization. The architecture consists of two stages, a Basic Landmark Prediction Stage and a Whole Landmark Regression Stage. At the former stage, given an input image, the basic landmarks of all faces are detected by a sub-network of landmark heatmap and affinity field prediction. At the latter stage, the coarse canonical face and the pose can be generated by a Pose Splitting Layer based on the visible basic landmarks. According to its pose, each canonical state is distributed to the corresponding branch of the shape regression sub-networks for the whole landmark detection. Experimental results show that our method obtains promising results on the 300-W dataset, and achieves superior performances over the baselines of the semi-frontal and the profile categories in this competition.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"2069-2077"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83230703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Exploring the Granularity of Sparsity in Convolutional Neural Networks 卷积神经网络稀疏度粒度的研究
Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, Xingyu Liu, Yu Wang, W. Dally
Sparsity helps reducing the computation complexity of DNNs by skipping the multiplication with zeros. The granularity of sparsity affects the efficiency of hardware architecture and the prediction accuracy. In this paper we quantitatively measure the accuracy-sparsity relationship with different granularity. Coarse-grained sparsity brings more regular sparsity pattern, making it easier for hardware acceleration, and our experimental results show that coarsegrained sparsity have very small impact on the sparsity ratio given no loss of accuracy. Moreover, due to the index saving effect, coarse-grained sparsity is able to obtain similar or even better compression rates than fine-grained sparsity at the same accuracy threshold. Our analysis, which is based on the framework of a recent sparse convolutional neural network (SCNN) accelerator, further demonstrates that it saves 30% – 35% of memory references compared with fine-grained sparsity.
稀疏性通过跳过与零的乘法来帮助降低dnn的计算复杂度。稀疏度的粒度影响硬件架构的效率和预测精度。本文定量地度量了不同粒度下的精度-稀疏度关系。粗粒度稀疏带来更规则的稀疏模式,使硬件加速更容易,我们的实验结果表明,在不损失精度的情况下,粗粒度稀疏对稀疏比的影响非常小。此外,由于索引节省效果,在相同精度阈值下,粗粒度稀疏性能够获得与细粒度稀疏性相似甚至更好的压缩率。我们的分析基于最近的稀疏卷积神经网络(SCNN)加速器的框架,进一步表明与细粒度稀疏性相比,它节省了30% - 35%的内存引用。
{"title":"Exploring the Granularity of Sparsity in Convolutional Neural Networks","authors":"Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, Xingyu Liu, Yu Wang, W. Dally","doi":"10.1109/CVPRW.2017.241","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.241","url":null,"abstract":"Sparsity helps reducing the computation complexity of DNNs by skipping the multiplication with zeros. The granularity of sparsity affects the efficiency of hardware architecture and the prediction accuracy. In this paper we quantitatively measure the accuracy-sparsity relationship with different granularity. Coarse-grained sparsity brings more regular sparsity pattern, making it easier for hardware acceleration, and our experimental results show that coarsegrained sparsity have very small impact on the sparsity ratio given no loss of accuracy. Moreover, due to the index saving effect, coarse-grained sparsity is able to obtain similar or even better compression rates than fine-grained sparsity at the same accuracy threshold. Our analysis, which is based on the framework of a recent sparse convolutional neural network (SCNN) accelerator, further demonstrates that it saves 30% – 35% of memory references compared with fine-grained sparsity.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"66 1","pages":"1927-1934"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75816539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 118
Fast Simplex-HMM for One-Shot Learning Activity Recognition 快速Simplex-HMM一次性学习活动识别
Mario Rodríguez, C. Orrite-Uruñuela, C. Medrano, D. Makris
The work presented in this paper deals with the challenging task of learning an activity class representation using a single sequence for training. Recently, Simplex-HMM framework has been shown to be an efficient representation for activity classes, however, it presents high computational costs making it impractical in several situations. A dimensionality reduction of the features spaces based on a Maximum at Posteriori adaptation combined with a fast estimation of the optimal parameters in the Expectation Maximization algorithm are presented in this paper. As confirmed by the experimental results, these two modifications not only reduce the computational cost but also maintain the performance or even improve it. The process suitability is experimentally confirmed using the human activity datasets Weizmann, KTH and IXMAS and the gesture dataset ChaLearn.
本文提出的工作涉及使用单个序列进行训练来学习活动类表示的挑战性任务。最近,Simplex-HMM框架被证明是活动类的一种有效表示,然而,它的计算成本很高,使得它在一些情况下不切实际。本文提出了一种基于极大后验自适应的特征空间降维方法,并结合期望最大化算法中最优参数的快速估计。实验结果表明,这两种改进不仅降低了计算成本,而且保持了性能甚至提高了性能。使用人类活动数据集Weizmann, KTH和IXMAS以及手势数据集ChaLearn进行实验验证了该过程的适用性。
{"title":"Fast Simplex-HMM for One-Shot Learning Activity Recognition","authors":"Mario Rodríguez, C. Orrite-Uruñuela, C. Medrano, D. Makris","doi":"10.1109/CVPRW.2017.166","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.166","url":null,"abstract":"The work presented in this paper deals with the challenging task of learning an activity class representation using a single sequence for training. Recently, Simplex-HMM framework has been shown to be an efficient representation for activity classes, however, it presents high computational costs making it impractical in several situations. A dimensionality reduction of the features spaces based on a Maximum at Posteriori adaptation combined with a fast estimation of the optimal parameters in the Expectation Maximization algorithm are presented in this paper. As confirmed by the experimental results, these two modifications not only reduce the computational cost but also maintain the performance or even improve it. The process suitability is experimentally confirmed using the human activity datasets Weizmann, KTH and IXMAS and the gesture dataset ChaLearn.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"35 1","pages":"1259-1266"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81079084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Locally Adaptive Color Correction for Underwater Image Dehazing and Matching 水下图像去雾与匹配的局部自适应色彩校正
C. Ancuti, Cosmin Ancuti, C. Vleeschouwer, Rafael García
Underwater images are known to be strongly deteriorated by a combination of wavelength-dependent light attenuation and scattering. This results in complex color casts that depend both on the scene depth map and on the light spectrum. Color transfer, which is a technique of choice to counterbalance color casts, assumes stationary casts, defined by global parameters, and is therefore not directly applicable to the locally variable color casts encountered in underwater scenarios. To fill this gap, this paper introduces an original fusion-based strategy to exploit color transfer while tuning the color correction locally, as a function of the light attenuation level estimated from the red channel. The Dark Channel Prior (DCP) is then used to restore the color compensated image, by inverting the simplified Koschmieder light transmission model, as for outdoor dehazing. Our technique enhances image contrast in a quite effective manner and also supports accurate transmission map estimation. Our extensive experiments also show that our color correction strongly improves the effectiveness of local keypoints matching.
众所周知,由于波长相关的光衰减和散射,水下图像会严重恶化。这导致了复杂的偏色,这取决于场景深度图和光谱。颜色转移是一种平衡色偏的技术,它假设固定的色偏,由全局参数定义,因此不能直接适用于水下场景中遇到的局部可变色偏。为了填补这一空白,本文引入了一种新颖的基于融合的策略来利用颜色转移,同时局部调整颜色校正,作为从红色通道估计的光衰减水平的函数。然后使用暗通道先验(DCP)通过反演简化的Koschmieder光传输模型来恢复颜色补偿图像,就像室外去雾一样。我们的技术有效地提高了图像对比度,并支持准确的传输图估计。大量的实验也表明,我们的颜色校正方法大大提高了局部关键点匹配的有效性。
{"title":"Locally Adaptive Color Correction for Underwater Image Dehazing and Matching","authors":"C. Ancuti, Cosmin Ancuti, C. Vleeschouwer, Rafael García","doi":"10.1109/CVPRW.2017.136","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.136","url":null,"abstract":"Underwater images are known to be strongly deteriorated by a combination of wavelength-dependent light attenuation and scattering. This results in complex color casts that depend both on the scene depth map and on the light spectrum. Color transfer, which is a technique of choice to counterbalance color casts, assumes stationary casts, defined by global parameters, and is therefore not directly applicable to the locally variable color casts encountered in underwater scenarios. To fill this gap, this paper introduces an original fusion-based strategy to exploit color transfer while tuning the color correction locally, as a function of the light attenuation level estimated from the red channel. The Dark Channel Prior (DCP) is then used to restore the color compensated image, by inverting the simplified Koschmieder light transmission model, as for outdoor dehazing. Our technique enhances image contrast in a quite effective manner and also supports accurate transmission map estimation. Our extensive experiments also show that our color correction strongly improves the effectiveness of local keypoints matching.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"26 3 1","pages":"997-1005"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78820364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Earth Observation Using SAR and Social Media Images 利用SAR和社交媒体图像进行地球观测
Yuanyuan Wang, Xiaoxiang Zhu
Earth Observation (EO) is mostly carried out through centralized optical and synthetic aperture radar (SAR) missions. Despite the controlled quality of their products, such observation is restricted by the characteristics of the sensor platform, e.g. the revisit time. Over the last decade, the rapid development of social media has accumulated vast amount of online images. Despite their uncontrolled quality, the sheer volume may contain useful information that can complement the EO missions, especially the SAR missions.,,,,,, This paper presents a preliminary work of fusing social media and SAR images. They have distinct imaging geometries, which are nearly impossible to even coregister without a precise 3-D model. We describe a general approach to coregister them without using external 3-D model. We demonstrate that, one can obtain a new kind of 3-D city model that includes the optical texture for better scene understanding and the precise deformation retrieved from SAR interferometry.
地球观测(EO)主要是通过集中光学和合成孔径雷达(SAR)任务进行的。尽管他们的产品质量受到控制,但这种观察受到传感器平台特性的限制,例如重访时间。在过去的十年里,社交媒体的快速发展积累了大量的网络图片。尽管它们的质量不受控制,但绝对数量可能包含有用的信息,可以补充EO任务,特别是SAR任务。,,,,,,本文介绍了社交媒体与SAR图像融合的初步工作。它们具有独特的成像几何形状,如果没有精确的3d模型,几乎不可能进行共配。我们描述了一种通用的方法来共同注册他们不使用外部三维模型。我们证明,我们可以获得一种新的三维城市模型,该模型包括更好地理解场景的光学纹理和从SAR干涉测量中获得的精确变形。
{"title":"Earth Observation Using SAR and Social Media Images","authors":"Yuanyuan Wang, Xiaoxiang Zhu","doi":"10.1109/CVPRW.2017.202","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.202","url":null,"abstract":"Earth Observation (EO) is mostly carried out through centralized optical and synthetic aperture radar (SAR) missions. Despite the controlled quality of their products, such observation is restricted by the characteristics of the sensor platform, e.g. the revisit time. Over the last decade, the rapid development of social media has accumulated vast amount of online images. Despite their uncontrolled quality, the sheer volume may contain useful information that can complement the EO missions, especially the SAR missions.,,,,,, This paper presents a preliminary work of fusing social media and SAR images. They have distinct imaging geometries, which are nearly impossible to even coregister without a precise 3-D model. We describe a general approach to coregister them without using external 3-D model. We demonstrate that, one can obtain a new kind of 3-D city model that includes the optical texture for better scene understanding and the precise deformation retrieved from SAR interferometry.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"21 1","pages":"1580-1588"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90184514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parsimonious Coding and Verification of Offline Handwritten Signatures 离线手写签名的简约编码与验证
E. Zois, Ilias Theodorakopoulos, Dimitrios Tsourounis, G. Economou
A common practice for addressing the problem of verifying the presence, or the consent of a person in many transactions is to utilize the handwritten signature. Among others, the offline or static signature is a valuable tool in forensic related studies. Thus, the importance of verifying static handwritten signatures still poses a challenging task. Throughout the literature, gray-level images, composed of handwritten signature traces are subjected to numerous processing stages; their outcome is the mapping of any input signature image in a so-called corresponding feature space. Pattern recognition techniques utilize this feature space, usually as a binary verification problem. In this work, sparse dictionary learning and coding are for the first time employed as a means to provide a feature space for offline signature verification, which intuitively adapts to a small set of randomly selected genuine reference samples, thus making it attractable for forensic cases. In this context, the K-SVD dictionary learning algorithm is employed in order to create a writer oriented lexicon. For any signature sample, sparse representation with the use of the writer's lexicon and the Orthogonal Matching Pursuit algorithm generates a weight matrix; features are then extracted by applying simple average pooling to the generated sparse codes. The performance of the proposed scheme is demonstrated using the popular CEDAR, MCYT75 and GPDS300 signature datasets, delivering state of the art results.
在许多交易中,解决验证某人是否存在或同意的问题的常见做法是使用手写签名。其中,离线或静态签名在法医相关研究中是一个有价值的工具。因此,验证静态手写签名的重要性仍然是一项具有挑战性的任务。在整个文献中,由手写签名痕迹组成的灰度图像经历了许多处理阶段;它们的结果是任何输入签名图像在所谓的对应特征空间中的映射。模式识别技术利用这个特征空间,通常作为一个二进制验证问题。在这项工作中,稀疏字典学习和编码首次作为一种手段,为离线签名验证提供了一个特征空间,它直观地适应了一小部分随机选择的真实参考样本,从而使其对法医案件具有吸引力。在这种情况下,使用K-SVD字典学习算法来创建面向编写器的词典。对于任意签名样本,使用写信人的词典和正交匹配追踪算法进行稀疏表示,生成一个权重矩阵;然后通过对生成的稀疏代码应用简单平均池化来提取特征。使用流行的CEDAR、MCYT75和GPDS300签名数据集证明了所提出方案的性能,提供了最先进的结果。
{"title":"Parsimonious Coding and Verification of Offline Handwritten Signatures","authors":"E. Zois, Ilias Theodorakopoulos, Dimitrios Tsourounis, G. Economou","doi":"10.1109/CVPRW.2017.92","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.92","url":null,"abstract":"A common practice for addressing the problem of verifying the presence, or the consent of a person in many transactions is to utilize the handwritten signature. Among others, the offline or static signature is a valuable tool in forensic related studies. Thus, the importance of verifying static handwritten signatures still poses a challenging task. Throughout the literature, gray-level images, composed of handwritten signature traces are subjected to numerous processing stages; their outcome is the mapping of any input signature image in a so-called corresponding feature space. Pattern recognition techniques utilize this feature space, usually as a binary verification problem. In this work, sparse dictionary learning and coding are for the first time employed as a means to provide a feature space for offline signature verification, which intuitively adapts to a small set of randomly selected genuine reference samples, thus making it attractable for forensic cases. In this context, the K-SVD dictionary learning algorithm is employed in order to create a writer oriented lexicon. For any signature sample, sparse representation with the use of the writer's lexicon and the Orthogonal Matching Pursuit algorithm generates a weight matrix; features are then extracted by applying simple average pooling to the generated sparse codes. The performance of the proposed scheme is demonstrated using the popular CEDAR, MCYT75 and GPDS300 signature datasets, delivering state of the art results.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"36 1","pages":"636-645"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89923691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Caught Red-Handed: Toward Practical Video-Based Subsequences Matching in the Presence of Real-World Transformations 当场抓获:在现实世界变换的存在下,走向实用的基于视频的子序列匹配
Yi Xu, True Price, F. Monrose, Jan-Michael Frahm
Every minute, staggering amounts of user-generated videos are uploaded to on-line social networks. These videos can generate significant advertising revenue, providing strong incentive for unscrupulous individuals that wish to capitalize on this bonanza by pirating short clips from popular content and altering the copied media in ways that might bypass detection. Unfortunately, while the challenges posed by the use of skillful transformations has been known for quite some time, current state-of-the-art methods still suffer from severe limitations. Indeed, most of today's techniques perform poorly in the face of real world copies. To address this, we propose a novel approach that leverages temporal characteristics to identify subsequences of a video that were copied from elsewhere. Our approach takes advantage of a new temporal feature to index a reference library in a manner that is robust to popular spatial and temporal transformations in pirated videos. Our experimental evaluation on 27 hours of video obtained from social networks demonstrates that our technique significantly outperforms the existing state-of-the-art approaches with respect to accuracy, resilience, and efficiency.
每分钟都有数量惊人的用户自制视频上传到在线社交网络。这些视频可以产生可观的广告收入,为那些希望通过盗版流行内容中的短片段并以可能绕过检测的方式修改复制的媒体来利用这一丰厚利润的不法分子提供了强烈的动机。不幸的是,虽然使用熟练的转换所带来的挑战已经知道了很长一段时间,但目前最先进的方法仍然受到严重的限制。事实上,今天的大多数技术在面对现实世界的副本时表现不佳。为了解决这个问题,我们提出了一种新的方法,利用时间特征来识别从其他地方复制的视频的子序列。我们的方法利用了一种新的时间特征,以一种对盗版视频中流行的空间和时间转换具有鲁棒性的方式对参考库进行索引。我们对从社交网络获得的27小时视频的实验评估表明,我们的技术在准确性、弹性和效率方面明显优于现有的最先进的方法。
{"title":"Caught Red-Handed: Toward Practical Video-Based Subsequences Matching in the Presence of Real-World Transformations","authors":"Yi Xu, True Price, F. Monrose, Jan-Michael Frahm","doi":"10.1109/CVPRW.2017.182","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.182","url":null,"abstract":"Every minute, staggering amounts of user-generated videos are uploaded to on-line social networks. These videos can generate significant advertising revenue, providing strong incentive for unscrupulous individuals that wish to capitalize on this bonanza by pirating short clips from popular content and altering the copied media in ways that might bypass detection. Unfortunately, while the challenges posed by the use of skillful transformations has been known for quite some time, current state-of-the-art methods still suffer from severe limitations. Indeed, most of today's techniques perform poorly in the face of real world copies. To address this, we propose a novel approach that leverages temporal characteristics to identify subsequences of a video that were copied from elsewhere. Our approach takes advantage of a new temporal feature to index a reference library in a manner that is robust to popular spatial and temporal transformations in pirated videos. Our experimental evaluation on 27 hours of video obtained from social networks demonstrates that our technique significantly outperforms the existing state-of-the-art approaches with respect to accuracy, resilience, and efficiency.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"51 1","pages":"1397-1406"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89160497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Human Activity Recognition Using Combinatorial Deep Belief Networks 基于组合深度信念网络的人类活动识别
Shreyank N. Gowda
Human activity recognition is a topic undergoing a great amount of research. The main reason for that is the number of practical applications that are developed using activity recognition as the base. This paper proposes an approach to human activity recognition using a combination of deep belief networks. One network is used to obtain features from motion and to do this we propose a modified Weber descriptor. Another network is used to obtain features from images and to do this we propose the modification of the standard local binary patterns descriptor to obtain a concatenated histogram of lower dimensions. This helps to encode spatial and temporal information of various actions happening in a frame. This further helps to overcome the dimensionality problem that occurs with LBP. The features extracted are then passed onto a CNN that classifies the activity. Few standard activities are considered such as walking, sprinting, hugging etc. Results showed that the proposed algorithm gave a high level of accuracy for classification.
人类活动识别是一个正在进行大量研究的课题。其主要原因是以活动识别为基础开发的实际应用程序的数量。本文提出了一种结合深度信念网络的人类活动识别方法。一个网络用于从运动中获取特征,为此我们提出了一个改进的韦伯描述符。另一种网络用于从图像中获取特征,为此,我们提出了对标准局部二进制模式描述符的修改,以获得低维的连接直方图。这有助于对一个帧中发生的各种动作的空间和时间信息进行编码。这进一步有助于克服LBP中出现的维度问题。然后将提取的特征传递给对活动进行分类的CNN。很少有标准的活动被考虑,如散步、冲刺、拥抱等。结果表明,该算法具有较高的分类精度。
{"title":"Human Activity Recognition Using Combinatorial Deep Belief Networks","authors":"Shreyank N. Gowda","doi":"10.1109/CVPRW.2017.203","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.203","url":null,"abstract":"Human activity recognition is a topic undergoing a great amount of research. The main reason for that is the number of practical applications that are developed using activity recognition as the base. This paper proposes an approach to human activity recognition using a combination of deep belief networks. One network is used to obtain features from motion and to do this we propose a modified Weber descriptor. Another network is used to obtain features from images and to do this we propose the modification of the standard local binary patterns descriptor to obtain a concatenated histogram of lower dimensions. This helps to encode spatial and temporal information of various actions happening in a frame. This further helps to overcome the dimensionality problem that occurs with LBP. The features extracted are then passed onto a CNN that classifies the activity. Few standard activities are considered such as walking, sprinting, hugging etc. Results showed that the proposed algorithm gave a high level of accuracy for classification.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"1589-1594"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76460459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
A Cost-Effective Framework for Automated Vehicle-Pedestrian Near-Miss Detection Through Onboard Monocular Vision 基于车载单目视觉的车辆-行人近距离碰撞自动检测框架
Ruimin Ke, J. Lutin, J. Spears, Yinhai Wang
Onboard monocular cameras have been widely deployed in both public transit and personal vehicles. Obtaining vehicle-pedestrian near-miss event data from onboard monocular vision systems may be cost-effective compared with onboard multiple-sensor systems or traffic surveillance videos. But extracting near-misses from onboard monocular vision is challenging and little work has been published. This paper fills the gap by developing a framework to automatically detect vehicle-pedestrian near-misses through onboard monocular vision. The proposed framework can estimate depth and real-world motion information through monocular vision with a moving video background. The experimental results based on processing over 30-hours video data demonstrate the ability of the system to capture near-misses by comparison with the events logged by the Rosco/MobilEye Shield+ system which includes four cameras working cooperatively. The detection overlap rate reaches over 90% with the thresholds properly set.
车载单目摄像头已广泛应用于公共交通和私人车辆。与车载多传感器系统或交通监控视频相比,从车载单目视觉系统获取车辆-行人近距离碰撞事件数据可能更具成本效益。但是,从机载单目视觉中提取近距离脱靶是具有挑战性的,而且很少有相关工作发表。本文通过开发一种通过车载单目视觉自动检测车辆与行人近距离碰撞的框架来填补这一空白。该框架可以通过运动视频背景下的单目视觉来估计深度和真实世界的运动信息。基于处理超过30小时视频数据的实验结果表明,通过与Rosco/MobilEye Shield+系统记录的事件进行比较,该系统具有捕获近距离脱险的能力,该系统包括四个协同工作的摄像头。在阈值设置合理的情况下,检测重叠率达到90%以上。
{"title":"A Cost-Effective Framework for Automated Vehicle-Pedestrian Near-Miss Detection Through Onboard Monocular Vision","authors":"Ruimin Ke, J. Lutin, J. Spears, Yinhai Wang","doi":"10.1109/CVPRW.2017.124","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.124","url":null,"abstract":"Onboard monocular cameras have been widely deployed in both public transit and personal vehicles. Obtaining vehicle-pedestrian near-miss event data from onboard monocular vision systems may be cost-effective compared with onboard multiple-sensor systems or traffic surveillance videos. But extracting near-misses from onboard monocular vision is challenging and little work has been published. This paper fills the gap by developing a framework to automatically detect vehicle-pedestrian near-misses through onboard monocular vision. The proposed framework can estimate depth and real-world motion information through monocular vision with a moving video background. The experimental results based on processing over 30-hours video data demonstrate the ability of the system to capture near-misses by comparison with the events logged by the Rosco/MobilEye Shield+ system which includes four cameras working cooperatively. The detection overlap rate reaches over 90% with the thresholds properly set.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"97 1","pages":"898-905"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79938720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Cartooning for Enhanced Privacy in Lifelogging and Streaming Videos 在生活日志和流媒体视频中增强隐私的卡通
Eman T. Hassan, Rakibul Hasan, Patrick Shaffer, David J. Crandall, Apu Kapadia
We describe an object replacement approach whereby privacy-sensitive objects in videos are replaced by abstract cartoons taken from clip art. Our approach uses a combination of computer vision, deep learning, and image processing techniques to detect objects, abstract details, and replace them with cartoon clip art. We conducted a user study (N=85) to discern the utility and effectiveness of our cartoon replacement technique. The results suggest that our object replacement approach preserves a video's semantic content while improving its privacy by obscuring details of objects.
我们描述了一种对象替换方法,即视频中的隐私敏感对象被取自剪贴画的抽象卡通所取代。我们的方法结合了计算机视觉、深度学习和图像处理技术来检测物体、抽象细节,并将其替换为卡通剪贴画。我们进行了一项用户研究(N=85),以辨别我们的卡通替代技术的效用和有效性。结果表明,我们的对象替换方法保留了视频的语义内容,同时通过模糊对象的细节来提高其隐私性。
{"title":"Cartooning for Enhanced Privacy in Lifelogging and Streaming Videos","authors":"Eman T. Hassan, Rakibul Hasan, Patrick Shaffer, David J. Crandall, Apu Kapadia","doi":"10.1109/CVPRW.2017.175","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.175","url":null,"abstract":"We describe an object replacement approach whereby privacy-sensitive objects in videos are replaced by abstract cartoons taken from clip art. Our approach uses a combination of computer vision, deep learning, and image processing techniques to detect objects, abstract details, and replace them with cartoon clip art. We conducted a user study (N=85) to discern the utility and effectiveness of our cartoon replacement technique. The results suggest that our object replacement approach preserves a video's semantic content while improving its privacy by obscuring details of objects.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"20 1","pages":"1333-1342"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91366679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
期刊
2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1