2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)最新文献

英文中文

Blur vs. Block: Investigating the Effectiveness of Privacy-Enhancing Obfuscation for Images 模糊与块:调查图像隐私增强混淆的有效性

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Pub Date : 2017-07-01 DOI: 10.1109/CVPRW.2017.176

Yifang Li, Nishant Vishwamitra, Bart P. Knijnenburg, Hongxin Hu, Kelly E. Caine

Computer vision can lead to privacy issues such as unauthorized disclosure of private information and identity theft, but it may also be used to preserve user privacy. For example, using computer vision, we may be able to identify sensitive elements of an image and obfuscate those elements thereby protecting private information or identity. However, there is a lack of research investigating the effectiveness of applying obfuscation techniques to parts of images as a privacy enhancing technology. In particular, we know very little about how well obfuscation works for human viewers or users' attitudes towards using these mechanisms. In this paper, we report results from an online experiment with 53 participants that investigates the effectiveness two exemplar obfuscation techniques: "blurring" and "blocking", and explores users' perceptions of these obfuscations in terms of image satisfaction, information sufficiency, enjoyment, and social presence. Results show that although "blocking" is more effective at de-identification compared to "blurring" or leaving the image "as is", users' attitudes towards "blocking" are the most negative, which creates a conflict between privacy protection and users' experience. Future work should explore alternative obfuscation techniques that could protect users' privacy and also provide a good viewing experience.

计算机视觉可能导致隐私问题，如未经授权的私人信息泄露和身份盗窃，但它也可以用来保护用户隐私。例如，使用计算机视觉，我们可能能够识别图像中的敏感元素并对这些元素进行模糊处理，从而保护私人信息或身份。然而，对于将混淆技术应用于图像部分作为隐私增强技术的有效性，缺乏研究。特别是，我们对混淆对人类观众或用户使用这些机制的态度的效果知之甚少。在本文中，我们报告了53名参与者的在线实验结果，该实验调查了两种典型混淆技术的有效性:“模糊”和“阻塞”，并探讨了用户在图像满意度、信息充分性、享受和社会存在方面对这些混淆的看法。结果表明，虽然“屏蔽”在去识别方面比“模糊”或“保持原样”更有效，但用户对“屏蔽”的态度是最消极的，这造成了隐私保护与用户体验之间的冲突。未来的工作应该探索其他混淆技术，既能保护用户隐私，又能提供良好的观看体验。

{"title":"Blur vs. Block: Investigating the Effectiveness of Privacy-Enhancing Obfuscation for Images","authors":"Yifang Li, Nishant Vishwamitra, Bart P. Knijnenburg, Hongxin Hu, Kelly E. Caine","doi":"10.1109/CVPRW.2017.176","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.176","url":null,"abstract":"Computer vision can lead to privacy issues such as unauthorized disclosure of private information and identity theft, but it may also be used to preserve user privacy. For example, using computer vision, we may be able to identify sensitive elements of an image and obfuscate those elements thereby protecting private information or identity. However, there is a lack of research investigating the effectiveness of applying obfuscation techniques to parts of images as a privacy enhancing technology. In particular, we know very little about how well obfuscation works for human viewers or users' attitudes towards using these mechanisms. In this paper, we report results from an online experiment with 53 participants that investigates the effectiveness two exemplar obfuscation techniques: \"blurring\" and \"blocking\", and explores users' perceptions of these obfuscations in terms of image satisfaction, information sufficiency, enjoyment, and social presence. Results show that although \"blocking\" is more effective at de-identification compared to \"blurring\" or leaving the image \"as is\", users' attitudes towards \"blocking\" are the most negative, which creates a conflict between privacy protection and users' experience. Future work should explore alternative obfuscation techniques that could protect users' privacy and also provide a good viewing experience.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"65 1","pages":"1343-1351"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85714754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 62

The First Automatic Method for Mapping the Pothole in Seagrass 第一种海草坑洞自动测绘方法

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Pub Date : 2017-07-01 DOI: 10.1109/CVPRW.2017.39

M. Rahnemoonfar, M. Yari, Abdullah F. Rahman, Richard J. Kline

There is a vital need to map seagrass ecosystems in order to determine worldwide abundance and distribution. Currently there is no established method for mapping the pothole or scars in seagrass. Detection of seagrass with optical remote sensing is challenged by the fact that light is attenuated as it passes through the water column and reflects back from the benthos. Optical remote sensing of seagrass is only possible if the water is shallow and relatively clear. In reality, coastal waters are commonly turbid, and seagrasses can grow under 10 meters of water or even deeper. One of the most precise sensors to map the seagrass disturbance is side scan sonar. Underwater acoustics mapping produces a high definition, two-dimensional sonar image of seagrass ecosystems. This paper proposes a methodology which detects seagrass potholes in sonar images. Side scan sonar images usually contain speckle noise and uneven illumination across the image. Moreover, disturbance presents complex patterns where most segmentation techniques will fail. In this paper, the quality of image is improved in the first stage using adaptive thresholding and wavelet denoising techniques. In the next step, a novel level set technique is applied to identify the pothole patterns. Our method is robust to noise and uneven illumination. Moreover it can detect the complex pothole patterns. We tested our proposed approach on a collection of underwater sonar images taken from Laguna Madre in Texas. Experimental results in comparison with the ground-truth show the efficiency of the proposed method.

迫切需要绘制海草生态系统图，以便确定世界范围内的丰度和分布。目前还没有确定的方法来绘制海草中的坑或疤痕。由于光在穿过水柱并从底栖生物反射回来时被衰减，因此光学遥感对海草的探测面临挑战。只有在水较浅且相对清澈的情况下，才能对海草进行光学遥感。实际上，沿海水域通常是浑浊的，海草可以生长在10米以下甚至更深的水里。绘制海草扰动图最精确的传感器之一是侧扫声纳。水下声学测绘产生海草生态系统的高清晰度，二维声纳图像。本文提出了一种检测声纳图像中海草坑的方法。侧扫声纳图像通常包含散斑噪声和光照不均匀。此外，干扰呈现出复杂的模式，大多数分割技术将失败。本文首先采用自适应阈值和小波去噪技术提高图像质量。在接下来的步骤中，采用一种新的水平集技术来识别凹坑模式。该方法对噪声和光照不均匀具有较强的鲁棒性。此外，该方法还能探测复杂的坑穴模式。我们在德克萨斯州拉古纳马德雷的一组水下声纳图像上测试了我们提出的方法。实验结果与地面真值的比较表明了该方法的有效性。

{"title":"The First Automatic Method for Mapping the Pothole in Seagrass","authors":"M. Rahnemoonfar, M. Yari, Abdullah F. Rahman, Richard J. Kline","doi":"10.1109/CVPRW.2017.39","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.39","url":null,"abstract":"There is a vital need to map seagrass ecosystems in order to determine worldwide abundance and distribution. Currently there is no established method for mapping the pothole or scars in seagrass. Detection of seagrass with optical remote sensing is challenged by the fact that light is attenuated as it passes through the water column and reflects back from the benthos. Optical remote sensing of seagrass is only possible if the water is shallow and relatively clear. In reality, coastal waters are commonly turbid, and seagrasses can grow under 10 meters of water or even deeper. One of the most precise sensors to map the seagrass disturbance is side scan sonar. Underwater acoustics mapping produces a high definition, two-dimensional sonar image of seagrass ecosystems. This paper proposes a methodology which detects seagrass potholes in sonar images. Side scan sonar images usually contain speckle noise and uneven illumination across the image. Moreover, disturbance presents complex patterns where most segmentation techniques will fail. In this paper, the quality of image is improved in the first stage using adaptive thresholding and wavelet denoising techniques. In the next step, a novel level set technique is applied to identify the pothole patterns. Our method is robust to noise and uneven illumination. Moreover it can detect the complex pothole patterns. We tested our proposed approach on a collection of underwater sonar images taken from Laguna Madre in Texas. Experimental results in comparison with the ground-truth show the efficiency of the proposed method.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"16 1","pages":"267-274"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85996780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Classification of Puck Possession Events in Ice Hockey 冰球控球事件的分类

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Pub Date : 2017-07-01 DOI: 10.1109/CVPRW.2017.24

Moumita Roy Tora, Jianhui Chen, J. Little

Group activity recognition in sports is often challenging due to the complex dynamics and interaction among the players. In this paper, we propose a recurrent neural network to classify puck possession events in ice hockey. Our method extracts features from the whole frame and appearances of the players using a pre-trained convolutional neural network. In this way, our model captures the context information, individual attributes and interaction among the players. Our model requires only the player positions on the image and does not need any explicit annotations for the individual actions or player trajectories, greatly simplifying the input of the system. We evaluate our model on a new Ice Hockey Dataset. Experimental results show that our model produces competitive results on this challenging dataset with much simpler inputs compared with the previous work.

由于运动员之间复杂的动态和相互作用，体育运动中的群体活动识别往往具有挑战性。本文提出了一种递归神经网络对冰球控球事件进行分类的方法。我们的方法使用预训练的卷积神经网络从整个帧和球员的外观中提取特征。通过这种方式，我们的模型捕获了上下文信息、个体属性和参与者之间的交互。我们的模型只需要玩家在图像上的位置，而不需要任何针对个人动作或玩家轨迹的显式注释，从而大大简化了系统的输入。我们在一个新的冰球数据集上评估我们的模型。实验结果表明，与以前的工作相比，我们的模型在这个具有挑战性的数据集上以更简单的输入产生了具有竞争力的结果。

引用次数: 43

Exploration of Social and Web Image Search Results Using Tensor Decomposition 利用张量分解探索社交和网络图像搜索结果

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Pub Date : 2017-07-01 DOI: 10.1109/CVPRW.2017.239

Liuqing Yang, E. Papalexakis

How do socially popular images differ from authoritative images indexed by web search engines? Empirically, social images on e.g., Twitter often tend to look more diverse and ultimately more "personal", contrary to images that are returned by web image search, some of which are so-called "stock" images. Are there image features, that we can automatically learn, which differentiate the two types of image search results, or features that the two have in common? This paper outlines the vision towards achieving this result. We propose a tensor-based approach that learns key features of social and web image search results, and provides a comprehensive framework for analyzing and understanding the similarities and differences between the two types types of content. We demonstrate our preliminary results on a small-scale study, and conclude with future research directions for this exciting and novel application.

社会上受欢迎的图片与网络搜索引擎索引的权威图片有何不同?根据经验，Twitter等社交网站上的图片往往看起来更多样化，最终更“个性化”，这与网络图片搜索返回的图片相反，其中一些是所谓的“库存”图片。是否有图像特征，我们可以自动学习，区分这两种类型的图像搜索结果，或者两者有共同的特征?本文概述了实现这一结果的愿景。我们提出了一种基于张量的方法来学习社交和网络图像搜索结果的关键特征，并提供了一个全面的框架来分析和理解两种类型内容之间的异同。我们在一个小规模的研究中展示了我们的初步结果，并总结了这一令人兴奋和新颖的应用的未来研究方向。

引用次数: 0

FORMS-Locks: A Dataset for the Evaluation of Similarity Measures for Forensic Toolmark Images FORMS-Locks:用于评估法医工具标记图像相似性度量的数据集

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Pub Date : 2017-07-01 DOI: 10.1109/CVPRW.2017.236

M. Keglevic, Robert Sablatnig

We present a toolmark dataset created using lock cylinders seized during criminal investigations of break-ins. A total number of 197 cylinders from 48 linked criminal cases were photographed under a comparison microscope used by forensic experts for toolmark comparisons. In order to allow an assessment of the influence of different lighting conditions, all images were captured using a ring light with 11 different lighting settings. Further, matching image regions in the toolmark images were manually annotated. In addition to the annotated toolmark images and the annotation tool, extracted toolmark patches are provided for training and testing to allow a quantitative comparison of the performance of different similarity measures. Finally, results from an evaluation using a publicly available state-of-the-art image descriptor based on deep learning are presented to provide a baseline for future publications.

我们提出了一个工具标记数据集，该数据集使用在刑事调查中查获的锁圆柱体创建。来自48个相关刑事案件的197个圆柱体在法医专家用于工具标记比较的比较显微镜下被拍摄下来。为了评估不同照明条件的影响，所有图像都是使用具有11种不同照明设置的环形灯拍摄的。此外，对工具标记图像中的匹配图像区域进行手工标注。除了标注的工具标记图像和标注工具外，还提供了提取的工具标记补丁用于训练和测试，以允许对不同相似性度量的性能进行定量比较。最后，介绍了使用基于深度学习的公开可用的最先进图像描述符进行评估的结果，为未来的出版物提供基线。

引用次数: 1

Robust FEC-CNN: A High Accuracy Facial Landmark Detection System 鲁棒FEC-CNN:一种高精度人脸特征检测系统

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Pub Date : 2017-07-01 DOI: 10.1109/CVPRW.2017.255

Zhenliang He, Jie Zhang, Meina Kan, S. Shan, Xilin Chen

Facial landmark detection, as a typical and crucial task in computer vision, is widely used in face recognition, face animation, facial expression analysis, etc. In the past decades, many efforts are devoted to designing robust facial landmark detection algorithms. However, it remains a challenging task due to extreme poses, exaggerated facial expression, unconstrained illumination, etc. In this work, we propose an effective facial landmark detection system, recorded as Robust FEC-CNN (RFC), which achieves impressive results on facial landmark detection in the wild. Considering the favorable ability of deep convolutional neural network, we resort to FEC-CNN as a basic method to characterize the complex nonlinearity from face appearance to shape. Moreover, face bounding box invariant technique is adopted to reduce the landmark localization sensitivity to the face detector while model ensemble strategy is adopted to further enhance the landmark localization performance. We participate the Menpo Facial Landmark Localisation in-the-Wild Challenge and our RFC significantly outperforms the baseline approach APS. Extensive experiments on Menpo Challenge dataset and IBUG dataset demonstrate the superior performance of the proposed RFC.

人脸特征检测作为计算机视觉中一项典型而关键的任务，广泛应用于人脸识别、人脸动画、面部表情分析等领域。在过去的几十年里，许多人致力于设计鲁棒的面部特征检测算法。然而，由于极端的姿势，夸张的面部表情，不受约束的照明等，这仍然是一项具有挑战性的任务。在这项工作中，我们提出了一种有效的面部地标检测系统，记录为鲁棒FEC-CNN (RFC)，该系统在野外面部地标检测方面取得了令人印象深刻的结果。考虑到深度卷积神经网络的良好能力，我们采用FEC-CNN作为表征面部从外观到形状的复杂非线性的基本方法。采用人脸边界盒不变性技术降低了对人脸检测器的地标定位灵敏度，同时采用模型集成策略进一步提高了地标定位性能。我们参加了Menpo面部地标定位野外挑战赛，我们的RFC显著优于基线方法APS。在Menpo Challenge数据集和IBUG数据集上的大量实验证明了所提RFC的优越性能。

{"title":"Robust FEC-CNN: A High Accuracy Facial Landmark Detection System","authors":"Zhenliang He, Jie Zhang, Meina Kan, S. Shan, Xilin Chen","doi":"10.1109/CVPRW.2017.255","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.255","url":null,"abstract":"Facial landmark detection, as a typical and crucial task in computer vision, is widely used in face recognition, face animation, facial expression analysis, etc. In the past decades, many efforts are devoted to designing robust facial landmark detection algorithms. However, it remains a challenging task due to extreme poses, exaggerated facial expression, unconstrained illumination, etc. In this work, we propose an effective facial landmark detection system, recorded as Robust FEC-CNN (RFC), which achieves impressive results on facial landmark detection in the wild. Considering the favorable ability of deep convolutional neural network, we resort to FEC-CNN as a basic method to characterize the complex nonlinearity from face appearance to shape. Moreover, face bounding box invariant technique is adopted to reduce the landmark localization sensitivity to the face detector while model ensemble strategy is adopted to further enhance the landmark localization performance. We participate the Menpo Facial Landmark Localisation in-the-Wild Challenge and our RFC significantly outperforms the baseline approach APS. Extensive experiments on Menpo Challenge dataset and IBUG dataset demonstrate the superior performance of the proposed RFC.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"52 1","pages":"2044-2050"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82259484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Real-Time Driver Drowsiness Detection for Embedded System Using Model Compression of Deep Neural Networks 基于深度神经网络模型压缩的嵌入式系统驾驶员困倦实时检测

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Pub Date : 2017-07-01 DOI: 10.1109/CVPRW.2017.59

B. Reddy, Ye-Hoon Kim, Sojung Yun, Chanwon Seo, Junik Jang

Driver’s status is crucial because one of the main reasons for motor vehicular accidents is related to driver’s inattention or drowsiness. Drowsiness detector on a car can reduce numerous accidents. Accidents occur because of a single moment of negligence, thus driver monitoring system which works in real-time is necessary. This detector should be deployable to an embedded device and perform at high accuracy. In this paper, a novel approach towards real-time drowsiness detection based on deep learning which can be implemented on a low cost embedded board and performs with a high accuracy is proposed. Main contribution of our paper is compression of heavy baseline model to a light weight model deployable to an embedded board. Moreover, minimized network structure was designed based on facial landmark input to recognize whether driver is drowsy or not. The proposed model achieved an accuracy of 89.5% on 3-class classification and speed of 14.9 frames per second (FPS) on Jetson TK1.

驾驶员的状态是至关重要的，因为机动车事故的主要原因之一与驾驶员的注意力不集中或困倦有关。汽车上的睡意检测器可以减少许多事故。事故的发生往往是由于一时的疏忽，因此需要实时工作的驾驶员监控系统。该检测器应可部署到嵌入式设备，并以高精度执行。本文提出了一种基于深度学习的实时困倦检测新方法，该方法可以在低成本的嵌入式电路板上实现，并且具有高精度。本文的主要贡献是将重型基线模型压缩为可部署到嵌入式板上的轻量级模型。此外，基于面部地标输入，设计了最小化网络结构来识别驾驶员是否昏昏欲睡。该模型在Jetson TK1上的3类分类准确率达到89.5%，速度达到14.9帧/秒。

引用次数: 158

Intel(R) RealSense(TM) Stereoscopic Depth Cameras Intel(R) RealSense(TM)立体深度相机

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Pub Date : 2017-07-01 DOI: 10.1109/CVPRW.2017.167

L. Keselman, J. Woodfill, A. Grunnet-Jepsen, A. Bhowmik

We present a comprehensive overview of the stereoscopic Intel RealSense RGBD imaging systems. We discuss these systems' mode-of-operation, functional behavior and include models of their expected performance, shortcomings, and limitations. We provide information about the systems' optical characteristics, their correlation algorithms, and how these properties can affect different applications, including 3D reconstruction and gesture recognition. Our discussion covers the Intel RealSense R200 and RS400.

我们提出了立体英特尔RealSense RGBD成像系统的全面概述。我们将讨论这些系统的操作模式、功能行为，并包括其预期性能、缺点和限制的模型。我们提供了有关系统光学特性的信息，它们的相关算法，以及这些特性如何影响不同的应用，包括3D重建和手势识别。我们的讨论涵盖了英特尔RealSense R200和RS400。

引用次数: 116

Robust Hand Detection and Classification in Vehicles and in the Wild 车辆和野外的鲁棒手部检测与分类

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Pub Date : 2017-07-01 DOI: 10.1109/CVPRW.2017.159

T. Le, Kha Gia Quach, Chenchen Zhu, C. Duong, Khoa Luu, M. Savvides

Robust hand detection and classification is one of the most crucial pre-processing steps to support human computer interaction, driver behavior monitoring, virtual reality, etc. This problem, however, is very challenging due to numerous variations of hand images in real-world scenarios. This work presents a novel approach named Multiple Scale Region-based Fully Convolutional Networks (MSRFCN) to robustly detect and classify human hand regions under various challenging conditions, e.g. occlusions, illumination, low-resolutions. In this approach, the whole image is passed through the proposed fully convolutional network to compute score maps. Those score maps with their position-sensitive properties can help to efficiently address a dilemma between translation-invariance in classification and detection. The method is evaluated on the challenging hand databases, i.e. the Vision for Intelligent Vehicles and Applications (VIVA) Challenge, Oxford hand dataset and compared against various recent hand detection methods. The experimental results show that our proposed MS-FRCN approach consistently achieves the state-of-the-art hand detection results, i.e. Average Precision (AP) / Average Recall (AR) of 95.1% / 94.5% at level 1 and 86.0% / 83.4% at level 2, on the VIVA challenge. In addition, the proposed method achieves the state-of-the-art results for left/right hand and driver/passenger classification tasks on the VIVA database with a significant improvement on AP/AR of ~7% and ~13% for both classification tasks, respectively. The hand detection performance of MS-RFCN reaches to 75.1% of AP and 77.8% of AR on Oxford database.

鲁棒手部检测与分类是支持人机交互、驾驶员行为监控、虚拟现实等最关键的预处理步骤之一。然而，这个问题是非常具有挑战性的，因为在现实世界中，手的图像有很多变化。这项工作提出了一种名为基于多尺度区域的全卷积网络(MSRFCN)的新方法，用于在各种具有挑战性的条件下(例如遮挡，照明，低分辨率)稳健地检测和分类人类的手部区域。在这种方法中，整个图像通过所提出的全卷积网络来计算分数映射。这些具有位置敏感特性的分数图可以帮助有效地解决分类和检测中翻译不变性之间的困境。该方法在具有挑战性的手部数据库上进行了评估，即智能车辆视觉与应用(VIVA)挑战，牛津手部数据集，并与各种最新的手部检测方法进行了比较。实验结果表明，我们提出的MS-FRCN方法在VIVA挑战上始终能够达到最先进的手检测结果，即平均精度(AP) /平均召回率(AR)在水平1为95.1% / 94.5%，在水平2为86.0% / 83.4%。此外，本文提出的方法在VIVA数据库上实现了左/右手和驾驶员/乘客分类任务的最先进结果，两种分类任务的AP/AR分别提高了7%和13%。MS-RFCN在牛津数据库上的手部检测性能达到AP的75.1%和AR的77.8%。

{"title":"Robust Hand Detection and Classification in Vehicles and in the Wild","authors":"T. Le, Kha Gia Quach, Chenchen Zhu, C. Duong, Khoa Luu, M. Savvides","doi":"10.1109/CVPRW.2017.159","DOIUrl":"https://doi.org/10.1109/CVPRW.2017.159","url":null,"abstract":"Robust hand detection and classification is one of the most crucial pre-processing steps to support human computer interaction, driver behavior monitoring, virtual reality, etc. This problem, however, is very challenging due to numerous variations of hand images in real-world scenarios. This work presents a novel approach named Multiple Scale Region-based Fully Convolutional Networks (MSRFCN) to robustly detect and classify human hand regions under various challenging conditions, e.g. occlusions, illumination, low-resolutions. In this approach, the whole image is passed through the proposed fully convolutional network to compute score maps. Those score maps with their position-sensitive properties can help to efficiently address a dilemma between translation-invariance in classification and detection. The method is evaluated on the challenging hand databases, i.e. the Vision for Intelligent Vehicles and Applications (VIVA) Challenge, Oxford hand dataset and compared against various recent hand detection methods. The experimental results show that our proposed MS-FRCN approach consistently achieves the state-of-the-art hand detection results, i.e. Average Precision (AP) / Average Recall (AR) of 95.1% / 94.5% at level 1 and 86.0% / 83.4% at level 2, on the VIVA challenge. In addition, the proposed method achieves the state-of-the-art results for left/right hand and driver/passenger classification tasks on the VIVA database with a significant improvement on AP/AR of ~7% and ~13% for both classification tasks, respectively. The hand detection performance of MS-RFCN reaches to 75.1% of AP and 77.8% of AR on Oxford database.","PeriodicalId":6668,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"32 1","pages":"1203-1210"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91384722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54

Investigating Nuisance Factors in Face Recognition with DCNN Representation 用DCNN表示研究人脸识别中的妨害因素

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Pub Date : 2017-07-01 DOI: 10.1109/CVPRW.2017.86

C. Ferrari, G. Lisanti, S. Berretti, A. Bimbo

Deep learning based approaches proved to be dramatically effective to address many computer vision applications, including "face recognition in the wild". It has been extensively demonstrated that methods exploiting Deep Convolutional Neural Networks (DCNN) are powerful enough to overcome to a great extent many problems that negatively affected computer vision algorithms based on hand-crafted features. These problems include variations in illumination, pose, expression and occlusion, to mention some. The DCNNs excellent discriminative power comes from the fact that they learn low-and high-level representations directly from the raw image data. Considering this, it can be assumed that the performance of a DCNN are influenced by the characteristics of the raw image data that are fed to the network. In this work, we evaluate the effect of different bounding box dimensions, alignment, positioning and data source on face recognition using DCNNs, and present a thorough evaluation on two well known, public DCNN architectures.

事实证明，基于深度学习的方法在解决许多计算机视觉应用(包括“野外人脸识别”)方面非常有效。已经广泛证明，利用深度卷积神经网络(DCNN)的方法足够强大，可以在很大程度上克服许多对基于手工特征的计算机视觉算法产生负面影响的问题。这些问题包括光照、姿势、表情和遮挡等方面的变化。DCNNs出色的判别能力来自于它们直接从原始图像数据中学习低级和高级表示。考虑到这一点，可以假设DCNN的性能受到输入到网络的原始图像数据的特征的影响。在这项工作中，我们评估了不同的边界盒尺寸、对齐、定位和数据源对使用DCNN进行人脸识别的影响，并对两种知名的公共DCNN架构进行了全面的评估。

引用次数: 11

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀