A comparison of CNN-based face and head detectors for real-time video surveillance applications

Le Thanh Nguyen-Meidine, Eric Granger, M. Kiran, Louis-Antoine Blais-Morin
{"title":"A comparison of CNN-based face and head detectors for real-time video surveillance applications","authors":"Le Thanh Nguyen-Meidine, Eric Granger, M. Kiran, Louis-Antoine Blais-Morin","doi":"10.1109/IPTA.2017.8310113","DOIUrl":null,"url":null,"abstract":"Detecting faces and heads appearing in video feeds are challenging tasks in real-world video surveillance applications due to variations in appearance, occlusions and complex backgrounds. Recently, several CNN architectures have been proposed to increase the accuracy of detectors, although their computational complexity can be an issue, especially for realtime applications, where faces and heads must be detected live using high-resolution cameras. This paper compares the accuracy and complexity of state-of-the-art CNN architectures that are suitable for face and head detection. Single pass and region-based architectures are reviewed and compared empirically to baseline techniques according to accuracy and to time and memory complexity on images from several challenging datasets. The viability of these architectures is analyzed with real-time video surveillance applications in mind. Results suggest that, although CNN architectures can achieve a very high level of accuracy compared to traditional detectors, their computational cost can represent a limitation for many practical real-time applications.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPTA.2017.8310113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 37

Abstract

Detecting faces and heads appearing in video feeds are challenging tasks in real-world video surveillance applications due to variations in appearance, occlusions and complex backgrounds. Recently, several CNN architectures have been proposed to increase the accuracy of detectors, although their computational complexity can be an issue, especially for realtime applications, where faces and heads must be detected live using high-resolution cameras. This paper compares the accuracy and complexity of state-of-the-art CNN architectures that are suitable for face and head detection. Single pass and region-based architectures are reviewed and compared empirically to baseline techniques according to accuracy and to time and memory complexity on images from several challenging datasets. The viability of these architectures is analyzed with real-time video surveillance applications in mind. Results suggest that, although CNN architectures can achieve a very high level of accuracy compared to traditional detectors, their computational cost can represent a limitation for many practical real-time applications.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于cnn的人脸和头部检测器在实时视频监控应用中的比较
在现实世界的视频监控应用中,由于外观、遮挡和复杂背景的变化,检测视频馈送中出现的人脸和头部是一项具有挑战性的任务。最近,已经提出了几种CNN架构来提高检测器的准确性,尽管它们的计算复杂性可能是一个问题,特别是在实时应用中,必须使用高分辨率相机实时检测面部和头部。本文比较了适用于人脸和头部检测的最先进的CNN架构的准确性和复杂性。回顾了单通道和基于区域的架构,并根据准确性和时间和内存复杂性对来自几个具有挑战性的数据集的图像进行了经验比较基线技术。并结合实时视频监控应用分析了这些架构的可行性。结果表明,尽管与传统检测器相比,CNN架构可以实现非常高的精度,但它们的计算成本可能是许多实际实时应用的限制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automated quantification of retinal vessel morphometry in the UK biobank cohort Deep learning for automatic sale receipt understanding Illumination-robust multispectral demosaicing Completed local structure patterns on three orthogonal planes for dynamic texture recognition Single object tracking using offline trained deep regression networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1