Performance of object recognition in wearable videos

Alberto Sabater, L. Montesano, A. C. Murillo
{"title":"Performance of object recognition in wearable videos","authors":"Alberto Sabater, L. Montesano, A. C. Murillo","doi":"10.1109/ETFA.2019.8869019","DOIUrl":null,"url":null,"abstract":"Wearable technologies are enabling plenty of new applications of computer vision, from life logging to health assistance. Many of them are required to recognize the elements of interest in the scene captured by the camera This work studies the problem of object detection and localization on videos captured by this type of camera. Wearable videos are a much more challenging scenario for object detection than standard images or even another type of videos, due to lower quality images (e.g. poor focus) or high clutter and occlusion common in wearable recordings. Existing work typically focuses on detecting the objects of focus or those being manipulated by the user wearing the camera. We perform a more general evaluation of the task of object detection in this type of video, because numerous applications, such as marketing studies, also need detecting objects which are not in focus by the user. This work presents a thorough study of the well known YOLO architecture, that offers an excellent trade-off between accuracy and speed, for the particular case of object detection in wearable video. We focus our study on the public ADL Dataset, but we also use additional public data for complementary evaluations. We run an exhaustive set of experiments with different variations of the original architecture and its training strategy. Our experiments drive to several conclusions about the most promising directions for our goal and point us to further research steps to improve detection in wearable videos.","PeriodicalId":6682,"journal":{"name":"2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA)","volume":"25 1","pages":"1813-1820"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ETFA.2019.8869019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Wearable technologies are enabling plenty of new applications of computer vision, from life logging to health assistance. Many of them are required to recognize the elements of interest in the scene captured by the camera This work studies the problem of object detection and localization on videos captured by this type of camera. Wearable videos are a much more challenging scenario for object detection than standard images or even another type of videos, due to lower quality images (e.g. poor focus) or high clutter and occlusion common in wearable recordings. Existing work typically focuses on detecting the objects of focus or those being manipulated by the user wearing the camera. We perform a more general evaluation of the task of object detection in this type of video, because numerous applications, such as marketing studies, also need detecting objects which are not in focus by the user. This work presents a thorough study of the well known YOLO architecture, that offers an excellent trade-off between accuracy and speed, for the particular case of object detection in wearable video. We focus our study on the public ADL Dataset, but we also use additional public data for complementary evaluations. We run an exhaustive set of experiments with different variations of the original architecture and its training strategy. Our experiments drive to several conclusions about the most promising directions for our goal and point us to further research steps to improve detection in wearable videos.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
可穿戴视频中物体识别的性能
可穿戴技术使计算机视觉的许多新应用成为可能,从生活记录到健康援助。其中许多都需要识别摄像机拍摄的场景中感兴趣的元素,本工作研究了此类摄像机拍摄的视频的目标检测和定位问题。与标准图像甚至其他类型的视频相比,可穿戴视频在目标检测方面更具挑战性,因为可穿戴记录中常见的图像质量较低(例如聚焦差)或高杂乱和遮挡。现有的工作通常侧重于检测焦点对象或被佩戴相机的用户操纵的对象。我们对这类视频中的物体检测任务进行了更一般的评估,因为许多应用,如市场营销研究,也需要检测用户没有关注的物体。这项工作对著名的YOLO架构进行了深入的研究,该架构在可穿戴视频中的目标检测的特定情况下,提供了精度和速度之间的良好权衡。我们将研究重点放在公共ADL数据集上,但我们也使用其他公共数据进行补充评估。我们对原始架构及其训练策略的不同变体进行了详尽的实验。我们的实验得出了几个关于我们的目标最有希望的方向的结论,并指出了我们进一步研究的步骤,以提高可穿戴视频的检测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An analysis of energy storage system interaction in a multi objective model predictive control based energy management in DC microgrid Latency-Based 5G RAN Slicing Descriptor to Support Deterministic Industry 4.0 Applications CAP: Context-Aware Programming for Cyber Physical Systems Multiplexing Avionics and additional flows on a QoS-aware AFDX network Ultra-Reliable Low Latency based on Retransmission and Spatial Diversity in Slowly Fading Channels with Co-channel Interference
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1