Egocentric recognition of handled objects: Benchmark and analysis

Xiaofeng Ren, Matthai Philipose
{"title":"Egocentric recognition of handled objects: Benchmark and analysis","authors":"Xiaofeng Ren, Matthai Philipose","doi":"10.1109/CVPRW.2009.5204360","DOIUrl":null,"url":null,"abstract":"Recognizing objects being manipulated in hands can provide essential information about a person's activities and have far-reaching impacts on the application of vision in everyday life. The egocentric viewpoint from a wearable camera has unique advantages in recognizing handled objects, such as having a close view and seeing objects in their natural positions. We collect a comprehensive dataset and analyze the feasibilities and challenges of the egocentric recognition of handled objects. We use a lapel-worn camera and record uncompressed video streams as human subjects manipulate objects in daily activities. We use 42 day-to-day objects that vary in size, shape, color and textureness. 10 video sequences are shot for each object under different illuminations and backgrounds. We use this dataset and a SIFT-based recognition system to analyze and quantitatively characterize the main challenges in egocentric object recognition, such as motion blur and hand occlusion, along with its unique constraints, such as hand color, location prior and temporal consistency. SIFT-based recognition has an average recognition rate of 12%, and reaches 20% through enforcing temporal consistency. We use simulations to estimate the upper bound for SIFT-based recognition at 64%, the loss of accuracy due to background clutter at 20%, and that of hand occlusion at 13%. Our quantitative evaluations show that the egocentric recognition of handled objects is a challenging but feasible problem with many unique characteristics and many opportunities for future research.","PeriodicalId":431981,"journal":{"name":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"114","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPRW.2009.5204360","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 114

Abstract

Recognizing objects being manipulated in hands can provide essential information about a person's activities and have far-reaching impacts on the application of vision in everyday life. The egocentric viewpoint from a wearable camera has unique advantages in recognizing handled objects, such as having a close view and seeing objects in their natural positions. We collect a comprehensive dataset and analyze the feasibilities and challenges of the egocentric recognition of handled objects. We use a lapel-worn camera and record uncompressed video streams as human subjects manipulate objects in daily activities. We use 42 day-to-day objects that vary in size, shape, color and textureness. 10 video sequences are shot for each object under different illuminations and backgrounds. We use this dataset and a SIFT-based recognition system to analyze and quantitatively characterize the main challenges in egocentric object recognition, such as motion blur and hand occlusion, along with its unique constraints, such as hand color, location prior and temporal consistency. SIFT-based recognition has an average recognition rate of 12%, and reaches 20% through enforcing temporal consistency. We use simulations to estimate the upper bound for SIFT-based recognition at 64%, the loss of accuracy due to background clutter at 20%, and that of hand occlusion at 13%. Our quantitative evaluations show that the egocentric recognition of handled objects is a challenging but feasible problem with many unique characteristics and many opportunities for future research.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
被处理对象的自我中心识别:基准与分析
识别手中被操纵的物体可以提供关于人的活动的基本信息,并对视觉在日常生活中的应用产生深远的影响。可穿戴相机的自我中心视角在识别处理过的物体方面具有独特的优势,例如可以近距离观察和看到物体的自然位置。我们收集了一个完整的数据集,分析了以自我为中心识别被处理物体的可行性和挑战。我们使用一个挂在翻领上的摄像头,记录人类在日常活动中操纵物体时未压缩的视频流。我们使用42种日常物品,它们的大小、形状、颜色和质感各不相同。在不同光照和背景下,为每个物体拍摄10个视频序列。我们使用该数据集和基于sift的识别系统来分析和定量表征自我中心目标识别中的主要挑战,如运动模糊和手部遮挡,以及其独特的约束,如手部颜色、位置先验和时间一致性。基于sift的识别平均识别率为12%,通过增强时间一致性,识别率达到20%。我们使用模拟来估计基于sift的识别的上限为64%,背景杂波导致的准确度损失为20%,手部遮挡导致的准确度损失为13%。我们的定量评估表明,自我中心识别是一个具有挑战性但可行的问题,具有许多独特的特征和许多未来研究的机会。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Robust real-time 3D modeling of static scenes using solely a Time-of-Flight sensor Image matching in large scale indoor environment Learning to segment using machine-learned penalized logistic models Modeling and exploiting the spatio-temporal facial action dependencies for robust spontaneous facial expression recognition Fuzzy statistical modeling of dynamic backgrounds for moving object detection in infrared videos
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1