Hand-Priming in Object Localization for Assistive Egocentric Vision.

Kyungjun Lee, Abhinav Shrivastava, Hernisa Kacorri
{"title":"Hand-Priming in Object Localization for Assistive Egocentric Vision.","authors":"Kyungjun Lee, Abhinav Shrivastava, Hernisa Kacorri","doi":"10.1109/wacv45572.2020.9093353","DOIUrl":null,"url":null,"abstract":"<p><p>Egocentric vision holds great promises for increasing access to visual information and improving the quality of life for people with visual impairments, with object recognition being one of the daily challenges for this population. While we strive to improve recognition performance, it remains difficult to identify which object is of interest to the user; the object may not even be included in the frame due to challenges in camera aiming without visual feedback. Also, gaze information, commonly used to infer the area of interest in egocentric vision, is often not dependable. However, blind users often tend to include their hand either interacting with the object that they wish to recognize or simply placing it in proximity for better camera aiming. We propose localization models that leverage the presence of the hand as the contextual information for priming the center area of the object of interest. In our approach, hand segmentation is fed to either the entire localization network or its last convolutional layers. Using egocentric datasets from sighted and blind individuals, we show that the hand-priming achieves higher precision than other approaches, such as fine-tuning, multi-class, and multi-task learning, which also encode hand-object interactions in localization.</p>","PeriodicalId":73325,"journal":{"name":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7423407/pdf/nihms-1609047.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Winter Conference on Applications of Computer Vision. IEEE Winter Conference on Applications of Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/wacv45572.2020.9093353","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/5/14 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Egocentric vision holds great promises for increasing access to visual information and improving the quality of life for people with visual impairments, with object recognition being one of the daily challenges for this population. While we strive to improve recognition performance, it remains difficult to identify which object is of interest to the user; the object may not even be included in the frame due to challenges in camera aiming without visual feedback. Also, gaze information, commonly used to infer the area of interest in egocentric vision, is often not dependable. However, blind users often tend to include their hand either interacting with the object that they wish to recognize or simply placing it in proximity for better camera aiming. We propose localization models that leverage the presence of the hand as the contextual information for priming the center area of the object of interest. In our approach, hand segmentation is fed to either the entire localization network or its last convolutional layers. Using egocentric datasets from sighted and blind individuals, we show that the hand-priming achieves higher precision than other approaches, such as fine-tuning, multi-class, and multi-task learning, which also encode hand-object interactions in localization.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
辅助性眼心视觉物体定位中的手部定位
以自我为中心的视觉在增加视觉信息的获取和改善视障人士的生活质量方面大有可为,而物体识别则是视障人士面临的日常挑战之一。在我们努力提高识别性能的同时,识别用户感兴趣的物体仍然很困难;在没有视觉反馈的情况下,由于相机瞄准方面的挑战,物体甚至可能不在画面中。此外,在自我中心视觉中通常用于推断感兴趣区域的注视信息往往并不可靠。然而,盲人用户通常倾向于将他们的手与他们希望识别的物体进行互动,或者只是将其放在附近以更好地瞄准摄像机。我们提出的定位模型可以利用手的存在作为背景信息,以引出感兴趣物体的中心区域。在我们的方法中,手部分割信息被输入到整个定位网络或其最后的卷积层中。通过使用视力正常和失明人士的自我中心数据集,我们证明了手部定位比其他方法(如微调、多类和多任务学习)实现了更高的精确度,其他方法也在定位中编码了手与物体之间的相互作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Ordinal Classification with Distance Regularization for Robust Brain Age Prediction. Brainomaly: Unsupervised Neurologic Disease Detection Utilizing Unannotated T1-weighted Brain MR Images. PathLDM: Text conditioned Latent Diffusion Model for Histopathology. Domain Generalization with Correlated Style Uncertainty. Semantic-aware Video Representation for Few-shot Action Recognition.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1