Information Fusion in Visual-Task Inference

Amin Haji Abolhassani, James J. Clark
{"title":"Information Fusion in Visual-Task Inference","authors":"Amin Haji Abolhassani, James J. Clark","doi":"10.1109/CRV.2012.14","DOIUrl":null,"url":null,"abstract":"Eye movement is a rich modality that can provide us with a window into a person's mind. In a typical human-human interaction, we can get information about the behavioral state of the others by examining their eye movements. For instance, when a poker player looks into the eyes of his opponent, he looks for any indication of bluffing by verifying the dynamics of the eye movements. However, the information extracted from the eyes is not the only source of information we get in a human-human interaction and other modalities, such as speech or gesture, help us infer the behavioral state of the others. Most of the time this fusion of information refines our decisions and helps us better infer people's cognitive and behavioral activity based on their actions. In this paper, we develop a probabilistic framework to fuse different sources of information to infer the ongoing task in a visual search activity given the viewer's eye movement data. We propose to use a dynamic programming method called token passing in an eye-typing application to reveal what the subject is typing during a search process by observing his direction of gaze during the execution of the task. Token passing is a computationally simple technique that allows us to fuse higher order constraints in the inference process and build models dynamically so we can have unlimited number of hypotheses. In the experiments we examine the effect of higher order information, in the form of a lexicon dictionary, on the task recognition accuracy.","PeriodicalId":372951,"journal":{"name":"2012 Ninth Conference on Computer and Robot Vision","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Ninth Conference on Computer and Robot Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRV.2012.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Eye movement is a rich modality that can provide us with a window into a person's mind. In a typical human-human interaction, we can get information about the behavioral state of the others by examining their eye movements. For instance, when a poker player looks into the eyes of his opponent, he looks for any indication of bluffing by verifying the dynamics of the eye movements. However, the information extracted from the eyes is not the only source of information we get in a human-human interaction and other modalities, such as speech or gesture, help us infer the behavioral state of the others. Most of the time this fusion of information refines our decisions and helps us better infer people's cognitive and behavioral activity based on their actions. In this paper, we develop a probabilistic framework to fuse different sources of information to infer the ongoing task in a visual search activity given the viewer's eye movement data. We propose to use a dynamic programming method called token passing in an eye-typing application to reveal what the subject is typing during a search process by observing his direction of gaze during the execution of the task. Token passing is a computationally simple technique that allows us to fuse higher order constraints in the inference process and build models dynamically so we can have unlimited number of hypotheses. In the experiments we examine the effect of higher order information, in the form of a lexicon dictionary, on the task recognition accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
视觉任务推理中的信息融合
眼动是一种丰富的形态,它可以为我们提供一扇了解一个人思想的窗口。在典型的人际互动中,我们可以通过观察对方的眼球运动来了解对方的行为状态。例如,当一个扑克玩家看着对手的眼睛时,他会通过验证眼球运动的动态来寻找任何虚张声势的迹象。然而,从眼睛中提取的信息并不是我们在人际互动中获得的唯一信息来源,其他方式,如语言或手势,可以帮助我们推断他人的行为状态。大多数时候,这种信息融合可以改进我们的决策,帮助我们更好地根据人们的行为推断出他们的认知和行为活动。在本文中,我们开发了一个概率框架来融合不同来源的信息,以推断视觉搜索活动中正在进行的任务。我们建议在一个眼睛打字应用中使用一种动态规划方法,即标记传递,通过观察对象在执行任务过程中注视的方向来揭示对象在搜索过程中输入的内容。令牌传递是一种计算简单的技术,它允许我们在推理过程中融合高阶约束,并动态构建模型,这样我们就可以有无限数量的假设。在实验中,我们考察了词典词典形式的高阶信息对任务识别准确率的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Visual Place Categorization in Indoor Environments Probabilistic Obstacle Detection Using 2 1/2 D Terrain Maps Shape from Suggestive Contours Using 3D Priors Large-Scale Tattoo Image Retrieval A Metaheuristic Bat-Inspired Algorithm for Full Body Human Pose Estimation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1