An audiovisual attention model for natural conversation scenes

A. Coutrot, Nathalie Guyader
{"title":"An audiovisual attention model for natural conversation scenes","authors":"A. Coutrot, Nathalie Guyader","doi":"10.1109/ICIP.2014.7025219","DOIUrl":null,"url":null,"abstract":"Classical visual attention models neither consider social cues, such as faces, nor auditory cues, such as speech. However, faces are known to capture visual attention more than any other visual features, and recent studies showed that speech turn-taking affects the gaze of non-involved viewers. In this paper, we propose an audiovisual saliency model able to predict the eye movements of observers viewing other people having a conversation. Thanks to a speaker diarization algorithm, our audiovisual saliency model increases the saliency of the speakers compared to the addressees. We evaluated our model with eye-tracking data, and found that it significantly outperforms visual attention models using an equal and constant saliency value for all faces.","PeriodicalId":6856,"journal":{"name":"2014 IEEE International Conference on Image Processing (ICIP)","volume":"66 1","pages":"1100-1104"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Image Processing (ICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP.2014.7025219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36

Abstract

Classical visual attention models neither consider social cues, such as faces, nor auditory cues, such as speech. However, faces are known to capture visual attention more than any other visual features, and recent studies showed that speech turn-taking affects the gaze of non-involved viewers. In this paper, we propose an audiovisual saliency model able to predict the eye movements of observers viewing other people having a conversation. Thanks to a speaker diarization algorithm, our audiovisual saliency model increases the saliency of the speakers compared to the addressees. We evaluated our model with eye-tracking data, and found that it significantly outperforms visual attention models using an equal and constant saliency value for all faces.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
自然对话场景的视听注意模型
经典的视觉注意模型既不考虑社会线索,如面孔,也不考虑听觉线索,如言语。然而,众所周知,面部比其他任何视觉特征都更能吸引视觉注意力,最近的研究表明,言语转换会影响未参与的观众的目光。在本文中,我们提出了一个视听显著性模型,能够预测观察者在观看他人交谈时的眼球运动。由于说话人的拨号算法,我们的视听显著性模型增加了说话人相对于收件人的显著性。我们用眼动追踪数据评估了我们的模型,发现它明显优于对所有面孔使用相同且恒定的显著性值的视觉注意模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Joint source and channel coding of view and rate scalable multi-view video Inter-view consistent hole filling in view extrapolation for multi-view image generation Cost-aware depth map estimation for Lytro camera SVM with feature selection and smooth prediction in images: Application to CAD of prostate cancer Model based clustering for 3D directional features: Application to depth image analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1