The acoustics of eye contact: detecting visual attention from conversational audio cues

GazeIn '13 Pub Date : 2013-12-13 DOI:10.1145/2535948.2535949
F. Eyben, F. Weninger, L. Paletta, Björn Schuller
{"title":"The acoustics of eye contact: detecting visual attention from conversational audio cues","authors":"F. Eyben, F. Weninger, L. Paletta, Björn Schuller","doi":"10.1145/2535948.2535949","DOIUrl":null,"url":null,"abstract":"An important aspect in short dialogues is attention as is manifested by eye-contact between subjects. In this study we provide a first analysis whether such visual attention is evident in the acoustic properties of a speaker's voice. We thereby introduce the multi-modal GRAS2 corpus, which was recorded for analysing attention in human-to-human interactions of short daily-life interactions with strangers in public places in Graz, Austria. Recordings of four test subjects equipped with eye tracking glasses, three audio recording devices, and motion sensors are contained in the corpus. We describe how we robustly identify speech segments from the subjects and other people in an unsupervised manner from multi-channel recordings. We then discuss correlations between the acoustics of the voice in these segments and the point of visual attention of the subjects. A significant relation between the acoustic features and the distance between the point of view and the eye region of the dialogue partner is found. Further, we show that automatic classification of binary decision eye-contact vs. no eye-contact from acoustic features alone is feasible with an Unweighted Average Recall of up to 70%.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GazeIn '13","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2535948.2535949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

An important aspect in short dialogues is attention as is manifested by eye-contact between subjects. In this study we provide a first analysis whether such visual attention is evident in the acoustic properties of a speaker's voice. We thereby introduce the multi-modal GRAS2 corpus, which was recorded for analysing attention in human-to-human interactions of short daily-life interactions with strangers in public places in Graz, Austria. Recordings of four test subjects equipped with eye tracking glasses, three audio recording devices, and motion sensors are contained in the corpus. We describe how we robustly identify speech segments from the subjects and other people in an unsupervised manner from multi-channel recordings. We then discuss correlations between the acoustics of the voice in these segments and the point of visual attention of the subjects. A significant relation between the acoustic features and the distance between the point of view and the eye region of the dialogue partner is found. Further, we show that automatic classification of binary decision eye-contact vs. no eye-contact from acoustic features alone is feasible with an Unweighted Average Recall of up to 70%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
目光接触的声学:从对话音频线索中检测视觉注意力
在短对话中,注意力是一个很重要的方面,主要表现在对话双方的目光接触上。在这项研究中,我们首次分析了这种视觉注意力在说话者声音的声学特性中是否明显。因此,我们引入了多模态GRAS2语料库,该语料库被记录下来,用于分析在奥地利格拉茨的公共场所与陌生人的短暂日常生活互动中人与人之间的注意力。语料库中包含四名测试对象的记录,他们配备了眼动追踪眼镜,三个录音设备和运动传感器。我们描述了我们如何以无监督的方式从多通道录音中健壮地识别来自受试者和其他人的语音片段。然后,我们讨论这些片段中声音的声学效果与受试者的视觉注意点之间的相关性。研究发现,声音特征与对话对象的视点和眼睛区域之间的距离有显著的关系。此外,我们表明,仅从声学特征对二元决策进行眼接触与无眼接触的自动分类是可行的,未加权平均召回率高达70%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Agent-assisted multi-viewpoint video viewer and its gaze-based evaluation Learning aspects of interest from Gaze A dominance estimation mechanism using eye-gaze and turn-taking information Unrawelling the interaction strategies and gaze in collaborative learning with online video lectures Mutual disambiguation of eye gaze and speech for sight translation and reading
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1