首页 > 最新文献

GazeIn '13最新文献

英文 中文
A dominance estimation mechanism using eye-gaze and turn-taking information 基于注视和轮询信息的优势估计机制
Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535956
Misato Yatsushiro, Naoya Ikeda, Yuki Hayashi, Y. Nakano
With a goal of contributing to multiparty conversation management, this paper proposes a mechanism for estimating conversational dominance in group interaction. Based on our corpus analysis, we have already established a regression model for dominance estimation using speech and gaze information. In this study, we implement the model as a dominance estimation mechanism, and propose an idea of utilizing the mechanism in moderating multiparty conversations between a conversational robot and three human users. The system decides whom the system should talk to based on the dominance level of each user.
本文提出了一种评估群体互动中会话优势的机制,旨在为多方会话管理做出贡献。基于语料库分析,我们已经建立了一个基于语音和注视信息的优势估计回归模型。在本研究中,我们将该模型实现为优势估计机制,并提出了利用该机制调节对话机器人与三个人类用户之间的多方对话的想法。系统根据每个用户的支配级别决定系统应该与谁对话。
{"title":"A dominance estimation mechanism using eye-gaze and turn-taking information","authors":"Misato Yatsushiro, Naoya Ikeda, Yuki Hayashi, Y. Nakano","doi":"10.1145/2535948.2535956","DOIUrl":"https://doi.org/10.1145/2535948.2535956","url":null,"abstract":"With a goal of contributing to multiparty conversation management, this paper proposes a mechanism for estimating conversational dominance in group interaction. Based on our corpus analysis, we have already established a regression model for dominance estimation using speech and gaze information. In this study, we implement the model as a dominance estimation mechanism, and propose an idea of utilizing the mechanism in moderating multiparty conversations between a conversational robot and three human users. The system decides whom the system should talk to based on the dominance level of each user.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125995625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning aspects of interest from Gaze 从凝视中学习兴趣的各个方面
Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535955
Kei Shimonishi, H. Kawashima, Ryo Yonetani, Erina Ishikawa, T. Matsuyama
This paper presents a probabilistic framework to model the gaze generative process when a user is browsing a content consisting of multiple regions. The model enables us to learn multiple aspects of interest from gaze data, to represent and estimate user's interest as a mixture of aspects, and to predict gaze behavior in a unified framework. We recorded gaze data of subjects when they were browsing a digital pictorial book, and confirmed the effectiveness of the proposed model in terms of predicting the gaze target.
本文提出了一个概率框架来模拟用户浏览由多个区域组成的内容时的凝视生成过程。该模型使我们能够从注视数据中学习兴趣的多个方面,将用户的兴趣作为多个方面的混合来表示和估计,并在一个统一的框架中预测注视行为。我们记录了被试在浏览电子画册时的凝视数据,并证实了所提模型在预测凝视目标方面的有效性。
{"title":"Learning aspects of interest from Gaze","authors":"Kei Shimonishi, H. Kawashima, Ryo Yonetani, Erina Ishikawa, T. Matsuyama","doi":"10.1145/2535948.2535955","DOIUrl":"https://doi.org/10.1145/2535948.2535955","url":null,"abstract":"This paper presents a probabilistic framework to model the gaze generative process when a user is browsing a content consisting of multiple regions. The model enables us to learn multiple aspects of interest from gaze data, to represent and estimate user's interest as a mixture of aspects, and to predict gaze behavior in a unified framework. We recorded gaze data of subjects when they were browsing a digital pictorial book, and confirmed the effectiveness of the proposed model in terms of predicting the gaze target.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121728347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The acoustics of eye contact: detecting visual attention from conversational audio cues 目光接触的声学:从对话音频线索中检测视觉注意力
Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535949
F. Eyben, F. Weninger, L. Paletta, Björn Schuller
An important aspect in short dialogues is attention as is manifested by eye-contact between subjects. In this study we provide a first analysis whether such visual attention is evident in the acoustic properties of a speaker's voice. We thereby introduce the multi-modal GRAS2 corpus, which was recorded for analysing attention in human-to-human interactions of short daily-life interactions with strangers in public places in Graz, Austria. Recordings of four test subjects equipped with eye tracking glasses, three audio recording devices, and motion sensors are contained in the corpus. We describe how we robustly identify speech segments from the subjects and other people in an unsupervised manner from multi-channel recordings. We then discuss correlations between the acoustics of the voice in these segments and the point of visual attention of the subjects. A significant relation between the acoustic features and the distance between the point of view and the eye region of the dialogue partner is found. Further, we show that automatic classification of binary decision eye-contact vs. no eye-contact from acoustic features alone is feasible with an Unweighted Average Recall of up to 70%.
在短对话中,注意力是一个很重要的方面,主要表现在对话双方的目光接触上。在这项研究中,我们首次分析了这种视觉注意力在说话者声音的声学特性中是否明显。因此,我们引入了多模态GRAS2语料库,该语料库被记录下来,用于分析在奥地利格拉茨的公共场所与陌生人的短暂日常生活互动中人与人之间的注意力。语料库中包含四名测试对象的记录,他们配备了眼动追踪眼镜,三个录音设备和运动传感器。我们描述了我们如何以无监督的方式从多通道录音中健壮地识别来自受试者和其他人的语音片段。然后,我们讨论这些片段中声音的声学效果与受试者的视觉注意点之间的相关性。研究发现,声音特征与对话对象的视点和眼睛区域之间的距离有显著的关系。此外,我们表明,仅从声学特征对二元决策进行眼接触与无眼接触的自动分类是可行的,未加权平均召回率高达70%。
{"title":"The acoustics of eye contact: detecting visual attention from conversational audio cues","authors":"F. Eyben, F. Weninger, L. Paletta, Björn Schuller","doi":"10.1145/2535948.2535949","DOIUrl":"https://doi.org/10.1145/2535948.2535949","url":null,"abstract":"An important aspect in short dialogues is attention as is manifested by eye-contact between subjects. In this study we provide a first analysis whether such visual attention is evident in the acoustic properties of a speaker's voice. We thereby introduce the multi-modal GRAS2 corpus, which was recorded for analysing attention in human-to-human interactions of short daily-life interactions with strangers in public places in Graz, Austria. Recordings of four test subjects equipped with eye tracking glasses, three audio recording devices, and motion sensors are contained in the corpus. We describe how we robustly identify speech segments from the subjects and other people in an unsupervised manner from multi-channel recordings. We then discuss correlations between the acoustics of the voice in these segments and the point of visual attention of the subjects. A significant relation between the acoustic features and the distance between the point of view and the eye region of the dialogue partner is found. Further, we show that automatic classification of binary decision eye-contact vs. no eye-contact from acoustic features alone is feasible with an Unweighted Average Recall of up to 70%.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131855947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Mutual disambiguation of eye gaze and speech for sight translation and reading 视觉翻译与阅读中目光与言语的相互消歧
Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535953
Rucha Kulkarni, Kritika Jain, H. Bansal, S. Bangalore, M. Carl
Researchers are proposing interactive machine translation as a potential method to make language translation process more efficient and usable. Introduction of different modalities like eye gaze and speech are being explored to add to the interactivity of language translation system. Unfortunately, the raw data provided by Automatic Speech Recognition (ASR) and Eye-Tracking is very noisy and erroneous. This paper describes a technique for reducing the errors of the two modalities, speech and eye-gaze with the help of each other in context of sight translation and reading. Lattice representation and composition of the two modalities was used for integration. F-measure for Eye-Gaze and Word Accuracy for ASR were used as metrics to evaluate our results. In reading task, we demonstrated a significant improvement in both Eye-Gaze f-measure and speech Word Accuracy. In sight translation task, significant improvement was found in gaze f-measure but not in ASR.
研究人员提出交互式机器翻译作为一种潜在的方法,使语言翻译过程更有效和可用。为增加语言翻译系统的互动性,我们正在探索引入眼神、言语等不同的翻译方式。不幸的是,自动语音识别(ASR)和眼动追踪提供的原始数据非常嘈杂和错误。本文介绍了一种在视觉翻译和阅读的语境中,通过言语和目光两种方式的相互帮助来减少错误的方法。采用点阵表示和组合两种模态进行积分。眼睛凝视的f测量和ASR的单词准确度被用作评估我们的结果的指标。在阅读任务中,我们证明了眼睛注视f测量和语音单词准确性的显著提高。在视觉翻译任务中,注视f测量有显著改善,而ASR无显著改善。
{"title":"Mutual disambiguation of eye gaze and speech for sight translation and reading","authors":"Rucha Kulkarni, Kritika Jain, H. Bansal, S. Bangalore, M. Carl","doi":"10.1145/2535948.2535953","DOIUrl":"https://doi.org/10.1145/2535948.2535953","url":null,"abstract":"Researchers are proposing interactive machine translation as a potential method to make language translation process more efficient and usable. Introduction of different modalities like eye gaze and speech are being explored to add to the interactivity of language translation system. Unfortunately, the raw data provided by Automatic Speech Recognition (ASR) and Eye-Tracking is very noisy and erroneous. This paper describes a technique for reducing the errors of the two modalities, speech and eye-gaze with the help of each other in context of sight translation and reading. Lattice representation and composition of the two modalities was used for integration. F-measure for Eye-Gaze and Word Accuracy for ASR were used as metrics to evaluate our results. In reading task, we demonstrated a significant improvement in both Eye-Gaze f-measure and speech Word Accuracy. In sight translation task, significant improvement was found in gaze f-measure but not in ASR.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128475608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Situated multi-modal dialog system in vehicles 位于车辆中的多模态对话系统
Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535951
Teruhisa Misu, Antoine Raux, Ian Lane, Joan Devassy, Rakesh Gupta
In this paper, we address Townsurfer, a situated multi-modal dialog system in vehicles. The system integrates multi-modal inputs of speech, geo-location, gaze (face direction) and dialog history to answer drivers' queries about their surroundings. To select appropriate data source used to answer queries, we apply belief tracking across the above modalities. We conducted a preliminary data collection and an evaluation focusing on the effect of gaze (head irection) and geo-location estimations. We report the result and analysis on the data.
在本文中,我们讨论了Townsurfer,一个位于车辆中的多模态对话系统。该系统集成了语音、地理位置、凝视(面部方向)和对话历史的多模式输入,以回答驾驶员对周围环境的询问。为了选择合适的数据源来回答查询,我们跨上述模式应用信念跟踪。我们进行了初步的数据收集和评估,重点关注凝视(头部方向)和地理位置估计的影响。我们报告结果并对数据进行分析。
{"title":"Situated multi-modal dialog system in vehicles","authors":"Teruhisa Misu, Antoine Raux, Ian Lane, Joan Devassy, Rakesh Gupta","doi":"10.1145/2535948.2535951","DOIUrl":"https://doi.org/10.1145/2535948.2535951","url":null,"abstract":"In this paper, we address Townsurfer, a situated multi-modal dialog system in vehicles. The system integrates multi-modal inputs of speech, geo-location, gaze (face direction) and dialog history to answer drivers' queries about their surroundings. To select appropriate data source used to answer queries, we apply belief tracking across the above modalities. We conducted a preliminary data collection and an evaluation focusing on the effect of gaze (head irection) and geo-location estimations. We report the result and analysis on the data.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129716520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Agent-assisted multi-viewpoint video viewer and its gaze-based evaluation agent辅助的多视点视频查看器及其基于注视的评价
Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535952
Takatsugu Hirayama, Takafumi Marutani, Daishi Tanoue, Shogo Tokai, S. Fels, K. Mase
Humans see things from various viewpoints but nobody attempts to see anything from every viewpoint owing to physical restrictions and the great effort required. Intelligent interfaces for viewing multi-viewpoint videos may remove the restrictions in effective ways and direct us toward a new visual world. We propose an agent-assisted multi-viewpoint video viewer that incorporates (1) target-centered viewpoint switching and (2) social viewpoint recommendation. The viewer stabilizes an object at the center of the display field using the former function, which helps to fix the user's gaze on the target object. To identify the popular viewing behavior for particular content, the latter function exploits a histogram of the viewing log in terms of time, viewpoints, and the target of many personal viewing experiences. We call this knowledge source of the director agent a viewgram. The agent automatically constructs the preferred viewpoint sequence for each target. We conducted user studies to analyze user behavior, especially eye movement, while using the viewer. The results of statistical analyses showed that the viewpoint sequence extracted from a viewgram includes a more distinct perspective for each target, and the target-centered viewpoint switching encourages the user to gaze at the display center where the target is located during the viewing. The proposed viewer can provide more effective perspectives for the main attractions in scenes.
人类从不同的角度看事物,但由于身体的限制和需要付出的巨大努力,没有人试图从每一个角度看事物。观看多视点视频的智能界面可以有效地消除这些限制,引导我们走向一个新的视觉世界。我们提出了一个智能体辅助的多视点视频查看器,它包含了(1)以目标为中心的视点切换和(2)社会视点推荐。观看者使用前一个功能将一个物体稳定在显示区域的中心,这有助于将用户的目光固定在目标物体上。为了确定特定内容的流行观看行为,后一个函数利用了观看日志的直方图,包括时间、观点和许多个人观看体验的目标。我们称这个导演代理的知识来源为视图图。代理自动为每个目标构建首选的视点序列。我们进行了用户研究来分析用户行为,特别是在使用观看器时的眼球运动。统计分析结果表明,从视点图中提取的视点序列包含了每个目标更清晰的视角,以目标为中心的视点切换鼓励用户在观看过程中注视目标所在的显示中心。建议的观看者可以为场景中的主要景点提供更有效的视角。
{"title":"Agent-assisted multi-viewpoint video viewer and its gaze-based evaluation","authors":"Takatsugu Hirayama, Takafumi Marutani, Daishi Tanoue, Shogo Tokai, S. Fels, K. Mase","doi":"10.1145/2535948.2535952","DOIUrl":"https://doi.org/10.1145/2535948.2535952","url":null,"abstract":"Humans see things from various viewpoints but nobody attempts to see anything from every viewpoint owing to physical restrictions and the great effort required. Intelligent interfaces for viewing multi-viewpoint videos may remove the restrictions in effective ways and direct us toward a new visual world. We propose an agent-assisted multi-viewpoint video viewer that incorporates (1) target-centered viewpoint switching and (2) social viewpoint recommendation. The viewer stabilizes an object at the center of the display field using the former function, which helps to fix the user's gaze on the target object. To identify the popular viewing behavior for particular content, the latter function exploits a histogram of the viewing log in terms of time, viewpoints, and the target of many personal viewing experiences. We call this knowledge source of the director agent a viewgram. The agent automatically constructs the preferred viewpoint sequence for each target. We conducted user studies to analyze user behavior, especially eye movement, while using the viewer. The results of statistical analyses showed that the viewpoint sequence extracted from a viewgram includes a more distinct perspective for each target, and the target-centered viewpoint switching encourages the user to gaze at the display center where the target is located during the viewing. The proposed viewer can provide more effective perspectives for the main attractions in scenes.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"35 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120971324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Finding the timings for a guide agent to interveneinter-user conversation in considering their gazebehaviors 在考虑用户的注视行为时,寻找引导代理干预用户间对话的时机
Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535957
Shochi Otogi, Hung-Hsuan Huang, R. Hotta, K. Kawagoe
As the advance of embodied conversational agent (ECA) technologies, there are more and more real-world deployed applications of ECA's like the guides in museums or exhibitions. However, in those situations, the agent systems are usually used by groups of visitors rather than individuals. In such multi-user situation which is much more complex than single user one, specific features are required. One of them is the ability for the agent to smoothly intervene user-user conversation. This feature is supposed to facilitate mixed-initiative human-agent conversation and more proactive service for the users. This paper presents the results of the first step of our project that aims to build an information providing the agent for collaborative decision making tasks, finding the timings for the agent to intervene user-user conversation to provide active support by focusing on the user's gaze. In order to realize this, at first, a Wizard-of- Oz (WOZ) experiment was conducted for collecting human interaction data. By analyzing the collected corpus, eight kinds of timings which allow the agent to do intervention potentially were found. Second, a method was developed to automatically identify four of the eight kinds of timings only by using nonverbal cues, gaze direction, body posture, and speech information. Although the performance of the method is moderate (F-measure 0.4), it should be able to be improved by integrating context information in the future.
随着具体会话代理技术的发展,具体会话代理在博物馆、展览等场所的实际部署应用越来越多。然而,在这些情况下,代理系统通常是由一群游客而不是个人使用的。在这种比单用户复杂得多的多用户情况下,需要特定的功能。其中之一是代理顺利干预用户-用户对话的能力。该功能旨在促进混合主动人机对话,并为用户提供更主动的服务。本文介绍了我们项目的第一步的结果,该项目旨在建立一个为协作决策任务提供信息的代理,找到代理干预用户-用户对话的时机,通过关注用户的目光来提供主动支持。为了实现这一点,首先,我们进行了一个《绿野仙踪》(Wizard-of- Oz, WOZ)实验来收集人类互动数据。通过对收集到的语料库进行分析,找到了8种智能体可能进行干预的时间点。其次,提出了一种仅通过非语言线索、凝视方向、身体姿势和语音信息自动识别8种时间中的4种的方法。虽然该方法的性能一般(f值为0.4),但未来应该可以通过整合上下文信息来改进。
{"title":"Finding the timings for a guide agent to interveneinter-user conversation in considering their gazebehaviors","authors":"Shochi Otogi, Hung-Hsuan Huang, R. Hotta, K. Kawagoe","doi":"10.1145/2535948.2535957","DOIUrl":"https://doi.org/10.1145/2535948.2535957","url":null,"abstract":"As the advance of embodied conversational agent (ECA) technologies, there are more and more real-world deployed applications of ECA's like the guides in museums or exhibitions. However, in those situations, the agent systems are usually used by groups of visitors rather than individuals. In such multi-user situation which is much more complex than single user one, specific features are required. One of them is the ability for the agent to smoothly intervene user-user conversation. This feature is supposed to facilitate mixed-initiative human-agent conversation and more proactive service for the users. This paper presents the results of the first step of our project that aims to build an information providing the agent for collaborative decision making tasks, finding the timings for the agent to intervene user-user conversation to provide active support by focusing on the user's gaze. In order to realize this, at first, a Wizard-of- Oz (WOZ) experiment was conducted for collecting human interaction data. By analyzing the collected corpus, eight kinds of timings which allow the agent to do intervention potentially were found. Second, a method was developed to automatically identify four of the eight kinds of timings only by using nonverbal cues, gaze direction, body posture, and speech information. Although the performance of the method is moderate (F-measure 0.4), it should be able to be improved by integrating context information in the future.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133100513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Unrawelling the interaction strategies and gaze in collaborative learning with online video lectures 通过网络视频讲座,揭示协作学习中的互动策略与凝视
Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535959
R. Bednarik, Marko Kauppinen
Using dual eye tracking we performed a study characterising the differences in interaction patterns while learning online materials individually or with a peer. The findings show that in majority of cases, user prefer to use the online learning materials in parallel when working on a learning task on their own tool. Collaborative learning took longer due to negotiation overheads, and most attention was paid to the materials. However, collaboration did not have effects on the overall distribution of gaze.
我们使用双眼动追踪进行了一项研究,描述了单独或与同伴一起学习在线材料时互动模式的差异。研究结果表明,在大多数情况下,用户在使用自己的工具进行学习任务时,更喜欢并行使用在线学习材料。由于谈判费用的原因,协作学习花费的时间更长,而且大部分注意力都集中在材料上。然而,合作对注视的总体分布没有影响。
{"title":"Unrawelling the interaction strategies and gaze in collaborative learning with online video lectures","authors":"R. Bednarik, Marko Kauppinen","doi":"10.1145/2535948.2535959","DOIUrl":"https://doi.org/10.1145/2535948.2535959","url":null,"abstract":"Using dual eye tracking we performed a study characterising the differences in interaction patterns while learning online materials individually or with a peer. The findings show that in majority of cases, user prefer to use the online learning materials in parallel when working on a learning task on their own tool. Collaborative learning took longer due to negotiation overheads, and most attention was paid to the materials. However, collaboration did not have effects on the overall distribution of gaze.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"774 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127154468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Context aware addressee estimation for human robot interaction 人机交互中上下文感知的收件人估计
Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535958
Samira Sheikhi, D. Jayagopi, Vasil Khalidov, J. Odobez
The paper investigates the problem of addressee recognition -to whom a speaker's utterance is intended- in a setting involving a humanoid robot interacting with multiple persons. More specifically, as it is well known that addressee can primarily be derived from the speaker's visual focus of attention (VFOA) defined as whom or what a person is looking at, we address the following questions: how much does the performance degrade when using automatically extracted VFOA from head pose instead of the VFOA ground-truth? Can the conversational context improve addressee recognition by using it either directly as a side cue in the addressee classifier, or indirectly by improving the VFOA recognition, or in both ways? Finally, from a computational perspective, which VFOA features and normalizations are better and does it matter whether the VFOA recognition module only monitors whether a person looks at potential addressee targets (the robot, people) or if it also considers objects of interest in the environment (paintings in our case) as additional VFOA targets? Experiments on the public Vernissage database where the humanoid Nao robots make a quiz to two participants shows that reducing VFOA confusion (either through context, or by ignoring VFOA targets) improves addressee recognition.
本文研究了在人形机器人与多人互动的情况下,收件人识别问题——说话人的话语是针对谁的。更具体地说,众所周知,收件人主要可以从说话人的视觉注意力焦点(VFOA)中获得,VFOA被定义为一个人正在看谁或什么,我们解决了以下问题:当使用从头部姿势自动提取的VFOA而不是VFOA基本事实时,性能会降低多少?会话上下文是否可以通过直接使用它作为收件人分类器中的边提示,或间接使用它来改进VFOA识别,或两者兼而有之来改进收件人识别?最后,从计算的角度来看,哪个VFOA特征和规范化更好,VFOA识别模块是否只监视一个人是否看着潜在的目标(机器人、人),或者它是否也考虑环境中感兴趣的对象(在我们的例子中是绘画)作为额外的VFOA目标,这是否重要?在公共Vernissage数据库中,人形机器人Nao对两名参与者进行测试,实验表明减少VFOA混淆(通过上下文或忽略VFOA目标)可以提高收件人识别。
{"title":"Context aware addressee estimation for human robot interaction","authors":"Samira Sheikhi, D. Jayagopi, Vasil Khalidov, J. Odobez","doi":"10.1145/2535948.2535958","DOIUrl":"https://doi.org/10.1145/2535948.2535958","url":null,"abstract":"The paper investigates the problem of addressee recognition -to whom a speaker's utterance is intended- in a setting involving a humanoid robot interacting with multiple persons. More specifically, as it is well known that addressee can primarily be derived from the speaker's visual focus of attention (VFOA) defined as whom or what a person is looking at, we address the following questions: how much does the performance degrade when using automatically extracted VFOA from head pose instead of the VFOA ground-truth? Can the conversational context improve addressee recognition by using it either directly as a side cue in the addressee classifier, or indirectly by improving the VFOA recognition, or in both ways? Finally, from a computational perspective, which VFOA features and normalizations are better and does it matter whether the VFOA recognition module only monitors whether a person looks at potential addressee targets (the robot, people) or if it also considers objects of interest in the environment (paintings in our case) as additional VFOA targets? Experiments on the public Vernissage database where the humanoid Nao robots make a quiz to two participants shows that reducing VFOA confusion (either through context, or by ignoring VFOA targets) improves addressee recognition.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128606966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Feature selection for gaze, pupillary, and EEG signals evoked in a 3D environment 三维环境下凝视、瞳孔和脑电图信号的特征选择
Pub Date : 2013-12-13 DOI: 10.1145/2535948.2535950
D. Jangraw, P. Sajda
As we navigate our environment, we are constantly assessing the objects we encounter and deciding on their subjective interest to us. In this study, we investigate the neural and ocular correlates of this assessment as a step towards their potential use in a mobile human-computer interface (HCI). Past research has shown that multiple physiological signals are evoked by objects of interest during visual search in the laboratory, including gaze, pupil dilation, and neural activity; these have been exploited for use in various HCIs. We use a virtual environment to explore which of these signals are also evoked during exploration of a dynamic, free-viewing 3D environment. Using a hierarchical classifier and sequential forward floating selection (SFFS), we identify a small, robust set of features across multiple modalities that can be used to distinguish targets from distractors in the virtual environment. The identification of these features may serve as an important factor in the design of mobile HCIs.
当我们在我们的环境中航行时,我们不断地评估我们遇到的物体,并决定它们对我们的主观兴趣。在这项研究中,我们研究了这种评估的神经和眼相关性,作为它们在移动人机界面(HCI)中潜在应用的一步。过去的研究表明,在实验室的视觉搜索过程中,感兴趣的物体会引发多种生理信号,包括凝视、瞳孔扩张和神经活动;这些已被用于各种hci。我们使用虚拟环境来探索在探索动态、自由观看的3D环境时,哪些信号也会被唤起。使用分层分类器和顺序前向浮动选择(SFFS),我们在多个模式中识别出一组小而鲁棒的特征,可用于区分虚拟环境中的目标和干扰物。识别这些特征可以作为移动hci设计的一个重要因素。
{"title":"Feature selection for gaze, pupillary, and EEG signals evoked in a 3D environment","authors":"D. Jangraw, P. Sajda","doi":"10.1145/2535948.2535950","DOIUrl":"https://doi.org/10.1145/2535948.2535950","url":null,"abstract":"As we navigate our environment, we are constantly assessing the objects we encounter and deciding on their subjective interest to us. In this study, we investigate the neural and ocular correlates of this assessment as a step towards their potential use in a mobile human-computer interface (HCI). Past research has shown that multiple physiological signals are evoked by objects of interest during visual search in the laboratory, including gaze, pupil dilation, and neural activity; these have been exploited for use in various HCIs. We use a virtual environment to explore which of these signals are also evoked during exploration of a dynamic, free-viewing 3D environment. Using a hierarchical classifier and sequential forward floating selection (SFFS), we identify a small, robust set of features across multiple modalities that can be used to distinguish targets from distractors in the virtual environment. The identification of these features may serve as an important factor in the design of mobile HCIs.","PeriodicalId":403097,"journal":{"name":"GazeIn '13","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133651623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
GazeIn '13
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1