首页 > 最新文献

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction最新文献

英文 中文
Attention and Engagement Aware Multimodal Conversational Systems 注意和参与感知多模态会话系统
Zhou Yu
Despite their ability to complete certain tasks, dialog systems still suffer from poor adaptation to users' engagement and attention. We observe human behaviors in different conversational settings to understand human communication dynamics and then transfer the knowledge to multimodal dialog system design. To focus solely on maintaining engaging conversations, we design and implement a non-task oriented multimodal dialog system, which serves as a framework for controlled multimodal conversation analysis. We design computational methods to model user engagement and attention in real time by leveraging automatically harvested multimodal human behaviors, such as smiles and speech volume. We aim to design and implement a multimodal dialog system to coordinate with users' engagement and attention on the fly via techniques such as adaptive conversational strategies and incremental speech production.
尽管对话系统有能力完成某些任务,但它们对用户粘性和注意力的适应能力仍然很差。我们观察人类在不同会话环境中的行为,了解人类的交流动态,然后将知识转移到多模态对话系统设计中。为了专注于保持有吸引力的对话,我们设计并实现了一个非任务导向的多模态对话系统,它作为受控多模态对话分析的框架。我们设计了计算方法,通过利用自动收集的多模态人类行为(如微笑和说话量)来实时模拟用户参与度和注意力。我们的目标是设计和实现一个多模态对话系统,通过自适应会话策略和增量语音生成等技术来协调用户的参与和注意力。
{"title":"Attention and Engagement Aware Multimodal Conversational Systems","authors":"Zhou Yu","doi":"10.1145/2818346.2823309","DOIUrl":"https://doi.org/10.1145/2818346.2823309","url":null,"abstract":"Despite their ability to complete certain tasks, dialog systems still suffer from poor adaptation to users' engagement and attention. We observe human behaviors in different conversational settings to understand human communication dynamics and then transfer the knowledge to multimodal dialog system design. To focus solely on maintaining engaging conversations, we design and implement a non-task oriented multimodal dialog system, which serves as a framework for controlled multimodal conversation analysis. We design computational methods to model user engagement and attention in real time by leveraging automatically harvested multimodal human behaviors, such as smiles and speech volume. We aim to design and implement a multimodal dialog system to coordinate with users' engagement and attention on the fly via techniques such as adaptive conversational strategies and incremental speech production.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"185 2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82917580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
ECA Control using a Single Affective User Dimension 使用单一情感用户维度的ECA控制
Fred Charles, Florian Pecune, Gabor Aranyi, C. Pelachaud, M. Cavazza
User interaction with Embodied Conversational Agents (ECA) should involve a significant affective component to achieve realism in communication. This aspect has been studied through different frameworks describing the relationship between user and ECA, for instance alignment, rapport and empathy. We conducted an experiment to explore how an ECA's non-verbal expression can be controlled to respond to a single affective dimension generated by users as input. Our system is based on the mapping of a high-level affective dimension, approach/avoidance, onto a new ECA control mechanism in which Action Units (AU) are activated through a neural network. Since 'approach' has been associated to prefrontal cortex activation, we use a measure of prefrontal cortex left-asymmetry through fNIRS as a single input signal representing the user's attitude towards the ECA. We carried out the experiment with 10 subjects, who have been instructed to express a positive mental attitude towards the ECA. In return, the ECA facial expression would reflect the perceived attitude under a neurofeedback paradigm. Our results suggest that users are able to successfully interact with the ECA and perceive its response as consistent and realistic, both in terms of ECA responsiveness and in terms of relevance of facial expressions. From a system perspective, the empirical calibration of the network supports a progressive recruitment of various AUs, which provides a principled description of the ECA response and its intensity. Our findings suggest that complex ECA facial expressions can be successfully aligned with one high-level affective dimension. Furthermore, this use of a single dimension as input could support experiments in the fine-tuning of AU activation or their personalization to user preferred modalities.
用户与具身会话代理(ECA)的交互应该包含重要的情感成分,以实现交流的现实性。这方面已经通过描述用户和ECA之间关系的不同框架进行了研究,例如对齐、融洽和同理心。我们进行了一项实验,以探索如何控制ECA的非语言表达,以响应用户作为输入产生的单一情感维度。我们的系统是基于一个高层次的情感维度,接近/回避,映射到一个新的ECA控制机制,其中行动单元(AU)通过一个神经网络被激活。由于“接近”与前额皮质激活有关,我们通过fNIRS测量前额皮质左侧不对称,作为代表用户对ECA态度的单一输入信号。我们对10名受试者进行了实验,他们被要求对ECA表达积极的心理态度。作为回报,ECA面部表情将反映在神经反馈范式下感知到的态度。我们的研究结果表明,用户能够成功地与ECA互动,并将其反应视为一致和现实的,无论是在ECA响应方面还是在面部表情的相关性方面。从系统的角度来看,网络的经验校准支持逐步招募各种AUs,这提供了对ECA响应及其强度的原则性描述。我们的研究结果表明,复杂的ECA面部表情可以成功地与一个高层次的情感维度相一致。此外,使用单一维度作为输入可以支持AU激活的微调实验或用户偏好模式的个性化。
{"title":"ECA Control using a Single Affective User Dimension","authors":"Fred Charles, Florian Pecune, Gabor Aranyi, C. Pelachaud, M. Cavazza","doi":"10.1145/2818346.2820730","DOIUrl":"https://doi.org/10.1145/2818346.2820730","url":null,"abstract":"User interaction with Embodied Conversational Agents (ECA) should involve a significant affective component to achieve realism in communication. This aspect has been studied through different frameworks describing the relationship between user and ECA, for instance alignment, rapport and empathy. We conducted an experiment to explore how an ECA's non-verbal expression can be controlled to respond to a single affective dimension generated by users as input. Our system is based on the mapping of a high-level affective dimension, approach/avoidance, onto a new ECA control mechanism in which Action Units (AU) are activated through a neural network. Since 'approach' has been associated to prefrontal cortex activation, we use a measure of prefrontal cortex left-asymmetry through fNIRS as a single input signal representing the user's attitude towards the ECA. We carried out the experiment with 10 subjects, who have been instructed to express a positive mental attitude towards the ECA. In return, the ECA facial expression would reflect the perceived attitude under a neurofeedback paradigm. Our results suggest that users are able to successfully interact with the ECA and perceive its response as consistent and realistic, both in terms of ECA responsiveness and in terms of relevance of facial expressions. From a system perspective, the empirical calibration of the network supports a progressive recruitment of various AUs, which provides a principled description of the ECA response and its intensity. Our findings suggest that complex ECA facial expressions can be successfully aligned with one high-level affective dimension. Furthermore, this use of a single dimension as input could support experiments in the fine-tuning of AU activation or their personalization to user preferred modalities.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87401564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Personality Trait Classification via Co-Occurrent Multiparty Multimodal Event Discovery 基于共发生多方多模态事件发现的人格特征分类
S. Okada, O. Aran, D. Gática-Pérez
This paper proposes a novel feature extraction framework from mutli-party multimodal conversation for inference of personality traits and emergent leadership. The proposed framework represents multi modal features as the combination of each participant's nonverbal activity and group activity. This feature representation enables to compare the nonverbal patterns extracted from the participants of different groups in a metric space. It captures how the target member outputs nonverbal behavior observed in a group (e.g. the member speaks while all members move their body), and can be available for any kind of multiparty conversation task. Frequent co-occurrent events are discovered using graph clustering from multimodal sequences. The proposed framework is applied for the ELEA corpus which is an audio visual dataset collected from group meetings. We evaluate the framework for binary classification task of 10 personality traits. Experimental results show that the model trained with co-occurrence features obtained higher accuracy than previously related work in 8 out of 10 traits. In addition, the co-occurrence features improve the accuracy from 2 % up to 17 %.
本文提出了一种新的基于多方多模态对话的特征提取框架,用于人格特质和应急领导的推理。该框架体现了多模态特征,即每个参与者的非语言活动和群体活动的结合。这种特征表示可以在度量空间中比较从不同群体参与者中提取的非语言模式。它捕捉目标成员如何输出在群体中观察到的非语言行为(例如,当所有成员都移动他们的身体时,成员说话),并且可以用于任何类型的多方对话任务。利用图聚类方法从多模态序列中发现频繁的共发生事件。提议的框架应用于ELEA语料库,该语料库是从小组会议收集的视听数据集。我们评估了10种人格特质的二元分类任务框架。实验结果表明,用共现特征训练的模型在10个特征中有8个特征的准确率高于已有的模型。此外,共现特征将准确率从2%提高到17%。
{"title":"Personality Trait Classification via Co-Occurrent Multiparty Multimodal Event Discovery","authors":"S. Okada, O. Aran, D. Gática-Pérez","doi":"10.1145/2818346.2820757","DOIUrl":"https://doi.org/10.1145/2818346.2820757","url":null,"abstract":"This paper proposes a novel feature extraction framework from mutli-party multimodal conversation for inference of personality traits and emergent leadership. The proposed framework represents multi modal features as the combination of each participant's nonverbal activity and group activity. This feature representation enables to compare the nonverbal patterns extracted from the participants of different groups in a metric space. It captures how the target member outputs nonverbal behavior observed in a group (e.g. the member speaks while all members move their body), and can be available for any kind of multiparty conversation task. Frequent co-occurrent events are discovered using graph clustering from multimodal sequences. The proposed framework is applied for the ELEA corpus which is an audio visual dataset collected from group meetings. We evaluate the framework for binary classification task of 10 personality traits. Experimental results show that the model trained with co-occurrence features obtained higher accuracy than previously related work in 8 out of 10 traits. In addition, the co-occurrence features improve the accuracy from 2 % up to 17 %.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87982317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
Interaction Studies with Social Robots 与社交机器人的交互研究
K. Dautenhahn
Over the past 10 years we have seen worldwide an immense growth of research and development into companion robots. Those are robots that fulfil particular tasks, but do so in a socially acceptable manner. The companionship aspect reflects the repeated and long-term nature of such interactions, and the potential of people to form relationships with such robots, e.g. as friendly assistants. A number of companion and assistant robots have been entering the market, two of the latest examples are Aldebaran's Pepper robot, or Jibo (Cynthia Breazeal). Companion robots are more and more targeting particular application areas, e.g. as home assistants or therapeutic tools. Research into companion robots needs to address many fundamental research problems concerning perception, cognition, action and learning, but regardless how sophisticated our robotic systems may be, the potential users need to be taken into account from the early stages of development. The talk will emphasize the need for a highly user-centred approach towards design, development and evaluation of companion robots. An important challenge is to evaluate robots in realistic and long-term scenarios, in order to capture as closely as possible those key aspects that will play a role when using such robots in the real world. In order to illustrate these points, my talk will give examples of interaction studies that my research team has been involved in. This includes studies into how people perceive robots' non-verbal cues, creating and evaluating realistic scenarios for home companion robots using narrative framing, and verbal and tactile interaction of children with the therapeutic and social robot Kaspar. The talk will highlight the issues we encountered when we proceeded from laboratory-based experiments and prototypes to real-world applications.
在过去的10年里,我们看到世界范围内对伴侣机器人的研究和开发取得了巨大的增长。这些机器人可以完成特定的任务,但要以社会可接受的方式完成。陪伴方面反映了这种互动的重复性和长期性,以及人们与这种机器人建立关系的潜力,例如作为友好的助手。许多伴侣和助手机器人已经进入市场,最新的两个例子是Aldebaran的Pepper机器人,或者Jibo(辛西娅·布雷齐尔饰)。伴侣机器人越来越多地瞄准特定的应用领域,例如作为家庭助理或治疗工具。对伴侣机器人的研究需要解决许多关于感知、认知、行动和学习的基础研究问题,但无论我们的机器人系统有多复杂,潜在的用户需要从开发的早期阶段就考虑到。演讲将强调需要一个高度以用户为中心的方法来设计、开发和评估同伴机器人。一个重要的挑战是在现实和长期的场景中评估机器人,以便尽可能接近地捕捉那些在现实世界中使用此类机器人时将发挥作用的关键方面。为了说明这些观点,我的演讲将给出我的研究团队参与的相互作用研究的例子。这包括研究人们如何感知机器人的非语言线索,使用叙事框架为家庭伴侣机器人创建和评估现实场景,以及儿童与治疗和社交机器人Kaspar的语言和触觉互动。讲座将重点介绍我们从实验室实验和原型到实际应用过程中遇到的问题。
{"title":"Interaction Studies with Social Robots","authors":"K. Dautenhahn","doi":"10.1145/2818346.2818347","DOIUrl":"https://doi.org/10.1145/2818346.2818347","url":null,"abstract":"Over the past 10 years we have seen worldwide an immense growth of research and development into companion robots. Those are robots that fulfil particular tasks, but do so in a socially acceptable manner. The companionship aspect reflects the repeated and long-term nature of such interactions, and the potential of people to form relationships with such robots, e.g. as friendly assistants. A number of companion and assistant robots have been entering the market, two of the latest examples are Aldebaran's Pepper robot, or Jibo (Cynthia Breazeal). Companion robots are more and more targeting particular application areas, e.g. as home assistants or therapeutic tools. Research into companion robots needs to address many fundamental research problems concerning perception, cognition, action and learning, but regardless how sophisticated our robotic systems may be, the potential users need to be taken into account from the early stages of development. The talk will emphasize the need for a highly user-centred approach towards design, development and evaluation of companion robots. An important challenge is to evaluate robots in realistic and long-term scenarios, in order to capture as closely as possible those key aspects that will play a role when using such robots in the real world. In order to illustrate these points, my talk will give examples of interaction studies that my research team has been involved in. This includes studies into how people perceive robots' non-verbal cues, creating and evaluating realistic scenarios for home companion robots using narrative framing, and verbal and tactile interaction of children with the therapeutic and social robot Kaspar. The talk will highlight the issues we encountered when we proceeded from laboratory-based experiments and prototypes to real-world applications.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78326714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Deep Learning for Emotion Recognition on Small Datasets using Transfer Learning 基于迁移学习的小数据集情感识别深度学习
Hongwei Ng, Viet Dung Nguyen, Vassilios Vonikakis, Stefan Winkler
This paper presents the techniques employed in our team's submissions to the 2015 Emotion Recognition in the Wild contest, for the sub-challenge of Static Facial Expression Recognition in the Wild. The objective of this sub-challenge is to classify the emotions expressed by the primary human subject in static images extracted from movies. We follow a transfer learning approach for deep Convolutional Neural Network (CNN) architectures. Starting from a network pre-trained on the generic ImageNet dataset, we perform supervised fine-tuning on the network in a two-stage process, first on datasets relevant to facial expressions, followed by the contest's dataset. Experimental results show that this cascading fine-tuning approach achieves better results, compared to a single stage fine-tuning with the combined datasets. Our best submission exhibited an overall accuracy of 48.5% in the validation set and 55.6% in the test set, which compares favorably to the respective 35.96% and 39.13% of the challenge baseline.
本文介绍了我们团队提交给2015年野外情感识别竞赛的技术,用于野外静态面部表情识别的子挑战。这个子挑战的目标是对从电影中提取的静态图像中主要人类主体所表达的情感进行分类。我们遵循深度卷积神经网络(CNN)架构的迁移学习方法。从在通用ImageNet数据集上预训练的网络开始,我们分两个阶段对网络进行监督微调,首先是与面部表情相关的数据集,然后是比赛的数据集。实验结果表明,与组合数据集的单阶段微调相比,这种级联微调方法取得了更好的效果。我们的最佳提交在验证集中显示出48.5%的总体准确性,在测试集中显示出55.6%的总体准确性,这与挑战基线的35.96%和39.13%相比是有利的。
{"title":"Deep Learning for Emotion Recognition on Small Datasets using Transfer Learning","authors":"Hongwei Ng, Viet Dung Nguyen, Vassilios Vonikakis, Stefan Winkler","doi":"10.1145/2818346.2830593","DOIUrl":"https://doi.org/10.1145/2818346.2830593","url":null,"abstract":"This paper presents the techniques employed in our team's submissions to the 2015 Emotion Recognition in the Wild contest, for the sub-challenge of Static Facial Expression Recognition in the Wild. The objective of this sub-challenge is to classify the emotions expressed by the primary human subject in static images extracted from movies. We follow a transfer learning approach for deep Convolutional Neural Network (CNN) architectures. Starting from a network pre-trained on the generic ImageNet dataset, we perform supervised fine-tuning on the network in a two-stage process, first on datasets relevant to facial expressions, followed by the contest's dataset. Experimental results show that this cascading fine-tuning approach achieves better results, compared to a single stage fine-tuning with the combined datasets. Our best submission exhibited an overall accuracy of 48.5% in the validation set and 55.6% in the test set, which compares favorably to the respective 35.96% and 39.13% of the challenge baseline.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88925689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 568
Connections: 2015 ICMI Sustained Accomplishment Award Lecture 联系:2015年ICMI持续成就奖讲座
E. Horvitz
Our community has long pursued principles and methods for enabling fluid and effortless collaborations between people and computing systems. Forging deep connections between people and machines has come into focus over the last 25 years as a grand challenge at the intersection of artificial intelligence, human-computer interaction, and cognitive psychology. I will review experiences and directions with leveraging advances in perception, learning, and reasoning in pursuit of our shared dreams.
长期以来,我们的社区一直在追求实现人与计算系统之间流畅而轻松的协作的原则和方法。在过去的25年里,在人与机器之间建立深层次的联系已经成为人工智能、人机交互和认知心理学交叉领域的一项重大挑战。我将回顾经验和方向,利用感知、学习和推理的进步来追求我们共同的梦想。
{"title":"Connections: 2015 ICMI Sustained Accomplishment Award Lecture","authors":"E. Horvitz","doi":"10.1145/2818346.2835500","DOIUrl":"https://doi.org/10.1145/2818346.2835500","url":null,"abstract":"Our community has long pursued principles and methods for enabling fluid and effortless collaborations between people and computing systems. Forging deep connections between people and machines has come into focus over the last 25 years as a grand challenge at the intersection of artificial intelligence, human-computer interaction, and cognitive psychology. I will review experiences and directions with leveraging advances in perception, learning, and reasoning in pursuit of our shared dreams.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89114812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Computational Model of Culture-Specific Emotion Detection for Artificial Agents in the Learning Domain 学习领域人工智能的特定文化情感检测计算模型
Ganapreeta Renunathan Naidu
Nowadays, intelligent agents are expected to be affect-sensitive as agents are becoming essential entities that supports computer-mediated tasks, especially in teaching and training. These agents use common natural modalities-such as facial expressions, gestures and eye gaze in order to recognize a user's affective state and respond accordingly. However, these nonverbal cues may not be universal as emotion recognition and expression differ from culture to culture. It is important that intelligent interfaces are equipped with the abilities to meet the challenge of cultural diversity to facilitate human-machine interaction particularly in Asia. Asians are known to be more passive and possess certain traits such as indirectness and non-confrontationalism, which lead to emotions such as (culture-specific form of) shyness and timidity. Therefore, a model based on other culture may not be applicable in an Asian setting, out-rulling a one-size-fits-all approach. This study is initiated to identify the discriminative markers of culture-specific emotions based on the multimodal interactions.
如今,随着智能体成为支持计算机中介任务的重要实体,特别是在教学和培训中,智能体被期望具有影响敏感性。这些代理使用常见的自然形态——如面部表情、手势和眼神——来识别用户的情感状态并做出相应的反应。然而,这些非语言线索可能不是普遍的,因为情感识别和表达因文化而异。重要的是,智能界面必须具备应对文化多样性挑战的能力,以促进人机交互,特别是在亚洲。众所周知,亚洲人更被动,具有间接和非对抗主义等某些特征,这些特征会导致(特定文化形式的)害羞和胆怯等情绪。因此,基于其他文化的模式可能不适用于亚洲环境,排除了一刀切的方法。本研究旨在基于多模态交互作用来识别文化特异性情绪的鉴别标记。
{"title":"A Computational Model of Culture-Specific Emotion Detection for Artificial Agents in the Learning Domain","authors":"Ganapreeta Renunathan Naidu","doi":"10.1145/2818346.2823307","DOIUrl":"https://doi.org/10.1145/2818346.2823307","url":null,"abstract":"Nowadays, intelligent agents are expected to be affect-sensitive as agents are becoming essential entities that supports computer-mediated tasks, especially in teaching and training. These agents use common natural modalities-such as facial expressions, gestures and eye gaze in order to recognize a user's affective state and respond accordingly. However, these nonverbal cues may not be universal as emotion recognition and expression differ from culture to culture. It is important that intelligent interfaces are equipped with the abilities to meet the challenge of cultural diversity to facilitate human-machine interaction particularly in Asia. Asians are known to be more passive and possess certain traits such as indirectness and non-confrontationalism, which lead to emotions such as (culture-specific form of) shyness and timidity. Therefore, a model based on other culture may not be applicable in an Asian setting, out-rulling a one-size-fits-all approach. This study is initiated to identify the discriminative markers of culture-specific emotions based on the multimodal interactions.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75694221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multimodal Selfies: Designing a Multimodal Recording Device for Students in Traditional Classrooms 多模态自拍:设计一种用于传统教室学生的多模态记录设备
Federico Domínguez, K. Chiluiza, Vanessa Echeverría, X. Ochoa
The traditional recording of student interaction in classrooms has raised privacy concerns in both students and academics. However, the same students are happy to share their daily lives through social media. Perception of data ownership is the key factor in this paradox. This article proposes the design of a personal Multimodal Recording Device (MRD) that could capture the actions of its owner during lectures. The MRD would be able to capture close-range video, audio, writing, and other environmental signals. Differently from traditional centralized recording systems, students would have control over their own recorded data. They could decide to share their information in exchange of access to the recordings of the instructor, notes form their classmates, and analysis of, for example, their attention performance. By sharing their data, students participate in the co-creation of enhanced and synchronized course notes that will benefit all the participating students. This work presents details about how such a device could be build from available components. This work also discusses and evaluates the design of such device, including its foreseeable costs, scalability, flexibility, intrusiveness and recording quality.
学生在课堂上互动的传统记录引起了学生和学者对隐私的关注。然而,同样是这些学生也乐于通过社交媒体分享他们的日常生活。对数据所有权的认知是这个悖论的关键因素。本文提出了一种个人多模态录音设备(MRD)的设计,该设备可以捕捉其所有者在讲座期间的动作。MRD将能够捕捉近距离的视频、音频、文字和其他环境信号。与传统的集中记录系统不同,学生可以控制自己记录的数据。他们可以决定分享他们的信息,以换取访问讲师的录音,同学的笔记,以及分析,例如,他们的注意力表现。通过共享他们的数据,学生们参与了共同创建增强和同步的课程笔记,这将使所有参与的学生受益。这项工作详细介绍了如何从可用组件构建这样的设备。本文还讨论和评估了这种设备的设计,包括其可预见的成本、可扩展性、灵活性、侵入性和记录质量。
{"title":"Multimodal Selfies: Designing a Multimodal Recording Device for Students in Traditional Classrooms","authors":"Federico Domínguez, K. Chiluiza, Vanessa Echeverría, X. Ochoa","doi":"10.1145/2818346.2830606","DOIUrl":"https://doi.org/10.1145/2818346.2830606","url":null,"abstract":"The traditional recording of student interaction in classrooms has raised privacy concerns in both students and academics. However, the same students are happy to share their daily lives through social media. Perception of data ownership is the key factor in this paradox. This article proposes the design of a personal Multimodal Recording Device (MRD) that could capture the actions of its owner during lectures. The MRD would be able to capture close-range video, audio, writing, and other environmental signals. Differently from traditional centralized recording systems, students would have control over their own recorded data. They could decide to share their information in exchange of access to the recordings of the instructor, notes form their classmates, and analysis of, for example, their attention performance. By sharing their data, students participate in the co-creation of enhanced and synchronized course notes that will benefit all the participating students. This work presents details about how such a device could be build from available components. This work also discusses and evaluates the design of such device, including its foreseeable costs, scalability, flexibility, intrusiveness and recording quality.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75843857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Providing Real-time Feedback for Student Teachers in a Virtual Rehearsal Environment 在虚拟排练环境中为实习教师提供实时反馈
R. Barmaki, C. Hughes
Research in learning analytics and educational data mining has recently become prominent in the fields of computer science and education. Most scholars in the field emphasize student learning and student data analytics; however, it is also important to focus on teaching analytics and teacher preparation because of their key roles in student learning, especially in K-12 learning environments. Nonverbal communication strategies play an important role in successful interpersonal communication of teachers with their students. In order to assist novice or practicing teachers with exhibiting open and affirmative nonverbal cues in their classrooms, we have designed a multimodal teaching platform with provisions for online feedback. We used an interactive teaching rehearsal software, TeachLivE, as our basic research environment. TeachLivE employs a digital puppetry paradigm as its core technology. Individuals walk into this virtual environment and interact with virtual students displayed on a large screen. They can practice classroom management, pedagogy and content delivery skills with a teaching plan in the TeachLivE environment. We have designed an experiment to evaluate the impact of an online nonverbal feedback application. In this experiment, different types of multimodal data have been collected during two experimental settings. These data include talk-time and nonverbal behaviors of the virtual students, captured in log files; talk time and full body tracking data of the participant; and video recording of the virtual classroom with the participant. 34 student teachers participated in this 30-minute experiment. In each of the settings, the participants were provided with teaching plans from which they taught. All the participants took part in both of the experimental settings. In order to have a balanced experiment design, half of the participants received nonverbal online feedback in their first session and the other half received this feedback in the second session. A visual indication was used for feedback each time the participant exhibited a closed, defensive posture. Based on recorded full-body tracking data, we observed that only those who received feedback in their first session demonstrated a significant number of open postures in the session containing no feedback. However, the post-questionnaire information indicated that all participants were more mindful of their body postures while teaching after they had participated in the study.
近年来,学习分析和教育数据挖掘的研究在计算机科学和教育领域已成为突出的研究方向。该领域的大多数学者强调学生学习和学生数据分析;然而,关注教学分析和教师准备也很重要,因为它们在学生学习中起着关键作用,特别是在K-12学习环境中。非语言交际策略在师生人际交往中起着重要的作用。为了帮助新手或实习教师在课堂上展示开放和肯定的非语言暗示,我们设计了一个多模式教学平台,并提供在线反馈。我们使用交互式教学排练软件TeachLivE作为基础研究环境。TeachLivE采用数字木偶范式作为其核心技术。个人走进这个虚拟环境,与显示在大屏幕上的虚拟学生互动。他们可以在TeachLivE环境中通过教学计划练习课堂管理、教学法和内容交付技能。我们设计了一个实验来评估在线非语言反馈应用程序的影响。在本实验中,在两个实验设置中收集了不同类型的多模态数据。这些数据包括记录在日志文件中的虚拟学生的谈话时间和非语言行为;参与者的通话时间和全身跟踪数据;以及参与者在虚拟教室的视频记录。34名实习教师参加了这个30分钟的实验。在每一种情况下,参与者都被提供了教学计划。所有的参与者都参加了两种实验设置。为了平衡实验设计,一半的参与者在第一阶段接受非语言在线反馈,另一半在第二阶段接受非语言在线反馈。每次参与者表现出封闭的、防御性的姿势时,视觉指示被用于反馈。根据记录的全身跟踪数据,我们观察到只有那些在第一次训练中得到反馈的人在没有反馈的训练中表现出大量的开放姿势。然而,问卷调查后的信息表明,所有参与者在参与研究后,在教学时都更加注意自己的身体姿势。
{"title":"Providing Real-time Feedback for Student Teachers in a Virtual Rehearsal Environment","authors":"R. Barmaki, C. Hughes","doi":"10.1145/2818346.2830604","DOIUrl":"https://doi.org/10.1145/2818346.2830604","url":null,"abstract":"Research in learning analytics and educational data mining has recently become prominent in the fields of computer science and education. Most scholars in the field emphasize student learning and student data analytics; however, it is also important to focus on teaching analytics and teacher preparation because of their key roles in student learning, especially in K-12 learning environments. Nonverbal communication strategies play an important role in successful interpersonal communication of teachers with their students. In order to assist novice or practicing teachers with exhibiting open and affirmative nonverbal cues in their classrooms, we have designed a multimodal teaching platform with provisions for online feedback. We used an interactive teaching rehearsal software, TeachLivE, as our basic research environment. TeachLivE employs a digital puppetry paradigm as its core technology. Individuals walk into this virtual environment and interact with virtual students displayed on a large screen. They can practice classroom management, pedagogy and content delivery skills with a teaching plan in the TeachLivE environment. We have designed an experiment to evaluate the impact of an online nonverbal feedback application. In this experiment, different types of multimodal data have been collected during two experimental settings. These data include talk-time and nonverbal behaviors of the virtual students, captured in log files; talk time and full body tracking data of the participant; and video recording of the virtual classroom with the participant. 34 student teachers participated in this 30-minute experiment. In each of the settings, the participants were provided with teaching plans from which they taught. All the participants took part in both of the experimental settings. In order to have a balanced experiment design, half of the participants received nonverbal online feedback in their first session and the other half received this feedback in the second session. A visual indication was used for feedback each time the participant exhibited a closed, defensive posture. Based on recorded full-body tracking data, we observed that only those who received feedback in their first session demonstrated a significant number of open postures in the session containing no feedback. However, the post-questionnaire information indicated that all participants were more mindful of their body postures while teaching after they had participated in the study.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76959158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Spoken Interruptions Signal Productive Problem Solving and Domain Expertise in Mathematics 口头中断信号生产问题解决和领域专业知识的数学
S. Oviatt, Kevin Hang, Jianlong Zhou, Fang Chen
Prevailing social norms prohibit interrupting another person when they are speaking. In this research, simultaneous speech was investigated in groups of students as they jointly solved math problems and peer tutored one another. Analyses were based on the Math Data Corpus, which includes ground-truth performance coding and speech transcriptions. Simultaneous speech was elevated 120-143% during the most productive phase of problem solving, compared with matched intervals. It also was elevated 18-37% in students who were domain experts, compared with non-experts. Qualitative analyses revealed that experts differed from non-experts in the function of their interruptions. Analysis of these functional asymmetries produced nine key behaviors that were used to identify the dominant math expert in a group with 95-100% accuracy in three minutes. This research demonstrates that overlapped speech is a marker of group problem-solving progress and domain expertise. It provides valuable information for the emerging field of learning analytics.
普遍的社会规范禁止在别人说话时打断别人。在本研究中,我们调查了学生在共同解决数学问题和相互辅导时的同声演讲。分析基于数学数据语料库,其中包括真实表现编码和语音转录。与匹配的时间间隔相比,在解决问题的最高效阶段,同时说话的能力提高了120-143%。与非专家相比,领域专家的学生的智商也提高了18-37%。定性分析表明,专家与非专家在打断的功能上存在差异。对这些功能不对称的分析产生了九种关键行为,用来在三分钟内以95-100%的准确率识别小组中的主导数学专家。本研究表明,重叠言语是群体问题解决进展和领域专长的标志。它为新兴的学习分析领域提供了有价值的信息。
{"title":"Spoken Interruptions Signal Productive Problem Solving and Domain Expertise in Mathematics","authors":"S. Oviatt, Kevin Hang, Jianlong Zhou, Fang Chen","doi":"10.1145/2818346.2820743","DOIUrl":"https://doi.org/10.1145/2818346.2820743","url":null,"abstract":"Prevailing social norms prohibit interrupting another person when they are speaking. In this research, simultaneous speech was investigated in groups of students as they jointly solved math problems and peer tutored one another. Analyses were based on the Math Data Corpus, which includes ground-truth performance coding and speech transcriptions. Simultaneous speech was elevated 120-143% during the most productive phase of problem solving, compared with matched intervals. It also was elevated 18-37% in students who were domain experts, compared with non-experts. Qualitative analyses revealed that experts differed from non-experts in the function of their interruptions. Analysis of these functional asymmetries produced nine key behaviors that were used to identify the dominant math expert in a group with 95-100% accuracy in three minutes. This research demonstrates that overlapped speech is a marker of group problem-solving progress and domain expertise. It provides valuable information for the emerging field of learning analytics.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"29 3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79862327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
期刊
Proceedings of the 2015 ACM on International Conference on Multimodal Interaction
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1