首页 > 最新文献

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction最新文献

英文 中文
Hierarchical Committee of Deep CNNs with Exponentially-Weighted Decision Fusion for Static Facial Expression Recognition 基于指数加权决策融合的深度cnn分层委员会静态面部表情识别
Bo-Kyeong Kim, Hwaran Lee, Jihyeon Roh, Soo-Young Lee
We present a pattern recognition framework to improve committee machines of deep convolutional neural networks (deep CNNs) and its application to static facial expression recognition in the wild (SFEW). In order to generate enough diversity of decisions, we trained multiple deep CNNs by varying network architectures, input normalization, and weight initialization as well as by adopting several learning strategies to use large external databases. Moreover, with these deep models, we formed hierarchical committees using the validation-accuracy-based exponentially-weighted average (VA-Expo-WA) rule. Through extensive experiments, the great strengths of our committee machines were demonstrated in both structural and decisional ways. On the SFEW2.0 dataset released for the 3rd Emotion Recognition in the Wild (EmotiW) sub-challenge, a test accuracy of 57.3% was obtained from the best single deep CNN, while the single-level committees yielded 58.3% and 60.5% with the simple average rule and with the VA-Expo-WA rule, respectively. Our final submission based on the 3-level hierarchy using the VA-Expo-WA achieved 61.6%, significantly higher than the SFEW baseline of 39.1%.
我们提出了一个模式识别框架,以改进深度卷积神经网络(deep cnn)的委员会机及其在静态面部表情识别(SFEW)中的应用。为了产生足够的决策多样性,我们通过不同的网络架构、输入归一化和权重初始化以及采用多种学习策略来使用大型外部数据库来训练多个深度cnn。此外,通过这些深度模型,我们使用基于验证精度的指数加权平均(VA-Expo-WA)规则组建了分层委员会。通过广泛的实验,我们的委员会机器在结构和决策方面的巨大优势得到了证明。在面向第三次情感识别(EmotiW)子挑战发布的SFEW2.0数据集上,最佳单深度CNN的测试准确率为57.3%,而使用简单平均规则和VA-Expo-WA规则的单级别委员会的测试准确率分别为58.3%和60.5%。我们使用VA-Expo-WA基于三级层次的最终提交获得了61.6%,显著高于SFEW基线的39.1%。
{"title":"Hierarchical Committee of Deep CNNs with Exponentially-Weighted Decision Fusion for Static Facial Expression Recognition","authors":"Bo-Kyeong Kim, Hwaran Lee, Jihyeon Roh, Soo-Young Lee","doi":"10.1145/2818346.2830590","DOIUrl":"https://doi.org/10.1145/2818346.2830590","url":null,"abstract":"We present a pattern recognition framework to improve committee machines of deep convolutional neural networks (deep CNNs) and its application to static facial expression recognition in the wild (SFEW). In order to generate enough diversity of decisions, we trained multiple deep CNNs by varying network architectures, input normalization, and weight initialization as well as by adopting several learning strategies to use large external databases. Moreover, with these deep models, we formed hierarchical committees using the validation-accuracy-based exponentially-weighted average (VA-Expo-WA) rule. Through extensive experiments, the great strengths of our committee machines were demonstrated in both structural and decisional ways. On the SFEW2.0 dataset released for the 3rd Emotion Recognition in the Wild (EmotiW) sub-challenge, a test accuracy of 57.3% was obtained from the best single deep CNN, while the single-level committees yielded 58.3% and 60.5% with the simple average rule and with the VA-Expo-WA rule, respectively. Our final submission based on the 3-level hierarchy using the VA-Expo-WA achieved 61.6%, significantly higher than the SFEW baseline of 39.1%.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79678040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 124
Session details: Oral Session 5: Interaction Techniques 会话内容:会话5:互动技巧
S. Oviatt
{"title":"Session details: Oral Session 5: Interaction Techniques","authors":"S. Oviatt","doi":"10.1145/3252450","DOIUrl":"https://doi.org/10.1145/3252450","url":null,"abstract":"","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84746040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Keynote Address 1 会议详情:主题演讲
Zhengyou Zhang
{"title":"Session details: Keynote Address 1","authors":"Zhengyou Zhang","doi":"10.1145/3252443","DOIUrl":"https://doi.org/10.1145/3252443","url":null,"abstract":"","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83182668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nakama: A Companion for Non-verbal Affective Communication Nakama:非语言情感交流的伴侣
Christian J. A. M. Willemse, G. M. Munters, J. V. Erp, D. Heylen
We present "Nakama": A communication device that supports affective communication between a child and its - geographically separated - parent. Nakama consists of a control unit at the parent's end and an actuated teddy bear for the child. The bear contains several communication channels, including social touch, temperature, and vibrotactile heartbeats; all aimed at increasing the sense of presence. The current version of Nakama is suitable for user evaluations in lab settings, with which we aim to gain a more thorough understanding of the opportunities and limitations of these less traditional communication channels.
我们提出了“Nakama”:一种通信设备,支持儿童和地理上分离的父母之间的情感交流。Nakama由父母一端的控制单元和孩子的驱动泰迪熊组成。熊有几个沟通渠道,包括社交触摸、温度和振动触觉心跳;所有这些都是为了增加存在感。Nakama的当前版本适合在实验室环境中进行用户评估,我们的目标是更全面地了解这些不太传统的通信渠道的机会和局限性。
{"title":"Nakama: A Companion for Non-verbal Affective Communication","authors":"Christian J. A. M. Willemse, G. M. Munters, J. V. Erp, D. Heylen","doi":"10.1145/2818346.2823299","DOIUrl":"https://doi.org/10.1145/2818346.2823299","url":null,"abstract":"We present \"Nakama\": A communication device that supports affective communication between a child and its - geographically separated - parent. Nakama consists of a control unit at the parent's end and an actuated teddy bear for the child. The bear contains several communication channels, including social touch, temperature, and vibrotactile heartbeats; all aimed at increasing the sense of presence. The current version of Nakama is suitable for user evaluations in lab settings, with which we aim to gain a more thorough understanding of the opportunities and limitations of these less traditional communication channels.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83638041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Session details: Oral Session 6: Mobile and Wearable 会议详情:口头会议6:移动和可穿戴
M. Johnston
{"title":"Session details: Oral Session 6: Mobile and Wearable","authors":"M. Johnston","doi":"10.1145/3252451","DOIUrl":"https://doi.org/10.1145/3252451","url":null,"abstract":"","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88942069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting and Identifying Tactile Gestures using Deep Autoencoders, Geometric Moments and Gesture Level Features
Dana Hughes, N. Farrow, Halley P. Profita, N. Correll
While several sensing modalities and transduction approaches have been developed for tactile sensing in robotic skins, there has been much less work towards extracting features for or identifying high-level gestures performed on the skin. In this paper, we investigate using deep neural networks with hidden Markov models (DNN-HMMs), geometric moments and gesture level features to identify a set of gestures performed on robotic skins. We demonstrate that these features are useful for identifying gestures, and predict a set of gestures from a 14-class dataset with 56% accuracy, and a 7-class dataset with 71% accuracy.
虽然已经开发了几种用于机器人皮肤触觉传感的传感模式和转导方法,但在提取特征或识别在皮肤上执行的高级手势方面的工作要少得多。在本文中,我们研究了使用具有隐马尔可夫模型(dnn - hmm),几何矩和手势水平特征的深度神经网络来识别机器人皮肤上执行的一组手势。我们证明了这些特征对识别手势很有用,并从14类数据集中预测了一组手势,准确率为56%,从7类数据集中预测了一组手势,准确率为71%。
{"title":"Detecting and Identifying Tactile Gestures using Deep Autoencoders, Geometric Moments and Gesture Level Features","authors":"Dana Hughes, N. Farrow, Halley P. Profita, N. Correll","doi":"10.1145/2818346.2830601","DOIUrl":"https://doi.org/10.1145/2818346.2830601","url":null,"abstract":"While several sensing modalities and transduction approaches have been developed for tactile sensing in robotic skins, there has been much less work towards extracting features for or identifying high-level gestures performed on the skin. In this paper, we investigate using deep neural networks with hidden Markov models (DNN-HMMs), geometric moments and gesture level features to identify a set of gestures performed on robotic skins. We demonstrate that these features are useful for identifying gestures, and predict a set of gestures from a 14-class dataset with 56% accuracy, and a 7-class dataset with 71% accuracy.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76547778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Interaction Studies with Social Robots 与社交机器人的交互研究
K. Dautenhahn
Over the past 10 years we have seen worldwide an immense growth of research and development into companion robots. Those are robots that fulfil particular tasks, but do so in a socially acceptable manner. The companionship aspect reflects the repeated and long-term nature of such interactions, and the potential of people to form relationships with such robots, e.g. as friendly assistants. A number of companion and assistant robots have been entering the market, two of the latest examples are Aldebaran's Pepper robot, or Jibo (Cynthia Breazeal). Companion robots are more and more targeting particular application areas, e.g. as home assistants or therapeutic tools. Research into companion robots needs to address many fundamental research problems concerning perception, cognition, action and learning, but regardless how sophisticated our robotic systems may be, the potential users need to be taken into account from the early stages of development. The talk will emphasize the need for a highly user-centred approach towards design, development and evaluation of companion robots. An important challenge is to evaluate robots in realistic and long-term scenarios, in order to capture as closely as possible those key aspects that will play a role when using such robots in the real world. In order to illustrate these points, my talk will give examples of interaction studies that my research team has been involved in. This includes studies into how people perceive robots' non-verbal cues, creating and evaluating realistic scenarios for home companion robots using narrative framing, and verbal and tactile interaction of children with the therapeutic and social robot Kaspar. The talk will highlight the issues we encountered when we proceeded from laboratory-based experiments and prototypes to real-world applications.
在过去的10年里,我们看到世界范围内对伴侣机器人的研究和开发取得了巨大的增长。这些机器人可以完成特定的任务,但要以社会可接受的方式完成。陪伴方面反映了这种互动的重复性和长期性,以及人们与这种机器人建立关系的潜力,例如作为友好的助手。许多伴侣和助手机器人已经进入市场,最新的两个例子是Aldebaran的Pepper机器人,或者Jibo(辛西娅·布雷齐尔饰)。伴侣机器人越来越多地瞄准特定的应用领域,例如作为家庭助理或治疗工具。对伴侣机器人的研究需要解决许多关于感知、认知、行动和学习的基础研究问题,但无论我们的机器人系统有多复杂,潜在的用户需要从开发的早期阶段就考虑到。演讲将强调需要一个高度以用户为中心的方法来设计、开发和评估同伴机器人。一个重要的挑战是在现实和长期的场景中评估机器人,以便尽可能接近地捕捉那些在现实世界中使用此类机器人时将发挥作用的关键方面。为了说明这些观点,我的演讲将给出我的研究团队参与的相互作用研究的例子。这包括研究人们如何感知机器人的非语言线索,使用叙事框架为家庭伴侣机器人创建和评估现实场景,以及儿童与治疗和社交机器人Kaspar的语言和触觉互动。讲座将重点介绍我们从实验室实验和原型到实际应用过程中遇到的问题。
{"title":"Interaction Studies with Social Robots","authors":"K. Dautenhahn","doi":"10.1145/2818346.2818347","DOIUrl":"https://doi.org/10.1145/2818346.2818347","url":null,"abstract":"Over the past 10 years we have seen worldwide an immense growth of research and development into companion robots. Those are robots that fulfil particular tasks, but do so in a socially acceptable manner. The companionship aspect reflects the repeated and long-term nature of such interactions, and the potential of people to form relationships with such robots, e.g. as friendly assistants. A number of companion and assistant robots have been entering the market, two of the latest examples are Aldebaran's Pepper robot, or Jibo (Cynthia Breazeal). Companion robots are more and more targeting particular application areas, e.g. as home assistants or therapeutic tools. Research into companion robots needs to address many fundamental research problems concerning perception, cognition, action and learning, but regardless how sophisticated our robotic systems may be, the potential users need to be taken into account from the early stages of development. The talk will emphasize the need for a highly user-centred approach towards design, development and evaluation of companion robots. An important challenge is to evaluate robots in realistic and long-term scenarios, in order to capture as closely as possible those key aspects that will play a role when using such robots in the real world. In order to illustrate these points, my talk will give examples of interaction studies that my research team has been involved in. This includes studies into how people perceive robots' non-verbal cues, creating and evaluating realistic scenarios for home companion robots using narrative framing, and verbal and tactile interaction of children with the therapeutic and social robot Kaspar. The talk will highlight the issues we encountered when we proceeded from laboratory-based experiments and prototypes to real-world applications.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78326714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Capturing AU-Aware Facial Features and Their Latent Relations for Emotion Recognition in the Wild 捕捉au感知的面部特征及其在野外情绪识别中的潜在关系
Anbang Yao, Junchao Shao, Ningning Ma, Yurong Chen
The Emotion Recognition in the Wild (EmotiW) Challenge has been held for three years. Previous winner teams primarily focus on designing specific deep neural networks or fusing diverse hand-crafted and deep convolutional features. They all neglect to explore the significance of the latent relations among changing features resulted from facial muscle motions. In this paper, we study this recognition challenge from the perspective of analyzing the relations among expression-specific facial features in an explicit manner. Our method has three key components. First, we propose a pair-wise learning strategy to automatically seek a set of facial image patches which are important for discriminating two particular emotion categories. We found these learnt local patches are in part consistent with the locations of expression-specific Action Units (AUs), thus the features extracted from such kind of facial patches are named AU-aware facial features. Second, in each pair-wise task, we use an undirected graph structure, which takes learnt facial patches as individual vertices, to encode feature relations between any two learnt facial patches. Finally, a robust emotion representation is constructed by concatenating all task-specific graph-structured facial feature relations sequentially. Extensive experiments on the EmotiW 2015 Challenge testify the efficacy of the proposed approach. Without using additional data, our final submissions achieved competitive results on both sub-challenges including the image based static facial expression recognition (we got 55.38% recognition accuracy outperforming the baseline 39.13% with a margin of 16.25%) and the audio-video based emotion recognition (we got 53.80% recognition accuracy outperforming the baseline 39.33% and the 2014 winner team's final result 50.37% with the margins of 14.47% and 3.43%, respectively).
“野生情绪识别挑战赛”已经举办了三年。之前的获奖团队主要专注于设计特定的深度神经网络或融合各种手工制作和深度卷积特征。他们都忽视了对面部肌肉运动所导致的特征变化之间潜在关系的探讨。在本文中,我们从明确分析表情特征之间关系的角度来研究这一识别挑战。我们的方法有三个关键组成部分。首先,我们提出了一种成对学习策略来自动寻找一组面部图像补丁,这些补丁对于区分两种特定的情绪类别很重要。我们发现这些学习到的局部斑块与表达特异性动作单元(expression-specific Action Units, au)的位置部分一致,因此从这种面部斑块中提取的特征被称为au感知面部特征。其次,在每个成对任务中,我们使用无向图结构,将学习到的面部斑块作为单个顶点,编码任何两个学习到的面部斑块之间的特征关系。最后,通过顺序连接所有特定任务的图形结构面部特征关系,构建了鲁棒的情感表示。EmotiW 2015挑战赛上的大量实验证明了所提出方法的有效性。在没有使用额外数据的情况下,我们最终提交的作品在基于图像的静态面部表情识别(我们获得了55.38%的识别准确率,优于基线的39.13%,差值为16.25%)和基于音频视频的情感识别(我们获得了53.80%的识别准确率,优于基线的39.33%,2014年冠军团队的最终结果为50.37%,差值分别为14.47%和3.43%)这两个子挑战上都取得了有竞争力的结果。
{"title":"Capturing AU-Aware Facial Features and Their Latent Relations for Emotion Recognition in the Wild","authors":"Anbang Yao, Junchao Shao, Ningning Ma, Yurong Chen","doi":"10.1145/2818346.2830585","DOIUrl":"https://doi.org/10.1145/2818346.2830585","url":null,"abstract":"The Emotion Recognition in the Wild (EmotiW) Challenge has been held for three years. Previous winner teams primarily focus on designing specific deep neural networks or fusing diverse hand-crafted and deep convolutional features. They all neglect to explore the significance of the latent relations among changing features resulted from facial muscle motions. In this paper, we study this recognition challenge from the perspective of analyzing the relations among expression-specific facial features in an explicit manner. Our method has three key components. First, we propose a pair-wise learning strategy to automatically seek a set of facial image patches which are important for discriminating two particular emotion categories. We found these learnt local patches are in part consistent with the locations of expression-specific Action Units (AUs), thus the features extracted from such kind of facial patches are named AU-aware facial features. Second, in each pair-wise task, we use an undirected graph structure, which takes learnt facial patches as individual vertices, to encode feature relations between any two learnt facial patches. Finally, a robust emotion representation is constructed by concatenating all task-specific graph-structured facial feature relations sequentially. Extensive experiments on the EmotiW 2015 Challenge testify the efficacy of the proposed approach. Without using additional data, our final submissions achieved competitive results on both sub-challenges including the image based static facial expression recognition (we got 55.38% recognition accuracy outperforming the baseline 39.13% with a margin of 16.25%) and the audio-video based emotion recognition (we got 53.80% recognition accuracy outperforming the baseline 39.33% and the 2014 winner team's final result 50.37% with the margins of 14.47% and 3.43%, respectively).","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77767764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 101
Video and Image based Emotion Recognition Challenges in the Wild: EmotiW 2015 基于视频和图像的情感识别挑战:EmotiW 2015
Abhinav Dhall, O. V. R. Murthy, Roland Göcke, Jyoti Joshi, Tom Gedeon
The third Emotion Recognition in the Wild (EmotiW) challenge 2015 consists of an audio-video based emotion and static image based facial expression classification sub-challenges, which mimics real-world conditions. The two sub-challenges are based on the Acted Facial Expression in the Wild (AFEW) 5.0 and the Static Facial Expression in the Wild (SFEW) 2.0 databases, respectively. The paper describes the data, baseline method, challenge protocol and the challenge results. A total of 12 and 17 teams participated in the video based emotion and image based expression sub-challenges, respectively.
2015年第三次野外情绪识别挑战赛(EmotiW)由基于音频视频的情绪和基于静态图像的面部表情分类子挑战组成,模拟了现实世界的情况。这两个子挑战分别基于野生面部表情(AFEW) 5.0和静态面部表情(SFEW) 2.0数据库。本文介绍了数据、基线方法、挑战方案和挑战结果。共有12支队伍和17支队伍分别参加了基于视频的情感子挑战和基于图像的表达子挑战。
{"title":"Video and Image based Emotion Recognition Challenges in the Wild: EmotiW 2015","authors":"Abhinav Dhall, O. V. R. Murthy, Roland Göcke, Jyoti Joshi, Tom Gedeon","doi":"10.1145/2818346.2829994","DOIUrl":"https://doi.org/10.1145/2818346.2829994","url":null,"abstract":"The third Emotion Recognition in the Wild (EmotiW) challenge 2015 consists of an audio-video based emotion and static image based facial expression classification sub-challenges, which mimics real-world conditions. The two sub-challenges are based on the Acted Facial Expression in the Wild (AFEW) 5.0 and the Static Facial Expression in the Wild (SFEW) 2.0 databases, respectively. The paper describes the data, baseline method, challenge protocol and the challenge results. A total of 12 and 17 teams participated in the video based emotion and image based expression sub-challenges, respectively.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80072845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 281
Attention and Engagement Aware Multimodal Conversational Systems 注意和参与感知多模态会话系统
Zhou Yu
Despite their ability to complete certain tasks, dialog systems still suffer from poor adaptation to users' engagement and attention. We observe human behaviors in different conversational settings to understand human communication dynamics and then transfer the knowledge to multimodal dialog system design. To focus solely on maintaining engaging conversations, we design and implement a non-task oriented multimodal dialog system, which serves as a framework for controlled multimodal conversation analysis. We design computational methods to model user engagement and attention in real time by leveraging automatically harvested multimodal human behaviors, such as smiles and speech volume. We aim to design and implement a multimodal dialog system to coordinate with users' engagement and attention on the fly via techniques such as adaptive conversational strategies and incremental speech production.
尽管对话系统有能力完成某些任务,但它们对用户粘性和注意力的适应能力仍然很差。我们观察人类在不同会话环境中的行为,了解人类的交流动态,然后将知识转移到多模态对话系统设计中。为了专注于保持有吸引力的对话,我们设计并实现了一个非任务导向的多模态对话系统,它作为受控多模态对话分析的框架。我们设计了计算方法,通过利用自动收集的多模态人类行为(如微笑和说话量)来实时模拟用户参与度和注意力。我们的目标是设计和实现一个多模态对话系统,通过自适应会话策略和增量语音生成等技术来协调用户的参与和注意力。
{"title":"Attention and Engagement Aware Multimodal Conversational Systems","authors":"Zhou Yu","doi":"10.1145/2818346.2823309","DOIUrl":"https://doi.org/10.1145/2818346.2823309","url":null,"abstract":"Despite their ability to complete certain tasks, dialog systems still suffer from poor adaptation to users' engagement and attention. We observe human behaviors in different conversational settings to understand human communication dynamics and then transfer the knowledge to multimodal dialog system design. To focus solely on maintaining engaging conversations, we design and implement a non-task oriented multimodal dialog system, which serves as a framework for controlled multimodal conversation analysis. We design computational methods to model user engagement and attention in real time by leveraging automatically harvested multimodal human behaviors, such as smiles and speech volume. We aim to design and implement a multimodal dialog system to coordinate with users' engagement and attention on the fly via techniques such as adaptive conversational strategies and incremental speech production.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82917580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Proceedings of the 2015 ACM on International Conference on Multimodal Interaction
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1