Online Affect Tracking with Multimodal Kalman Filters

Krishna Somandepalli, Rahul Gupta, Md. Nasir, Brandon M. Booth, Sungbok Lee, Shrikanth S. Narayanan
{"title":"Online Affect Tracking with Multimodal Kalman Filters","authors":"Krishna Somandepalli, Rahul Gupta, Md. Nasir, Brandon M. Booth, Sungbok Lee, Shrikanth S. Narayanan","doi":"10.1145/2988257.2988259","DOIUrl":null,"url":null,"abstract":"Arousal and valence have been widely used to represent emotions dimensionally and measure them continuously in time. In this paper, we introduce a computational framework for tracking these affective dimensions from multimodal data as an entry to the Multimodal Affect Recognition Sub-Challenge of the 2016 Audio/Visual Emotion Challenge and Workshop (AVEC2016). We propose a linear dynamical system approach with a late fusion method that accounts for the dynamics of the affective state evolution (i.e., arousal or valence). To this end, single-modality predictions are modeled as observations in a Kalman filter formulation in order to continuously track each affective dimension. Leveraging the inter-correlations between arousal and valence, we use the predicted arousal as an additional feature to improve valence predictions. Furthermore, we propose a conditional framework to select Kalman filters of different modalities while tracking. This framework employs voicing probability and facial posture cues to detect the absence or presence of each input modality. Our multimodal fusion results on the development and the test set provide a statistically significant improvement over the baseline system from AVEC2016. The proposed approach can be potentially extended to other multimodal tasks with inter-correlated behavioral dimensions.","PeriodicalId":432793,"journal":{"name":"Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2988257.2988259","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

Abstract

Arousal and valence have been widely used to represent emotions dimensionally and measure them continuously in time. In this paper, we introduce a computational framework for tracking these affective dimensions from multimodal data as an entry to the Multimodal Affect Recognition Sub-Challenge of the 2016 Audio/Visual Emotion Challenge and Workshop (AVEC2016). We propose a linear dynamical system approach with a late fusion method that accounts for the dynamics of the affective state evolution (i.e., arousal or valence). To this end, single-modality predictions are modeled as observations in a Kalman filter formulation in order to continuously track each affective dimension. Leveraging the inter-correlations between arousal and valence, we use the predicted arousal as an additional feature to improve valence predictions. Furthermore, we propose a conditional framework to select Kalman filters of different modalities while tracking. This framework employs voicing probability and facial posture cues to detect the absence or presence of each input modality. Our multimodal fusion results on the development and the test set provide a statistically significant improvement over the baseline system from AVEC2016. The proposed approach can be potentially extended to other multimodal tasks with inter-correlated behavioral dimensions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多模态卡尔曼滤波器在线影响跟踪
唤起和效价已被广泛地用于情感的维度表征和持续的时间测量。在本文中,我们引入了一个计算框架,用于从多模态数据中跟踪这些情感维度,作为2016年音频/视觉情感挑战和研讨会(AVEC2016)的多模态情感识别子挑战的入口。我们提出了一种线性动态系统方法,采用一种晚期融合方法来解释情感状态演变的动态(即唤醒或价态)。为此,单模态预测被建模为卡尔曼滤波公式中的观测值,以便连续跟踪每个情感维度。利用唤醒和效价之间的相互关系,我们使用预测唤醒作为一个额外的特征来改进效价预测。此外,我们还提出了一个条件框架,在跟踪时选择不同模态的卡尔曼滤波器。该框架使用语音概率和面部姿势线索来检测每种输入模态的存在或缺失。与AVEC2016的基线系统相比,我们在开发和测试集上的多模态融合结果在统计上有显著改善。所提出的方法可以潜在地扩展到其他具有相互关联行为维度的多模态任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Detecting Depression using Vocal, Facial and Semantic Communication Cues Multimodal Emotion Recognition for AVEC 2016 Challenge Staircase Regression in OA RVM, Data Selection and Gender Dependency in AVEC 2016 Session details: Depression recognition Depression Assessment by Fusing High and Low Level Features from Audio, Video, and Text
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1