An Active Learning Paradigm for Online Audio-Visual Emotion Recognition

IF 9.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Affective Computing Pub Date : 2019-12-20 DOI:10.1109/TAFFC.2019.2961089
Ioannis Kansizoglou;Loukas Bampis;Antonios Gasteratos
{"title":"An Active Learning Paradigm for Online Audio-Visual Emotion Recognition","authors":"Ioannis Kansizoglou;Loukas Bampis;Antonios Gasteratos","doi":"10.1109/TAFFC.2019.2961089","DOIUrl":null,"url":null,"abstract":"The advancement of Human-Robot Interaction (HRI) drives research into the development of advanced emotion identification architectures that fathom audio-visual (A-V) modalities of human emotion. State-of-the-art methods in multi-modal emotion recognition mainly focus on the classification of complete video sequences, leading to systems with no online potentialities. Such techniques are capable of predicting emotions only when the videos are concluded, thus restricting their applicability in practical scenarios. This article provides a novel paradigm for online emotion classification, which exploits both audio and visual modalities and produces a responsive prediction when the system is confident enough. We propose two deep Convolutional Neural Network (CNN) models for extracting emotion features, one for each modality, and a Deep Neural Network (DNN) for their fusion. In order to conceive the temporal quality of human emotion in interactive scenarios, we train in cascade a Long Short-Term Memory (LSTM) layer and a Reinforcement Learning (RL) agent –which monitors the speaker– thus stopping feature extraction and making the final prediction. The comparison of our results on two publicly available A-V emotional datasets viz., RML and BAUM-1s, against other state-of-the-art models, demonstrates the beneficial capabilities of our work.","PeriodicalId":13131,"journal":{"name":"IEEE Transactions on Affective Computing","volume":"13 2","pages":"756-768"},"PeriodicalIF":9.8000,"publicationDate":"2019-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAFFC.2019.2961089","citationCount":"54","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Affective Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/8937495/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 54

Abstract

The advancement of Human-Robot Interaction (HRI) drives research into the development of advanced emotion identification architectures that fathom audio-visual (A-V) modalities of human emotion. State-of-the-art methods in multi-modal emotion recognition mainly focus on the classification of complete video sequences, leading to systems with no online potentialities. Such techniques are capable of predicting emotions only when the videos are concluded, thus restricting their applicability in practical scenarios. This article provides a novel paradigm for online emotion classification, which exploits both audio and visual modalities and produces a responsive prediction when the system is confident enough. We propose two deep Convolutional Neural Network (CNN) models for extracting emotion features, one for each modality, and a Deep Neural Network (DNN) for their fusion. In order to conceive the temporal quality of human emotion in interactive scenarios, we train in cascade a Long Short-Term Memory (LSTM) layer and a Reinforcement Learning (RL) agent –which monitors the speaker– thus stopping feature extraction and making the final prediction. The comparison of our results on two publicly available A-V emotional datasets viz., RML and BAUM-1s, against other state-of-the-art models, demonstrates the beneficial capabilities of our work.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种在线视听情感识别的主动学习范式
人机交互(HRI)的进步推动了对高级情感识别架构的开发研究,该架构可以理解人类情感的视听(A-V)模式。最先进的多模态情感识别方法主要集中在对完整视频序列的分类上,导致系统没有在线潜力。这种技术只能在视频结束时预测情绪,从而限制了其在实际场景中的适用性。本文为在线情绪分类提供了一种新的范式,它利用音频和视觉模式,并在系统足够自信时产生响应预测。我们提出了两个用于提取情绪特征的深度卷积神经网络(CNN)模型,每个模型一个,并提出了一个用于融合的深度神经网络(DNN)。为了构想交互场景中人类情绪的时间质量,我们级联训练长短期记忆(LSTM)层和监控说话者的强化学习(RL)代理,从而停止特征提取并进行最终预测。将我们在两个公开可用的A-V情感数据集(即RML和BAUM-1s)上的结果与其他最先进的模型进行比较,证明了我们工作的有益能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Affective Computing
IEEE Transactions on Affective Computing COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, CYBERNETICS
CiteScore
15.00
自引率
6.20%
发文量
174
期刊介绍: The IEEE Transactions on Affective Computing is an international and interdisciplinary journal. Its primary goal is to share research findings on the development of systems capable of recognizing, interpreting, and simulating human emotions and related affective phenomena. The journal publishes original research on the underlying principles and theories that explain how and why affective factors shape human-technology interactions. It also focuses on how techniques for sensing and simulating affect can enhance our understanding of human emotions and processes. Additionally, the journal explores the design, implementation, and evaluation of systems that prioritize the consideration of affect in their usability. We also welcome surveys of existing work that provide new perspectives on the historical and future directions of this field.
期刊最新文献
2025 Reviewers List* Only Subsets Matters: The Effect of Dual Fairness Constraints in Speech Emotion Recognition Towards Efficient and Robust Linguistic Emotion Diagnosis for Mental Health via Multi-Agent Instruction Refinement MERGE: A Bimodal Audio-Lyrics Dataset For Static Music Emotion Recognition CFSDBN: Emotion Recognition Via Channel Feature Selection and Dynamic Brain Network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1