基于深度学习的交互式会话语音情感气候识别

Ghada Alhussein, M. Alkhodari, Ahsan Khandokher, L. Hadjileontiadis
{"title":"基于深度学习的交互式会话语音情感气候识别","authors":"Ghada Alhussein, M. Alkhodari, Ahsan Khandokher, L. Hadjileontiadis","doi":"10.1109/ICDH55609.2022.00023","DOIUrl":null,"url":null,"abstract":"Emotions play a pivotal role in the individual's overall physical health. Therefore, there has been a steadily increasing interest towards emotion recognition in conversation (ERC). In this work, we propose bidirectional long short term memory (Bi-LSTM), convolutional neural network (CNN), and CNN-BiLSTM based models to predict the emotional climate established during the conversation by peers. Their speech signals across their conversation are analyzed using Mel frequency cepstral coefficients (MFCCs) that are then fed to the Bi-LSTM, CNN and CNN-BiLSTM models to predict the valence and arousal emotional climate cues. The proposed approach was tested on a publicly available dataset, namely K-EmoCon, that includes emotion labeling and peers' speech signals, during their conversation. The obtained results show that Bi-LSTM, CNN and CNN-BiLSTM models achieved a classification accuracy (arousal/valence) of 67.5%/57.7%, 73.3%/66.9%, and 75.1%/68.3%, respectively. These encouraging results show that a combination of deep learning schemes could increase the classification accuracy and provide efficient emotional climate recognition in naturalistic conversation environments.","PeriodicalId":120923,"journal":{"name":"2022 IEEE International Conference on Digital Health (ICDH)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Emotional Climate Recognition in Interactive Conversational Speech Using Deep Learning\",\"authors\":\"Ghada Alhussein, M. Alkhodari, Ahsan Khandokher, L. Hadjileontiadis\",\"doi\":\"10.1109/ICDH55609.2022.00023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emotions play a pivotal role in the individual's overall physical health. Therefore, there has been a steadily increasing interest towards emotion recognition in conversation (ERC). In this work, we propose bidirectional long short term memory (Bi-LSTM), convolutional neural network (CNN), and CNN-BiLSTM based models to predict the emotional climate established during the conversation by peers. Their speech signals across their conversation are analyzed using Mel frequency cepstral coefficients (MFCCs) that are then fed to the Bi-LSTM, CNN and CNN-BiLSTM models to predict the valence and arousal emotional climate cues. The proposed approach was tested on a publicly available dataset, namely K-EmoCon, that includes emotion labeling and peers' speech signals, during their conversation. The obtained results show that Bi-LSTM, CNN and CNN-BiLSTM models achieved a classification accuracy (arousal/valence) of 67.5%/57.7%, 73.3%/66.9%, and 75.1%/68.3%, respectively. These encouraging results show that a combination of deep learning schemes could increase the classification accuracy and provide efficient emotional climate recognition in naturalistic conversation environments.\",\"PeriodicalId\":120923,\"journal\":{\"name\":\"2022 IEEE International Conference on Digital Health (ICDH)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Digital Health (ICDH)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDH55609.2022.00023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Digital Health (ICDH)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDH55609.2022.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

情绪在个人的整体身体健康中起着关键作用。因此,人们对对话中的情感识别(ERC)的兴趣一直在稳步增长。在这项工作中,我们提出了双向长短期记忆(Bi-LSTM)、卷积神经网络(CNN)和CNN- bilstm为基础的模型来预测同伴在谈话过程中建立的情绪气氛。他们在谈话中的语音信号使用Mel频率倒谱系数(MFCCs)进行分析,然后将其输入Bi-LSTM, CNN和CNN- bilstm模型,以预测价态和唤醒情绪气候线索。所提出的方法在一个公开可用的数据集K-EmoCon上进行了测试,该数据集包括情绪标签和同伴在交谈过程中的语音信号。结果表明,Bi-LSTM、CNN和CNN- bilstm模型的分类准确率(唤醒/效价)分别为67.5%/57.7%、73.3%/66.9%和75.1%/68.3%。这些令人鼓舞的结果表明,深度学习方案的组合可以提高分类精度,并在自然对话环境中提供有效的情绪气候识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Emotional Climate Recognition in Interactive Conversational Speech Using Deep Learning
Emotions play a pivotal role in the individual's overall physical health. Therefore, there has been a steadily increasing interest towards emotion recognition in conversation (ERC). In this work, we propose bidirectional long short term memory (Bi-LSTM), convolutional neural network (CNN), and CNN-BiLSTM based models to predict the emotional climate established during the conversation by peers. Their speech signals across their conversation are analyzed using Mel frequency cepstral coefficients (MFCCs) that are then fed to the Bi-LSTM, CNN and CNN-BiLSTM models to predict the valence and arousal emotional climate cues. The proposed approach was tested on a publicly available dataset, namely K-EmoCon, that includes emotion labeling and peers' speech signals, during their conversation. The obtained results show that Bi-LSTM, CNN and CNN-BiLSTM models achieved a classification accuracy (arousal/valence) of 67.5%/57.7%, 73.3%/66.9%, and 75.1%/68.3%, respectively. These encouraging results show that a combination of deep learning schemes could increase the classification accuracy and provide efficient emotional climate recognition in naturalistic conversation environments.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Designing User-friendly Medical AI Applications - Methodical Development of User-centered Design Guidelines Digital Health Promotion For Fitness Enthusiasts In Africa Knowledge Management in a Healthcare Enterprise: Creation of a Digital Knowledge Repository A New Low-Cost and Accurate Diagnostic mHealth System for Patients with COVID-19 Pneumonia Detection of Erythropoietin in Blood to Uncover Doping in Sports using Machine Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1