情感识别的多重损失

Anwer Slimi, M. Zrigui, H. Nicolas
{"title":"情感识别的多重损失","authors":"Anwer Slimi, M. Zrigui, H. Nicolas","doi":"10.1145/3512527.3531406","DOIUrl":null,"url":null,"abstract":"With the rise of human-machine interactions, it has become necessary for machines to better understand humans in order to respond appropriately. Hence, in order to increase communication and interaction, it would be ideal for machines to automatically detect human emotions. Speech Emotion Recognition (SER) has been a focus of a lot of studies in the past few years. However, they can be considered poor in accuracy and must be improved. In our work, we propose a new loss function that aims to encode speeches instead of classifying them directly as the majority of the existing models do. The encoding will be done in a way that utterances with the same labels would have similar encodings. The encoded speeches were tested on two datasets and we managed to get 88.19% accuracy with the RAVDESS (Ryerson Audiovisual Database of Emotional Speech and Song) dataset and 91.66% accuracy with the RML (Ryerson Multimedia Research Lab) dataset.","PeriodicalId":179895,"journal":{"name":"Proceedings of the 2022 International Conference on Multimedia Retrieval","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"MuLER: Multiplet-Loss for Emotion Recognition\",\"authors\":\"Anwer Slimi, M. Zrigui, H. Nicolas\",\"doi\":\"10.1145/3512527.3531406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rise of human-machine interactions, it has become necessary for machines to better understand humans in order to respond appropriately. Hence, in order to increase communication and interaction, it would be ideal for machines to automatically detect human emotions. Speech Emotion Recognition (SER) has been a focus of a lot of studies in the past few years. However, they can be considered poor in accuracy and must be improved. In our work, we propose a new loss function that aims to encode speeches instead of classifying them directly as the majority of the existing models do. The encoding will be done in a way that utterances with the same labels would have similar encodings. The encoded speeches were tested on two datasets and we managed to get 88.19% accuracy with the RAVDESS (Ryerson Audiovisual Database of Emotional Speech and Song) dataset and 91.66% accuracy with the RML (Ryerson Multimedia Research Lab) dataset.\",\"PeriodicalId\":179895,\"journal\":{\"name\":\"Proceedings of the 2022 International Conference on Multimedia Retrieval\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 International Conference on Multimedia Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3512527.3531406\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 International Conference on Multimedia Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3512527.3531406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

随着人机交互的兴起,机器有必要更好地了解人类,以便做出适当的反应。因此,为了增加交流和互动,机器能够自动检测人类的情绪将是理想的。语音情感识别(SER)是近年来研究的热点之一。然而,它们的准确性很差,必须加以改进。在我们的工作中,我们提出了一个新的损失函数,旨在对语音进行编码,而不是像大多数现有模型那样直接对语音进行分类。编码将以具有相同标签的话语具有相似编码的方式进行。编码后的语音在两个数据集上进行了测试,我们使用RAVDESS (Ryerson audio - visual Database of Emotional Speech and Song)数据集获得了88.19%的准确率,使用RML (Ryerson Multimedia Research Lab)数据集获得了91.66%的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MuLER: Multiplet-Loss for Emotion Recognition
With the rise of human-machine interactions, it has become necessary for machines to better understand humans in order to respond appropriately. Hence, in order to increase communication and interaction, it would be ideal for machines to automatically detect human emotions. Speech Emotion Recognition (SER) has been a focus of a lot of studies in the past few years. However, they can be considered poor in accuracy and must be improved. In our work, we propose a new loss function that aims to encode speeches instead of classifying them directly as the majority of the existing models do. The encoding will be done in a way that utterances with the same labels would have similar encodings. The encoded speeches were tested on two datasets and we managed to get 88.19% accuracy with the RAVDESS (Ryerson Audiovisual Database of Emotional Speech and Song) dataset and 91.66% accuracy with the RML (Ryerson Multimedia Research Lab) dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Self-Lifting: A Novel Framework for Unsupervised Voice-Face Association Learning DMPCANet: A Low Dimensional Aggregation Network for Visual Place Recognition Revisiting Performance Measures for Cross-Modal Hashing MFGAN: A Lightweight Fast Multi-task Multi-scale Feature-fusion Model based on GAN Weakly Supervised Fine-grained Recognition based on Combined Learning for Small Data and Coarse Label
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1