跨语言声音情感识别:综述及趋势

S. M. Feraru, Dagmar M. Schuller, Björn Schuller
{"title":"跨语言声音情感识别:综述及趋势","authors":"S. M. Feraru, Dagmar M. Schuller, Björn Schuller","doi":"10.1109/ACII.2015.7344561","DOIUrl":null,"url":null,"abstract":"Automatic emotion recognition from speech has matured close to the point where it reaches broader commercial interest. One of the last major limiting factors is the ability to deal with multilingual inputs as will be given in a real-life operating system in many if not most cases. As in real-life scenarios speech is often used mixed across languages more experience will be needed in performance effects of cross-language recognition. In this contribution we first provide an overview on languages covered in the research on emotion and speech finding that only roughly two thirds of native speakers' languages are so far touched upon. We thus next shed light on mis-matched vs matched condition emotion recognition across a variety of languages. By intention, we include less researched languages of more distant language families such as Burmese, Romanian or Turkish. Binary arousal and valence mapping is employed in order to be able to train and test across databases that have originally been labelled in diverse categories. In the result - as one may expect - arousal recognition works considerably better across languages than valence, and cross-language recognition falls considerably behind within-language recognition. However, within-language family recognition seems to provide an `emergency-solution' in case of missing language resources, and the observed notable differences depending on the combination of languages show a number of interesting effects.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"17 1","pages":"125-131"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"58","resultStr":"{\"title\":\"Cross-language acoustic emotion recognition: An overview and some tendencies\",\"authors\":\"S. M. Feraru, Dagmar M. Schuller, Björn Schuller\",\"doi\":\"10.1109/ACII.2015.7344561\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic emotion recognition from speech has matured close to the point where it reaches broader commercial interest. One of the last major limiting factors is the ability to deal with multilingual inputs as will be given in a real-life operating system in many if not most cases. As in real-life scenarios speech is often used mixed across languages more experience will be needed in performance effects of cross-language recognition. In this contribution we first provide an overview on languages covered in the research on emotion and speech finding that only roughly two thirds of native speakers' languages are so far touched upon. We thus next shed light on mis-matched vs matched condition emotion recognition across a variety of languages. By intention, we include less researched languages of more distant language families such as Burmese, Romanian or Turkish. Binary arousal and valence mapping is employed in order to be able to train and test across databases that have originally been labelled in diverse categories. In the result - as one may expect - arousal recognition works considerably better across languages than valence, and cross-language recognition falls considerably behind within-language recognition. However, within-language family recognition seems to provide an `emergency-solution' in case of missing language resources, and the observed notable differences depending on the combination of languages show a number of interesting effects.\",\"PeriodicalId\":6863,\"journal\":{\"name\":\"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)\",\"volume\":\"17 1\",\"pages\":\"125-131\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"58\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACII.2015.7344561\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACII.2015.7344561","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 58

摘要

语音的自动情感识别技术已经成熟到可以实现更广泛的商业利益。最后一个主要限制因素是处理多语言输入的能力,这在许多(如果不是大多数的话)实际操作系统中都有。由于在现实生活中,语音经常是跨语言混合使用的,因此跨语言识别的表现效果需要更多的经验。在这篇文章中,我们首先概述了情感和语言研究中涉及的语言,发现到目前为止,只有大约三分之二的母语被触及。因此,我们接下来阐明了跨各种语言的不匹配与匹配条件情感识别。有意地,我们包括较少研究的语言更遥远的语系,如缅甸语,罗马尼亚语或土耳其语。为了能够跨数据库进行训练和测试,采用了二元唤醒和价映射,这些数据库最初被标记为不同的类别。结果,正如人们所预料的那样,唤醒识别在不同语言之间的表现要比效价好得多,而跨语言识别则远远落后于语言内识别。然而,在缺少语言资源的情况下,语言族内部识别似乎提供了一种“紧急解决方案”,并且根据语言组合所观察到的显着差异显示了许多有趣的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Cross-language acoustic emotion recognition: An overview and some tendencies
Automatic emotion recognition from speech has matured close to the point where it reaches broader commercial interest. One of the last major limiting factors is the ability to deal with multilingual inputs as will be given in a real-life operating system in many if not most cases. As in real-life scenarios speech is often used mixed across languages more experience will be needed in performance effects of cross-language recognition. In this contribution we first provide an overview on languages covered in the research on emotion and speech finding that only roughly two thirds of native speakers' languages are so far touched upon. We thus next shed light on mis-matched vs matched condition emotion recognition across a variety of languages. By intention, we include less researched languages of more distant language families such as Burmese, Romanian or Turkish. Binary arousal and valence mapping is employed in order to be able to train and test across databases that have originally been labelled in diverse categories. In the result - as one may expect - arousal recognition works considerably better across languages than valence, and cross-language recognition falls considerably behind within-language recognition. However, within-language family recognition seems to provide an `emergency-solution' in case of missing language resources, and the observed notable differences depending on the combination of languages show a number of interesting effects.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Avatar and participant gender differences in the perception of uncanniness of virtual humans Neural conditional ordinal random fields for agreement level estimation Fundamental frequency modeling using wavelets for emotional voice conversion Bimodal feature-based fusion for real-time emotion recognition in a mobile context Harmony search for feature selection in speech emotion recognition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1