Detection of negative emotions in speech signals using bags-of-audio-words

2015 International Conference on Affective Computing and Intelligent Interaction (ACII) Pub Date : 2015-09-21 DOI:10.1109/ACII.2015.7344678

Florian B. Pokorny, F. Graf, F. Pernkopf, Björn Schuller

{"title":"Detection of negative emotions in speech signals using bags-of-audio-words","authors":"Florian B. Pokorny, F. Graf, F. Pernkopf, Björn Schuller","doi":"10.1109/ACII.2015.7344678","DOIUrl":null,"url":null,"abstract":"Boosted by a wide potential application spectrum, emotional speech recognition, i.e., the automatic computer-aided identification of human emotional states based on speech signals, currently describes a popular field of research. However, a variety of studies especially concentrating on the recognition of negative emotions often neglected the specific requirements of real-world scenarios, for example, robustness, real-time capability, and realistic speech corpora. Motivated by these facts, a robust, low-complex classification system for the detection of negative emotions in speech signals was implemented on the basis of a spontaneous, strongly emotionally colored speech corpus. Therefore, an innovative approach in the field of emotion recognition was applied as the core of the system - the bag-of-words approach that is originally known from text and image document retrieval applications. Thorough performance evaluations were carried out and a promising recognition accuracy of 65.6 % for the 2-class paradigm negative versus non-negative emotional states attests to the potential of bags-of-words in speech emotion recognition in the wild.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"4 1","pages":"879-884"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACII.2015.7344678","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 35

Abstract

Boosted by a wide potential application spectrum, emotional speech recognition, i.e., the automatic computer-aided identification of human emotional states based on speech signals, currently describes a popular field of research. However, a variety of studies especially concentrating on the recognition of negative emotions often neglected the specific requirements of real-world scenarios, for example, robustness, real-time capability, and realistic speech corpora. Motivated by these facts, a robust, low-complex classification system for the detection of negative emotions in speech signals was implemented on the basis of a spontaneous, strongly emotionally colored speech corpus. Therefore, an innovative approach in the field of emotion recognition was applied as the core of the system - the bag-of-words approach that is originally known from text and image document retrieval applications. Thorough performance evaluations were carried out and a promising recognition accuracy of 65.6 % for the 2-class paradigm negative versus non-negative emotional states attests to the potential of bags-of-words in speech emotion recognition in the wild.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用音频-词袋检测语音信号中的消极情绪

由于具有广泛的潜在应用范围，情感语音识别，即基于语音信号的人类情绪状态的计算机自动辅助识别，目前是一个热门的研究领域。然而，各种专注于负面情绪识别的研究往往忽略了现实场景的具体要求，如鲁棒性、实时性和真实的语音语料库。基于这些事实，基于一个自发的、具有强烈情感色彩的语音语料库，实现了一个鲁棒的、低复杂度的语音信号负面情绪检测分类系统。因此，我们采用了情感识别领域的一种创新方法作为系统的核心——最初在文本和图像文档检索应用中已知的词袋方法。对两类范式的消极情绪状态和非消极情绪状态进行了全面的性能评估，其识别准确率为65.6%，证明了词袋在野外语音情绪识别中的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

自引率

0.00%

发文量