Extreme Learning Machine for Automatic Language Identification Utilizing Emotion Speech Data

Musatafa Abbas Abbood Albadr, S. Tiun, M. Ayob, Fahad Taha Al-Dhief, Taj-Aldeen Naser Abdali, A. F. Abbas
{"title":"Extreme Learning Machine for Automatic Language Identification Utilizing Emotion Speech Data","authors":"Musatafa Abbas Abbood Albadr, S. Tiun, M. Ayob, Fahad Taha Al-Dhief, Taj-Aldeen Naser Abdali, A. F. Abbas","doi":"10.1109/ICECCE52056.2021.9514107","DOIUrl":null,"url":null,"abstract":"The technique used for recognizing a language by utilizing pronounced speech is called spoken Language Identification (LID). This field has a high significance in the interaction between human and computer. Besides, it can be implemented in several applications such as call centers, speaker diarization in multilingual environments, and in translation systems using a speech-to-speech manner. However, most studies that used LID systems are used and focused on neutral speech only. Moreover, the application of emotional speech in LID systems is crucial in real applications. Therefore, this study aims to investigate the performance of Extreme Learning Machine (ELM) in LID system by utilizing emotional speech. The system is evaluated based on two different languages (Germany and English language). This study has used the Berlin Emotional Speech Dataset (BESD) for the Germany language while the Ryerson Audio-Visual Dataset of Emotional Speech and Song (RAVDESS) for the English language. Four different evaluation scenarios (All Dataset (AD), Normal-Speech Dependent (N-SD), Gender-Female Dependent (G-FD), and Gender-Male Dependent (G-MD) scenario) have been conducted in order to evaluate the system. The experiments results have shown that the highest performance was achieved an accuracy of 99.08%, 100.00%, 98.22%, and 99.37% for AD, N-SD, G-FD, and G-MD scenario, respectively.","PeriodicalId":302947,"journal":{"name":"2021 International Conference on Electrical, Communication, and Computer Engineering (ICECCE)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Electrical, Communication, and Computer Engineering (ICECCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECCE52056.2021.9514107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

The technique used for recognizing a language by utilizing pronounced speech is called spoken Language Identification (LID). This field has a high significance in the interaction between human and computer. Besides, it can be implemented in several applications such as call centers, speaker diarization in multilingual environments, and in translation systems using a speech-to-speech manner. However, most studies that used LID systems are used and focused on neutral speech only. Moreover, the application of emotional speech in LID systems is crucial in real applications. Therefore, this study aims to investigate the performance of Extreme Learning Machine (ELM) in LID system by utilizing emotional speech. The system is evaluated based on two different languages (Germany and English language). This study has used the Berlin Emotional Speech Dataset (BESD) for the Germany language while the Ryerson Audio-Visual Dataset of Emotional Speech and Song (RAVDESS) for the English language. Four different evaluation scenarios (All Dataset (AD), Normal-Speech Dependent (N-SD), Gender-Female Dependent (G-FD), and Gender-Male Dependent (G-MD) scenario) have been conducted in order to evaluate the system. The experiments results have shown that the highest performance was achieved an accuracy of 99.08%, 100.00%, 98.22%, and 99.37% for AD, N-SD, G-FD, and G-MD scenario, respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于情感语音数据的语言自动识别极限学习机
通过发音来识别语言的技术被称为口语识别(LID)。该领域在人机交互中具有重要的意义。此外,它还可以在呼叫中心、多语言环境中的说话人拨号以及使用语音对语音方式的翻译系统等多种应用中实现。然而,大多数使用LID系统的研究都只关注中性言语。此外,情感语音在LID系统中的应用在实际应用中至关重要。因此,本研究旨在探讨极限学习机(ELM)在LID系统中运用情绪语音的表现。该系统基于两种不同的语言(德语和英语)进行评估。这项研究使用了柏林情感语音数据集(BESD)来研究德语,而使用了瑞尔森情感语音和歌曲视听数据集(RAVDESS)来研究英语。为了对系统进行评估,我们进行了四种不同的评估场景(全数据集(AD)、正常语音依赖(N-SD)、性别-女性依赖(G-FD)和性别-男性依赖(G-MD)场景)。实验结果表明,在AD、N-SD、G-FD和G-MD场景下,该算法的准确率分别为99.08%、100.00%、98.22%和99.37%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
WiFi Performance Estimation for Voice Services Feasibility of using Air-conducted and Bone-conducted Sounds Transmitted through Eyeglasses Frames for User Authentication Non-Linear Auto-Regressive Modeling based Day-ahead BESS Dispatch Strategy for Distribution Transformer Overload Management Hot Spot Analysis in Asset Inspections in The Electricity Distribution Area Extreme Learning Machine for Automatic Language Identification Utilizing Emotion Speech Data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1