K. Daqrouq, A. Balamesh, O. Alrusaini, A. Alkhateeb, A. S. Balamash
{"title":"Emotion Modeling in Speech Signals: Discrete Wavelet Transform and Machine Learning Tools for Emotion Recognition System","authors":"K. Daqrouq, A. Balamesh, O. Alrusaini, A. Alkhateeb, A. S. Balamash","doi":"10.1155/2024/7184018","DOIUrl":null,"url":null,"abstract":"Speech emotion recognition (SER) is a challenging task due to the complex and subtle nature of emotions. This study proposes a novel approach for emotion modeling using speech signals by combining discrete wavelet transform (DWT) with linear prediction coding (LPC). The performance of various classifiers, including support vector machine (SVM), K-Nearest Neighbors (KNN), Efficient Logistic Regression, Naive Bayes, Ensemble, and Neural Network, was evaluated for emotion classification using the EMO-DB dataset. Evaluation metrics such as area under the curve (AUC), average prediction accuracy, and cross-validation techniques were employed. The results indicate that KNN and SVM classifiers exhibited high accuracy in distinguishing sadness from other emotions. Ensemble methods and Neural Networks also demonstrated strong performance in sadness classification. While Efficient Logistic Regression and Naive Bayes classifiers showed competitive performance, they were slightly less accurate compared to other classifiers. Furthermore, the proposed feature extraction method yielded the highest average accuracy, and its combination with formants or wavelet entropy further improved classification accuracy. On the other hand, Efficient Logistic Regression exhibited the lowest accuracies among the classifiers. The uniqueness of this study was that it investigated a combined feature extraction method and integrated them to compare with various forms of combinations. However, the purposes of the investigation include improved performance of the classifiers, high effectiveness of the system, and the potential for emotion classification tasks. These findings can guide the selection of appropriate classifiers and feature extraction methods in future research and real-world applications. Further investigations can focus on refining classifiers and exploring additional feature extraction techniques to enhance emotion classification accuracy.","PeriodicalId":44894,"journal":{"name":"Applied Computational Intelligence and Soft Computing","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computational Intelligence and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2024/7184018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Speech emotion recognition (SER) is a challenging task due to the complex and subtle nature of emotions. This study proposes a novel approach for emotion modeling using speech signals by combining discrete wavelet transform (DWT) with linear prediction coding (LPC). The performance of various classifiers, including support vector machine (SVM), K-Nearest Neighbors (KNN), Efficient Logistic Regression, Naive Bayes, Ensemble, and Neural Network, was evaluated for emotion classification using the EMO-DB dataset. Evaluation metrics such as area under the curve (AUC), average prediction accuracy, and cross-validation techniques were employed. The results indicate that KNN and SVM classifiers exhibited high accuracy in distinguishing sadness from other emotions. Ensemble methods and Neural Networks also demonstrated strong performance in sadness classification. While Efficient Logistic Regression and Naive Bayes classifiers showed competitive performance, they were slightly less accurate compared to other classifiers. Furthermore, the proposed feature extraction method yielded the highest average accuracy, and its combination with formants or wavelet entropy further improved classification accuracy. On the other hand, Efficient Logistic Regression exhibited the lowest accuracies among the classifiers. The uniqueness of this study was that it investigated a combined feature extraction method and integrated them to compare with various forms of combinations. However, the purposes of the investigation include improved performance of the classifiers, high effectiveness of the system, and the potential for emotion classification tasks. These findings can guide the selection of appropriate classifiers and feature extraction methods in future research and real-world applications. Further investigations can focus on refining classifiers and exploring additional feature extraction techniques to enhance emotion classification accuracy.
期刊介绍:
Applied Computational Intelligence and Soft Computing will focus on the disciplines of computer science, engineering, and mathematics. The scope of the journal includes developing applications related to all aspects of natural and social sciences by employing the technologies of computational intelligence and soft computing. The new applications of using computational intelligence and soft computing are still in development. Although computational intelligence and soft computing are established fields, the new applications of using computational intelligence and soft computing can be regarded as an emerging field, which is the focus of this journal.