{"title":"A 2D Convolution Neural Network Based Method for Human Emotion Classification from Speech Signal","authors":"Rakhi Rani Paul, S. Paul, Md. Ekramul Hamid","doi":"10.1109/ICCIT57492.2022.10054811","DOIUrl":null,"url":null,"abstract":"recognizing emotions from speech signals is one of the active research fields in the area of human information processing as well as man-machine interaction. Different persons have different emotions and altogether different ways of expressing them. In this paper, a 2D Convolutional Neural Network (CNN) based method is presented for human emotion classification. We consider RAVDESS and SAVEE datasets to evaluate the performance of the model. Initially, Mel-frequency cepstral coefficients MFCC features are extracted from the speech signals which are used for the training purpose. Here, we consider only forty (40) cepstrum coefficients per frame. The proposed 2D CNN model is trained to classify seven different emotional states (neutral, calm, happy, sad, angry, scared, disgust, surprised). We achieve 89.86% overall accuracy from our proposed model for the RAVDESS dataset and 83.57% for the SAVEE dataset respectively. It is found that happy class is classified with an accuracy of 96% for the RAVDESS dataset and 92% for the SAVEE dataset. Lastly, the result of our proposed model is compared with the other recent existing works. The performance of our proposed model is good enough because it achieves better accuracy than other models. This work has many real-life applications such as man-machine interaction, auto supervision, auxiliary lie detection, the discovery of dissatisfaction with the client’s mode, detecting neurological disordered patients and so on.","PeriodicalId":255498,"journal":{"name":"2022 25th International Conference on Computer and Information Technology (ICCIT)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 25th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIT57492.2022.10054811","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
recognizing emotions from speech signals is one of the active research fields in the area of human information processing as well as man-machine interaction. Different persons have different emotions and altogether different ways of expressing them. In this paper, a 2D Convolutional Neural Network (CNN) based method is presented for human emotion classification. We consider RAVDESS and SAVEE datasets to evaluate the performance of the model. Initially, Mel-frequency cepstral coefficients MFCC features are extracted from the speech signals which are used for the training purpose. Here, we consider only forty (40) cepstrum coefficients per frame. The proposed 2D CNN model is trained to classify seven different emotional states (neutral, calm, happy, sad, angry, scared, disgust, surprised). We achieve 89.86% overall accuracy from our proposed model for the RAVDESS dataset and 83.57% for the SAVEE dataset respectively. It is found that happy class is classified with an accuracy of 96% for the RAVDESS dataset and 92% for the SAVEE dataset. Lastly, the result of our proposed model is compared with the other recent existing works. The performance of our proposed model is good enough because it achieves better accuracy than other models. This work has many real-life applications such as man-machine interaction, auto supervision, auxiliary lie detection, the discovery of dissatisfaction with the client’s mode, detecting neurological disordered patients and so on.