Nehul Gupta, Vedangi Thakur, Vaishnavi Patil, Tamanna Vishnoi, K. Bhangale
{"title":"Analysis of Affective Computing for Marathi Corpus using Deep Learning","authors":"Nehul Gupta, Vedangi Thakur, Vaishnavi Patil, Tamanna Vishnoi, K. Bhangale","doi":"10.1109/INCET57972.2023.10170346","DOIUrl":null,"url":null,"abstract":"Speech Emotion Recognition (SER) offers a wide range of potential uses, including strengthening human-computer interaction in virtual reality and gaming settings, enhancing the detection and tracking of mental health disorders, and enhancing the precision of speech based assistants and chat bots. It faces the challenge of cross corpus SER, intonation variations, dialects variations and prosodic changes in language due to age, gender, region, and religion, etc. This paper presents deep Convolution Neural Network based SER for Marathi language Our novel Marathi data set consists of 300 recordings of 15 speakers for Anger, Happy, Sad and Neutral emotions. The performance of the proposed DCNN is evaluated on the novel data set based on accuracy, precision, recall and F1-score. The suggested scheme provides overall accuracy of raw data is 0.4750, 0.4076 and 0.3927 for 5,10 and 15 speakers respectively and the overall accuracy after feature extraction is 0.6652, 0.6361 and 0.5800 for 5, 10 and 15 speakers respectively shows improvement in existing state of arts utilized for SER for Marathi Corpus.","PeriodicalId":403008,"journal":{"name":"2023 4th International Conference for Emerging Technology (INCET)","volume":"23 8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 4th International Conference for Emerging Technology (INCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INCET57972.2023.10170346","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Speech Emotion Recognition (SER) offers a wide range of potential uses, including strengthening human-computer interaction in virtual reality and gaming settings, enhancing the detection and tracking of mental health disorders, and enhancing the precision of speech based assistants and chat bots. It faces the challenge of cross corpus SER, intonation variations, dialects variations and prosodic changes in language due to age, gender, region, and religion, etc. This paper presents deep Convolution Neural Network based SER for Marathi language Our novel Marathi data set consists of 300 recordings of 15 speakers for Anger, Happy, Sad and Neutral emotions. The performance of the proposed DCNN is evaluated on the novel data set based on accuracy, precision, recall and F1-score. The suggested scheme provides overall accuracy of raw data is 0.4750, 0.4076 and 0.3927 for 5,10 and 15 speakers respectively and the overall accuracy after feature extraction is 0.6652, 0.6361 and 0.5800 for 5, 10 and 15 speakers respectively shows improvement in existing state of arts utilized for SER for Marathi Corpus.