{"title":"Emotion Recognition and Multi-class Classification in Music with MFCC and Machine Learning","authors":"Gilsang Yoo, Sungdae Hong, Hyeocheol Kim","doi":"10.18517/ijaseit.14.3.18671","DOIUrl":null,"url":null,"abstract":"Background music in OTT services significantly enhances narratives and conveys emotions, yet users with hearing impairments might not fully experience this emotional context. This paper illuminates the pivotal role of background music in user engagement on OTT platforms. It introduces a novel system designed to mitigate the challenges the hearing-impaired face in appreciating the emotional nuances of music. This system adeptly identifies the mood of background music and translates it into textual subtitles, making emotional content accessible to all users. The proposed method extracts key audio features, including Mel Frequency Cepstral Coefficients (MFCC), Root Mean Square (RMS), and MEL Spectrograms. It then harnesses the power of leading machine learning algorithms Logistic Regression, Random Forest, AdaBoost, and Support Vector Classification (SVC) to analyze the emotional traits embedded in the music and accurately identify its sentiment. Among these, the Random Forest algorithm, applied to MFCC features, demonstrated exceptional accuracy, reaching 94.8% in our tests. The significance of this technology extends beyond mere feature identification; it promises to revolutionize the accessibility of multimedia content. By automatically generating emotionally resonant subtitles, this system can enrich the viewing experience for all, particularly those with hearing impairments. This advancement not only underscores the critical role of music in storytelling and emotional engagement but also highlights the vast potential of machine learning in enhancing the inclusivity and enjoyment of digital entertainment across diverse audiences.","PeriodicalId":14471,"journal":{"name":"International Journal on Advanced Science, Engineering and Information Technology","volume":"15 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Advanced Science, Engineering and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18517/ijaseit.14.3.18671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Background music in OTT services significantly enhances narratives and conveys emotions, yet users with hearing impairments might not fully experience this emotional context. This paper illuminates the pivotal role of background music in user engagement on OTT platforms. It introduces a novel system designed to mitigate the challenges the hearing-impaired face in appreciating the emotional nuances of music. This system adeptly identifies the mood of background music and translates it into textual subtitles, making emotional content accessible to all users. The proposed method extracts key audio features, including Mel Frequency Cepstral Coefficients (MFCC), Root Mean Square (RMS), and MEL Spectrograms. It then harnesses the power of leading machine learning algorithms Logistic Regression, Random Forest, AdaBoost, and Support Vector Classification (SVC) to analyze the emotional traits embedded in the music and accurately identify its sentiment. Among these, the Random Forest algorithm, applied to MFCC features, demonstrated exceptional accuracy, reaching 94.8% in our tests. The significance of this technology extends beyond mere feature identification; it promises to revolutionize the accessibility of multimedia content. By automatically generating emotionally resonant subtitles, this system can enrich the viewing experience for all, particularly those with hearing impairments. This advancement not only underscores the critical role of music in storytelling and emotional engagement but also highlights the vast potential of machine learning in enhancing the inclusivity and enjoyment of digital entertainment across diverse audiences.
期刊介绍:
International Journal on Advanced Science, Engineering and Information Technology (IJASEIT) is an international peer-reviewed journal dedicated to interchange for the results of high quality research in all aspect of science, engineering and information technology. The journal publishes state-of-art papers in fundamental theory, experiments and simulation, as well as applications, with a systematic proposed method, sufficient review on previous works, expanded discussion and concise conclusion. As our commitment to the advancement of science and technology, the IJASEIT follows the open access policy that allows the published articles freely available online without any subscription. The journal scopes include (but not limited to) the followings: -Science: Bioscience & Biotechnology. Chemistry & Food Technology, Environmental, Health Science, Mathematics & Statistics, Applied Physics -Engineering: Architecture, Chemical & Process, Civil & structural, Electrical, Electronic & Systems, Geological & Mining Engineering, Mechanical & Materials -Information Science & Technology: Artificial Intelligence, Computer Science, E-Learning & Multimedia, Information System, Internet & Mobile Computing