基于声-文特征融合的音频-情绪分类

R. Rajan, Joshua Antony, Riya Ann Joseph, Jijohn M. Thomas, Chandr Dhanush H, A. V
{"title":"基于声-文特征融合的音频-情绪分类","authors":"R. Rajan, Joshua Antony, Riya Ann Joseph, Jijohn M. Thomas, Chandr Dhanush H, A. V","doi":"10.1109/ICMSS53060.2021.9673592","DOIUrl":null,"url":null,"abstract":"Listeners browse songs based on artist or genre, but a significant amount of queries are based on emotions like happy, sad, calm etc. and therefore, automatic music mood classification is gaining importance. People search for songs based on the emotions they are feeling or the emotion they hope to feel. Audio-based techniques can achieve satisfying results, but part of the semantic information of songs resides exclusively in the lyrics. In this paper, we present a study on the fusion approach of music mood classification. As both audio and lyrical information is complimentary, creating a hybrid model to classify music based on mood provides enhanced accuracy. Where a single song might fall under two different categories based on audio or lyrical information, a hybrid model helps us achieve more accurate results by merging both the information. In this work, we extracted features using librosa from audio, used TF-IDF for text, and experimented with the Bi-LSTM network. The performance evaluation is done on corpus consists of 776 songs. The multimodal approach achieved average precision, recall and F1-score of 0.66, 0.65 and 0.65 respectively.","PeriodicalId":274597,"journal":{"name":"2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Audio-Mood Classification Using Acoustic-Textual Feature Fusion\",\"authors\":\"R. Rajan, Joshua Antony, Riya Ann Joseph, Jijohn M. Thomas, Chandr Dhanush H, A. V\",\"doi\":\"10.1109/ICMSS53060.2021.9673592\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Listeners browse songs based on artist or genre, but a significant amount of queries are based on emotions like happy, sad, calm etc. and therefore, automatic music mood classification is gaining importance. People search for songs based on the emotions they are feeling or the emotion they hope to feel. Audio-based techniques can achieve satisfying results, but part of the semantic information of songs resides exclusively in the lyrics. In this paper, we present a study on the fusion approach of music mood classification. As both audio and lyrical information is complimentary, creating a hybrid model to classify music based on mood provides enhanced accuracy. Where a single song might fall under two different categories based on audio or lyrical information, a hybrid model helps us achieve more accurate results by merging both the information. In this work, we extracted features using librosa from audio, used TF-IDF for text, and experimented with the Bi-LSTM network. The performance evaluation is done on corpus consists of 776 songs. The multimodal approach achieved average precision, recall and F1-score of 0.66, 0.65 and 0.65 respectively.\",\"PeriodicalId\":274597,\"journal\":{\"name\":\"2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS)\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMSS53060.2021.9673592\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMSS53060.2021.9673592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

听众根据艺术家或流派浏览歌曲,但大量的查询是基于情绪,如快乐,悲伤,平静等,因此,自动音乐情绪分类变得越来越重要。人们搜索歌曲是基于他们所感受到的情绪或他们希望感受到的情绪。基于音频的技术可以取得令人满意的效果,但歌曲的部分语义信息只存在于歌词中。本文对音乐情绪分类的融合方法进行了研究。由于音频和歌词信息都是互补的,因此创建一个基于情绪的混合模型来对音乐进行分类可以提高准确性。根据音频或歌词信息,一首歌可能属于两种不同的类别,混合模型通过合并这两种信息帮助我们获得更准确的结果。在这项工作中,我们使用librosa从音频中提取特征,使用TF-IDF提取文本,并使用Bi-LSTM网络进行实验。对776首歌曲的语料库进行了性能评价。多模态方法的平均准确率、查全率和f1得分分别为0.66、0.65和0.65。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Audio-Mood Classification Using Acoustic-Textual Feature Fusion
Listeners browse songs based on artist or genre, but a significant amount of queries are based on emotions like happy, sad, calm etc. and therefore, automatic music mood classification is gaining importance. People search for songs based on the emotions they are feeling or the emotion they hope to feel. Audio-based techniques can achieve satisfying results, but part of the semantic information of songs resides exclusively in the lyrics. In this paper, we present a study on the fusion approach of music mood classification. As both audio and lyrical information is complimentary, creating a hybrid model to classify music based on mood provides enhanced accuracy. Where a single song might fall under two different categories based on audio or lyrical information, a hybrid model helps us achieve more accurate results by merging both the information. In this work, we extracted features using librosa from audio, used TF-IDF for text, and experimented with the Bi-LSTM network. The performance evaluation is done on corpus consists of 776 songs. The multimodal approach achieved average precision, recall and F1-score of 0.66, 0.65 and 0.65 respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Stabilization Control for an Inverted Pendulum on a Cart: A Terminal Sliding Mode Approach Automatic Severity Evaluation of Articulation Disorder in Speech using Dynamic Time Warping Non-uniform Region Based Features for Automatic Language Identification An Investigation on the Analysis of Graphene-based Counter Electrode with MATLAB Simulation for Dye-Sensitized Solar Cells Sensorless Heating Control of SMA
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1