Musical Genre Classification Using Advanced Audio Analysis and Deep Learning Techniques

Mumtahina Ahmed;Uland Rozario;Md Mohshin Kabir;Zeyar Aung;Jungpil Shin;M. F. Mridha
{"title":"Musical Genre Classification Using Advanced Audio Analysis and Deep Learning Techniques","authors":"Mumtahina Ahmed;Uland Rozario;Md Mohshin Kabir;Zeyar Aung;Jungpil Shin;M. F. Mridha","doi":"10.1109/OJCS.2024.3431229","DOIUrl":null,"url":null,"abstract":"Classifying music genres has been a significant problem in the decade of seamless music streaming platforms and countless content creations. An accurate music genre classification is a fundamental task with applications in music recommendation, content organization, and understanding musical trends. This study presents a comprehensive approach to music genre classification using deep learning and advanced audio analysis techniques. In this study, a deep learning method was used to tackle the task of music genre classification. For this study, the GTZAN dataset was chosen for music genre classification. This study examines the challenge of music genre categorization using Convolutional Neural Networks (CNN), Feedforward Neural Networks (FNN), Support Vector Machine (SVM), k-nearest Neighbors (kNN), and Long Short-term Memory (LSTM) models on the dataset. This study precisely cross-validates the model's output following feature extraction from pre-processed audio data and then evaluates its performance. The modified CNN model performs better than conventional NN models by using its capacity to capture complex spectrogram patterns. These results highlight how deep learning algorithms may improve systems for categorizing music genres, with implications for various music-related applications and user interfaces. Up to this point, 92.7% of the GTZAN dataset's correctness has been achieved on the GTZAN dataset and 91.6% on the ISMIR2004 Ballroom dataset.","PeriodicalId":13205,"journal":{"name":"IEEE Open Journal of the Computer Society","volume":"5 ","pages":"457-467"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10605044","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Computer Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10605044/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Classifying music genres has been a significant problem in the decade of seamless music streaming platforms and countless content creations. An accurate music genre classification is a fundamental task with applications in music recommendation, content organization, and understanding musical trends. This study presents a comprehensive approach to music genre classification using deep learning and advanced audio analysis techniques. In this study, a deep learning method was used to tackle the task of music genre classification. For this study, the GTZAN dataset was chosen for music genre classification. This study examines the challenge of music genre categorization using Convolutional Neural Networks (CNN), Feedforward Neural Networks (FNN), Support Vector Machine (SVM), k-nearest Neighbors (kNN), and Long Short-term Memory (LSTM) models on the dataset. This study precisely cross-validates the model's output following feature extraction from pre-processed audio data and then evaluates its performance. The modified CNN model performs better than conventional NN models by using its capacity to capture complex spectrogram patterns. These results highlight how deep learning algorithms may improve systems for categorizing music genres, with implications for various music-related applications and user interfaces. Up to this point, 92.7% of the GTZAN dataset's correctness has been achieved on the GTZAN dataset and 91.6% on the ISMIR2004 Ballroom dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用高级音频分析和深度学习技术进行音乐流派分类
在无缝音乐流媒体平台和无数内容创作的十年间,音乐流派分类一直是一个重要问题。准确的音乐流派分类是一项基本任务,可应用于音乐推荐、内容组织和了解音乐趋势。本研究提出了一种利用深度学习和先进音频分析技术进行音乐流派分类的综合方法。本研究采用深度学习方法来处理音乐流派分类任务。本研究选择 GTZAN 数据集进行音乐流派分类。本研究在数据集上使用卷积神经网络(CNN)、前馈神经网络(FNN)、支持向量机(SVM)、k-近邻(kNN)和长短期记忆(LSTM)模型来研究音乐流派分类所面临的挑战。本研究精确地交叉验证了从预处理音频数据中提取特征后的模型输出,然后评估了其性能。改进后的 CNN 模型利用其捕捉复杂频谱图模式的能力,表现优于传统的 NN 模型。这些结果凸显了深度学习算法如何改进音乐类型分类系统,并对各种音乐相关应用和用户界面产生了影响。到目前为止,GTZAN 数据集的正确率达到 92.7%,ISMIR2004 Ballroom 数据集的正确率达到 91.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
12.60
自引率
0.00%
发文量
0
期刊最新文献
A Hybrid Temporal Convolutional Network and Transformer Model for Accurate and Scalable Sales Forecasting Enhancing Cloud Security: A Multi-Factor Authentication and Adaptive Cryptography Approach Using Machine Learning Techniques An Efficient and Privacy-Preserving Federated Learning Approach Based on Homomorphic Encryption The Rise of Cognitive SOCs: A Systematic Literature Review on AI Approaches Generative AI and the Metaverse: A Scoping Review of Ethical and Legal Challenges
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1