三种改进的深度学习架构在音乐体裁分类中的比较分析

Quazi Ghulam Rafi, Mohammed Noman, Sadia Zahin Prodhan, S. Alam, Dipannyta Nandi
{"title":"三种改进的深度学习架构在音乐体裁分类中的比较分析","authors":"Quazi Ghulam Rafi, Mohammed Noman, Sadia Zahin Prodhan, S. Alam, Dipannyta Nandi","doi":"10.5815/IJITCS.2021.02.01","DOIUrl":null,"url":null,"abstract":"Among the many music information retrieval (MIR) tasks, music genre classification is noteworthy. The categorization of music into different groups that came to existence through a complex interplay of cultures, musicians, and various market forces to characterize similarities between compositions and organize collections is known as a music genre. The past researchers extracted various hand-crafted features and developed classifiers based on them. But the major drawback of this approach was the requirement of field expertise. However, in recent times researchers, because of the remarkable classification accuracy of deep learning models, have used similar models for MIR tasks. Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and the hybrid model, Convolutional Recurrent Neural Network (CRNN), are such prominently used deep learning models for music genre classification along with other MIR tasks and various architectures of these models have achieved state-of-the-art results. In this study, we review and discuss three such architectures of deep learning models, already used for music genre classification of music tracks of length of 29-30 seconds. In particular, we analyze improved CNN, RNN, and CRNN architectures named Bottom-up Broadcast Neural Network (BBNN) [1], Independent Recurrent Neural Network (IndRNN) [2] and CRNN in Time and Frequency dimensions (CRNNTF) [3] respectively, almost all of the architectures achieved the highest classification accuracy among the variants of their base deep learning model. Hence, this study holds a comparative analysis of the three most impressive architectural variants of the main deep learning models that are prominently used to classify music genre and presents the three architecture, hence the models (CNN, RNN, and CRNN) in one study. We also propose two ways that can improve the performances of the RNN (IndRNN) and CRNN (CRNN-TF) architectures.","PeriodicalId":130361,"journal":{"name":"International Journal of Information Technology and Computer Science","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Comparative Analysis of Three Improved Deep Learning Architectures for Music Genre Classification\",\"authors\":\"Quazi Ghulam Rafi, Mohammed Noman, Sadia Zahin Prodhan, S. Alam, Dipannyta Nandi\",\"doi\":\"10.5815/IJITCS.2021.02.01\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Among the many music information retrieval (MIR) tasks, music genre classification is noteworthy. The categorization of music into different groups that came to existence through a complex interplay of cultures, musicians, and various market forces to characterize similarities between compositions and organize collections is known as a music genre. The past researchers extracted various hand-crafted features and developed classifiers based on them. But the major drawback of this approach was the requirement of field expertise. However, in recent times researchers, because of the remarkable classification accuracy of deep learning models, have used similar models for MIR tasks. Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and the hybrid model, Convolutional Recurrent Neural Network (CRNN), are such prominently used deep learning models for music genre classification along with other MIR tasks and various architectures of these models have achieved state-of-the-art results. In this study, we review and discuss three such architectures of deep learning models, already used for music genre classification of music tracks of length of 29-30 seconds. In particular, we analyze improved CNN, RNN, and CRNN architectures named Bottom-up Broadcast Neural Network (BBNN) [1], Independent Recurrent Neural Network (IndRNN) [2] and CRNN in Time and Frequency dimensions (CRNNTF) [3] respectively, almost all of the architectures achieved the highest classification accuracy among the variants of their base deep learning model. Hence, this study holds a comparative analysis of the three most impressive architectural variants of the main deep learning models that are prominently used to classify music genre and presents the three architecture, hence the models (CNN, RNN, and CRNN) in one study. We also propose two ways that can improve the performances of the RNN (IndRNN) and CRNN (CRNN-TF) architectures.\",\"PeriodicalId\":130361,\"journal\":{\"name\":\"International Journal of Information Technology and Computer Science\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Information Technology and Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5815/IJITCS.2021.02.01\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Technology and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/IJITCS.2021.02.01","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

在众多的音乐信息检索(MIR)任务中,音乐类型分类是值得注意的。通过文化、音乐家和各种市场力量的复杂相互作用,将音乐分类为不同的群体,以表征作品之间的相似性并组织收藏,这被称为音乐流派。过去的研究人员提取各种手工制作的特征,并在此基础上开发分类器。但是这种方法的主要缺点是需要现场专家。然而,近年来,由于深度学习模型具有显著的分类准确性,研究人员已经将类似的模型用于MIR任务。卷积神经网络(CNN)、循环神经网络(RNN)和混合模型卷积循环神经网络(CRNN)是用于音乐类型分类的深度学习模型,以及其他MIR任务,这些模型的各种架构已经取得了最先进的结果。在本研究中,我们回顾并讨论了三种深度学习模型的架构,这些模型已经用于长度为29-30秒的音乐曲目的音乐类型分类。特别是,我们分别分析了自底向上广播神经网络(BBNN)[1]、独立递归神经网络(IndRNN)[2]和时间和频率维度的CRNN (CRNNTF)[3]等改进的CNN、RNN和CRNN架构,几乎所有的架构在其基础深度学习模型的变体中都达到了最高的分类精度。因此,本研究对主要深度学习模型的三种最令人印象深刻的架构变体进行了比较分析,这些模型主要用于对音乐类型进行分类,并呈现了三种架构,因此在一项研究中提出了模型(CNN, RNN和CRNN)。我们还提出了两种可以提高RNN (IndRNN)和CRNN (CRNN- tf)体系性能的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Comparative Analysis of Three Improved Deep Learning Architectures for Music Genre Classification
Among the many music information retrieval (MIR) tasks, music genre classification is noteworthy. The categorization of music into different groups that came to existence through a complex interplay of cultures, musicians, and various market forces to characterize similarities between compositions and organize collections is known as a music genre. The past researchers extracted various hand-crafted features and developed classifiers based on them. But the major drawback of this approach was the requirement of field expertise. However, in recent times researchers, because of the remarkable classification accuracy of deep learning models, have used similar models for MIR tasks. Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and the hybrid model, Convolutional Recurrent Neural Network (CRNN), are such prominently used deep learning models for music genre classification along with other MIR tasks and various architectures of these models have achieved state-of-the-art results. In this study, we review and discuss three such architectures of deep learning models, already used for music genre classification of music tracks of length of 29-30 seconds. In particular, we analyze improved CNN, RNN, and CRNN architectures named Bottom-up Broadcast Neural Network (BBNN) [1], Independent Recurrent Neural Network (IndRNN) [2] and CRNN in Time and Frequency dimensions (CRNNTF) [3] respectively, almost all of the architectures achieved the highest classification accuracy among the variants of their base deep learning model. Hence, this study holds a comparative analysis of the three most impressive architectural variants of the main deep learning models that are prominently used to classify music genre and presents the three architecture, hence the models (CNN, RNN, and CRNN) in one study. We also propose two ways that can improve the performances of the RNN (IndRNN) and CRNN (CRNN-TF) architectures.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enhancing Healthcare Provision in Conflict Zones: Queuing System Models for Mobile and Flexible Medical Care Units with a Limited Number of Treatment Stations A Machine Learning Based Intelligent Diabetic and Hypertensive Patient Prediction Scheme and A Mobile Application for Patients Assistance Mimicking Nature: Analysis of Dragonfly Pursuit Strategies Using LSTM and Kalman Filter Securing the Internet of Things: Evaluating Machine Learning Algorithms for Detecting IoT Cyberattacks Using CIC-IoT2023 Dataset Analyzing Test Performance of BSIT Students and Question Quality: A Study on Item Difficulty Index and Item Discrimination Index for Test Question Improvement
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1