Music time signature detection using ResNet18

IF 1.9 3区计算机科学 Q2 ACOUSTICS Eurasip Journal on Audio Speech and Music Processing Pub Date : 2024-06-13 DOI:10.1186/s13636-024-00346-6

Jeremiah Abimbola, Daniel Kostrzewa, Pawel Kasprowski

{"title":"Music time signature detection using ResNet18","authors":"Jeremiah Abimbola, Daniel Kostrzewa, Pawel Kasprowski","doi":"10.1186/s13636-024-00346-6","DOIUrl":null,"url":null,"abstract":"Time signature detection is a fundamental task in music information retrieval, aiding in music organization. In recent years, the demand for robust and efficient methods in music analysis has amplified, underscoring the significance of advancements in time signature detection. In this study, we explored the effectiveness of residual networks for time signature detection. Additionally, we compared the performance of the residual network (ResNet18) to already existing models such as audio similarity matrix (ASM) and beat similarity matrix (BSM). We also juxtaposed with traditional algorithms such as support vector machine (SVM), random forest, K-nearest neighbor (KNN), naive Bayes, and that of deep learning models, such as convolutional neural network (CNN) and convolutional recurrent neural network (CRNN). The evaluation is conducted using Mel-frequency cepstral coefficients (MFCCs) as feature representations on the Meter2800 dataset. Our results indicate that ResNet18 outperforms all other models thereby showing the potential of deep learning models for accurate time signature detection.","PeriodicalId":49202,"journal":{"name":"Eurasip Journal on Audio Speech and Music Processing","volume":"61 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eurasip Journal on Audio Speech and Music Processing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s13636-024-00346-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Time signature detection is a fundamental task in music information retrieval, aiding in music organization. In recent years, the demand for robust and efficient methods in music analysis has amplified, underscoring the significance of advancements in time signature detection. In this study, we explored the effectiveness of residual networks for time signature detection. Additionally, we compared the performance of the residual network (ResNet18) to already existing models such as audio similarity matrix (ASM) and beat similarity matrix (BSM). We also juxtaposed with traditional algorithms such as support vector machine (SVM), random forest, K-nearest neighbor (KNN), naive Bayes, and that of deep learning models, such as convolutional neural network (CNN) and convolutional recurrent neural network (CRNN). The evaluation is conducted using Mel-frequency cepstral coefficients (MFCCs) as feature representations on the Meter2800 dataset. Our results indicate that ResNet18 outperforms all other models thereby showing the potential of deep learning models for accurate time signature detection.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用 ResNet 检测音乐时间特征18

时间特征检测是音乐信息检索的一项基本任务，有助于音乐的组织。近年来，音乐分析对稳健高效方法的需求日益增长，这凸显了时间特征检测技术进步的重要意义。在这项研究中，我们探讨了残差网络在时间特征检测中的有效性。此外，我们还将残差网络（ResNet18）的性能与音频相似性矩阵（ASM）和节拍相似性矩阵（BSM）等现有模型进行了比较。我们还将其与支持向量机 (SVM)、随机森林、K-近邻 (KNN)、天真贝叶斯等传统算法以及卷积神经网络 (CNN) 和卷积递归神经网络 (CRNN) 等深度学习模型进行了比较。评估是在 Meter2800 数据集上使用 Mel-frequency cepstral coefficients (MFCC) 作为特征表示进行的。结果表明，ResNet18 优于所有其他模型，从而显示了深度学习模型在准确检测时间特征方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Eurasip Journal on Audio Speech and Music Processing ACOUSTICS-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

4.10

自引率

4.20%

发文量

审稿时长

12 months

期刊介绍： The aim of “EURASIP Journal on Audio, Speech, and Music Processing” is to bring together researchers, scientists and engineers working on the theory and applications of the processing of various audio signals, with a specific focus on speech and music. EURASIP Journal on Audio, Speech, and Music Processing will be an interdisciplinary journal for the dissemination of all basic and applied aspects of speech communication and audio processes.

期刊最新文献

Diffraction perception in L-shaped rooms using virtual reality. Singing to speech conversion with generative flow. Robust and early howling detection based on a sparsity measure. Compression of room impulse responses for compact storage and fast low-latency convolution Guest editorial: AI for computational audition—sound and music processing