基于音频信号的节奏分类场景深度学习

IAES International Journal of Artificial Intelligence (IJ-AI) Pub Date : 2024-06-01 DOI:10.11591/ijai.v13.i2.pp1687-1701

Muljono Muljono, Pulung Nurtantio Andono, Sari Ayu Wulandari, Harun Al Azies, Muhammad Naufal

{"title":"基于音频信号的节奏分类场景深度学习","authors":"Muljono Muljono, Pulung Nurtantio Andono, Sari Ayu Wulandari, Harun Al Azies, Muhammad Naufal","doi":"10.11591/ijai.v13.i2.pp1687-1701","DOIUrl":null,"url":null,"abstract":"This article explains how to determine the tempo of the kendhang, an Indonesian traditional melodic instrument. This research presents novelty as technological research related to gamelan instruments, which has rarely been achieved thus far, through the introduction of kendhang tempo types through the sounds produced, with the hope of creating an automatic system that can recognize the kendhang tempo during a gamelan performance. The testing in this work will categorize the tempo of kendhang into three categories: slow, medium, and fast, utilizing one of the two scenario models proposed, mel frequency cepstral coefficients (MFCC) and convolutional neural network (CNN) in the first scenario, and mel spectrogram and CNN in the second. Kendhang's original audio data, which was captured in real time and later enhanced, makes up the data set. The model 1 scenario, which entails feature extraction using MFCC and classification using the CNN classification approach, is the best scenario in this research, based on the experimental results. When compared to the other suggested modeling scenarios, model 1 has a level of 97%, an average accuracy, and a gain value of 96.67%, making it a solid assistant in terms of kendhang's good tempo recognition accuracy.","PeriodicalId":507934,"journal":{"name":"IAES International Journal of Artificial Intelligence (IJ-AI)","volume":"32 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep learning for audio signal-based tempo classification scenarios\",\"authors\":\"Muljono Muljono, Pulung Nurtantio Andono, Sari Ayu Wulandari, Harun Al Azies, Muhammad Naufal\",\"doi\":\"10.11591/ijai.v13.i2.pp1687-1701\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article explains how to determine the tempo of the kendhang, an Indonesian traditional melodic instrument. This research presents novelty as technological research related to gamelan instruments, which has rarely been achieved thus far, through the introduction of kendhang tempo types through the sounds produced, with the hope of creating an automatic system that can recognize the kendhang tempo during a gamelan performance. The testing in this work will categorize the tempo of kendhang into three categories: slow, medium, and fast, utilizing one of the two scenario models proposed, mel frequency cepstral coefficients (MFCC) and convolutional neural network (CNN) in the first scenario, and mel spectrogram and CNN in the second. Kendhang's original audio data, which was captured in real time and later enhanced, makes up the data set. The model 1 scenario, which entails feature extraction using MFCC and classification using the CNN classification approach, is the best scenario in this research, based on the experimental results. When compared to the other suggested modeling scenarios, model 1 has a level of 97%, an average accuracy, and a gain value of 96.67%, making it a solid assistant in terms of kendhang's good tempo recognition accuracy.\",\"PeriodicalId\":507934,\"journal\":{\"name\":\"IAES International Journal of Artificial Intelligence (IJ-AI)\",\"volume\":\"32 3\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IAES International Journal of Artificial Intelligence (IJ-AI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11591/ijai.v13.i2.pp1687-1701\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IAES International Journal of Artificial Intelligence (IJ-AI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/ijai.v13.i2.pp1687-1701","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文阐述了如何确定印尼传统旋律乐器肯德汉琴的节奏。这项研究通过声音来介绍肯德汉琴的节奏类型，希望创建一个能在加麦兰演奏中识别肯德汉琴节奏的自动系统，从而展示了与加麦兰乐器相关的技术研究的新颖性，迄今为止还很少有人能做到这一点。本作品中的测试将把肯德杭的节奏分为慢、中、快三类，并利用所提出的两种情景模式之一：第一种情景模式是梅尔频率倒频谱系数（MFCC）和卷积神经网络（CNN），第二种情景模式是梅尔频谱图和 CNN。数据集由 Kendhang 的原始音频数据组成，这些数据是实时采集的，随后进行了增强。根据实验结果，模型 1（使用 MFCC 提取特征并使用 CNN 分类方法进行分类）是本研究的最佳方案。与其他建议的建模方案相比，模型 1 的水平为 97%，平均准确率为 96.67%，增益值为 96.67%，是 kendhang 良好节奏识别准确率的可靠助手。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Deep learning for audio signal-based tempo classification scenarios

This article explains how to determine the tempo of the kendhang, an Indonesian traditional melodic instrument. This research presents novelty as technological research related to gamelan instruments, which has rarely been achieved thus far, through the introduction of kendhang tempo types through the sounds produced, with the hope of creating an automatic system that can recognize the kendhang tempo during a gamelan performance. The testing in this work will categorize the tempo of kendhang into three categories: slow, medium, and fast, utilizing one of the two scenario models proposed, mel frequency cepstral coefficients (MFCC) and convolutional neural network (CNN) in the first scenario, and mel spectrogram and CNN in the second. Kendhang's original audio data, which was captured in real time and later enhanced, makes up the data set. The model 1 scenario, which entails feature extraction using MFCC and classification using the CNN classification approach, is the best scenario in this research, based on the experimental results. When compared to the other suggested modeling scenarios, model 1 has a level of 97%, an average accuracy, and a gain value of 96.67%, making it a solid assistant in terms of kendhang's good tempo recognition accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IAES International Journal of Artificial Intelligence (IJ-AI)

自引率

0.00%

发文量