{"title":"基于投票的音乐类型分类——基于melspectrum和卷积神经网络","authors":"S. Sugianto, S. Suyanto","doi":"10.1109/ISRITI48646.2019.9034644","DOIUrl":null,"url":null,"abstract":"The music genre is a categorical label created by humans to describe music. Huge digital music nowadays makes the classification process manually requires much effort and time. Hence, an automatic system that is capable of classifying musical genres is needed. Most systems are commonly developed using Mel Frequency Cepstral Coefficients (MFCC) but they give low accuracies. A new system is proposed here using Melspectogram and Convolutional Neural Network (CNN) with a voting scheme. The Melspectogram provides a better representation than MFCC since it gives various information about music, such as frequency, time, amplitude, etc. It is used as an input for training CNN to develop some unique patterns in each musical genre. Evaluation on the GTZAN dataset shows that the proposed system is capable of predicting music genres, where voting scheme produces a higher accuracy of 71.87% than the commonly used single scheme that gives an accuracy of 63.49%.","PeriodicalId":367363,"journal":{"name":"2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Voting-Based Music Genre Classification Using Melspectogram and Convolutional Neural Network\",\"authors\":\"S. Sugianto, S. Suyanto\",\"doi\":\"10.1109/ISRITI48646.2019.9034644\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The music genre is a categorical label created by humans to describe music. Huge digital music nowadays makes the classification process manually requires much effort and time. Hence, an automatic system that is capable of classifying musical genres is needed. Most systems are commonly developed using Mel Frequency Cepstral Coefficients (MFCC) but they give low accuracies. A new system is proposed here using Melspectogram and Convolutional Neural Network (CNN) with a voting scheme. The Melspectogram provides a better representation than MFCC since it gives various information about music, such as frequency, time, amplitude, etc. It is used as an input for training CNN to develop some unique patterns in each musical genre. Evaluation on the GTZAN dataset shows that the proposed system is capable of predicting music genres, where voting scheme produces a higher accuracy of 71.87% than the commonly used single scheme that gives an accuracy of 63.49%.\",\"PeriodicalId\":367363,\"journal\":{\"name\":\"2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISRITI48646.2019.9034644\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISRITI48646.2019.9034644","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Voting-Based Music Genre Classification Using Melspectogram and Convolutional Neural Network
The music genre is a categorical label created by humans to describe music. Huge digital music nowadays makes the classification process manually requires much effort and time. Hence, an automatic system that is capable of classifying musical genres is needed. Most systems are commonly developed using Mel Frequency Cepstral Coefficients (MFCC) but they give low accuracies. A new system is proposed here using Melspectogram and Convolutional Neural Network (CNN) with a voting scheme. The Melspectogram provides a better representation than MFCC since it gives various information about music, such as frequency, time, amplitude, etc. It is used as an input for training CNN to develop some unique patterns in each musical genre. Evaluation on the GTZAN dataset shows that the proposed system is capable of predicting music genres, where voting scheme produces a higher accuracy of 71.87% than the commonly used single scheme that gives an accuracy of 63.49%.