{"title":"结合Resnet和空间金字塔池进行乐器识别","authors":"Christine Dewi, Rung-Ching Chen","doi":"10.2478/cait-2022-0007","DOIUrl":null,"url":null,"abstract":"Abstract Identifying similar objects is one of the most challenging tasks in computer vision image recognition. The following musical instruments will be recognized in this study: French horn, harp, recorder, bassoon, cello, clarinet, erhu, guitar saxophone, trumpet, and violin. Numerous musical instruments are identical in size, form, and sound. Further, our works combine Resnet 50 with Spatial Pyramid Pooling (SPP) to identify musical instruments that are similar to one another. Next, the Resnet 50 and Resnet 50 SPP model evaluation performance includes the Floating-Point Operations (FLOPS), detection time, mAP, and IoU. Our work can increase the detection performance of musical instruments similar to one another. The method we propose, Resnet 50 SPP, shows the highest average accuracy of 84.64% compared to the results of previous studies.","PeriodicalId":45562,"journal":{"name":"Cybernetics and Information Technologies","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Combination of Resnet and Spatial Pyramid Pooling for Musical Instrument Identification\",\"authors\":\"Christine Dewi, Rung-Ching Chen\",\"doi\":\"10.2478/cait-2022-0007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Identifying similar objects is one of the most challenging tasks in computer vision image recognition. The following musical instruments will be recognized in this study: French horn, harp, recorder, bassoon, cello, clarinet, erhu, guitar saxophone, trumpet, and violin. Numerous musical instruments are identical in size, form, and sound. Further, our works combine Resnet 50 with Spatial Pyramid Pooling (SPP) to identify musical instruments that are similar to one another. Next, the Resnet 50 and Resnet 50 SPP model evaluation performance includes the Floating-Point Operations (FLOPS), detection time, mAP, and IoU. Our work can increase the detection performance of musical instruments similar to one another. The method we propose, Resnet 50 SPP, shows the highest average accuracy of 84.64% compared to the results of previous studies.\",\"PeriodicalId\":45562,\"journal\":{\"name\":\"Cybernetics and Information Technologies\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2022-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cybernetics and Information Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/cait-2022-0007\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cybernetics and Information Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/cait-2022-0007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Combination of Resnet and Spatial Pyramid Pooling for Musical Instrument Identification
Abstract Identifying similar objects is one of the most challenging tasks in computer vision image recognition. The following musical instruments will be recognized in this study: French horn, harp, recorder, bassoon, cello, clarinet, erhu, guitar saxophone, trumpet, and violin. Numerous musical instruments are identical in size, form, and sound. Further, our works combine Resnet 50 with Spatial Pyramid Pooling (SPP) to identify musical instruments that are similar to one another. Next, the Resnet 50 and Resnet 50 SPP model evaluation performance includes the Floating-Point Operations (FLOPS), detection time, mAP, and IoU. Our work can increase the detection performance of musical instruments similar to one another. The method we propose, Resnet 50 SPP, shows the highest average accuracy of 84.64% compared to the results of previous studies.