{"title":"基于高斯过程回归量自适应聚合的音乐情感识别","authors":"Satoru Fukayama, Masataka Goto","doi":"10.1109/ICASSP.2016.7471639","DOIUrl":null,"url":null,"abstract":"This paper describes a novel method for estimating the emotions elicited by a piece of music from its acoustic signals. Previous research in this field has centered on finding effective acoustic features and regression methods to relate features to emotions. The state-of-the-art method is based on a multi-stage regression, which aggregates the results from different regressors trained with training data. However, after training, the aggregation happens in a fixed way and cannot be adapted to acoustic signals with different musical properties. We propose a method that adapts the aggregation by taking into account new acoustic signal inputs. Since we cannot know the emotions elicited by new inputs beforehand, we need a way of adapting the aggregation weights. We do so by exploiting the deviation observed in the training data using Gaussian process regressions. We confirmed with an experiment comparing different aggregation approaches that our adaptive aggregation is effective in improving recognition accuracy.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Music emotion recognition with adaptive aggregation of Gaussian process regressors\",\"authors\":\"Satoru Fukayama, Masataka Goto\",\"doi\":\"10.1109/ICASSP.2016.7471639\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes a novel method for estimating the emotions elicited by a piece of music from its acoustic signals. Previous research in this field has centered on finding effective acoustic features and regression methods to relate features to emotions. The state-of-the-art method is based on a multi-stage regression, which aggregates the results from different regressors trained with training data. However, after training, the aggregation happens in a fixed way and cannot be adapted to acoustic signals with different musical properties. We propose a method that adapts the aggregation by taking into account new acoustic signal inputs. Since we cannot know the emotions elicited by new inputs beforehand, we need a way of adapting the aggregation weights. We do so by exploiting the deviation observed in the training data using Gaussian process regressions. We confirmed with an experiment comparing different aggregation approaches that our adaptive aggregation is effective in improving recognition accuracy.\",\"PeriodicalId\":165321,\"journal\":{\"name\":\"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2016.7471639\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2016.7471639","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Music emotion recognition with adaptive aggregation of Gaussian process regressors
This paper describes a novel method for estimating the emotions elicited by a piece of music from its acoustic signals. Previous research in this field has centered on finding effective acoustic features and regression methods to relate features to emotions. The state-of-the-art method is based on a multi-stage regression, which aggregates the results from different regressors trained with training data. However, after training, the aggregation happens in a fixed way and cannot be adapted to acoustic signals with different musical properties. We propose a method that adapts the aggregation by taking into account new acoustic signal inputs. Since we cannot know the emotions elicited by new inputs beforehand, we need a way of adapting the aggregation weights. We do so by exploiting the deviation observed in the training data using Gaussian process regressions. We confirmed with an experiment comparing different aggregation approaches that our adaptive aggregation is effective in improving recognition accuracy.