{"title":"模拟人工耳蜗调制谱声情感转换的可行性","authors":"Zhi Zhu, Ryota Miyauchi, Yukiko Araki, M. Unoki","doi":"10.23919/EUSIPCO.2017.8081526","DOIUrl":null,"url":null,"abstract":"Cochlear implant (CI) listeners were found to have great difficulty with vocal emotion recognition because of the limited spectral cues provided by CI devices. Previous studies have shown that the modulation spectral features of temporal envelopes may be important cues for vocal emotion recognition of noise-vocoded speech (NVS) as simulated CIs. In this paper, the feasibility of vocal emotion conversion on a modulation spectrogram for simulated CIs for correctly recognizing vocal emotion is confirmed. A method based on a linear prediction scheme is proposed to modify the modulation spectrogram and its features of neutral speech to match that of emotional speech. The logic of this approach is that if vocal emotion perception of NVS is based on the modulation spectral features, NVS with similar modulation spectral features of emotional speech will be recognized as the same emotion. As a result, it was found that the modulation spectrogram of neutral speech can be successfully converted to that of emotional speech. The results of the evaluation experiment showed the feasibility of vocal emotion conversion on the modulation spectrogram for simulated CIs. The vocal emotion enhancement on the modulation spectrogram was also further discussed.","PeriodicalId":346811,"journal":{"name":"2017 25th European Signal Processing Conference (EUSIPCO)","volume":"24 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Feasibility of vocal emotion conversion on modulation spectrogram for simulated cochlear implants\",\"authors\":\"Zhi Zhu, Ryota Miyauchi, Yukiko Araki, M. Unoki\",\"doi\":\"10.23919/EUSIPCO.2017.8081526\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cochlear implant (CI) listeners were found to have great difficulty with vocal emotion recognition because of the limited spectral cues provided by CI devices. Previous studies have shown that the modulation spectral features of temporal envelopes may be important cues for vocal emotion recognition of noise-vocoded speech (NVS) as simulated CIs. In this paper, the feasibility of vocal emotion conversion on a modulation spectrogram for simulated CIs for correctly recognizing vocal emotion is confirmed. A method based on a linear prediction scheme is proposed to modify the modulation spectrogram and its features of neutral speech to match that of emotional speech. The logic of this approach is that if vocal emotion perception of NVS is based on the modulation spectral features, NVS with similar modulation spectral features of emotional speech will be recognized as the same emotion. As a result, it was found that the modulation spectrogram of neutral speech can be successfully converted to that of emotional speech. The results of the evaluation experiment showed the feasibility of vocal emotion conversion on the modulation spectrogram for simulated CIs. The vocal emotion enhancement on the modulation spectrogram was also further discussed.\",\"PeriodicalId\":346811,\"journal\":{\"name\":\"2017 25th European Signal Processing Conference (EUSIPCO)\",\"volume\":\"24 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 25th European Signal Processing Conference (EUSIPCO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/EUSIPCO.2017.8081526\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 25th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/EUSIPCO.2017.8081526","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feasibility of vocal emotion conversion on modulation spectrogram for simulated cochlear implants
Cochlear implant (CI) listeners were found to have great difficulty with vocal emotion recognition because of the limited spectral cues provided by CI devices. Previous studies have shown that the modulation spectral features of temporal envelopes may be important cues for vocal emotion recognition of noise-vocoded speech (NVS) as simulated CIs. In this paper, the feasibility of vocal emotion conversion on a modulation spectrogram for simulated CIs for correctly recognizing vocal emotion is confirmed. A method based on a linear prediction scheme is proposed to modify the modulation spectrogram and its features of neutral speech to match that of emotional speech. The logic of this approach is that if vocal emotion perception of NVS is based on the modulation spectral features, NVS with similar modulation spectral features of emotional speech will be recognized as the same emotion. As a result, it was found that the modulation spectrogram of neutral speech can be successfully converted to that of emotional speech. The results of the evaluation experiment showed the feasibility of vocal emotion conversion on the modulation spectrogram for simulated CIs. The vocal emotion enhancement on the modulation spectrogram was also further discussed.