{"title":"语音习得神经计算建模的特征评估","authors":"D. Shitov, E. Pirogova, M. Lech","doi":"10.1109/ICSPCS.2018.8631770","DOIUrl":null,"url":null,"abstract":"The aim of this study is to determine the most suitable speech representation (features) for the neurocomputational modeling of the speech acquisition process. Majority of the existing techniques apply the mel frequency cepstral coefficients (MFCCs). Recent advancements in deep learning technologies created an opportunity for using a deep network parameters to represent speech signals. In this study, two experiments were conducted to obtain both qualitative and quantitative assessments of the modeling suitability of four different types of features: formants, MFCCs, MFCCs-PCA and neural network features. The results show that features extracted from the modified Convolutional Neural Network with a Long Short-Term Memory layer (CNN-LSTM) clearly outperformed all other types of features.","PeriodicalId":179948,"journal":{"name":"2018 12th International Conference on Signal Processing and Communication Systems (ICSPCS)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessment of Features for Neurocomputational Modeling of Speech Acquisition\",\"authors\":\"D. Shitov, E. Pirogova, M. Lech\",\"doi\":\"10.1109/ICSPCS.2018.8631770\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The aim of this study is to determine the most suitable speech representation (features) for the neurocomputational modeling of the speech acquisition process. Majority of the existing techniques apply the mel frequency cepstral coefficients (MFCCs). Recent advancements in deep learning technologies created an opportunity for using a deep network parameters to represent speech signals. In this study, two experiments were conducted to obtain both qualitative and quantitative assessments of the modeling suitability of four different types of features: formants, MFCCs, MFCCs-PCA and neural network features. The results show that features extracted from the modified Convolutional Neural Network with a Long Short-Term Memory layer (CNN-LSTM) clearly outperformed all other types of features.\",\"PeriodicalId\":179948,\"journal\":{\"name\":\"2018 12th International Conference on Signal Processing and Communication Systems (ICSPCS)\",\"volume\":\"120 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 12th International Conference on Signal Processing and Communication Systems (ICSPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSPCS.2018.8631770\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 12th International Conference on Signal Processing and Communication Systems (ICSPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSPCS.2018.8631770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Assessment of Features for Neurocomputational Modeling of Speech Acquisition
The aim of this study is to determine the most suitable speech representation (features) for the neurocomputational modeling of the speech acquisition process. Majority of the existing techniques apply the mel frequency cepstral coefficients (MFCCs). Recent advancements in deep learning technologies created an opportunity for using a deep network parameters to represent speech signals. In this study, two experiments were conducted to obtain both qualitative and quantitative assessments of the modeling suitability of four different types of features: formants, MFCCs, MFCCs-PCA and neural network features. The results show that features extracted from the modified Convolutional Neural Network with a Long Short-Term Memory layer (CNN-LSTM) clearly outperformed all other types of features.