{"title":"音素识别的时间扭曲神经网络","authors":"K. Aikawa","doi":"10.1109/IJCNN.1991.170701","DOIUrl":null,"url":null,"abstract":"The author investigates a feedforward neural network that can accept phonemes with an arbitrary duration coping with nonlinear time warping. The time-warping neural network is characterized by the time-warping functions embedded between the input layer and the first hidden layer in the network. The input layer accesses three different time points. The accessing points are determined by the time-warping functions. The input spectrum sequence itself is not warped but the accessing-point sequence is warped. The advantage of this network architecture is that the input layer can access the original spectrum sequence. The proposed network demonstrated higher phoneme recognition accuracy than the baseline recognizer based on conventional feedforward neural networks. The recognition accuracy was even higher than that achieved with discrete hidden Markov models.<<ETX>>","PeriodicalId":211135,"journal":{"name":"[Proceedings] 1991 IEEE International Joint Conference on Neural Networks","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1991-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Time-warping neural network for phoneme recognition\",\"authors\":\"K. Aikawa\",\"doi\":\"10.1109/IJCNN.1991.170701\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The author investigates a feedforward neural network that can accept phonemes with an arbitrary duration coping with nonlinear time warping. The time-warping neural network is characterized by the time-warping functions embedded between the input layer and the first hidden layer in the network. The input layer accesses three different time points. The accessing points are determined by the time-warping functions. The input spectrum sequence itself is not warped but the accessing-point sequence is warped. The advantage of this network architecture is that the input layer can access the original spectrum sequence. The proposed network demonstrated higher phoneme recognition accuracy than the baseline recognizer based on conventional feedforward neural networks. The recognition accuracy was even higher than that achieved with discrete hidden Markov models.<<ETX>>\",\"PeriodicalId\":211135,\"journal\":{\"name\":\"[Proceedings] 1991 IEEE International Joint Conference on Neural Networks\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1991-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[Proceedings] 1991 IEEE International Joint Conference on Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCNN.1991.170701\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings] 1991 IEEE International Joint Conference on Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.1991.170701","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Time-warping neural network for phoneme recognition
The author investigates a feedforward neural network that can accept phonemes with an arbitrary duration coping with nonlinear time warping. The time-warping neural network is characterized by the time-warping functions embedded between the input layer and the first hidden layer in the network. The input layer accesses three different time points. The accessing points are determined by the time-warping functions. The input spectrum sequence itself is not warped but the accessing-point sequence is warped. The advantage of this network architecture is that the input layer can access the original spectrum sequence. The proposed network demonstrated higher phoneme recognition accuracy than the baseline recognizer based on conventional feedforward neural networks. The recognition accuracy was even higher than that achieved with discrete hidden Markov models.<>