{"title":"Time-warping neural network for phoneme recognition","authors":"K. Aikawa","doi":"10.1109/IJCNN.1991.170701","DOIUrl":null,"url":null,"abstract":"The author investigates a feedforward neural network that can accept phonemes with an arbitrary duration coping with nonlinear time warping. The time-warping neural network is characterized by the time-warping functions embedded between the input layer and the first hidden layer in the network. The input layer accesses three different time points. The accessing points are determined by the time-warping functions. The input spectrum sequence itself is not warped but the accessing-point sequence is warped. The advantage of this network architecture is that the input layer can access the original spectrum sequence. The proposed network demonstrated higher phoneme recognition accuracy than the baseline recognizer based on conventional feedforward neural networks. The recognition accuracy was even higher than that achieved with discrete hidden Markov models.<<ETX>>","PeriodicalId":211135,"journal":{"name":"[Proceedings] 1991 IEEE International Joint Conference on Neural Networks","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1991-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings] 1991 IEEE International Joint Conference on Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.1991.170701","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The author investigates a feedforward neural network that can accept phonemes with an arbitrary duration coping with nonlinear time warping. The time-warping neural network is characterized by the time-warping functions embedded between the input layer and the first hidden layer in the network. The input layer accesses three different time points. The accessing points are determined by the time-warping functions. The input spectrum sequence itself is not warped but the accessing-point sequence is warped. The advantage of this network architecture is that the input layer can access the original spectrum sequence. The proposed network demonstrated higher phoneme recognition accuracy than the baseline recognizer based on conventional feedforward neural networks. The recognition accuracy was even higher than that achieved with discrete hidden Markov models.<>