音素识别的时间扭曲神经网络

[Proceedings] 1991 IEEE International Joint Conference on Neural Networks Pub Date : 1991-11-18 DOI:10.1109/IJCNN.1991.170701

K. Aikawa

{"title":"音素识别的时间扭曲神经网络","authors":"K. Aikawa","doi":"10.1109/IJCNN.1991.170701","DOIUrl":null,"url":null,"abstract":"The author investigates a feedforward neural network that can accept phonemes with an arbitrary duration coping with nonlinear time warping. The time-warping neural network is characterized by the time-warping functions embedded between the input layer and the first hidden layer in the network. The input layer accesses three different time points. The accessing points are determined by the time-warping functions. The input spectrum sequence itself is not warped but the accessing-point sequence is warped. The advantage of this network architecture is that the input layer can access the original spectrum sequence. The proposed network demonstrated higher phoneme recognition accuracy than the baseline recognizer based on conventional feedforward neural networks. The recognition accuracy was even higher than that achieved with discrete hidden Markov models.<<ETX>>","PeriodicalId":211135,"journal":{"name":"[Proceedings] 1991 IEEE International Joint Conference on Neural Networks","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1991-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Time-warping neural network for phoneme recognition\",\"authors\":\"K. Aikawa\",\"doi\":\"10.1109/IJCNN.1991.170701\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The author investigates a feedforward neural network that can accept phonemes with an arbitrary duration coping with nonlinear time warping. The time-warping neural network is characterized by the time-warping functions embedded between the input layer and the first hidden layer in the network. The input layer accesses three different time points. The accessing points are determined by the time-warping functions. The input spectrum sequence itself is not warped but the accessing-point sequence is warped. The advantage of this network architecture is that the input layer can access the original spectrum sequence. The proposed network demonstrated higher phoneme recognition accuracy than the baseline recognizer based on conventional feedforward neural networks. The recognition accuracy was even higher than that achieved with discrete hidden Markov models.<<ETX>>\",\"PeriodicalId\":211135,\"journal\":{\"name\":\"[Proceedings] 1991 IEEE International Joint Conference on Neural Networks\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1991-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[Proceedings] 1991 IEEE International Joint Conference on Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCNN.1991.170701\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings] 1991 IEEE International Joint Conference on Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.1991.170701","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

作者研究了一种可以接受任意持续时间音素的前馈神经网络，该网络可以处理非线性时间翘曲。时间规整神经网络的特点是在网络的输入层和第一隐层之间嵌入时间规整函数。输入层访问三个不同的时间点。访问点由时间规整函数确定。输入频谱序列本身不被扭曲，但接入点序列被扭曲。这种网络结构的优点是输入层可以访问原始频谱序列。该网络比基于传统前馈神经网络的基线识别器具有更高的音素识别精度。识别精度甚至高于离散隐马尔可夫模型

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Time-warping neural network for phoneme recognition

The author investigates a feedforward neural network that can accept phonemes with an arbitrary duration coping with nonlinear time warping. The time-warping neural network is characterized by the time-warping functions embedded between the input layer and the first hidden layer in the network. The input layer accesses three different time points. The accessing points are determined by the time-warping functions. The input spectrum sequence itself is not warped but the accessing-point sequence is warped. The advantage of this network architecture is that the input layer can access the original spectrum sequence. The proposed network demonstrated higher phoneme recognition accuracy than the baseline recognizer based on conventional feedforward neural networks. The recognition accuracy was even higher than that achieved with discrete hidden Markov models.<>

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

[Proceedings] 1991 IEEE International Joint Conference on Neural Networks

自引率

0.00%

发文量