{"title":"Speech articulatory analysis through time delay neural networks","authors":"F. Lavagetto","doi":"10.1109/ANNES.1995.499495","DOIUrl":null,"url":null,"abstract":"The approach described is based on the use of time delay neural networks for solving the task of articulatory estimation from acoustic speech and on image vector quantization as far as the visual synthesis is concerned. Once the system has been trained on a reference speaker, the association of visual cues is performed in real time to each 20 ms of incoming speech. Preliminary results are reported with reference to the ongoing experimentation both with normal hearing people and with deaf persons to estimate some of the many perceptual thresholds involved in the complex task of speech reading from synthetic images. This experimental phase is carried on in cooperation with FIADDA, the Italian association of the families of hearing impaired children, and is based on a flexible simulation environment.","PeriodicalId":123427,"journal":{"name":"Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ANNES.1995.499495","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The approach described is based on the use of time delay neural networks for solving the task of articulatory estimation from acoustic speech and on image vector quantization as far as the visual synthesis is concerned. Once the system has been trained on a reference speaker, the association of visual cues is performed in real time to each 20 ms of incoming speech. Preliminary results are reported with reference to the ongoing experimentation both with normal hearing people and with deaf persons to estimate some of the many perceptual thresholds involved in the complex task of speech reading from synthetic images. This experimental phase is carried on in cooperation with FIADDA, the Italian association of the families of hearing impaired children, and is based on a flexible simulation environment.